Follow for more talkers

Test shows AI not ready to take over for doctors just yet

AI was unable to pass one of the qualifying radiology examinations - suggesting the technology is not yet ready to replace human medics.

Avatar photo

Published

on
A team of experienced surgeons performing a complex operation with full concentration in well-lit operating room
Laptop computer with glowing circuit brain. Artificial intelligence and communication concept. Double exposure. Close up
(Golden Dayz via Shutterstock)

By Chris Dyer via SWNS

Artificial Intelligence is not ready to take over from doctors and perform major surgery or examinations, experts said after a bot program failed a major radiology test.

AI is increasingly being used for some tasks that doctors carry out, such as interpreting radiographs, x-rays and scans to help diagnose a range of conditions.

But an AI was unable to pass one of the qualifying radiology examinations - suggesting the technology is not yet ready to replace human medics, scientists said.

Researchers compared the performance of a commercially available AI tool with 26 radiologists - mostly aged between 31 and 40 years old and with 62 percent being female.

All the candidates passed the Fellowship of the Royal College of Radiologists (FRCR) exam the previous year. The test is taken by UK trainees in order to qualify as radiology consultants.

They developed 10 ‘mock’ rapid reporting exams, based on one of three modules that make up the qualifying FRCR paper designed to test candidates for speed and accuracy.

Each mock exam was made up of 30 radiographs at the same or a higher level of difficulty and breadth of knowledge expected for the real FRCR exam.

To pass, candidates had to correctly interpret at least 27 - 90 percent - of the 30 images within 35 minutes.

The AI candidate had been trained to assess chest and bone - known as musculoskeletal - radiographs for several conditions including fractures, swollen and dislocated joints, and collapsed lungs.

Allowances were made for images relating to body parts that the AI had not been trained in, which were deemed “uninterpretable”, scientists said.

When uninterpretable images were excluded from the analysis, the AI got an average overall accuracy of 79.5 percent and passed two of 10 mock FRCR exams, while the average radiologist achieved an average accuracy of 84.8 percent and passed four of 10 mock examinations.

The sensitivity - or the ability to correctly identify patients with a particular condition - for the AI candidate was 83.6 percent, compared with 84.1 percent for the radiologists tested.

The specificity - or the ability to correctly pick out patients without a certain illness - was 75.2 percent for the AI and 87.3 percent across all the humans who took the exams.

A team of experienced surgeons performing a complex operation with full concentration in well-lit operating room
(Juice Verve via Shutterstock)

Across 148 out of 300 radiographs that were correctly interpreted by more than 90 percent of radiologists, the AI candidate was correct in 134 - or 91 percent - and incorrect in the remaining 14 - or nine percent, the researchers said.

In 20 out of 300 radiographs that over half of radiologists interpreted incorrectly, the AI candidate was incorrect in 10 - or 50 percent - and correct in the remaining half.

Scientists found that radiologists slightly overestimated the likely performance of the AI, assuming it would perform almost as well as themselves on average and outperform them in at least three of the 10 mock exams.

But this was not the case, the researchers said, adding in their report on the study: “On this occasion, the artificial intelligence candidate was unable to pass any of the 10 mock examinations when marked against similarly strict criteria to its human counterparts, but it could pass two of the mock examinations if special dispensation was made by the RCR to exclude images that it had not been trained on.”

More training and revision were "strongly recommended" by the researchers, particularly for cases the AI considered "non-interpretable", such as abdominal radiographs and those of the axial skeleton - the bones of the head and trunk of a vertebrate.

AI may help ease medics' workflows, but human input is still crucial at this stage of the technology, scientists said.

Researchers said using artificial intelligence “has untapped potential to further facilitate efficiency and diagnostic accuracy to meet an array of healthcare demands”.

But the experts added that doing so appropriately “implies educating physicians and the public better about the limitations of artificial intelligence and making these more transparent”.

The research in the AI field is "buzzing", the experts said, and this study showed one major aspect of radiology - passing the FRCR exam necessary for the license to practice - still benefits from the human touch.

These are observational findings and the researchers only looked at one AI, the scientists said.

Only mock exams were used that were not timed or supervised, so radiologists may not have felt as much pressure to do their best as one would in a real exam, the experts added.

However, this study is one of the more comprehensive cross-comparisons between radiologists and AI, giving a wide range of scores and results to be analyzed, the researchers said.

The study was published in the Christmas issue of The BMJ.

Stories and infographics by ‘Talker Research’ are available to download & ready to use. Stories and videos by ‘Talker News’ are managed by SWNS. To license content for editorial or commercial use and to see the full scope of SWNS content, please email [email protected] or submit an inquiry via our contact form.

Top Talkers