Provider organizations need AI trainers to ensure quality outcomes

Even the best algorithm could still need months to verify its performance, a healthcare systems expert says. So IT leaders should ensure they budget for the training process whenever they evaluate any artificial intelligence technology.
By Bill Siwicki
11:12 AM

Matt Hollingsworth, cofounder and chief innovation officer at Carta Healthcare, with his mother, his inspiration for starting the company

Photo: Matt Hollingsworth

This year, healthcare data experts will also begin to take on a new role as artificial intelligence trainers, contends Matt Hollingsworth, cofounder and chief innovation officer at Carta Healthcare, a healthcare AI systems company.

"Although the adoption of AI in healthcare is nothing new, there will continue to be a growing need for AI technology in 2024 and beyond," he said. "With an overall lack of manpower in healthcare, as seen in nursing and staff shortage trends, AI looks like the best solution for retaining existing manpower at competitive compensation rates while increasing efficiency in workflow and improving clinician job satisfaction.

"According to the National Library of Medicine, the key to successful AI implementation is to do it in a clinically relevant way that clinical caregivers can get behind," he continued. "It's not only about the technology, it's about how technology and caregivers work together in a trusted way to believe in, train and commit their AI solutions to provide long-term value."

We interviewed Hollingsworth to better understand his beliefs on the need for AI trainers in healthcare.

Q. You say healthcare provider organizations today need artificial intelligence trainers. Why?

A. Fundamentally, it is because no class of AI produces high enough quality output for any given task to trust until it has been verified to perform that specific task well. To perform said verification, you need subject matter experts – and we call these people AI trainers.

Let's make this concrete. Imagine you want to build a system that will chat with patients to answer their clinical questions about diagnoses they've received (like a chatbot WebMD). In principle, generative AI could do that. Here is what happens when you ask ChatGPT to give you some details about a clinical diagnosis:

The first answer is entirely wrong and would mislead any patient who received it. The second answer is perfectly fine. Before you send your product out into the wild, answering patients' questions, you have a question that you absolutely must answer: How often is it right, and how often is it wrong?

Once you have that answer, you must decide whether it is good enough to help solve your problem or if it will cause more harm than good. How do you do that? In this case, you ask many questions and then have an "AI trainer" verify the output and score it based on accuracy. Then you take those findings and decide whether or not it's good enough. Unless you don't care whether your product works, there isn't any way around this – someone needs to check the veracity of the output.

Today, no single "generalized AI" algorithm can take any problem and perform it at a human level. So, for any class of AI, you choose – in the example above, it's a generative text AI model, ChatGPT – you must verify its performance against a specific, quantifiable problem before knowing if the algorithm will add value to solving your problem. We call the people who do that verification process "AI trainers."

Q. What does the role of an AI trainer look like? What exactly do they need to be doing?

A. The role of an AI trainer is multifaceted and involves critical evaluation of an AI algorithm's outputs using real-world data. This professional assesses whether the AI's performance aligns with expected outcomes and accuracy. The scope and methods of an AI trainer's work depend highly on the AI algorithm's specific application.

For example, in scenarios where the AI algorithm is tasked with responding to patient inquiries about medical diagnoses, the AI trainer must evaluate the responses for their relevance and correctness. This involves comparing the AI's answers with verified medical information to ensure accuracy.

The AI trainer's role becomes more intricate in more complex applications, such as when an AI algorithm is designed to estimate blood loss during surgery through image analysis. Here, they must measure the blood loss independently and then compare these measurements with the AI's estimates, ensuring the AI's precision in real-time medical situations.

Similarly, suppose the AI is involved in summarizing clinical documentation. In that case, the AI trainer must verify the AI-generated summaries are comprehensive and reflect the key points of the actual documents. This involves a detailed comparison between the AI's output and the original clinical records.

Lastly, in cases where the AI assists in detecting missed billing codes, the AI trainer's job is to confirm the codes suggested by the AI are relevant and applicable. They must cross-reference the AI's suggestions with the medical services provided, ensuring billing is accurate and comprehensive.

In summary, an AI trainer's role is crucial in validating and refining AI algorithms across various domains, ensuring the AI's output is technically correct, practically applicable and reliable in real-world scenarios.

Q. What titles or roles that exist today at hospitals and health systems need to take on the responsibilities of AI trainer, and why them?

A. In implementing AI in hospitals and health systems, the roles that would be best suited to take on the responsibilities of an AI trainer are those professionals who already possess deep subject matter expertise in the specific tasks the AI is designed to perform. However, it's important to note these professionals would need additional training in AI to bridge the gap between their domain expertise and the technical aspects of AI effectively.

Here's a breakdown of specific roles and why they are suitable.

Q&A Bot – The ideal AI trainer for an AI handling patient questions about diagnoses would be a doctor. Doctors have the necessary medical knowledge and experience to assess the accuracy and appropriateness of AI-generated responses. Their expertise in diagnosis and patient communication is crucial for ensuring the AI provides medically accurate and contextually relevant answers.

Blood Loss Estimation – An operating room nurse is well-placed to train the AI. OR nurses have firsthand experience in surgical settings and are skilled in assessing patient conditions during surgery, including estimating blood loss. Their practical knowledge is vital for AI training to analyze images and estimate blood loss accurately.

Clinical Summary – Doctors, physician assistants or nurse practitioners could effectively manage this task. These professionals are experienced in creating and interpreting detailed clinical documentation. Their expertise is essential to ensure AI-generated summaries of clinical documentation are accurate and include all critical medical information.

Billing Coding – A coding specialist is the most appropriate choice for training an AI in billing coding. Coding specialists comprehensively understand medical billing codes and their application in various healthcare scenarios. Their role in ensuring accurate and efficient billing aligns with the AI's purpose, making them ideal for training and overseeing the AI in this area.

In each of these cases, the selected professionals already have the domain knowledge and experience in the tasks that AI aims to automate or assist. The additional requirement for them to be effective AI trainers is a foundational understanding of AI principles and operations. This knowledge can be acquired through specialized training, enabling them to bridge their subject matter expertise with the technical nuances of AI algorithms and applications.

Q. When vendors with AI in their systems are involved, who should the AI trainers be working with and how should they be working as a conduit between vendors and users?

A. This highly specialized process has only been around at scale for a few years now, so it's unlikely hospitals have these people just sitting around yet. At least in the near future, the most common model will be having the vendors with AI in their systems provide the trainers necessary to implement their product via a services contract, most likely during the technology implementation.

These trainers will need to be subject matter experts both in the task at hand and the AI algorithms themselves, and having the latter experience will be infinitely easier if the trainer is an employee of the company that made the algorithm in the first place.

Generally, this will take the form of a hospital-employed AI trainer working with a larger team of vendor-side AI trainers during the implementation phase of any given project. The hospital-side trainer will be setting performance requirements, sanity-checking the output, and doing spot checks to ensure they trust the output of the process.

The vendor-side folks will be doing the grunt work of collecting enough example statistics to get certainty on the performance and working with the rest of the team to address any shortcomings that surfaced during the implementation process.

Q. What would you say are a couple big-picture issues AI trainers should be addressing with healthcare AI users?

A. The most important thing is to remind everyone we don't currently have – and likely will not have in our lifetimes – a generalized AI algorithm that can do any task we throw at it, so this training process is essential.

Suppose a given task/AI algorithm pair doesn't have the trainer's blessing in the form of a quantified measurement of accuracy for a given task in real life at the institution where it is being deployed. In that case, the users shouldn't use that AI tool. My earlier GPT example is an excellent example of why that is. GPT is great at answering GRE questions but not so great at answering clinical questions.

The blind application of any AI system to a task is a sure formula for disaster.

Budget for this training if you're buying an AI solution. Fundamentally, this is similar to a clinical trial for a medical device. We would never use an EKG machine whose manufacturer didn't verify its accuracy. Similarly, we should only use AI algorithms once we've verified they work.

The primary difference is that, unlike patient physiology, AI algorithms' data varies wildly from institution to institution based on their IT infrastructure and documentation practices. And these algorithms are not sentient. They can't just correct themselves magically because your EHR documents weight in kg when the place where the algorithm was trained documented weight in lbs.

That means any algorithms touching data in any EHR or similar system must be verified at each institution rather than globally. As a result, this is often a labor-intensive process, and people expecting this to happen overnight will be disappointed.

The best AI algorithm could still need months to verify its performance. As such, leaders should ensure they budget for the training process whenever they evaluate any AI solution. If they expect to "turn on the AI" and it will add value, they will be disappointed 100% of the time.

Follow Bill's HIT coverage on LinkedIn: Bill Siwicki
Email him: bsiwicki@himss.org
Healthcare IT News is a HIMSS Media publication.

Want to get more stories like this one? Get daily news updates from Healthcare IT News.
Your subscription has been saved.
Something went wrong. Please try again.