G7 Research Group G7 Information Centre
Summits |  Meetings |  Publications |  Research |  Search |  Home |  About the G7 Research Group
University of Toronto

Deliverable 2: Principles to Support the Development and Deployment of Artificial Intelligence or Machine Learning-Enabled Medical Devices across Jurisdictions

United Kingdom Department of Health and Social Care, December 30, 2021


Contents

  1. Introduction
  2. Current context
  3. Vision for the future
  4. Supporting the development and deployment of AI/ML-enabled medical devices across jurisdictions
  5. Conclusion

Introduction

As part of the objectives of the G7's health track artificial intelligence (AI) governance workstream 2021, member states committed to the creation of 2 complementary papers:

  1. The first paper seeks to define the phases and good practice for clinically evaluating artificial intelligence or machine learning (AI/ML) enabled medical devices.
  2. The second seeks to define and agree good practice for assessing the suitability of AI/ML-enabled medical devices developed in one G7 country for deployment in another G7 country.

These papers should be read in combination to gain a more complete picture of the G7's stance on the governance of AI in health.

This paper is the result of a concerted effort by G7 nations to contribute to the harmonisation of principles that support the development and deployment of AI/ML-enabled medical devices in health and care. It builds on existing international work led by the:

Current context

AI/ML-enabled medical devices are becoming increasingly prevalent in clinical settings around the world. Accompanying this growth are national and international efforts to increase the scrutiny on the performance and safety of AI/ML-enabled medical devices.

Challenges remain with governing the use of AI/ML-enabled medical devices in healthcare, particularly in assessing and comparing the performance and safety of models. A lack of internationally recognised principles for reporting AI models means assessing AI/ML-enabled medical devices against a standard of care – or one another – is currently a challenge and imprecise.[footnote 1]

Current efforts to assess the development and deployment of AI/ML-enabled medical devices across jurisdictions tend to only consider model performance. Given the safety concerns with AI, there is a need to explore how we can take into account further considerations, such as model development, robustness and other safety considerations, in supporting the development and deployment of AI/ML-enabled medical devices.

As G7 nations, we have a key role to play in setting principles for what good looks like and supporting the development and deployment of AI/ML-enabled medical devices across jurisdictions. This would enable greater transparency on the quality of AI/ML-enabled medical devices, and promote patient safety and responsible innovation.

We aim to progress our understanding and discussions on the following key questions. How do we, as G7 nations:

Vision for the future

As G7 countries, we are well positioned to lead the conversations on the development and adoption of safe, effective and performant AI/ML-enabled medical devices in health settings around the world.

As countries develop governance processes for AI/ML-enabled medical devices deployed in healthcare, it is an opportune moment for G7 nations to lead the way on this conversation, and encourage a consensus on the principles for supporting the development and deployment of AI/ML-enabled medical devices to promote patient safety and foster innovation.

Internationally recognised principles for the development and deployment of AI/ML-enabled medical devices across jurisdictions could allow for the comparison of AI/ML-enabled medical devices deployed in different countries.

With many AI/ML-enabled medical devices in healthcare trained on data from one country but deployed in another, internationally recognised mechanisms to test the robustness and generalisability of AI/ML-enabled medical devices could support countries as they navigate a growing new marketplace. More information about models and better standard forms of reporting data sets are necessary for models to safely traverse borders. These mechanisms also have the potential to be used in fast-tracking future regulatory approvals, as appropriate, from country to country.

As G7 nations, we recognise the need to work together to define this set of principles for supporting the development and deployment of AI/ML-enabled medical devices across jurisdictions. This will accelerate international adoption of AI/ML-enabled medical devices in healthcare.

We aim to build a set of values in our countries for supporting the development and deployment of AI/ML-enabled medical devices across jurisdictions that:

Supporting the development and deployment of AI/ML-enabled medical devices across jurisdictions

We are committed to an international harmonisation of principles that can be used at different stages of the development cycle of AI/ML-enabled medical devices. The principles laid out below should be taken in conjunction with the discussion outlined in Deliverable 1: principles for the evaluation of AI/ML enabled medical devices to assure safety, effectiveness and ethicality.

1. Understanding the data

Supporting the development and deployment of AI/ML-enabled medical devices is dependent on good data governance and management practices. It is also contingent on manufacturers understanding and providing information about how data used by AI/ML-enabled medical devices is collected, organised, labelled and processed (even if this is carried out by a contracted third party).

By understanding and improving the quality of the data that is being fed into these models we can work towards mitigating bias and discrimination and foster fairness and equity.

a. Provide information over data collection

b. Provide information about data quality

c. Provide information over data organisation

We recognise the need to ensure:

d. Provide information over data labelling

Data labels are used to train AI/ML-enabled medical devices by providing a ground truth for the model to learn from. Within healthcare, labels can also be used to compare AI/ML-enabled medical device performance with a reference standard (for example, the performance of clinicians).

Inconsistent or poor-quality labels can lower model performance or introduce biases into AI/ML-enabled medical devices.

Accepted best available methods for developing a reference data set and labelling data ensure clinically relevant and well characterised data is collected, and that the limitations of the reference data are understood.

e. Provide information over data processing

Data processing includes mitigating inconsistencies or limitations in the data ('data cleaning') and manipulating the raw data ('feature engineering'). How data is processed:

2. Understanding the model

AI is a rapidly changing field. Standardising the reporting of how AI/ML-enabled medical devices are developed and the metrics used to calculate model performance can:

a. Reporting guidelines

Reporting guidelines could be useful tools to assess whether developers adhered to good scientific principles and these could be incorporated into supporting the development and deployment of AI/ML-enabled medical devices across jurisdictions.

Internationally recognised guidelines are currently being updated to include AI-driven technologies such as:

b. Metrics

A range of different metrics (such as specificity and sensitivity) can be used to calculate AI/ML-enabled medical devices. Depending on the model's use case, different metrics will be applicable.

Transparency and justification on what metrics are used – preferably in advance – will help assess if a metric may be a misleading representation of the model's performance.

Where feasible, standardised metrics could also allow for more seamless support of the development and deployment of AI/ML-enabled medical devices across jurisdictions.

c. Information to users

Users are provided ready access to clear, contextually relevant information that is appropriate for the intended audience (such as clinicians or patients).

3. Robustness

Once deployed in the real world, AI/ML-enabled medical devices use input data, which is often noisy, has defects or changes over time – all of which could lower model performance or introduce biases. Data quality can be especially variable within healthcare due to unique factors within each clinical setting – magnifying any robustness concerns.

Due to the valid – and well documented – safety concerns with deploying AI models in healthcare, robustness should be an explicit consideration of supporting the development and deployment of AI/ML-enabled medical devices across jurisdictions.

To champion transparency and foster fairness and equity, the following areas are a particular priority:

a. Corruption robustness

Noise, defects or changes to training datasets may lower model performance or introduce biases into the modelling.[footnote 3] Healthcare data can be subtly altered by factors specific to individual clinical settings – making this a particular concern for AI/ML-enabled medical devices.

Performance will never be static against all different data inputs – thresholds (informed by standards of care) will be required to determine acceptable ranges of model performance.

b. Testing robustness

Data sets to test robustness could include:

c. Post-market surveillance

Changes in deployment environments will result in new permutations that may result in unintended bias or for how data could be corrupted. Monitoring for model degradation, unintended bias and mitigating corruption robustness requires a continuous process for all deployed AI/ML-enabled medical devices.

Frequency of post-deployment model testing will depend on, among other factors:

d. Adversarial robustness

A malign actor could potentially deliberately design perturbations within a data set or model input to intentionally mislead AI/ML-enabled medical devices or change modelling outputs – this would be a particular risk in the field of imaging.

However, it is more likely that adversarial examples arise in health and care due to human error caused by frontline pressures or staff fatigue.

Reliability of deployment infrastructure environments and cyber security protections are key considerations for defending AI/ML-enabled medical devices against malicious activity.

Conclusion

As G7 nations, we are committed to working together – taking into consideration different jurisdictional frameworks and in close co-ordination with other international initiatives – to promote the harmonisation of principles and eventually standards for how we support the development and deployment of AI/ML-enabled medical devices across jurisdictions.

We will:

  1. There is some work underway on the standardisation of model reporting, such as Sendak MP, Gao M, Brajer N and others. 'Presenting machine learning model information to clinical end users with model facts labels.' npj Digital Medicine 2020: volume 3, issue 41.

  2. If the manufacturer utilises AI regulatory sandboxes, this should be made transparent in order to enable the user to evaluate if data has been processed legally.

  3. This will need to be balanced against point 1a.ii.

[back to top]

Source: UK Government, Department of Health and Social Care


G7 Information Centre

Top of Page
This Information System is provided by the University of Toronto Libraries and the G7 Research Group at the University of Toronto.
Please send comments to: g7@utoronto.ca
This page was last updated December 31, 2021.

All contents copyright © 2024. University of Toronto unless otherwise stated. All rights reserved.