A Quantitative Analysis of Explainable Artificial Intelligence Techniques as Applied to Machine Learning Models for Breast Cancer Classification
Submitted to the University of York in partial fulfilment of the requirements for the degree of MSc Computer Science with Artificial Intelligence
Artificial Intelligence (AI) is being applied to an increasing number of facets of our lives. From scenarios such as recommendation engines through to complex systems designed to drive vehicles on public roads, there seems to be no end to the scope and variety of the tasks that AI techniques are being applied to. It is natural that these scenarios exist on a spectrum between what are referred to as low stakes towards those that would be described as high stakes. For example, a private individual may have less interest in YouTube’s recommendation engine providing appropriate and interesting content than they would be a system charged with detecting the presence of malignant cancers within tissue samples.
Machine Learning (ML) models can produce undesirable results or raise worrying questions. An individual’s life chances could be extremely negatively impacted by the misclassification of a tissue sample as benign instead of malignant. In instances where an individual has been harmed or exposed to potential harm it is reasonable for that individual and applicable regulatory bodies to ask the question as to why this happened such that suitable action can be taken. Developers of and researchers in AI systems may be able to build better systems and refine approaches if those systems have an ability to describe why model outputs have been determined. This need for explainability in systems has been known for many years, was acknowledged during the development of expert systems during the late 1970s and early 1980s, and within the field of AI is the focus of the sub-field of eXplainable Artificial Intelligence (XAI).
This research will attempt to determine the following:
- Can the Optimal Sparse Decision Tree (OSDT) computation technique developed by Hu, Rudin, and Seltzer be applied to build interpretable binary classification ML models?
- How do such ML models compare in terms of accuracy versus competing ML models developed using an alternative classification technique?
- How do such ML models compare in terms of interpretability versus competing ML models developed using an alternative classification technique?