Introductory Reading List

With the COVID-19 pandemic impacting every aspect of our lives around the globe, many people are taking an interest in epidemiological modelling for the first time. With this in mind, we have prepared a set of resources to serve as an introduction to the field for people with different levels of interest and prior knowledge.



The main demographic that each section is aimed at is denoted with their listing in the contents and under their section header.
NTF - New to the field
GI - General interest in the field but do not want to learn details about methodology and best practice.
PM - Practitioners and modellers

Common Terms



As with any technical field, there are several terms that are commonly used but may not be understood by people who are new. Below is a list of such terms for scientific modelling.*

*Note: These terms should not be considered perfect definitions, but instead simple descriptions of the concepts to allow for understanding.

  • Scientific modelling - The act of using a technical methodology to represent (and generally simplify) a system in order to better understand that system. Common types include simulation and statistical modellingIt should be noted that articles can include multiple examples of modelling together. For example, statistical modelling is often used to understand the details of the system before simulations of the system are created.

  • Statistical modelling - Gathering data from a system to look for trends and patterns. A lot of scientific enquiry and machine learning falls under this category.

  • Simulation - Modeling the details of a system to replicate or predict the behaviour of the system. Common types include agent-based modelling and equation-based modelling.

  • Equation-based models - Also known as EBMs or mathematical modelling. These models are created by using specific ‘simplifying’ assumptions about the system so that it can be described by equations. Physics, for example, is characterized by EBMs that describe the motion of objects.

  • Agent-based models - Also known as ABMs or individual-based models. These are computer simulations that model a system as an environment that contains agents. The ways in which the agents interact with the environment and each other is set but this type of model is characterized by emergent behaviour. See the encyclopaedia article for more information.

  • Emergent behaviour - The effect where many simple individual interactions can cause large and complex group dynamics. See the flocking behaviour in the Boid model in the encyclopaedia article.

  • Non-pharmaceutical interventions - These are policies enforced by governments to limit the spread of diseases but do not use pharmaceuticals. Examples include social distancing and the use of masks.

General Introduction to Agent-Based Modelling


Agent-based Modelling - Gallagher and Bryson

This is an encyclopaedia article on agent-based modelling and is, therefore, a very good introduction to the field. It provides historical context for the methodology and explains its usage both conceptually and practically. The Boid model by Craig Reynolds is used as an illustrative example.




A Comparison of Agent-Based Models and Equation-Based Models for Infectious Disease Epidemiology - Hunter et al.

This paper uses an ABM and an EBM to study the course of measles outbreaks in early 20th century Irish towns. It explains the methods used in particularly comprehensible plain English and finds that the extra information provided by the agent-based model is worth the extra time needed to set up and run it. This result is common for the small populations being studied.
This paper provides a good introduction to both types of modelling with minimal maths and jargon, useful to those looking to learn more about the field of disease modelling from other backgrounds.

Mathematical and computational approaches to epidemic modeling: a comprehensive review - Duan et al.

In this review article, the authors describe three major types of epidemic modelling: mathematical models, complex network models and agent-based models. They explore existing work in each area and compare the key benefits and limitations of each type of model. This paper can serve as a good introduction to epidemic modelling by covering the key ideas in each area of research, including important equations. It should be noted that other authors may consider complex network models to be a subcategory of ABMs.
This is a technical but comprehensive guide to epidemic modelling, especially useful for finding more literature to read or as a comprehensive review of the state of the field in 2015.


Using Simulation Results, Statistics and Uncertainty


Fixed-time descriptive statistics underestimate extremes of epidemic curve ensembles - Juul et al.

The key lesson from this paper is useful to anyone using scientific data (whether experimental or simulated): No matter how good the data, if the statistical work around it is not done correctly then the conclusions can be spurious.
Worries about how people use data have been mentioned in our interviews with academics. Similarly, the need for scientists to give accurate and trustworthy information has been mentioned in our interviews with policymakers. Thus, all practitioners must think deeply and carefully about how they show and describe their data.
The particular contribution of this paper is a set of methods to describe extreme events (e.g. the maximum number of people hospitalised over the course of an epidemic) or likelihoods of particular circumstances (e.g. having some number of people hospitalised each day for several days).


Hybrid Models


A Hybrid Epidemic Model: Combining the Advantages of Agent-Based and Equation-Based Approaches - Bobashev et al.

This paper creates a hybrid model to simulate the entire process of an epidemic. In the early stages of an epidemic, when small variations can lead to large differences in the outcome, it takes advantage of the ability of ABMs to replicate the details and heterogeneity of the real world. Later, when the epidemic spread has stabilised and simplifying assumption become more accurate, the hybrid model transfers to the less computationally expensive EBMs.
Though this hybrid model is useful in its own right, it is worth learning a broader lesson from this paper: All types of modelling have strengths and weaknesses. Therefore, it can be beneficial to use multiple methods and compare the results.


Existing Research Groups


Because of the impact that COVID-19 has had on the world, many researchers from diverse fields have contributed to the global research effort. Because of this, it can be hard for people outside of academia to know the best places to seek information. Of course, there are national and international agencies such as the World Health Organization and the various national public health agencies dedicated to collating and communicating the science to policymakers. However, the two research groups listed below have contributed significantly to the global understanding of this disease.

Imperial College COVID-19 Response Team

The Imperial College team has produced multiple models for making predictions around several problems faced throughout the pandemic. In Report 9, they present their only Agent-based model and use it to predict the effects of non-pharmaceutical interventions on a population level in Great Britain and the US. For more specific problems, they tend to use mathematical models. Rather than standard compartmental models, these tend to be specific models tailored to each problem, examples include Reports 16 and 19.
They also use a lot of statistical modelling and machine learning to study existing data. In fact, statistical models seem to have become the focus of their work as the pandemic has progressed and data has become more available. In Reports 13, 20, 21 and 23, they use a “semi-mechanistic Bayesian machine learning model”. This is their favoured model for tracking epidemics as they progress by estimating transmission intensity and attack rates conditional on the reported number of deaths.

The University of Washington, Institute for Health Metrics and Evaluation (IHME)

This institute was well established before the pandemic but they have contributed several publications to support policymaking around COVID-19. Although most of their work is statistical (as has been common in the broader field of work), two papers (below) discuss simulation modelling.

The model comparison paper looks at several popular models that were used to predict the evolution of the epidemic. All of the models that are compared are mathematical or statistical but the research groups used are worth looking into if you are looking to find more high-quality research groups.

The forecasting publication uses an agent-based model to predict hospital demand in the United States and in countries within the European Economic Area. It may be enlightening to compare this to Report 27 from the Imperial College team to see how similar problems are studied using different methods.


Publishing Best Practice


The three frameworks listed below (the ODD protocol, Pattern-oriented modelling and TRACE) are all designed to help people designing, using or learning about agent-based models improve the scientific process around their work. They are also designed to be compatible. Although they all originate from the field of ecological modelling, they are suitable for all types of agent-based modelling.

The ODD Protocol for Describing Agent-Based and Other Simulation Models - Grimm et al.

For anyone creating and publishing agent-based models, the ODD (Overview, Design concepts, Details) protocol is a standardized way to explain everything from the initial purpose of the design to code-level details of the agents and the environment. The ODD protocol is a structured and formalised way to explain a model precisely in plain English (though equations and tables of variables are common in the sections where more detail is expected). Having a standard structure makes it easier for authors to make sure that they give enough detail to make their results reproducible and makes it easier for readers to find the information they are looking for within large descriptions of complex models.

The button links to the latest guideline for using the protocol, one of the supplements to the latest paper. The links below are papers explaining the thought process behind the protocol and are the references to be used for any publication that uses the ODD protocol.

Grimm et al. 2020

Grimm et al. 2006

Pattern-Oriented Modeling of Agent-Based Complex Systems: Lessons from Ecology - Grimm et al.

Pattern-oriented modelling is a framework that adapts the scientific method to the field of agent-based modelling. Particularly, it evaluates those models based on their ability to replicate real patterns. It also gives an algorithmic conception of scientific theory rather than an analytic one (since analytic conceptualisation is best suited to EBMs).

Towards better modelling and decision support: Documenting model development, testing, and analysis using TRACE - Grimm et al.

While the ODD protocol focuses on the details of the model to allow easy replication, TRACE is used to describe the entire scientific process that every simulation model is part of. It starts with problem formulation and provides a structured framework for describing the evaluation and validation processes that should happen during the design of any model.


Agent-Based Modelling Simulation Tools


Agent Based Modelling and Simulation tools: A review of the state-of-art software - Abar et al.

This review article gives a comprehensive list of agent-based modelling and simulation tools. Though the wider paper provides good context and more detail, the main figure (Fig. 1 in the reference) provides most of the information that a practitioner would need to decide which tool is best for them. For those looking to use agent-based modelling but without a lot of programming experience, there are tools designed for education with simple user interfaces but limited capabilities. Research-capable tools are available for most common programming languages (Python, Basic, C/C++) but most tools are built around Java.