Strengths and limitations of large databases in lung cancer radiation oncology research

Vikram Jairam, Henry S. Park


There has been a substantial rise in the utilization of large databases in radiation oncology research. The advantages of these datasets include a large sample size and inclusion of a diverse population of patients in a real-world setting. Such observational studies hold promise in enhancing our understanding of questions for which evidence is conflicting or absent in lung cancer radiotherapy. However, it is critical that investigators understand the strengths and limitations of large databases in order to avoid the common pitfalls that beset observational analyses. This review begins by outlining the data variables available in major registries that are used most often in observational analyses. This is followed by a discussion of the type of radiotherapy-related questions that can be addressed using such datasets, accompanied by examples from the lung cancer literature. Finally, we describe some limitations of observational research and techniques to mitigate bias and confounding. We hope that clinicians and researchers find this review helpful for designing new research studies and interpreting published analyses in the literature.