Tools for Reproducible Real-World Analyses
This course focuses on the concepts and tools of reproducible research and reporting of modern data analyses. The need for more reproducible tools in health economics and outcomes research is growing rapidly as analyses of real world data become more frequent, involve larger datasets, and employ more complex computations. This course will cover the principles of structuring and organizing a modern data analysis, literate statistical analysis tools, formal version control, software testing and debugging, and developing reproducible reports. Numerous real-world examples and an interactive class exercise will be used to reinforce the concepts and tools introduced. This course will use RStudio Cloud for exercises. Participants who wish to gain hands-on experience should bring their laptops.
- What is reproducible research?
- Why is reproducibility so important?
- How do we get there?
- Organizing data
- Writing clear code
- Disseminating code & findings
- Catching mistakes
Course Materials and Exercises
coding style & culture
- Writing system software: code comments
- Style guide from Google
- The Tidyverse style guide
- Software Carpentry: best practices for writing R code
- Code review best practices
- RStudio webinars
- RStudio cheat sheets
- R for Data Science textbook (also available as paperback via Amazon)
- Advanced R programming (covers many advanced topics)
- Good Practices for Real-World Data Studies of Treatment and/or Comparative Effectiveness: Recommendations from the Joint ISPOR-ISPE Special Task Force on Real-World Evidence in Health Care Decision Making
- Reproducibility checklist
- rOpenSci – a non-profit website that fosters reproducible research
- Coursera Reproducible Research