 
			
				
			
		Summer School: Big Data and Statistics for Bench Scientists
Lead PI
Abstract
Northeastern University hosted a Summer School, entitled Big Data and Statistics for Bench Scientists, in the summers of 2016, 2017, and 2018. The attendees were graduate and post-graduate life scientists, working primarily in wet labs generating large datasets.
Unlike other educational efforts that emphasize genomic applications, this School targeted scientists working with other experimental technologies. Mass spectrometry-based proteomics and metabolomics was the main focus, however the School was also appropriate for scientists working with other assays, e.g. nuclear magnetic resonance spectroscopy (NMR), protein arrays, etc. This large community has been traditionally under-served by educational efforts in computation and statistics. This School aimed to fill this void.
The Summer School was motivated by the feedback from smaller short courses previously co-organized or co- instructed by the PI, and covered theoretical and practical aspects of design and analysis of large-scale experimental datasets.
The Summer School had a modular format, with 8 20-hour modules scheduled in 2 parallel tracks during 2 consecutive weeks. Each module could be taken independently. The modules were (1) Processing raw mass spectrometric data from proteomic experiments using Skyline, (2) Beginner’s R, (3) Processing raw mass spectrometric data from metabolomic experiments using OpenMS, (4) Intermediate R, (5) Beginner’s guide to statistical experimental design and group comparison, (6) Specialized statistical methods for detecting differentially abundant proteins and metabolites, (7) Statistical methods for discovery of biomarkers of disease, and (8) Introduction to systems biology and data integration. Each module introduced the necessary statistical and computational methodology, and contained extensive practical hands-on sessions. Each module was organized by instructors with extensive interdisciplinary teaching experience, and supported by several teaching assistants.
All the course materials, including videos of the lectures and of the practical sessions, are publicly available free of charge.