Course Team

| Instructor | Prof. Alejandro Schuler (he/his) [email protected] | | --- | --- | | GSIs | |

Audience

This class is for anyone who wants to be able to think critically about data analysis. The only requirements are pre-calculus and proficiency in R programming with the tidyverse to the level of exploratory data analysis. If you’re feeling shaky on these topics here are some helpful resources:

Subject Resources
Pre-Calculus - https://www.khanacademy.org/math/precalculus

Apart from this, undergraduate calculus and linear algebra are certainly helpful but not required.

Content

This course will provide you with the ability to think critically about how to answer complex public health questions with data. The fundamental problem in doing this is that data are random: if you do the exact same study twice, you’ll get different datasets and different answers. How can we define the truth we’re after?

<aside> <img src="/icons/target_gray.svg" alt="/icons/target_gray.svg" width="40px" /> Part I Learning Goals:

  1. translate between mathematical notation, diagrams, english prose, and code
  2. demonstrate fundamental ideas in probability and statistics (e.g. conditioning, independence, expectation, consistency) using multivariate simulation
  3. distinguish between estimands and estimates </aside>

In the first part of the course we’ll develop unambiguous language that we can use to talk about the sort of randomness we observe in the real world. This language takes shape as the fundamentals of probability and statistics: random variables, estimands, estimators, sampling distributions, etc. Once we have those frameworks we can work within them to clearly define what it is we’re after when we analyze data and transparently quantify our uncertainty in the result. Think of this as building up a spaceship piece by piece 🛠️ 🚀 — by itself it might seem bewildering, but you’re going to need to get where you want to go. And you’re definitely going to need it to be airtight!

Week Topic
Jan 16 Course Intro
Jan 23 Probability and Random Variables
Jan 30 Distributions
Feb 6 Dependent Random Variables
Feb 13 Data-Generating Processes
Feb 20 Expectation
Feb 27 Estimators
Mar 6 Inference
Mar 13 Causality
Mar 20 Identification and Models

<aside> <img src="/icons/target_gray.svg" alt="/icons/target_gray.svg" width="40px" /> Part II Learning Goals:

  1. distinguish between average-conditional and marginal estimands and verbally interpret them
  2. state the assumptions under which a generalized linear model coefficient is unbiased for a marginal causal effect
  3. state the assumptions under which maximum likelihood and sandwich variance estimators for generalized linear model coefficients are consistent
  4. use simulation to evaluate consistency and coverage of marginal effect estimators under different data-generating processes
  5. apply generalized linear models to infer marginal causal effects from randomized interventional data using plug-in estimation </aside>

In the second part of the course we’ll take the spaceship out for a quick spin and learn the controls 🚀🌙 . Specifically, we’ll apply our statistical thinking to one of the most basic problem settings for epidemiology: figuring out whether one treatment works better than another. We’ll dig into linear regression and see that despite its rampant use and popularity it is not always a viable approach. We’ll see that it depends on whether or not the treatment is experimentally randomized, what method is used to quantify uncertainty, and exactly what we mean by “treatment effect”. Understanding these nuances will require us to carefully apply the concepts we learned in the first part of the course.

Week Topic
Apr 3 Linear Regression: Mechanics
Apr 10 Linear Regression: Inference
Apr 17 Logistic Regression: Mechanics
Apr 24 Logistic Regression: Inference
May 1 Observational Inference

My hope is that by the end of the course you’ll be a reasonably competent spaceship pilot and mechanic 👨‍🚀 . You won’t know how to get everywhere and solve every breakdown, but you’ll have the fundamental tools to figure it out from the ground up. You’ll also be a veteran of the linear regression circuit, especially as it pertains to the estimation of causal effects.

Components and Grading

This course is relatively simple in what I expect from students and how grades are assigned. There are weekly lectures (online), labs, and open hours. Most of your grade comes from the weekly assignments, with smaller portions from weekly lecture reflections and labs. There is no midterm or final. Each week will look like this:

Week Monday - Friday Monday Night
$k$ watch week $k$ lectures
go to week $k$ lab
work on week $k-1$ assignment submit week $k$ reflection
submit week $k-1$ assignment
get back graded week $k-2$ assignment

<aside> <img src="/icons/warning_lightgray.svg" alt="/icons/warning_lightgray.svg" width="40px" /> required components

Lectures and Reflections

<aside> <img src="/icons/delivery-truck_gray.svg" alt="/icons/delivery-truck_gray.svg" width="40px" /> Lectures can be found on the [lectures section on bcourses]. Lecture reflections are due every week on Monday by 11:59 PM and can be completed [on bcourses].

</aside>

<aside> <img src="/icons/gradebook_gray.svg" alt="/icons/gradebook_gray.svg" width="40px" /> Lecture reflections are worth 10% of your total grade in this course. They are graded based on completion: you must write something meaningful for each question each week to get credit for that week’s reflection. All weekly reflections are worth the same amount.

</aside>

Labs

<aside> <img src="/icons/delivery-truck_gray.svg" alt="/icons/delivery-truck_gray.svg" width="40px" /> Labs are led live during whatever section you signed up for. Your lab work must be submitted via [bcourses] at the end of your lab section.

</aside>

<aside> <img src="/icons/gradebook_gray.svg" alt="/icons/gradebook_gray.svg" width="40px" /> Labs are worth 30% of your total grade in this course. They are graded based on attendance. If you attend and follow what was done that week you will get full credit. All labs are worth the same amount.

You are allowed to miss two labs without it impacting your grade in the course.

FOR THE ONLINE VERSION (241W): we will still host synchronous labs (on zoom, to be scheduled first week of class) but your attendance is optional. These sessions will be recorded so you have the opportunity to watch them later. You can complete the lab work on your own time and submit it on [bcourses] to get full credit. Students who attend the lab will complete the lab work at that time and can submit it at the end of the session.

</aside>

Assignments

<aside> <img src="/icons/delivery-truck_gray.svg" alt="/icons/delivery-truck_gray.svg" width="40px" /> Assignments can be found [on the assignments section on bcourses]. They are due every week on Friday by 5pm PST via the [submission system on bcourses] and must be submitted as pdf documents.

</aside>

<aside> <img src="/icons/gradebook_gray.svg" alt="/icons/gradebook_gray.svg" width="40px" /> Assignments are worth 60% of your total grade in this course. They are graded based on correctness: each question will have a point value indicated on the assignment and you can receive partial increments of one quarter of the total point value for each question. All assignments are worth the same amount.

All assignments must be completed individually.

In order to ensure timely grading and feedback we generally cannot accept late work. If you have extenuating circumstances contact the course staff. Also note that at the end of the semester your two lowest-scoring assignments will be dropped.

</aside>

<aside> <img src="/icons/help-alternate_lightgray.svg" alt="/icons/help-alternate_lightgray.svg" width="40px" /> optional components

Open Hours

<aside> <img src="/icons/delivery-truck_gray.svg" alt="/icons/delivery-truck_gray.svg" width="40px" /> Open hours are optional and can be scheduled at [link]. Open hours are held in [room].

</aside>

Open hours are a space for you to get your subject matter questions answered directly by me or a GSI in a one-on-one or small group setting. Since lectures are asynchronous we’ve made extra in-person time so we can take your questions. Come say hi!

bCourses Discussions

<aside> <img src="/icons/delivery-truck_gray.svg" alt="/icons/delivery-truck_gray.svg" width="40px" /> The course discussion board can be found at [link].

</aside>

We encourage you to post your questions and thoughts on the course discussion board on bCourses. Extra credit will be given to folks who write up particularly thoughtful answers or questions!

</aside>

Pedagogical Approach