Preface
1
Four Research Steps
1.1
Why Statistics?
1.2
Research Question
1.2.1
Variables
1.2.2
Inference and Prediction
1.2.3
Population and Sample
1.2.4
Parameter and Statistic
1.3
Data Collection
1.3.1
Variability and Bias
1.3.2
What Makes A Good Sample?
1.3.3
Sampling Schemes
1.3.4
Observational and Experimental Studies
1.4
Data Analysis
2
Descriptive Statistics
2.1
Types of Variables
2.2
Summaries of Qualitative Variables
2.2.1
Frequency
2.2.2
Plots
2.3
Summaries of Quantitative Variables
2.3.1
Stem Plots
2.3.2
Quantiles
2.3.3
Five-number Summary
2.3.4
Box Plots
2.3.5
Histogram
2.3.6
Measure of Centrality
2.3.7
Measure of Variability
2.3.8
Shape of the Distribution
2.3.9
Robustness to outliers
3
The Laws of Probability
3.1
What is Probability?
3.1.1
Randomness
3.1.2
Probabilistic Experiment
3.1.3
Three Views of Probability
3.1.4
Mathematical Definition of Probability
3.2
Addition Rule
3.2.1
Venn Diagram
3.2.2
Set Operations
3.2.3
Addition Rule for Mutually Exclusive Events
3.2.4
Addition Rule for Non-mutually-exclusive Events
3.3
Multiplication Rule
3.3.1
Conditional Probability
3.3.2
Multiplication Rule
3.3.3
Independence
3.4
Law of Total Probability
3.4.1
Conditional Probability of the Complement
3.4.2
Law of Total Probability
3.4.3
Tree Diagram
4
Random Variables
4.1
Random Variables
4.1.1
What is A Random Variable?
4.1.2
Observed Values
4.1.3
Defining Probabilities and Events with Random Variables
4.1.4
Types of Random Variables
4.2
Probability Function
4.2.1
Probability Distribution
4.2.2
Probability Function
4.2.3
Discrete and Continuous Probability Functions
4.3
Probability Function for Discrete Random Variables
4.3.1
Probability Mass Function
4.3.2
Cumulative Distribution Function
4.3.3
Expectation
4.3.4
Bernoulli Distribution
4.4
Probability Function for Continuous Random Variables
4.4.1
Corresponding Definitions for Continuous Random Variables
4.4.2
Continuous Uniform Distribution
4.5
Comparison of Discrete and Continuous Random Variables
5
Normal Distribution
5.1
Normal Distribution
5.1.1
Probability Density Function
5.1.2
Expected Value and Variance
5.1.3
Shape of the Distribution
5.1.4
Effect of the mean
\((\mu)\)
5.1.5
Effect of the Variance (
\(\sigma^2\)
)
5.1.6
The Empirical Rule of Normal Distribution
5.1.7
Ubiquity of Normal Distribution
5.2
Standard Normal Distribution
5.2.1
Probability Density Function
5.2.2
\(Z\)
-table
5.3
\(Z\)
-score
5.3.1
Linearity of Normal Distribution
5.3.2
\(Z\)
-score
5.3.3
Examples
5.4
Calculate Normal Probabilities in R
6
Sampling Distribution of the Mean
6.1
Sampling Distribution
6.1.1
Sample Random Variables
6.1.2
Independent Trials
6.1.3
Population Distribution
6.1.4
Sampling Distribution
6.2
Sample Mean
6.2.1
Sample Mean Random Variable
6.2.2
Observed Sample Means
6.2.3
Sampling Distribution of the Sample Mean
6.2.4
Expectation and Variance
6.2.5
Sample Mean of Independent Normal Draws
6.2.6
Central Limit Theorem
7
Confidence Intervals
7.1
Pivotal Quantity
7.1.1
Linearity of Normal Distribution
7.1.2
Pivotal Quantity for the Mean of a Normal Population
7.1.3
Pivotal Quantity for Population Mean via CLT
7.1.4
Pivotal Quantity for
\(\mu\)
When
\(\sigma\)
is Unknown
7.2
Confidence Intervals
7.2.1
What is a Confidence Interval?
7.2.2
Interpret a Confidence Interval
7.3
Confidence Interval for
\(\mu\)
When
\(\sigma\)
is Known
7.3.1
Construct the Confidence Interval
7.3.2
Margin of Error
7.3.3
Determining the Minimum Sample Size
\(n\)
7.4
Confidence Interval for
\(\mu\)
When
\(\sigma\)
is Unknown
7.4.1
Construct the Confidence Interval
7.4.2
Determine the Sample Size
7.5
Summary
8
Hypothesis Tests
8.1
Hypotheses
8.1.1
Research Questions
8.1.2
Hypotheses
8.1.3
Example: Hypothesis About A Specific Value of
\(\mu\)
8.2
Hypothesis Test
8.3
Error of a Test
8.4
One-sample Hypothesis Tests of the Mean
8.4.1
Testing
\(H_0: \mu = \mu_0\)
When
\(\sigma\)
is Known
8.4.2
Testing
\(H_0: \mu = \mu_0\)
when
\(\sigma\)
is unknown
8.4.3
Some More Examples
9
Two-sample Tests
9.1
Independent Two-Sample Tests
9.1.1
Two-sample Test
9.1.2
Distribution of the Difference of Two Normal Means
9.1.3
When
\(\sigma_1\)
and
\(\sigma_2\)
are Known
9.1.4
When
\(\sigma_1\)
and
\(\sigma_2\)
are Unknown and
\(\sigma_1 = \sigma_2 = \sigma\)
9.1.5
When
\(\sigma_1\)
and
\(\sigma_2\)
are Unknown and
\(\sigma_1 \ne \sigma_2\)
9.1.6
Summary
9.2
Wald-type Tests and Confidence Intervals
9.2.1
From Hypothesis Testing to Confidence Interval
9.2.2
From Confidence Interval to Hypothesis Testing
9.3
\(t\)
-test for Paired Data
9.4
\(t\)
-test for Proportions
9.4.1
One-sample Test of Proportion
9.4.2
Two-sample Test of Proportion
9.5
Summary of Wald-type hypothesis tests
9.5.1
Two-sided vs One-sided Tests
9.5.2
Table of Wald-tests
10
Other Useful tests
10.1
ANOVA
10.1.1
Assumptions
10.1.2
Hypotheses
10.1.3
Analysis in R
10.2
Chi-square Test
10.2.1
Assumptions
10.2.2
Hypotheses
10.2.3
Analysis in R
10.3
Levene’s Test
10.3.1
Hypotheses
10.3.2
Analysis in R
10.4
QQ plot
10.5
Nonparametric Tests
10.6
Wilcoxon’s Rank Sum Test
10.6.1
Assumptions
10.6.2
Hypotheses
10.6.3
Analysis in R
10.7
Kruskal-Wallis test
10.7.1
Assumptions
10.7.2
Hypotheses
10.7.3
Analysis in R
10.8
Summary of the Tests
11
Correlation
11.1
Covariance
11.1.1
Population Covariance
11.1.2
Relationship with Variance
11.1.3
Interpretation
11.1.4
Sample Covariance
11.2
Correlation Coefficient
11.2.1
Population Correlation Coefficient
11.2.2
Interpretation
11.2.3
Sample Correlation Coefficient
11.3
Correlation Test
11.4
Summary
12
Linear Regression
12.1
Simple Linear Regression
12.1.1
The Model
12.1.2
Fitting the Model to the Sample Data
12.1.3
The Least Squares Regression Line
12.1.4
Interpretation
12.2
Linear Regression Inference
12.2.1
Assumptions
12.2.2
Inference
12.3
Goodness of Fit of the Model
12.3.1
Residuals
12.3.2
Coefficient of Determination (
\(R^2\)
)
12.3.3
Influential Points
12.3.4
Checking Assumptions
12.4
Predicted Value
12.5
Multiple Linear Regression
Appendix
A
Introduction to R
A.1
Why R?
A.2
Installing R and RStudio
A.3
The RStudio Interface
A.4
Helper Codes
A.4.1
Packages
A.4.2
Getting Help
A.4.3
Leaving Comments
A.5
Basic Calculations
A.6
Objects
A.7
Vectors
A.7.1
Vector Creation
A.7.2
Vector Operations
A.7.3
Extracting Elements from Vectors
A.8
Data Sets
A.8.1
Data Frame
A.8.2
Load Data Files
A.8.3
Working Directory
A.8.4
Extracting Elements from a Data Frame
B
Introduction to Rmarkdown
B.1
Install Rmarkdown
B.2
Components of A Rmarkdown File
C
Data Set
References
A First Course In Statistics
References