\Question{Normal Distribution}
Recall the following facts about the normal distribution: if $X \sim \mathcal{N}(\mu, \sigma^2)$, then the random variable $Z = (X - \mu)/\sigma$ is standard normal, i.e.\ $Z \sim \mathcal{N}(0, 1)$. There is no closed-form expression for the CDF of the standard normal distribution, so we define $\Phi(z) = \Pr[Z \leq z]$. You may express your answers in terms of $\Phi(z)$.
The average jump of a certain frog is $3$ inches. However, because
of the wind, the frog does not always go exactly $3$ inches. A zoologist
tells you that the distance the frog travels is normally distributed
with mean $3$ and variance $1/4$.
\begin{Parts}
\Part What is the probability that the frog jumps more than $4$ inches?
\nosolspace{1cm}
\Part What is the probability that the distance the frog jumps is between
$2$ and $4$ inches?
\nosolspace{1cm}
\end{Parts}
\Question{Binomial CLT}
In this question we will explicitly see why the central limit theorem holds for the binomial distribution as the
number of coin tosses grows.
Let $X$ be the random variable showing the total number of heads in $n$ independent coin tosses.
\begin{Parts}
\Part Compute the mean and variance of $X$. Show that $\mu = \E[X] = n/2$ and $\sigma^2=\var X =n/4$.
\Part Prove that $\Pr[X=k]={n \choose k}/2^n$.
\Part Show by using Stirling's formula that
$$\Pr[X=k]\simeq \frac{1}{\sqrt{2\pi}}\Bigl(\frac{n}{2k}\Bigr)^k\Bigl(\frac{n}{2(n-k)}\Bigr)^{n-k}\sqrt{\frac{n}{k(n-k)}}.$$
In general we expect $2k$ and $2(n-k)$ to be close to $n$ for the probability to be non-negligible. When
this happens we expect $\displaystyle \sqrt{\frac{n}{k(n-k)}}$ to be close to
$\displaystyle \sqrt{\frac{n}{(n/2)\times(n/2)}}=\frac{2}{\sqrt{n}}$. So replace that part of the formula
by $2/\sqrt{n}$.
\label{q:bclt-c}
\Part In order to normalize $X$, we need to subtract the mean, and divide by the standard deviation.
Let $Y=(X-\mu)/\sigma$ be the normalized version of $X$. Note that $Y$ is a discrete random variable.
Determine the set of values that $Y$ can take. What is the distance $d$ between two consecutive values?
\Part Let $X=k$ correspond to the event $Y=t$. Then $X\in [k-0.5,
k+0.5]$ corresponds to $Y\in [t-d/2, t+d/2]$. For conceptual
simplicity, it is reasonable to assume that the mass at point $t$
is distributed uniformly on the interval $[t-d/2,t+d/2]$.
We can capture this with the idea of a ``probability density''
and say that the probability density on this interval is just
$\Pr[Y=t]/d=\Pr[X=k]/d$.
\vspace{0.5em}
Compute $k$ as a function of $t$. Then substitute that for $k$ in
the approximation you have from part~\ref{q:bclt-c} to find an approximation for $\Pr[Y=
t]/d$. Show that the end result is equivalent to:
$$\frac{1}{\sqrt{2\pi}}\Bigl[\Bigl(1+\frac{t}{\sqrt{n}}\Bigr)^{1+t/\sqrt{n}}\Bigl(1-\frac{t}{\sqrt{n}}\Bigr)^{1-t/\sqrt{n}}\Bigr]^{-n/2}$$
\Part As you can see, we have expressions of the form $(1+x)^{1+x}$
in our approximation. To simplify them, write $(1+x)^{1+x}$ as
$\exp((1+x)\ln(1+x))$ and then replace $(1+x)\ln(1+x)$ by its
Taylor series.
The Taylor series up to the $x^2$ term is $(1+x)\ln(1+x)\simeq
x+x^2/2+\cdots$ (feel free to verify this by hand). Use this to
simplify the approximation from the last part. In the end you
should get the familiar formula that appears inside the CLT:
$$\frac{1}{\sqrt{2\pi}}\exp\Bigl( - \frac{t^2}{2} \Bigr).$$
(The CLT is essentially taking a sum with lots of tiny slices and
approximating it by an integral of this function. Because the
slices are tiny, dropping all the higher-order terms in the Taylor
expansion is justified.)
\end{Parts}
% why_is_it_gaussian.text
\Question{Why Is It Gaussian?}
Let $X$ be a normally distributed random variable with mean $\mu$ and variance $\sigma^2$. Let $Y = aX+b$, where $a$ and $b$ are non-zero real numbers.
Show explicitly that $Y$ is normally distributed with mean $a\mu + b$ and variance $a^2\sigma^2$.
(Your proof should be more explicit than what's in the class notes.
One approach is to start with the cumulative distribution function of $Y$ and use it to derive the probability density function of $Y$.)
\Question{Deriving Chebyshev's Inequality}
Recall Markov's Inequality, which applies for non-negative $X$ and $\alpha > 0$: $$\Pr[X\geq\alpha]\leq\frac{\E[X]}{\alpha}$$
Use an appropriate substitution for $X$ and $\alpha$ to derive Chebyshev's Inequality, where $\mu$ denotes the expected value of $Y$.
$$\Pr[|Y-\mu|\geq k]\leq\frac{\Var Y}{k^2}$$
\nosolspace{0.5cm}
\Question{Markov's Inequality and Chebyshev's Inequality}
A random variable $X$ has variance $\Var{X} = 9$ and expectation $\Ex{X}=2$. Furthermore, the value of $X$ is never greater than $10$. Given this information, provide either a proof or a counterexample for the following statements.
\begin{Parts}
\Part $\Ex{X^2} = 13$.\label{markov-chebyshev-part-a}
\Part $\Pr[X = 2] > 0$.
\Part $\Pr[X \geq 2] = \Pr[X \leq 2]$.
\Part $\Pr[X \leq 1] \leq 8/9$.
\Part $\Pr[X \geq 6] \leq 9/16$.
\Part $\Pr[X \geq 6] \leq 9/32$.
\end{Parts}
\Question{Practical Confidence Intervals}
\begin{enumerate}[(a)]
\item It's New Year's Eve, and you're re-evaluating your finances for the next year. Based on previous spending patterns, you know that you spend \$1500 per month on average, with a standard deviation of \$500, and each month's expenditure is independently and identically distributed. As a poor college student, you also don't have any income. How much should you have in your bank account if you don't want to go broke this year, with probability at least 95\%?
\item As a UC Berkeley CS student, you're always thinking about ways to become the next billionaire in Silicon Valley. After hours of brainstorming, you've finally cut your list of ideas down to 10, all of which you want to implement at the same time. A venture capitalist has agreed to back all 10 ideas, as long as your net return from implementing the ideas is positive with at least 95\% probability.
Suppose that implementing an idea requires $50$ thousand dollars, and your start-up then succeeds with probability $p$, generating $150$ thousand dollars in revenue (for a net gain of $100$ thousand dollars), or fails with probability $1 - p$ (for a net loss of $50$ thousand dollars). The success of each idea is independent of every other. What is the condition on $p$ that you need to satisfy to secure the venture capitalist's funding?
\item One of your start-ups uses error-correcting codes, which can recover the original message as long as at least $1000$ packets are received (not erased). Each packet gets erased independently with probability $0.8$. How many packets should you send such that you can recover the message with probability at least 99\%?
\end{enumerate}
\Question{Quadratic Regression}
In this question, we will find the best quadratic estimator of $Y$ given $X$. First, some notation: let $\mu_i$ be the $i$th moment of $X$, i.e.\ $\mu_i = \E[X^i]$. Also, define $\beta_1 = \E[XY]$ and $\beta_2 = \E[X^2 Y]$. For simplicity, we will assume that $\E[X] = \E[Y] = 0$ and $\E[X^2] = \E[Y^2] = 1$. (Note that this poses no loss of generality, because we can always transform the random variables by subtracting their means and dividing by their standard deviations.) We claim that the best quadratic estimator of $Y$ given $X$ is
\[
\hat{Y} = \frac{1}{\mu_3^2 - \mu_4 + 1} (a X^2 + b X + c)
\]
where
\begin{align*}
a &= \mu_3 \beta_1 - \beta_2, \\
b &= (1 - \mu_4) \beta_1 + \mu_3 \beta_2, \\
c &= -\mu_3 \beta_1 + \beta_2.
\end{align*}
Your task is to prove the Projection Property for $\hat{Y}$.
\begin{Parts}
\Part Prove that $\E[Y - \hat{Y}] = 0$.
\Part Prove that $\E[(Y - \hat{Y})X] = 0$.
\Part Prove that $\E[(Y - \hat{Y})X^2] = 0$.
\end{Parts}
Any quadratic function of $X$ is a linear combination of $1$, $X$, and $X^2$. Hence, these equations together imply that $Y - \hat{Y}$ is orthogonal to any quadratic function of $X$, and so $\hat{Y}$ is the best quadratic estimator of $Y$.
\Question{LLSE and Graphs}
Consider a graph with $n$ vertices numbered $1$ through $n$, where $n$ is a positive integer $\ge 2$. For each pair of distinct vertices, we add an undirected edge between them independently with probability $p$. Let $D_1$ be the random variable representing the degree of vertex 1, and let $D_2$ be the random variable representing the degree of vertex 2.
\begin{Parts}
\Part Compute $\E[D_1]$ and $\E[D_2]$.
\Part Compute $\var(D_1)$.
\Part Compute $\cov(D_1, D_2)$.
\Part Using the information from the first three parts, what is $L(D_2 \mid D_1)$?
\end{Parts}