---
title: "Homework 4"
author: "Garland Durham"
date: "January 23, 2017"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
require(astsa)
```

### 2.1 (a)

```{r}

Q = factor( cycle( jj ))
trend = time(jj) - 1970
fit01 = lm( log(jj) ~ 0 + trend + Q )
summary(fit01)
```
### 2.1 (b) 

Average increase in log earnings is the coefficient on trend, 16.7\%.

### 2.1 (c) 

The average change from Q3 to Q4 is the difference in the coefficients
on Q3 and Q4 plus 0.25 times the annual increase (from (b)),
0.882 - 1.151 + 0.25*0.167 = -0.227 

### 2.1 (d) 

Multicollinearity (because of quarterly dummies).

### 2.1 (e)

```{r}
ts.plot( cbind(log(jj), fitted(fit01)), col=1:2, type="o", main="JJ, quarterly earnings per share", ylab="Dollars")
plot.ts(residuals(fit01), type="o", main="JJ earnings, residuals", ylab="Dollars")
acf(residuals(fit01), main="ACF, JJ residuals")
```

The residuals do not look like white noise.  There is too much autocorrelation.


### 2.2 (a)
```{r}
temp = tempr-mean(tempr)
temp2 = temp^2
trend = time(cmort)
part4 = lag(part,-4)
plot( cbind(Mortality=cmort, Temperature=temp, Particulate=part), main="LA mortality")
A = ts.intersect( cmort, trend, temp, temp2, part, part4, dframe=TRUE )
fit01 = lm( cmort ~ trend + temp + temp2 + part + part4, data=A )
summary(fit01)
```

Lagged particulates appears to be significant.


### 2.2(b)
```{r}
pairs( cbind( Mort=A$cmort, Temp=A$temp, Part=A$part, Lag_part=A$part4 ))
cor(   cbind( Mort=A$cmort, Temp=A$temp, Part=A$part, Lag_part=A$part4 ))
```

Both are positively correlated and have very significant positive regression coefficients.  Although "lagged particulates" is more highly correlated with "cmort" than is contemporaneous "particulates," the regression coefficient is smaller.  There must be some interactions with "Temp" or "Temp\^2".


### 2.8 (a)

```{r}
plot( cbind(varve,log(varve)), main="Varve" )
n = length(varve)
var(varve[1:(n/2)])
var(varve[(n/2+1):n])
var(log(varve[1:(n/2)]))
var(log(varve[(n/2+1):n]))
hist(varve)
hist(log(varve))
```

Taking log helps, but we really need to log at log differences.


### 2.8 (b)

```{r}
t=seq(260,357)
plot(t,log(varve[t]),type="l", main="Log Varve", ylab="Log Varve")
t=seq(500,572)
plot(t,log(varve[t]),type="l", main="Log Varve", ylab="Log Varve")
```

Observations 260-357 and 500-572 both exhibit behavior similar to recent global warming.

### 2.8 (c)
```{r}
acf(log(varve), main="Log Varve")
```

Lots of autocorrelation.

### 2.8 (d)
```{r}
u = diff(log(varve))
plot(u, main="Change in Log Varve")
acf(u, main="Change in Log Varve")
```

Much better!  Log differences correspond to percentage changes from one year to the next.

### 2.8 (e) 

The ACF cuts off after 1 lag, so a moving average with one lag may be appropriate.

### 2.8 (f)

```{r}
(gamma0 = var(u))
(rho1 = cor( ts.intersect(u,lag(u)))[1,2])
(gamma1 = rho1*gamma0)
```

From $\rho_1=\gamma_1/\gamma_0=\theta/(1+\theta^2)$, we get $\rho_1\theta^2-\theta+\rho_1=0$.
Using the quadratic equation to solve, we get 
$$
\theta = (1\pm \sqrt{1-4\rho_1^2})/(2\rho_1).
$$
Then, from $\gamma_1 = \theta\sigma^2_w$, we get
$$
\sigma^2_w = \gamma_1/\theta.
$$
Plugging our estimates for $\rho_1$ and $\gamma_1=\rho_1\gamma_0$, we get

```{r}
(  theta1 = (1+sqrt(1-4*rho1^2))/(2*rho1)  )
(  theta2 = (1-sqrt(1-4*rho1^2))/(2*rho1)  )
(sigmaw2 = gamma1/theta2)  
```
where we have chosen $\theta_2$ to get an invertible model (we will discuss this in more detail in Chapter 3).

### 2.9 (a)
```{r}
plot(soi, main="SOI")
trend=time(soi)
fit01 = lm( soi~trend )
summary(fit01)
y = residuals(fit01)
n=length(soi)
```


The trend is significantly downwards.


### 2.9 (b)


```{r}
f = seq(1,(n-1)/2)/n
A = rep(0,length(f))
for (i in 1:length(f) ) {
  fit = lm( y ~ cos(2*pi*f[i]*seq(1,n)) + sin(2*pi*f[i]*seq(1,n)))
  A[i] = sqrt(fit$coefficients[2]^2 + fit$coefficients[3]^2)
}

plot( f, A, type="l", ylab="Amplitude", xlab="Frequency", main="SOI, Periodogram")
s = sort( A, index.return=TRUE, decreasing=TRUE )
ii = s$ix[1:10]
f[ii]
1/f[ii]
```


Or,
```{r}
f = seq(1,(n-1)/2)/n
I = fft(y)[1+seq(1,(n-1)/2)]
A = 2/n*abs(I)

plot(f, A, type="l", ylab="Amplitude", xlab="Frequency", main="SOI, Periodogram")
s = sort( A, index.return=TRUE, decreasing=TRUE)
ii = s$ix[1:10]
f[ii]
1/f[ii]
```

The main peak occurs at a period of about 1 year.  The second peak occurs at a period of about 4 years.


## 2.11 (a)
```{r}
plot( cbind(oil,gas), main="Oil and gas" )
plot(cbind(log(oil),log(gas)), main="Log oil and gas" )

```

Maybe a random walk?  They don't look stationary.

## 2.11 (c)
```{r}
poil = diff(log(oil))
pgas = diff(log(gas))
plot( cbind( Oil=poil, Gas=pgas), main="Oil and gas, log differences" )
acf( poil, lag.max=100, main="Oil, log differences")
acf( pgas, lag.max=100, main="Gas, log differences")
```

These look pretty much like white noise
.
## 2.11 (d)
```{r}
ccf( poil, pgas, main="Oil and gas, log differences" )
```


## 2.11 (e)
```{r}
lag2.plot( poil, pgas, 3 )
```

The relationships look pretty weak except at lag 0.


## 2.11 (f)
```{r}
indi = ifelse(poil<0,0,1)
mess = ts.intersect( pgas, poil, poilL = lag(poil,-1), indi )
summary(fit <- lm( pgas ~ poil + poilL + indi, data=mess ))
plot( residuals(fit), type="l" )
acf(  residuals(fit) )
```

There is weak evidence of asymmetry. The residuals look pretty good, but there are a couple of big outliers and possibly volatility clustering. (They don't look much different if the asymmetry is omitted.)