# Covariance

Variance is the measure of spread of data. In a data set if two variables are given then they may have some linear relation among them. The correlation and covariance are the measure of strength of the relation between two variables. Unlike variance, the covariance can take negative or positive values. It depends on the behavior of the variables. If random variables tend to show same behavior means greater value of one variable corresponds with the greater value of the other variable, covariance will be positive else covariance is negative.

## Definition

Covariance is the relationship between two variables in a given data set. Let $E(x)$ be the expected  value of a given variable $x$, and $E(y)$ be the expected value of variable $y$, then the covariance between $x$ and $y$ is given by,

$cov(x,\ y)$ = $E[x - E[x]]E[y - E[y]]$
= $E[xy] - E[x]E[y]$

This expected value can also be the mean value of the data set. Hence, the covariance between $x$ and $y$ can be given as,

$cov(x,y)$= $\sum_{i=1}^{N}$ $\frac{(x_i-\mu )(x_i-\mu )}{N}$

## Formula

The formula for covariance of $x$ and $y$ is,
$cov(x,y)$ = $\sum_{i=1}^{N}$ $\frac{(x_i-\bar{x} )(x_i-\bar{y} )}{N}$
$\bar{x}$ is the mean of variable $x$
$\bar{y}$ is the mean of variable $y$
$n$ is total number of pairs

To calculate covariance take each of the possible events that could occur in turn and calculate the extent to which the returns on investment. We get a large possible value if there is a strong positive relationship between the two variables, and a big negative value if there is a strong negative relationship between the two variables.

Covariance will be zero if there is no relationship between the variables.

## Properties of Covariance

For two variables x and y, the properties of covariance are:

1) $cov(x, y)$ = $cov(y, x)$

2) $cov(x, x)$ = $var(x)$

3) $cov(x + y, z)$ = $cov(x, z) + cov(y, z)$

4) $cov(c.x, y)$ = $c.cov(x, y)$ $[c\ is\ a\ constant]$

5) $cov(a + b.x, y)$ = $b.cov(x, y)$

## Covariance & Correlation

Correlation depends upon two variables, change in one variable effects a change in second variable. Its value lies in the range of -$1$ and +$1$. Whereas in covarience two variables vary together, which can be negative or positive. Correlation is the measure of the strength of the relation between two variables. How strongly two variables are connected is defined as the correlation. Correlation is related to covariance by the given formula,
$cor(x, y)$ = $\frac{cov(x,y)}{\sigma_{x}\ \sigma_{y}}$