Simulating the distribution and cross-correlation of wind farm output, E.ON Engineering
Question:
Can the output of a portfolio of wind farms be modelled such that the
resultant multivariate time series have the correct auto correlation,
cross correlations and distribution structure?
Background:
With the current expansion of renewable generation, and especially wind
power, it is important to understand the effect a portfolio of wind
farms has on their combined output. Larger and larger wind farms are
planned which will be geographically dispersed, many being offshore,
and will connect directly to the transmission network. Currently
small wind farms connect into the distribution network and therefore
their uncertainty is ‘hidden’ amongst the demand
uncertainty. This uncertainty in their output needs to be modelled
in order to better control the network as a whole.
Mathematics:
The output from a single wind farm is highly intermittent, but one way of
combating this uncertainty in output is to combine its output with
another wind farm. Obviously, the more wind farms that are added to
the portfolio, the less intermittent the total output becomes.
However, this combination of wind farms assumes that each wind farm’s
output is completely independent of the other wind farms in the
portfolio. This cannot be the case for a large portfolio and
therefore the degree to which the wind farms are correlated must be
taken into account. The primary influence on wind farm
interdependency is the distance between them. Two wind farms in the
Highlands are less likely to have independent outputs than one wind
farm in the Highlands and another in Cornwall. This relationship
between distance, d, and correlation is believed to be
exponential in nature i.e.
![]()
Another property that is important in modelling a portfolio of wind farms is the autocorrelation of the wind resource at each location. For the purposes of this question we will only consider the first lag autocorrelation such that the wind speed at any site at time t is highly correlated with its output at t-1 (time steps in electricity modelling are usually in half hours). Values are typically of the order 0.95 or higher. Therefore an autoregressive AR(1) form for the wind at each site should be of the form
![]()
The final property of any simulated time series of wind farm outputs is the distribution. Wind speeds are well known for following Weibull distributions with two parameters, scale and shape describing the distribution. Therefore for a given set of shape and scale parameters (one each for each wind farm) the simulated time series should describe the correct Weibull distribution.

Progress to Date:
To date it has been possible to combine the cross correlations and the
distributions of a portfolio of wind farms using standard copula
theory (generate a multivariate set of normal random numbers with a
known Spearman rank correlation coefficient, transform to uniform
random numbers and then invert into the distribution required.
Spearman rank correlations are preserved through these
transformations whereas Pearson correlations are not, but there
exists a standard relationship between the two coefficients).
However, this does not give the correct autocorrelations.
Also an algorithm devised by Billar and Nelson at NorthWestern University claims to be able to meet all three properties using a technique known as VARTA (vector autoregressive to anything). VARTA takes several AR(1) processes and adjusts the correlation matrix to the desired correlation via a number of individual matching problems. The new AR(1) process is then transformed using inverse Johnson marginal distributions. Although this seems to work for a small set of wind farms (n<=4) above this level the algorithm struggles and the software developed gives error messages.
Requirements:
In conclusion then, it is proposed that an algorithm can be developed
that will simulate the wind at each site from a portfolio of wind
farms such that
If it is necessary for some of the hypothesised distributions to be modified (e.g. not to be Weibull) then a distribution that gives an equally good fit to the measured data would be acceptable.
Data will be provided in the form of an Access database with 7 locations with wind speed. (Warning: Not all the data is synchronised, but there should be sufficient to test any algorithm.)
The solution should be easy to code in a software package such as Matlab, and be able to handle a portfolio of about 20 wind farms. It should be able to run an annual simulation on a PC in 10 minutes. The preference is for a pragmatic solution rather than mathematical purity.
Whether the technique/algorithm uses any of the ideas already studied is completely up to the academics involved.
Problem presenter: Tom Allerton,
E.ON Engineering.