31 Data assimilation

Data assimilation is about determining a model state from observations.

31.1 Objective analysis

Until the second half of the 20th century, meteorologists had to draw weather maps manually based on the available data, which is called analysis. The initial states of the first operational model runs were also determined in this way [30]. This was time-consuming, inefficient and error-prone. Therefore, people quickly began to automate this process, which is known as objective analysis. The analysis is objective because it no longer depends on the person who creates it. Such analyses essentially involve interpolations and conversions of the measured variables into model variables.

31.2 Optimum interpolation

A disadvantage of interpolations is that they can become very inaccurate if there are no measurement data in the nearby area. At the beginning of operational numerical weather forecasting in the 1950s, there were only conventional measurements, most of them in Europe and the USA. There were hardly any weather observations in the southern hemisphere. An obvious solution is to use a climatological background state or the previous model run as a replacement for measurements in regions where hardly any measurements are taken. This allows the information from the areas with many observations to propagate to the regions with few observations using the physical laws simulated by the model.

The so-called optimum interpolation is about combining information from the observation data and a background state in an optimal way. This means that the result (the analysis) should be as close as possible to the real atmospheric state at the time of analysis.This is not necessarily the state that leads to the best prediction. For this purpose, a cost function $J$ is used, which contains two terms:

A portion $J_B$ that represents the difference between the analysis and the background state.
A portion $J_y$ representing the difference between the analysis and the observations.

As a formula, one writes

\[ \begin{align} J\left(\mathbf{x}\right) = J_B\left(\mathbf{x}, \mathbf{x}_B\right) + J_y\left(\mathbf{x}, \mathbf{y}\right). \end{align} \]

31.2.1 An observation

31.2.1.1 A model degree of freedom

First, imagine that one has only a single observation $y$ and that the model consists of only a single number $x$. As an ansatz $J'$ for $J$, one takes the Euclidean norm as a guide and writes

\[ \begin{align} J'\left(x\right) = \sqrt{J_B'\left(x, x_B\right)^2 + J_y'\left(x, y\right)^2} = \sqrt{\alpha_B\left(x - x_B\right)^2 + \alpha_y\left[x - \newhat{I}\left(y\right)\right]^2}. \end{align} \]

Here $\alpha_B$, $\alpha_y$ are two weighting factors with

\[ \begin{align} \alpha_B + \alpha_y = 1. \end{align} \]

Since the cost function is to be minimized and the root increases strictly monotonically, the notation of the cost function is simplified to:

\[ \begin{align} J\left(x\right) = J_B\left(x, x_B\right) + J_y\left(x, y\right) = \frac{1}{2}\alpha_B\left(x - x_B\right)^2 + \frac{1}{2}\alpha_y\left[x - \newhat{I}\left(y\right)\right]^2. \end{align} \]

The factor $\frac{1}{2}$ is a convention and does not change the minimum of the cost function. Since in the case of conventional observations there are usually more model degrees of freedom than measured values, the replacement

\[ \begin{align} \text{model} - \text{interpolated observation} &\to \text{observation} - \text{interpolated model}\\ \Rightarrow x - \newhat{I}\left(y\right) &\to y - \newhat{h}\left(x\right)\tag{31.6}\label{eq:replace_for_oi} \end{align} \]

leads to a smaller problem. Here $\newhat{h}$ is a generally nonlinear operator, the so-called observation operator. The cost function thus reads

\[ \begin{align} J\left(x\right) = \frac{1}{2}\alpha_B\left(x - x_B\right)^2 + \frac{1}{2}\alpha_y\left[y - \newhat{h}\left(x\right)\right]^2.\tag{31.7}\label{eq:3dvar_cost_one_obs} \end{align} \]

The minimum of the cost function satisfies

\[ \begin{align} \nabla J = 0.\tag{31.8}\label{eq:3dvar_criterion} \end{align} \]

This equation is also necessary and sufficient for the minimum in the multidimensional case: since $J$ is quadratic, $\nabla J$ is linear with constant terms that can be made to vanish by a coordinate transformation. The matrix defining $\nabla J$ is moreover symmetric. Since the entries on the main diagonal are all non-zero, this matrix is invertible. Hence there is only one $\mathbf{x}$ with $\nabla J\left(\mathbf{x}\right) = \mathbf{0}$. Because of $\lim_{\left|\mathbf{x}\right| \to \infty} J = \infty$, this is a minimum.

In order to be able to treat the problem with linear algebra, $\newhat{h}$ is linearized at the point $x_B$, i.e.,

\[ \begin{align} \newhat{h}\left(x\right) = \newhat{h}\left(x_B\right) + \newhat{H}\left(x - x_B\right) + \mathcal{O}\left[\left(x - x_B\right)^2\right], \end{align} \]

where

\[ \begin{align} \newhat{H}\left(x\right) \coloneqq \newhat{h}'\left(x_B\right) \end{align} \]

is the derivative of $\newhat{h}$ at the point $x_B$. Substituting this into Eq. (31.7), neglecting an approximation sign, one obtains

\[ \begin{align} J\left(x\right) = \frac{1}{2}\alpha_B\left(x - x_B\right)^2 + \frac{1}{2}\alpha_y\left[y - \newhat{h}\left(x_B\right) - \newhat{H}\left(x - x_B\right)\right]^2.\tag{31.11}\label{eq:3dvar_cost_simple} \end{align} \]

The derivative of this reads

\[ \begin{align} J'\left(x\right) = \alpha_B\left(x - x_B\right) - \alpha_y\newhat{H}\left[y - \newhat{h}\left(x_B\right) - \newhat{H}\left(x - x_B\right)\right].\tag{31.12}\label{eq:3dvar_deriv_1obs} \end{align} \]

With only one measurement and one model variable, $\newhat{H}$ is simply a number,

\[ \begin{align} \newhat{H} = H. \end{align} \]

The cost function should be dimensionless. From a dimensional analysis one obtains

\[ \begin{align} \left[\alpha_B\right] = \frac{1}{\left[x^2\right]}, & {} & \left[\alpha_y\right] = \frac{1}{\left[y^2\right]}. \end{align} \]

A sensible ansatz for the weighting factors is the inverse variances:

\[ \begin{align} \alpha_B = \frac{1}{\sigma_B^2}, & {} & \alpha_y = \frac{1}{\sigma_y^2} \end{align} \]

Here $\sigma_B$ is the standard deviation of the background state and $\sigma_y$ is the standard deviation of the measurement $y$. This can also be justified more formally by assuming that both the error of the background state and of the measurement are normally distributed, i.e.,

\[ \begin{align} P\left(x, y\right) \propto \exp\left[-\frac{\left(x - x_B\right)^2}{2\sigma_B^2}\right]\exp\left[-\frac{\left(y - \newhat{h}\left(x\right)\right)^2}{2\sigma_B^2}\right] = \exp\left[-\frac{\left(x - x_B\right)^2}{2\sigma_B^2} - \frac{\left(y - \newhat{h}\left(x\right)\right)^2}{2\sigma_B^2}\right]. \end{align} \]

$P\left(x, y\right)$ is the probability density for $x$ under the assumption that $y$ has occurred. One maximizes this by minimizing the negative logarithm, i.e.

\[ \begin{align} P\left(x, y\right)\text{ maximal} \Rightarrow \frac{\left(x - x_B\right)^2}{2\sigma_B^2} + \frac{\left(y - \newhat{h}\left(x\right)\right)^2}{2\sigma_B^2}\text{ minimal.} \end{align} \]

From Eq. (31.12), using Eq. (31.8), one obtains

\[ \begin{align} \frac{1}{\sigma_B^2}\left(x - x_B\right) - \frac{H}{\sigma_y^2}\left[y - \newhat{h}\left(x_B\right) - H\left(x - x_B\right)\right] &= \frac{1}{\sigma_B^2}\left(x - x_B\right) - \frac{H}{\sigma_y^2}\left[y - \newhat{h}\left(x_B\right)\right] + \frac{H^2}{\sigma_y^2}\left(x - x_B\right) = 0\nonumber\\ \Rightarrow\left(\frac{1}{\sigma_B^2} + \frac{H^2}{\sigma_y^2}\right)\left(x - x_B\right) - \frac{H}{\sigma_y^2}\left[y - \newhat{h}\left(x_B\right)\right] &= 0\nonumber\\ \Rightarrow x = x_B + \frac{H}{\sigma_y^2\left(\frac{1}{\sigma_B^2} + \frac{H^2}{\sigma_y^2}\right)}\left[y - \newhat{h}\left(x_B\right)\right] &= x_B + \frac{H}{\frac{\sigma_y^2}{\sigma_B^2} + H^2}\left[y - \newhat{h}\left(x_B\right)\right]. \end{align} \]

Two interesting limiting cases follow from this:

In the case $\frac{\sigma_y^2}{\sigma_B^2} \to \infty$, which corresponds to very imprecise measurements $y$, one has $x = x_B$. In this case, the analysis is determined only from the background state.
In the case $\frac{\sigma_B^2}{\sigma_y^2} \to \infty$, which corresponds to a very imprecise background state $x_B$, one has

\[ \begin{align} x &= x_B + \frac{H}{H^2}\left[y - \newhat{h}\left(x_B\right)\right] = x_B + \frac{y - \newhat{h}\left(x_B\right)}{H}. \end{align} \]

In this case, the analysis is determined only from the measurements.

31.2.1.2 Multiple degrees of model freedom

Now assume that a model state is not determined by a single number $x$, but by a state vector $\mathbf{x}$ of length $N$. For this case, the cost function Eq. (31.11) generalizes to

\[ \begin{align} J\left(\mathbf{x}\right) = \frac{1}{2}\left(\mathbf{x} - \mathbf{x}_B\right)^TB^{-1}\left(\mathbf{x} - \mathbf{x}_B\right) + \frac{1}{2}\frac{1}{\sigma_y^2}\left[y - \newhat{h}\left(\mathbf{x}_B\right) - \mathbf{H}\left(\mathbf{x} - \mathbf{x}_B\right)\right]^2.\tag{31.20}\label{eq:3dvar_cost_1_obs} \end{align} \]

$\mathbf{H}$ is a $1 \times N$ vector. $B \in \mathbb{R}^{N \times N}$ is the covariance matrix of the errors of the background state. This matrix has diagonal form; on its main diagonal are the variances of the background state:

\[ \begin{align} B = \left(\begin{array}{cccc} \sigma_{0, B}^2 & 0 & \dots & 0\\ 0 & \sigma_{1, B}^2 & 0 & \vdots\\ \vdots & \dots & \ddots & 0\\ 0 & \dots & 0 & \sigma_{N - 1, B}^2 \end{array}\right) \end{align} \]

From this it follows

\[ \begin{align} B^{-1} = \left(\begin{array}{cccc} \frac{1}{\sigma_{0, B}^2} & 0 & \dots & 0\\ 0 & \frac{1}{\sigma_{1, B}^2} & 0 & \vdots\\ \vdots & \dots & \ddots & 0\\ 0 & \dots & 0 & \frac{1}{\sigma_{N - 1, B}^2} \end{array}\right). \end{align} \]

Thus one obtains

\[ \begin{align} \frac{1}{2}\left(\mathbf{x} - \mathbf{x}_B\right)^TB^{-1}\left(\mathbf{x} - \mathbf{x}_B\right) = \sum_{i = 0}^{N - 1}\frac{\left(x_i - x_{i, B}\right)^2}{2\sigma_{i, B}^2}. \end{align} \]

From this one obtains

\[ \begin{align} \left(\nabla J\right)_i = \frac{\left(x_i - x_{i, B}\right)}{\sigma_{i, B}^2} - \frac{H_i}{\sigma_y^2}\left[y - \newhat{h}\left(\mathbf{x}_B\right) - \mathbf{H}\left(\mathbf{x} - \mathbf{x}_B\right)\right]. \end{align} \]

This can be written in the compact form

\[ \begin{align} \nabla J = B^{-1}\left(\mathbf{x} - \mathbf{x}_B\right) - \frac{\mathbf{H}^T}{\sigma_y^2}\left[y - \newhat{h}\left(\mathbf{x}_B\right) - \mathbf{H}\left(\mathbf{x} - \mathbf{x}_B\right)\right]. \end{align} \]

From Eq. (31.8) it thus follows in this case

\[ \begin{align} B^{-1}\left(\mathbf{x} - \mathbf{x}_B\right) &= \frac{\mathbf{H}^T}{\sigma_y^2}\left[y - \newhat{h}\left(\mathbf{x}_B\right) - \mathbf{H}\left(\mathbf{x} - \mathbf{x}_B\right)\right]\nonumber\\ \Rightarrow B^{-1}\left(\mathbf{x} - \mathbf{x}_B\right) + \frac{\mathbf{H}^T}{\sigma_y^2}\left[\mathbf{H}\left(\mathbf{x} - \mathbf{x}_B\right)\right] &= \frac{\mathbf{H}^T}{\sigma_y^2}\left[y - \newhat{h}\left(\mathbf{x}_B\right)\right]\nonumber\\ \Rightarrow\left(B^{-1} + \frac{\mathbf{H}^T\mathbf{H}}{\sigma_y^2}\right)\left(\mathbf{x} - \mathbf{x}_B\right) &= \frac{\mathbf{H}^T}{\sigma_y^2}\left[y - \newhat{h}\left(\mathbf{x}_B\right)\right]\nonumber \end{align} \] \[ \begin{align} \Rightarrow\left(1 + B\frac{\mathbf{H}^T\mathbf{H}}{\sigma_y^2}\right)\left(\mathbf{x} - \mathbf{x}_B\right) &= \frac{B\mathbf{H}^T}{\sigma_y^2}\left[y - \newhat{h}\left(\mathbf{x}_B\right)\right]\nonumber\\ \Rightarrow\mathbf{x} - \mathbf{x}_B &= \left(1 + B\frac{\mathbf{H}^T\mathbf{H}}{\sigma_y^2}\right)^{-1}\frac{B\mathbf{H}^T}{\sigma_y^2}\left[y - \newhat{h}\left(\mathbf{x}_B\right)\right]\nonumber \end{align} \]

\[ \begin{align} \Rightarrow\mathbf{x} &= \mathbf{x}_B + \left(B\frac{\mathbf{H}^T\mathbf{H}}{\sigma_y^2} + 1\right)^{-1}\frac{B\mathbf{H}^T}{\sigma_y^2}\left[y - \newhat{h}\left(\mathbf{x}_B\right)\right]. \end{align} \]

31.2.2 Any number of observations

Now assume that there are not one but $M$ observations. For this case, the cost function Eq. (31.20) is generalized to

\[ \begin{eqnarray} J\left(\mathbf{x}\right) = \frac{1}{2}\left(\mathbf{x} - \mathbf{x}_B\right)^TB^{-1}\left(\mathbf{x} - \mathbf{x}_B\right) + \frac{1}{2}\left(\mathbf{y} - \newhat{h}\left(\mathbf{x}_B\right) - H\left(\mathbf{x} - \mathbf{x}_B\right)\right)^TR^{-1}\left(\mathbf{y} - \newhat{h}\left(\mathbf{x}_B\right) - H\left(\mathbf{x} - \mathbf{x}_B\right)\right).\tag{31.27}\label{eq:3d-var_cost_function} \end{eqnarray} \]

Here $R$ is the covariance matrix of the errors of the observations. Since these are predominantly statistical (and not systematic) errors, one can assume that it has diagonal form. One has

\[ \begin{align} \left(\mathbf{y} - H\mathbf{x}\right)^TR^{-1}\left(\mathbf{y} - H\mathbf{x}\right) &= \mathbf{y}^TR^{-1}\mathbf{y} - \mathbf{y}^TR^{-1}H\mathbf{x} - \mathbf{x}^TH^TR^{-1}\mathbf{y} + \mathbf{x}^TH^TR^{-1}H\mathbf{x}. \end{align} \]

One now computes

\[ \begin{align} \mathbf{y}^TR^{-1}H\mathbf{x} &= \left(R^{-1}H\mathbf{x}\right)^T\mathbf{y} = \mathbf{x}^TH^TR^{-1}\mathbf{y}. \end{align} \]

Here it was used that $R^{-1}$ is orthogonal. Thus one has

\[ \begin{align} \left(\mathbf{y} - H\mathbf{x}\right)^TR^{-1}\left(\mathbf{y} - H\mathbf{x}\right) &= \mathbf{y}^TR^{-1}\mathbf{y} - 2\mathbf{x}^TH^TR^{-1}\mathbf{y} + \mathbf{x}^TH^TR^{-1}H\mathbf{x}\nonumber\\ &= \mathbf{y}^TR^{-1}\mathbf{y} - 2\mathbf{x}^TH^TR^{-1}\mathbf{y} + \left(\mathbf{x}^TH^TR^{-1}H\right)^T\cdot\mathbf{x}\nonumber\\ &= \mathbf{y}^TR^{-1}\mathbf{y} - 2\mathbf{x}^TH^TR^{-1}\mathbf{y} + \left(H^TR^{-1}H\mathbf{x}\right)\cdot\mathbf{x}.\tag{31.30}\label{eq:3d-var_deriv} \end{align} \]

It was taken into account that because of

\[ \begin{align} \left(H^TR^{-1}H\right)^T &= H^TR^{-1}H \end{align} \]

the matrix $H^TR^{-1}H$ is orthogonal. If one forms the gradient of Eq. (31.30) (with respect to $\mathbf{x}$), one obtains

\[ \begin{align} \nabla\left[\left(\mathbf{y} - H\mathbf{x}\right)^TR^{-1}\left(\mathbf{y} - H\mathbf{x}\right)\right] &= -2H^TR^{-1}\mathbf{y} + 2H^TR^{-1}H\mathbf{x} = -2H^TR^{-1}\left(\mathbf{y} - H\mathbf{x}\right). \end{align} \]

From this it follows

\[ \begin{eqnarray} \nabla J = B^{-1}\left(\mathbf{x} - \mathbf{x}_B\right) - H^TR^{-1}\left(\mathbf{y} - \newhat{h}\left(\mathbf{x}_B\right) - H\left(\mathbf{x} - \mathbf{x}_B\right)\right). \end{eqnarray} \]

From Eq. (31.8) it thus follows

\[ \begin{align} B^{-1}\left(\mathbf{x} - \mathbf{x}_B\right) - H^TR^{-1}\left(\mathbf{y} - \newhat{h}\left(\mathbf{x}_B\right) - H\left(\mathbf{x} - \mathbf{x}_B\right)\right) &= 0\nonumber\\ \Leftrightarrow\left(\mathbf{x} - \mathbf{x}_B\right) - BH^TR^{-1}\left(\mathbf{y} - \newhat{h}\left(\mathbf{x}_B\right) - H\left(\mathbf{x} - \mathbf{x}_B\right)\right) &= 0\nonumber\\ \Leftrightarrow\left(BH^TR^{-1}H + 1\right)\left(\mathbf{x} - \mathbf{x}_B\right) - BH^TR^{-1}\left(\mathbf{y} - \newhat{h}\left(\mathbf{x}_B\right)\right) &= 0\nonumber \end{align} \]

\[ \begin{align} \Leftrightarrow\left(BH^TR^{-1}H + 1\right)\left(\mathbf{x} - \mathbf{x}_B\right) = BH^TR^{-1}\left(\mathbf{y} - \newhat{h}\left(\mathbf{x}_B\right)\right).\tag{31.34}\label{eq:oi_lin} \end{align} \]

This is the system of linear equations valid for $\mathbf{x}$. One can further transform this into

\[ \begin{align} \mathbf{x} &= \mathbf{x}_B + \left(BH^TR^{-1}H + 1\right)^{-1}BH^TR^{-1}\left(\mathbf{y} - \newhat{h}\left(\mathbf{x}_B\right)\right).\tag{31.35}\label{eq:oi_result_pre} \end{align} \]

The matrix

\[ \begin{align} K \coloneqq \left(BH^TR^{-1}H + 1\right)^{-1}BH^TR^{-1}\tag{31.36}\label{eq:gain_first_version} \end{align} \]

is called gain matrix. To calculate the right-hand side of this equation, one must invert an $N \times N$ matrix.

One can simplify the expression for the gain matrix somewhat further. Let $A \in \mathbb{R}^{N \times M}$. Then one has

\[ \begin{align} A &= A\left(HA + 1\right)\left(HA + 1\right)^{-1}\nonumber\\ \Leftrightarrow A &= \left(AHA + A\right)\left(HA + 1\right)^{-1}\nonumber\\ \Leftrightarrow A &= \left(AH + 1\right)A\left(HA + 1\right)^{-1}\nonumber\\ \Leftrightarrow\left(AH + 1\right)^{-1}A &= A\left(HA + 1\right)^{-1}\nonumber. \end{align} \]

Defining now

\[ \begin{align} A \coloneqq BH^TR^{-1}, \end{align} \]

it follows

\[ \begin{align} K = \left(BH^TR^{-1}H + 1\right)^{-1}BH^TR^{-1} = BH^TR^{-1}\left(HBH^TR^{-1} + 1\right)^{-1}. \end{align} \]

To calculate the right-hand side of this equation, an $M \times M$ matrix must now be inverted, which for $M < N$ is more efficient than Eq. (31.36). One can further simplify this to

\[ \begin{align} K = BH^T\left(HBH^T + R\right)^{-1} \end{align} \]

Putting this into Eq. (31.35), one obtains

\[ \begin{align} \mathbf{x} &= \mathbf{x}_B + BH^T\left(HBH^T + R\right)^{-1}\left(\mathbf{y} - \newhat{h}\left(\mathbf{x}_B\right)\right).\tag{31.40}\label{eq:oi_result} \end{align} \]

31.2.3 Special problems in the treatment of the wind field

The wind field can be assimilated as part of the OI. However, with coarse model resolutions, this has the disadvantage that the wind obtained in this way is often further away from the balance equation than is usual on the synoptic scale and thus excites sound and gravity waves that are too strong. These then cause forecast errors. Therefore, it may make sense to determine the wind field not on the basis of measurements, but by solving the balance equation. This is an example of the fact that OI maximizes the quality of the analysis rather than of the forecast.

31.2.4 Consideration of hydrostatics

If one assumes hydrostatics, i.e. the validity of the basic hydrostatic equation Eq. (13.122), the thermodynamic state of the atmosphere is simplified considerably: usually two thermodynamic quantities $q_1$, $q_2$ are sufficient to determine this. If an equation of the form

\[ \begin{align} \frac{\partial q_1}{\partial z} = f\left(q_2\right) \end{align} \]

holds, one can, however, replace the specification of the quantity $q_1$ in the entire atmosphere by its specification on a single surface, for example the earth's surface or a pressure surface. Requiring the exact validity of a statement about the result of data assimilation is called strong constraint.

31.3 3D-Var

The inversion of the $M \times M$ matrix $HBH^T + R$ is very costly if the number of individual observations is large. To see this, note that a radar image, for example, consists of a very large number of individual observations. This is a disadvantage of OI. It could be avoided by undoing the replacement Eq. (31.6). This is not directly possible with remote sensing data because, for example, spectral radiance on a satellite cannot easily be converted into temperatures. A solution to this is presented in this section.

The 3D-Var method is about solving Eq. (31.40) iteratively without explicitly computing the gain matrix. Define

\[ \begin{align} \delta\mathbf{x} &\coloneqq \mathbf{x} - \mathbf{x}_B,\\ \mathbf{d}_B &\coloneqq \mathbf{y} - \newhat{h}\left(\mathbf{x}_B\right),\\ \end{align} \]

Eq. (31.34) then reads

\[ \begin{align} \left(BH^TR^{-1}H + 1\right)\delta\mathbf{x} = BH^TR^{-1}\mathbf{d}_B. \end{align} \]

Multiplying by $B^{-1}$ yields

\[ \begin{align} \left(H^TR^{-1}H + B^{-1}\right)\delta\mathbf{x} = H^TR^{-1}\mathbf{d}_B. \end{align} \]

So here an $N \times N$ matrix must be inverted. From this one would expect that for $N < M$, i.e. if there are more observations than model degrees of freedom, this equation is more efficient than Eq. (31.40).

31.3.1 Iterative solution process

31.3.2 Outer and inner loop

31.3.2.1 The outer loop

31.3.2.2 The inner loop

31.3.3 Weak constraints

Another advantage of 3D-Var over OI is that constraints can be enforced without much effort by including them in the cost function. This is called a weak constraint. The procedure presented in Sect. 31.2.4 can be formulated as a weak constraint by including a term

\[ \begin{align} J \propto \frac{1}{2}\int_A\left(\frac{\partial p}{\partial z} + g\rho\right)^2d^3r \end{align} \]

in the cost function.

31.3.3.1 Balance equation

The horizontal wind is closely linked to the thermodynamic quantities on the synoptic scale due to the approximate validity of the geostrophic balance and the balance equation. It can therefore be viewed as a diagnostic variable. Computing it directly from geostrophy in the form of Eq. (13.187) is not possible in the tropics, due to the divergence problem presented in Sect. 13.10.1. Therefore, one can instead use the balance equation (15.137) or the linearized balance equation Eq. (15.138). This filters out both sound and gravity waves, since both are associated with divergences.

Analogously to the case of hydrostatics, it is again possible here to either require the validity of the balance equation exactly or, which is easier, to include it in the cost function.

31.4 4D-Var

4D-Var is based on 3D-Var, but additionally takes the time dimension into account. To do this, the cost function $J$ is generalized in the form

\[ \begin{eqnarray} J = J_{\text{3D-Var}} + \sum_{n = 1}^NJ_n, \end{eqnarray} \]

here, $N$ is the number of time steps involved (apart from the analysis time itself) and $J_n$ is the cost function at step $n$. $J_{\text{3D-Var}}$ is defined as in Eq. (31.27). Let $\mathbf{Y}_n$ be the vector of observations at time step $n$. At each time step, this vector can have a different size and a different meaning. Let $\mathbf{X}_n$ be the state vector of the model atmosphere at time step $n$, which is determined from the analysis state $\mathbf{X}$ via

\[ \begin{eqnarray} \mathbf{X}_n = M_n\mathbf{X}, \end{eqnarray} \]

Here $M_n$ is the integration of the model from the analysis time to time step $n$. To linearize this integration, one computes a so-called tangent linear model operator $M$, which is a matrix. Then one has

\[ \begin{eqnarray} M_n\approx M^n, \end{eqnarray} \]

which becomes increasingly inaccurate as $n$ increases. Thus one has

\[ \begin{eqnarray} J_n = \left(\mathbf{Y}_n - H_nM^n\mathbf{X}\right)^TA_{O, n}\left(\mathbf{Y}_n - H_nM^n\mathbf{X}\right). \end{eqnarray} \]

31.4.1 Outer and inner loop

31.4.1.1 The outer loop

31.4.1.2 The inner loop

31.5 Interpolation of another model

If the initial state of another model is available (for example on a longitude-latitude grid), it is entirely possible to simply interpolate it onto the desired model grid and convert it into the prognostic variables. Here, the analysis of the other model is interpreted as an „observation“ and the assumption is made that the error of the data assimilation of the other model is negligible compared to the error of the forecast of one's own model. This assumption is often justified, and even if it is not justified, it may still make sense to use this method in order to devote the computing capacity freed up by the simplicity of this procedure to the model run.