Differential Evolution - Markov Chain

4 minute read

Introduction

The Differential Evolution - Markov Chain1 is an optimization algorithm for real parameter spaces. It provides posterior distribution samples of the parameters, rather than point estimates. This helps quantify the uncertainty in that parameter.

The Differential Evolution2 component of the algorithm is used to generate candidate solutions, which are either accepted or rejected according to the Metropolis ratio. Since candidate solutions are generated by this method, the proposal or jumping distribution does not have to be tuned. Practically, this translates to the algorithm producing good posterior on difficult optimization problems and avoiding local minima.

Like Differential Evolution, a population of solutions is evolved to constitute the Markov Chain. Again, this presents a great practical advantage because model evaluation can be parallelized. This saves a lot of time when the most expensive step is evaluating the model.

I used this algorithm to estimate the parameters of my model3 that tracks the flow of viruses from water used to irrigate plants to the final produce. Human viruses, if present in the irrigation water, can internalize into the plant tissue and potentially cause infections in the people who eat such produce.

Commented MATLAB code

The commented MATLAB code implementing this algorithm can be downloaded from the Github associated with my viral transport model here.

Check out the repository!

Example usage

To demonstrate the syntax and usage of the function, here is a minimal example. Suppose we have some data that we know is normally distributed. The mean (μ or mu) and standard deviation (σ or sigma) are unknown and to be estimated by DE-MC. The following snippet sets up this problem and calls the DE-MC function.

% Generate normally distributed data
mu_real = 4; sig_real = 1;
data = randn(5000,1)*sig_real + mu_real;

% Define the inputs for DE-MC
maxGen = 1000;
Npop = 6;
d = 2;
lb = [-15,0]; ub = [15,15];
genlist = 1:maxGen;
reset = 2;
x2.data = data;

solset = DE_MC(@example_objective,d,lb,ub,maxGen,Npop,x2,reset);

Model Diagnostics

The first step to diagnosing the model output is visualizing the objective function. This is easily done with the following snippet:

plot(solset.Flist, 'linewidth', 2);
xlabel('Generations'); ylabel('Objective value');
set(gca, 'linewidth', 2, 'fontsize', 14);

Plot of the objective function value across generations

The different colors represent the different chains which were run in parallel. Eyeballing the plot suggests convergence, but I sometimes like to look at the log of the objective functions or the post warm-up (or burn-in) period:

set(gcf, 'Position', [100, 100, 900, 400])
subplot(121)
plot(log10(solset.Flist), 'linewidth', 2);
xlabel('Generations'); ylabel('Log_{10}(Objective value)');
set(gca, 'linewidth', 2, 'fontsize', 14);

subplot(122)
plot(maxGen/2:maxGen, solset.Flist(maxGen/2:end,:), 'linewidth', 2);
xlabel('Generations'); ylabel('Objective value');
set(gca, 'linewidth', 2, 'fontsize', 14);

Plot of log objective value and the objective value after warm-up

We do not see any outlier chains in the log objective plot (left). Also there is good mixing in the objective plot after the warm-up period, suggesting convergence to the optimum. One can also look at the mixing of the parameters with:

lw = 1.5;
set(gcf, 'Position', [100, 100, 900, 400])
subplot(121)
plot(maxGen/2:maxGen, solset.Xlist(maxGen/2:end,:,1), 'linewidth', lw);
hold on; yline(mu_real, '--', 'linewidth', lw); hold off
xlabel('Generations'); ylabel('\mu');
set(gca, 'linewidth', 2, 'fontsize', 14);
subplot(122)
plot(maxGen/2:maxGen, solset.Xlist(maxGen/2:end,:,2), 'linewidth', lw);
hold on; yline(sig_real, '--', 'linewidth', lw); hold off
xlabel('Generations'); ylabel('\sigma');
set(gca, 'linewidth', 2, 'fontsize', 14);

Plot of the sampled posteriors over generations

Again, we see good mixing without any outlier chains. The original values (mu_real and sig_real) are given by the dotted lines. Since the chains oscillate around the assumed mu_real and sig_real, convergence is confirmed. (Note that the oscillations are small as indicated by the Y axis units).

Note 1

In this example case, it was easy to verify convergence by comparing with mu_real and sigma_real. Yet, in practice, convergence cannot be confirmed easily, only supported. Some convergence metrics are described here.

Obtaining posteriors

The parameter posteriors can be extracted from the chains with the following snippet, taking care to remove the warm-up period as follows:

X_chains = solset.Xlist(maxGen/2:end,:,:); % Remove warm-up
X_post = nan(size(X_chains,3),size(X_chains,1)*size(X_chains,2));
for ind=1:size(X_chains,3)
    X_post(ind,:) = reshape(X_chains(:,:,ind), 1, size(X_post,2));
end
mu_post = X_post(1,:); sig_post = X_post(2,:);

These samples can then be used to summarize the inferred parameter or can be plotted:

fprintf('Predicted mu = %2.2f, sigma = %2.2f \n', mean(mu_post), ...
mean(sig_post));
figure(); h = gca();
s = scatter(mu_post, sig_post, 'o', 'filled', 'SizeData', 20,...
'markeredgecolor', 'none','markerfacecolor', [0.5, 0.15, 0.15]);
xlabel('\mu'); ylabel('\sigma'); hold on;
alpha(s,.125)
ksdensity(h, X_post', 'PlotFcn', 'contour'); hold off;
h2 = findobj(h, 'type', 'contour');
h2.LineColor = 'k'; h2.LineWidth = 1;
set(gca, 'linewidth', 2, 'fontsize', 14)
legend([s, h2], {'Samples', 'Contour'})
>> Predicted mu = 4.01, sigma = 1.01

Scatter plot of the sampled posteriors

Note 2

If you use or modify the DE_MC code in your research, please cite 3!

References

  1. Ter Braak, C. J. F. (2006). A Markov Chain Monte Carlo version of the genetic algorithm Differential Evolution: Easy Bayesian computing for real parameter spaces. Statistics and Computing, 16(3), 239–249. https://doi.org/10.1007/s11222-006-8769-1 

  2. Storn, R., & Price, K. (1997). Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces. Journal of Global Optimization, 11(4), 341–359. https://doi.org/10.1023/A:1008202821328 

  3. Chandrasekaran, S., & Jiang, S. C. (2018). A dynamic transport model for quantification of norovirus internalization in lettuce from irrigation water and associated health risk. Science of The Total Environment, 643, 751–761. https://doi.org/10.1016/j.scitotenv.2018.06.158  2

Updated:

Comments powered by Talkyard.