**Monte Carlo Simulation** (or Method) is a probabilistic numerical technique used to estimate the outcome of a given, uncertain (stochastic) process. This means it’s a method for simulating events that cannot be modelled implicitly. This is usually a case when we have a random variables in our processes.

In this article, I give you a brief background of this technique, I show what steps you have to follow to implement it and, at the end, there will be two examples of a problems solved using Monte Carlo in Python programming language.

**A bit of history**

Monte Carlo Simulation, as many other numerical methods, was invented before the advent of modern computers — it was developed during World War II — by two mathematicians: Stanisław Ulam and John von Neumann. At that time, they both were involved in the Manhattan project, and they came up with this technique to simulate a chain reaction in highly enriched uranium. Simply speaking they were simulating an atomic explosion.

Solving the “neutron diffusion” model was too complex to describe and to solve explicitly, especially keeping in mind they had only IBM punch-card machines or later a computer called ENIAC. During his stay in hospital Stanisław Ulam was killing boredom by playing cards and then a new idea struck him. After returning to work he shared his novel idea with a colleague from the laboratory John von Neumann. The development of a new solving method got a codename “Monte Carlo”. Method this was based on random sampling and statistics. Thanks to it both mathematicians were able to speed up the calculation process, make incredibly good predictions and deliver useful and highly needed at that time results to the project.

While working in Los Alamos National Laboratory Stanisław Ulam published in 1949 the first unclassified document describing Monte Carlo Simulation.

**2. Applications now**

Currently, due to the ease of implementation and available high computing power this technique is widely used across various industries. Let us look on some documented use cases.

Health:

- “A random walk Monte Carlo simulation study of COVID-19-like infection spread”
- “Human health risk assessment of toxic elements in South Korean cabbage, Kimchi, using Monte Carlo simulation”

Finance:

- “Sensitivity estimation of conditional value at risk using randomized quasi-Monte Carlo”
- “Complex system in finance: Monte Carlo evaluation of first passage time density function”

Production:

- “Robustness evaluation of production plans using Monte Carlo Simulation”
- “Monte Carlo Tree Search for online decision making in smart industrial production”

Transport:

Engineering/Science:

- Design and optimization of graphene quantum dot-based luminescent solar concentrator using Monte-Carlo simulation
- “Risk analysis of an underground gas storage facility using a physics-based system performance model and Monte Carlo simulation”
- “Nonlinear robust fault diagnosis of power plant gas turbine using Monte Carlo-based adaptive threshold approach”

**3. Backbone of the Monte Carlo Method**

The core concept behind the Monte Carlo Simulation is a multiple random sampling from a given set of probability distributions. These can be of any type, e.g.: normal, continuous, triangular, Beta, Gamma, you name it.

To use this technique, you have to follow four main steps:

- Identify all input components of the process and how do they interact e.g., do they sum up or subtract?
- Define parameters of the distributions.
- Sample from each of the distributions and integrate the results based on point 1.
- Repeat the process as many times as you want.

During the running of this simulation, your resultant parameter (e.g. cost or risk) will converge toward the Normal Distribution even the source distributions can be different. This is the effect of the Central Limit Theorem and that is one of the reasons why this technique became immensely popular in various industries.

**4. Implementation in Python — basics**

Monte Carlo Simulation can be easily implemented using any programming language. In this case we will use Python. NumPy library will be very handy here as it has multiple most popular probability distributions implemented. For example:

- numpy.random.normal — the Normal Distribution
- numpy.random.triangular — a triangular distribution
- numpy.random.uniform — a uniform distribution
- numpy.random.weibull — a Weibull distribution

**5. Example 1**

Let’s assume we have a process constructed from 3 stages (X1, X2, X3). Each one has an average duration (5, 10 and 15 minutes) which vary following the normal distribution and we know their variance (all 1 minute).

We want to know what is the probability that the process will exceed 34 minutes?

This example is so trivial that it can be solved manually what we do later to validate the Monte Carlo result.

We know all the individual components so let’s define the relationship between them (it’s additive):

Now we can start coding. The single-component can be represented with a short function:

The Monte Carlo simulation code shown below uses this function as a basic block. The number of iterations for this use case is set at 10 000 but you can change it. The last section of a code checks the probability of exiting the limit of 34 minutes (once again it uses the sampling technique).

After running the following code we get the following answer but it will vary every time you run the code:

`Probability of exceeding the time limit: 1.035 %`

Now we can plot the histogram of the estimated parameter (time). We clearly see it follows the normal distribution.

Let’s verify our results with hand calculations.

As a result the total time follows the normal equation with parameters:

In order to calculate the probability we have to find the z-score first.

P-value we read now from the z-score table. A right tailed probability we calculate as:

As you remember our Monte Carlo simulation gave us the result of 1.05% which is pretty close.

**6. Example 2**

In this example let’s assume we want to assemble three blocks inside a container of a given width. Nominal dimensions are shown on the picture below. We see that by design there is a nominal gap of 0.5mm.

However, the real dimensions of these three blocks and a container can vary due to technological reasons. For the sake of demonstration let’s assume that none of these variations follow the normal distribution. Three blocks will follow triangular distributions shown below and a container’s dimensions spread will follow an uniform distribution in a range of +/-0.1 mm.

Now, by simply calculating the extreme values we can see that in the worst scenario blocks have 17mm and a container has a width of only 16.4mm meaning, in this case, we cannot fit them all together.

The question is: what is the probability that we won’t be able to fit all the blocks into a container?

In this case relationships between blocks look like this:

By modifying the previous code we obtain a function to sample the triangular distribution. The same we can do to obtain the uniform distribution sampling function.

A modified core code for the MC simulation:

After running above code we get the answer in order of:

`Probability of not fitting the blocks: 5.601 %`

After checking the mean and standard deviation we can say even more about the expected gap dimension:

`The mean gap width will be 0.33mm with a standard deviation of 0.2mm.`

Now let’s plot the histogram of the estimated parameter (gap width) to show that it follows the normal distribution even though none of the input distribution is of this type.

You may now wonder how the result will vary with increase number of samples. To check that see the graph below showing the gap width estimation with 95% confidence interval, depending on the sample size (from 100 to 7000 samples):

From this graph it’s evident that the mean of the estimated value doesn’t change significantly but the spread decreases with the number of samples. This means that with a new run of the simulation bigger samples give you smaller results spread. However, at some point adding more samples doesn’t help anymore.

**7. Summary**

As you’ve seen Monte Carlo is basically a very simple idea yet very powerful. After reading this article I hope you understand the core concept of the this method, when to use it and how to implement it in Python programming language.

## Insights, advice, suggestions, feedback and comments from experts

Monte Carlo Simulation is a probabilistic numerical technique used to estimate the outcome of a given uncertain or stochastic process. It is a method for simulating events that cannot be modeled implicitly. This technique was invented during World War II by mathematicians Stanisław Ulam and John von Neumann, who were involved in the Manhattan Project. They developed this technique to simulate a chain reaction in highly enriched uranium, specifically for simulating an atomic explosion. The method was named "Monte Carlo" and was based on random sampling and statistics. It allowed for faster calculations and accurate predictions, which were crucial for the project at that time [[1]].

Monte Carlo Simulation is widely used across various industries today due to its ease of implementation and the availability of high computing power. Some documented use cases include health (such as studying the spread of COVID-19-like infections and assessing health risks), finance (such as estimating conditional value at risk and evaluating complex systems), production (such as evaluating the robustness of production plans), transport (such as modeling collision risk management), and engineering/science (such as optimizing solar concentrators and analyzing risk in gas storage facilities) [[2]].

The backbone of the Monte Carlo Method involves multiple random sampling from a given set of probability distributions. These distributions can be of various types, such as normal, continuous, triangular, Beta, Gamma, and more. To use this technique, you need to follow four main steps:

- Identify all input components of the process and how they interact (e.g., summing up or subtracting).
- Define the parameters of the probability distributions.
- Sample from each of the distributions and integrate the results based on the interactions identified in step 1.
- Repeat the process as many times as desired.

During the simulation, the resultant parameter (e.g., cost or risk) will converge toward a Normal Distribution, even if the source distributions are different. This convergence is due to the Central Limit Theorem, which is one of the reasons why Monte Carlo Simulation is popular in various industries [[3]].

Monte Carlo Simulation can be implemented using any programming language, and Python is commonly used for this purpose. The NumPy library in Python provides implementations of popular probability distributions, such as the Normal Distribution, triangular distribution, uniform distribution, and Weibull distribution [[4]].

The article provides two examples of problems solved using Monte Carlo Simulation in Python. In the first example, the goal is to determine the probability that a process constructed from three stages will exceed a certain duration. The durations of each stage follow a normal distribution, and the relationship between the stages is additive. The Monte Carlo simulation code is provided in the article to calculate the probability of exceeding the time limit [[5]].

In the second example, the goal is to determine the probability of not being able to fit three blocks into a container due to variations in their dimensions. The dimensions of the blocks and the container follow triangular and uniform distributions, respectively. The modified Monte Carlo simulation code is provided in the article to calculate the probability of not fitting the blocks [[6]].

Overall, Monte Carlo Simulation is a powerful technique for estimating outcomes in uncertain processes. It has a wide range of applications across industries and can be implemented using programming languages like Python.