Graphics Processing Units

This page reports some of the simulations that I ran on Graphics Processing Units (GPUs). A GPU is a specialized electronic circuit with a highly parallel architecture, which is therefore more efficient than a CPU when running algorithms that process large blocks of data in parallel (see this nice video by Nvidia). GPUs were originally designed to accelerate the rendering of 3D computer graphics, but it didn’t take long for scientists to realize that their massively parallel capabilities were well suited for analyzing efficiently large volumes of data in many fields of scientific research, including artificial intelligence and computational neuroscience. In particular, when simulating biologically realistic neural circuits, it was shown that GPUs outperform high performance computing and state-of-the-art neuromorphic hardware in terms of speed and energy consumption.

The numerical calculations reported in this page were parallelized on a GeForce® GTX 1660 Ti using the open source compiler Numba v0.54.1 with CUDA Toolkit 11.3.1.

An EVGA GeForce® GTX 1660 Ti graphics card

This GPU is based on the Turing architecture (without ray-tracing and tensor cores), and it features 24 streaming multiprocessors and 1,536 CUDA cores.

METHODS

The partial differential equations (PDEs) and partial integro-differential equations (PIDEs) considered in this page were parallelized on the GPU and solved with the method of lines, given homogeneous boundary conditions (i.e. the solution was set to zero at the boundaries of the domain) and normalized initial conditions. The partial derivatives on the right-hand side of the equations were approximated by the 4th-order central difference formula. Then the resulting ODEs system was solved by implementing the explicit Runge-Kutta 4 method with Numba, while the integral of the solutions on their multi-dimensional domains was calculated through the Newton-Cotes rule of degree 6. The stochastic differential equations (SDEs) were parallelized and solved with the Euler-Maruyama scheme, while the sources of random noise were generated by Numba’s xoroshiro128+ algorithm. Because of the stochastic nature of the model, its probability density function was calculated by solving the SDEs system 10,000 times through a Monte Carlo method.

2D Heat Equation

The heat equation is a linear PDE which models how heat diffuses over time through a given region. When the region is two dimensional, the heat equation reads as follows:

where u is the temperature function, t denotes time, (x,y) are the space coordinates, and D is the diffusion coefficient which describes how fast heat diffuses through space.

I solved the heat equation on a 334×334 grid, meaning that I approximated the PDE by a system of 111,556 ODEs. The computation time was 3.5 seconds for 6,000 integration time steps.

Numerical solution of the 2D heat equation on GPU

3D Heat Equation

When heat diffuses through a three-dimensional region, the heat equation reads as follows:

I solved this equation on a 200x200x200 grid (8,000,000 ODEs), resulting in a computation time of 6.2 minutes for 6,000 integration time steps.
Numerical solution of the 3D heat equation on GPU
Numerical solution of the 3D heat equation on GPU

McKean-Vlasov-Fokker-Planck Equation

The McKean-Vlasov-Fokker-Planck equation is a nonlinear PIDE that describes the probability density function of stochastic systems of interacting units in the thermodynamic (or mean-field) limit. In particular, in [Baladron et al 2012] this equation was used to study the temporal evolution of stochastic, densely-connected neural networks. When the neurons are interconnected by chemical synapses and the network dynamics follows the FitzHugh-Nagumo model, the McKean-Vlasov-Fokker-Planck equation reads as follows:

where p is the probability density function of the network, t denotes time, and represent, respectively, the membrane potential and the adaptation variable of the neurons, and y is a function denoting the fraction of open ion channels in the chemical synapses. Moreover, the drift (dri) and diffusion (dif) terms in the equation have the following mathematical expressions:

I refer to [Baladron et al 2012] for a description of the network parameters and functions reported above.

I solved the McKean-Vlasov-Fokker-Planck equation on a 297x300x334 grid (29,759,400 ODEs), resulting in a computation time of 1.6 hours for 6,000 integration time steps. Note that in [Baladron et al 2012] the PIDE was solved on expensive Tesla C2050 cards (launch price: 2,499$), while nowadays it is possible to efficiently solve the equation on laptops with much cheaper GPUs (launch price of the GTX 1660 Ti: 279$).

The following video shows the temporal evolution of the marginal probability distributions P(V,w,t), P(V,y,t) and P(w,y,t), which I obtained by integrating out one of the three network variables from the numerical solution p(V,w,y,t). Note the formation of a limit cycle representing the emergence of spiking activity in the network. The cycle is particularly visible during the temporal evolution of the probability density P(V,w,t) (red function in the video).

Stochastic Differential Equations and Monte Carlo Simulations

Here I consider a system of 5N SDEs describing the temporal evolution of a network of N recurrently connected Hodgkin-Huxley neurons with chemical synapses:

In this model, V represents the membrane potential of the neurons, n, m, h are the gating variables (associated with potassium channel activation, sodium channel activation, and sodium channel inactivation, respectively), and y is a function denoting the fraction of open ion channels in the chemical synapses. Then the variables B and W denote 6N independent sources or noise, see [Baladron et al 2012] for more details and for a description of the remaining network parameters and functions reported above.

For N = 10, 100, 1000 (meaning that the system is composed of 50, 500, 5000 SDEs with 60, 600, 6000 sources of noise, respectively) and ten thousand Monte Carlo simulations, the computation time was, respectively, 2.1 mins, 21.6 mins, 198 mins (i.e. 3.3 hours) for 1000 integration time steps. The following video shows the formation of action potentials in a network composed of N = 10 neurons (for simplicity I reported the potentials for 50 simulations only), and the marginal probability density function of two randomly chosen neurons.

MENU