Nabla Logo user
  Advanced Search
  
  Send us a comment/query
 
  Back to main web site
 
  OpenFOAM guides
  - User guide
  - Programmer’s guide
 
  Index
 
  Changes from OpenFOAM 2.2 to 2.3
 
  Trademarks in the guides
  ©2000-2007 Nabla Ltd.

3.4 Running applications in parallel

This section describes how to run OpenFOAM in parallel on distributed processors. The method of parallel computing used by OpenFOAM is known as domain decomposition, in which the geometry and associated fields are broken into pieces and allocated to separate processors for solution. The process of parallel computation involves: decomposition of mesh and fields; running the application in parallel; and, post-processing the decomposed case as described in the following sections. The parallel running uses the public domain, Local Area Multicomputer (LAM) implementation of the standard message passing interface (MPI). OpenFOAM can also be run using the MPICH implementation of MPI which is described in section A.1.

3.4.1 Decomposition of mesh and initial field data

The mesh and fields are decomposed using the decomposePar utility. The underlying aim is to break up the domain with minimal effort but in such a way to guarantee a fairly economic solution. The geometry and fields are broken up according to a set of parameters specified in a dictionary named decomposeParDict that must be located in the system directory of the case of interest. An example decomposeParDict dictionary can be copied from the interFoam/damBreak tutorial if the user requires one; the dictionary entries within it are reproduced below:


31  // * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
32  
33  
34  numberOfSubdomains 4;
35  
36  method          simple;
37  
38  simpleCoeffs
39  {
40      n               (2 2 1);
41      delta           0.001;
42  }
43  
44  hierarchicalCoeffs
45  {
46      n               (1 1 1);
47      delta           0.001;
48      order           xyz;
49  }
50  
51  metisCoeffs
52  {
53      processorWeights
54      (
55          1
56          1
57          1
58          1
59      );
60  }
61  
62  manualCoeffs
63  {
64      dataFile        "";
65  }
66  
67  distributed     no;
68  
69  roots
70  (
71  );
72  
73  
74  // ************************************************************************* //

The user has a choice of four methods of decomposition, specified by the method keyword as described below.

simple
Simple geometric decomposition in which the domain is split into pieces by direction, e.g. 2 pieces in the x  \special {t4ht= direction, 1 in y  \special {t4ht= etc.
hierarchical
Hierarchical geometric decomposition which is the same as simple except the user specifies the order in which the directional split is done, e.g. first in the y  \special {t4ht=-direction, then the x  \special {t4ht=-direction etc.
metis
METIS decomposition which requires no geometric input from the user and attempts to minimise the number of processor boundaries. The user can specify a weighting for the decomposition between processors which can be useful on machines with differing performance between processors.
manual
Manual decomposition, where the user directly specifies the allocation of each cell to a particular processor.

For each method there are a set of coefficients specified in a subdictionary of decompositionDict, named <method>Coeffs as shown in the dictionary listing. The full set of keyword entries in the decomposeParDict dictionary are explained in Table 3.4.


Compulsory entries



numberOfSubdomains

Total number of subdomains

N  \special {t4ht=

method

Method of decomposition

simple/ hierarchical/ metis/ manual/

simpleCoeffs entries



n

Number of subdomains in x  \special {t4ht=, y  \special {t4ht=, z  \special {t4ht=

(nx  \special {t4ht= ny  \special {t4ht= nz  \special {t4ht=)

delta

Cell skew factor

Typically,   -3
10   \special {t4ht=

hierarchicalCoeffs entries



n

Number of subdomains in x  \special {t4ht=, y  \special {t4ht=, z  \special {t4ht=

(n
 x  \special {t4ht= n
  y  \special {t4ht= n
 z  \special {t4ht=)

delta

Cell skew factor

Typically,   -3
10   \special {t4ht=

order

Order of decomposition

xyz/xzy/yxz. . .

metisCoeffs entries



processorWeights

List of weighting factors for allocation of cells to processors; <wt1> is the weighting factor for processor 1, etc.; weights are normalised so can take any range of values.

(<wt1>...<wtN>)

manualCoeffs entries



dataFile

Name of file containing data of allocation of cells to processors

"<fileName>"

Distributed data entries (optional) -- see subsection 3.4.3



distributed

Is the data distributed across several disks?

yes/no

roots

Root paths to case directories; <rt1> is the root path for node 1, etc.

(<rt1>...<rtN>)


Table 3.4: Keywords in decompositionDict dictionary.

The decomposePar utility is executed in the normal manner by typing


    decomposePar <root> <case>
On completion, a set of subdirectories will have been created, one for each processor, in the case directory. The directories are named processorN  \special {t4ht= where N  = 0,1, ...  \special {t4ht= represents a processor number and contains a time directory, containing the decomposed field descriptions, and a constant/polyMesh directory containing the decomposed mesh description.

3.4.2 Running a decomposed case

A decomposed OpenFOAM case is run in parallel using the LAM implementation of MPI (LAM/MPI). The user must first start a LAM multicomputer using the lamboot executable as described in subsubsection 3.4.2.1. The OpenFOAM application can then be executed on the allocated LAM nodes, using mpirun as described in subsubsection 3.4.2.2.

3.4.2.1 Starting a LAM multicomputer

A file must be created that contains the host names of the machines of which the LAM multicomputer will be comprised. The file can be given any name and located at any path. In the following description we shall refer to such a file by the generic name, including full path, <machines>.

The <machines> file contains the names of the machines listed one machine per line. The names must correspond to a fully resolved hostname in the /etc/hosts file of the machine on which the LAM is started. The list must contain the name of the machine running the LAM. Where a machine node contains more than one processor, the node name may be followed by the entry cpu=n   \special {t4ht= where n  \special {t4ht= is the number of processors LAM should run on that node.

For example, let us imagine a user wishes to run LAM from machine aaa on the following machines: aaa; bbb, which has 2 processors; and ccc. The <machines> would contain:


    aaa
    bbb cpu=2
    ccc
LAM is then started by executing:


    lamboot -v <machines>
A message is printed to screen that looks something like the following for our current example (for a process ID of 1000)


    LAM 7.0.6 - Indiana University

    n0<1000> ssi:boot:base:linear: booting n0 (aaa)
    n0<1000> ssi:boot:base:linear: booting n1 (bbb)
    n0<1000> ssi:boot:base:linear: booting n2 (ccc)
    n0<4332> ssi:boot:base:linear: finished
A process called lamd will then be running on the system. In order to stop or restart the process at any time, simply type:


    lamhalt -d

3.4.2.2 Running applications in parallel

An application is run in parallel using mpirun.


    mpirun -np <nProcs> <foamExec> <root> <case> <otherArgs>
       -parallel < /dev/null >& log &
where: <nProcs> is the number of processors; <foamExec> is the executable, e.g. icoFoam; and, the output is redirected to a file named log. For example, if icoFoam is run on 4 nodes on the cavity tutorial in the $OpenFOAM_RUN/tutorials/icoFoam directory, then the following command should be executed:


    mpirun -np 4 icoFoam $OpenFOAM_RUN/tutorials/icoFoam cavity
        -parallel < /dev/null >& log &

The user may also choose to run on a selection of the CPUs running under LAM by listing them in the command line of mpirun. In our example, if wished to run the same job using 2 CPUs on nodes aaa and ccc, nodes n0 and n2 respectively, we would type:


    mpirun n0,2 -np 2 icoFoam $OpenFOAM_RUN/tutorials/icoFoam cavity
        -parallel < /dev/null >& log &

3.4.3 Distributing data across several disks

Data files may need to be distributed if, for example, if only local disks are used in order to improve performance. In this case, the user may find that the root path to the case directory may differ between machines. The paths must then be specified in the decomposeParDict dictionary using distributed and roots keywords. The distributed entry should read


    distributed  yes;
and the roots entry is a list of root paths, <root0>, <root1>, . . . , for each node


    roots
    <nRoots>
    (
       "<root0>"
       "<root1>"
       ...
    );
where <nRoots> is the number of roots.

Each of the processorN  \special {t4ht= directories should be placed in the case directory at each of the root paths specified in the decomposeParDict dictionary. The system directory and files within the constant directory must also be present in each case directory. Note: the files in the constant directory are needed, but the polyMesh directory is not.

3.4.4 Post-processing parallel processed cases

When post-processing cases that have been run in parallel the user has two options:

  • reconstruction of the mesh and field data to recreate the complete domain and fields, which can be post-processed as normal;
  • post-processing each segment of decomposed domain individually.

3.4.4.1 Reconstructing mesh and data

After a case has been run in parallel, it can be reconstructed for post-processing. The case is reconstructed by merging the sets of time directories from each processorN  \special {t4ht= directory into a single set of time directories. The reconstructPar utility performs such a reconstruction by executing the command:


    reconstructPar <root> <case>
When the data is distributed across several disks, it must be first copied to the local case directory for reconstruction.

3.4.4.2 Post-processing decomposed cases

The user may post-process decomposed cases using the dxFoam post-processor, described in section 7.2. To prepare for post-processing the user must add the full path of the case directory to the caseRoots to the user’s $HOME/.foam2.3/controlDict directory (not the case controlDict directory). For example, to view the parallel tutorial case damBreakFine requires:


    "$OpenFOAM_RUN/tutorials/interFoam/damBreakFine"
The user will be able to select the individual processor directories, i.e. processor0, processor1, . . . , from the Case menu in the Read Data window of dxFoam as described in subsection 7.2.1.