FWIW: Open MPI 4.1.2 has been released -- you can probably stop using an RC 
release.

I think you're probably running into an issue that is just a fact of life.  
Especially when there's a lot of output simultaneously from multiple MPI 
processes (potentially on different nodes), the stdout/stderr lines can just 
get munged together.

Can you check for convergence a different way?

--
Jeff Squyres
jsquy...@cisco.com

________________________________________
From: users <users-boun...@lists.open-mpi.org> on behalf of Fisher (US), Mark S 
via users <users@lists.open-mpi.org>
Sent: Thursday, December 2, 2021 10:48 AM
To: users@lists.open-mpi.org
Cc: Fisher (US), Mark S
Subject: [OMPI users] stdout scrambled in file

We are using Mellanox HPC-X MPI based on OpenMPI 4.1.1RC1 and having issues 
with lines scrambling together occasionally. This causes issues our convergence 
checking code since we put convergence data there. We are not using any mpirun 
options for stdout we just redirect stdout/stderr to a file before we run the 
mpirun command so all output goes there. We had similar issue with Intel MPI in 
the past and used the -ordered-output to fix it but I do not see any similar 
option for OpenMPI. See example below. Is there anyway to ensure a line from a 
process gets one line in the output file?


The data in red below is scrambled up and should look like the cleaned-up 
version. You can see it put a line from a different process inside a line from 
another processes and the rest of the line ended up a couple of lines down.

ZONE   0 : Min/Max CFL= 5.000E-01 1.500E+01 Min/Max DT= 8.411E-10 1.004E-01 sec

*IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04 -4.945E-06  
aerosurfs
*IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04 -2.785E-05  
aerosurfs
*IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04 -4.945E-06  
Aircraft-Total
*IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04 -2.785E-05  
Aircr Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  699  1625     12
Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  111  1626      6
aft-Total
*IGSTAB* 1626 6.623E-02 2.137E-01 -9.063E-04 8.450E-03 -5.485E-04 -4.961E-06  
Aircraft-OML
*IGMNTAERO* 1626 -6.118E-04 -1.602E-02 6.404E-04 5.756E-08 3.341E-04 -2.791E-05 
 Aircraft-OML


Cleaned up version:

ZONE   0 : Min/Max CFL= 5.000E-01 1.500E+01 Min/Max DT= 8.411E-10 1.004E-01 sec

*IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04 -4.945E-06  
aerosurfs
*IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04 -2.785E-05  
aerosurfs
*IGSTAB* 1626 7.392E-02 2.470E-01 -9.075E-04 8.607E-03 -5.911E-04 -4.945E-06  
Aircraft-Total
*IGMNTAERO* 1626 -6.120E-04 1.406E-02 6.395E-04 4.473E-08 3.112E-04 -2.785E-05  
Aircraft-Total
 Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  699  1625     12
Warning: BCFD: US_UPDATEQ: izon, iter, nBadpmin:  111  1626      6
*IGSTAB* 1626 6.623E-02 2.137E-01 -9.063E-04 8.450E-03 -5.485E-04 -4.961E-06  
Aircraft-OML
*IGMNTAERO* 1626 -6.118E-04 -1.602E-02 6.404E-04 5.756E-08 3.341E-04 -2.791E-05 
 Aircraft-OML

Thanks!

Reply via email to