Hi Marks, Thanks for pointing out possible problems with our system. I will talk to the system admin about these issues.
Fhokrul > Date: Fri, 29 Jan 2010 09:47:53 -0600 > From: L-marks at northwestern.edu > To: wien at zeus.theochem.tuwien.ac.at > Subject: [Wien] Fwd: MPI segmentation fault > > I've edited your information down (too large for the list), and am > including it so others can see if they run into similar problems. > > In essence, you have a mess and you are going to have to talk to your > sysadmin (hikmpn) to get things sorted out. Issues: > > a) You have openmpi-1.3.3. This works for small problems, fails for > large ones. This needs to be updated to 1.4.0 or 1.4.1 (the older > versions of openmpi have bugs). > b) The openmpi was compiled with ifort 10.1 but you are using 11.1.064 > for Wien2k -- could lead to problems. > c) The openmpi was compiled with gcc and ifort 10.1, not icc and ifort > which could lead to problems. > d) The fftw library you are using was compiled with gcc not icc, this > could lead to problems. > e) Some of the shared libraries are in your LD_LIBRARY_PATH, you will > need to add -x LD_LIBRARY_PATH to how mpirun is called (in > $WIENROOT/parallel_options) -- look at man mpirun. > f) I still don't know what the stack limits are on your machine -- > this can lead to severe problems in lapw0_mpi > > ---------- Forwarded message ---------- > From: Fhokrul Islam <fhokrul.islam at lnu.se> > Date: Fri, Jan 29, 2010 at 9:16 AM > Subject: MPI segmentation fault > To: "L-marks at northwestern.edu" <L-marks at northwestern.edu> > > Below are the information that you requested. I would like to mention > that MPI worked fine when I used it for a bulk > 8 atom system. But for surface supercell of 96 atom it crashes at lapw0. > > Thanks, > Fhokrul > > >> 1) Please do "ompi_info " and paste the output to the end of your > >> response to this email. > > 1. [eishfh at milleotto s110]$ ompi_info > Package: Open MPI hikmpn at milleotto.local Distribution > Open MPI: 1.3.3 > Prefix: /home/hikmpn/local > Configured architecture: x86_64-unknown-linux-gnu > Configure host: milleotto.local > Configured by: hikmpn > Fortran90 bindings size: small > C compiler: gcc > C compiler absolute: /usr/bin/gcc > C++ compiler: g++ > C++ compiler absolute: /usr/bin/g++ > Fortran77 compiler: ifort > Fortran77 compiler abs: /sw/pkg/intel/10.1/bin//ifort > Fortran90 compiler: ifort > Fortran90 compiler abs: /sw/pkg/intel/10.1/bin//ifort > > >> 2) Also paste the output of "echo $LD_LIBRARY_PATH" > > 2. [eishfh at milleotto s110]$ echo $LD_LIBRARY_PATH > /home/eishfh/fftw-2.1.5-gcc/lib/:/home/hikmpn/local/lib/:/sw/pkg/intel/11.1.064//lib/intel64:/sw/pkg/mkl/10.0/lib/em64t:/lib64:/usr/lib64:/usr/X11R6/lib64:/lib:/usr/lib:/usr/X11R6/lib:/usr/local/lib > > >> 3) If you have in your .bashrc a "ulimit -s unlimited" please edit > >> this (temporarily) out, then ssh into one of the child nodes. > > After editing .bashrc file I did the following from the child node: > > 3. [eishfh at mn012 ~]$ which mpirun > /home/hikmpn/local/bin/mpirun > > 4. [eishfh at mn012 ~]$ which lapw0_mpi > /disk/global/home/eishfh/Wien2k_09_2/lapw0_mpi > > 5. [eishfh at mn012 ~]$ echo $LD_LIBRARY_PATH > -bash: > home/eishfh/fftw-2.1.5-gcc/lib/:/home/hikmpn/local/lib/:/sw/pkg/intel/11.1.064//lib/intel64:/sw/pkg/mkl/10.0/lib/em64t:/lib64:/usr/lib64:/usr/X11R6/lib64:/lib:/usr/lib:/usr/X11R6/lib:/usr/local/lib > > 6. [eishfh at mn012 ~]$ ldd $WIENROOT/lapw0_mpi > libmkl_intel_lp64.so => > /sw/pkg/mkl/10.0/lib/em64t/libmkl_intel_lp64.so (0x00002ab5610d3000) > libmkl_sequential.so => > /sw/pkg/mkl/10.0/lib/em64t/libmkl_sequential.so (0x00002ab5613d9000) > libmkl_core.so => /sw/pkg/mkl/10.0/lib/em64t/libmkl_core.so > (0x00002ab561566000) > libiomp5.so => /sw/pkg/intel/11.1.064//lib/intel64/libiomp5.so > (0x00002ab561738000) > libsvml.so => /sw/pkg/intel/11.1.064//lib/intel64/libsvml.so > (0x00002ab5618e9000) > libimf.so => /sw/pkg/intel/11.1.064//lib/intel64/libimf.so > (0x00002ab562694000) > libifport.so.5 => > /sw/pkg/intel/11.1.064//lib/intel64/libifport.so.5 > (0x00002ab562a28000) > libifcoremt.so.5 => > /sw/pkg/intel/11.1.064//lib/intel64/libifcoremt.so.5 > (0x00002ab562b61000) > libintlc.so.5 => > /sw/pkg/intel/11.1.064//lib/intel64/libintlc.so.5 (0x00002ab562e05000) > > -- > Laurence Marks > Department of Materials Science and Engineering > MSE Rm 2036 Cook Hall > 2220 N Campus Drive > Northwestern University > Evanston, IL 60208, USA > Tel: (847) 491-3996 Fax: (847) 491-7820 > email: L-marks at northwestern dot edu > Web: www.numis.northwestern.edu > Chair, Commission on Electron Crystallography of IUCR > www.numis.northwestern.edu/ > Electron crystallography is the branch of science that uses electron > scattering and imaging to study the structure of matter. > _______________________________________________ > Wien mailing list > Wien at zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien _________________________________________________________________ Your E-mail and More On-the-Go. Get Windows Live Hotmail Free. https://signup.live.com/signup.aspx?id=60969 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20100129/b73b6315/attachment-0001.htm>