I agree with Gus - check your stack size. This isn't occurring in OMPI itself, so I suspect it is in the system setup.
On Apr 3, 2013, at 10:17 AM, Reza Bakhshayeshi <reza.b2...@gmail.com> wrote: > Thanks for your answers. > > @Ralph Castain: > Do you mean what error I receive? > It's the output when I'm running the program: > > *** Process received signal *** > Signal: Segmentation fault (11) > Signal code: Address not mapped (1) > Failing at address: 0x1b7f000 > [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f6a84b524a0] > [ 1] hpcc(HPCC_Power2NodesMPIRandomAccessCheck+0xa04) [0x423834] > [ 2] hpcc(HPCC_MPIRandomAccess+0x87a) [0x41e43a] > [ 3] hpcc(main+0xfbf) [0x40a1bf] > [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) > [0x7f6a84b3d76d] > [ 5] hpcc() [0x40aafd] > *** End of error message *** > [ > ][[53938,1],0][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] > mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) > -------------------------------------------------------------------------- > mpirun noticed that process rank 1 with PID 4164 on node 192.168.100.6 exited > on signal 11 (Segmentation fault). > -------------------------------------------------------------------------- > > @Gus Correa: > I did it both on server and on instances but it didn't solve the problem. > > > On 3 April 2013 19:14, Gus Correa <g...@ldeo.columbia.edu> wrote: > Hi Reza > > Check the system stacksize first ('limit stacksize' or 'ulimit -s'). > If it is small, you can try to increase it > before you run the program. > Say (tcsh): > > limit stacksize unlimited > > or (bash): > > ulimit -s unlimited > > I hope this helps, > Gus Correa > > > On 04/03/2013 10:29 AM, Ralph Castain wrote: > Could you perhaps share the stacktrace from the segfault? It's > impossible to advise you on the problem without seeing it. > > > On Apr 3, 2013, at 5:28 AM, Reza Bakhshayeshi <reza.b2...@gmail.com > <mailto:reza.b2...@gmail.com>> wrote: > > Hi > I have installed HPCC benchmark suite and openmpi on a private cloud > instances. > Unfortunately I get Segmentation fault error mostly when I want to run > it simultaneously on two or more instances with: > mpirun -np 2 --hostfile ./myhosts hpcc > > Everything is on Ubuntu server 12.04 (updated) > and this is my make.intel64 file: > > shell -------------------------------------------------------------- > # ---------------------------------------------------------------------- > # > SHELL = /bin/sh > # > CD = cd > CP = cp > LN_S = ln -s > MKDIR = mkdir > RM = /bin/rm -f > TOUCH = touch > # > # ---------------------------------------------------------------------- > # - Platform identifier ------------------------------------------------ > # ---------------------------------------------------------------------- > # > ARCH = intel64 > # > # ---------------------------------------------------------------------- > # - HPL Directory Structure / HPL library ------------------------------ > # ---------------------------------------------------------------------- > # > TOPdir = ../../.. > INCdir = $(TOPdir)/include > BINdir = $(TOPdir)/bin/$(ARCH) > LIBdir = $(TOPdir)/lib/$(ARCH) > # > HPLlib = $(LIBdir)/libhpl.a > # > # ---------------------------------------------------------------------- > # - Message Passing library (MPI) -------------------------------------- > # ---------------------------------------------------------------------- > # MPinc tells the C compiler where to find the Message Passing library > # header files, MPlib is defined to be the name of the library to be > # used. The variable MPdir is only used for defining MPinc and MPlib. > # > MPdir = /usr/lib/openmpi > MPinc = -I$(MPdir)/include > MPlib = $(MPdir)/lib/libmpi.so > # > # ---------------------------------------------------------------------- > # - Linear Algebra library (BLAS or VSIPL) ----------------------------- > # ---------------------------------------------------------------------- > # LAinc tells the C compiler where to find the Linear Algebra library > # header files, LAlib is defined to be the name of the library to be > # used. The variable LAdir is only used for defining LAinc and LAlib. > # > LAdir = /usr/local/ATLAS/obj64 > LAinc = -I$(LAdir)/include > LAlib = $(LAdir)/lib/libcblas.a $(LAdir)/lib/libatlas.a > # > # ---------------------------------------------------------------------- > # - F77 / C interface -------------------------------------------------- > # ---------------------------------------------------------------------- > # You can skip this section if and only if you are not planning to use > # a BLAS library featuring a Fortran 77 interface. Otherwise, it is > # necessary to fill out the F2CDEFS variable with the appropriate > # options. **One and only one** option should be chosen in **each** of > # the 3 following categories: > # > # 1) name space (How C calls a Fortran 77 routine) > # > # -DAdd_ : all lower case and a suffixed underscore (Suns, > # Intel, ...), [default] > # -DNoChange : all lower case (IBM RS6000), > # -DUpCase : all upper case (Cray), > # -DAdd__ : the FORTRAN compiler in use is f2c. > # > # 2) C and Fortran 77 integer mapping > # > # -DF77_INTEGER=int : Fortran 77 INTEGER is a C int, [default] > # -DF77_INTEGER=long : Fortran 77 INTEGER is a C long, > # -DF77_INTEGER=short : Fortran 77 INTEGER is a C short. > # > # 3) Fortran 77 string handling > # > # -DStringSunStyle : The string address is passed at the string loca- > # tion on the stack, and the string length is then > # passed as an F77_INTEGER after all explicit > # stack arguments, [default] > # -DStringStructPtr : The address of a structure is passed by a > # Fortran 77 string, and the structure is of the > # form: struct {char *cp; F77_INTEGER len;}, > # -DStringStructVal : A structure is passed by value for each Fortran > # 77 string, and the structure is of the form: > # struct {char *cp; F77_INTEGER len;}, > # -DStringCrayStyle : Special option for Cray machines, which uses > # Cray fcd (fortran character descriptor) for > # interoperation. > # > F2CDEFS = > # > # ---------------------------------------------------------------------- > # - HPL includes / libraries / specifics ------------------------------- > # ---------------------------------------------------------------------- > # > HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc) > HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib) -lm > # > # - Compile time options ----------------------------------------------- > # > # -DHPL_COPY_L force the copy of the panel L before bcast; > # -DHPL_CALL_CBLAS call the cblas interface; > # -DHPL_CALL_VSIPL call the vsip library; > # -DHPL_DETAILED_TIMING enable detailed timers; > # > # By default HPL will: > # *) not copy L before broadcast, > # *) call the BLAS Fortran 77 interface, > # *) not display detailed timing information. > # > HPL_OPTS = -DHPL_CALL_CBLAS > # > # ---------------------------------------------------------------------- > # > HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES) > # > # ---------------------------------------------------------------------- > # - Compilers / linkers - Optimization flags --------------------------- > # ---------------------------------------------------------------------- > # > CC = /usr/bin/mpicc > CCNOOPT = $(HPL_DEFS) > CCFLAGS = $(HPL_DEFS) -fomit-frame-pointer -O3 -funroll-loops > #CCFLAGS = $(HPL_DEFS) > # > # On some platforms, it is necessary to use the Fortran linker to find > # the Fortran internals used in the BLAS library. > # > LINKER = /usr/bin/mpif90 > LINKFLAGS = $(CCFLAGS) > # > ARCHIVER = ar > ARFLAGS = r > RANLIB = echo > # > # ---------------------------------------------------------------------- > > Would you mind please help me figure this problem out? > > Regards, > Reza > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users