I agree with Gus - check your stack size. This isn't occurring in OMPI itself, 
so I suspect it is in the system setup.


On Apr 3, 2013, at 10:17 AM, Reza Bakhshayeshi <reza.b2...@gmail.com> wrote:

> Thanks for your answers.
> 
> @Ralph Castain: 
> Do you mean what error I receive?
> It's the output when I'm running the program:
> 
>   *** Process received signal ***
>   Signal: Segmentation fault (11)
>   Signal code: Address not mapped (1)
>   Failing at address: 0x1b7f000
>   [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f6a84b524a0]
>   [ 1] hpcc(HPCC_Power2NodesMPIRandomAccessCheck+0xa04) [0x423834]
>   [ 2] hpcc(HPCC_MPIRandomAccess+0x87a) [0x41e43a]
>   [ 3] hpcc(main+0xfbf) [0x40a1bf]
>   [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) 
> [0x7f6a84b3d76d]
>   [ 5] hpcc() [0x40aafd]
>   *** End of error message ***
> [ 
> ][[53938,1],0][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv]
>  mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
> --------------------------------------------------------------------------
> mpirun noticed that process rank 1 with PID 4164 on node 192.168.100.6 exited 
> on signal 11 (Segmentation fault).
> --------------------------------------------------------------------------
> 
> @Gus Correa:
> I did it both on server and on instances but it didn't solve the problem.
> 
> 
> On 3 April 2013 19:14, Gus Correa <g...@ldeo.columbia.edu> wrote:
> Hi Reza
> 
> Check the system stacksize first ('limit stacksize' or 'ulimit -s').
> If it is small, you can try to increase it
> before you run the program.
> Say (tcsh):
> 
> limit stacksize unlimited
> 
> or (bash):
> 
> ulimit -s unlimited
> 
> I hope this helps,
> Gus Correa
> 
> 
> On 04/03/2013 10:29 AM, Ralph Castain wrote:
> Could you perhaps share the stacktrace from the segfault? It's
> impossible to advise you on the problem without seeing it.
> 
> 
> On Apr 3, 2013, at 5:28 AM, Reza Bakhshayeshi <reza.b2...@gmail.com
> <mailto:reza.b2...@gmail.com>> wrote:
> 
> ​Hi
> ​​I have installed HPCC benchmark suite and openmpi on a private cloud
> instances.
> Unfortunately I get Segmentation fault error mostly when I want to run
> it simultaneously on two or more instances with:
> mpirun -np 2 --hostfile ./myhosts hpcc
> 
> Everything is on Ubuntu server 12.04 (updated)
> and this is my make.intel64 file:
> 
> shell --------------------------------------------------------------
> # ----------------------------------------------------------------------
> #
> SHELL = /bin/sh
> #
> CD = cd
> CP = cp
> LN_S = ln -s
> MKDIR = mkdir
> RM = /bin/rm -f
> TOUCH = touch
> #
> # ----------------------------------------------------------------------
> # - Platform identifier ------------------------------------------------
> # ----------------------------------------------------------------------
> #
> ARCH = intel64
> #
> # ----------------------------------------------------------------------
> # - HPL Directory Structure / HPL library ------------------------------
> # ----------------------------------------------------------------------
> #
> TOPdir = ../../..
> INCdir = $(TOPdir)/include
> BINdir = $(TOPdir)/bin/$(ARCH)
> LIBdir = $(TOPdir)/lib/$(ARCH)
> #
> HPLlib = $(LIBdir)/libhpl.a
> #
> # ----------------------------------------------------------------------
> # - Message Passing library (MPI) --------------------------------------
> # ----------------------------------------------------------------------
> # MPinc tells the C compiler where to find the Message Passing library
> # header files, MPlib is defined to be the name of the library to be
> # used. The variable MPdir is only used for defining MPinc and MPlib.
> #
> MPdir = /usr/lib/openmpi
> MPinc = -I$(MPdir)/include
> MPlib = $(MPdir)/lib/libmpi.so
> #
> # ----------------------------------------------------------------------
> # - Linear Algebra library (BLAS or VSIPL) -----------------------------
> # ----------------------------------------------------------------------
> # LAinc tells the C compiler where to find the Linear Algebra library
> # header files, LAlib is defined to be the name of the library to be
> # used. The variable LAdir is only used for defining LAinc and LAlib.
> #
> LAdir = /usr/local/ATLAS/obj64
> LAinc = -I$(LAdir)/include
> LAlib = $(LAdir)/lib/libcblas.a $(LAdir)/lib/libatlas.a
> #
> # ----------------------------------------------------------------------
> # - F77 / C interface --------------------------------------------------
> # ----------------------------------------------------------------------
> # You can skip this section if and only if you are not planning to use
> # a BLAS library featuring a Fortran 77 interface. Otherwise, it is
> # necessary to fill out the F2CDEFS variable with the appropriate
> # options. **One and only one** option should be chosen in **each** of
> # the 3 following categories:
> #
> # 1) name space (How C calls a Fortran 77 routine)
> #
> # -DAdd_ : all lower case and a suffixed underscore (Suns,
> # Intel, ...), [default]
> # -DNoChange : all lower case (IBM RS6000),
> # -DUpCase : all upper case (Cray),
> # -DAdd__ : the FORTRAN compiler in use is f2c.
> #
> # 2) C and Fortran 77 integer mapping
> #
> # -DF77_INTEGER=int : Fortran 77 INTEGER is a C int, [default]
> # -DF77_INTEGER=long : Fortran 77 INTEGER is a C long,
> # -DF77_INTEGER=short : Fortran 77 INTEGER is a C short.
> #
> # 3) Fortran 77 string handling
> #
> # -DStringSunStyle : The string address is passed at the string loca-
> # tion on the stack, and the string length is then
> # passed as an F77_INTEGER after all explicit
> # stack arguments, [default]
> # -DStringStructPtr : The address of a structure is passed by a
> # Fortran 77 string, and the structure is of the
> # form: struct {char *cp; F77_INTEGER len;},
> # -DStringStructVal : A structure is passed by value for each Fortran
> # 77 string, and the structure is of the form:
> # struct {char *cp; F77_INTEGER len;},
> # -DStringCrayStyle : Special option for Cray machines, which uses
> # Cray fcd (fortran character descriptor) for
> # interoperation.
> #
> F2CDEFS =
> #
> # ----------------------------------------------------------------------
> # - HPL includes / libraries / specifics -------------------------------
> # ----------------------------------------------------------------------
> #
> HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)
> HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib) -lm
> #
> # - Compile time options -----------------------------------------------
> #
> # -DHPL_COPY_L force the copy of the panel L before bcast;
> # -DHPL_CALL_CBLAS call the cblas interface;
> # -DHPL_CALL_VSIPL call the vsip library;
> # -DHPL_DETAILED_TIMING enable detailed timers;
> #
> # By default HPL will:
> # *) not copy L before broadcast,
> # *) call the BLAS Fortran 77 interface,
> # *) not display detailed timing information.
> #
> HPL_OPTS = -DHPL_CALL_CBLAS
> #
> # ----------------------------------------------------------------------
> #
> HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES)
> #
> # ----------------------------------------------------------------------
> # - Compilers / linkers - Optimization flags ---------------------------
> # ----------------------------------------------------------------------
> #
> CC = /usr/bin/mpicc
> CCNOOPT = $(HPL_DEFS)
> CCFLAGS = $(HPL_DEFS) -fomit-frame-pointer -O3 -funroll-loops
> #CCFLAGS = $(HPL_DEFS)
> #
> # On some platforms, it is necessary to use the Fortran linker to find
> # the Fortran internals used in the BLAS library.
> #
> LINKER = /usr/bin/mpif90
> LINKFLAGS = $(CCFLAGS)
> #
> ARCHIVER = ar
> ARFLAGS = r
> RANLIB = echo
> #
> # ----------------------------------------------------------------------
> 
> Would you mind please help me figure this problem out?
> 
> Regards,
> Reza
> _______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to