Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

2013-12-18 Thread Ake Sandgren
On Wed, 2013-12-18 at 11:47 -0500, Noam Bernstein wrote: > Yes - I never characterized it fully, but we attached with gdb to every > single vasp running process, and all were stuck in the same > call to MPI_allreduce() every time. It's only happening on a rather large > jobs, so it's not the easi

[OMPI users] typo in opal/memoryhooks/memory.h (1.6.5)

2013-12-16 Thread Ake Sandgren
Hi! Not sure if this has been caught already or not, but there is a typo in opal/memoryhooks/memory.h in 1.6.5. #ifndef OPAL_MEMORY_MEMORY_H #define OPAl_MEMORY_MEMORY_H Note the lower case "l" in the define. /Åke S.

Re: [OMPI users] Calling MPI_send MPI_recv from a fortran subroutine

2013-02-28 Thread Ake Sandgren
On Fri, 2013-03-01 at 01:24 +0900, Pradeep Jha wrote: > Sorry for those mistakes. I addressed all the three problems > - I put "implicit none" at the top of main program > - I initialized tag. > - changed MPI_INT to MPI_INTEGER > - "send_length" should be just "send", it was a typo. > > > But the

[OMPI users] libmpi_f90 shared lib version number change in 1.6.3

2013-01-12 Thread Ake Sandgren
Hi! Was the change for libmpi_f90 in VERSION intentional or a typo? This is from openmpi 1.6.3 libmpi_f90_so_version=4:0:1 1.6.1 had libmpi_f90_so_version=2:0:1 -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126

Re: [OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
On Thu, 2013-01-03 at 07:14 -0800, Ralph Castain wrote: > > Well, it isn't :-) > > configure says: > > --- MCA component grpcomm:pmi (m4 configuration macro) > > checking for MCA component grpcomm:pmi compile mode... dso > > checking if user requested PMI support... no > > checking if MCA component

Re: [OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
On Thu, 2013-01-03 at 07:00 -0800, Ralph Castain wrote: > On Jan 3, 2013, at 6:52 AM, Ake Sandgren wrote: > > > On Thu, 2013-01-03 at 06:18 -0800, Ralph Castain wrote: > >> On Jan 3, 2013, at 3:01 AM, Ake Sandgren wrote: > >> > >>> On Thu, 201

Re: [OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
On Thu, 2013-01-03 at 06:18 -0800, Ralph Castain wrote: > On Jan 3, 2013, at 3:01 AM, Ake Sandgren wrote: > > > On Thu, 2013-01-03 at 11:54 +0100, Ake Sandgren wrote: > >> On Thu, 2013-01-03 at 11:15 +0100, Ake Sandgren wrote: > >>> Hi! > >>> &

Re: [OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
On Thu, 2013-01-03 at 11:54 +0100, Ake Sandgren wrote: > On Thu, 2013-01-03 at 11:15 +0100, Ake Sandgren wrote: > > Hi! > > > > The grpcomm component hier seems to have vanished between 1.6.1 and > > 1.6.3. > > Why? > > It seems that the version of sl

Re: [OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
On Thu, 2013-01-03 at 11:15 +0100, Ake Sandgren wrote: > Hi! > > The grpcomm component hier seems to have vanished between 1.6.1 and > 1.6.3. > Why? > It seems that the version of slurm we are using (not the latest at the > moment) is using it for startup. >

[OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
Hi! The grpcomm component hier seems to have vanished between 1.6.1 and 1.6.3. Why? It seems that the version of slurm we are using (not the latest at the moment) is using it for startup. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90

Re: [OMPI users] fortran bindings for MPI_Op_commutative

2012-09-27 Thread Ake Sandgren
On Thu, 2012-09-27 at 16:31 +0200, Ake Sandgren wrote: > Hi! > > Building 1.6.1 and 1.6.2 i seem to be missing the actual fortran > bindings for MPI_Op_commutative and a bunch of other functions. > > My configure is > ./configure --enable-orterun-prefix-by-default --

[OMPI users] fortran bindings for MPI_Op_commutative

2012-09-27 Thread Ake Sandgren
. mpi_init_ is there (as a weak) as it should. All compilers give me the same result. Any ideas why? -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

[OMPI users] Bug in openmpi 1.5.4 in paffinity

2011-09-04 Thread Ake Sandgren
Hi! I'm getting a segfault in hwloc_setup_distances_from_os_matrix in the call to hwloc_bitmap_or due to objs or objs[i]->cpuset being freed and containing garbage, objs[i]->cpuset has infinite < 0. I only get this when using slurm with cgroups, asking for 2 nodes with 1 cpu each. The cpuset is t

Re: [OMPI users] PathScale problems persist

2010-09-22 Thread Ake Sandgren
On Wed, 2010-09-22 at 14:16 +0200, Ake Sandgren wrote: > On Wed, 2010-09-22 at 07:42 -0400, Jeff Squyres wrote: > > This is a problem with the Pathscale compiler and old versions of GCC. See: > > > > > > http://www.open-mpi.org/faq/?category=building#pathscale-

Re: [OMPI users] PathScale problems persist

2010-09-22 Thread Ake Sandgren
; > ---------- > > [host1:29931] 3 more processes have sent help message > > help-mpi-errors.txt / mpi_errors_are_fatal > > [host1:29931] Set MCA parameter "orte_base_help_aggregate" to 0 to see > > all help / error messages > > > > There are no problems when Open MPI 1.4.2 is built with GCC (GCC 4.1.2). > > No problems are found with Open MPI 1.2.6 and PathScale either. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

[OMPI users] opal_mutex_lock(): Resource deadlock avoided

2010-05-06 Thread Ake Sandgren
this most likely caused by our setup. openmpi version is 1.4.2 (fails with 1.3.3 too) Filesystem used is GPFS openmpi built with mpi-threads but without progress-threads -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46

Re: [OMPI users] Segmentation fault in mca_btl_tcp

2010-04-15 Thread Ake Sandgren
on-mpi related packets coming in on the sockets will sometimes cause havoc. We've been getting http traffic in the jobs stdout/err sometimes. That really makes the users confused :-) And yes, we are going to block this but we haven't had time... -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Ake Sandgren
_asm__ __volatile__ ( SMPLOCK "cmpxchgl %3,%2 \n\t" "sete %0 \n\t" : "=qm" (ret), "+a" (oldval), "+m" (*addr) : "q"(newval)

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Ake Sandgren
On Wed, 2010-02-10 at 08:21 -0500, Jeff Squyres wrote: > On Feb 10, 2010, at 7:47 AM, Ake Sandgren wrote: > > > According to people who knows asm statements fairly well (compiler > > developers), it should be > > > static inline int opal_atomic_cmpset

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Ake Sandgren
well (compiler developers), it should be static inline int opal_atomic_cmpset_32( volatile int32_t *addr, int32_t oldval, int32_t newval) { unsigned char ret; __asm__ __volatile__ ( SMPLOCK "cmpxchgl %3,%2 \n\t" "sete %0 \n\t" : "=qm" (ret), "=a" (oldval), "=m" (*addr) : "q"(newval), "2"(*addr), "1"(oldval) : "memory", "cc"); return (int)ret; } -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Ake Sandgren
ring since the demise of SciCortex last year. I hope they will be able to release a new version fairly soon. In my opinion (working mostly with Fortran codes, shudder) it is the best compiler around. Although they have had problems over the years in coming out with fixes for bugs in a timely fas

Re: [OMPI users] Problems compiling OpenMPI 1.4 with PGI 9.0-3

2010-01-07 Thread Ake Sandgren
ould try the 1.4.1-rc1 > which should work with PGI-10 and see if it fixes your problems too. Our PGI 9.0-3 doesn't have any problems building openmpi 1.3.3 or 1.4 -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

[OMPI users] openmpi 1.4 and pgi 10

2010-01-04 Thread Ake Sandgren
Hi! config/libtool.m4 has a bug when pgi 10 is used. The lines: pgCC* | pgcpp*) # Portland Group C++ compiler case `$CC -V` in *pgCC\ [[1-5]]* | *pgcpp\ [[1-5]]*) matches pgi 10.0 but 10.0 doesn't have the --instantiation_dir flag. -- Ake San

Re: [OMPI users] MPI_Irecv segmentation fault

2009-09-22 Thread Ake Sandgren
It should be MPI_Irecv(buffer, 1, ...) > The segfault disappears if I comment out the MPI_Irecv call in > recv_func so I'm assuming that there's something wrong with the > parameters that I'm passing to it. Thoughts? -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea,

Re: [OMPI users] Bad MPI_Bcast behaviour when running over openib

2009-09-11 Thread Ake Sandgren
On Fri, 2009-09-11 at 13:18 +0200, Ake Sandgren wrote: > Hi! > > The following code shows a bad behaviour when running over openib. Oops. Red Face big time. I happened to run the IB test between two systems that don't have IB connectivity. Goes and hide in a dark corner... --

[OMPI users] Bad MPI_Bcast behaviour when running over openib

2009-09-11 Thread Ake Sandgren
does. But i think that it should be allowed to behave as it does. This example is a bit engineered but there are codes where a similar situation can occur, i.e. the Bcast sender doing lots of other work after the Bcast before the next MPI call. VASP is a candidate for this. -- Ake Sandgren,

[OMPI users] Need help with tuning of IB for OpenMPI 1.3.3

2009-08-25 Thread Ake Sandgren
n keep up a lot better but not completely. OS: CentOS5.3 (OFED 1.3.2 and 1.4.2 tested) HW: Mellanox MT25208 InfiniHost III Ex (128MB) -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 77

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-28 Thread Ake Sandgren
ts comes to a complete standstill at the integer bsbr tests > >> It consumes cpu all the time but nothing happens. > > > > Actually if i'm not too inpatient i will progress but VERY slowly. > > A complete run of the blacstest takes +30min cpu-time... > >> From