On Wed, 2013-12-18 at 11:47 -0500, Noam Bernstein wrote:
> Yes - I never characterized it fully, but we attached with gdb to every
> single vasp running process, and all were stuck in the same
> call to MPI_allreduce() every time. It's only happening on a rather large
> jobs, so it's not the easi
Hi!
Not sure if this has been caught already or not, but there is a typo in
opal/memoryhooks/memory.h in 1.6.5.
#ifndef OPAL_MEMORY_MEMORY_H
#define OPAl_MEMORY_MEMORY_H
Note the lower case "l" in the define.
/Åke S.
On Fri, 2013-03-01 at 01:24 +0900, Pradeep Jha wrote:
> Sorry for those mistakes. I addressed all the three problems
> - I put "implicit none" at the top of main program
> - I initialized tag.
> - changed MPI_INT to MPI_INTEGER
> - "send_length" should be just "send", it was a typo.
>
>
> But the
Hi!
Was the change for libmpi_f90 in VERSION intentional or a typo?
This is from openmpi 1.6.3
libmpi_f90_so_version=4:0:1
1.6.1 had
libmpi_f90_so_version=2:0:1
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126
On Thu, 2013-01-03 at 07:14 -0800, Ralph Castain wrote:
> > Well, it isn't :-)
> > configure says:
> > --- MCA component grpcomm:pmi (m4 configuration macro)
> > checking for MCA component grpcomm:pmi compile mode... dso
> > checking if user requested PMI support... no
> > checking if MCA component
On Thu, 2013-01-03 at 07:00 -0800, Ralph Castain wrote:
> On Jan 3, 2013, at 6:52 AM, Ake Sandgren wrote:
>
> > On Thu, 2013-01-03 at 06:18 -0800, Ralph Castain wrote:
> >> On Jan 3, 2013, at 3:01 AM, Ake Sandgren wrote:
> >>
> >>> On Thu, 201
On Thu, 2013-01-03 at 06:18 -0800, Ralph Castain wrote:
> On Jan 3, 2013, at 3:01 AM, Ake Sandgren wrote:
>
> > On Thu, 2013-01-03 at 11:54 +0100, Ake Sandgren wrote:
> >> On Thu, 2013-01-03 at 11:15 +0100, Ake Sandgren wrote:
> >>> Hi!
> >>>
&
On Thu, 2013-01-03 at 11:54 +0100, Ake Sandgren wrote:
> On Thu, 2013-01-03 at 11:15 +0100, Ake Sandgren wrote:
> > Hi!
> >
> > The grpcomm component hier seems to have vanished between 1.6.1 and
> > 1.6.3.
> > Why?
> > It seems that the version of sl
On Thu, 2013-01-03 at 11:15 +0100, Ake Sandgren wrote:
> Hi!
>
> The grpcomm component hier seems to have vanished between 1.6.1 and
> 1.6.3.
> Why?
> It seems that the version of slurm we are using (not the latest at the
> moment) is using it for startup.
>
Hi!
The grpcomm component hier seems to have vanished between 1.6.1 and
1.6.3.
Why?
It seems that the version of slurm we are using (not the latest at the
moment) is using it for startup.
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se Phone: +46 90
On Thu, 2012-09-27 at 16:31 +0200, Ake Sandgren wrote:
> Hi!
>
> Building 1.6.1 and 1.6.2 i seem to be missing the actual fortran
> bindings for MPI_Op_commutative and a bunch of other functions.
>
> My configure is
> ./configure --enable-orterun-prefix-by-default --
.
mpi_init_ is there (as a weak) as it should.
All compilers give me the same result.
Any ideas why?
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
Hi!
I'm getting a segfault in hwloc_setup_distances_from_os_matrix in the
call to hwloc_bitmap_or due to objs or objs[i]->cpuset being freed and
containing garbage, objs[i]->cpuset has infinite < 0.
I only get this when using slurm with cgroups, asking for 2 nodes with 1
cpu each. The cpuset is t
On Wed, 2010-09-22 at 14:16 +0200, Ake Sandgren wrote:
> On Wed, 2010-09-22 at 07:42 -0400, Jeff Squyres wrote:
> > This is a problem with the Pathscale compiler and old versions of GCC. See:
> >
> >
> > http://www.open-mpi.org/faq/?category=building#pathscale-
; > ----------
> > [host1:29931] 3 more processes have sent help message
> > help-mpi-errors.txt / mpi_errors_are_fatal
> > [host1:29931] Set MCA parameter "orte_base_help_aggregate" to 0 to see
> > all help / error messages
> >
> > There are no problems when Open MPI 1.4.2 is built with GCC (GCC 4.1.2).
> > No problems are found with Open MPI 1.2.6 and PathScale either.
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
this most likely
caused by our setup.
openmpi version is 1.4.2 (fails with 1.3.3 too)
Filesystem used is GPFS
openmpi built with mpi-threads but without progress-threads
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46
on-mpi related packets
coming in on the sockets will sometimes cause havoc.
We've been getting http traffic in the jobs stdout/err sometimes. That
really makes the users confused :-)
And yes, we are going to block this but we haven't had time...
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
_asm__ __volatile__ (
SMPLOCK "cmpxchgl %3,%2 \n\t"
"sete %0 \n\t"
: "=qm" (ret), "+a" (oldval), "+m" (*addr)
: "q"(newval)
On Wed, 2010-02-10 at 08:21 -0500, Jeff Squyres wrote:
> On Feb 10, 2010, at 7:47 AM, Ake Sandgren wrote:
>
> > According to people who knows asm statements fairly well (compiler
> > developers), it should be
>
> > static inline int opal_atomic_cmpset
well (compiler
developers), it should be
static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
int32_t oldval, int32_t newval)
{
unsigned char ret;
__asm__ __volatile__ (
SMPLOCK "cmpxchgl %3,%2 \n\t"
"sete %0 \n\t"
: "=qm" (ret), "=a" (oldval), "=m" (*addr)
: "q"(newval), "2"(*addr), "1"(oldval)
: "memory", "cc");
return (int)ret;
}
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
ring since the demise of SciCortex
last year. I hope they will be able to release a new version fairly
soon.
In my opinion (working mostly with Fortran codes, shudder) it is the
best compiler around. Although they have had problems over the years in
coming out with fixes for bugs in a timely fas
ould try the 1.4.1-rc1
> which should work with PGI-10 and see if it fixes your problems too.
Our PGI 9.0-3 doesn't have any problems building openmpi 1.3.3 or 1.4
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
Hi!
config/libtool.m4 has a bug when pgi 10 is used.
The lines:
pgCC* | pgcpp*)
# Portland Group C++ compiler
case `$CC -V` in
*pgCC\ [[1-5]]* | *pgcpp\ [[1-5]]*)
matches pgi 10.0 but 10.0 doesn't have the --instantiation_dir flag.
--
Ake San
It should be MPI_Irecv(buffer, 1, ...)
> The segfault disappears if I comment out the MPI_Irecv call in
> recv_func so I'm assuming that there's something wrong with the
> parameters that I'm passing to it. Thoughts?
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea,
On Fri, 2009-09-11 at 13:18 +0200, Ake Sandgren wrote:
> Hi!
>
> The following code shows a bad behaviour when running over openib.
Oops. Red Face big time.
I happened to run the IB test between two systems that don't have IB
connectivity.
Goes and hide in a dark corner...
--
does.
But i think that it should be allowed to behave as it does.
This example is a bit engineered but there are codes where a similar
situation can occur, i.e. the Bcast sender doing lots of other work
after the Bcast before the next MPI call. VASP is a candidate for this.
--
Ake Sandgren,
n keep up a lot better but not
completely.
OS: CentOS5.3 (OFED 1.3.2 and 1.4.2 tested)
HW: Mellanox MT25208 InfiniHost III Ex (128MB)
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126
Mobile: +46 70 77
ts comes to a complete standstill at the integer bsbr tests
> >> It consumes cpu all the time but nothing happens.
> >
> > Actually if i'm not too inpatient i will progress but VERY slowly.
> > A complete run of the blacstest takes +30min cpu-time...
> >> From
28 matches
Mail list logo