On Sep 20, 2013, at 1:00 PM, Lloyd Brown wrote:
> It is interesting to me, though, that I need to explicitly exclude
> lo/127.0.0.1 in this case, but when I'm on an Ethernet-only node, and I
> just do the plain "mpirun ./appname", I don't have to exclude anything,
> and it figures out to use em1,
On 09/20/2013 12:48 PM, Noam Bernstein wrote:
On Sep 20, 2013, at 11:52 AM, Gus Correa wrote:
Hi Noam
Could it be that Torque, or probably more likely NFS,
is too slow to create/make available the PBS_NODEFILE?
What if you insert a "sleep 2",
or whatever number of seconds you want,
before t
1 - How do I check the BTLs available? Something like "ompi_info | grep
-i btl"? If so, here's the list:
> MCA btl: ofud (MCA v2.0, API v2.0, Component v1.6.3)
> MCA btl: openib (MCA v2.0, API v2.0, Component v1.6.3)
> MCA btl: self (MCA v2.0, A
On Sep 20, 2013, at 11:52 AM, Gus Correa wrote:
> Hi Noam
>
> Could it be that Torque, or probably more likely NFS,
> is too slow to create/make available the PBS_NODEFILE?
>
> What if you insert a "sleep 2",
> or whatever number of seconds you want,
> before the mpiexec command line?
> Or may
On Sep 20, 2013, at 12:27 PM, Lloyd Brown wrote:
> Interesting. I was taking the approach of "only exclude what you're
> certain you don't want" (the native IB and TCP/IPoIB stuff) since I
> wasn't confident enough in my knowledge of the OpenMPI internals, to
> know what I should explicitly incl
Interesting. I was taking the approach of "only exclude what you're
certain you don't want" (the native IB and TCP/IPoIB stuff) since I
wasn't confident enough in my knowledge of the OpenMPI internals, to
know what I should explicitly include.
However, taking Jeff's suggestion, this does seem to
Looks like Ralph noticed that we fixed this on the trunk and forgot to bring it
over to v1.7. I just committed it on v1.7 in r29215. Give it a whirl in
tonight's v1.7 nightly tarball.
On Sep 20, 2013, at 7:00 AM, Siegmar Gross
wrote:
> Hi,
>
> I tried to install openmpi-1.7.3a1r29213 on "
Hi Noam
Could it be that Torque, or probably more likely NFS,
is too slow to create/make available the PBS_NODEFILE?
What if you insert a "sleep 2",
or whatever number of seconds you want,
before the mpiexec command line?
Or maybe better, a "ls -l $PBS_NODEFILE; cat $PBS_NODEFILE",
just to make
Sorry for the delay replying -- I actually replied on the original thread
yesterday, but it got hung up in my outbox and I didn't notice that it didn't
actually go out until a few moments ago. :-(
I'm *guessing* that this is a problem with your local icpc installation.
Can you compile / run ot
Correct -- it doesn't make sense to specify both include *and* exclude: by
specifying one, you're implicitly (but exactly/precisely) specifying the other.
My suggestion would be to use positive notation, not negative notation. For
example:
mpirun --mca btl tcp,self --mca btl_tcp_if_include eth
I can't tell if this is a busted compiler installation or not. The first error
is:
-
/usr/include/c++/4.6.3/bits/stl_algobase.h(573): error: type name is not allowed
const bool __simple = (__is_trivial(_ValueType1)
^
detected duri
I don't think you are allowed to specify both include and exclude options at
the same time as they conflict - you should either exclude ib0 or include eth0
(or whatever).
My guess is that the various nodes are trying to communicate across disjoint
networks. We've seen that before when, for exam
> The trouble is when I try to add some "--mca" parameters to force it to
> use TCP/Ethernet, the program seems to hang. I get the headers of the
> "osu_bw" output, but no results, even on the first case (1 byte payload
> per packet). This is occurring on both the IB-enabled nodes, and on the
> E
Hi, all.
We've got a couple of clusters running RHEL 6.2, and have several
centrally-installed versions/compilations of OpenMPI. Some of the nodes
have 4xQDR Infiniband, and all the nodes have 1 gigabit ethernet. I was
gathering some bandwidth and latency numbers using the OSU/OMB tests,
and not
On Sep 20, 2013, at 10:36 AM, Noam Bernstein
wrote:
>
> On Sep 20, 2013, at 10:22 AM, Reuti wrote:
>
>>
>> Is the location for the spool directory local or shared by NFS? Disk full?
>
> No - locally mounted, and far from full on all the nodes.
Another new observation, which may shift the
On Sep 20, 2013, at 10:22 AM, Reuti wrote:
>
> Is the location for the spool directory local or shared by NFS? Disk full?
No - locally mounted, and far from full on all the nodes.
Noam
Hi,
Am 20.09.2013 um 16:12 schrieb Noam Bernstein:
> On Sep 20, 2013, at 10:04 AM, Noam Bernstein
> wrote:
>
>> Never mind - I was sure that my earlier tests showed that the $PBS_NODEFILE
>> was there, but now it seems like every time the job fails it's because this
>> file really is missing.
On Sep 20, 2013, at 10:04 AM, Noam Bernstein
wrote:
>
> Never mind - I was sure that my earlier tests showed that the $PBS_NODEFILE
> was there, but now it seems like every time the job fails it's because this
> file really is missing. Time to check why torque isn't always creating
> the nodef
On Sep 20, 2013, at 9:55 AM, Noam Bernstein wrote:
>
> This is completely unrepeatable - resubmitting the same job almost
> always works the second time around. The line appears to be
> associated with looking for the torque/maui generated node file,
> and when I do something like
> echo $PBS_
Hi - we've been using openmpi for a while, but only for the last few months
with torque/maui. Intermittently (maybe 1/10 jobs), we get mpi jobs that fail
with the error:
[compute-2-4:32448] [[52041,0],0] ORTE_ERROR_LOG: File open failure in file
ras_tm_module.c at line 142
[compute-2-4:32448] [
Hi,
I tried to install openmpi-1.7.3a1r29213 on "openSuSE Linux 12.1",
"Solaris 10 x86_64", and "Solaris 10 sparc" with "Sun C 5.12" and
gcc-4.8.0 in 64-bit mode. Unfortunately "make" breaks with the same
error for both compilers on both Solaris platforms.
tyr openmpi-1.7.3a1r29213-SunOS.sparc.6
Output of make V=1 is attached. Again same error. If intel compiler is
using C++ headers from gfortran then how can we avoid this.
On Fri, Sep 20, 2013 at 11:07 AM, Bert Wesarg
wrote:
> Hi,
>
> On Fri, Sep 20, 2013 at 4:49 AM, Syed Ahsan Ali wrote:
>> I am trying to compile openmpi-1.6.5 on fc16
Hi,
On Fri, Sep 20, 2013 at 4:49 AM, Syed Ahsan Ali wrote:
> I am trying to compile openmpi-1.6.5 on fc16.x86_64 with icc and ifort
> but getting the subject error. config.out and make.out is attached.
> Following command was used for configure
>
> ./configure CC=icc CXX=icpc FC=ifort F77=ifort
23 matches
Mail list logo