What kind of system was this on? ssh, slurm, ...?
> On Jul 28, 2016, at 1:55 PM, Blosch, Edwin L wrote:
>
> I am running cases that are starting just fine and running for a few hours,
> then they die with a message that seems like a startup type of failure.
> Message shown below. The messag
Actually, what Saliya describes sounds like a bug - those procs must all be
assigned to the same comm_world.
Saliya: are you sure they are not? What ranks are you seeing?
> On Jul 29, 2016, at 12:12 PM, Udayanga Wickramasinghe
> wrote:
>
> Hi,
> I think orte/ompi-mca foward number of environ
at 12:18 PM, Ralph Castain wrote:
>
> Actually, what Saliya describes sounds like a bug - those procs must all be
> assigned to the same comm_world.
>
> Saliya: are you sure they are not? What ranks are you seeing?
>
>
>> On Jul 29, 2016, at 12:12 PM, Udayanga
Typical practice would be to put a ./myprogram in there to avoid any possible
confusion with a “myprogram” sitting in your $PATH. We should search the PATH
to find your executable, but the issue might be that it isn’t your PATH on a
remote node.
So the question is: are you launching strictly lo
> David Schneider
> SLAC/LCLS
>
> From: users [users-boun...@lists.open-mpi.org] on behalf of Ralph Castain
> [r...@open-mpi.org]
> Sent: Friday, July 29, 2016 5:19 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] mpirun won't find programs from the PATH
> environ
rs-boun...@lists.open-mpi.org] On Behalf Of Ralph
> Castain
> Sent: Thursday, July 28, 2016 4:07 PM
> To: Open MPI Users
> Subject: EXTERNAL: Re: [OMPI users] Question on run-time error "ORTE was
> unable to reliably start"
>
> What kind of system was this on? s
Hmmm...we'll have to check the configure logic as I don't think you should be
getting that message. Regardless, it isn't something of concern - you can turn
it "off" by adding
-mca btl ^usnic
on your command line, or configuring OMPI --enable-mca-no-build=btl-usnic
On Mar 22, 2014, at 10:00 P
I suspect the root cause of the problem here lies in how MPI messages are
progressed. OMPI doesn't have an async progress method (yet), and so messaging
on both send and recv ends is only progressed when the app calls the MPI
library. It sounds like your app issues an isend or recv, and then spe
The "updated"field in the orte_job_t structure is only used to help reduce the
size of the launch message sent to all the daemons. Basically, we only include
info on jobs that have been changed - thus, it only gets used when the app
calls comm_spawn. After every launch, we automatically change i
Looks good - thanks!
On Mar 24, 2014, at 4:55 AM, tmish...@jcity.maeda.co.jp wrote:
>
> Hi Ralph,
>
> I tried to improve checking for mapping-too-low and fixed a minor
> problem in rmaps_rr.c file. Please see attached patch file.
>
> 1) Regarding mapping-too-low, in future we'll have a lager s
Or use --display-map to see the process to node assignments
Sent from my iPhone
> On Mar 27, 2014, at 11:47 AM, Gus Correa wrote:
>
> PS - The (OMPI 1.6.5) mpiexec default is -bind-to-none,
> in which case -report-bindings won't report anything.
>
> So, if you are using the default,
> you can
Agreed - Jeff and I discussed this just this morning. I will be updating FAQ
soon
Sent from my iPhone
> On Mar 27, 2014, at 9:24 AM, Gus Correa wrote:
>
> <\begin hijacking this thread>
>
> I second Saliya's thanks to Tetsuya.
> I've been following this thread, to learn a bit more about
> how
Oooh...it's Jeff's fault!
Fwiw you can get even more detailed mapping info with --display-devel-map
Sent from my iPhone
> On Mar 27, 2014, at 2:58 PM, "Jeff Squyres (jsquyres)"
> wrote:
>
>> On Mar 27, 2014, at 4:06 PM, "Sasso, John (GE Power & Water, Non-GE)"
>> wrote:
>>
>> Yes, I no
Yes, that is correct
Ralph
On Thu, Mar 27, 2014 at 4:15 PM, Gus Correa wrote:
> On 03/27/2014 05:58 PM, Jeff Squyres (jsquyres) wrote:
>
>> On Mar 27, 2014, at 4:06 PM, "Sasso, John (GE Power & Water, Non-GE)"
>>
> wrote:
>
>>
>> Yes, I noticed that I could not find --display-map in any of t
You make a good point, Gus - let me throw the thread open for suggestions on
how to resolve that problem. We've heard similar concerns raised about other
features we've added to OMPI over the years, but I'm not sure of the best way
to communicate such information.
Do we need a better web page,
Unfortunately, Jeff just went on vacation for a week, so we won't be able to
address this right away. I know he spent a bunch of time making sure everything
worked okay with gfortran, so I expect there is something odd in the setup -
but I'm afraid I don't know all the details
On Mar 30, 2014,
Hmmm...indeed, it looks like the default versions may be out-of-date. Here is a
table showing the required rev levels:
http://www.open-mpi.org/svn/building.php
On Apr 1, 2014, at 8:26 AM, Blosch, Edwin L wrote:
> I am getting some errors building 1.8 on RHEL6. I tried autoreconf as
> sugges
Yeah, it's a change we added to resolve a problem when Slurm is configured
with TaskAffinity set. It's harmless, but annoying - I'm trying to figure
out a solution.
On Wed, Apr 2, 2014 at 11:35 AM, Dave Goodell (dgoodell) wrote:
> On Apr 2, 2014, at 12:57 PM, Filippo Spiga
> wrote:
>
> > I st
I'm having trouble understanding your note, so perhaps I am getting this wrong.
Let's see if I can figure out what you said:
* your perl command fails with "no route to host" - but I don't see any host in
your cmd. Maybe I'm just missing something.
* you tried running a couple of "mpirun", but
You haven't provided enough information to even guess at the issue - please see
my response to your last post on this question
On Apr 3, 2014, at 9:53 AM, Nisha Dhankher -M.Tech(CSE)
wrote:
> hoe btl_tcp_endpoint.c error 113 can be solved while executing mpiblast on
> rocks 6.0 which uses ope
We'd be happy to add them both to our examples section, and to our regression
test area, if okay with you. Feel free to send them to me offlist.
Thanks!
Ralph
On Apr 3, 2014, at 1:44 PM, Saliya Ekanayake wrote:
> Hi,
>
> I've been working on some applications in our group where I've been usin
2 hours lapsed.on rocks 6.0 cluster with 12
> virtual nodes on pc's ...2 on each using virt-manger , 1 gb ram to each
>
>
>
> On Thu, Apr 3, 2014 at 8:37 PM, Ralph Castain wrote:
> I'm having trouble understanding your note, so perhaps I am getting this
>
rent
> compute nodes after partitioning od database.
> And sir have you done mpiblast ?
Nope - but that isn't the issue, is it? The issue is with the MPI setup.
>
>
> On Fri, Apr 4, 2014 at 4:48 AM, Ralph Castain wrote:
> What is "mpiformatdb"? We don't have
Fixed in r31308 and scheduled for inclusion in 1.8.1
Thanks
Ralph
On Apr 2, 2014, at 12:17 PM, Ralph Castain wrote:
> Yeah, it's a change we added to resolve a problem when Slurm is configured
> with TaskAffinity set. It's harmless, but annoying - I'm trying to fig
cks itself installed,configured openmpi and mpich on it
> own through hpc roll.
>
>
> On Fri, Apr 4, 2014 at 9:25 AM, Ralph Castain wrote:
>
> On Apr 3, 2014, at 8:03 PM, Nisha Dhankher -M.Tech(CSE)
> wrote:
>
>> thankyou Ralph.
>> Yes cluster is heterogenous..
On Apr 4, 2014, at 7:39 AM, Reuti wrote:
> Am 04.04.2014 um 05:55 schrieb Ralph Castain:
>
>> On Apr 3, 2014, at 8:03 PM, Nisha Dhankher -M.Tech(CSE)
>> wrote:
>>
>>> thankyou Ralph.
>>> Yes cluster is heterogenous...
>>
>> And did
Okay, so if you run mpiBlast on all the non-name nodes, everything is okay?
What do you mean by "names nodes"?
On Apr 4, 2014, at 7:32 AM, Nisha Dhankher -M.Tech(CSE)
wrote:
> no it does not happen on names nodes
>
>
> On Fri, Apr 4, 2014 at 7:51 PM, Ralph Cast
Running out of file descriptors sounds likely here - if you have 20 procs/node,
and fully connect, each node will see 20*220 connections (you don't use tcp
between procs on the same node), with each connection requiring a file
descriptor.
On Apr 4, 2014, at 11:26 AM, Vince Grimes wrote:
> De
It sounds like you don't have a balance between sends and recvs somewhere -
i.e., some apps send messages, but the intended recipient isn't issuing a recv
and waiting until the message has been received before exiting. If the
recipient leaves before the isend completes, then the isend will never
What version of OMPI are you attempting to install?
Also, using /usr/local as your prefix is a VERY, VERY BAD idea. Most OS
distributions come with a (typically old) version of OMPI installed in the
system area. Overlaying that with another version can easily lead to the errors
you show.
You s
Looks like bit-rot has struck the sequential mapper support - I'll revive it
for 1.8.1
On Apr 6, 2014, at 7:17 PM, Chen Bill wrote:
> Hi ,
>
> I just tried the openmpi 1.8, but I found the feature --mca rmaps seq doesn't
> work.
>
> for example,
>
> >mpirun -np 4 -hostfile hostsfle --mca rm
Nope - make uninstall will not clean everything out, which is one reason we
don't recommend putting things in a system directory
On Apr 6, 2014, at 8:44 AM, Kamal wrote:
> Hi Hamid,
>
> So I can uninstall just by typing
>
> ' make uninstall ' right ?
>
> what does ' make -j2 ' do ?
>
> T
tion/lib:$LD_LIBRARY_PATH
>
> best of luck.
>
>
> On Sun, Apr 6, 2014 at 5:45 PM, Kamal wrote:
> Hi Ralph,
>
> I use OMPI 1.8 for Macbook OS X mavericks.
>
> As you said I will create a new directory to install my MPI files.
>
> Thanks for your reply,
>
Looks to me like the problem is here:
/bin/.: Permission denied.
Appears you don't have permission to exec bash??
On Apr 7, 2014, at 1:04 PM, Blosch, Edwin L wrote:
> I am submitting a job for execution under SGE. My default shell is /bin/csh.
> The script that is submitted has #!/bin/bash
I doubt that the rsh launcher is getting confused by the cmd you show below.
However, if that command is embedded in a script that changes the shell away
from your default shell, then yes - it might get confused. When the rsh
launcher spawns your remote orted, it attempts to set some envars to e
hell as bash.
>
> But telling it to check the remote shell did the trick.
>
> Thanks
>
>
> -Original Message-
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
> Sent: Monday, April 07, 2014 4:12 PM
> To: Open MPI Users
> Subject: Re: [OMPI
I suspect it all depends on when you start the clock. If the data is sitting in
the file at time=0, then the file I/O method will likely be faster as every
proc just reads its data in parallel - no comm required as it is all handled by
the parallel file system.
I confess I don't quite understan
that can be heavily
optimized with pre-fetch and memory caching.
>
>
> On Tue, Apr 8, 2014 at 4:45 PM, Ralph Castain wrote:
> I suspect it all depends on when you start the clock. If the data is sitting
> in the file at time=0, then the file I/O method will likely be faster as
What version of OMPI are you using? We have a "seq" mapper that does what you
want, but the precise cmd line option for directing to use it depends a bit on
the version.
On Apr 9, 2014, at 9:22 AM, Gan, Qi PW wrote:
> Hi,
>
> I have a problem when setting the processes of a parallel job wi
Wow - that's an ancient one. I'll see if it can be applied to 1.8.1. These
things don't automatically go across - it requires that someone file a request
to move it - and I think this commit came into the trunk after we branched for
the 1.7 series.
On Apr 9, 2014, at 12:05 PM, Richard Shaw wr
Just to ensure I understand what you are saying: it appears that 1.8 is much
faster than 1.6.5 with the default settings, but slower when you set
btl=tcp,self?
This seems rather strange. I note that the 1.8 value is identical in the two
cases, but somehow 1.6.5 went much faster in the latter ca
. What you can do to compensate is add the --novm option
to mpirun (or use the "state_novm_select=1" MCA param) which reverts back to
the 1.6.5 behavior.
On Apr 10, 2014, at 7:00 AM, Ralph Castain wrote:
> Just to ensure I understand what you are saying: it appears that 1.8 is much
Just add "-mca rmaps seq" to your command line, then. The mapper will take your
hostfile (no rankfile) and map each proc sequentially to the listed nodes. You
need to list each node once for each proc - something like this:
nodeA
nodeB
nodeB
nodeC
nodeA
nodeC
...
would produce your described pa
10, 2014, at 8:02 AM, Richard Shaw wrote:
> Okay. Thanks for having a look Ralph!
>
> For future reference, is there a better process I can go through if I find
> bugs like this that makes sure they don't get forgotten?
>
> Thanks,
> Richard
>
>
> On 10 April 2014
On Apr 10, 2014, at 7:58 AM, Victor Vysotskiy
wrote:
> Dear Ralph,
>
>> it appears that 1.8 is much faster than 1.6.5 with the default settings, but
>> slower when you set btl=tcp,self?
>
> Precisely. However, with the default settings both versions are much slower
> compared to other MPI
I shaved about 30% off the time - the patch is waiting for 1.8.1, but you can
try it now (see the ticket for the changeset):
https://svn.open-mpi.org/trac/ompi/ticket/4510#comment:1
I've added you to the ticket so you can follow what I'm doing. Getting any
further improvement will take a little
Interesting data. Couple of quick points that might help:
option B is equivalent to --map-by node --bind-to none. When you bind to every
core on the node, we don't bind you at all since "bind to all" is exactly
equivalent to "bind to none". So it will definitely run slower as the threads
run ac
I'm a little confused - the "no_tree_spawn=true" option means that we are *not*
using tree spawn, and so mpirun is directly launching each daemon onto its
node. Thus, this requires that the host mpirun is on be able to ssh to every
other host in the allocation.
You can debug the rsh launcher by
Please see:
http://www.open-mpi.org/faq/?category=rsh#ssh-keys
short answer: you need to be able to ssh to the remote hosts without a password
On Apr 11, 2014, at 1:09 AM, Lubrano Francesco
wrote:
> Dear MPI users,
> I have a problem with open-mpi (version 1.8).
> I'm just beginning to undes
The problem is with the tree-spawn nature of the rsh/ssh launcher. For
scalability, mpirun only launches a first "layer" of daemons. Each of those
daemons then launches another layer in a tree-like fanout. The default pattern
is such that you first notice it when you have four nodes in your allo
I'm afraid our suspend/resume support only allows the signal to be applied to
*all* procs, not selectively to some. For that matter, I'm unaware of any
MPI-level API for hitting a proc with a signal - so I'm not sure how you would
programmatically have rank0 suspend some other ranks.
On Apr 11,
Hmmm...well, first ensure you configured --enable-debug, and then add "-mca
plm_base_verbose 10 --debug-daemons" to your mpirun cmd line. This will tell
you what is happening during the launch.
On Apr 13, 2014, at 12:31 PM, Lubrano Francesco
wrote:
> Sorry for my late reply
> I tried previou
I'm confused - how are you building OMPI?? You normally have to do:
1. ./configure --prefix= This is where you would add --enable-debug
2. make clean all install
You then run your mpirun command as you've done.
On Apr 14, 2014, at 12:52 AM, Lubrano Francesco
wrote:
> I can't set --en
On Apr 13, 2014, at 11:42 AM, Allan Wu wrote:
> Thanks, Ralph!
>
> Adding MAC parameter 'plm_rsh_no_tree_spawn' solves the problem.
>
> If I understand correctly, the first layer of daemons are three nodes, and
> when there are more than three nodes the second layer of daemons are spawn.
>
I'm still poking around, but would appreciate a little more info to ensure I'm
looking in the right places. How many nodes are you running your application
across for your verification suite? I suspect it isn't just one :-)
On Apr 10, 2014, at 9:19 PM, Ralph Castain wrote:
&
Have you tried a typical benchmark (e.g., NetPipe or OMB) to ensure the problem
isn't in your program? Outside of that, you might want to explicitly tell it to
--bind-to core just to be sure it does so - it's supposed to do that by
default, but might as well be sure. You can check by adding --re
Have you tried using a debugger to look at the resulting core file? It will
probably point you right at the problem. Most likely a case of overrunning
some array when #temps > 5
On Tue, Apr 15, 2014 at 10:46 AM, Oscar Mojica wrote:
> Hello everybody
>
> I implemented a parallel simulated annea
st3:07134] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././.]
> [host4:10282] MCW rank 1 bound to socket 0[core 0[hwt 0]]: [B/././.]
>
>
>
>
> On Tue, Apr 15, 2014 at 8:39 PM, Ralph Castain wrote:
>
>> Have you tried a typical benchmark (e.g., NetPipe or OMB) to ensure
Thanks Victor! Sorry for the problem, but appreciate you bringing it to our
attention.
Ralph
On Wed, Apr 16, 2014 at 5:16 AM, Victor Vysotskiy <
victor.vysots...@teokem.lu.se> wrote:
> Hi,
>
> I just will confirm that the issue has been fixed. Specifically, with the
> latest OpenMPI v1.8.1a1r3
The Java bindings are written on top of the C bindings, so you'll be able
to use those networks just fine from Java :-)
On Wed, Apr 16, 2014 at 2:27 PM, Saliya Ekanayake wrote:
> Thank you Nathan, this is what I was looking for. I'll try to build
> OpenMPI 1.8 and get back to this thread if I
Unfortunately, each execution of mpirun has no knowledge of where the procs
have been placed and bound by another execution of mpirun. So what is
happening is that the procs of the two jobs are being bound to the same
cores, thus causing contention.
If you truly want to run two jobs at the same ti
Sounds like either a routing problem or a firewall. Are there multiple NICs on
these nodes? Looking at the quoted NIC in your error message, is that the
correct subnet we should be using?
Have you checked to ensure no firewalls exist on that subnet between the nodes?
On Apr 24, 2014, at 8:41 A
Hmmmwe haven't heard a problem like that, but if you don't have Xeon Phi
devices on your machine, one simple workaround would be to add
--enable-mca-no-build=btl-scif
to your configure line
On Apr 25, 2014, at 10:22 AM, Andrus, Brian Contractor wrote:
> All,
>
> I have been unable to com
We don't fully support THREAD_MULTIPLE, and most definitely not when using IB.
We are planning on extending that coverage in the 1.9 series
On Apr 25, 2014, at 2:22 PM, Markus Wittmann wrote:
> Hi everyone,
>
> I'm using the current Open MPI 1.8.1 release and observe
> non-deterministic deadl
mmings
> Engineering Specialist
> Performance Modeling and Analysis Department
> Systems Analysis and Simulation Subdivision
> Systems Engineering Division
> Engineering and Technology Group
> The Aerospace Corporation
> 571-307-4220
> jeffrey.a.cummi...@aero.org
>
&g
My bad - forgot to remove a stale line of code exposed by the
--enable-heterogeneous option. Fixed in r31567
Sorry about that...
On Apr 30, 2014, at 8:11 AM, Siegmar Gross
wrote:
> Hi,
>
> I tried to install openmpi-1.9a1r31561 on my machines (openSUSE
> Linux 12.1 x86_64, Solaris 10 x86_64,
have the CPU bindings shown as
>> well
>>
>> * If using "--report-bindings --bind-to-core" with OpenMPI 1.4.1 then the
>> bindings on just the head node are shown. In 1.6.1, full bindings across
>> all hosts are shown. (I'd have to read release notes on this...)
Hmmm...just testing on my little cluster here on two nodes, it works just fine
with 1.8.2:
[rhc@bend001 v1.8]$ mpirun -n 2 --map-by node ./a.out
In rank 0 and host= bend001 Do Barrier call 1.
In rank 0 and host= bend001 Do Barrier call 2.
In rank 0 and host= bend001 Do Barrier call 3.
In r
>
> Message: 9
> Date: Tue, 6 May 2014 14:50:34 +
> From: "Jeff Squyres (jsquyres)"
> To: Open MPI Users
> Subject: Re: [OMPI users] users Digest, Vol 2879, Issue 1
> Message-ID:
> Content-Type: text/plain; charset=&
.com
> > > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> > >
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-
send a message with subject or body 'help' to
> users-requ...@open-mpi.org
>
> You can reach the person managing the list at
> users-ow...@open-mpi.org
>
> When replying, please edit your
There is a known bug in the 1.8.1 release whereby daemons failing to start on a
remote node will cause a silent failure. This has been fixed for the upcoming
1.8.2 release, but you might want to use one of the nightly 1.8.2 snapshots in
the interim.
Most likely causes:
* not finding the requir
Hmmmthat is indeed odd. What version of OMPI are you using?
You might try adding "-mca plm rsh" to your cmd line - this will ensure the
launcher isn't trying to use srun under the covers. However, it shouldn't have
built the slurm support if you specifically asked us not to do so.
On May 7
ssee(s), and may not be passed on to, or made available for use by any
> person other than the addressee(s). Any and every liability resulting from
> any electronic transmission is ruled out.
> If you are not the intended recipient, please contact the sender by reply
> email and dest
What version are you talking about?
On May 13, 2014, at 11:13 PM, Hamed Mortazavi wrote:
> Hi all,
>
> in make check for openmpi on a mac I see following error message, has anybody
> ever run to this error? any solutions?
>
> Best,
>
> Hamed,
> raw extraction in 1 microsec
>
> Example 3.
You might give it a try with 1.8.1 or the nightly snapshot from 1.8.2 - we
updated ROMIO since the 1.6 series, and whatever fix is required may be in the
newer version
On May 14, 2014, at 6:52 AM, CANELA-XANDRI Oriol
wrote:
> Hello,
>
> I am using MPI IO for writing/reading a block cyclic
What are the interfaces on these machines?
On May 14, 2014, at 7:45 AM, Siegmar Gross
wrote:
> Hi,
>
> I just installed openmpi-1.8.2a1r31742 on my machines (Solaris 10
> Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with
> Sun C5.12 and still have the following problem.
>
> tyr
auber wrote:
> Is there an ETA for 1.8.2 general release instead of snapshot?
>
> Thanks, -- bennet
>
> On Wed, May 14, 2014 at 10:17 AM, Ralph Castain wrote:
>> You might give it a try with 1.8.1 or the nightly snapshot from 1.8.2 - we
>> updated ROMIO since the
Hmmm...well, that's an interesting naming scheme :-)
Try adding "-mca oob_base_verbose 10 --report-uri -" on your cmd line and let's
see what it thinks is happening
On May 14, 2014, at 9:06 AM, Siegmar Gross
wrote:
> Hi Ralph,
>
>> What are the interfaces on these machines?
>
> tyr fd1026
Just committed a potential fix to the trunk - please let me know if it worked
for you
On May 14, 2014, at 11:44 AM, Siegmar Gross
wrote:
> Hi Ralph,
>
>> Hmmm...well, that's an interesting naming scheme :-)
>>
>> Try adding "-mca oob_base_verbose 10 --report-uri -" on your cmd line
>> and le
FWIW: I believe we no longer build the slurm support by default, though I'd
have to check to be sure. The intent is definitely not to do so.
The plan we adjusted to a while back was to *only* build support for schedulers
upon request. Can't swear that they are all correctly updated, but that was
t for various schedulers, and so just finding the
required headers isn't enough to know that the scheduler is intended for use.
So we wind up building a bunch of useless modules.
On May 14, 2014, at 3:09 PM, Ralph Castain wrote:
> FWIW: I believe we no longer build the slurm suppo
On May 14, 2014, at 3:21 PM, Jeff Squyres (jsquyres) wrote:
> On May 14, 2014, at 6:09 PM, Ralph Castain wrote:
>
>> FWIW: I believe we no longer build the slurm support by default, though I'd
>> have to check to be sure. The intent is definitely not to do so.
>
you don't tell me to not-build"
Tough set of compromises as it depends on the target audience. Sys admins
prefer the "build only what I say", while users (who frequently aren't that
familiar with the inners of a system) prefer the "build all" mentality.
On Ma
Just sniffing around the web, I found that this is a problem caused by newer
versions of gcc. One reporter stated that they resolved the problem by adding
"-fgnu89-inline" to their configuration:
"add the compiler flag "-fgnu89-inline" (because of an issue where old glibc
libraries aren't compa
more than one scheduler.
>
> Maxime
>
> Le 2014-05-14 19:09, Ralph Castain a écrit :
>> Jeff and I have talked about this and are approaching a compromise. Still
>> more thinking to do, perhaps providing new configure options to "only build
>> what I ask
It is an unrelated bug introduced by a different commit - causing mpirun to
segfault upon termination. The fact that you got the hostname to run indicates
that this original fix works, so at least we know the connection logic is now
okay.
Thanks
Ralph
On May 15, 2014, at 3:40 AM, Siegmar Gros
Hi Gus
The issue is that you have to work thru all the various components (leafing
thru the code base) to construct a list of all the things you *don't* want to
build. By default, we build *everything*, so there is no current method to
simply "build only what I want".
For those building static
What do you mean "goes through orte component"? It will still call into the
orte code base, but will use PMI to do the modex.
On May 15, 2014, at 12:54 PM, Hadi Montakhabi wrote:
> Hello,
>
> I am trying to utilize pmi instead of orte, but I come across the following
> problem.
> I do configu
I'm not sure of the issue, but so far as I'm aware the cpus-per-proc
functionality continued to work thru all those releases and into today. Yes,
the syntax changed during the 1.7 series to reflect a broader desire to
consolidate options into something that could be contained in a minimum number
e rte framework, namely orte and pmi.
> The question is whether pmi could be used independent from orte? Or it needs
> orte to function?
>
> Peace,
> Hadi
>
>
> On Thu, May 15, 2014 at 2:59 PM, Ralph Castain wrote:
> What do you mean "goes through orte component&q
This is on a Windows box? If so, I don't know if anyone built/posted a 64-bit
release version for Windows (you might check the OMPI site and see if there is
something specific for 64-bit), and we don't support Windows directly any more.
You might also look at the cygwin site for a downloadable v
On May 15, 2014, at 2:34 PM, Fabricio Cannini wrote:
> Em 15-05-2014 07:29, Jeff Squyres (jsquyres) escreveu:
>> I think Ralph's email summed it up pretty well -- we unfortunately have (at
>> least) two distinct groups of people who install OMPI:
>>
>> a) those who know exactly what they want
On May 15, 2014, at 4:15 PM, Maxime Boissonneault
wrote:
> Le 2014-05-15 18:27, Jeff Squyres (jsquyres) a écrit :
>> On May 15, 2014, at 6:14 PM, Fabricio Cannini wrote:
>>
>>> Alright, but now I'm curious as to why you decided against it.
>>> Could please elaborate on it a bit ?
>> OMPI has
Nobody is disagreeing that one could find a way to make CMake work - all we are
saying is that (a) CMake has issues too, just like autotools, and (b) we have
yet to see a compelling reason to undertake the transition...which would have
to be a *very* compelling one.
On May 15, 2014, at 4:45 PM
.
>
> Josh
>
>
> On Thu, May 15, 2014 at 4:13 PM, Ralph Castain wrote:
> I wouldn't trust that PMI component in the RTE framework - it was only
> created as a test example for that framework. It is routinely broken and not
> maintained, and can only be used i
you might try the nightly 1.8.2 build - there were some additional patches to
fix the darned tkr support. I'm afraid getting all the various compilers to
work correctly with it has been a major pain.
On May 15, 2014, at 5:01 PM, W Spector wrote:
> Hi Jeff and the list,
>
> A year ago, we had
wrote:
> Ralph is right.
> I used 1.8, and after digging into it, I noticed it doesn't even compile the
> pmi component. When I tried to configure without orte, I could see the errors
> while compiling.
> It looks like it is well broken!
>
> Peace,
> Hadi
>
>
Done - will be in nightly 1.8.2 tarball generated later today.
On May 16, 2014, at 2:57 AM, Siegmar Gross
wrote:
> Hi,
>
>> This bug should be fixed in tonight's tarball, BTW.
> ...
>>> It is an unrelated bug introduced by a different commit -
>>> causing mpirun to segfault upon termination.
On May 16, 2014, at 1:03 PM, Fabricio Cannini wrote:
> Em 16-05-2014 10:06, Jeff Squyres (jsquyres) escreveu:
>> On May 15, 2014, at 8:00 PM, Fabricio Cannini
>> wrote:
>>
Nobody is disagreeing that one could find a way to make CMake
work - all we are saying is that (a) CMake has iss
1 - 100 of 3066 matches
Mail list logo