Something is broken in the trunk.
# mpirun -np 2 -H host1,host2 ./osu_latency
--
Some of the requested hosts are not included in the current allocation.
The requested hosts were specified with --host as:
host1,host2
Please
Sorry about that. I removed a field in a structure, then 'svn up' seems
to have added it back, so we were using a field that should not even
exist in a couple places.
Should be fixed in r17757
Tim
Gleb Natapov wrote:
Something is broken in the trunk.
# mpirun -np 2 -H host1,host2 ./osu_lat
On Thu, Mar 06, 2008 at 07:49:13AM -0500, Tim Prins wrote:
> Sorry about that. I removed a field in a structure, then 'svn up' seems
> to have added it back, so we were using a field that should not even
> exist in a couple places.
>
> Should be fixed in r17757
Works again. Thanks
--
Tim and I talked about this on IM. We'd like to amend the proposal:
1. Remove these tests from make check, but leave them in SVN per the
original proposal.
2. File a ticket to make carto selection not fail when no components
are found (I filed https://svn.open-mpi.org/trac/ompi/ticket/1232).
I believe I have at least helped reduce this with r17761. I added the
ability for procs to detect that their "lifeline" connection (either the HNP
for unity routed, or their local daemon for tree) has been lost and
gracefully abort.
Let me know if that helps
Ralph
On 3/4/08 9:37 PM, "Aurélien B
Hello
I've been doing some work on fault response within the system, and finally
realized something I should probably have seen awhile back. Perhaps I am
misunderstanding somewhere, so forgive the ignorance if so.
When we designed ORTE some time in the deep, dark past, we had envisioned
that peop
The checkpoint/restart work that I have integrated does not respond to
failure at the moment. If a failures happens I want ORTE to terminate
the entire job. I will then restart the entire job from a checkpoint
file. This follows the 'all fall down' approach that users typically
expect when
Ah - ok, thanks for clarifying! I'm happy to leave it around, but wasn't
sure if/where it fit into anyone's future plans.
Thanks
Ralph
On 3/6/08 9:13 AM, "Josh Hursey" wrote:
> The checkpoint/restart work that I have integrated does not respond to
> failure at the moment. If a failures happen
In the usual place:
http://www.open-mpi.org/software/ompi/v1.2/
It contains a few changes, such as the new
pml_ob1_use_early_completion MCA parameter:
http://svn.open-mpi.org/svn/ompi/branches/v1.2/NEWS
--
Jeff Squyres
Cisco Systems
Hi All,
The "first" (actually rc2) release candidate of Open MPI v1.2.6 is now up:
http://www.open-mpi.org/software/ompi/v1.2/
Please run it through it's paces as best you can.
--
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
tmat...@gmail.com || timat...@open-mpi.org
I'm a bright..
Aside of what Josh said, we are working right know at UTK on orted/MPI
recovery (without killing/respawning all). For now we had no use of
the errgmr, but I'm quite sure this would be the smartest place to
put all the mechanisms we are trying now.
Aurelien
Le 6 mars 08 à 11:17, Ralph Casta
This still has a race condition... which can be dealt with using
opal_atomic stuff.
See below.
On Thu, Mar 6, 2008 at 2:35 PM, wrote:
> Author: rhc
> Date: 2008-03-06 14:35:57 EST (Thu, 06 Mar 2008)
> New Revision: 17766
> URL: https://svn.open-mpi.org/trac/ompi/changeset/17766
>
> Log:
> F
On Mar 5, 2008, at 1:50 PM, Greg Watson wrote:
Looking back through the mailing list, I can only see two references
that seem relevant to this. One was titled "Major reduction in ORTE"
and does allude to the event model changes. The other "OMPI/ORTE and
tools" talks about "alternative methods of
In ompi/contrib/vt/vt/extlib/otf/acinclude.m4, in the macros WITH_DEBUG
and WITH_VERBOSE, dubious constructs such as
AC_CACHE_CHECK([debug],
[debug],
[debug=])
are used. These have the following problems:
* Cache variables need to match *_cv_* in order to actually be saved
(
Thanks Tim - good suggestion! Had to modify your proposed code a tad to get
it to compile and work, but it is definitely a cleaner solution.
Ralph
On 3/6/08 1:34 PM, "Tim Mattox" wrote:
> This still has a race condition... which can be dealt with using
> opal_atomic stuff.
> See below.
>
> On
FYI: since I was the one who stirred up the hornet's nest a while
ago :-), I thought I'd update everyone -- we're actually *not* going
to use libev anymore. We're simply going to update to a newer version
of libevent, which seems to have all the things that we care about
(better performanc
Hello,
I've just stumbled over three testsuite failures on GNU/Linux x86,
with an out-of-tree build (mkdir build; cd build;
../ompi_trunk/configure -C). Hope I'm not completely off-topic here...
Cheers,
Ralf
PASS: ompi_bitmap
-
Nope, you're not off-topic at all. This has been a debate among us
developers for a few days now... :-)
The issue is that these tests are now doing something that assume that
OMPI has been installed. We've sent an RFC around to the developers
proposing how to fix it (easy solution: just r
18 matches
Mail list logo