I can confirm that both 1.3rc6 and 1.2.9rc2 now build fine for me.
-Paul
George Bosilca wrote:
Paul,
Thanks for noticing the Elan problem. It appears we miss one patch in
the 1.3 (https://svn.open-mpi.org/trac/ompi/changeset/20122). I'll
fill a CMR asap.
Thanks,
george.
On Jan 13, 2
Brian and I were just reminiscing today that we sometimes got up to
b56 before a successful LAM release. So I think we're still doing ok
here.
:p
On Jan 14, 2009, at 4:44 PM, Tim Mattox wrote:
Hi All,
The sixth (yes 6!) release candidate of Open MPI v1.3 is now
available:
http://www.o
Hi All,
The sixth (yes 6!) release candidate of Open MPI v1.3 is now available:
http://www.open-mpi.org/software/ompi/v1.3/
Please run it through it's paces as best you can.
Anticipated release of 1.3 is tomorrow morning.
This only has a fix for a segfault in coll_hierarch_component.c
with resp
r20275 looks good. I suggest that we CMR that into 1.3 and get rc6 rolled
and tested. (actually, Jeff just did the CMR...so off to rc6)
--brad
On Wed, Jan 14, 2009 at 1:16 PM, Edgar Gabriel wrote:
> so I am not entirely sure why the bug only happened on trunk, it could in
> theory also appear
Sorry, I have searched the whole day for a solution of that problem, but
unfortunately, I'm clueless :-( I cannot say which flag causes the
compile error. Furthermore, I'm also unable to reproduce this error on
some different platforms.
The coding style in the concerned source file looks also not s
so I am not entirely sure why the bug only happened on trunk, it could
in theory also appear on v1.3 (is there a difference on how
pointer_arrays are handled between the two versions?)
Anyway, it passes now on both with changeset 20275. We should probably
move that over to 1.3 as well, whether
So, if it looks okay on 1.3...then there should not be anything holding up
the release, right? Otherwise, George we need to decide on whether or not
this is a blocker, or if we go ahead and release with this as a known issue
and schedule the fix for 1.3.1. My vote is to go ahead and release, but
The crcpw component is in the PML framework. The following should be
the MCA parameter you are looking for:
pml_crcpw_verbose=20
You can use the 'ompi_info' command to find out more information about
MCA parameters available. For example to find this one you can use the
following:
ompi_i
I think you'd like to know more than just how many procs are local.
E.g., if the chunk or eager limits are changed much, that would impact
how much memory you'd like to allocate.
A phone chat is all right for me, though so far all I've heard is that
no one understands the code!
But, maybe w
In case a parallel debugger is not available, I'm using mpirun
*blahblah* xterm -e gdb [app_name] and this works pretty well as long
as the ssh forward the X11 display.
Hope this helps,
george.
On Jan 14, 2009, at 13:04 , Edgar Gabriel wrote:
I'm already debugging it. the good news is
On Jan 14, 2009, at 1:04 PM, Edgar Gabriel wrote:
I'm already debugging it. the good news is that it only seems to
appear with trunk, with 1.3 (after copying the new tuned module
over), all the tests pass.
Ooooh.. that *is* good news!
Now if somebody can tell me a trick on how to tell mpir
Excellent; I'm starting an MTT run now, including hiearch variants.
Note that this MTT will take many hours (5-10?).
On Jan 14, 2009, at 1:04 PM, Tim Mattox wrote:
Hi All,
The fifth release candidate of Open MPI v1.3 is now available:
http://www.open-mpi.org/software/ompi/v1.3/
Please run i
Hi All,
The fifth release candidate of Open MPI v1.3 is now available:
http://www.open-mpi.org/software/ompi/v1.3/
Please run it through it's paces as best you can.
Anticipated release of 1.3 is tonight/tomorrow. (again)
--
Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
tmat...@gmail.com
I'm already debugging it. the good news is that it only seems to appear
with trunk, with 1.3 (after copying the new tuned module over), all the
tests pass.
Now if somebody can tell me a trick on how to tell mpirun not kill the
debugger under my feet, then I could even see where the problem occ
All these errors are in the MPI_Finalize, it should not be that hard
to find. I'll take a look later this afternoon.
george.
On Jan 14, 2009, at 06:41 , Tim Mattox wrote:
Unfortunately, although this fixed some problems when enabling
hierarch coll,
there is still a segfault in two of IU's
Hi,
What variable should I set to increase the verbosity of crcpw component?
I've tried "ompi_crcpw_verbose=20" and "crcpw_base_verbose=20". How
can I figure out the name of the variable.
Regards,
Caciano
To followup for the web archives -- we discussed this more off-list.
AFAIK, compiling Open MPI -- including its memory registration cache
-- works fine in 32 bit mode, even on 64 bit platforms (there was some
confusion between virtual and physical memory addresses and who uses
what [OMPI *
We -may- be able to do a more formal XML output at some point. The
problem will be the natural interleaving of stdout/err from the
various procs due to the async behavior of MPI. Mpirun receives
fragmented output in the forwarding system, limited by the buffer
sizes and the amount of data w
I also know little about that part of the code, but agree that does
seem weird. Seeing as we know how many local procs there are before we
get to this point, I would think we could be smart about our memory
pool size. You might not need to dive into the sm BTL to get the info
you need - if
Whoa, this analysis rocks. :-) I'm going through trying to grok it
all...
Just wanted to say: kudos for this.
On Jan 14, 2009, at 1:14 AM, Eugene Loh wrote:
RFC: Fragmented sm Allocations
WHAT: Dealing with the fragmented allocations of sm BTL FIFO
circular buffers (CB) during MPI_Init(
Ya, that does seem weird to me, but I never fully grokked the whole
mpool / allocator scheme (I haven't had to interact with that part of
the code much).
Would it be useful to get on the phone and discuss this stuff?
On Jan 14, 2009, at 1:11 AM, Eugene Loh wrote:
Thanks for the reply. I
Ralph,
The only time we use the resolved names is when we get a map, so we
consider them part of the map output.
If quasi-XML is all that will ever be possible with 1.3, then you may
as well leave as-is and we will attempt to clean it up in Eclipse. It
would be nice if a future version of
If your timer is actually generating an interrupt to the process, then
that could be the source of the problem. I believe the event library
also treats interrupts as events, and assigns them the highest
priority. So every one of your interrupts would cause the event
library to stop what it
I haven't reviewed the code either, but really do appreciate someone
taking the time for such a thorough analysis of the problems we have
all observed for some time!
Thanks Eugene!!
On Jan 14, 2009, at 5:05 AM, Tim Mattox wrote:
Great analysis and suggested changes! I've not had a chance
Is there some code that can be fixed instead? I.e., is this feature
totally incompatible with whatever RPM compiler flags are used, or is
it just some coding style that these particular flags don't like?
On Jan 14, 2009, at 5:05 AM, Matthias Jurenz wrote:
Another workaround should be to d
Great analysis and suggested changes! I've not had a chance yet
to look at your hg branch, so this sin't a code review... Barring a
bad code review, I'd say these changes should all go in the trunk
for inclusion in 1.4.
2009/1/14 Eugene Loh :
>
>
> RFC: Fragmented sm Allocations
>
> WHAT: Dealin
Unfortunately, although this fixed some problems when enabling hierarch coll,
there is still a segfault in two of IU's tests that only shows up when we set
-mca coll_hierarch_priority 100
See this MTT summary to see how the failures improved on the trunk,
but that there are still two that segfault
Another workaround should be to disable the I/O tracing feature of VT
by adding the configure option
'--with-contrib-vt-flags=--disable-iotrace'
That will have the effect that the upcoming OMPI-rpm's have no support
for I/O tracing, but in our opinion it is not so bad...
Furthermore, we
Title: RFC: Fragmented sm Allocations
RFC: Fragmented sm Allocations
WHAT: Dealing with the fragmented allocations of sm BTL FIFO
circular buffers (CB) during MPI_Init().
Also:
Improve handling of error codes.
Automate the sizing of the mmap file.
WHY: To reduce consumption of sha
Thanks for the reply. I kind of understand, but it's rather weird. The
BTL calls mca_mpool_base_module_create() to create a pool of memory, but
the BTL has no say how big of a pool to create? Could you imagine
having a memory allocation routine ("malloc" or something) that didn't
allow you t
Here we go by the book :)
https://svn.open-mpi.org/trac/ompi/ticket/1749
george.
On Jan 13, 2009, at 23:40 , Jeff Squyres wrote:
Let's debate tomorrow when people are around, but first you have to
file a CMR... :-)
On Jan 13, 2009, at 10:28 PM, George Bosilca wrote:
Unfortunately, this
31 matches
Mail list logo