On Tue, 11 Aug 2009, Rainer Keller wrote:
When compiling on systems with MX or Portals, we offer MTLs and BTLs.
If MTLs are used, the PML/CM is loaded as well as the PML/OB1.
Question 1: Is favoring OB1 over CM required for any MTL (MX, Portals, PSM)?
George has in the past had srtong feelin
On Wed, 5 Aug 2009, Josh Hursey wrote:
On Aug 5, 2009, at 11:35 AM, Brian W. Barrett wrote:
Josh -
Just in case it wasn't clear -- if you're only looking for a symbol in the
executable (which you know is there), you do *NOT* have to dlopen() the
executable first (you do with
cutable, in
which case you might be better off just using dlsym() directly. If you're
looking for a symbol first place it's found, then you can just do:
dlsym(RTLD_DEFAULT, symbol);
The lt_dlsym only really helps if you're running on really obscure
platforms which don't su
On Sun, 2 Aug 2009, Ralph Castain wrote:
Perhaps a bigger question needs to be addressed - namely, does the ob1 code
need to be refactored?
Having been involved a little in the early discussion with bull when we
debated over where to put this, I know the primary concern was that the code
not
x27;re
looking for a symbol first place it's found, then you can just do:
dlsym(RTLD_DEFAULT, symbol);
The lt_dlsym only really helps if you're running on really obscure
platforms which don't support dlsym and loading "preloaded" components.
Brian
On Wed, 29 Jul 200
On Wed, 29 Jul 2009, Jeff Squyres wrote:
On Jul 28, 2009, at 1:56 PM, Ralf Wildenhues wrote:
- support files are not versioned (e.g., show_help text files)
- include files are not versioned (e.g., mpi.h)
- OMPI's DSOs actually are versioned, but more work would be needed
in this area to make t
What are you trying to do with lt_dlopen? It seems like you should always
go through the MCA base utilities. If one's missing, adding it there
seems like the right mechanism.
Brian
On Wed, 29 Jul 2009, Josh Hursey wrote:
George suggested that to me as well yesterday after the meeting. So we
On Thu, 23 Jul 2009, Jeff Squyres wrote:
There are two solutions I can think of. Which should we do?
a. Pass the (max?) PML header size down into the BTL during
initialization such that the the btl_XXX_eager_limit can
represent the max MPI data payload size (i.e., the BTL can siz
The current autodetect implementation seems like the wrong approach to me.
I'm rather unhappy the base functionality was hacked up like it was
without any advanced notice or questions about original design intent.
We seem to have a set of base functions which are now more unreadable than
before
On Wed, 15 Jul 2009, Lisandro Dalcin wrote:
The MPI 2-1 standard says:
"MPI_PROC_NULL is a valid target rank in the MPI RMA calls
MPI_ACCUMULATE, MPI_GET, and MPI_PUT. The effect is the same as for
MPI_PROC_NULL in MPI point-to-point communication. After any RMA
operation with rank MPI_PROC_NUL
On Thu, 25 Jun 2009, Eugene Loh wrote:
I spoke with Brian and Jeff about this earlier today. Presumably, up through
1.2, mca_btl_component_progress would poll and if it received a message
fragment would return. Then, presumably in 1.3.0, behavior was changed to
keep polling until the FIFO wa
All -
Jeff, Eugene, and I had a long discussion this morning on the sm BTL flow
management issues and came to a couple of conclusions.
* Jeff, Eugene, and I are all convinced that Eugene's addition of polling
the receive queue to drain acks when sends start backing up is required
for deadloc
On Wed, 24 Jun 2009, Eugene Loh wrote:
Brian Barrett wrote:
Or go to what I proposed and USE A LINKED LIST! (as I said before, not an
original idea, but one I think has merit) Then you don't have to size the
fifo, because there isn't a fifo. Limit the number of send fragments any
one p
I think that sounds like a rational path forward. Another, more long
term, option would be to move from the FIFOs to a linked list (which can
even be atomic), which is what MPICH does with nemesis. In that case,
there's never a queue to get backed up (although the receive queue for
collective
Well, this may just be another sign that the push of the DDT to OPAL is a
bad idea. That's been my opinion from the start, so I'm biased. But OPAL
was intended to be single process systems portability, not MPI crud.
Brian
On Mon, 1 Jun 2009, Rainer Keller wrote:
Hmm, OK, I see.
However, I
I have to agree with Jeff's concerns.
Brian
On Mon, 1 Jun 2009, Jeff Squyres wrote:
Hmm. I'm not sure that I like this commit.
George, Brian, and I specifically kept Fortran out of (the non-generated code
in) opal because the MPI layer is the *only* layer that uses Fortran. There
was one
On Thu, 14 May 2009, Jeff Squyres wrote:
On May 14, 2009, at 2:22 PM, Brian W. Barrett wrote:
We actually took pains to *not* do that; we *used* to do that and
explicitly
took it out. :-\ IIRC, it had something to do with dlopen'ing
libmpi.so...?
Actually, I think that was something
On Thu, 14 May 2009, Ralf Wildenhues wrote:
Hi Brian,
* Brian W. Barrett wrote on Thu, May 14, 2009 at 08:22:58PM CEST:
Actually, I think that was something else. Today, libopen-rte.la lists
libopen-pal.la as a dependency and libmpi.la lists libopen-rte.la. I had
removed the dependency of
On Thu, 14 May 2009, Jeff Squyres wrote:
On May 14, 2009, at 1:46 PM, Ralf Wildenhues wrote:
A more permanent workaround could be in OpenMPI to list each library
that is used *directly* by some other library as a dependency. Sigh.
We actually took pains to *not* do that; we *used* to do tha
On Wed, 6 May 2009, Ralph Castain wrote:
Any thoughts on this? Should we change it?
Yes, we should change this (IMHO) :).
If so, who wants to be involved in the re-design? I'm pretty sure it would
require some modification of the paffinity framework, plus some minor mods to
the odls framewo
first days of Open
MPI...
Thanks
Edgar
Brian W. Barrett
wrote:
On Thu,
30 Apr
2009,
On Thu, 30 Apr 2009, Ralph Castain wrote:
well, that's only because the code's doing something it shouldn't.
Have a look at comm_cid.c:185 - there's the check we added to the
multi-threaded case (which was the only case when we added it).
The cid generation should never generate a number larger
On Thu, 30 Apr 2009, Edgar Gabriel wrote:
Brian W. Barrett wrote:
When we added the CM PML, we added a pml_max_contextid field to the PML
structure, which is the max size cid the PML can handle (because the
matching interfaces don't allow 32 bits to be used for the cid. At the
same
t. this is not new, so if there is a
discrepancy between what the comm structure assumes that a cid is and what
the pml assumes, than this was in the code since the very first days of Open
MPI...
Thanks
Edgar
Brian W. Barrett wrote:
On Thu, 30 Apr 2009, Ralph Castain wrote:
We seem to have hit
On Thu, 30 Apr 2009, Ralph Castain wrote:
We seem to have hit a problem here - it looks like we are seeing a
built-in limit on the number of communicators one can create in a
program. The program basically does a loop, calling MPI_Comm_split each
time through the loop to create a sub-communicato
I'm going to stay out of the debate about whether Andy correctly
characterized the two points you brought up as a distributed OS or not.
Sandia's position on these two points remains the same as I previously
stated when the question was distributed OS or not. The primary goal of
the Open MPI
On Wed, 11 Mar 2009, Andrew Lumsdaine wrote:
Hi all -- There is a meta question that I think is underlying some of the
discussion about what to do with BTLs etc. Namely, is Open MPI an MPI
implementation with a portable run time system -- or is it a distributed OS
with an MPI interface? It s
On Wed, 11 Mar 2009, Richard Graham wrote:
Brian,
Going back over the e-mail trail it seems like you have raised two
concerns:
- BTL performance after the change, which I would take to be
- btl latency
- btl bandwidth
- Code maintainability
- repeated code changes that impact a large number
nst it in the default OMPI configuration; other RTEs that want to
do more meaningful stuff will need to provide more meaningful
implementations of the stubs and hooks.
- Hopefully the teleconference time tomorrow works out for Rich (his
communications were unclear on this point). Otherwise,
I, not suprisingly, have serious concerns about this RFC. It assumes that
the ompi_proc issues and bootstrapping issues (the entire point of the
move, as I understand it) can both be solved, but offer no proof to
support that claim. Without those two issues solved, we would be left
with an on
On Wed, 4 Mar 2009, George Bosilca wrote:
I'm churning a lot and not making much progress, but I'll try chewing on
that idea (unless someone points out it's utterly ridiculous). I'll look
into having PML ignore sendi functions altogether and just make the
"send-immediate" path work fast with
On Tue, 3 Mar 2009, Brian W. Barrett wrote:
On Tue, 3 Mar 2009, Jeff Squyres wrote:
1.3.1rc3 had a race condition in the ORTE shutdown sequence. The only
difference between rc3 and rc4 was a fix for that race condition. Please
test ASAP:
http://www.open-mpi.org/software/ompi/v1.3
On Tue, 3 Mar 2009, Jeff Squyres wrote:
On Mar 3, 2009, at 3:31 PM, Eugene Loh wrote:
First, this behavior is basically what I was proposing and what George
didn't feel comfortable with. It is arguably no compromise at all. (Uggh,
why must I be so honest?) For eager messages, it favors BTL
On Tue, 3 Mar 2009, Eugene Loh wrote:
First, this behavior is basically what I was proposing and what George didn't
feel comfortable with. It is arguably no compromise at all. (Uggh, why must
I be so honest?) For eager messages, it favors BTLs with sendi functions,
which could lead to those
On Tue, 3 Mar 2009, Jeff Squyres wrote:
1.3.1rc3 had a race condition in the ORTE shutdown sequence. The only
difference between rc3 and rc4 was a fix for that race condition. Please
test ASAP:
http://www.open-mpi.org/software/ompi/v1.3/
I'm sorry, I've failed to test rc1 & rc2 on Catam
Hi Wayne -
Sorry for the delay. I'm the author of that code, and am currently trying
to finish my dissertation, so I've been a bit behind.
Anyway, at present, the compiler_args field only works on a single token.
So you can't have something looking for -tp p7. I thought about how to do
thi
On Mon, 23 Feb 2009, Jeff Squyres wrote:
On Feb 23, 2009, at 10:37 AM, Eugene Loh wrote:
I sense an opening here and rush in for the kill...
:-)
And, why does the PML pass a BTL argument into the sendi function? First,
the BTL argument is not typically used. Second, if the BTL sendi func
At a high level, it seems reasonable to me. I am not familiar enough with
the sendi code, however, to have a strong opinion either way.
Brian
On Mon, 23 Feb 2009, Jeff Squyres wrote:
Sounds reasonable to me. George / Brian?
On Feb 21, 2009, at 2:11 AM, Eugene Loh wrote:
What: Eliminate
I have no objections to this change
Brian
On Tue, 10 Feb 2009, Greg Koenig wrote:
RFC: Rename several OMPI_* names to OPAL_*
WHAT: Rename several #define values that encode the prefix "OMPI_" to
instead encode the prefix "OPAL_" throughout the entire Open MPI source code
tree. Also, elimina
On Sat, 7 Feb 2009, Jeff Squyres wrote:
End result: I guess I'm a little surprised that the difference is that clear
-- does a function call really take 10ns? I'm also surprised that the
layered C version has significantly more jitter than the non-layered version;
I can't really explain that.
The selection logic for the PML is very confusing and doesn't follow the
standard priority selection. The reasons for this are convoluted and not
worth discussing here. The bottom line, however, is that the OB1 PML will
be the default *UNLESS* the PSM (PathScale/Qlogic) MTL can be chosen, in
I think this sounds reasonable, if (and only if) MPI_Accumulate is
properly handled. The interface for calling the op functions was broken
in some fairly obvious way for accumulate when I was writing the one-sided
code. I think I had to call some supposedly internal bits of the
interface to m
That was my thought exactly. And since the point of the notifier
component is to return a *useful* description of what failure the BTL had
(like IB ran out of resource X again), that will be lost if we just push
that up to the next layer.
Just my $0.02, of course.
Brian
On Thu, 4 Dec 2008,
On Thu, 6 Nov 2008, Jeff Squyres wrote:
WHAT: Rename libopen-rte to be libompi-rte (and probably other supporting
text)
WHY: ORTE is really quite specific to OMPI. We decided long ago that ORTE
would not split off from OMPI, and it has been specifically tailored for
OMPI. Indeed, Ralph has
bzero is not a gnu-ism -- it's in POSIX.1. Either bzero or memset is
correct and used throughout OMPI.
Brian
On Thu, 21 Aug 2008, Jeff Squyres wrote:
IIRC, bzero is a gnu-ism. We should probably use memset instead.
On Aug 21, 2008, at 5:40 AM, George Bosilca wrote:
Terry,
We use the fe
On Sun, 20 Jul 2008 21:13:48 +0200, Ralf Wildenhues
wrote:
> * Funda Wang wrote on Sun, Jul 20, 2008 at 05:29:57AM CEST:
>> I'm currently building openmpi 1.2.6 under Mandriva cooker, and its
>> default LDFLAGS is "-Wl,--as-needed -Wl,--no-undefined".
>>
>> But openmpi 1.2.6 builds failed with:
>
> After starting, we decided that changing the MCA base revision number
> to 2.0.0 also meant changing *ALL* the framework version numbers.
> This is because the same components from framework compiled with
> MCA base version 1.x.x would not be binary compatible when compiled
> with MCA b
Responding to both of Ralph's e-mails in one, just to confuse people :).
First, the issue of the recursive locks... Back in the day, ompi_proc_t
instances could be created as a side effect of other operations.
Therefore, to maintain sanity, the procs were implicitly added to the
master proc l
On Mon, 7 Jul 2008, Terry Dontje wrote:
Just curious has anyone done comparisons of latency measurements as one
changes the size of a job. That is changing the size of the job (and number
of nodes used) and just taking the half roundtrip latency of two of the
processes in the job. I am rough
As long as we don't go back to libptmalloc2 linked into libmpi, I don't
have strong objections.
Brian
On Thu, 3 Jul 2008, Jeff Squyres wrote:
WHAT: make mpi_leave_pinned=1 by default when a BTL is used that would
benefit from it (when possible; 0 when not, obviously)
WHY: Several reasons:
On Thu, 26 Jun 2008, Jeff Squyres wrote:
On Jun 26, 2008, at 3:08 PM, George Bosilca wrote:
Here is the solution I propose. If you think there is any problem with it,
please let me know asap.
Move the progress function from the BML layer back into the PML. Then the
PML will have a way to ch
general philosophy of: running out of the box always works just
fine, but if you/the sysadmin is smart, you can get performance
improvements.
On Jun 23, 2008, at 4:18 PM, Shipman, Galen M. wrote:
I concur
- galen
On Jun 23, 2008, at 3:44 PM, Brian W. Barrett wrote:
That sounds like a reas
proc abort since it would know that other procs that
detected similar capabilities may well have selected that PML. For now,
though, this would solve the problem.
Make sense?
Ralph
On 6/23/08 1:31 PM, "Brian W. Barrett" wrote:
The problem is that we default to OB1, but that's
o force a modex just
for that - if so, then perhaps this could again be a settable option to
avoid requiring non-scalable behavior for those of us who want scalability?
On 6/23/08 1:21 PM, "Brian W. Barrett" wrote:
The selection code was added because frequently high speed interconnec
On Mon, 23 Jun 2008, Jeff Squyres wrote:
On Jun 23, 2008, at 3:17 PM, Brian W. Barrett wrote:
Just because it's volatile doesn't mean that adds are atomic. There's at
least one place in the PML (or used to be) where two threads could
decrement that counter at the same time.
The selection code was added because frequently high speed interconnects
fail to initialize properly due to random stuff happening (yes, that's a
horrible statement, but true). We ran into a situation with some really
flaky machines where most of the processes would chose CM, but a couple
woul
Just because it's volatile doesn't mean that adds are atomic. There's at
least one place in the PML (or used to be) where two threads could
decrement that counter at the same time.
Brian
On Mon, 23 Jun 2008, Jeff Squyres wrote:
I see in a few places in ob1 we do things like this:
OPAL_TH
On Thu, 19 Jun 2008, Terry Dontje wrote:
But my concern is not the raw performance of MPI_Iprobe in this case but more
of an interaction between MPI and an application. The concern is if it takes
2 MPI_Iprobes to get to the real message (instead of one) then could this
induce a synchronizatio
On Wed, 18 Jun 2008, Terry Dontje wrote:
Jeff Squyres wrote:
Perhaps we did that as a latency optimization...?
George / Brian / Galen -- do you guys know/remember why this was done?
On the surface, it looks like it would be ok to call progress and check
again to see if it found the match. C
I'm sure it was a latency optimization, just like the old test behavior.
Personally, I'd call opal_progress blindly, then walk through the queue.
Doing the walk the queue, call opal_progress, walk the queue thing seems
like too much work for iprobe. Test, sure. iProbe... eh.
Brian
On Wed,
Brad unfortunately figured out I had done something to annoy the gods of
mercurial and the repository below didn't contain all the changes
advertised (and in fact didn't work). I've since rebuilt the repository
and verified it works now. I'd recommend deleting your existing clones of
the repo
On Thu, 5 Jun 2008, Jeff Squyres wrote:
I just noticed that heterogeneous MPI support is enabled by default.
Do we really want this? Doesn't it add a little overhead (probably
noticeable on shared memory)?
I'd be comfortable with users specifically having to enable
heterogeneous support via co
On Wed, 4 Jun 2008, Paul H. Hargrove wrote:
Brian states
This will allow users to turn ptmalloc2 support on/off at application
link time instead of MPI compile time.
Where I assume "MPI compile time" means when the MPI *implementation* is
compiled.
Correct.
So what about LD_PRELOAD? Can
Hi all -
Sorry this is so late, but it took a couple of iterations with a couple of
people to get right from a technology standpoint. All mistakes in this
proposal are my fault.
What: Fix the ptmalloc2 problem
How: Remove it from the default path
When: This weekend? For the 1.3 branch
The pro
On Wed, 28 May 2008, Roland Dreier wrote:
>- gleb asks: don't we want to avoid the system call when possible?
>- patrick: a single syscall can be/is cheaper than a reg cache
> lookup in user space
This doesn't really make sense -- syscall + cache lookup in kernel is
"obviously" mor
On Thu, 22 May 2008, Terry Dontje wrote:
The major difference here is that libmyriexpress is not being included
in mainline Linux distributions. Specifically: if you can find/use
libmyriexpress, it's likely because you have that hardware. The same
*used* to be true for libibverbs, but is no lo
x27;t live without mpi_leave_pinned so
threads are back.
Jeff Squyres wrote:
On May 21, 2008, at 4:37 PM, Brian W. Barrett wrote:
ptmalloc2 is not *required* by the openib btl. But it is required on
Linux if you want to use the mpi_leave_pinned functionality. I see
one function call to __
On Wed, 21 May 2008, Jeff Squyres wrote:
I'm only concerned about the case where there's an IB card, the user
expects the IB card to be used, and the IB card isn't used.
Can you put in a site wide
btl = ^tcp
to avoid the problem? If the IB card fails, then you'll get
unreachable MPI errors.
On Wed, 21 May 2008, Jeff Squyres wrote:
On May 21, 2008, at 4:17 PM, Don Kerr wrote:
Just want to make sure what I think I see is true:
Linux build. openib btl requires ptmalloc2 and ptmalloc2 requires
posix
threads, is that correct?
ptmalloc2 is not *required* by the openib btl. But it
On Wed, 21 May 2008, Jeff Squyres wrote:
On May 21, 2008, at 3:38 PM, Jeff Squyres wrote:
It would be great if libibverbs could return two different error
messages
- one for "there's no IB card in this machine" and one for "there's
an IB
card here, but we can't initialize it". I think that wo
d running.
So we're only talking about the Open MPI warning message here. More
below.
On May 21, 2008, at 12:17 PM, Brian W. Barrett wrote:
2. An out-of-the-box "mpirun a.out" will print warning messages in
perfectly valid/good configurations (no verbs-capable hardware, but
just hap
On Wed, 21 May 2008, Jeff Squyres wrote:
2. An out-of-the-box "mpirun a.out" will print warning messages in
perfectly valid/good configurations (no verbs-capable hardware, but
just happen to have libibverbs installed). This is a Big Deal.
Which is easily solved with a better error message, as
Pasha) wrote:
I'm agree with Brian. We may add to the warning message detailed
description how to disable it.
Pasha
Brian W. Barrett wrote:
I think having a parameter to turn off the warning is a great idea. So
great in fact, that it already exists in the trunk and v1.2 :)! Setting
the def
I think having a parameter to turn off the warning is a great idea. So
great in fact, that it already exists in the trunk and v1.2 :)! Setting
the default value for the btl_base_warn_component_unused flag from 0 to 1
will have the desired effect.
I'm not sure I agree with setting the default
On Tue, 13 May 2008, Don Kerr wrote:
I believe there are similar operations being used by other areas of open
mpi, place to start looking would be, opal/util/if.c.
Yes, opal/util/if.h and opal/util/net.h provide a portable interface to
almost everything that comes from getifaddrs().
Brian
On Tue, 6 May 2008, Jeff Squyres wrote:
On May 5, 2008, at 6:27 PM, Steve Wise wrote:
There is a larger question regarding why the remote node is still
polling the hca and not shutting down, but my immediate question is
if it is an acceptable fix to simply disregard this "error" if it
is an iW
On Mon, 21 Apr 2008, Ralph H Castain wrote:
So it appears to be a combination of memchecker=yes automatically requiring
valgrind, and the override on the configure line of a param set by a
platform file not working.
So I can't speak to the valgrind/memchecker issue, but can to the
platform/co
George -
Good catch -- that's going to cause a problem :). But I think we should
add yet another check to also make sure that we're on Linux. So the three
tests would be:
1) Am I on a platform that we have timer assembly support for?
(That's the long list of architectures that we rec
On Fri, 21 Mar 2008, Regan Russell wrote:
I am having problems with the Assembler section of the GNU autoconf stuff on
OpenMPI.
Is anyone willing to work with me to get this up and running...?
As a warning, MIPS / IRIX is not currently on the list of Open MPI
supported platforms, so there m
Hi all -
Does anyone know why we go through the modex receive and for the local
process in ompi_proc_get_info()? It doesn't seem like it's necessary, and
it causes some problems on platforms that don't implement the modex (since
it zeros out useful information determined during the init step)
about going to 2.2 now or not)
On Mar 19, 2008, at 12:26 PM, Brian W. Barrett wrote:
Hi all -
Now that Libtool 2.2 has gone stable (2.0 was skipped entirely), it
probably makes sense to update the version of Libtool used to build
the
nightly tarball and releases for the trunk (and eventually
Hi all -
Now that Libtool 2.2 has gone stable (2.0 was skipped entirely), it
probably makes sense to update the version of Libtool used to build the
nightly tarball and releases for the trunk (and eventually v1.3) from the
nightly snapshot we have been using to the stable LT 2.2 release.
I'v
Jeff / George -
Did you add a way to specify which event modules are used? Because epoll
pushs the socket list into the kernel, I can see how it would screw up
BLCR. I bet everything would work if we forced the use of poll / select.
Brian
On Tue, 18 Mar 2008, Jeff Squyres wrote:
Crud, ok
On Fri, 22 Feb 2008, Adrian Knoth wrote:
I see three approaches:
a) remove lo globally (in if.c). I expect objections. ;)
I object! :). But for a good reason -- it'll break things. Someone
tried this before, and the issue is when a node (like a laptop) only has
lo -- then there are no
Out of curiousity, why is one-sided rdma component struck from 1.3? As
far as I'm aware, the code is in the trunk and ready for release.
Brian
On Mon, 11 Feb 2008, Brad Benton wrote:
All:
The latest scrub of the 1.3 release schedule and contents is ready for
review and comment. Please use
On Fri, 8 Feb 2008, Ralph Castain wrote:
1. event library
2. ROMIO
3. VT
4. backtrace
5. PLPA - this one is a little less obvious, but still being released as a
separate package
6. libNBC
Sorry to Ralph, but I clipped everything from his e-mail, then am going to
make references to it. oh wel
On Mon, 4 Feb 2008, Muhammad Atif wrote:
I am trying to port xensockets to openmpi. In principle, I have the
framework and everything, but there seems to be a small issue, I cannot
get libevent (or OPAL) to give callbacks for receive (or send) for
xensockets. I have tried to implement native c
Automake forces v7 mode so that Solaris tar can untar the tarball, IIRC.
Brian
On Thu, 24 Jan 2008, Aurélien Bouteiller wrote:
According to posix, tar should not limit the file name length. Only
the v7 implementation of tar is limited to 99 characters. GNU tar has
never been limited in the num
Nope, I think that's a valid approach. For some reason, I believe it
was problematic for the OpenIB guys to do that at the time we were
hacking up that code. But if it works, it sounds like a much better
approach.
When you make the change to the openib mpool, I'd also
MORECORE_CANNONT_TRIM
On Fri, 14 Dec 2007, Adrian Knoth wrote:
Should we consider moving towards these mapped addresses? The
implications:
- less code, only one socket to handle
- better FD consumption
- breaks WinXP support, but not Vista/Longhorn or later
- requires non-default kernel runtime setting on Op
On Wed, 12 Dec 2007, Gleb Natapov wrote:
On Wed, Dec 12, 2007 at 03:46:10PM -0500, Richard Graham wrote:
This is better than nothing, but really not very helpful for looking at the
specific issues that can arise with this, unless these systems have several
parallel networks, with tests that wil
On Tue, 11 Dec 2007, Gleb Natapov wrote:
I did a rewrite of matching code in OB1. I made it much simpler and 2
times smaller (which is good, less code - less bugs). I also got rid
of huge macros - very helpful if you need to debug something. There
is no performance degradation, actually I even
On Mon, 10 Dec 2007, Peter Wong wrote:
Open MPI defines its own malloc (by default), so malloc of glibc
is not called.
But, without calling malloc of glibc, the allocator of libhugetlbfs
to back text and dynamic data by large pages, e.g., 16MB pages on
POWER systems, is not used.
Indeed, we ca
On Thu, 6 Dec 2007, Tim Prins wrote:
Tim Prins wrote:
First, in opal_condition_wait (condition.h:97) we do not release the
passed mutex if opal_using_threads() is not set. Is there a reason for
this? I ask since this violates the way condition variables are supposed
to work, and it seems like t
OS X enforces a no duplicate symbol rule when flat namespaces are in use
(the default on OS X). If all the libraries are two-level namespace
libraries (libSystem.dylib, aka libm.dylib is two-level), then duplicate
symbols are mostly ok.
Libtool by default forces a flat namespace in sharedlibr
To me, (a) is dumb and (c) isn't a non-starter.
The whole point of the component system is to seperate concerns. Routing
topology and collectives operations are two difference concerns. While
there's some overlap (a topology-aware collective doesn't make sense when
using the unity routing st
On Wed, 28 Nov 2007, Jeff Squyres wrote:
We've had a few users complain about trying to use THREAD_MULTIPLE
lately and having it not work.
Here's a proposal: why don't we disable it (at least in the 1.2
series)? Or, at the very least, put in a big stderr warning that is
displayed when THREAD_M
Hi all -
Lisa Glendenning, who's working on a Portals one-sided component,
discovered that the test onesided/test_start1.c in our repository is
incorrect. It assumes that MPI_Win_start is non-blocking, but the
standard says that "MPI_WIN_START is allowed to block until the
corresponding MPI_
On Mon, 5 Nov 2007, Torsten Hoefler wrote:
On Mon, Nov 05, 2007 at 04:57:19PM -0500, Brian W. Barrett wrote:
This is extremely tricky to do. How do you know which environment
variables to forward (foo in this case) and which not to (hostname).
SLURM has a better chance, since it's linux
This is extremely tricky to do. How do you know which environment
variables to forward (foo in this case) and which not to (hostname).
SLURM has a better chance, since it's linux only and generally only run on
tightly controlled clusters. But there's a whole variety of things that
shouldn't b
1 - 100 of 124 matches
Mail list logo