Ralph Castain wrote:
Not quite that simple, Patrick. Think of things like MPI_Sendrecv, where
the "send" call is below that of the user's code.
You have a point, Ralph. Although, that would be 8 more lines to add to
the user MPI code to define a MPI_Sendrecv macro :-)
Seriously, this particu
George Bosilca wrote:
I know the approach "because we can". We develop an MPI library, and we
should keep it that way. Our main focus should not diverge to provide
I would join George in the minority on this one. "Because we can" is a
slippery slope, there is value in keeping things simple, h
Jeff,
Jeff Squyres wrote:
ignored it whenever presenting competitive data. The 1,000,000th time I
saw this, I gave up arguing that our competitors were not being fair and
simply changed our defaults to always leave memory pinned for
OpenFabrics-based networks.
Instead, you should have tol
Jeff Squyres wrote:
Why not? The "owning" process can do the touch; then it'll be
affinity'ed properly. Right?
Yes, that's what I meant by forcing allocation. From the thread, it
looked like nobody touched the pages of the mapped file. If it's already
done, no need to write in the whole fil
George Bosilca wrote:
performance hit on the startup time. And second, we will have to find a
pretty smart way to do this or we will completely break the memory
affinity stuff.
I didn't look at the code, but I sure hope that the SM init code does
touch each page to force allocation, otherwise
Hi Christian,
Christian Siebert wrote:
I just gave the new release 1.3.1 a go. While Ethernet and InfiniBand
seem to work properly, I noticed that Myrinet/GM compiles fine but gives
a segmentation violation in the first attempt to communicate (MPI_Send
in a simple "hello world" application). I
Eugene Loh wrote:
Possibly, you meant to ask how one does directed polling with a wildcard
source MPI_ANY_SOURCE. If that was your question, the answer is we
punt. We report failure to the ULP, which reverts to the standard code
path.
Sorry, I meant ANY_SOURCE. If you poll only the queue th
Eugene,
All my remarks are related to the receive side. I think the send side
optimizations are fine, but don't take my word for it.
Eugene Loh wrote:
> To recap:
> 1) The work is already done.
How do you do "directed polling" with ANY_TAG ? How do you ensure you
check all incoming queues from t
Hi Eugene,
Eugene Loh wrote:
>> replace the fifo’s with a single link list per process in shared
>> memory, with senders to this process adding match envelopes
>> atomically, with each process reading its own link list (multiple
> *) Doesn't strike me as a "simple" change.
Actually, it's muc
Jeff Squyres wrote:
Gaah! I specifically asked Patrick and George about this and they said
that the README text was fine. Grr...
When I looked at that time, I vaguely remember that _both_ PMLs were
initialized but CM was eventually used because it was the last one. It
looked broken, but it
Richard Graham wrote:
Yes - it is polling volatile memory, so has to load from memory on every
read.
Actually, it will poll in cache, and only load from memory when the
cache coherency protocol invalidates the cache line. Volatile semantic
only prevents compiler optimizations.
It does not m
Jeff Squyres wrote:
- There's a big chunk of text about MX that I have no idea if it's still
up-to-date / correct or not.
Looks good to me.
Patrick
Gentlemen,
I have been looking at a data corruption with the MX btl or mtl with the
1.3 branch when trying to use MX registration cache. The related ticket
is #1525, opened by Tim.
In 1.3, mallopt() is used to never trim memory, in replacement of the
malloc overload by ptmalloc2. MX provides
Jeff Squyres wrote:
WHAT: make mpi_leave_pinned=1 by default when a BTL is used that would
benefit from it (when possible; 0 when not, obviously)
Comments?
The probable reason registration cache (aka leave_pinned) is disabled by
default is that it may be unsafe. Even if you use mallocopt to
Hi Roland,
Roland Dreier wrote:
Stick in a separate library then?
I don't think we want the complexity in the kernel -- I personally would
argue against merging it upstream; and given that the userspace solution
is actually faster, it becomes pretty hard to justify.
Memory registration has al
Hi Gleb,
Gleb Natapov wrote:
Not just that but also when swapping out or pagefault happens so even no
page pinning is needed. But HW should be designed to work with changing
page mappings and I am not sure that Mellanox HW designed for that. What
about Myricom HW?
Yes, we can support it. Howev
Jeff Squyres wrote:
That would also be great. I don't know anything about these mmu
notifiers (I'm not much of a kernel guy), but anything that allows us
It's what Quadrics used for years in True64. Instead of trying to catch
at user-level all instances when the page table of a process is
Hi Jeff,
Jeff Squyres wrote:
the topic of the memory hooks came up again. Brian was wondering if
we should [finally] revisit this topic -- there's a few things that
could be done to make life "better". Two things jump to mind:
- using mallopt on Linux
What about using the (probably) upc
Brian W. Barrett wrote:
With MX, it's one initialization call (mx_init), and it's not clear from
the errors it can return that you can differentiate between the two cases.
If you run mx_init() on a machine without the MX driver loaded or no NIC
detected by the driver, you get a specific error
Paul,
Paul H. Hargrove wrote:
discuss what tests we will run, but it will probably be a very minimal
set. Once we both have MTT setup and running GM tests, we should
compare configs to avoid overlap (and thus increase coverage).
That would be great. I have only one 32-node 2G cluster I can u
Hi Paul,
Paul H. Hargrove wrote:
The fact that this has gone unfixed for 2 months suggests to me that
nobody is building the GM BTL. So, how would I go about checking ...
a) ...if there exists any periodic build of the GM BTL via MTT?
We are deploying MTT on all our clusters. Right now, we
Lenny Verkhovsky wrote:
We would like to add SDP support for OPENMPI.
SDP can be used to accelerate job start ( oob over sdp ) and IPoIB
performance.
I fail to see the reason to pollute the TCP btl with IB-specific SDP stuff.
For the oob, this is arguable, but doesn't SDP allow for *transpa
Hi Peter,
Peter Wong wrote:
Open MPI defines its own malloc (by default), so malloc of glibc
is not called.
But, without calling malloc of glibc, the allocator of libhugetlbfs
to back text and dynamic data by large pages, e.g., 16MB pages on
POWER systems, is not used.
You could modify ptmall
Hi Gleb,
Gleb Natapov wrote:
In the case of TCP, kernel is kind enough to progress message for you,
but only if there was enough space in a kernel internal buffers. If there
was no place there, TCP BTL will also buffer messages in userspace and
will, eventually, have the same problem.
Occasion
Richard Graham wrote:
The real problem, as you and others have pointed out is the lack of
predictable time slices for the progress engine to do its work, when relying
on the ULP to make calls into the library...
The real, real problem is that the BTL should handle progression at
their level, s
Jeff Squyres wrote:
This is not a problem in the current code base.
Remember that this is all in the context of Galen's proposal for
btl_send() to be able to return NOT_ON_WIRE -- meaning that the send
was successful, but it has not yet been sent (e.g., openib BTL
buffered it because it ra
Hi Bogdan,
Bogdan Costescu wrote:
I made some progress: if I configure with "--without-memory-manager"
(along with all other options that I mentioned before), then it works.
This was inspired by the fact that the segmentation fault occured in
ptmalloc2. I have previously tried to remove the MX
Jeff Squyres wrote:
Let's take a step back and see exactly what we *want*. Then we can
talk about how to have an interface for it.
I must be missing something but why is the bandwidth/latency passed by
the user (by whatever means) ? Would it be easier to automagically get
these values by pr
segfault. For static
registrations, the ones that are the real problem with fork because of
the infinite exposure, it's much simpler to use MAP_SHARED...
Patrick
--
Patrick Geoffray
Myricom, Inc.
http://www.myri.com
the last
partial page of the buffer, it can happen for any pinned page.
Patrick
--
Patrick Geoffray
Myricom, Inc.
http://www.myri.com
ed to registration cache.
The right way to fix the fork problem is to fix the memory registration
problem in the OS itself. It's not going to happen anytime soon, so it
requires another hack (forcing VM duplication of registered pages at
fork time).
Patrick
--
Patrick Geoffray
Myricom,
he is playing with fire.
My 2 cents.
Patrick
--
Patrick Geoffray
Myricom, Inc.
http://www.myri.com
Jeff Squyres (jsquyres) wrote:
-Original Message-
From: devel-boun...@open-mpi.org
[mailto:devel-boun...@open-mpi.org] On Behalf Of Patrick Geoffray
Sent: Wednesday, June 28, 2006 1:23 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] Best bw/lat performance for
microbenchmark
ected
messages, the host CPU overhead and the ability to progress.
All of these metrics are measured by existing benchmarks, do you want to
write one that covers everything or something like IMB ?
Patrick
--
Patrick Geoffray
Myricom, Inc.
http://www.myri.com
34 matches
Mail list logo