What: Completely revamp the BTL RDMA interface (btl_put, btl_get) to
better match what is needed for MPI one-sided.
Why: I am preparing to push an enhanced MPI-3 one-sided component that
makes use of network rdma and atomic operations to provide a fast truely
one-sided implementation. Before I ca
Ah, gotcha.
On Nov 4, 2014, at 5:41 PM, Steve Wise wrote:
> Correct: I don't see the bug in the 1.8.4rc1 release.
>
>
> On 11/4/2014 4:33 PM, Nathan Hjelm wrote:
>> Looks like there is no issue in 1.8.4 except for the message coalescing
>> bug. Ralph, Howard, and I agree that disabling messag
Correct: I don't see the bug in the 1.8.4rc1 release.
On 11/4/2014 4:33 PM, Nathan Hjelm wrote:
Looks like there is no issue in 1.8.4 except for the message coalescing
bug. Ralph, Howard, and I agree that disabling message coalescing for
1.8.4 is the safest way forward. We can back-port the re
Looks like there is no issue in 1.8.4 except for the message coalescing
bug. Ralph, Howard, and I agree that disabling message coalescing for
1.8.4 is the safest way forward. We can back-port the real fix for an
eventual 1.8.5. Message rates no longer seem to care about message
coalescing in the o
There is one other bug fix to address the message coalescing bug. The
rest is the BTL RDMA revamp.
If there is a need I can probably pull those out and apply them to
master sooner than SC.
-Nathan
On Tue, Nov 04, 2014 at 10:11:26PM +, Jeff Squyres (jsquyres) wrote:
> It sounds like this fix
That sounds fine, but I think Steve's point is that he is being bitten by this
bug now, so it would probably be good to even include this one particular fix
in 1.8.4.
On Nov 4, 2014, at 5:24 PM, Nathan Hjelm wrote:
> Going to put the RFC out today with a timeout of about 2 weeks. This
> will
Going to put the RFC out today with a timeout of about 2 weeks. This
will give me some time to talk with other Open MPI developers
face-to-face at SC14.
If the RFC fails I will still bring that and a couple of other fixes
into the master.
-Nathan
On Tue, Nov 04, 2014 at 04:06:45PM -0600, Steve W
It sounds like this fix should be merged in soon.
Nathan: are your other changes bug fixes, or part of your BTL revamp branch?
On Nov 4, 2014, at 5:06 PM, Steve Wise wrote:
> Ok, sounds like I should let you continue the good work! :) When do you plan
> to merge this into ompi proper?
>
>
Ok, sounds like I should let you continue the good work! :) When do you
plan to merge this into ompi proper?
On 11/4/2014 3:58 PM, Nathan Hjelm wrote:
That certainly addresses part of the problem. I am working on a complete
revamp of the btl RDMA interface. It contains this fix:
https://gith
That certainly addresses part of the problem. I am working on a complete
revamp of the btl RDMA interface. It contains this fix:
https://github.com/hjelmn/ompi/commit/66fa429e306beb9fca59da0a4554e9b98d788316
-Nathan
On Tue, Nov 04, 2014 at 03:27:23PM -0600, Steve Wise wrote:
> I found the bug.
I'll issue a pull request for this and the other change I"m making.
On 11/4/2014 3:27 PM, Steve Wise wrote:
I found the bug. Here is the fix:
[root@stevo1 openib]# git diff
diff --git a/opal/mca/btl/openib/btl_openib_component.c
b/opal/mca/btl/openib/btl_openib_component.c
index d876e21..8a
I found the bug. Here is the fix:
[root@stevo1 openib]# git diff
diff --git a/opal/mca/btl/openib/btl_openib_component.c
b/opal/mca/btl/openib/btl_openib_component.c
index d876e21..8a5ea82 100644
--- a/opal/mca/btl/openib/btl_openib_component.c
+++ b/opal/mca/btl/openib/btl_openib_component.c
I have run into the issue as well. I will open a pull request for 1.8.4
as part of a patch fixing the coalescing issues.
-Nathan
On Tue, Nov 04, 2014 at 02:50:30PM -0600, Steve Wise wrote:
> On 11/4/2014 2:09 PM, Steve Wise wrote:
> >Hi,
> >
> >I'm running ompi top-o-tree from github and seeing a
On 11/4/2014 2:09 PM, Steve Wise wrote:
Hi,
I'm running ompi top-o-tree from github and seeing an openib btl issue
where the qp/srq configuration is incorrect for the given device id.
This works fine in 1.8.4rc1, but I see the problem in top-of-tree. A
simple 2 node IMB-MPI1 pingpong fails
Hi,
I'm running ompi top-o-tree from github and seeing an openib btl issue
where the qp/srq configuration is incorrect for the given device id.
This works fine in 1.8.4rc1, but I see the problem in top-of-tree. A
simple 2 node IMB-MPI1 pingpong fails to get the ranks setup. I see
this logg
Hi Folks,
Per request to have a yes/yesifneedbe/no poll, and limitation of doodle
to change options, a new doodle poll for deciding on the date for the
next developers f2f is at:
https://doodle.com/zzaupgxge9y6medu
There is also a wiki page for the meeting:
https://github.com/open-mpi/ompi/wiki
That would be correct - we restored some configure flags that are required to
make multi-thread programs work. Jeff can probably provide more info.
> On Nov 4, 2014, at 9:15 AM, Alina Sklarevich
> wrote:
>
> Hi,
>
> We observe a hang when running the multi-threading support test "latency.c"
Hi OMPI folks,
We're planning to hold another developers face to face in Q1 2015.
Currently, we're thinking of holding the face to face either the
last week of January, or one of the first two weeks of February.
The format will be similar to the previous f2f in Chicago - start
on Monday afternoon
Hi,
We observe a hang when running the multi-threading support test "latency.c"
(attached to this report), which uses MPI_THREAD_MULTIPLE.
The hang happens immediately at the begining of the test and is reproduced
in the v1.8 release branch.
The command line to reproduce the behavior is:
$ mpir
> On Nov 4, 2014, at 12:44 AM, Gilles Gouaillardet
> wrote:
>
> Ralph,
>
> On 2014/11/04 1:54, Ralph Castain wrote:
>> Hi folks
>>
>> Looking at the over-the-weekend MTT reports plus at least one comment on the
>> list, we have the following issues to address:
>>
>> * many-to-one continues
All,
the TU Dresden would like to talk a little bit about the current state
of VampirTrace in Open MPI, its successor Score-P [1] and the future of
the collaboration at the SC'14 BoF. I think a 5min talk to present the
basic idea for Score-P project would be great to have, following an open
d
Ralph,
On 2014/11/04 1:54, Ralph Castain wrote:
> Hi folks
>
> Looking at the over-the-weekend MTT reports plus at least one comment on the
> list, we have the following issues to address:
>
> * many-to-one continues to fail. Shall I just assume this is an unfixable
> problem or a bad test and i
Thanks, Nathan. After a bit more investigation yesterday, this was our
conclusion too; that it is a longstanding bug in OpenIB BTL we just
happened to start triggering the broken flow with some recent changes made
to the default max_lmc parameter. Let us know if you need anything from our
end.
Jos
Ah, okay - thanks for clarifying that!
> On Nov 3, 2014, at 9:12 PM, Gilles Gouaillardet
> wrote:
>
> That works too since pthread is mandatory now
> (i previously made a RFC and removing the --with-threads configure option is
> in my todo list)
>
> On 2014/11/04 14:10, Ralph Castain wrote:
>
That works too since pthread is mandatory now
(i previously made a RFC and removing the --with-threads configure
option is in my todo list)
On 2014/11/04 14:10, Ralph Castain wrote:
> Curious - why put it under condition of pthread config? I just added it to
> the "if solaris" section - i.e., add
Curious - why put it under condition of pthread config? I just added it to the
“if solaris” section - i.e., add the flag if we are under solaris, regardless
of someone asking for thread support. Since we require that libevent be
thread-enabled, it seemed safer to always ensure those flags are se
Ralph,
FYI, here is attached the patch i am working on (still testing ...)
aa207ad2f3de5b649e5439d06dca90d86f5a82c2 should be reverted then.
Cheers,
Gilles
On 2014/11/04 13:56, Paul Hargrove wrote:
> Ralph,
>
> You will see from the message I sent a moment ago that -D_REENTRANT on
> Solaris a
27 matches
Mail list logo