Steve Wise wrote:
I hope you guys are documenting this in a way that makes this issue
extremely clear to both uDAPL and OFA verbs (is this the right naming?)
users. Maybe it's been done already, but is it possible to emit some
sort of loud warning/error when the accept()'ing side tries to send
general-boun...@lists.openfabrics.org wrote:
> On Wed, 2007-05-09 at 17:55 -0700, Andrew Friedley wrote:
>>
>> Steve Wise wrote:
>>> On Wed, 2007-05-09 at 16:15 -0700, Andrew Friedley wrote:
Steve Wise wrote:
> There have been a series of discussions on the ofa general list
> about th
On Wed, 2007-05-09 at 15:01 -0700, Sean Hefty wrote:
> > The reason it is hard or impossible to solve this in the DAPL layer is
> > that any rdma operation on the QP affects the state of that QP and the
> > associate CQs. In addition, if you use an RDMA send to enforce this you
> > impact the othe
On Wed, 2007-05-09 at 17:55 -0700, Andrew Friedley wrote:
>
> Steve Wise wrote:
> > On Wed, 2007-05-09 at 16:15 -0700, Andrew Friedley wrote:
> >> Steve Wise wrote:
> >>> There have been a series of discussions on the ofa general list about
> >>> this issue, and the conclusion to date is that it c
devel-boun...@open-mpi.org wrote:
> Steve Wise wrote:
>> There have been a series of discussions on the ofa general list about
>> this issue, and the conclusion to date is that it cannot be resolved
>> in the rdma-cm or iwarp-cm code of the linux rdma stack. Mainly
>> because sending an RDMA messa
On Wed, 2007-05-09 at 17:46 -0700, Andrew Friedley wrote:
> > Therefore, the only truly safe thing for an iWARP btl to do (or a
> > udapl btl since that is also an iWARP btl) is to have the active
> > layer send an MPI Layer "nop" of some kind immediately after
> > establishing the connection if t
Steve Wise wrote:
On Wed, 2007-05-09 at 16:15 -0700, Andrew Friedley wrote:
Steve Wise wrote:
There have been a series of discussions on the ofa general list about
this issue, and the conclusion to date is that it cannot be resolved in
the rdma-cm or iwarp-cm code of the linux rdma stack. Ma
general-boun...@lists.openfabrics.org wrote:
>> Therefore, the only truly safe thing for an iWARP btl to do (or a
>> udapl btl since that is also an iWARP btl) is to have the active
>> layer send an MPI Layer "nop" of some kind immediately after
>> establishing the connection if there is nothing el
Therefore, the only truly safe thing for an iWARP btl to do (or a
udapl btl since that is also an iWARP btl) is to have the active
layer send an MPI Layer "nop" of some kind immediately after
establishing the connection if there is nothing else to send.
This is fine for an iWARP/RDMACM/whatev
On Wed, 2007-05-09 at 16:15 -0700, Andrew Friedley wrote:
>
> Steve Wise wrote:
> > There have been a series of discussions on the ofa general list about
> > this issue, and the conclusion to date is that it cannot be resolved in
> > the rdma-cm or iwarp-cm code of the linux rdma stack. Mainly be
Understood, and I agree.
FWIW: note that the CONNECTED state that I refered to is internal to
OMPI's endpoint abstraction (not an iwarp/udapl/verbs/etc. state).
It's part of our connection dance protocol.
On May 9, 2007, at 5:33 PM, Caitlin Bestler wrote:
Jeff Squyres wrote:
- The ot
Jeff Squyres wrote:
>
> - The other peer (the receiver of the connection) must wait
> to send its pending fragment(s) until it receives the first
> frag from the connection initiator. This can be accomplished
> either with another flag on the OMPI module struct or perhaps
> making it part of the
I talked with Steve a bunch on the phone about this.
1. This "connector must RDMA first" issue is an iWARP restriction --
it's not specific to udapl or verbs. For example, if you try to use
udapl with iWARP on Solaris, you'll have the same issue (I have no
idea whether you have iWARP drive
>
> 2) OMPI is not adhering to the iwarp protocol requirement
> that the ULP,
> in this case OMPI, initiating the iwarp connection (the side
> issuing the
> dat_ep_connect() or rdma_connect()) _MUST_ be the first to
> send an RDMA
> message. So if a OMPI process _accepts_ an rdma connection, the
I guess I have not read enough about iwarp yet but if iwarp is sitting
below ib verbs or udapl in the stack and is trying to impose
restrictions which ib verbs or udapl do not adhere to then maybe iwarp
is in the wrong place in the ofed stack.
Having said that I do agree the OMPI community nee
On Wed, 2007-05-09 at 16:27 -0400, Donald Kerr wrote:
> So then I agree with Andrew, I think you are trying to impose
> restrictions on uDAPL which are not part of the Spec.
>
true, but if you want a single btl for IB and IW, then you'll need to
address this issue in some way...
So then I agree with Andrew, I think you are trying to impose
restrictions on uDAPL which are not part of the Spec.
-DON
Steve Wise wrote:
On Wed, 2007-05-09 at 16:20 -0400, Donald Kerr wrote:
I missing some context here. Where are you plugging iwarp and OMPI
together?
ofed-1.2 su
On Wed, 2007-05-09 at 16:20 -0400, Donald Kerr wrote:
> I missing some context here. Where are you plugging iwarp and OMPI
> together?
ofed-1.2 supports iwarp and the chelsio rnic. It can be accessed
directly via the ofa verbs and ofa rdma-cm _as well as_ via udapl.
I'm attempting to run OMP
I missing some context here. Where are you plugging iwarp and OMPI
together?
Steve Wise wrote:
On Wed, 2007-05-09 at 11:42 -0400, Donald Kerr wrote:
I agree OMPI trac ticket #890 should cover this. I will test the
suggested fix, just removing that one line from btl_udapl.c, on Solaris.
I
Steve Wise wrote:
There have been a series of discussions on the ofa general list about
this issue, and the conclusion to date is that it cannot be resolved in
the rdma-cm or iwarp-cm code of the linux rdma stack. Mainly because
sending an RDMA message involves the ULP's work queue and complet
Hi all -
After a minor hiccup last night, nightly tarballs for the trunk (and
eventually v1.3 branch) are now made with AC 2.61, AM 1.10, and LT
2.1a. Don't forget the mandatory update of AC and AM for the trunk
coming saturday morning!
Brian
On Wed, 2007-05-09 at 11:42 -0400, Donald Kerr wrote:
> I agree OMPI trac ticket #890 should cover this. I will test the
> suggested fix, just removing that one line from btl_udapl.c, on Solaris.
> I am still not set up on Linux so hopefully Steve can confirm there.
>
All,
First, I haven't tes
I agree OMPI trac ticket #890 should cover this. I will test the
suggested fix, just removing that one line from btl_udapl.c, on Solaris.
I am still not set up on Linux so hopefully Steve can confirm there.
-DON
Jeff Squyres wrote:
FWIW, I would marginally prefer if this bug is tracked in t
On May 9, 2007, at 10:30 AM, Steve Wise wrote:
Agreed. enabling udapl will get OMPI over iwarp immediately (and
hopefully in ofed-1.2). Post ofed-1.2, I think OMPI _should_ create a
rdma-cm btl. That's the plan...
Yes and no. Please see my other reply about an "rdma cm" BTL...
--
Jeff Squ
FWIW, I would marginally prefer if this bug is tracked in the Open
MPI trac ticket system, not the OFA bugzilla (Steve W. will have
write access there as soon as Chelsio submits their OMPI 3rd party
contribution agreement). We've traditionally [mostly] tracked OMPI
bugs in the OMPI bug sys
Although as Boris pointed out, perhaps the hack in OMPI is no longer
needed at all...
On Wed, 2007-05-09 at 08:41 -0500, Steve Wise wrote:
> 606 opened to track the udapl change.
>
> 607 opened to track the ompi change to remove the port number stashing
> hack.
>
> Status: I have a patch from
On Wed, 2007-05-09 at 08:37 +0300, Or Gerlitz wrote:
> Andrew Friedley wrote:
> > Jeff Squyres wrote:
> FWIW, yes, adding RDMA CM support has actually been on my to-do list
> for a while, but it keeps getting bumped by higher priority items.
> It would be *much* better if some iWARP
606 opened to track the udapl change.
607 opened to track the ompi change to remove the port number stashing
hack.
Status: I have a patch from Arlin to test today. I will test with that
patch and with the OMPI port hack removed. Stay tuned...
Steve.
On Tue, 2007-05-08 at 15:47 -0700, Arlin
On May 9, 2007, at 1:37 AM, Or Gerlitz wrote:
Doing a bit of zoom out from the "how to make ofed's udapl work for
ompi" thread, my thinking is that the ompi udapl btl enablement is
actually only the first step, where for production/longterm/etc you
want to have an rdmacm btl.
I think this
29 matches
Mail list logo