Re: [openib-general] IB mcast question

2006-08-15 Thread Dotan Barak
Hi guys.

On Monday 14 August 2006 23:33, Sean Hefty wrote:
 Steve Wise wrote:
  So is this replicating done in the mthca hca?  
 
 As just an FYI, I didn't see anything wrong in the mthca driver either when I 
 was looking at this problem.
 
  Since one app is getting the mcast packet, can I assume the opensm code
  is doing the right thing switch/port wise?
 
 That seems like a fairly safe assumption.
 
  Should the SM get join requests for both applications that join the
  group on the same host?  Or only the first one?
 
 Only the first join request should make it to the SA.  The second join 
 request 
 is fulfilled by ib_multicast.  This is what makes ib_multicast suspect.

What is exactly the scenario that you are doing?

We have a test (over the verbs) that have 1 server and n clients.
All of the clients create a QPs and attaches them to the (same) multicast group 
(without any join).
The server sends m messages and all of the clients get those messages in every 
QP.

This test passes when it being executed in one HCA, in two HCAs (without any 
switch in the middle).

Dotan

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Steve Wise
can you send me this code?

I suspect the main difference is that I'm using librdmacm to join and
leave mcast groups.





___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Steve Wise
Just throwing out ideas here:

Maybe something in the ib_sa_mcmember_rec is prohibiting replication on
the HCA?  And maybe ib_multicast is incorrectly building this record...

struct ib_sa_mcmember_rec {
union ib_gid mgid;
union ib_gid port_gid;
__be32   qkey;
__be16   mlid;
u8   mtu_selector;
u8   mtu;
u8   traffic_class;
__be16   pkey;
u8   rate_selector;
u8   rate;
u8   packet_life_time_selector;
u8   packet_life_time;
u8   sl;
__be32   flow_label;
u8   hop_limit;
u8   scope;
u8   join_state;
int  proxy_join;
};




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Roland Dreier
Steve Just throwing out ideas here: Maybe something in the
Steve ib_sa_mcmember_rec is prohibiting replication on the HCA?
Steve And maybe ib_multicast is incorrectly building this
Steve record...

Shouldn't make a difference -- if one copy of the packet arrives at
the HCA then none of the SA stuff matters as far as replicating it to
multiple QPs.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Steve Wise
How about qp attributes?  

pkeys?

qkeys?


On Tue, 2006-08-15 at 07:15 -0700, Roland Dreier wrote:
 Steve Just throwing out ideas here: Maybe something in the
 Steve ib_sa_mcmember_rec is prohibiting replication on the HCA?
 Steve And maybe ib_multicast is incorrectly building this
 Steve record...
 
 Shouldn't make a difference -- if one copy of the packet arrives at
 the HCA then none of the SA stuff matters as far as replicating it to
 multiple QPs.
 
  - R.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Roland Dreier
Steve How about qp attributes?  pkeys? qkeys?

Good question -- yes, the QPs will need be to set up with the right
keys for packets to appear.  It's definitely something to check.

If different mcmembers are used for the first join of the group and
subsequent joins by another QP, that could explain the problem.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Sean Hefty
Steve How about qp attributes?  pkeys? qkeys?

Good question -- yes, the QPs will need be to set up with the right
keys for packets to appear.  It's definitely something to check.

The qkeys used by the RDMA CM sound like they may be the problem.  I'll verify
this and see how to fix it if so.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Sean Hefty
The qkeys used by the RDMA CM sound like they may be the problem.  I'll verify
this and see how to fix it if so.

If I set the qkeys for the QPs and MCMemberRecord to 0, I can get this to work
now.  The RDMA CM uses a qkey = port number for UD QPs, and a qkey = IPv4
address for MCMemberRecords.

A potential fix I see for this is to use the same qkey for all UD QPs and
multicast groups created by the RDMA CM.  Otherwise we restrict UD QPs to using
a single destination (remote UD QP or multicast group.)

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Steve Wise
On Tue, 2006-08-15 at 09:58 -0700, Sean Hefty wrote:
 The qkeys used by the RDMA CM sound like they may be the problem.  I'll 
 verify
 this and see how to fix it if so.
 
 If I set the qkeys for the QPs and MCMemberRecord to 0, I can get this to work
 now.  The RDMA CM uses a qkey = port number for UD QPs, and a qkey = IPv4
 address for MCMemberRecords.
 
 A potential fix I see for this is to use the same qkey for all UD QPs and
 multicast groups created by the RDMA CM.  Otherwise we restrict UD QPs to 
 using
 a single destination (remote UD QP or multicast group.)
 

I was marching to the same tune!  But I have a few points needing
clarification.

In my IP-centric mind, the sender specifies the ip mcast address and a
remote port.  All hosts with subscribers to the ip mcast address get the
packet, and all sockets on those hosts who are bound to the dst_port
receive a copy.   Other sockets on those hosts that joined the ipmcast
group but are bound to different ports will _not_ get a copy of the
packet.  In addition, the sender's local port number doesn't matter at
all in the equation.   Now how does that translate to qkeys, udqops, and
ib mcast?

It sounds to me like the remote_qkey is used to identify the mcast group
when sending a mcast -and- to identify the set of qps on each host that
should receive the incoming mcast packets.  Is this true?  




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Hal Rosenstock
On Tue, 2006-08-15 at 12:58, Sean Hefty wrote:
 The qkeys used by the RDMA CM sound like they may be the problem.  I'll 
 verify
 this and see how to fix it if so.
 
 If I set the qkeys for the QPs and MCMemberRecord to 0, I can get this to work
 now.  The RDMA CM uses a qkey = port number for UD QPs, and a qkey = IPv4
 address for MCMemberRecords.
 
 A potential fix I see for this is to use the same qkey for all UD QPs and
 multicast groups created by the RDMA CM.  Otherwise we restrict UD QPs to 
 using
 a single destination (remote UD QP or multicast group.)

Doesn't the QKey need to be the same as the one used for the IPoIB
broadcast group (for the partition in question) per IPoIB RFC ? It
should also be the one returned in the SA MCMemberRecord response.

-- Hal

 
 - Sean
 
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Hal Rosenstock
On Tue, 2006-08-15 at 14:18, Sean Hefty wrote:
  A potential fix I see for this is to use the same qkey for all UD QPs and
  multicast groups created by the RDMA CM.  Otherwise we restrict UD QPs to
 using
  a single destination (remote UD QP or multicast group.)
 
 Doesn't the QKey need to be the same as the one used for the IPoIB
 broadcast group (for the partition in question) per IPoIB RFC ? It
 should also be the one returned in the SA MCMemberRecord response.
 
 It shouldn't.  The RDMA CM multicast groups are separate from those used by
 ipoib.

Is the IP address only used locally to construct the MGID ? What does
the MGID look like ? What signature does it use if any ?

-- Hal

 - Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Sean Hefty
In my IP-centric mind, the sender specifies the ip mcast address and a
remote port.  All hosts with subscribers to the ip mcast address get the
packet, and all sockets on those hosts who are bound to the dst_port
receive a copy.   Other sockets on those hosts that joined the ipmcast
group but are bound to different ports will _not_ get a copy of the
packet.  In addition, the sender's local port number doesn't matter at
all in the equation.   Now how does that translate to qkeys, udqops, and
ib mcast?

Currently, the IP address is mapped to an MGID.  Senders and receivers are
required to subscribe to the multicast group in order to receive packets from
the multicast group.  (The UD QPs must be attached to the group to get the
packet.)  The port number is not used.

Is it possible for an IP socket to receive packets from multiple multicast
groups?

It sounds to me like the remote_qkey is used to identify the mcast group
when sending a mcast -and- to identify the set of qps on each host that
should receive the incoming mcast packets.  Is this true?

I think the QKey usage in the RDMA CM needs to be redone.  If we look at just UD
QP transfers, in order to support one-to-many data transfers, all of the QPs
need to have the same QKey.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Sean Hefty
Is the IP address only used locally to construct the MGID ? What does
the MGID look like ? What signature does it use if any ?

The IP address may also used be used to lookup routing information in order to
bind to a local device.  The address is then used locally construct the MGID.
The MGID looks a lot like the ipoib MGIDs, with a byte 8 being 0x01.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Hal Rosenstock
On Tue, 2006-08-15 at 14:33, Sean Hefty wrote:
 Is the IP address only used locally to construct the MGID ? What does
 the MGID look like ? What signature does it use if any ?
 
 The IP address may also used be used to lookup routing information in order to
 bind to a local device.  The address is then used locally construct the MGID.
 The MGID looks a lot like the ipoib MGIDs, with a byte 8 being 0x01.

One of the reserved bytes in the MGID is 1 rather than 0 and it's using
an IPv4 signature (0x401b) ?

Where does the qkey come from on the creation of the group ?

-- Hal

 - Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Sean Hefty
One of the reserved bytes in the MGID is 1 rather than 0 and it's using
an IPv4 signature (0x401b) ?

It uses a signature of 0x4001 to avoid conflicts with ipoib groups.

Where does the qkey come from on the creation of the group ?

The qkey is the same as the IPv4 address.

I need to spend some time looking at the QKeys of the QPs and the multicast
group to understand how one of the receivers worked.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Steve Wise
[adding back to list]

On Tue, 2006-08-15 at 11:59 -0700, Sean Hefty wrote:

 
 For type SOCK_DGRAM (UDP), the socket will receive packets from multiple
 subscribed ip mcast groups iff the dst_port of the incoming packet
 matches the port to which the socket is bound...
 
 This is what I was referring to.  I'm really not familiar with IP multicast
 beyond what I read in a book while coding the RDMA CM.  It sounds like we 
 might
 be able to use the QKey as the port number for the QP to mimic the behavior.
 
 The RDMA CM sets the QKey for UD QPs to the port number, but sets the QKey of 
 a
 multicast group to the IPv4 address.
 
 NOTE: I'm just trying to understand how this works in IB.  I'm not
 necessarily advocating it should behave exactly like ip mcast/udp.
 
 Clients need to create an UD QP.  When they join a multicast group, they get 
 an
 MGID, MLID, and QKey.  The UD QP needs to attach to the MGID / MLID, and have 
 a
 matching QKey.  Today, the RDMA CM assigns a QKey to a UD QP when it's 
 created;
 it doesn't know if it will join a multicast group or not.
 

Looking at the mckey code, I see that the code calls rdma_get_dst_attr()
to get the remote qpn/qkey + the ah_attrs for the mcast group (which is
the dst addr in this case).  Then it creates an ibv_ah.  Later when
sending, the SEND WR contains both the ah and the remote qpn/qkey.  

Why are these separated?  Isn't an address handle needed for each
destination QP?  If so, then why is the remote qpn/qkey also needed to
transmit a datagram?

Trying to understand how ah's relate to qpn/qkeys...





___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Sean Hefty
Why are these separated?  Isn't an address handle needed for each
destination QP?  If so, then why is the remote qpn/qkey also needed to
transmit a datagram?

The address handle doesn't include QPN/QKey information.  Maybe think of them
more as specifying the path to some port.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Steve Wise
On Tue, 2006-08-15 at 12:39 -0700, Sean Hefty wrote:
 Why are these separated?  Isn't an address handle needed for each
 destination QP?  If so, then why is the remote qpn/qkey also needed to
 transmit a datagram?
 
 The address handle doesn't include QPN/QKey information.  Maybe think of them
 more as specifying the path to some port.
 

Ok.

From what I can tell via experimentation, the qkey of the mcast group
doesn't need to have any relation to the qkeys of the qps. 

I was able to create a mcast group with the mc qkey==0xe00a0a0a, and 3
apps joined this group, but their qp qkeys were 0 (I changed
ucma_init_ud_qp() to set the qp qkey to 0).  One app sent to the mcgroup
ah/qkey/qpn and the other two received the packet.  Does that make
sense?

So maybe all we need is the concept of REUSE_PORT to allow multiple
librdma users to create cm_ids with the same local port.  currently this
isn't allowed.  If we do this, then all processes that want to exchange
mcast packets would create cm_ids and do rdma_resolve_addr() with the
same src port number on all systems. 

Senders send to the ah/remote_qpn/remote_qkey of the mcast group.  This
routes packets to all IB ports that have subscribers.  Then since the
sender's qp has the same qkey as all the group participants each qp will
receive a copy of the packet.

The mcast setup code in librdma doesn't need to change.  IE the qkey can
remain the ip mcast address.

I think this will work.  It is similar to UDP/IP/MCAST...

Or am I all wet?

whatchathink?



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Roland Dreier
Steve I was able to create a mcast group with the mc
Steve qkey==0xe00a0a0a, and 3 apps joined this group, but their
Steve qp qkeys were 0 (I changed ucma_init_ud_qp() to set the qp
Steve qkey to 0).  One app sent to the mcgroup ah/qkey/qpn and
Steve the other two received the packet.  Does that make sense?

In theory the Q_Key of a multicast group record is the Q_Key you're
supposed to use when sending to the group.  Of course nothing enforces
this but I don't really like abusing things this way.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Steve Wise
On Tue, 2006-08-15 at 13:17 -0700, Roland Dreier wrote:
 Steve I was able to create a mcast group with the mc
 Steve qkey==0xe00a0a0a, and 3 apps joined this group, but their
 Steve qp qkeys were 0 (I changed ucma_init_ud_qp() to set the qp
 Steve qkey to 0).  One app sent to the mcgroup ah/qkey/qpn and
 Steve the other two received the packet.  Does that make sense?
 
 In theory the Q_Key of a multicast group record is the Q_Key you're
 supposed to use when sending to the group.  Of course nothing enforces
 this but I don't really like abusing things this way.
 

I _did_ send the message to the qkey of the mcg.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Hal Rosenstock
On Tue, 2006-08-15 at 16:07, Steve Wise wrote:
 On Tue, 2006-08-15 at 12:39 -0700, Sean Hefty wrote:
  Why are these separated?  Isn't an address handle needed for each
  destination QP?  If so, then why is the remote qpn/qkey also needed to
  transmit a datagram?
  
  The address handle doesn't include QPN/QKey information.  Maybe think of 
  them
  more as specifying the path to some port.
  
 
 Ok.
 
 From what I can tell via experimentation, the qkey of the mcast group
 doesn't need to have any relation to the qkeys of the qps. 

That may be what is happening but I don't think that is correct (per the
IBA spec).

 I was able to create a mcast group with the mc qkey==0xe00a0a0a,

Don't we need to be careful about controlled Q_Keys as well ?

-- Hal

 and 3 apps joined this group, but their qp qkeys were 0 (I changed
 ucma_init_ud_qp() to set the qp qkey to 0).  One app sent to the mcgroup
 ah/qkey/qpn and the other two received the packet.  Does that make
 sense?
 
 So maybe all we need is the concept of REUSE_PORT to allow multiple
 librdma users to create cm_ids with the same local port.  currently this
 isn't allowed.  If we do this, then all processes that want to exchange
 mcast packets would create cm_ids and do rdma_resolve_addr() with the
 same src port number on all systems. 
 
 Senders send to the ah/remote_qpn/remote_qkey of the mcast group.  This
 routes packets to all IB ports that have subscribers.  Then since the
 sender's qp has the same qkey as all the group participants each qp will
 receive a copy of the packet.
 
 The mcast setup code in librdma doesn't need to change.  IE the qkey can
 remain the ip mcast address.
 
 I think this will work.  It is similar to UDP/IP/MCAST...
 
 Or am I all wet?
 
 whatchathink?
 
 
 
 ___
 openib-general mailing list
 openib-general@openib.org
 http://openib.org/mailman/listinfo/openib-general
 
 To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-15 Thread Sean Hefty
 Steve I was able to create a mcast group with the mc
 Steve qkey==0xe00a0a0a, and 3 apps joined this group, but their
 Steve qp qkeys were 0 (I changed ucma_init_ud_qp() to set the qp
 Steve qkey to 0).  One app sent to the mcgroup ah/qkey/qpn and
 Steve the other two received the packet.  Does that make sense?

 In theory the Q_Key of a multicast group record is the Q_Key you're
 supposed to use when sending to the group.  Of course nothing enforces
 this but I don't really like abusing things this way.


I _did_ send the message to the qkey of the mcg.

I didn't think that this was supposed to work.

Is the QKey going out on the wire the QKey from the send WR, or that associated
with the QP?  I think the QKey going out on the wire is the latter, which just
happens to make it work.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Sean Hefty
However, if I run 2 instances of the app that reads mcasts and dumps
them to stdout, I only get the mcast packets delivered to one of the
applications.  Namely the first one who joins the group seems to get the
mcasts.  I know for UDP/IP multicast, all applications bound to the same
port and joined to the IP mcast addr will get a copy of incoming mcast
packets.  Is this not true for IB mcast?   It appears not based on my
tests...

My testing revealed the same issue, and I was unable to locate the root cause of
the problem.  I was not able to confirm that this configuration had ever been
successfully tested.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Roland Dreier
Steve However, if I run 2 instances of the app that reads mcasts
Steve and dumps them to stdout, I only get the mcast packets
Steve delivered to one of the applications.  Namely the first one
Steve who joins the group seems to get the mcasts.  I know for
Steve UDP/IP multicast, all applications bound to the same port
Steve and joined to the IP mcast addr will get a copy of incoming
Steve mcast packets.  Is this not true for IB mcast?  It appears
Steve not based on my tests...

This should work -- multicast packets should be replicated to all
attached UD QPs.  There is likely a bug in the librdma multicast support.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Roland Dreier
Sean My testing revealed the same issue, and I was unable to
Sean locate the root cause of the problem.  I was not able to
Sean confirm that this configuration had ever been successfully
Sean tested.

Are you positive ibv_attach_mcast() is called on all the QPs, and that
the MGID is passed correctly in to all calls?

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Steve Wise
On Mon, 2006-08-14 at 12:43 -0700, Roland Dreier wrote:
 Sean My testing revealed the same issue, and I was unable to
 Sean locate the root cause of the problem.  I was not able to
 Sean confirm that this configuration had ever been successfully
 Sean tested.
 
 Are you positive ibv_attach_mcast() is called on all the QPs, and that
 the MGID is passed correctly in to all calls?
 
  - R.

I'll let you know...






___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Steve Wise
On Mon, 2006-08-14 at 12:31 -0700, Sean Hefty wrote:
 However, if I run 2 instances of the app that reads mcasts and dumps
 them to stdout, I only get the mcast packets delivered to one of the
 applications.  Namely the first one who joins the group seems to get the
 mcasts.  I know for UDP/IP multicast, all applications bound to the same
 port and joined to the IP mcast addr will get a copy of incoming mcast
 packets.  Is this not true for IB mcast?   It appears not based on my
 tests...
 
 My testing revealed the same issue, and I was unable to locate the root cause 
 of
 the problem.  I was not able to confirm that this configuration had ever been
 successfully tested.
 
 - Sean

Hmm.  Ok. I'll debug this.  I need to get this working...

Steve.




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Sean Hefty
Roland Dreier wrote:
 Are you positive ibv_attach_mcast() is called on all the QPs, and that
 the MGID is passed correctly in to all calls?

Yes - ibv_attach_mcast() is being called with the same MLID, MGID by both 
receiving processes.  That doesn't necessarily mean that there's not a bug in 
ib_multicast or the RDMA CM; I just couldn't locate any.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Steve Wise
On Mon, 2006-08-14 at 12:42 -0700, Roland Dreier wrote:
 Steve However, if I run 2 instances of the app that reads mcasts
 Steve and dumps them to stdout, I only get the mcast packets
 Steve delivered to one of the applications.  Namely the first one
 Steve who joins the group seems to get the mcasts.  I know for
 Steve UDP/IP multicast, all applications bound to the same port
 Steve and joined to the IP mcast addr will get a copy of incoming
 Steve mcast packets.  Is this not true for IB mcast?  It appears
 Steve not based on my tests...
 
 This should work -- multicast packets should be replicated to all
 attached UD QPs.  There is likely a bug in the librdma multicast support.
 

So is this replicating done in the mthca hca?  

Since one app is getting the mcast packet, can I assume the opensm code
is doing the right thing switch/port wise?

Should the SM get join requests for both applications that join the
group on the same host?  Or only the first one?

Steve.




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Hal Rosenstock
Steve,
 
IB only replicates once per node (and also not on the incoming port if there 
are any members).
 
The SM tracks join states (full, non member, send only member) for a port. It 
doesn't matter whether the SM gets duplicated join requests for a port. It 
would just indicate that was OK and the node would still only get one packet 
per send from another port in that group. It's up to the HCA to replicate the 
multicast packet to all its QPs which are part of that group.
 
-- Hal



From: [EMAIL PROTECTED] on behalf of Steve Wise
Sent: Mon 8/14/2006 4:18 PM
To: Roland Dreier
Cc: openib-general
Subject: Re: [openib-general] IB mcast question



On Mon, 2006-08-14 at 12:42 -0700, Roland Dreier wrote:
 Steve However, if I run 2 instances of the app that reads mcasts
 Steve and dumps them to stdout, I only get the mcast packets
 Steve delivered to one of the applications.  Namely the first one
 Steve who joins the group seems to get the mcasts.  I know for
 Steve UDP/IP multicast, all applications bound to the same port
 Steve and joined to the IP mcast addr will get a copy of incoming
 Steve mcast packets.  Is this not true for IB mcast?  It appears
 Steve not based on my tests...

 This should work -- multicast packets should be replicated to all
 attached UD QPs.  There is likely a bug in the librdma multicast support.


So is this replicating done in the mthca hca? 

Since one app is getting the mcast packet, can I assume the opensm code
is doing the right thing switch/port wise?

Should the SM get join requests for both applications that join the
group on the same host?  Or only the first one?

Steve.




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Sean Hefty
Steve Wise wrote:
 So is this replicating done in the mthca hca?  

As just an FYI, I didn't see anything wrong in the mthca driver either when I 
was looking at this problem.

 Since one app is getting the mcast packet, can I assume the opensm code
 is doing the right thing switch/port wise?

That seems like a fairly safe assumption.

 Should the SM get join requests for both applications that join the
 group on the same host?  Or only the first one?

Only the first join request should make it to the SA.  The second join request 
is fulfilled by ib_multicast.  This is what makes ib_multicast suspect.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Steve Wise
On Mon, 2006-08-14 at 13:33 -0700, Sean Hefty wrote:
 Steve Wise wrote:
  So is this replicating done in the mthca hca?  
 
 As just an FYI, I didn't see anything wrong in the mthca driver either when I 
 was looking at this problem.
 

Ok.  I added printks in the mcast attach/detach and they're firing as
expected:

vic18:/home/swise/zip # dmesg
mthca_multicast_attach qp_num 406 gid ff124001:000a0aff lid c003
mthca_multicast_attach qp_num 407 gid ff124001:000a0aff lid c003
mthca_multicast_detach qp_num 406 gid ff124001:000a0aff lid c003
mthca_multicast_detach qp_num 407 gid ff124001:000a0aff lid c003


  Should the SM get join requests for both applications that join the
  group on the same host?  Or only the first one?
 
 Only the first join request should make it to the SA.  The second join 
 request 
 is fulfilled by ib_multicast.  This is what makes ib_multicast suspect.


I'll look into this module...

Thanks,

Stevo.



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Steve Wise
  
  Only the first join request should make it to the SA.  The second join 
  request 
  is fulfilled by ib_multicast.  This is what makes ib_multicast suspect.
 
 
 I'll look into this module...
 

ib_multicast takes care of sending the join/leave info to the SA, right?
It keeps track of _when_ to leave, for instance.  So since opensm -is-
getting the join and setting up the group, and the mcast packet is being
passed to the first member who joined, then I don't think ib_multicast
can mess up the subsequent members, can it?  

I confirmed that mthca was called to attach both qps to the mgid/mlid,
so this makes me think ib_multicast worked ok. 

I'm new to IB mcast, so I'm learning, but it seems like the mthca
firmware maybe isn't doing the right thing here.  

Any suggestions on how to further debug this? 

BTW my HCAs are at the latest firmware.  I just had them upgraded.

Steve.









___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Roland Dreier
Steve So is this replicating done in the mthca hca?

Yes, it should be.  There may be a bug in the mthca kernel multicast
code for handling multiple QPs attached to the same group.

Steve Since one app is getting the mcast packet, can I assume the
Steve opensm code is doing the right thing switch/port wise?

Yep.

Steve Should the SM get join requests for both applications that
Steve join the group on the same host?  Or only the first one?

No there should only be one join request for a given port.

 - R.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Sean Hefty
Steve Wise wrote:
 ib_multicast takes care of sending the join/leave info to the SA, right?
 It keeps track of _when_ to leave, for instance.  So since opensm -is-
 getting the join and setting up the group, and the mcast packet is being
 passed to the first member who joined, then I don't think ib_multicast
 can mess up the subsequent members, can it?  

It theory, it shouldn't mess up subsequent members.  While the first join is 
active, subsequent join / leave requests to that same group should be queued. 
After the first join completes, subsequent joins should get a copy of the 
MCMemberRecord that was returned by the SA.

(This is a slight simplification, with the actual operation determined by the 
type of join operation that occurs.  But for the RDMA CM, this is what should 
happen.)

 I'm new to IB mcast, so I'm learning, but it seems like the mthca
 firmware maybe isn't doing the right thing here.  

This was my suspicion, but I couldn't be certain.  It would help if anyone can 
say that they've successfully tested this sort of multicast configuration.  
I.e. 
two QPs from the same HCA in the same group.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Hal Rosenstock
This is not the main issue (the lack of replication is) but I don't think a 
subsequent join from the same port does any harm but ib_multicast shouldn't be 
doing this. It would matter in terms of the leave though.
 
-- Hal



From: [EMAIL PROTECTED] on behalf of Roland Dreier
Sent: Mon 8/14/2006 5:24 PM
To: Steve Wise
Cc: openib-general
Subject: Re: [openib-general] IB mcast question



Steve So is this replicating done in the mthca hca?

Yes, it should be.  There may be a bug in the mthca kernel multicast
code for handling multiple QPs attached to the same group.

Steve Since one app is getting the mcast packet, can I assume the
Steve opensm code is doing the right thing switch/port wise?

Yep.

Steve Should the SM get join requests for both applications that
Steve join the group on the same host?  Or only the first one?

No there should only be one join request for a given port.

 - R.


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Steve Wise
On Tue, 2006-08-15 at 00:38 +0300, Hal Rosenstock wrote:
 This is not the main issue (the lack of replication is) but I don't
 think a subsequent join from the same port does any harm but
 ib_multicast shouldn't be doing this. It would matter in terms of the
 leave though.
  

The osm logs seem to show only one join_mgrp request, when the first app
joins, and one leave_mgrp when the 2nd app exits.

So I  think the interaction with OSM is okeydokey.

Steve.




___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Steve Wise
I added some debug printks in mthca_multicast_attach().

Roland, does this look ok to you? It seems correct to me:

# dmesg
mthca_multicast_attach qp_num 406 gid ff124001:000a0a0a lid c003
mthca_multicast_attach line 167 - found mgm, hash a20, prev , index a20
mthca_multicast_attach line 197 - updated mgm gid: mgm gid 
ff124001:000a0a0a
mthca_multicast_attach line 219 - writing mgm: mgm-qp[0] 8406 (BE)
mthca_multicast_attach qp_num 407 gid ff124001:000a0a0a lid c003
mthca_multicast_attach line 167 - found mgm, hash a20, prev , index a20
mthca_multicast_attach line 197 - updated mgm gid: mgm gid 
ff124001:000a0a0a
mthca_multicast_attach line 219 - writing mgm: mgm-qp[1] 8407 (BE)



On Mon, 2006-08-14 at 14:30 -0700, Sean Hefty wrote:
 Steve Wise wrote:
  ib_multicast takes care of sending the join/leave info to the SA, right?
  It keeps track of _when_ to leave, for instance.  So since opensm -is-
  getting the join and setting up the group, and the mcast packet is being
  passed to the first member who joined, then I don't think ib_multicast
  can mess up the subsequent members, can it?  
 
 It theory, it shouldn't mess up subsequent members.  While the first join is 
 active, subsequent join / leave requests to that same group should be queued. 
 After the first join completes, subsequent joins should get a copy of the 
 MCMemberRecord that was returned by the SA.
 
 (This is a slight simplification, with the actual operation determined by the 
 type of join operation that occurs.  But for the RDMA CM, this is what should 
 happen.)
 
  I'm new to IB mcast, so I'm learning, but it seems like the mthca
  firmware maybe isn't doing the right thing here.  
 
 This was my suspicion, but I couldn't be certain.  It would help if anyone 
 can 
 say that they've successfully tested this sort of multicast configuration.  
 I.e. 
 two QPs from the same HCA in the same group.
 
 - SeanR


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] IB mcast question

2006-08-14 Thread Roland Dreier
  I added some debug printks in mthca_multicast_attach().
  
  Roland, does this look ok to you? It seems correct to me:
  
  # dmesg
  mthca_multicast_attach qp_num 406 gid ff124001:000a0a0a lid 
  c003
  mthca_multicast_attach line 167 - found mgm, hash a20, prev , index 
  a20
  mthca_multicast_attach line 197 - updated mgm gid: mgm gid 
  ff124001:000a0a0a
  mthca_multicast_attach line 219 - writing mgm: mgm-qp[0] 8406 (BE)
  mthca_multicast_attach qp_num 407 gid ff124001:000a0a0a lid 
  c003
  mthca_multicast_attach line 167 - found mgm, hash a20, prev , index 
  a20
  mthca_multicast_attach line 197 - updated mgm gid: mgm gid 
  ff124001:000a0a0a
  mthca_multicast_attach line 219 - writing mgm: mgm-qp[1] 8407 (BE)

You're two steps ahead.  Yeah, that looks fine to me.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general