Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
>So basically what you are saying is that the TClass and FlowLabel act >as some kind of global dis-ambiguation that lets all SAs know that the >tuple MUST be matched with >on each side. Sort of... My reasoning is that if you look at a packet traveling from the source QP to the destination QP, a

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Jason Gunthorpe
On Fri, Feb 09, 2007 at 03:08:12PM -0800, Sean Hefty wrote: > The route itself is determined using the SGID, DGID, TClass, FlowLabel. > So, as long as the two queries match on these fields, I would think that it > would work. So basically what you are saying is that the TClass and FlowLabel ac

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
> The hard part is the global distribution of this information. The best idea I can come up with for locating remote SAs is to have the SAs assign themselves a specific Unicast Global GID Assigned Value. So, each SA gives themselves a GID similar to: 64-bit subnet prefix :: 1. Hosts on remote

Re: [openib-general] Unknown SMP Recv

2007-02-09 Thread Michael Arndt
Hi, below the two files missing, sender.h and helper.c. Thanks Michael # Sender. h ## // Includes #include #include #include #include #include #include // Defines --

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
> Sean: Even if you can query both SA's there isn't enough information > to force things to use the same router path in each direction. My assumption is that the remote SA contains the necessary information about how a packet coming from the local SGID to the remote DGID would be routed on the

Re: [openib-general] [PATCH] RDMA/iwcm: Bugs in cm_conn_req_handler()

2007-02-09 Thread Steve Wise
> All 4 above cases were tested by injecting random error in > iw_conn_req_handler() and running rdma_bw/krping, they were > confirmed. I added the BUG_ON() to confirm the earlier check > for id_priv->refcount==0 should always be true (and could be > removed). Can you post the test case you're us

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Jason Gunthorpe
On Fri, Feb 09, 2007 at 04:45:29PM -0500, Hal Rosenstock wrote: > >Off hand I don't see that the existing path record query structure > >has enough information to do this.. Particularly, in cases > >where each subnet has more than 1 router port there is no real > >guarentee that qu

[openib-general] MVAPICH 0.9.9-beta release is available

2007-02-09 Thread Dhabaleswar Panda
The MVAPICH team is pleased to announce the availability of MVAPICH 0.9.9-beta with the following NEW features: - Message coalescing support to enable reduction of per Queue-pair send queues for reduction in memory requirement on large scale clusters. This design also increases the small messa

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Hal Rosenstock
On Fri, 2007-02-09 at 15:34, Sean Hefty wrote: > > the /missing part (right now) is locating the SA on that > > remote subnet if this is a needed function. > > Maybe we can expose this to SA clients through a ServiceRecord? That might be one way if there were a standardized service name for SA an

Re: [openib-general] Unknown SMP Recv

2007-02-09 Thread Hal Rosenstock
On Fri, 2007-02-09 at 15:19, Michael Arndt wrote: > Hi, > > > It is strange, I did similar thing (you can see in > > management/diags/src/mcm_rereg_test.c) and it worked fine for me. > > What location is that? > > >Which libibumad version you are using? Also I understand you did some > >changes

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Hal Rosenstock
On Fri, 2007-02-09 at 14:20, Jason Gunthorpe wrote: > On Fri, Feb 09, 2007 at 12:58:51PM -0500, Hal Rosenstock wrote: > > > For simplicity, assume a single path. My assumption in this case was > > > that the > > > SLID/DLID values would be reversed. That is, the LIDs are relative to > > > the

Re: [openib-general] Unknown SMP Recv

2007-02-09 Thread Sasha Khapyorsky
On Fri, 2007-02-09 at 21:19 +0100, Michael Arndt wrote: > Hi, > > > It is strange, I did similar thing (you can see in > > management/diags/src/mcm_rereg_test.c) and it worked fine for me. > > What location is that? Do git clone git://git.openfabrics.org/~halr/management and find this as man

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
> the /missing part (right now) is locating the SA on that > remote subnet if this is a needed function. Maybe we can expose this to SA clients through a ServiceRecord? This doesn't solve how the two SAs find each other (or any of the other difficult stuff), but with this and the path record q

Re: [openib-general] Unknown SMP Recv

2007-02-09 Thread Michael Arndt
Hi, > It is strange, I did similar thing (you can see in > management/diags/src/mcm_rereg_test.c) and it worked fine for me. What location is that? >Which libibumad version you are using? Also I understand you did some >changes in the stack, is it related to user_mad? Could you publish this? I

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
> - A kind of inter-subnet path record query is needed that can > return a local and remote GRH and LRH. These four structures need to > be *linked* so that: >- Side A GRH.SGID = active side's Port GID >- Side A GRH.DGID = passive side's Port GID >- Side A LRH.SLID = any active side

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Hal Rosenstock
On Fri, 2007-02-09 at 14:56, Sean Hefty wrote: > I don't see a way to issue the SA query to the remote subnet though. Even though SA queries can go intersubnet as they are GMPs and can contain a GRH, the /missing part (right now) is locating the SA on that remote subnet if this is a needed functio

Re: [openib-general] patches to 2.6.19.1 kernel for switch Operation

2007-02-09 Thread Hal Rosenstock
Suri, On Mon, 2007-02-05 at 12:31, Suresh Shelvapille wrote: > Hal: > > We are upgrading to 2.6.19.1 kernel and I finally ported the changes > required for Switch operation from my current kernel (2.6.12) version. > > I have tested these changes for a switch with different SM(s). But I need > t

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Jason Gunthorpe
On Fri, Feb 09, 2007 at 12:58:51PM -0500, Hal Rosenstock wrote: > > For simplicity, assume a single path. My assumption in this case was that > > the > > SLID/DLID values would be reversed. That is, the LIDs are relative to the > > local > > subnet, not the SGID. But if I set the SGID = DGID

Re: [openib-general] Immediate data question

2007-02-09 Thread Tang, Changqing
> > > >Not for the receiver, but the sender will be severely slowed down by > >having to wait for the RNR timeouts. > > RNR = Receiver Not Ready so by definition, the data flow > isn't going to > progress until the receiver is ready to receive data. If a > receive QP > enters RNR for a RC,

Re: [openib-general] Unknown SMP Recv

2007-02-09 Thread Hal Rosenstock
On Fri, 2007-02-09 at 13:38, Michael Arndt wrote: > Hi, > > > I have no clue; I don't really understand what you have changed so it is > > hard to know. > > For example: if I send ten SMPs like: > > for (i=0;i<10;i++){ > umad_send(portid, agentid, msg, len, timeout, repeats); > }

Re: [openib-general] Unknown SMP Recv

2007-02-09 Thread Sasha Khapyorsky
Hi Michael, On Fri, 2007-02-09 at 19:38 +0100, Michael Arndt wrote: > Hi, > > > I have no clue; I don't really understand what you have changed so it is > > hard to know. > > For example: if I send ten SMPs like: > > for (i=0;i<10;i++){ > umad_send(portid, agentid, msg, len, timeout

Re: [openib-general] Open MPI rpmbuild fails in OFED-1.2

2007-02-09 Thread Jeff Squyres
New SRPM on server that munges the %build section into the %install section. Yuck. :-) On Feb 7, 2007, at 11:42 AM, Vladimir Sokolovsky wrote: > Hi Jeff, > Please remove %build macro from the RPM spec file. > On SuSE distros it removes RPM_BUILD_ROOT. > > Executing(%build): /bin/sh -e /var/t

Re: [openib-general] Unknown SMP Recv

2007-02-09 Thread Michael Arndt
Hi, > I have no clue; I don't really understand what you have changed so it is > hard to know. For example: if I send ten SMPs like: for (i=0;i<10;i++){ umad_send(portid, agentid, msg, len, timeout, repeats); } timeout > 0! than only the first one is sent and all other umad_

Re: [openib-general] Unknown SMP Recv

2007-02-09 Thread Hal Rosenstock
On Fri, 2007-02-09 at 12:14, Michael Arndt wrote: > Hi, > > > umad_send takes the timeout in msec. 100 msec is too short. Try > > something on the order of seconds. Note also that negative 'timeout_ms' > > value makes the kernel wait for the reply forever. > > I have tried many values, but soone

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Hal Rosenstock
On Fri, 2007-02-09 at 12:22, Sean Hefty wrote: > > SLID corresponding to SGID and a DLID for some IB router on the subnet > > which can route to the remote DGID. > > This was my assumption as well. > > > An SM is free to choose SLID and DLID to supply to if there are multiple > > LIDs for the por

Re: [openib-general] please pull for 2.6.21: fix + add IB multicast support

2007-02-09 Thread Sean Hefty
> + member = kzalloc(sizeof *member, gfp_mask); > + if (!member) > + return ERR_PTR(-ENOMEM); This appears okay to replace with kmalloc. > + group = kzalloc(sizeof *group, gfp_mask); > + if (!group) > + return NULL; > + We would need additional

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
> SLID corresponding to SGID and a DLID for some IB router on the subnet > which can route to the remote DGID. This was my assumption as well. > An SM is free to choose SLID and DLID to supply to if there are multiple > LIDs for the ports in question it can choose alternates. The key here is > wh

Re: [openib-general] Unknown SMP Recv

2007-02-09 Thread Michael Arndt
Hi, > umad_send takes the timeout in msec. 100 msec is too short. Try > something on the order of seconds. Note also that negative 'timeout_ms' > value makes the kernel wait for the reply forever. I have tried many values, but sooner or later the umad_send broke down, which is bad because the

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Sean Hefty
> I have a follow up question to this.. With CM how is the SL for each > side determined? I'm looking through the code here and it looks like > the SL of the active side is passed in the REQ to the passive side (ie > both sides are the same) But cma_query_ib_route does not set the > reversible bit

[openib-general] [PATCH] for-2.6.21 Declare iwch_ev_dispatch in iwch.h

2007-02-09 Thread Steve Wise
Declare iwch_ev_dispatch in iwch.h Remove the extern declaration from iwch.c and put it in iwch.h Signed-off-by: Steve Wise <[EMAIL PROTECTED]> --- drivers/infiniband/hw/cxgb3/iwch.c |2 -- drivers/infiniband/hw/cxgb3/iwch.h |2 ++ 2 files changed, 2 insertions(+), 2 deletions(-) diff

Re: [openib-general] [PATCH] OpenSM/osm_ucast_lash.c: In osm_get_lash_sl, fix SL when CA ports on same switch

2007-02-09 Thread Hal Rosenstock
On Fri, 2007-02-09 at 11:05, Dale Purdy wrote: > We have successfully tested this bug fix Thanks. > and would like to see it > pushed into the 1.2 branch. Already pushed for ofed_1_2. I will be sending a note to Vlad to pick these up and it should be in alpha. -- Hal > Dale > > On Thu, 8 Feb

Re: [openib-general] [PATCH] OpenSM/osm_ucast_lash.c: In osm_get_lash_sl, fix SL when CA ports on same switch

2007-02-09 Thread Dale Purdy
We have successfully tested this bug fix and would like to see it pushed into the 1.2 branch. Dale On Thu, 8 Feb 2007, Hal Rosenstock wrote: > OpenSM/osm_ucast_lash.c: In osm_get_lash_sl, fix SL when CA ports on same > switch This change resolves an issue with strange SL assignment when two HC

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate forunicast packets

2007-02-09 Thread Hal Rosenstock
Arkady, On Fri, 2007-02-09 at 10:32, Kanevsky, Arkady wrote: > Hal, > unfortunately, IBTA punted on this issue. > We considered it for IBTA CM IP address annex but at the end > could not handle all the cases. Thanks. Any idea if this issue might be addressed (no pun intended) or whether it is le

Re: [openib-general] [PATCH] RDMA/iwcm: Bugs in cm_conn_req_handler()

2007-02-09 Thread Tom Tucker
Roland: This looks bad. Lemme noodle... On Thu, 2007-02-08 at 20:23 -0800, Roland Dreier wrote: > BTW, while looking at iwcm.c, I noticed the following highly dubious > code for the first time: > > static int iwcm_deref_id(struct iwcm_id_private *cm_id_priv) > { > int r

Re: [openib-general] [PATCH 0/5] iw_cxgb3 - misc cleanup and fixes

2007-02-09 Thread Roland Dreier
Michael> What about the mthca memory registration patches? I Michael> thought they are on their way. Should I repost? Sorry, I forgot about that. Yes, please resend the latest state. ___ openib-general mailing list openib-general@openib.org ht

Re: [openib-general] [PATCH] IPOIB: Use a GRH when appropriate forunicast packets

2007-02-09 Thread Kanevsky, Arkady
Hal, unfortunately, IBTA punted on this issue. We considered it for IBTA CM IP address annex but at the end could not handle all the cases. Thanks, Arkady Kanevsky email: [EMAIL PROTECTED] Network Appliance Inc. phone: 781-768-5395 1601 Trapelo Rd. - Suite 16.

Re: [openib-general] dapl broken for iWARP

2007-02-09 Thread Kanevsky, Arkady
Mike, this is not a DAPL issue. There are 2 ways to deal with it. One is for all ULPs to use private data to exchange CM info. yes, some ULPs, like SDP do that in hello world message. Another is to let CM handle it. This way ULP does not have to deal with it. This is analogous to the IBTA CM IP ad

Re: [openib-general] dapl broken for iWARP

2007-02-09 Thread Steve Wise
On Fri, 2007-02-09 at 10:15 -0500, Kanevsky, Arkady wrote: > Steve, > what is an issue of using > max_qp_rd_atom and max_qp_init_rd_atom > beside the bad name? its a hack. But Bob already asked to do this, so I guess I will. We still don't ensure interoperability with DAPL consumers. A global

Re: [openib-general] [PATCH 0/5] iw_cxgb3 - misc cleanup and fixes

2007-02-09 Thread Steve Wise
> I understand, I did not get that. > > But for example create_read_req_cqe builds it in software. > It could build ib_wc instead. > Reads are handled in a slightly different manner. This is due to the fact that the T3 HW can complete a read out of order. For example: POST READ POST WRITE Th

Re: [openib-general] dapl broken for iWARP

2007-02-09 Thread Kanevsky, Arkady
Steve, what is an issue of using max_qp_rd_atom and max_qp_init_rd_atom beside the bad name? Thanks, Arkady Kanevsky email: [EMAIL PROTECTED] Network Appliance Inc. phone: 781-768-5395 1601 Trapelo Rd. - Suite 16.Fax: 781-895-1195 Waltham, MA 02451

Re: [openib-general] [PATCH 0/5] iw_cxgb3 - misc cleanup and fixes

2007-02-09 Thread Michael S. Tsirkin
> Quoting r. Steve Wise <[EMAIL PROTECTED]>: > Subject: Re: [PATCH 0/5] iw_cxgb3 - misc cleanup and fixes > > On Fri, 2007-02-09 at 08:51 +0200, Michael S. Tsirkin wrote: > > > > Also I agree with MST, I would like to see the core/ subdirectory die > > > > completely. > > > > > > > > > > ok ok..

Re: [openib-general] [PATCH 0/5] iw_cxgb3 - misc cleanup and fixes

2007-02-09 Thread Steve Wise
On Fri, 2007-02-09 at 08:23 -0600, Steve Wise wrote: > On Fri, 2007-02-09 at 08:51 +0200, Michael S. Tsirkin wrote: > > > > Also I agree with MST, I would like to see the core/ subdirectory die > > > > completely. > > > > > > > > > > ok ok...I'll kill the subdir... > > > > It's not just the dire

Re: [openib-general] [PATCH 0/5] iw_cxgb3 - misc cleanup and fixes

2007-02-09 Thread Steve Wise
On Fri, 2007-02-09 at 08:51 +0200, Michael S. Tsirkin wrote: > > > Also I agree with MST, I would like to see the core/ subdirectory die > > > completely. > > > > > > > ok ok...I'll kill the subdir... > > It's not just the directory BTW. Stuff like building completions in > t3_cqe format and the

Re: [openib-general] [PATCH] RDMA/iwcm: Bugs in cm_conn_req_handler()

2007-02-09 Thread Tom Tucker
Kumar: I _LOVE_ the patch and the fact that you're making this code better. I just want to tweak it a little bit... * Please convince yourself (and me ;-)) that the iw_cm_destroy_id can never block where you've put it. I'll bet that it's fine, but convince yourself too. Your comment scared me a

Re: [openib-general] [PATCH TRIVIAL] osmtest: use more descriptive constant names

2007-02-09 Thread Hal Rosenstock
On Thu, 2007-02-08 at 18:16, Sasha Khapyorsky wrote: > Use more descriptive constant names for osmtest flows. > > Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]> Thanks. Applied (to master and ofed_1_2). -- Hal ___ openib-general mailing list ope

[openib-general] [PATCH ofed-1.2] ofa_user.spec: fix installation path for ehca.driver

2007-02-09 Thread Stefan Roscher
Hi Vladimir, we tested the newest ofed1.2 package and found out that ehca.driver file is not copied into /usr/local/ofed/etc/libibverbs.d/ This patch add the installation path for ehca.driver to ofa_user.spec. Please ensure you first apply the ofa_user.spec patch I sent yesterday: http://openib.

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Hal Rosenstock
On Thu, 2007-02-08 at 23:37, Jason Gunthorpe wrote: > On Thu, Feb 08, 2007 at 03:43:24PM -0800, Sean Hefty wrote: > > > Looking at the problem more, I think that the issue extends to the remote > > > port > > > LID as well. My expectation with a local path record query is that the > > > SLID is

Re: [openib-general] Problem is routing CM REQ

2007-02-09 Thread Hal Rosenstock
On Thu, 2007-02-08 at 18:43, Sean Hefty wrote: > > Looking at the problem more, I think that the issue extends to the remote > > port > > LID as well. My expectation with a local path record query is that the > > SLID is > > the local port, and the DLID is the local router. This should be >

[openib-general] ofa_1_2_kernel 20070209-0200 daily build status

2007-02-09 Thread vlad
-2.6.12 Passed on ia64 with linux-2.6.16 Passed on ia64 with linux-2.6.14 Passed on ppc64 with linux-2.6.18 Passed on ia64 with linux-2.6.15 Passed on ia64 with linux-2.6.17 Failed: Build failed on ia64 with linux-2.6.16.21-0.8-default Log: /home/vlad/tmp/ofa_1_2_kernel-20070209-0200_linux-2.6.16.21-0.8

Re: [openib-general] please pull for 2.6.21: fix + add IB multicast support

2007-02-09 Thread Michael S. Tsirkin
> > Or, have you or anyone else at Voltaire read over the > > code in addition to using it? Do you see anything that should be > > cleaned up? > > OK, I most the the review i did (and interaction with Sean to add changes) was > on the rdma_cm: add multicast communication support patch, and i was

Re: [openib-general] please pull for 2.6.21: fix + add IB multicast support

2007-02-09 Thread Or Gerlitz
On 2/9/07, Roland Dreier <[EMAIL PROTECTED]> wrote: > I plan to review to multicast stuff next week and I hope to merge it for > 2.6.21 thanks, good news! > Or, have you or anyone else at Voltaire read over the > code in addition to using it? Do you see anything that should be > cleaned up? OK

Re: [openib-general] please pull for 2.6.21: fix + add IB multicast support

2007-02-09 Thread Michael S. Tsirkin
> Quoting Roland Dreier <[EMAIL PROTECTED]>: > Subject: Re: please pull for 2.6.21: fix + add IB multicast support > > I merged the "increment port number" and "remove redundant '_wq'" > patches from git.openfabrics.org/~shefty/scm/rdma-dev.git for-roland > > I plan to review to multicast stuff n