Re: [openib-general] APM support in openib stack

2006-10-30 Thread Sean Hefty
> Sean, i don't familiar with the cm.c code, but i believe that the > following patch will solve this issue: > > Index: last_stable/drivers/infiniband/core/cm.c > === > --- last_stable.orig/drivers/infiniband/core/cm.c 2006-10-

Re: [openib-general] APM support in openib stack

2006-10-30 Thread Venkatesh Babu
I tried this patch and it is working fine. Now if I remove the both the cables connected to the destination, the IB_WC_RETRY_EXC_ERR on the first outstanding WR on a CQ as expected. With this patch I think all my APM related isses were resolved. Dotan, you can check this fix into the OFED s

Re: [openib-general] APM support in openib stack

2006-10-30 Thread Venkatesh Babu
Thanks for sending the patch. I will try this out and let you know the results. VBabu Dotan Barak wrote: > Hi. > > > Venkatesh Babu wrote: > >> I don't think there is any event which says "path1 is back again". >> It is the application which needs to load the alternate path. The HW >> just

Re: [openib-general] APM support in openib stack

2006-10-29 Thread Dotan Barak
Hi. Venkatesh Babu wrote: > I don't think there is any event which says "path1 is back again". It > is the application which needs to load the alternate path. The HW just > sends an event IB_EVENT_PORT_ACTIVE when port comes up. Upon recipt of > the this event the application has to see if th

Re: [openib-general] APM support in openib stack

2006-10-29 Thread Tziporet Koren
Venkatesh Babu wrote: > Any comments on the issue described in the following email ? > > It doesn't look like a firmware problem. I had got the APM working on > the same Mellanox HCA cards with IBGD 1.8.2 stack. With OFED 1.0 stack I > am getting the following problem. I guess it is some problem i

Re: [openib-general] APM support in openib stack

2006-10-27 Thread Venkatesh Babu
I don't think there is any event which says "path1 is back again". It is the application which needs to load the alternate path. The HW just sends an event IB_EVENT_PORT_ACTIVE when port comes up. Upon recipt of the this event the application has to see if there exists a path from this port t

Re: [openib-general] APM support in openib stack

2006-10-27 Thread somenath
Sean Hefty wrote: > Tang, Changqing wrote: > >> Is there a way (not limited to cm) to know that path 1 is back again and >> reload it as the new alternate path ? If path 1 is down, we can not set >> it as alternate path, right ? >> >> If we can not bring a path back on fly, the usage of APM is lim

Re: [openib-general] APM support in openib stack

2006-10-26 Thread Venkatesh Babu
Any comments on the issue described in the following email ? It doesn't look like a firmware problem. I had got the APM working on the same Mellanox HCA cards with IBGD 1.8.2 stack. With OFED 1.0 stack I am getting the following problem. I guess it is some problem in initializing the timers to th

Re: [openib-general] APM support in openib stack

2006-10-26 Thread Venkatesh Babu
I see your point. With my ULP module this was not the case, since this was the only module which was registering during the module init and unregistering during module exit. From the generic OFED stack's perspective, you are right. This condition needs to be addressed. This possibility exists in ot

Re: [openib-general] APM support in openib stack

2006-10-26 Thread Venkatesh Babu
Yes, this is possible too. The alternate path specified in ib_send_cm_lap() can be used to set cm_id_priv->alt_av. Then in ib_cm_init_qp_attr() cm_id_priv->alt_av can be used to initialize the fields for making path_mig_state transitions to IB_MIG_REARM or IB_MIG_MIGRATED. VBabu Sean Hefty wrote

Re: [openib-general] APM support in openib stack

2006-10-26 Thread Roland Dreier
> All the patches are attached to the bugzilla bug reports. bugzilla really isn't the best way to discuss patches. In the future, I would suggest sending the patches to openib-general so that everyone can see them and discuss them. - R. ___ openib-g

Re: [openib-general] APM support in openib stack

2006-10-26 Thread Sean Hefty
Venkatesh Babu wrote: > The patch for ib_sa_serv_notice_hdlr() handles multiple callers > registering for the same event by maintaining a linked list of callback > handlers. So I didn't think that the multiple user count was necessary. > The actual registration with the MAD layer happens only wh

Re: [openib-general] APM support in openib stack

2006-10-26 Thread Sean Hefty
Venkatesh Babu wrote: > Loading the alternate path may not necessarily indicate the failover. It > is just saying that alternate path is available. Failover actually > happens only when primary path's any components like local port or > remote port or switch port goes down. We should get some ev

Re: [openib-general] APM support in openib stack

2006-10-25 Thread Venkatesh Babu
The patch for ib_sa_serv_notice_hdlr() handles multiple callers registering for the same event by maintaining a linked list of callback handlers. So I didn't think that the multiple user count was necessary. The actual registration with the MAD layer happens only when the ib_sa module is initia

Re: [openib-general] APM support in openib stack

2006-10-25 Thread Venkatesh Babu
All the patches are attached to the bugzilla bug reports. Yes, it is possible to enhance ib_cm_init_qp_attr(), but it requires a new parameter *alternate_path and changing it might break the compatibility with the existing code. So I thought, it is better to add new interface, so that it won't

Re: [openib-general] APM support in openib stack

2006-10-25 Thread Sean Hefty
Tang, Changqing wrote: > Is there a way (not limited to cm) to know that path 1 is back again and > reload it as the new alternate path ? If path 1 is down, we can not set > it as alternate path, right ? > > If we can not bring a path back on fly, the usage of APM is limited. Yes - the usage is c

Re: [openib-general] APM support in openib stack

2006-10-25 Thread Tang, Changqing
]; Venkatesh Babu; openib-general@openib.org Subject: Re: [openib-general] APM support in openib stack > How do you know that the old path is back ? Do you have a notification > handler called, or you need to query periodically ? The cm doesn't know that it's back. It records t

Re: [openib-general] APM support in openib stack

2006-10-25 Thread Sean Hefty
> How do you know that the old path is back ? Do you have a notification > handler called, or you need to query periodically ? The cm doesn't know that it's back. It records the original path used when sending the REQ and uses that same path for all future messages. - Sean

Re: [openib-general] APM support in openib stack

2006-10-25 Thread Tang, Changqing
How do you know that the old path is back ? Do you have a notification handler called, or you need to query periodically ? Thanks. --CQ > another question: If one brings back the old path (after the first > failure) and use the old path record > to do lap/apr then reenabling migration using L

Re: [openib-general] APM support in openib stack

2006-10-24 Thread Sean Hefty
>The problem is for the the remote node where RC QP listen was accepted >this connection (Passive side). There is no way for this node to know >that port failure has occurred on the Active side. So it requires some >interfaces to get this notification. So ib_sa_serv_notice_hdlr() >interface as desc

Re: [openib-general] APM support in openib stack

2006-10-24 Thread Sean Hefty
>but this function > >cm_init_qp_rts_attr(struct cm_id_private *cm_id_priv, > struct ib_qp_attr *qp_attr, > int *qp_attr_mask) > >does perform the state transition using >if (cm_id_priv->alt_av.ah_attr.dlid) { >*qp_attr_mask |= IB_QP_PATH_MIG_STATE; >

Re: [openib-general] APM support in openib stack

2006-10-24 Thread Sean Hefty
> I have implemented this interface and proposing this in bug#172. You >have to patch it manually to get it working. Can you post the patch to the list? If the functionality is needed, I'd like to queue it for 2.6.20. Although, I'd prefer if the existing ib_cm_init_qp_attr() routine were used wi

Re: [openib-general] APM support in openib stack

2006-10-24 Thread Venkatesh Babu
I have implemented this interface and proposing this in bug#172. You have to patch it manually to get it working. VBabu somenath wrote: > I don't see that API( ib_cm_init_rearm_attr() the build I am using. > > so I I use a function like this: > > modifyqp_rearm(con_t *con) { >struct ib_q

Re: [openib-general] APM support in openib stack

2006-10-24 Thread Venkatesh Babu
Sean Hefty wrote: > somenath wrote: > >> any idea when this stuff will get done? > > > As a rough estimate, Q1, though I will try to get to it after adding > support for SA events, which likely would be near late November, early > December. The framework in the ib_cm is there; I just need a w

Re: [openib-general] APM support in openib stack

2006-10-24 Thread somenath
I don't see that API( ib_cm_init_rearm_attr() the build I am using. so I I use a function like this: modifyqp_rearm(con_t *con) { struct ib_qp_attr qp_attr; int qp_attr_mask =0; int ib_stat =0; memset(&qp_attr, 0, sizeof(struct ib_qp_attr)); qp_attr.qp_state = IB_QPS_RTS;

Re: [openib-general] APM support in openib stack

2006-10-24 Thread Venkatesh Babu
1. No my application has to make the state trasition from RTS to RTS. In case of failover set path_mig_state to IB_MIG_MIGRATED. In case of rearming call the function ib_cm_init_rearm_attr(). This function is defined in the bug#172. 2. No, I have my own ULP module which sits alongside to the SDP

Re: [openib-general] APM support in openib stack

2006-10-24 Thread somenath
Sean Hefty wrote: > somenath wrote: > >> any idea when this stuff will get done? > > > As a rough estimate, Q1, though I will try to get to it after adding > support for SA events, which likely would be near late November, early > December. The framework in the ib_cm is there; I just need a way

Re: [openib-general] APM support in openib stack

2006-10-24 Thread Sean Hefty
somenath wrote: > any idea when this stuff will get done? As a rough estimate, Q1, though I will try to get to it after adding support for SA events, which likely would be near late November, early December. The framework in the ib_cm is there; I just need a way to signal that failover has oc

Re: [openib-general] APM support in openib stack

2006-10-24 Thread somenath
Sean Hefty wrote: >> 1. does re-enabling Migration (as defined in vol1 of ib spec in >> 17.2.8.1.4) work for you? >> (I mean after the 1st path failure, you do lap/apr packet transfer) > > > I believe that there's other issues that need to be fixed for this to > fully work. The ib_cm uses the p

Re: [openib-general] APM support in openib stack

2006-10-24 Thread Sean Hefty
> 1. does re-enabling Migration (as defined in vol1 of ib spec in > 17.2.8.1.4) work for you? > (I mean after the 1st path failure, you do lap/apr packet transfer) I believe that there's other issues that need to be fixed for this to fully work. The ib_cm uses the primary_path specified during

Re: [openib-general] APM support in openib stack

2006-10-24 Thread somenath
>> (I was setting rnr_retry_count = 6, retry_count = 6, >>> packet_life_time field of struct ib_sa_path_rec to 15 and also tried >>> with 12) >>> 2. Send some traffic over RC QP >>> 3. Disconnect the cable belonging to

Re: [openib-general] APM support in openib stack

2006-10-24 Thread Venkatesh Babu
setting rnr_retry_count = 6, retry_count = 6, >> packet_life_time field of struct ib_sa_path_rec to 15 and also tried >> with 12) >> 2. Send some traffic over RC QP >> 3. Disconnect the cable belonging to the primary path >> 4. Obviously traffic stops since there are

Re: [openib-general] APM support in openib stack

2006-10-24 Thread somenath
cable belonging to the primary path > 4. Obviously traffic stops since there are no paths to the > destination. For the outstanding WRs in the RC QP I do get a callback > from the verbs layer describing the first WR that it failed due to > error IB_WC_RETRY_EXC_ERR and for all

Re: [openib-general] APM support in openib stack

2006-10-20 Thread Venkatesh Babu
I get IB_WC_WR_FLUSH_ERR. I will close this RC QP. VBabu Date: Mon, 16 Oct 2006 14:03:50 -0700 From: "Sean Hefty" <[EMAIL PROTECTED]> Subject: Re: [openib-general] APM support in openib stack To: [EMAIL PROTECTED] Cc: openib-general@openib.org Message-ID: <[EMAIL PROTECTED]> Content-Ty

Re: [openib-general] APM support in openib stack

2006-10-16 Thread Sean Hefty
somenath wrote: >> Doesn't ib_cm_init_qp_attr() set this for you? > > No, it doesn't. it returns me > attr_mask=0x12d181 > port=0x0 alt_port=0x0 Okay - there was a fix to the cm.c file (svn rev 8267) that added setting the alternate port number when initializing the QP attributes. Appar

Re: [openib-general] APM support in openib stack

2006-10-16 Thread somenath
Sean Hefty wrote: >>now I am able to create connections with pri_path = path1, alt_path = path2 >>with the following change in the code. >> >>I specify the port_num and al_port_num before calling ib_modify_qp() to >>change >>state to RTR (earlier I was changing this when modifying state to >>IB_QP

Re: [openib-general] APM support in openib stack

2006-10-16 Thread Sean Hefty
>now I am able to create connections with pri_path = path1, alt_path = path2 >with the following change in the code. > >I specify the port_num and al_port_num before calling ib_modify_qp() to >change >state to RTR (earlier I was changing this when modifying state to >IB_QPS_INIT). Doesn't ib_cm_in

Re: [openib-general] APM support in openib stack

2006-10-16 Thread somenath
Sean Hefty wrote: >>>pri_path = alt_path = path 1 >>>works >>> >>> >>> >>no, I haven't tested that. I can try that too, if u think that can >>provide useful info.. >> >> > >I misunderstood one of your earlier e-mails then. I threw together a test case >to try this, and it worked for me.

Re: [openib-general] APM support in openib stack

2006-10-13 Thread Sean Hefty
>> pri_path = alt_path = path 1 >> works >> > >no, I haven't tested that. I can try that too, if u think that can >provide useful info.. I misunderstood one of your earlier e-mails then. I threw together a test case to try this, and it worked for me. Can you see if the same works for you? If no

Re: [openib-general] APM support in openib stack

2006-10-13 Thread somenath
somenath wrote: > Sean Hefty wrote: > >> Can you confirm that I have your test failure correctly? >> >> pri_path = path 1 >> alt_path = NULL >> works > > > > yes. > >> >> pri_path = path 2 >> alt_path = NULL >> works >> > yes. > >> pri_path = alt_path = path 1 >> works >> > > no, I haven't tested

Re: [openib-general] APM support in openib stack

2006-10-13 Thread somenath
Sean Hefty wrote: > Can you confirm that I have your test failure correctly? > > pri_path = path 1 > alt_path = NULL > works yes. > > pri_path = path 2 > alt_path = NULL > works > yes. > pri_path = alt_path = path 1 > works > no, I haven't tested that. I can try that too, if u think that can

Re: [openib-general] APM support in openib stack

2006-10-13 Thread Sean Hefty
Can you confirm that I have your test failure correctly? pri_path = path 1 alt_path = NULL works pri_path = path 2 alt_path = NULL works pri_path = alt_path = path 1 works pri_path = alt_path = path 2 works pri_path = path 1 alt_path = path 2 fails Are the only difference between path 1 and 2

Re: [openib-general] APM support in openib stack

2006-10-13 Thread somenath
Sean Hefty wrote: > somenath wrote: > >> I must be missing something here: do I have to do anything with: >> >> 1.path_mig_state field of ib_qp_attr? > > > This should be set when transitioning to RTS. The field is set by the > ib_cm in ib_cm_init_qp_attr. (Actually, are you using this call to

Re: [openib-general] APM support in openib stack

2006-10-13 Thread Sean Hefty
somenath wrote: > I must be missing something here: do I have to do anything with: > > 1.path_mig_state field of ib_qp_attr? This should be set when transitioning to RTS. The field is set by the ib_cm in ib_cm_init_qp_attr. (Actually, are you using this call to set your QP attributes?) > 2.

Re: [openib-general] APM support in openib stack

2006-10-13 Thread somenath
Sean Hefty wrote: >>ib_sa_path_rec_get( >> device, >> HCA_PRM_PORT, /* first port =1, second >>port=2 */ >> >> > >Note that this tells the ib_sa module which port to send the request out on. >It's separate from the actual path information being req

Re: [openib-general] APM support in openib stack

2006-10-13 Thread Sean Hefty
> ib_sa_path_rec_get( >device, >HCA_PRM_PORT, /* first port =1, second >port=2 */ Note that this tells the ib_sa module which port to send the request out on. It's separate from the actual path information being requested, which is based on the GID

Re: [openib-general] APM support in openib stack

2006-10-13 Thread Michael Krause
At 11:24 AM 10/13/2006, Sean Hefty wrote: > >3. in req_handler() we follow the same steps as we have done without APM.. > > i.e. create qpairs, change qp state to RTR and then send REP. > > > >however, when trying to change state to RTR usinb ib_modify_qp() I get > >an error (-22). > > > >two in

Re: [openib-general] APM support in openib stack

2006-10-13 Thread somenath
Sean Hefty wrote: >>3. in req_handler() we follow the same steps as we have done without APM.. >> i.e. create qpairs, change qp state to RTR and then send REP. >> >>however, when trying to change state to RTR usinb ib_modify_qp() I get >>an error (-22). >> >>two info: same code will work if I pa

Re: [openib-general] APM support in openib stack

2006-10-13 Thread Sean Hefty
>3. in req_handler() we follow the same steps as we have done without APM.. > i.e. create qpairs, change qp state to RTR and then send REP. > >however, when trying to change state to RTR usinb ib_modify_qp() I get >an error (-22). > >two info: same code will work if I pass alt_path as NULL or ch

[openib-general] APM support in openib stack

2006-10-12 Thread somenath
hi, I am trying to use the APM support in openib kernel stack and facing some problems. here are the steps I follow: 1. first resolve both the path, primary and alternate path. 2. send REQ using: active_param.primary_path = path; active_param.alternate_path = alt_path; ib_send_cm_req( cm_id, &