Venkatesh Babu wrote: > 1. Yes, I can rearm the alternate path by sending LAP and APR messages.
does the qpair go to rearm state just by sending LAP and APR messages? I mean, you don't have to change the QP state to REARM explicitely? > > 2. I was sending some network traffic (netperf) while doing these > failovers. > so, I assume its SDP's APM feature gets tested? is that true? thanks, som. > VBabu > > somenath wrote: > >> hi Venkatesh: >> >> Two questions: >> >> 1. does re-enabling Migration (as defined in vol1 of ib spec in >> 17.2.8.1.4) work for you? >> (I mean after the 1st path failure, you do lap/apr packet transfer) >> >> 2. What applications you are testing with? >> >> thanks, som. >> >> Venkatesh Babu wrote: >> >>> >>> I have added couple of patches to the OFED stack as described in >>> bug#160, bug#172, and bug#159 and with this successfully tested the >>> APM functionality, except one issue. >>> >>> Configuration: >>> 2 Nodes >>> CPU: AMD Opteron(tm) Processor 252 Dual processor >>> CA type: MT25208 >>> Firmware version: 5.1.4 >>> OS: CentOS release 4.2 >>> IB: OFED 1.0 >>> >>> 2 Flextronics 24 port switchs >>> >>> Node1 Port1 connected to Switch1 >>> Node1 Port2 connected to Switch2 >>> Node2 Port1 connected to switch1 >>> Node 2 Port 2 connected to Switch2 >>> >>> Node1 : Active side of the RC QP >>> Node 2 : Passive side of the RC QP >>> >>> Test1: >>> Failover simulation on Node1 >>> 1. Simulate the port1 failure, RC QP migrates the path to port2 >>> 2. Simulate the port1 UP to rearm the alternate path from port1 >>> 3. Simulate the port2 failure, RC QP migrate the path to port1 >>> 4. Simulate the port2 IP to rearm the alternate path from port2 >>> >>> Test2: >>> Real failover my manually pulling the cable >>> 1. Simulate the failover/failback by pulling cable of Node1 port1 >>> 2. Simulate the failover/failback by pulling cable of Node1 port2 >>> 3. Simulate the failover/failback by pulling cable of Node2 port1 >>> 4. Simulate the failover/failback by pulling cable of Node2 port2 >>> >>> >>> ISSUE: >>> If I pull the both the cables then there are no paths to the >>> destination, so RC QP connection is supposed to tear down. But it >>> is not working. >>> >>> 1. Create a RC QP and load both primary and alternate path >>> (I was setting rnr_retry_count = 6, retry_count = 6, >>> packet_life_time field of struct ib_sa_path_rec to 15 and also tried >>> with 12) >>> 2. Send some traffic over RC QP >>> 3. Disconnect the cable belonging to the primary path >>> 4. It smoothly fails over to alternate path and it becomes primary >>> path. >>> >>> No affect to the traffic on that RC QP >>> 5. Remove the second cable belonging to the new primary path. >>> 6. Obviously traffic stops since there are no paths to the >>> destination. But for the outstanding WRs in the RC QP I don't get >>> any callback from the verbs layer describing whether it succeeded or >>> failed due to some error like IB_WC_RETRY_EXC_ERR. >>> When I query the RC QP properties it still shows that it is in >>> IB_QPS_RTS state. >>> >>> >>> Without APM functionality it behaves correctly - >>> 1. Create a RC QP and load only primary path >>> (I was setting rnr_retry_count = 6, retry_count = 6, >>> packet_life_time field of struct ib_sa_path_rec to 15 and also tried >>> with 12) >>> 2. Send some traffic over RC QP >>> 3. Disconnect the cable belonging to the primary path >>> 4. Obviously traffic stops since there are no paths to the >>> destination. For the outstanding WRs in the RC QP I do get a >>> callback from the verbs layer describing the first WR that it failed >>> due to error IB_WC_RETRY_EXC_ERR and for all other WRs I get >>> IB_WC_WR_FLUSH_ERR. >>> I will close this RC QP. >>> >>> VBabu >>> >>> Date: Mon, 16 Oct 2006 14:03:50 -0700 >>> From: "Sean Hefty" <[EMAIL PROTECTED]> >>> Subject: Re: [openib-general] APM support in openib stack >>> To: [EMAIL PROTECTED] >>> Cc: openib-general@openib.org >>> Message-ID: <[EMAIL PROTECTED]> >>> Content-Type: text/plain; charset=iso-8859-1; format=flowed >>> >>> somenath wrote: >>> >>>>>>> Doesn't ib_cm_init_qp_attr() set this for you? >>>>>> >>>>>> >>>>>> >>>>> >>>>> No, it doesn't. it returns me >>>>> attr_mask= 0x12d181 >>>>> port=0x0 alt_port=0x0 >>>> >>>> >>>> >>>> >>>> >>> >>> Okay - there was a fix to the cm.c file (svn rev 8267) that added >>> setting the alternate port number when initializing the QP >>> attributes. Apparently that fix did not make it into the release >>> that you're using. >>> >>> - Sean >>> >>> >>> >>> >>> >> _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general