I tried this patch and it is working fine. Now if I remove the both the cables connected to the destination, the IB_WC_RETRY_EXC_ERR on the first outstanding WR on a CQ as expected.
With this patch I think all my APM related isses were resolved. Dotan, you can check this fix into the OFED svn. Thanks for providing the fix. VBabu > > I checked the code of the file cm.c (if OFED 1.1) and the attribute > alt_timeout is not mentioned anywhere in this code. > I believe that the value of this attribute is set to zero, which means > that the QP will wait infinite time to the answer (that will never come). > > Venkatesh, can you check this issue by querying the QP attributes > after the path was migrated? > I think that you will find that the value of the timeout attribute is > zero. > > Sean, i don't familiar with the cm.c code, but i believe that the > following patch will solve this issue: > > Index: last_stable/drivers/infiniband/core/cm.c > =================================================================== > --- last_stable.orig/drivers/infiniband/core/cm.c 2006-10-29 > 16:58:08.000000000 +0200 > +++ last_stable/drivers/infiniband/core/cm.c 2006-10-29 > 17:31:57.000000000 +0200 > @@ -3221,6 +3221,7 @@ static int cm_init_qp_rtr_attr(struct cm > if (cm_id_priv->alt_av.ah_attr.dlid) { > *qp_attr_mask |= IB_QP_ALT_PATH; > qp_attr->alt_port_num = > cm_id_priv->alt_av.port->port_num; > + qp_attr->alt_timeout = > cm_id_priv->alt_av.packet_life_time; > qp_attr->alt_ah_attr = cm_id_priv->alt_av.ah_attr; > } > ret = 0; > > > thanks > Dotan _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general