[openib-general] kdapltest regression? failing now...
I am not sure when this started, but after updating to top of trunk*, I can no longer get kdapltest to work properly. Both ipoib and sdp are working. Both server and client are returning an error: DAT_INVALID_HANDLE. This is coming from ib_create_qp(). With debugging turned on: [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8) kDAPL: dapl_ia_open () returns 0x0 kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0) kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8) kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0) kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300) kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8) kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 81001ba166e0, 81001ba22050, , 81000b806318) kDAPL: dapl_ib_qp_alloc: ib_create_qp failed = -22 kDAPL: dapl_evd_free (81001ba22050) kDAPL: dapl_evd_free () returns 0x0 kDAPL: dapl_evd_free (81001ba22168) kDAPL: dapl_evd_free () returns 0x0 kDAPL: dapl_evd_free (81001ba166e0) kDAPL: dapl_evd_free () returns 0x0 kDAPL: dapl_evd_free (81001ba164b0) kDAPL: dapl_evd_free () returns 0x0 kDAPL: dapl_pz_free (81001b9442c8) kDAPL: dapl_ia_query (81001ba165c8, , , 81001bba7b28) kDAPL: dapl_ia_query () returns 0x0 kDAPL: dapl_ia_close (81001ba165c8, 1) kDAPL: dapl_evd_free (81001ba167f8) kDAPL: dapl_evd_free () returns 0x0 Server_Cmd.debug: 1 Server_Cmd.dapl_name: mthca0a DT_cs_Server: IA mthca0a opened DT_cs_Server: PZ created DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE DT_cs_Server: Waiting for clients to all go away... DT_cs_Server: Cleaning up ... DT_cs_Server: IA mthca0a closed DT_cs_Server (mthca0a): Exiting. TEST INSTANCE 0 TEST return code = 1 Also, the ib_at module prints this out now when you ping (after running kdapltest)... ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 0xfe82c9010a99e031 -tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 machines back-2-back signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] kdapltest regression? failing now...
I'm looking into this Tom. The following code was added to hw/mthca/mthca_qp.c on Friday (starting on line 1233): if ((qp-transport == MLX qp-sq.max_gs + 2 dev-limits.max_sg) || qp-sq.max_gs dev-limits.max_sg || qp-rq.max_gs dev-limits.max_sg) return -EINVAL; If anyone knows what we have set incorrectly, please let me know. Thanks, james On Thu, 19 May 2005, Tom Duffy wrote: tduffy I am not sure when this started, but after updating to top of trunk*, I tduffy can no longer get kdapltest to work properly. Both ipoib and sdp are tduffy working. tduffy tduffy Both server and client are returning an error: DAT_INVALID_HANDLE. This tduffy is coming from ib_create_qp(). With debugging turned on: tduffy tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8) tduffy kDAPL: dapl_ia_open () returns 0x0 tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8) tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 81001ba166e0, 81001ba22050, , 81000b806318) tduffy kDAPL: dapl_ib_qp_alloc: ib_create_qp failed = -22 tduffy kDAPL: dapl_evd_free (81001ba22050) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba22168) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba166e0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba164b0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_pz_free (81001b9442c8) tduffy kDAPL: dapl_ia_query (81001ba165c8, , , 81001bba7b28) tduffy kDAPL: dapl_ia_query () returns 0x0 tduffy kDAPL: dapl_ia_close (81001ba165c8, 1) tduffy kDAPL: dapl_evd_free (81001ba167f8) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy Server_Cmd.debug: 1 tduffy Server_Cmd.dapl_name: mthca0a tduffy DT_cs_Server: IA mthca0a opened tduffy DT_cs_Server: PZ created tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE tduffy DT_cs_Server: Waiting for clients to all go away... tduffy DT_cs_Server: Cleaning up ... tduffy DT_cs_Server: IA mthca0a closed tduffy DT_cs_Server (mthca0a): Exiting. tduffy TEST INSTANCE 0 tduffy TEST return code = 1 tduffy tduffy Also, the ib_at module prints this out now when you ping (after running tduffy kdapltest)... tduffy tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 0xfe82c9010a99e031 tduffy tduffy -tduffy tduffy tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 machines back-2-back tduffy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] kdapltest regression? failing now...
For what it's worth, this is the check that we are failing: qp-sq.max_gs dev-limits.max_sg ( qp-sq.max_gs + 2 dev-limits.max_sg is also true but qp-transport == MLX is not). On Thu, 19 May 2005, James Lentini wrote: I'm looking into this Tom. The following code was added to hw/mthca/mthca_qp.c on Friday (starting on line 1233): if ((qp-transport == MLX qp-sq.max_gs + 2 dev-limits.max_sg) || qp-sq.max_gs dev-limits.max_sg || qp-rq.max_gs dev-limits.max_sg) return -EINVAL; If anyone knows what we have set incorrectly, please let me know. Thanks, james On Thu, 19 May 2005, Tom Duffy wrote: tduffy I am not sure when this started, but after updating to top of trunk*, I tduffy can no longer get kdapltest to work properly. Both ipoib and sdp are tduffy working. tduffy tduffy Both server and client are returning an error: DAT_INVALID_HANDLE. This tduffy is coming from ib_create_qp(). With debugging turned on: tduffy tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8) tduffy kDAPL: dapl_ia_open () returns 0x0 tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8) tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 81001ba166e0, 81001ba22050, , 81000b806318) tduffy kDAPL: dapl_ib_qp_alloc: ib_create_qp failed = -22 tduffy kDAPL: dapl_evd_free (81001ba22050) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba22168) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba166e0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba164b0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_pz_free (81001b9442c8) tduffy kDAPL: dapl_ia_query (81001ba165c8, , , 81001bba7b28) tduffy kDAPL: dapl_ia_query () returns 0x0 tduffy kDAPL: dapl_ia_close (81001ba165c8, 1) tduffy kDAPL: dapl_evd_free (81001ba167f8) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy Server_Cmd.debug: 1 tduffy Server_Cmd.dapl_name: mthca0a tduffy DT_cs_Server: IA mthca0a opened tduffy DT_cs_Server: PZ created tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE tduffy DT_cs_Server: Waiting for clients to all go away... tduffy DT_cs_Server: Cleaning up ... tduffy DT_cs_Server: IA mthca0a closed tduffy DT_cs_Server (mthca0a): Exiting. tduffy TEST INSTANCE 0 tduffy TEST return code = 1 tduffy tduffy Also, the ib_at module prints this out now when you ping (after running tduffy kdapltest)... tduffy tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 0xfe82c9010a99e031 tduffy tduffy -tduffy tduffy tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 machines back-2-back tduffy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] kdapltest regression? failing now...
I think I figure this out. DAPL was assuming a particular maximum scatter gather list size. I'm going to change it to query for this value. Hopefully I'll have a fix shortly. james On Thu, 19 May 2005, James Lentini wrote: For what it's worth, this is the check that we are failing: qp-sq.max_gs dev-limits.max_sg ( qp-sq.max_gs + 2 dev-limits.max_sg is also true but qp-transport == MLX is not). On Thu, 19 May 2005, James Lentini wrote: I'm looking into this Tom. The following code was added to hw/mthca/mthca_qp.c on Friday (starting on line 1233): if ((qp-transport == MLX qp-sq.max_gs + 2 dev-limits.max_sg) || qp-sq.max_gs dev-limits.max_sg || qp-rq.max_gs dev-limits.max_sg) return -EINVAL; If anyone knows what we have set incorrectly, please let me know. Thanks, james On Thu, 19 May 2005, Tom Duffy wrote: tduffy I am not sure when this started, but after updating to top of trunk*, I tduffy can no longer get kdapltest to work properly. Both ipoib and sdp are tduffy working. tduffy tduffy Both server and client are returning an error: DAT_INVALID_HANDLE. This tduffy is coming from ib_create_qp(). With debugging turned on: tduffy tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8) tduffy kDAPL: dapl_ia_open () returns 0x0 tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8) tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 81001ba166e0, 81001ba22050, , 81000b806318) tduffy kDAPL: dapl_ib_qp_alloc: ib_create_qp failed = -22 tduffy kDAPL: dapl_evd_free (81001ba22050) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba22168) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba166e0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba164b0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_pz_free (81001b9442c8) tduffy kDAPL: dapl_ia_query (81001ba165c8, , , 81001bba7b28) tduffy kDAPL: dapl_ia_query () returns 0x0 tduffy kDAPL: dapl_ia_close (81001ba165c8, 1) tduffy kDAPL: dapl_evd_free (81001ba167f8) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy Server_Cmd.debug: 1 tduffy Server_Cmd.dapl_name: mthca0a tduffy DT_cs_Server: IA mthca0a opened tduffy DT_cs_Server: PZ created tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE tduffy DT_cs_Server: Waiting for clients to all go away... tduffy DT_cs_Server: Cleaning up ... tduffy DT_cs_Server: IA mthca0a closed tduffy DT_cs_Server (mthca0a): Exiting. tduffy TEST INSTANCE 0 tduffy TEST return code = 1 tduffy tduffy Also, the ib_at module prints this out now when you ping (after running tduffy kdapltest)... tduffy tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 0xfe82c9010a99e031 tduffy tduffy -tduffy tduffy tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 machines back-2-back tduffy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] kdapltest regression? failing now...
I commited a fix for this in revision 2420. The problem turned out to be that DAPL wasn't initializing the max_inline_data value of the QP attr's cap structure. Let me know if you still have any problems. There is a patch in the pipeline that will remove the IBAT printout you mentioned. james On Thu, 19 May 2005, James Lentini wrote: I think I figure this out. DAPL was assuming a particular maximum scatter gather list size. I'm going to change it to query for this value. Hopefully I'll have a fix shortly. james On Thu, 19 May 2005, James Lentini wrote: For what it's worth, this is the check that we are failing: qp-sq.max_gs dev-limits.max_sg ( qp-sq.max_gs + 2 dev-limits.max_sg is also true but qp-transport == MLX is not). On Thu, 19 May 2005, James Lentini wrote: I'm looking into this Tom. The following code was added to hw/mthca/mthca_qp.c on Friday (starting on line 1233): if ((qp-transport == MLX qp-sq.max_gs + 2 dev-limits.max_sg) || qp-sq.max_gs dev-limits.max_sg || qp-rq.max_gs dev-limits.max_sg) return -EINVAL; If anyone knows what we have set incorrectly, please let me know. Thanks, james On Thu, 19 May 2005, Tom Duffy wrote: tduffy I am not sure when this started, but after updating to top of trunk*, I tduffy can no longer get kdapltest to work properly. Both ipoib and sdp are tduffy working. tduffy tduffy Both server and client are returning an error: DAT_INVALID_HANDLE. This tduffy is coming from ib_create_qp(). With debugging turned on: tduffy tduffy [EMAIL PROTECTED] ~]# ./kdapltest -T S -D mthca0a -d tduffy kDAPL: dapl_ia_open (mthca0a, 8, 81000b806308, 81000b8062d8) tduffy kDAPL: dapl_ia_open () returns 0x0 tduffy kDAPL: dapl_pz_create (81001ba165c8, 81000b8062e0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x20, 81000b8062e8) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0xa0, 81000b8062f0) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x10, 81000b806300) tduffy kDAPL: dapl_evd_kcreate (81001ba165c8, 8, 1, upcall, 0x40, 81000b8062f8) tduffy kDAPL: dapl_ep_create (81001ba165c8, 81001b9442c8, 81001ba164b0, 81001ba166e0, 81001ba22050, , 81000b806318) tduffy kDAPL: dapl_ib_qp_alloc: ib_create_qp failed = -22 tduffy kDAPL: dapl_evd_free (81001ba22050) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba22168) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba166e0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_evd_free (81001ba164b0) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy kDAPL: dapl_pz_free (81001b9442c8) tduffy kDAPL: dapl_ia_query (81001ba165c8, , , 81001bba7b28) tduffy kDAPL: dapl_ia_query () returns 0x0 tduffy kDAPL: dapl_ia_close (81001ba165c8, 1) tduffy kDAPL: dapl_evd_free (81001ba167f8) tduffy kDAPL: dapl_evd_free () returns 0x0 tduffy Server_Cmd.debug: 1 tduffy Server_Cmd.dapl_name: mthca0a tduffy DT_cs_Server: IA mthca0a opened tduffy DT_cs_Server: PZ created tduffy DT_cs_Server: dat_ep_create error: DAT_INVALID_HANDLE tduffy DT_cs_Server: Waiting for clients to all go away... tduffy DT_cs_Server: Cleaning up ... tduffy DT_cs_Server: IA mthca0a closed tduffy DT_cs_Server (mthca0a): Exiting. tduffy TEST INSTANCE 0 tduffy TEST return code = 1 tduffy tduffy Also, the ib_at module prints this out now when you ping (after running tduffy kdapltest)... tduffy tduffy ib_at: ib_at_arp_work: Process IB ARP ip 192.168.0.26 gid 0xfe82c9010a99e031 tduffy tduffy -tduffy tduffy tduffy * running x86_64 SMP, 2.6.12-rc4, gcc 4.0.0-6, OpenIB r2414, opensm r2414 2 machines back-2-back tduffy ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] kdapltest regression? failing now...
On Thu, 2005-05-19 at 21:43 -0400, James Lentini wrote: I commited a fix for this in revision 2420. The problem turned out to be that DAPL wasn't initializing the max_inline_data value of the QP attr's cap structure. Let me know if you still have any problems. Good job. All is well now. Tested working. Thanks, -tduffy signature.asc Description: This is a digitally signed message part ___ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general