On Wed, 2005-06-08 at 16:44, Tom Duffy wrote: > This just in! Don't know if was caused by your patches or mine, but I > ran the server on 192.168.0.26, and tried to connect with the client, it > didn't connect right away, so I did a control-c. Then, the kernel > panic'ed: > > [EMAIL PROTECTED] ~]# ./kdapltest -T Q -d -s 192.168.0.26 -D mthca0a > Server Name: 192.168.0.26 > Server Net Address: 192.168.0.26 > DT_cs_Client: Starting Test ... > DT_cs_Client: IA mthca0a opened > DT_cs_Client: EP created > ***** DAPL Characteristics ***** > Provider: mthca0a Version 1.0 DAPL 1.2 > Adapter: Generic InfiniBand HCA by Linux Version 0.0 > Supporting: > 64512 EPs with 65535 DTOs and 0 in RDMA/RDs and 0ut RDMA/RDs each > 65408 EVDs of up to 65535 entries (default S/R size is 256/256) > IOVs of up to 28 elements > 131056 LMRs (and 131056 RMRs) of up to 0xffffffffffffffff bytes > Maximum MTU 0x80000000 bytes, RDMA 0x80000000 bytes > Maximum Private data size 92 bytes > ***** ***** ***** ***** ***** ***** > DT_cs_Client: Posting 1 recv buffer > DT_cs_Client: Connect Endpoint > DT_cs_Client: Await connection ... > > <-- I DID A CONTROL-C HERE --> > > [EMAIL PROTECTED] ~]# dapl_path_comp_handler: path resolution failed -110 > retry 1802201964!!! > d<ap4>l_ibpa_atht:_c romeqp__henand:dl peren: d epff_pfftr81 > 000x65ab622b6bbe46b0 6bal6bre6bad6by > completed? status 3 > dapl_path_comp_handler: path resolution failed -110 retry 1802201965!!! > dapl_path_comp_handler: ep_ptr 0x6b6b6b6b6b6b6b6b > general protection fault: 0000 [1] SMP > CPU 0 > Modules linked in: kdapltest ib_dat_provider dat ib_at ib_ipoib ib_sdp ib_cm > md5 ipv6 parport_pc lp parport autofs4 nfs lockd rfcomm l2cap bluetooth > pcmcia yenta_socket rsrc_nonstatic pcmcia_core sunrpc ext3 jbd dm_mod video > container button battery ac ohci_hcd tpm_nsc tpm i2c_amd756 i2c_core ib_mthca > ib_sa ib_mad ib_core tg3 floppy xfs exportfs mptscsih mptbase sd_mod scsi_mod > Pid: 11881, comm: ib_at_wq/0 Not tainted 2.6.12-rc6openib > RIP: 0010:[<ffffffff88309c2d>] > <ffffffff88309c2d>{:ib_dat_provider:dapl_evd_connection_callback+67} > RSP: 0018:ffff8100218b9dc8 EFLAGS: 00010296 > RAX: 6b6b6b6b6b6b6b6b RBX: ffff81005a22be40 RCX: 0000000000004008 > RDX: ffff810075baaaf8 RSI: ffffffff883141a0 RDI: 0000000000000048 > RBP: ffff81005a22be70 R08: 6b6b6b6b6b6b6b6b R09: 0000000000000000 > R10: 0000000000000010 R11: 0000000000000010 R12: ffff81007d9b8c50 > R13: ffff81005a22be40 R14: 0000000000000292 R15: ffffffff882f5280 > FS: 00002aaaaaad7d60(0000) GS:ffffffff804e7880(0000) knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 00007fffffd1dce6 CR3: 00000000769db000 CR4: 00000000000006e0 > Process ib_at_wq/0 (pid: 11881, threadinfo ffff8100218b8000, task > ffff81004167caa0) > Stack: ffff81004167caa0 6b6b6b6b6b6b6b6b 0000000000000000 000040080000006e > ffff810075baaaf8 ffff81003fdc8f10 0000000000000000 0000000000000003 > ffff8100218b9e48 ffffffff88302d18 > Call Trace:<ffffffff88302d18>{:ib_dat_provider:dapl_path_comp_handler+416} > <ffffffff882f529b>{:ib_at:req_comp_work+27} > <ffffffff8014912c>{worker_thread+476} > <ffffffff801327c0>{default_wake_function+0} > <ffffffff8014d7d0>{keventd_create_kthread+0} > <ffffffff80148f50>{worker_thread+0} > <ffffffff8014d7d0>{keventd_create_kthread+0} > <ffffffff8014da59>{kthread+217} <ffffffff80133d10>{schedule_tail+64} > <ffffffff8010f6db>{child_rip+8} > <ffffffff8014d7d0>{keventd_create_kthread+0} > <ffffffff8014d980>{kthread+0} <ffffffff8010f6d3>{child_rip+0} > > > Code: 48 8b 80 90 00 00 00 48 89 44 24 28 c7 44 24 30 00 00 00 00 > RIP <ffffffff88309c2d>{:ib_dat_provider:dapl_evd_connection_callback+67} RSP > <ffff8100218b9dc8> > <<0>3>geSlneabra cl orprruotpteciotin:on s ftaaurtlt=f: ff00f80010 [072]5b > aa<4af>S8,MP l en > 51CP2 = > 1R e<dz4>on > e:M 0odx5ula2escf l07in1/ke0xd 5ain2c:f0 k71da. > teLastst iusb_erda: t_[<prffovffidfferff8 d83at02 3fib5>_a]t( > daibpl_i_dpoesibtro iy_b_cms_dpid +0ibxb_c9/m0 xbmde 5[ ibip_dv6at > _pparorpviordter_p])c l > p<04>f0 p:a rp6bor t6 ba ut6bofs 64b n 6fsb l6bock 6db rf6bco mm6d l 62cba > p6b bl 6uebto 6otbh 6bpc m6ciba 6bye > aP_sreocv keobtj:<4 s> tarsrtrc=f_nffonf8st10at07ic5b aapc8emc0,ia > l_cenor=5e1 2 nRrpedcz<on4e> : ex0xt3170 jfcbd2a 5/dm0x_m17od0fc v2aid5.eo > Lcoasntt aiusneerr: [bu<fttffonfff bffat88te0ary21 91ac>] > (ohkmciem_h_acdllo tc+pm0x_n61sc/0 xetp0 m[ xfi2s]c_)am > d750060: i2 0c_0co 0re0 i00b_ m0th0ca 00 i b_00sa 00ib _m00ad 0 i1b _c00or > ec 0tg a32 fl0o6pp 0y0 xf00s 0ex0p > tf01s0 :mp 0ts0c si02h m00ptb 0as0e 00sd _m00od 0 s0c si00_mo 0d2 > 4>Pi 0d:0 <114>88 82,0 coa3mm : 06ib_ 0at0_w 0q/01 N00ot < > taiNentxted o 2bj.6:. s12ta-rrtc6=fopffenf8ib10 > 075RIbaP:ad 01001, 0:le[<n=ff51ff2 > ffRe88dz30on9ce:2d 0>]x1 70<4fc><2af5f/ff0xff17ff0f88c230a59c. > >L{:asibt _dusater_p: > >ro[<viffdeffr:ffdaffpl88_e0avd21_c91on>]nec(ktimeonm__calallolbc+ac0xk+6167/0}xe > 0 R[xSPfs:] 0)01<48:> > f0f80010:0 5900c3 dd0c08 0 E0F LA00GS : 00000 010029 06 > 0R0AX:<4 6> b601b6 b600b6b a6b06b f6be RB00X: f04fff 081000 05a02 > e4010 0:RC X:00 00 000200 00000 000004 00008 2b > 00RD<X:4> f 0ff0f 810000 750b1aa 0af08 R00SI : cbfff 0ff5ff f0488 310041a > 00 0RD > I: 0000000000000048 > RBP: ffff81 > Message from [EMAIL PROTECTED] at Wed 0Jun 8 13:36:54 2005 ... > sins-5stinger-10 kerneal: general protection fault: 00020 [1] SMP > 2be70 R08: 6b6b6b6b6b6b6b6b R09: 0000000000000033 > R10: 0000000000000010 R11: 0000000000000010 R12: ffff81007d9b8cd0 > R13: ffff81005a22be40 R14: 0000000000000292 R15: ffffffff882f5280 > FS: 00002aaaaae0ae60(0000) GS:ffffffff804e7900(0000) knlGS:0000000000000000 > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > CR2: 00007fffffbd2e96 CR3: 000000007cb2f000 CR4: 00000000000006e0 > Process ib_at_wq/1 (pid: 11882, threadinfo ffff810059c3c000, task > ffff81004167d190) > Stack: 00000000ffffff92 6b6b6b6b6b6b6b6b 0000000000000000 000040087d9b8d10 > ffff810075baaaf8 0000000000000001 0000000000000092 0000000000000003 > ffff810059c3de48 ffffffff88302d18 > Call Trace:<ffffffff88302d18>{:ib_dat_provider:dapl_path_comp_handler+416} > <ffffffff882f529b>{:ib_at:req_comp_work+27} > <ffffffff8014912c>{worker_thread+476} > <ffffffff801327c0>{default_wake_function+0} > <ffffffff8014d7d0>{keventd_create_kthread+0} > <ffffffff80148f50>{worker_thread+0} > <ffffffff8014d7d0>{keventd_create_kthread+0} > <ffffffff8014da59>{kthread+217} <ffffffff80133d10>{schedule_tail+64} > <ffffffff8010f6db>{child_rip+8} > <ffffffff8014d7d0>{keventd_create_kthread+0} > <ffffffff8014d980>{kthread+0} <ffffffff8010f6d3>{child_rip+0} > > > Code: 48 8b 80 90 00 00 00 48 89 44 24 28 c7 44 24 30 00 00 00 00 > RIP <ffffffff88309c2d>{:ib_dat_provider:dapl_evd_connection_callback+67} RSP > <ffff810059c3ddc8>
This is somewhat garbled but it appears that the connection was destroyed from under the path resolution (during dapl_ib_connect). It looks like there is another case to protect against :-( This one is particularly nasty. -- Hal _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general