Hi Roman, Although IFF_XMIT_DST_RELEASE has enabled in my code, I have disabled it and the same kernel panic occurs.
[60576.286449] BUG: unable to handle kernel NULL pointer dereference at (null) [60576.286507] IP: [<f85eab7c>] myri10ge_xmit+0x42d/0x99b [myri10ge] [60576.286566] *pde = 00000000 [60576.286607] Oops: 0002 [#1] SMP [60576.286650] last sysfs file: /sys/module/vt/parameters/default_utf8 [60576.286702] Modules linked in: click proclikefs 8021q garp stp loop snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore myri10ge inet_lro snd_page_alloc ioatdma processor pcspkr i2c_i801 evdev i2c_core button ext3 jbd mbcache sg sd_mod sr_mod crc_t10dif cdrom ata_generic ata_piix uhci_hcd libata ehci_hcd usbcore scsi_mod e1000e nls_base igb dca thermal thermal_sys [last unloaded: click] [60576.287076] [60576.287111] Pid: 17085, comm: kclick Not tainted (2.6.32-5-686 #1) System Product Name [60576.287194] EIP: 0060:[<f85eab7c>] EFLAGS: 00010246 CPU: 0 [60576.287243] EIP is at myri10ge_xmit+0x42d/0x99b [myri10ge] [60576.287291] EAX: 00000000 EBX: 00000000 ECX: 00000420 EDX: f62c3000 [60576.287342] ESI: 00000000 EDI: c9051800 EBP: f62c4be0 ESP: cf1d1e9c [60576.287392] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [60576.287440] Process kclick (pid: 17085, ti=cf1d0000 task=f5d5d500 task.ti=cf1d0000) [60576.287521] Stack: [60576.287557] c9050480 00000000 c90504a0 00000032 f5d9a000 00000000 f5d9a3a0 00000000 [60576.287623] <0> f5d9a840 35f5504a 00000420 00000000 f65a33cc 09d3cfff 00000000 00000000 [60576.287721] <0> 00000001 00000000 0000000c 00000001 01d80000 c3008100 f5d5d500 00003039 [60576.287848] Call Trace: [60576.287960] [<fa715ad1>] ? _ZN8ToDevice12queue_packetEP6PacketP12netdev_queue+0xe1/0x190 [click] [60576.288118] [<fa715c6d>] ? _ZN8ToDevice8run_taskEP4Task+0xed/0x290 [click] [60576.288242] [<fa712c5f>] ? _ZN10FromDevice8run_taskEP4Task+0x7f/0x110 [click] [60576.288363] [<fa6cb9d6>] ? _ZN12RouterThread6driverEv+0x1a6/0x4b0 [click] [60576.288449] [<fa6afdcd>] ? click_lalloc+0x2d/0x60 [click] [60576.288526] [<fa6a7f50>] ? _ZN6VectorIPvE7reserveEi+0x90/0xb0 [click] [60576.288638] [<fa7550aa>] ? _ZL11click_schedPv+0xda/0x1a0 [click] [60576.288749] [<fa754fd0>] ? _ZL11click_schedPv+0x0/0x1a0 [click] [60576.288802] [<c1003d47>] ? kernel_thread_helper+0x7/0x10 [60576.288849] Code: 00 00 00 c7 44 24 60 00 00 00 00 8b 5f 54 8b 4f 50 29 d9 89 4c 24 28 8b 55 20 8b 45 18 21 d0 89 44 24 2c 89 c6 8b 45 14 c1 e6 04 <89> 3c 30 8b 5c 24 18 31 c0 8b 97 ac 00 00 00 8b 8b a4 00 00 00 [60576.289115] EIP: [<f85eab7c>] myri10ge_xmit+0x42d/0x99b [myri10ge] SS:ESP 0068:cf1d1e9c [60576.289201] CR2: 0000000000000000 [60576.289714] ---[ end trace dfb4d575adb09f5c ]--- Thanks, Ricard ----------------original message----------------- From: "Roman Chertov" [email protected] To: "Ricard Vilalta" [email protected] Date: Tue, 11 Jan 2011 09:18:51 -0800 ------------------------------------------------- > Does the crash occur immediately on the first packet, or does it occur > randomly? > > Can you check if IFF_XMIT_DST_RELEASE is defined in your code? I am curious if > this code could be causing an issue. > if (dev->priv_flags & IFF_XMIT_DST_RELEASE) > skb_dst_drop(skb1); > > Roman > > On Tue, 11 Jan 2011 09:21:32 +0100 Ricard Vilalta [email protected] > wrote > >> Hi Roman, >> >> To me looks like the module myri10ge is crashing. What is strange is that >> running click in user mode everything runs smoothly. >> As I told you in my previous e-mail, using dev_queue_xmit kernel module runs >> smoothly, too (not only for myricom NIC but intel 1Gb ethernet NIC). >> >> Thanks for your time, >> >> Ricard >> >> Jan 7 11:18:29 strongest-2 kernel: [160300.209737] click: starting >> router >> thread pid 21038 (f3d20180) >> Jan 7 11:20:52 strongest-2 kernel: [160443.541141] BUG: unable to handle >> kernel NULL pointer dereference at (null) >> Jan 7 11:20:52 strongest-2 kernel: [160443.541229] IP: [<f846aba4>] >> myri10ge_xmit+0x455/0x9c3 [myri10ge] >> Jan 7 11:20:52 strongest-2 kernel: [160443.541284] *pde = 00000000 >> Jan 7 11:20:52 strongest-2 kernel: [160443.541325] Oops: 0002 [#1] SMP >> Jan 7 11:20:52 strongest-2 kernel: [160443.541369] last sysfs file: >> /sys/module/vt/parameters/default_utf8 >> Jan 7 11:20:52 strongest-2 kernel: [160443.541420] Modules linked in: >> click >> proclikefs 8021q garp stp loop snd_hda_codec_realtek snd_hda_intel >> snd_hda_codec snd_hwdep snd_pcm snd_timer snd i2c_i801 soundcore >> snd_page_alloc i2c_core ioatdma button myri10ge inet_lro evdev pcspkr >> processor ext3 jbd mbcache sg sr_mod sd_mod cdrom crc_t10dif ata_generic >> ata_piix libata scsi_mod uhci_hcd e1000e ehci_hcd usbcore nls_base igb >> dca >> thermal thermal_sys [last unloaded: scsi_wait_scan] >> Jan 7 11:20:52 strongest-2 kernel: [160443.541834] Pid: 21038, comm: >> kclick >> Not tainted (2.6.32-5-686 #1) System Product Name >> Jan 7 11:20:52 strongest-2 kernel: [160443.541917] EIP: >> 0060:[<f846aba4>] >> EFLAGS: 00010246 CPU: 0 >> Jan 7 11:20:52 strongest-2 kernel: [160443.541967] EIP is at >> myri10ge_xmit+0x455/0x9c3 [myri10ge] >> Jan 7 11:20:52 strongest-2 kernel: [160443.542015] EAX: 00000000 EBX: >> 00000000 ECX: 0000040a EDX: f6338800 >> Jan 7 11:20:52 strongest-2 kernel: [160443.542066] ESI: 00000000 EDI: >> f50b9080 EBP: f63383e0 ESP: f671de9c >> Jan 7 11:20:52 strongest-2 kernel: [160443.542117] DS: 007b ES: 007b FS: >> 00d8 >> GS: 00e0 SS: 0068 >> Jan 7 11:20:52 strongest-2 kernel: [160443.542166] Process kclick (pid: >> 21038, ti=f671c000 task=f5c24840 task.ti=f671c000) >> Jan 7 11:20:52 strongest-2 kernel: [160443.542247] Stack: >> Jan 7 11:20:52 strongest-2 kernel: [160443.542282] 00000000 00000000 >> 00000000 >> 00000000 f5d6b800 00000000 f5d6bba0 00000000 >> Jan 7 11:20:52 strongest-2 kernel: [160443.542349] <0> f5d6c040 c1386ba0 >> 0000040a 00000000 00000000 09cc77ff 00000000 00000000 >> Jan 7 11:20:52 strongest-2 kernel: [160443.542447] <0> 00000000 00000000 >> 0000000c 00008100 f5c20000 c1007569 00000286 0000000f >> Jan 7 11:20:52 strongest-2 kernel: [160443.542574] Call Trace: >> Jan 7 11:20:52 strongest-2 kernel: [160443.542615] [<c1007569>] ? >> sched_clock+0x5/0x7 >> Jan 7 11:20:52 strongest-2 kernel: [160443.542664] [<c126c876>] ? >> schedule+0x78f/0x7dc >> Jan 7 11:20:52 strongest-2 kernel: [160443.542786] [<fa141a48>] ? >> _ZN8ToDevice12queue_packetEP6PacketP12netdev_queue+0xf8/0x1c0 >> [click] >> Jan 7 11:20:52 strongest-2 kernel: [160443.542948] [<fa155bc5>] ? >> _ZN13FullNoteQueue4pushEiP6Packet+0x1a5/0x220 [click] >> Jan 7 11:20:52 strongest-2 kernel: [160443.543102] [<fa141bfd>] ? >> _ZN8ToDevice8run_taskEP4Task+0xed/0x290 [click] >> Jan 7 11:20:52 strongest-2 kernel: [160443.543225] [<fa13ebbf>] ? >> _ZN10FromDevice8run_taskEP4Task+0x7f/0x110 [click] >> Jan 7 11:20:52 strongest-2 kernel: [160443.543347] [<fa0f79d6>] ? >> _ZN12RouterThread6driverEv+0x1a6/0x4b0 [click] >> Jan 7 11:20:52 strongest-2 kernel: [160443.543433] [<fa0dbdcd>] ? >> click_lalloc+0x2d/0x60 [click] >> Jan 7 11:20:52 strongest-2 kernel: [160443.543511] [<fa0d3f50>] ? >> _ZN6VectorIPvE7reserveEi+0x90/0xb0 [click] >> Jan 7 11:20:52 strongest-2 kernel: [160443.543623] [<fa180f7a>] ? >> _ZL11click_schedPv+0xda/0x1a0 [click] >> Jan 7 11:20:52 strongest-2 kernel: [160443.543734] [<fa180ea0>] ? >> _ZL11click_schedPv+0x0/0x1a0 [click] >> Jan 7 11:20:52 strongest-2 kernel: [160443.543786] [<c1003d47>] ? >> kernel_thread_helper+0x7/0x10 >> Jan 7 11:20:52 strongest-2 kernel: [160443.543833] Code: 00 00 00 c7 44 24 60 >> 00 00 00 00 8b 5f 54 8b 4f 50 29 d9 89 4c 24 28 8b 55 20 8b 45 18 21 d0 89 44 >> 24 2c 89 c6 8b 45 14 c1 e6 04 <89> 3c 30 8b 5c 24 18 31 c0 8b 97 ac 00 00 00 >> 8b 8b a4 00 00 00 >> Jan 7 11:20:52 strongest-2 kernel: [160443.544100] EIP: [<f846aba4>] >> myri10ge_xmit+0x455/0x9c3 [myri10ge] SS:ESP 0068:f671de9c >> Jan 7 11:20:52 strongest-2 kernel: [160443.544186] CR2: >> 0000000000000000 >> Jan 7 11:20:52 strongest-2 kernel: [160443.544714] ---[ end trace >> b519e64f26b52814 ]--- >> >> >> ----------------original message----------------- >> From: "Roman Chertov" [email protected] >> To: [email protected], "Ricard Vilalta" >> [email protected] >> Date: Mon, 10 Jan 2011 09:33:37 -0800 >> ------------------------------------------------- >> >> >> > Ricard, >> > >> > Do you have the stack trace of the kernel panic? >> > >> > Roman >> > >> > On Mon, 10 Jan 2011 12:58:00 +0100 Ricard Vilalta [email protected] >> > wrote >> > >> >> Hi all, >> >> I am not a linux kernel guru and I would like your support. I have >> >> lately been working with Myricom NIC's (using linux kernel module >> >> myri10ge) and it was impossible to forward paquets from another ethernet >> >> interface to the myricom interface. This leaded to kernel panic. I have >> >> been trying several solutions and the only one which seems to be working >> >> is using dev_queue_xmit instead of hard_start_xmit on todevice. I am >> >> using patchless click with linux kernel 2.6.32-35. >> >> >> >> I have fully tested the change and seems to work for me. What do you >> >> think of this change? May it lead to another type of errors or >> >> unexpected behaviors? >> >> >> >> Thanks in advance. >> >> >> >> Best regards, >> >> Ricard >> >> >> >> -- >> >> ______________________________________________________________ >> >> >> >> Ricard Vilalta >> >> Research Engineer >> >> Optical Networking Area (ONA) http://wikiona.cttc.es/ >> >> CTTC - Centre Tecnològic de Telecomunicacions de Catalunya >> >> Parc Mediterrani de la Tecnologia (PMT) >> >> Av. Carl Friedrich Gauss 7, >> >> 08860 Castelldefels (Barcelona), Spain >> >> http://www.cttc.es/ >> >> Phone: +34 93 396 71 70 (ext. 2232). Fax: +34 93 645 29 01 >> >> E-mail: [email protected] >> >> >> >> >> >> _______________________________________________ >> >> click mailing list >> >> [email protected] >> >> https://amsterdam.lcs.mit.edu/mailman/listinfo/click >> > >> > >> >> -- > > -- _______________________________________________ click mailing list [email protected] https://amsterdam.lcs.mit.edu/mailman/listinfo/click
