Hi all. I am currently implementing a Click-based opportunistic packet combination engine for use on top of IEEE 802.11. I've unit tested my implementation fairly extensively in user-space, and partly unit-tested in kernelspace, and haven't had any issues so far. I recently moved to doing integration testing, and the code seems to run okay in-kernel without any problems, except that every so often (maybe 6 out of 10 times), click-uninstall causes a kernel panic in interrupt context on cleanup.
The panic seems to be related to my code on the tx output path; since it only appears for a few particular configurations, and only when some of my elements are introducted. The configuration I am using that triggers the panic is attached. I've also manually copied the oops-trace from the screen, and attached it with this email. A register dump using sysrq does not produce any additional useful info, so I have excluded it. It seems like the panic is triggered somewhere in ToDevice::run_task. I realize that some brain-dead bug in my code is probably at fault, and am currently double-checking everything I've written. I am posting here mainly because I am not sure if this is a ToDevice bug that I am inadvertently triggering. Additionally, I'm having trouble getting ksymoops to run with click. Any ideas on how I go about it? (I've also tried using kexec/kdump, but it seems like these are very twitchy about what kernel config is used, and have issues with the one I am using). Vivek -- --- ************************************* Vivek Raghunathan, PhD student, University of Illinois, Urbana-Champaign Contact Details: 1012 W. Clark St #31, Urbana IL 61801 ph: 217-766-1868 (cell) 217-333-7541 (off)
ep 18 23:51:44 localhost kernel: [4309934.322000] scheduling while atomic: click-uninstall/0x00000001/4756 Sep 18 23:51:44 localhost kernel: [4309934.322000] [schedule+1468/1680] schedule+0x5bc/0x690 Sep 18 23:51:44 localhost kernel: [4309934.322000] [dev_ioctl+622/992] dev_ioctl+0x26e/0x3e0 Sep 18 23:51:44 localhost kernel: [4309934.322000] [pg0+257585981/1053733888] journal_stop+0x17d/0x2a0 [jbd] Sep 18 23:51:44 localhost kernel: [4309934.322000] [wait_for_completion+136/224] wait_for_completion+0x88/0xe0 Sep 18 23:51:44 localhost kernel: [4309934.322000] [default_wake_function+0/32] default_wake_function+0x0/0x20 Sep 18 23:51:44 localhost kernel: [4309934.323000] [synchronize_rcu+52/64] synchronize_rcu+0x34/0x40 Sep 18 23:51:44 localhost kernel: [4309934.323000] [wakeme_after_rcu+0/16] wakeme_after_rcu+0x0/0x10 Sep 18 23:51:44 localhost kernel: [4309934.323000] [unregister_netdevice+247/576] unregister_netdevice+0xf7/0x240 Sep 18 23:51:44 localhost kernel: [4309934.323000] [unregister_netdev+16/32] unregister_netdev+0x10/0x20 Sep 18 23:51:44 localhost kernel: [4309934.323000] [pg0+268235567/1053733888] _ZN8FromHost7cleanupEN7Element12CleanupStageE+0x9f/0xd0 [click] Sep 18 23:51:44 localhost kernel: [4309934.323000] [pg0+268023399/1053733888] _ZN6RouterD1Ev+0x427/0x460 [click] Sep 18 23:51:44 localhost kernel: [4309934.323000] [pg0+268023481/1053733888] _ZN6Router5unuseEv+0x19/0x40 [click] Sep 18 23:51:44 localhost kernel: [4309934.323000] [pg0+268413973/1053733888] _Z11kill_routerv+0x15/0x30 [click] Sep 18 23:51:44 localhost kernel: [4309934.324000] [pg0+268415537/1053733888] _Z12write_configRK6StringP7ElementPvP12ErrorHandler+0x21/0x1d0 [click] Sep 18 23:51:44 localhost kernel: [4309934.324000] [pg0+267998862/1053733888] _ZNK7Handler10call_writeERK6StringP7ElementbP12ErrorHandler+0x16e/0x1e0 [click] Sep 18 23:51:44 localhost kernel: [4309934.324000] [pg0+268429044/1053733888] handler_flush+0x4f4/0x500 [click] Sep 18 23:51:44 localhost kernel: [4309934.325000] [filp_close+35/128] filp_close+0x23/0x80 Sep 18 23:51:44 localhost kernel: [4309934.325000] [sys_close+92/160] sys_close+0x5c/0xa0 Sep 18 23:51:44 localhost kernel: [4309934.325000] [sysenter_past_esp+84/121] sysenter_past_esp+0x54/0x79 Sep 18 23:51:44 localhost kernel: [4309934.327000] scheduling while atomic: click-uninstall/0x00000001/4756 Sep 18 23:51:44 localhost kernel: [4309934.327000] [schedule+1468/1680] schedule+0x5bc/0x690 Sep 18 23:51:44 localhost kernel: [4309934.327000] [extract_entropy+124/176] extract_entropy+0x7c/0xb0 Sep 18 23:51:44 localhost kernel: [4309934.327000] [pneigh_queue_purge+47/80] pneigh_queue_purge+0x2f/0x50 Sep 18 23:51:44 localhost kernel: [4309934.327000] [neigh_ifdown+139/208] neigh_ifdown+0x8b/0xd0 Sep 18 23:51:44 localhost kernel: [4309934.327000] [pneigh_queue_purge+47/80] pneigh_queue_purge+0x2f/0x50 Sep 18 23:51:44 localhost kernel: [4309934.327000] [wait_for_completion+136/224] wait_for_completion+0x88/0xe0 Sep 18 23:51:44 localhost kernel: [4309934.327000] [default_wake_function+0/32] default_wake_function+0x0/0x20 Sep 18 23:51:44 localhost kernel: [4309934.327000] [synchronize_rcu+52/64] synchronize_rcu+0x34/0x40 Sep 18 23:51:44 localhost kernel: [4309934.327000] [wakeme_after_rcu+0/16] wakeme_after_rcu+0x0/0x10 Sep 18 23:51:44 localhost kernel: [4309934.327000] [unregister_netdevice+362/576] unregister_netdevice+0x16a/0x240 Sep 18 23:51:44 localhost kernel: [4309934.327000] [unregister_netdev+16/32] unregister_netdev+0x10/0x20 Sep 18 23:51:44 localhost kernel: [4309934.327000] [pg0+268235567/1053733888] _ZN8FromHost7cleanupEN7Element12CleanupStageE+0x9f/0xd0 [click] Sep 18 23:51:44 localhost kernel: [4309934.327000] [pg0+268023399/1053733888] _ZN6RouterD1Ev+0x427/0x460 [click] Sep 18 23:51:44 localhost kernel: [4309934.328000] [pg0+268023481/1053733888] _ZN6Router5unuseEv+0x19/0x40 [click] Sep 18 23:51:44 localhost kernel: [4309934.328000] [pg0+268413973/1053733888] _Z11kill_routerv+0x15/0x30 [click] Sep 18 23:51:44 localhost kernel: [4309934.328000] [pg0+268415537/1053733888] _Z12write_configRK6StringP7ElementPvP12ErrorHandler+0x21/0x1d0 [click] ZNK7Handler10call_writeERK6StringP7ElementbP12ErrorHandler+0x16e/0x1e0 [click] Sep 18 23:51:44 localhost kernel: [4309934.329000] [pg0+268429044/1053733888] handler_flush+0x4f4/0x500 [click] Sep 18 23:51:44 localhost kernel: [4309934.329000] [filp_close+35/128] filp_close+0x23/0x80 Sep 18 23:51:44 localhost kernel: [4309934.329000] [sys_close+92/160] sys_close+0x5c/0xa0 Sep 18 23:51:44 localhost kernel: [4309934.329000] [sysenter_past_esp+84/121] sysenter_past_esp+0x54/0x79 Sep 18 23:51:44 localhost kernel: [4309934.344000] Unable to handle kernel NULL pointer dereference at virtual address 00000020 Sep 18 23:51:44 localhost kernel: [4309934.344000] printing eip: Sep 18 23:51:44 localhost kernel: [4309934.344000] d12e5ac6 Sep 18 23:51:44 localhost kernel: [4309934.344000] *pde = 00000000 Sep 18 23:51:44 localhost kernel: [4309934.344000] Oops: 0000 [#1] Sep 18 23:51:44 localhost kernel: [4309934.344000] PREEMPT Sep 18 23:51:44 localhost kernel: [4309934.344000] Modules linked in: click proclikefs rfcomm l2cap bluetooth nvram uinput ppdev radeon drm speedstep_centrino cpufreq_userspace cpufreq_stats freq_table cpufreq_powersave cpufreq_ondemand cpufreq_conservative video ibm_acpi container button battery ac ipv6 dm_mod af_packet md_mod lp pcmcia e100 joydev ipw2200 snd_intel8x0 tsdev mii ieee80211 ieee80211_crypt yenta_socket rsrc_nonstatic pcmcia_core snd_ac97_codec snd_ac97_bus ide_cd cdrom snd_pcm_oss snd_mixer_oss parport_pc parport floppy psmouse serio_raw snd_pcm rtc ehci_hcd hw_random uhci_hcd intel_agp agpgart pcspkr usbcore snd_timer shpchp pci_hotplug snd soundcore snd_page_alloc evdev ext3 jbd mbcache ide_disk ide_generic via82cxxx trm290 triflex slc90e66 sis5513 siimage serverworks sc1200 rz1000 piix pdc202xx_old pdc202xx_new opti621 ns87415 it821x hpt366 hpt34x generic cy82c693 cs5535 cs5530 cs5520 cmd64x atiixp amd74xx alim15x3 aec62xx thermal processor fan Sep 18 23:51:44 localhost kernel: [4309934.344000] CPU: 0 Sep 18 23:51:44 localhost kernel: [4309934.344000] EIP: 0060:[pg0+268249798/1053733888] Not tainted VLI Sep 18 23:51:44 localhost kernel: [4309934.344000] EFLAGS: 00210246 (2.6.16.13 #2) Sep 18 23:51:44 localhost kernel: [4309934.344000] EIP is at _ZN8ToDevice8run_taskEP4Task+0x356/0x3d0 [click] Sep 18 23:51:44 localhost kernel: [4309934.344000] eax: 00000020 ebx: c87ce8d4 ecx: 000059ea edx: 00004c2c Sep 18 23:51:44 localhost kernel: [4309934.344000] esi: c87ce880 edi: 00000000 ebp: 00000000 esp: c207ff8c Sep 18 23:51:44 localhost kernel: [4309934.344000] ds: 007b es: 007b ss: 0068 Sep 18 23:51:44 localhost kernel: [4309934.344000] Process kclick (pid: 4377, threadinfo=c207e000 task=c2fc2030) Sep 18 23:51:44 localhost kernel: [4309934.344000] Stack: <0>007ce8d4 00000014 00000080 c272aec0 000d1ad5 d12a7f16 c87ce880 c87ce8d4 Sep 18 23:51:44 localhost kernel: [4309934.344000] c272af5c 00000010 00000020 d13190dd 00000010 c207e000 c2fc2030 c272aec0 Sep 18 23:51:44 localhost kernel: [4309934.344000] 00000000 d130e4e8 c272aec0 d130e460 00000000 00000000 00000000 c1001005 Sep 18 23:51:44 localhost kernel: [4309934.344000] Call Trace: Sep 18 23:51:44 localhost kernel: [4309934.344000] [pg0+267996950/1053733888] _ZN12RouterThread6driverEv+0x146/0x2e0 [click] Sep 18 23:51:44 localhost kernel: [4309934.344000] [pg0+268460253/1053733888] _ZN6VectorIiE7reserveEi+0x2d/0x90 [click] Sep 18 23:51:44 localhost kernel: [4309934.344000] [pg0+268416232/1053733888] _Z11click_schedPv+0x88/0x170 [click] Sep 18 23:51:44 localhost kernel: [4309934.344000] [pg0+268416096/1053733888] _Z11click_schedPv+0x0/0x170 [click] Sep 18 23:51:44 localhost kernel: [4309934.344000] [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10 Sep 18 23:51:44 localhost kernel: [4309934.344000] Code: c8 85 c0 7e f0 eb aa 8d b4 26 00 00 00 00 8d bc 27 00 00 00 00 ba 01 00 00 00 b9 01 00 00 00 e9 a2 fd ff ff 90 8b 86 b0 00 00 00 <8b> 00 85 86 b4 00 00 00 0f 85 a8 fd ff ff 31 d2 e9 a6 fd ff ff
AddressInfo(MyIP 10.1.1.2/8); AddressInfo(MyEther 00:11:25:2D:7D:33) // Remember to change masks like cls_broad whenever BroadcastAddr changes AddressInfo(BroadcastAddr 10.255.255.255); // AddressInfo(RemoteIP 10.1.1.1/8); // AddressInfo(RemoteEther 00:11:25:47:EA:7B) // Combiner for fak0 // SplayCombiner = 222 = 0xde frmhst::FromHost(fak0, MyIP, ETHER MyEther) -> cls_arp::Classifier(12/0806, 12/0800, -); cls_arp[0] -> passq::Queue; cls_arp[1] -> cls_broad::Classifier(16/0affffff, -); cls_arp[2] -> passq; cls_broad[0] -> passq; cls_broad[1] -> combq::PureQueue; passq -> [0]pr_sch::PrioSched; ruledb::RuleDB; nbr::NbrTable; comb::Combiner(RULEDB ruledb, PUREQUEUE combq, NBRTABLE nbr); comb->ipenc::IPEncap(0xde, MyIP, BroadcastAddr); ipenc->ethenc::SplayEtherEncapDstFix(MyEther); ethenc->[1]pr_sch; pr_sch -> Print(testtx) -> ToDevice(eth0); //////////////////////////////////// FromDevice(eth0) -> Print(test_rx, 100) -> SetPacketType(HOST) -> ToHost(fak0); Comments on what the elements are 1. PureQueue is a 1/0 variant of SimpleQueue that provides a push based interface for queueing, and allows for dequeuing from an arbitrary position inside the queue using a public method. This is comprehensively unit tested 2. NbrTable and RuleDB are database elements that only store data and provide an external access interface to add/delete entries. These are comprehensively unit tested. 3. Combiner uses NbrTable and RuleDB to decide which packets from PureQueue to extract when its downstream initiates a pull. This variant is a base class that does nothing; it ingores Nbrtable, RuleDB and simply extracts packets out of the PureQueue in FIFO order. 4. SplayEtherEncapDstFix copies Ethernet destination address from the annotation field. This annotation is set by Combiner.
_______________________________________________ click mailing list [email protected] https://amsterdam.lcs.mit.edu/mailman/listinfo/click
