Sorry, click send too fast... I think this line causes the KP: https://github.com/torvalds/linux/blob/v3.14/net/core/skbuff.c#L1039 But this is weird, because as I read from this mailing list, OVS doesn't allow shared skb.
I tried turn off GRO: it made receive side slower, but it eventually crashes too. Any ideas? Thanks. -Simon On Thu, Feb 12, 2015 at 8:32 PM, Xu (Simon) Chen <[email protected]> wrote: > Hi folks, > > > I can now consistently reproduce a kernel panic on my system. I am using > OVS 2.3.0 on 3.14.29 kernel, a sender and a receiver (two VMs) on two > identical hypervisors, using VXLAN tunnel connecting the two VMs. Iperf is > used inside of VMs for generating traffic. The sender side has no problem, > while the hypervisor with the receiving VM consistently crashes after > certain amount of time (or rather packets). > > > The kernel panic seems to be related to skb_shared check inside of > pskb_expand_head function: > > [ 7318.405112] ------------[ cut here ]------------ > > [ 7318.409796] kernel BUG at net/core/skbuff.c:1041! > > [ 7318.414563] invalid opcode: 0000 [#1] SMP > > [ 7318.418868] Modules linked in: ip6table_filter ip6_tables xt_mac > xt_tcpudp xt_state xt_physdev xt_set xt_multiport iptable_filter > iptable_nat nf_nat_ipv4 nf_nat ipta > > ble_raw ip_tables x_tables ip_set_hash_ip ip_set nfnetlink vhost_net vhost > macvtap macvlan tun veth openvswitch(O) gre vxlan libcrc32c bridge 8021q > garp stp llc bonding > > joydev hid_generic usbhid hid deflate ctr twofish_generic > twofish_avx_x86_64 nfsd twofish_x86_64_3way twofish_x86_64 twofish_common > auth_rpcgss oid_registry nfs_acl ca > > mellia_generic camellia_aesni_avx_x86_64 camellia_x86_64 nfs lockd > serpent_avx_x86_64 fscache serpent_sse2_x86_64 xts serpent_generic sunrpc > blowfish_generic blowfish_x > > 86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common > des_generic cbc cmac binfmt_misc xcbc rmd160 sha512_generic sha256_generic > hmac crypto_null af_key xfrm > > _algo iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal coretemp kvm_intel > kvm crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 > lrw gf128mul glue_he > > lper ablk_helper cryptd microcode evdev ehci_pci sb_edac ehci_hcd > edac_core usbcore lpc_ich ioatdma i2c_i801 usb_common mfd_core tpm_tis wmi > tpm acpi_cpufreq processor > > thermal_sys button nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ipv6 > nf_defrag_ipv6 nf_conntrack ipmi_devintf ipmi_si ipmi_msghandler loop > tcp_scalable autofs4 ext4 cr > > c16 jbd2 mbcache crc32c btrfs xor raid6_pq dm_mod mlx4_en(O) sg sd_mod > crc_t10dif crct10dif_common igb isci i2c_algo_bit ahci libsas i2c_core > libahci dca mlx4_core(O) m > > egaraid_sas scsi_transport_sas ptp libata pps_core compat(O) scsi_mod > > [ 7318.568195] CPU: 14 PID: 54124 Comm: vhost-54120 Tainted: G O > 3.14.25-ts1 #1 > > [ 7318.576227] Hardware name: Supermicro SYS-F617R2-R72+/X9DRFR, BIOS 3.0b > 04/24/2014 > > [ 7318.583944] task: ffff887f25dde240 ti: ffff883ef6a32000 task.ti: > ffff883ef6a32000 > > [ 7318.591562] RIP: 0010:[<ffffffff813eb634>] [<ffffffff813eb634>] > pskb_expand_head+0x234/0x270 > > [ 7318.600295] RSP: 0018:ffff887f7f103978 EFLAGS: 00010202 > > [ 7318.605770] RAX: 0000000000000002 RBX: ffff887f23417700 RCX: > 0000000000000020 > > [ 7318.613016] RDX: 00000000000002ee RSI: 0000000000000000 RDI: > ffff887f23417700 > > [ 7318.620278] RBP: ffff887f7f1039b8 R08: 000000005ff00000 R09: > ffff887f2113e040 > > [ 7318.627523] R10: 00000000ffffee43 R11: 0000000000000002 R12: > 0000000000000000 > > [ 7318.634805] R13: ffff887f23417700 R14: 000000000000000d R15: > ffff887f23417700 > > [ 7318.642032] FS: 0000000000000000(0000) GS:ffff887f7f100000(0000) > knlGS:0000000000000000 > > [ 7318.650238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 7318.656110] CR2: 00002ba78155e680 CR3: 0000007f1f8cd000 CR4: > 00000000001427e0 > > [ 7318.663378] Stack: > > [ 7318.665464] ffff887f7f1039f8 ffffffff8142ae14 ffffffffa07b50f0 > ffff887f23417700 > > [ 7318.673226] ffff887f7f103b58 ffff887f7f103a70 000000000000000d > ffff887f23417700 > > [ 7318.681000] ffff887f7f103a08 ffffffff813eb6fc ffff887f23417700 > 000001f823417700 > > [ 7318.688790] Call Trace: > > [ 7318.691309] <IRQ> > > [ 7318.693290] [<ffffffff8142ae14>] ? nf_hook_slow+0x74/0x130 > > [ 7318.699285] [<ffffffffa07b50f0>] ? deliver_clone+0x60/0x60 [bridge] > > [ 7318.705710] [<ffffffff813eb6fc>] __pskb_pull_tail+0x4c/0x330 > > [ 7318.711571] [<ffffffff813f8ca7>] skb_checksum_help+0x147/0x1a0 > > [ 7318.717599] [<ffffffffa07de8b0>] queue_userspace_packet+0x3f0/0x440 > [openvswitch] > > [ 7318.725289] [<ffffffffa07dfcd5>] ovs_dp_upcall+0x65/0x70 [openvswitch] > > [ 7318.732037] [<ffffffffa07dc7b6>] do_execute_actions+0x366/0xc00 > [openvswitch] > > [ 7318.739403] [<ffffffff8142ae14>] ? nf_hook_slow+0x74/0x130 > > [ 7318.745072] [<ffffffff812a7c9a>] ? arch_fast_hash2+0xa/0x10 > > [ 7318.750883] [<ffffffffa07dc7ec>] do_execute_actions+0x39c/0xc00 > [openvswitch] > > [ 7318.758221] [<ffffffffa07b570d>] ? br_forward+0x5d/0x70 [bridge] > > [ 7318.764419] [<ffffffffa07dd0c6>] ovs_execute_actions+0x76/0x110 > [openvswitch] > > [ 7318.771773] [<ffffffffa07dfd6f>] > ovs_dp_process_packet_with_key+0x8f/0xf0 [openvswitch] > > [ 7318.779988] [<ffffffffa07e0efa>] ? ovs_flow_extract+0x89a/0xab0 > [openvswitch] > > [ 7318.787355] [<ffffffffa07dfe10>] > ovs_dp_process_received_packet+0x40/0x60 [openvswitch] > > [ 7318.795535] [<ffffffffa07e616a>] ovs_vport_receive+0x2a/0x30 > [openvswitch] > > [ 7318.802634] [<ffffffffa07e7cf5>] netdev_frame_hook+0xc5/0x120 > [openvswitch] > > [ 7318.809773] [<ffffffff813f9f42>] __netif_receive_skb_core+0x332/0x7f0 > > [ 7318.816418] [<ffffffffa07e7c30>] ? netdev_create+0x150/0x150 > [openvswitch] > > [ 7318.823475] [<ffffffff813fa426>] __netif_receive_skb+0x26/0x70 > > [ 7318.829472] [<ffffffff813fa514>] process_backlog+0xa4/0x180 > > [ 7318.835223] [<ffffffff813fa979>] net_rx_action+0x139/0x220 > > [ 7318.840894] [<ffffffff81053218>] __do_softirq+0xf8/0x280 > > [ 7318.846391] [<ffffffff81504b5c>] do_softirq_own_stack+0x1c/0x30 > > [ 7318.852517] <EOI> > > [ 7318.854504] [<ffffffff81053425>] do_softirq+0x45/0x50 > > [ 7318.860084] [<ffffffff813f9759>] netif_rx_ni+0x39/0x70 > > [ 7318.865416] [<ffffffffa07f1ab3>] tun_get_user+0x413/0x840 [tun] > > [ 7318.871506] [<ffffffffa07f1f3a>] tun_sendmsg+0x5a/0x80 [tun] > > [ 7318.877357] [<ffffffffa0819e32>] handle_tx+0x382/0x400 [vhost_net] > > [ 7318.883712] [<ffffffffa0819ee5>] handle_tx_kick+0x15/0x20 [vhost_net] > > [ 7318.890333] [<ffffffffa080d4f6>] vhost_worker+0xf6/0x190 [vhost] > > [ 7318.896528] [<ffffffffa080d400>] ? vhost_log_access_ok+0x30/0x30 > [vhost] > > [ 7318.903454] [<ffffffff81070c69>] kthread+0xc9/0xe0 > > [ 7318.908412] [<ffffffff81070ba0>] ? flush_kthread_worker+0x80/0x80 > > [ 7318.914674] [<ffffffff8150342c>] ret_from_fork+0x7c/0xb0 > > [ 7318.920163] [<ffffffff81070ba0>] ? flush_kthread_worker+0x80/0x80 > > [ 7318.926426] Code: 55 c0 e8 f0 38 d4 ff 48 8b 55 c0 84 c0 0f 85 0b ff ff > ff e9 02 ff ff ff 0f 1f 80 00 00 00 00 41 81 cf 00 20 00 00 e9 1f fe ff ff > <0f> 0b 0f 0b 44 89 fe 4c 89 ef e8 ad e8 ff ff 85 c0 74 12 48 89 > > [ 7318.950040] RIP [<ffffffff813eb634>] pskb_expand_head+0x234/0x270 > > [ 7318.956385] RSP <ffff887f7f103978> > > [ 7318.959988] ---[ end trace 221c17dcc65b8372 ]--- > > [ 7319.076935] Kernel panic - not syncing: Fatal exception in interrupt > > [ 7319.086993] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation > range: 0xffffffff80000000-0xffffffff9fffffff) > > [ 7319.204560] Rebooting in 10 seconds.. > > > > >
_______________________________________________ discuss mailing list [email protected] http://openvswitch.org/mailman/listinfo/discuss
