Hi, On OVS 2.4, I am getting soft lockups from medium-sized packet flood attacks. It seems like during the packet flood, most incoming packets are not matching any rules, and so new ones need to be created for every incoming packet, which overloads the system.
To mitigate this problem, I have a daemon monitoring the packets per second on the physical interface every 500ms; it does a tcpdump if the pps is too high, finds the destination address of the flood, and blocks traffic to the destination address (e.g. 1.2.3.4) by adding a flow: ovs-ofctl add-flow br-ex dl_type=0x0800,nw_dst=1.2.3.4,actions=drop This is usually working, but sometimes the flood is not detected quickly enough, and the entire system freezes, so it's only several minutes later when the daemon can continue with the block. Is there a way to limit the rate of unmatched packets to prevent this overloading that locks up the system (so that some packets are dropped)? Then the daemon would be able to immediately block the targeted address by adding the drop flow. Alternatively, there may also be a better way to do this without the custom daemon code. Here's the vswitchd log (and kernel trace is attached): 2015-12-17T19:55:41.186Z|00137|dpif_netlink(handler10)|WARN|system@ovs-system: lost packet on port channel 3 of handler 0 2015-12-17T19:55:43.112Z|00146|ovs_rcu(urcu6)|WARN|blocked 1927 ms waiting for handler10 to quiesce 2015-12-17T19:55:45.248Z|00147|ovs_rcu(urcu6)|WARN|blocked 4063 ms waiting for handler10 to quiesce 2015-12-17T19:55:45.248Z|00148|ovs_rcu(urcu6)|WARN|blocked 4063 ms waiting for handler10 to quiesce 2015-12-17T19:57:10.314Z|00149|ovs_rcu(urcu6)|WARN|blocked 89129 ms waiting for handler10 to quiesce 2015-12-17T19:57:10.314Z|00150|ovs_rcu(urcu6)|WARN|blocked 89129 ms waiting for handler10 to quiesce 2015-12-17T19:57:10.314Z|00151|ovs_rcu(urcu6)|WARN|blocked 89129 ms waiting for handler10 to quiesce 2015-12-17T19:57:10.314Z|00152|ovs_rcu(urcu6)|WARN|blocked 89129 ms waiting for handler10 to quiesce 2015-12-17T19:57:57.234Z|00153|ovs_rcu(urcu6)|WARN|blocked 136049 ms waiting for handler10 to quiesce 2015-12-17T19:59:58.593Z|00154|ovs_rcu(urcu6)|WARN|blocked 257408 ms waiting for handler10 to quiesce 2015-12-17T20:01:42.685Z|726399|dpif(revalidator15)|WARN|Dropped 20 log messages in last 362 seconds (most recently, 362 seconds ago) due to excessive rate 2015-12-17T20:04:16.689Z|00155|ovs_rcu(urcu6)|WARN|blocked 515504 ms waiting for handler10 to quiesce 2015-12-17T20:05:42.479Z|00156|timeval(urcu6)|WARN|Unreasonably long 75330ms poll interval (0ms user, 0ms system) 2015-12-17T20:05:42.479Z|00157|timeval(urcu6)|WARN|context switches: 1 voluntary, 0 involuntary 2015-12-17T20:05:42.479Z|00158|coverage(urcu6)|INFO|Dropped 4 log messages in last 3203 seconds (most recently, 3165 seconds ago) due to excessive rate 2015-12-17T20:05:42.479Z|00159|coverage(urcu6)|INFO|Skipping details of duplicate event coverage for hash=8410616e Thanks! Favyen trace.txt [20195712.090625] BUG: soft lockup - CPU#2 stuck for 22s! [ovs-vswitchd:8558] [20195712.104800] CPU: 2 PID: 8558 Comm: ovs-vswitchd Tainted: G W 3.14.39-031439-generic #201504211206 [20195712.104805] Hardware name: Supermicro X9SRH-7F/7TF/X9SRH-7F/7TF, BIOS 3.00 07/05/2013 [20195712.104810] task: ffff88017bd31910 ti: ffff8801b92a2000 task.ti: ffff8801b92a2000 [20195712.104814] RIP: 0010:[<ffffffffa04f7369>] [<ffffffffa04f7369>] nf_conntrack_tuple_taken+0x99/0x1b0 [nf_conntrack] [20195712.104834] RSP: 0018:ffff88207fc437d8 EFLAGS: 00000246 [20195712.104838] RAX: ffff8801167e6c70 RBX: 000000000770c44e RCX: 00000000b612c443 [20195712.104842] RDX: 0000000000000001 RSI: 00000000fa8f0415 RDI: ffff88207fc43810 [20195712.104845] RBP: ffff88207fc437f8 R08: 00000000fea3ad6e R09: 000000007361453d [20195712.104849] R10: ffff88207fc43828 R11: ffff88207fc43978 R12: ffff88207fc43748 [20195712.104852] R13: ffffffff8176d81d R14: ffff88207fc437f8 R15: ffff88207fc43810 [20195712.104857] FS: 00007fa272770980(0000) GS:ffff88207fc40000(0000) knlGS:0000000000000000 [20195712.104861] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [20195712.104864] CR2: 000012b6d8042000 CR3: 0000001f38f8c000 CR4: 00000000001427e0 [20195712.104868] DR0: 0000000000000003 DR1: 00000000000000b0 DR2: 0000000000000001 [20195712.104871] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [20195712.104874] Stack: [20195712.104877] ffff88020de67998 ffff88207fc43a00 0000000000007c5b 000000000000fc00 [20195712.104887] ffff88207fc43848 ffffffffa051c659 0000000000000000 000000009f66459e [20195712.104895] 0000000000000000 0166459e00025000 0000000000000000 0106f7ef00000000 [20195712.104903] Call Trace: [20195712.104907] <IRQ> [20195712.104913] [<ffffffffa051c659>] nf_nat_used_tuple+0x29/0x30 [nf_nat] [20195712.104934] [<ffffffffa051d623>] nf_nat_l4proto_unique_tuple+0xf3/0x190 [nf_nat] [20195712.104945] [<ffffffffa051c659>] ? nf_nat_used_tuple+0x29/0x30 [nf_nat] [20195712.104955] [<ffffffff8101ec59>] ? sched_clock+0x9/0x10 [20195712.104965] [<ffffffffa051d7f5>] tcp_unique_tuple+0x15/0x20 [nf_nat] [20195712.104975] [<ffffffffa051ce40>] get_unique_tuple+0x110/0x260 [nf_nat] [20195712.104989] [<ffffffffa051d017>] nf_nat_setup_info+0x87/0x360 [nf_nat] [20195712.104998] [<ffffffff81092215>] ? run_posix_cpu_timers+0x45/0x290 [20195712.105007] [<ffffffffa0625104>] xt_snat_target_v0+0x34/0x40 [xt_nat] [20195712.105017] [<ffffffffa04de645>] ipt_do_table+0x335/0x550 [ip_tables] [20195712.105031] [<ffffffffa04f7459>] ? nf_conntrack_tuple_taken+0x189/0x1b0 [nf_conntrack] [20195712.105042] [<ffffffffa05290b8>] nf_nat_rule_find+0x28/0xc0 [iptable_nat] [20195712.105051] [<ffffffffa0529319>] nf_nat_ipv4_fn+0x1c9/0x280 [iptable_nat] [20195712.105062] [<ffffffff8168c5e0>] ? ip_finish_output.part.42+0x440/0x440 [20195712.105071] [<ffffffffa0529504>] nf_nat_ipv4_out.part.6+0x14/0xd0 [iptable_nat] [20195712.105079] [<ffffffffa0529605>] nf_nat_ipv4_out+0x45/0x50 [iptable_nat] [20195712.105089] [<ffffffff816804ce>] nf_iterate+0x8e/0xd0 [20195712.105097] [<ffffffff8168c5e0>] ? ip_finish_output.part.42+0x440/0x440 [20195712.105104] [<ffffffff8168058d>] nf_hook_slow+0x7d/0x150 [20195712.105111] [<ffffffff8168c5e0>] ? ip_finish_output.part.42+0x440/0x440 [20195712.105119] [<ffffffff8168cf72>] ip_output+0x82/0x90 [20195712.105126] [<ffffffff81688e02>] ip_forward_finish+0x102/0x130 [20195712.105132] [<ffffffff816891b9>] ip_forward+0x389/0x440 [20195712.105139] [<ffffffff81686e51>] ip_rcv_finish+0x121/0x380 [20195712.105145] [<ffffffff81687726>] ip_rcv+0x286/0x380 [20195712.105168] [<ffffffffa015683d>] ? ixgbe_alloc_rx_buffers+0x7d/0xd0 [ixgbe] [20195712.105177] [<ffffffff8164e002>] __netif_receive_skb_core+0x5e2/0x730 [20195712.105195] [<ffffffffa0156a20>] ? ixgbe_clean_rx_irq+0x190/0x260 [ixgbe] [20195712.105202] [<ffffffff8164e171>] __netif_receive_skb+0x21/0x70 [20195712.105209] [<ffffffff8164eb61>] process_backlog+0xb1/0x190 [20195712.105216] [<ffffffff8164f559>] net_rx_action+0x139/0x250 [20195712.105224] [<ffffffff8107000f>] __do_softirq+0xef/0x330 [20195712.105235] [<ffffffff8176e4dc>] do_softirq_own_stack+0x1c/0x30 [20195712.105237] <EOI> [20195712.105240] [<ffffffff81070305>] do_softirq+0x65/0x70 [20195712.105251] [<ffffffff810703a1>] __local_bh_enable_ip+0x91/0xa0 [20195712.105260] [<ffffffff817631b0>] _raw_spin_unlock_bh+0x20/0x40 [20195712.105268] [<ffffffff8167bc0f>] netlink_poll+0x11f/0x1a0 [20195712.105276] [<ffffffff81633da6>] sock_poll+0x116/0x130 [20195712.105285] [<ffffffff811e6554>] do_poll.isra.7+0x144/0x380 [20195712.105293] [<ffffffff811e76b9>] do_sys_poll+0x199/0x200 [20195712.105301] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105308] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105315] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105322] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105329] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105336] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105343] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105350] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105357] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105365] [<ffffffff8101e4a9>] ? read_tsc+0x9/0x20 [20195712.105374] [<ffffffff810d870c>] ? ktime_get_ts+0x4c/0xe0 [20195712.105382] [<ffffffff811e6815>] ? poll_select_set_timeout+0x85/0xa0 [20195712.105388] [<ffffffff8102468d>] ? syscall_trace_leave+0xdd/0x150 [20195712.105396] [<ffffffff811e77fb>] SyS_poll+0x6b/0x100 [20195712.105403] [<ffffffff8176ccdf>] tracesys+0xe1/0xe6 [20195712.105406] Code: 8b 00 a8 01 74 21 e9 ff 00 00 00 0f 1f 80 00 00 00 00 49 8b 95 e8 09 00 00 65 ff 02 48 8b 00 a8 01 0f 85 e3 00 00 00 0f b6 50 37 <48> 8d 0c d5 00 00 00 00 48 c1 e2 06 48 29 ca 48 89 c1 48 83 c2
[20195712.090625] BUG: soft lockup - CPU#2 stuck for 22s! [ovs-vswitchd:8558] [20195712.104800] CPU: 2 PID: 8558 Comm: ovs-vswitchd Tainted: G W 3.14.39-031439-generic #201504211206 [20195712.104805] Hardware name: Supermicro X9SRH-7F/7TF/X9SRH-7F/7TF, BIOS 3.00 07/05/2013 [20195712.104810] task: ffff88017bd31910 ti: ffff8801b92a2000 task.ti: ffff8801b92a2000 [20195712.104814] RIP: 0010:[<ffffffffa04f7369>] [<ffffffffa04f7369>] nf_conntrack_tuple_taken+0x99/0x1b0 [nf_conntrack] [20195712.104834] RSP: 0018:ffff88207fc437d8 EFLAGS: 00000246 [20195712.104838] RAX: ffff8801167e6c70 RBX: 000000000770c44e RCX: 00000000b612c443 [20195712.104842] RDX: 0000000000000001 RSI: 00000000fa8f0415 RDI: ffff88207fc43810 [20195712.104845] RBP: ffff88207fc437f8 R08: 00000000fea3ad6e R09: 000000007361453d [20195712.104849] R10: ffff88207fc43828 R11: ffff88207fc43978 R12: ffff88207fc43748 [20195712.104852] R13: ffffffff8176d81d R14: ffff88207fc437f8 R15: ffff88207fc43810 [20195712.104857] FS: 00007fa272770980(0000) GS:ffff88207fc40000(0000) knlGS:0000000000000000 [20195712.104861] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [20195712.104864] CR2: 000012b6d8042000 CR3: 0000001f38f8c000 CR4: 00000000001427e0 [20195712.104868] DR0: 0000000000000003 DR1: 00000000000000b0 DR2: 0000000000000001 [20195712.104871] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [20195712.104874] Stack: [20195712.104877] ffff88020de67998 ffff88207fc43a00 0000000000007c5b 000000000000fc00 [20195712.104887] ffff88207fc43848 ffffffffa051c659 0000000000000000 000000009f66459e [20195712.104895] 0000000000000000 0166459e00025000 0000000000000000 0106f7ef00000000 [20195712.104903] Call Trace: [20195712.104907] <IRQ> [20195712.104913] [<ffffffffa051c659>] nf_nat_used_tuple+0x29/0x30 [nf_nat] [20195712.104934] [<ffffffffa051d623>] nf_nat_l4proto_unique_tuple+0xf3/0x190 [nf_nat] [20195712.104945] [<ffffffffa051c659>] ? nf_nat_used_tuple+0x29/0x30 [nf_nat] [20195712.104955] [<ffffffff8101ec59>] ? sched_clock+0x9/0x10 [20195712.104965] [<ffffffffa051d7f5>] tcp_unique_tuple+0x15/0x20 [nf_nat] [20195712.104975] [<ffffffffa051ce40>] get_unique_tuple+0x110/0x260 [nf_nat] [20195712.104989] [<ffffffffa051d017>] nf_nat_setup_info+0x87/0x360 [nf_nat] [20195712.104998] [<ffffffff81092215>] ? run_posix_cpu_timers+0x45/0x290 [20195712.105007] [<ffffffffa0625104>] xt_snat_target_v0+0x34/0x40 [xt_nat] [20195712.105017] [<ffffffffa04de645>] ipt_do_table+0x335/0x550 [ip_tables] [20195712.105031] [<ffffffffa04f7459>] ? nf_conntrack_tuple_taken+0x189/0x1b0 [nf_conntrack] [20195712.105042] [<ffffffffa05290b8>] nf_nat_rule_find+0x28/0xc0 [iptable_nat] [20195712.105051] [<ffffffffa0529319>] nf_nat_ipv4_fn+0x1c9/0x280 [iptable_nat] [20195712.105062] [<ffffffff8168c5e0>] ? ip_finish_output.part.42+0x440/0x440 [20195712.105071] [<ffffffffa0529504>] nf_nat_ipv4_out.part.6+0x14/0xd0 [iptable_nat] [20195712.105079] [<ffffffffa0529605>] nf_nat_ipv4_out+0x45/0x50 [iptable_nat] [20195712.105089] [<ffffffff816804ce>] nf_iterate+0x8e/0xd0 [20195712.105097] [<ffffffff8168c5e0>] ? ip_finish_output.part.42+0x440/0x440 [20195712.105104] [<ffffffff8168058d>] nf_hook_slow+0x7d/0x150 [20195712.105111] [<ffffffff8168c5e0>] ? ip_finish_output.part.42+0x440/0x440 [20195712.105119] [<ffffffff8168cf72>] ip_output+0x82/0x90 [20195712.105126] [<ffffffff81688e02>] ip_forward_finish+0x102/0x130 [20195712.105132] [<ffffffff816891b9>] ip_forward+0x389/0x440 [20195712.105139] [<ffffffff81686e51>] ip_rcv_finish+0x121/0x380 [20195712.105145] [<ffffffff81687726>] ip_rcv+0x286/0x380 [20195712.105168] [<ffffffffa015683d>] ? ixgbe_alloc_rx_buffers+0x7d/0xd0 [ixgbe] [20195712.105177] [<ffffffff8164e002>] __netif_receive_skb_core+0x5e2/0x730 [20195712.105195] [<ffffffffa0156a20>] ? ixgbe_clean_rx_irq+0x190/0x260 [ixgbe] [20195712.105202] [<ffffffff8164e171>] __netif_receive_skb+0x21/0x70 [20195712.105209] [<ffffffff8164eb61>] process_backlog+0xb1/0x190 [20195712.105216] [<ffffffff8164f559>] net_rx_action+0x139/0x250 [20195712.105224] [<ffffffff8107000f>] __do_softirq+0xef/0x330 [20195712.105235] [<ffffffff8176e4dc>] do_softirq_own_stack+0x1c/0x30 [20195712.105237] <EOI> [20195712.105240] [<ffffffff81070305>] do_softirq+0x65/0x70 [20195712.105251] [<ffffffff810703a1>] __local_bh_enable_ip+0x91/0xa0 [20195712.105260] [<ffffffff817631b0>] _raw_spin_unlock_bh+0x20/0x40 [20195712.105268] [<ffffffff8167bc0f>] netlink_poll+0x11f/0x1a0 [20195712.105276] [<ffffffff81633da6>] sock_poll+0x116/0x130 [20195712.105285] [<ffffffff811e6554>] do_poll.isra.7+0x144/0x380 [20195712.105293] [<ffffffff811e76b9>] do_sys_poll+0x199/0x200 [20195712.105301] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105308] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105315] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105322] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105329] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105336] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105343] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105350] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105357] [<ffffffff811e6280>] ? __pollwait+0xf0/0xf0 [20195712.105365] [<ffffffff8101e4a9>] ? read_tsc+0x9/0x20 [20195712.105374] [<ffffffff810d870c>] ? ktime_get_ts+0x4c/0xe0 [20195712.105382] [<ffffffff811e6815>] ? poll_select_set_timeout+0x85/0xa0 [20195712.105388] [<ffffffff8102468d>] ? syscall_trace_leave+0xdd/0x150 [20195712.105396] [<ffffffff811e77fb>] SyS_poll+0x6b/0x100 [20195712.105403] [<ffffffff8176ccdf>] tracesys+0xe1/0xe6 [20195712.105406] Code: 8b 00 a8 01 74 21 e9 ff 00 00 00 0f 1f 80 00 00 00 00 49 8b 95 e8 09 00 00 65 ff 02 48 8b 00 a8 01 0f 85 e3 00 00 00 0f b6 50 37 <48> 8d 0c d5 00 00 00 00 48 c1 e2 06 48 29 ca 48 89 c1 48 83 c2
_______________________________________________ discuss mailing list [email protected] http://openvswitch.org/mailman/listinfo/discuss
