Hi list, I have a very strange problem with my network. I have 2 internet connections: A - 1 Gbit, B - 100Mbps. Network layout:
A, B | | [Brd1] / \ [L1] [L2] \ / [ GW1] ................... Clients ..................... Brd1 runs bgpd, and balances the traffic through L1 and L2. L1 and L2 do traffic shaping. GW1 does some packet filtering, and balances the traffic through L1 and L2. Every interface is gigabit. (Realtek NICs) I'm using IMQ on L1 and L2, to separate the traffic into 2 zones, international and local, with HTB for shaping. The system works fine for some time, but when the traffic hits 200Mbps, and ocassionally bursts to 250-300Mbps, L1 and L2 behave strangely (packet loss > 30%, increased latency +20ms), sometimes they even hang, leaving me with the only solution: rebooting them. I've checked the CPU usage, it stays around 80% during the highest traffic. I've examined the logs, and here is what i've found: Feb 11 08:04:05 l1 kernel: cpu 0 cold: low 0, high 0, batch 1 used:0 Feb 11 08:04:05 l1 kernel: DMA32 per-cpu: empty Feb 11 08:04:05 l1 kernel: Normal per-cpu: Feb 11 08:04:05 l1 kernel: cpu 0 hot: low 0, high 186, batch 31 used:79 Feb 11 08:04:05 l1 kernel: cpu 0 cold: low 0, high 62, batch 15 used:52 Feb 11 08:04:05 l1 kernel: HighMem per-cpu: empty Feb 11 08:04:05 l1 kernel: Free pages: 3032kB (0kB HighMem) Feb 11 08:04:05 l1 kernel: Active:15050 inactive:8995 dirty:0 writeback:0 unstable:0 free:758 slab:102918 mapped:3203 pagetables:101 Feb 11 08:04:05 l1 kernel: DMA free:2016kB min:88kB low:108kB high:132kB active:28kB inactive:1092kB present:16384kB pages_scanned:0 all_unrec laimable? no Feb 11 08:04:05 l1 kernel: lowmem_reserve[]: 0 0 495 495 Feb 11 08:04:05 l1 kernel: DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no Feb 11 08:04:05 l1 kernel: lowmem_reserve[]: 0 0 495 495 Feb 11 08:04:05 l1 kernel: Normal free:1016kB min:2800kB low:3500kB high:4200kB active:60172kB inactive:34888kB present:507584kB pages_scanned :0 all_unreclaimable? no Feb 11 08:04:05 l1 kernel: lowmem_reserve[]: 0 0 0 0 Feb 11 08:04:05 l1 kernel: HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimab le? no Feb 11 08:04:05 l1 kernel: lowmem_reserve[]: 0 0 0 0 ........... Feb 11 08:04:05 l1 kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0 Feb 11 08:04:05 l1 kernel: Free swap = 987956kB Feb 11 08:04:05 l1 kernel: Total swap = 987956kB Feb 11 08:04:05 l1 kernel: Free swap: 987956kB Feb 11 08:04:05 l1 kernel: 130992 pages of RAM Feb 11 08:04:05 l1 kernel: 0 pages of HIGHMEM Feb 11 08:04:05 l1 kernel: 2137 reserved pages Feb 11 08:04:05 l1 kernel: 28840 pages shared Feb 11 08:04:05 l1 kernel: 0 pages swap cached Feb 11 08:04:05 l1 kernel: 0 pages dirty Feb 11 08:04:05 l1 kernel: 0 pages writeback Feb 11 08:04:05 l1 kernel: 3203 pages mapped Feb 11 08:04:05 l1 kernel: 102918 pages slab Feb 11 08:04:05 l1 kernel: 101 pages pagetables Feb 11 08:04:05 l1 kernel: ksoftirqd/0: page allocation failure. order:0, mode:0x20 Feb 11 08:04:05 l1 kernel: [<c0137fa6>] __alloc_pages+0x1e6/0x2b0 Feb 11 08:04:05 l1 kernel: [<c013ac50>] kmem_getpages+0x30/0x90 Feb 11 08:04:05 l1 kernel: [<c013b89c>] cache_grow+0x8c/0x120 Feb 11 08:04:05 l1 kernel: [<c013ba4f>] cache_alloc_refill+0x11f/0x1d0 Feb 11 08:04:05 l1 kernel: [<c013bd6f>] __kmalloc+0x4f/0x60 Feb 11 08:04:05 l1 kernel: [<c028f200>] __alloc_skb+0x40/0x130 Feb 11 08:04:05 l1 kernel: [<c023c4a0>] e1000_alloc_rx_buffers+0x60/0x360 Feb 11 08:04:05 l1 kernel: [<c023bc83>] e1000_clean_rx_irq+0x1d3/0x4a0 Feb 11 08:04:05 l1 kernel: [<c02649bb>] rtl8169_rx_fill+0x5b/0x70 Feb 11 08:04:05 l1 kernel: [<c023b4fa>] e1000_clean+0x9a/0x150 Feb 11 08:04:05 l1 kernel: [<c011d790>] ksoftirqd+0x0/0x80 Feb 11 08:04:05 l1 kernel: [<c0294ce1>] net_rx_action+0x61/0xe0 Feb 11 08:04:05 l1 kernel: [<c011d479>] __do_softirq+0x79/0x90 Feb 11 08:04:05 l1 kernel: [<c011d4b6>] do_softirq+0x26/0x30 Feb 11 08:04:05 l1 kernel: [<c011d7dd>] ksoftirqd+0x4d/0x80 Feb 11 08:04:05 l1 kernel: [<c012a3cc>] kthread+0x9c/0xb0 Feb 11 08:04:05 l1 kernel: [<c012a330>] kthread+0x0/0xb0 Feb 11 08:04:05 l1 kernel: [<c0100f65>] kernel_thread_helper+0x5/0x10 And it continues like this for a long, long time .... Does anybody know whats wrong, or how can I fix this? Thanks. Andrei SANDU.
_______________________________________________ LARTC mailing list LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc