Stephen Hemminger wrote:
> On Wed, 24 May 2006 10:28:52 +0100
> "Daniel J Blueman" <[EMAIL PROTECTED]> wrote:
> 
> 
>>Having done some more stress testing with sky2 1.4 (in 2.6.17-rc4) and
>>the latest patch, I have found problems when streaming lots of data
>>out of the sky2 interface (eg via samba serving a large file to GigE
>>client). Ultimately, the interface will stop sending.
>>
>>Before this happens, I see lots of:
>>
>>kernel: lan0: hw csum failure.
>>kernel:  [__skb_checksum_complete+86/96] __skb_checksum_complete+0x56/0x60
>>kernel:  [tcp_error+300/512] tcp_error+0x12c/0x200
>>kernel:  [poison_obj+41/96] poison_obj+0x29/0x60
>>kernel:  [tcp_error+0/512] tcp_error+0x0/0x200
>>kernel:  [ip_conntrack_in+157/1072] ip_conntrack_in+0x9d/0x430
>>kernel:  [kfree_skbmem+8/128] kfree_skbmem+0x8/0x80
>>kernel:  [arp_process+102/1408] arp_process+0x66/0x580
>>kernel:  [check_poison_obj+36/416] check_poison_obj+0x24/0x1a0
>>kernel:  [arp_process+102/1408] arp_process+0x66/0x580
>>kernel:  [nf_iterate+99/144] nf_iterate+0x63/0x90
>>kernel:  [ip_rcv_finish+0/608] ip_rcv_finish+0x0/0x260
>>kernel:  [nf_hook_slow+89/240] nf_hook_slow+0x59/0xf0
>>kernel:  [ip_rcv_finish+0/608] ip_rcv_finish+0x0/0x260
>>kernel:  [ip_rcv+386/1104] ip_rcv+0x182/0x450
>>kernel:  [ip_rcv_finish+0/608] ip_rcv_finish+0x0/0x260
>>kernel:  [packet_rcv_spkt+216/320] packet_rcv_spkt+0xd8/0x140
>>kernel:  [netif_receive_skb+476/784] netif_receive_skb+0x1dc/0x310
>>kernel:  [sky2_poll+879/2096] sky2_poll+0x36f/0x830
>>kernel:  [_spin_lock_irqsave+9/16] _spin_lock_irqsave+0x9/0x10
>>kernel:  [run_timer_softirq+290/416] run_timer_softirq+0x122/0x1a0
>>kernel:  [net_rx_action+108/256] net_rx_action+0x6c/0x100
>>kernel:  [__do_softirq+66/160] __do_softirq+0x42/0xa0
>>kernel:  [do_softirq+78/96] do_softirq+0x4e/0x60
>>kernel:  =======================
>>kernel:  [do_IRQ+90/160] do_IRQ+0x5a/0xa0
>>kernel:  [remove_vma+69/80] remove_vma+0x45/0x50
>>kernel:  [common_interrupt+26/32] common_interrupt+0x1a/0x20
>>kernel:  [get_offset_pmtmr+151/3584] get_offset_pmtmr+0x97/0xe00
>>kernel:  [do_gettimeofday+26/208] do_gettimeofday+0x1a/0xd0
>>kernel:  [sys_gettimeofday+26/144] sys_gettimeofday+0x1a/0x90
>>kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
> 
> 
> 
> What ever the netfilter chain is, it is trimming or altering the packet
> without clearing or altering the hardware checksum. It is not a driver
> problem, we saw these in VLAN's and ebtables already.


The call chain looks pretty messed up, but the point where an
invalid HW checksum is detected is in TCP connection tracking,
which is basically the first thing netfilter does, unless
you use the raw table. There are no packet modifications done
by conntrack, so I doubt that netfilter is the culprit here.
Of course we had some big checksumming cleanups, so there is
a possibilty of bugs there, but I did test them with sky2 and
HW checksumming, so I don't think thats the case.

Daniel, is there an easy way to reproduce the checksum failure?
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to