Re: lockups with netconsole on e1000 on media insertion
Steven Rostedt wrote: I don't have the card, so I can't test it. But if this works (after removing the previous patch) then this is the better solution. I can confirm that this alone does not work for the simple unplug/re-plug cycle I described, it still locks up hard. Tried this alone on -rc6. --- John Bäckstrand - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: assertion (cnt <= tp->packets_out) failed
Someone asked if I could try to trigger this assertion again, and I'm afraid I probably cannot, I didnt do anything special at the time. But I've got something even better for you all, got a BUG from something tcp-related. Mind you, I am trying to find a possibly hardware-related issue here, so if this bug does not make any sense it might be my hardware! I would actually want to know it if this is likely hardware-related or not, since I have no idea if its RAM, CPU, motherboard or "only" a disk that is broken. I know _something_ is broken, due to lockups, and seeing a faulty disk indicated in a HDD diag, but only once, the disk is apparently fine 99% of the time. --- John Bäckstrand [148475.651000] [ cut here ] [148475.651050] kernel BUG at net/ipv4/tcp_output.c:918! [148475.651078] invalid operand: [#1] [148475.651103] Modules linked in: sha256 aes_i586 dm_crypt ipt_state ipt_multiport ipt_MASQUERADE iptable_filter netconsole md5 ipv6 af_packet pdc202xx_new e1000 8139cp de2104x i2c_viapro via686a i2c_sensor i2c_core uhci_hcd usbcore 3c59x 8139too mii de4x5 crc32 parport_pc parport reiserfs dm_mod ip_nat_ftp iptable_nat ip_tables ip_conntrack_ftp ip_conntrack rtc unix [148475.651378] CPU:0 [148475.651380] EIP:0060:[]Not tainted VLI [148475.651383] EFLAGS: 00010287 (2.6.13-rc5sand4) [148475.651464] EIP is at tcp_tso_should_defer+0xc9/0xe0 [148475.651494] eax: 002b ebx: ce49a660 ecx: 002c edx: ca008d00 [148475.651526] esi: 002c edi: 000e ebp: 99d57104 esp: c0865dec [148475.651556] ds: 007b es: 007b ss: 0068 [148475.651582] Process tor (pid: 10849, threadinfo=c0864000 task=c6234530) [148475.651602] Stack: ce49a660 002c ca008d00 99d57104 c02866fc ca008d00 ca008d00 ce49a660 [148475.651676]003a 0102 000e 0001 ca008d00 ca008d00 ca008d00 c9290034 [148475.651751]c0286a49 ca008d00 05b4 0001 c0254674 81dd5b2f 81dd5b2f 0010 [148475.651823] Call Trace: [148475.651869] [] tcp_write_xmit+0xcc/0x3e0 [148475.651910] [] __tcp_push_pending_frames+0x39/0xd0 [148475.651947] [] kfree_skbmem+0x24/0x30 [148475.651988] [] tcp_rcv_established+0x26e/0x840 [148475.652033] [] tcp_v4_do_rcv+0x115/0x120 [148475.652072] [] tcp_v4_rcv+0x64f/0x890 [148475.652106] [] ip_local_deliver_finish+0x0/0x1c0 [148475.652150] [] nf_hook_slow+0x6e/0x100 [148475.652199] [] ip_local_deliver+0xe3/0x250 [148475.652234] [] ip_local_deliver_finish+0x0/0x1c0 [148475.652272] [] ip_rcv+0x355/0x4e0 [148475.652309] [] ip_rcv_finish+0x0/0x290 [148475.652347] [] netif_receive_skb+0x1f1/0x270 [148475.652394] [] process_backlog+0x7f/0x100 [148475.652431] [] net_rx_action+0x7a/0x120 [148475.652467] [] __do_softirq+0x7d/0x90 [148475.652509] [] do_softirq+0x26/0x30 [148475.652544] [] do_IRQ+0x1e/0x30 [148475.652588] [] common_interrupt+0x1a/0x20 [148475.652630] Code: db 74 1d 89 f8 0f af c2 39 f0 0f 46 f0 31 d2 89 f0 f7 f3 31 d2 39 c1 73 cb ba 01 00 00 00 eb c4 6b c2 03 31 d2 39 c1 77 bb eb ee <0f> 0b 96 03 20 2f 2e c0 eb 83 8b ba 7c 02 00 00 eb ee 90 8d 74 [148475.653330] <0>Kernel panic - not syncing: Fatal exception in interrupt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] netpoll can lock up on low memory.
Steven Rostedt wrote: In my last email, I stated that this discussion seems to have demonstrated that the e1000 driver's netpoll is indeed broken, and needs to be fixed. I submitted eariler a patch for this, but it's untested and someone who owns an e1000 needs to try it. I can test this, but not right now: Im trying, again, to find my hard lockup issue, and so I will try to run this machine until it locks up. It lasted 9 days at one time, so it could potentially take some time, I'm afraid. --- John Bäckstrand - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: assertion (cnt <= tp->packets_out) failed
Hang on a second, the original poster mentioned rc5. Is this really pristine rc5 with the one netpoll patch? If so then it can't be the patches we're talking about because they only went in days later. Yes, I have no other patches in, so if it was not in -RC5, I was not running it. --- John Bäckstrand - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
assertion (cnt <= tp->packets_out) failed
I get KERNEL: assertion (cnt <= tp->packets_out) failed at net/ipv4/tcp_input.c (1476) with 2.6.13-rc5, also with a small netpoll patch that shouldnt affect these things. (Topic: "lockups with netconsole on e1000 on media insertion"). I have a decent amount of dropped/overruns: eth2 Link encap:Ethernet HWaddr 00:50:DA:E0:BB:36 inet addr:83.233.27.60 Bcast:83.233.27.255 Mask:255.255.255.0 inet6 addr: fe80::250:daff:fee0:bb36/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:9141685 errors:0 dropped:0 overruns:794 frame:0 TX packets:10596040 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:950232746 (906.2 MiB) TX bytes:804721505 (767.4 MiB) Interrupt:10 Base address:0x8800 eth3 Link encap:Ethernet HWaddr 00:0E:0C:75:F1:2A inet addr:10.32.0.1 Bcast:10.255.255.255 Mask:255.255.0.0 inet6 addr: fe80::20e:cff:fe75:f12a/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:16000 Metric:1 RX packets:16090188 errors:2329 dropped:4658 overruns:2329 frame:0 TX packets:34370559 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1148661167 (1.0 GiB) TX bytes:4000412315 (3.7 GiB) Base address:0x8400 Memory:e200-e202 ethtool -S eth3 NIC statistics: rx_packets: 16195970 tx_packets: 34563822 rx_bytes: 1258213074 tx_bytes: 4205874656 rx_errors: 2332 tx_errors: 0 rx_dropped: 2332 tx_dropped: 0 multicast: 0 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_fifo_errors: 2332 rx_no_buffer_count: 0 rx_missed_errors: 2332 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 2981894 tx_tcp_seg_failed: 0 rx_flow_control_xon: 0 rx_flow_control_xoff: 0 tx_flow_control_xon: 0 tx_flow_control_xoff: 0 rx_long_byte_count: 14143114962 rx_csum_offload_good: 16195740 rx_csum_offload_errors: 0 ethtool -S eth2 NIC statistics: tx_deferred: 0 tx_multiple_collisions: 0 rx_bad_ssd: 0 --- John Bäckstrand - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: lockups with netconsole on e1000 on media insertion
Andi Kleen wrote: The patch was for 2.6.12, did a quick untested port to 2.6.13rc5. -Andi Only try a limited number to send packets in netpoll Thanks, worked nicely! --- John Bäckstrand - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/