On Wed, 2013-02-20 at 17:10 +0100, Urban Loesch wrote:
> Hi,
> 
> today I had a strange system hang on one of our new Dell PER620 machines.
> I'm running a self compiled kernel, version 3.7.2 with linux vserver patch 
> included.
> 
> uname -a
> Linux dbhost04 3.7.2-vs2.3.5.5-rol-em64t #4 SMP Sun Feb 3 14:08:37 CET 2013 
> x86_64 GNU/Linux
> 
> 15min. systemload between 1-3.
> 
> 
> Today the system hangs for some seconds and I got the folling errors in 
> syslog multiple times within one second:
> 
> ...
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196338] WARNING: at 
> net/core/skbuff.c:573 skb_release_head_state+0xed/0x100()
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196338] Hardware name: PowerEdge 
> R620
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196352] Modules linked in: 
> lru_cache netconsole configfs act_police cls_basic cls_flow cls_fw cls_u32 
> sch_tbf sch_prio sch_hfsc sch_htb sch_ingress sch_sfq xt_statistic xt_CT 
> xt_realm xt_LOG xt_c
> onnlimit iptable_raw xt_comment xt_nat xt_recent ipt_ULOG ipt_REJECT 
> ipt_MASQUERADE ipt_ECN ipt_CLUSTERIP ipt_ah nf_nat_tftp nf_nat_sip 
> nf_nat_pptp 
> nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda 
> nf_conntrack_tftp nf_con
> ntrack_sane nf_conntrack_sip nf_conntrack_proto_udplite 
> nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre 
> nf_conntrack_netlink 
> nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc ts_kmp 
> nf_conntrack_h323 nf_con
> ntrack_amanda nf_conntrack_ftp xt_TPROXY xt_time nf_tproxy_core xt_TCPMSS 
> xt_tcpmss xt_sctp xt_policy xt_pkttype xt_NFLOG nfnetlink_log xt_physdev 
> xt_owner xt_NFQUEUE xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange 
> xt_helper xt
> _hashlimit xt_DSCP xt_dscp xt_dccp xt_connmark xt_CLASSIFY iptable_nat 
> nf_nat_ipv
> Feb 20 15:58:04 dbhost04 kernel: 4 nf_nat ip6t_REJECT nf_conntrack_ipv4 
> xt_tcpudp nf_defrag_ipv4 xt_state nf_conntrack_ipv6 nf_defrag_ipv6 
> xt_conntrack nf_conntrack iptable_mangle ip6table_raw ip6table_mangle 
> nfnetlink ip6table_filter ip
> 6_tables iptable_filter ip_tables x_tables ipmi_devintf ipmi_si 
> ipmi_msghandler coretemp kvm_intel kvm ghash_clmulni_intel aesni_intel xts 
> aes_x86_64 
> lrw gf128mul ablk_helper cryptd iTCO_wdt iTCO_vendor_support dcdbas microcode 
> pcspkr jo
> ydev lpc_ich shpchp hed evbug hid_generic usbhid hid ahci libahci 
> megaraid_sas tg3 [last unloaded: drbd]
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196368] Pid: 10942, comm: mysqld 
> Tainted: G        W    3.7.2-vs2.3.5.5-rol-em64t #4
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196368] Call Trace:
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196370]  <IRQ> [<ffffffff81053bff>] 
> warn_slowpath_common+0x7f/0xc0
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196371] [<ffffffff81594c52>] ? 
> skb_release_data+0xf2/0x110
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196372] [<ffffffff81053c5a>] 
> warn_slowpath_null+0x1a/0x20
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196373] [<ffffffff81594e9d>] 
> skb_release_head_state+0xed/0x100
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196374] [<ffffffff81594c86>] 
> __kfree_skb+0x16/0xa0
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196375] [<ffffffff8159521c>] 
> consume_skb+0x2c/0x80
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196379] [<ffffffffa000b0af>] 
> tg3_poll_work+0x5ef/0xdb0 [tg3]
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196384] [<ffffffffa000b055>] ? 
> tg3_poll_work+0x595/0xdb0 [tg3]
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196388] [<ffffffffa00145cf>] 
> tg3_poll+0x7f/0x390 [tg3]
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196392] [<ffffffffa000b927>] ? 
> tg3_poll_msix+0xb7/0x140 [tg3]
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196394] [<ffffffff815b9622>] 
> netpoll_poll_dev+0x162/0x580
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196395] [<ffffffff815b9bcc>] 
> netpoll_send_skb_on_dev+0x18c/0x3a0
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196398] [<ffffffff815ba0f7>] 
> netpoll_send_udp+0x277/0x290
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196400] [<ffffffffa03ae91f>] 
> write_msg+0xaf/0x100 [netconsole]
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196401] [<ffffffff81054959>] 
> call_console_drivers.constprop.16+0x99/0x100
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196403] [<ffffffff810553b9>] 
> console_unlock+0x3d9/0x420
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196404] [<ffffffff81055ca5>] 
> vprintk_emit+0x255/0x510
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196406] [<ffffffff8169f0b9>] 
> printk+0x61/0x63
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196407] [<ffffffff81031e8e>] 
> therm_throt_process+0x13e/0x180
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196408] [<ffffffff81032066>] 
> intel_thermal_interrupt+0x196/0x1a0
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196410] [<ffffffff810320c1>] 
> smp_thermal_interrupt+0x21/0x40
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196411] [<ffffffff816b1a1a>] 
> thermal_interrupt+0x6a/0x70
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196413]  <EOI> [<ffffffff816b0e19>] 
> ? system_call_fastpath+0x16/0x1b
> Feb 20 15:58:04 dbhost04 kernel: [1463997.196414] ---[ end trace 
> e3ec69533a534ff5 ]---
> ...
> 
> After the last message I got this entries in syslog, too:
> Feb 20 15:58:04 dbhost04 kernel: [1464001.755218] CPU18: Core power limit 
> normal
> Feb 20 15:58:04 dbhost04 kernel: [1464001.760038] Clocksource tsc unstable 
> (delta = 299966106527 ns)
> Feb 20 15:58:04 dbhost04 kernel: [1464001.769627] Switching to clocksource 
> hpet
> 
> I searched the archives for this error, but I can't find any solution.
> And my second PER620 doesn't show this error until now.
> 
> Have you any idea what this problem could be?
> 
> I'm not subscribed to lkml, if you need more information please contact me 
> directly by email.
> 
> Many thanks for your help.
> Urban

CC netdev

I guess tg3 needs to call dev_kfree_skb_any()

diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index bdb0869..22d9e44 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -5942,7 +5942,7 @@ static void tg3_tx(struct tg3_napi *tnapi)
                pkts_compl++;
                bytes_compl += skb->len;
 
-               dev_kfree_skb(skb);
+               dev_kfree_skb_any(skb);
 
                if (unlikely(tx_bug)) {
                        tg3_tx_recover(tp);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to