Re: unkillable dpkg-query processes

2007-11-01 Thread David Miller
From: David Miller <[EMAIL PROTECTED]>
Date: Thu, 01 Nov 2007 15:01:13 -0700 (PDT)

> I'm working on a kernel patch for 2.6.23 that will allow you to get
> some useful debugging information in situations like this.
>
> I'll try to get you that patch by the end of tonight.

As promised, here is the patch below.

To trigger the debugging log, simple give the console
a "Alt-SysRQ" then a "g".

On a serial console you can do this by giving a single
BREAK then a "g".

If you're having trouble triggering the sysrq on the
console, try instead:

bash# echo "g" >/proc/sysrq-trigger

Here is some sample output from my Niagara-2 system while
running a benchmark.  The current CPU is denoted by the
leading "*" character.

[81940.250994] SysRq : Show Global CPU Regs
[81940.251800] * CPU[  0]: TSTATE[e2001602] TPC[0055813c] 
TNPC[00558140] TASK[dd:2940]
[81940.252206]  TPC[NGbzero_loop+0x1c/0x38]
[81940.252422]   CPU[  1]: TSTATE[004411001607] TPC[0055c9bc] 
TNPC[0055c9c0] TASK[dd:2926]
[81940.252739]  TPC[atomic_sub_ret+0x4/0x30]
[81940.252936]   CPU[  2]: TSTATE[11001607] TPC[0055feec] 
TNPC[0055fef0] TASK[dd:2899]
[81940.253238]  TPC[NG2copy_to_user+0x46c/0x680]
[81940.253451]   CPU[  3]: TSTATE[e2001602] TPC[00558130] 
TNPC[00558134] TASK[dd:2929]
[81940.253776]  TPC[NGbzero_loop+0x10/0x38]
[81940.253993]   CPU[  4]: TSTATE[e2001602] TPC[00558124] 
TNPC[00558128] TASK[dd:2947]
[81940.254325]  TPC[NGbzero_loop+0x4/0x38]
[81940.254497]   CPU[  5]: TSTATE[004411001606] TPC[00495f94] 
TNPC[00495f98] TASK[dd:2908]
[81940.254893]  TPC[do_generic_mapping_read+0xbc/0x428]
[81940.255203]   CPU[  6]: TSTATE[11001607] TPC[0055fee8] 
TNPC[0055feec] TASK[dd:2920]
[81940.255699]  TPC[NG2copy_to_user+0x468/0x680]
[81940.256104]   CPU[  7]: TSTATE[11001607] TPC[0055feec] 
TNPC[0055fef0] TASK[dd:2935]
[81940.256574]  TPC[NG2copy_to_user+0x46c/0x680]
[81940.256972]   CPU[  8]: TSTATE[e2001602] TPC[00558124] 
TNPC[00558128] TASK[dd:2903]
[81940.257399]  TPC[NGbzero_loop+0x4/0x38]
[81940.257899]   CPU[  9]: TSTATE[11001607] TPC[0055feec] 
TNPC[0055fef0] TASK[dd:2904]
[81940.258240]  TPC[NG2copy_to_user+0x46c/0x680]
[81940.258482]   CPU[ 10]: TSTATE[e2001602] TPC[00558138] 
TNPC[0055813c] TASK[dd:2902]
[81940.258808]  TPC[NGbzero_loop+0x18/0x38]
[81940.258999]   CPU[ 11]: TSTATE[e2001602] TPC[00558120] 
TNPC[00558124] TASK[dd:2941]
[81940.259319]  TPC[NGbzero_loop+0x0/0x38]
[81940.259487]   CPU[ 12]: TSTATE[e2001602] TPC[00558130] 
TNPC[00558134] TASK[dd:2919]
[81940.259801]  TPC[NGbzero_loop+0x10/0x38]
[81940.260012]   CPU[ 13]: TSTATE[11001607] TPC[0055feec] 
TNPC[0055fef0] TASK[dd:2950]
[81940.260350]  TPC[NG2copy_to_user+0x46c/0x680]
[81940.260564]   CPU[ 14]: TSTATE[e2001602] TPC[00558134] 
TNPC[00558138] TASK[dd:2936]
[81940.260937]  TPC[NGbzero_loop+0x14/0x38]
[81940.261150]   CPU[ 15]: TSTATE[11001607] TPC[0055fee8] 
TNPC[0055feec] TASK[dd:2905]
[81940.261457]  TPC[NG2copy_to_user+0x468/0x680]
[81940.261677]   CPU[ 16]: TSTATE[11001607] TPC[0055feec] 
TNPC[0055fef0] TASK[dd:2923]
[81940.261973]  TPC[NG2copy_to_user+0x46c/0x680]
[81940.262167]   CPU[ 17]: TSTATE[11001607] TPC[0055feec] 
TNPC[0055fef0] TASK[dd:2897]
[81940.262462]  TPC[NG2copy_to_user+0x46c/0x680]
[81940.262643]   CPU[ 18]: TSTATE[e2001602] TPC[00558128] 
TNPC[0055812c] TASK[dd:2909]
[81940.262987]  TPC[NGbzero_loop+0x8/0x38]
[81940.263180]   CPU[ 19]: TSTATE[11001607] TPC[0055fee8] 
TNPC[0055feec] TASK[dd:2913]
[81940.263500]  TPC[NG2copy_to_user+0x468/0x680]
[81940.263901]   CPU[ 20]: TSTATE[e2001602] TPC[00558128] 
TNPC[0055812c] TASK[dd:2890]
[81940.264403]  TPC[NGbzero_loop+0x8/0x38]
[81940.264679]   CPU[ 21]: TSTATE[11001607] TPC[0055fee8] 
TNPC[0055feec] TASK[dd:2906]
[81940.265152]  TPC[NG2copy_to_user+0x468/0x680]
[81940.265535]   CPU[ 22]: TSTATE[11001607] TPC[0055feec] 
TNPC[0055fef0] TASK[dd:2918]
[81940.266075]  TPC[NG2copy_to_user+0x46c/0x680]
[81940.266448]   CPU[ 23]: TSTATE[11001607] TPC[0055fee8] 
TNPC[0055feec] TASK[dd:2900]
[81940.266942]  TPC[NG2copy_to_user+0x468/0x680]
[81940.267328]   CPU[ 24]: TSTATE[11001602] TPC[0049a618] 
TNPC[0049a61c] TASK[dd:2938]
[81940.267710]  

[PATCH] sparck64: remove duplicate includes

2007-11-01 Thread lizf

This patch removes duplicate includes in arch/sparc64

Signed-off-by Li Zefan <[EMAIL PROTECTED]>

---
 arch/sparc64/kernel/ds.c  |1 -
 arch/sparc64/kernel/module.c  |1 -
 arch/sparc64/kernel/sys_sparc32.c |1 -
 arch/sparc64/kernel/sys_sunos32.c |1 -
 arch/sparc64/kernel/time.c|2 --
 5 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/arch/sparc64/kernel/ds.c b/arch/sparc64/kernel/ds.c
index 9f472a7..eeb5a2f 100644
--- a/arch/sparc64/kernel/ds.c
+++ b/arch/sparc64/kernel/ds.c
@@ -6,7 +6,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/arch/sparc64/kernel/module.c b/arch/sparc64/kernel/module.c
index 5798715..158484b 100644
--- a/arch/sparc64/kernel/module.c
+++ b/arch/sparc64/kernel/module.c
@@ -11,7 +11,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include 
diff --git a/arch/sparc64/kernel/sys_sparc32.c 
b/arch/sparc64/kernel/sys_sparc32.c
index 78caff9..98c4688 100644
--- a/arch/sparc64/kernel/sys_sparc32.c
+++ b/arch/sparc64/kernel/sys_sparc32.c
@@ -51,7 +51,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include 
 #include 
diff --git a/arch/sparc64/kernel/sys_sunos32.c 
b/arch/sparc64/kernel/sys_sunos32.c
index 170d6ca..cfc22d3 100644
--- a/arch/sparc64/kernel/sys_sunos32.c
+++ b/arch/sparc64/kernel/sys_sunos32.c
@@ -57,7 +57,6 @@
 #include 
 
 /* For SOCKET_I */
-#include 
 #include 
 #include 
 
diff --git a/arch/sparc64/kernel/time.c b/arch/sparc64/kernel/time.c
index cd8c740..54bdb88 100644
--- a/arch/sparc64/kernel/time.c
+++ b/arch/sparc64/kernel/time.c
@@ -28,7 +28,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -47,7 +46,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 DEFINE_SPINLOCK(mostek_lock);
-- 
1.5.3.rc7

-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unkillable dpkg-query processes

2007-11-01 Thread Bernd Zeimetz


> The futex() calls are definitely from libnss-db.

And on Lenny/testing we have futex calls from libc6.
Didn't have the time to come up with any instructions yet as we have
public holidays today, I'll try to finish them tomorrow.

-- 
Bernd Zeimetz
<[EMAIL PROTECTED]> 
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unkillable dpkg-query processes

2007-11-01 Thread David Miller
From: Josip Rodin <[EMAIL PROTECTED]>
Date: Thu, 1 Nov 2007 22:40:37 +0100

> Given that it's still not catatonic, can I do something to provide some
> debugging information?

Not really, I'm working on a kernel patch for 2.6.23 that will
allow you to get some useful debugging information in situations
like this.

I'll try to get you that patch by the end of tonight.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: unkillable dpkg-query processes

2007-11-01 Thread Josip Rodin
Hi,

lebrun.d.o hasn't crashed in a while now, but it has this in the
process list:

buildd2382  0.0  0.2   8144  4736 ?Ss   Oct30   0:00 /usr/bin/perl 
/usr/bin/buildd
buildd2407  0.0  0.5  13920 11296 ?SN   Oct30   0:10  \_ 
/usr/bin/perl /usr/bin/sbuild --batch --stats-dir=/home/buildd/
buildd   18174  0.0  0.0  0 0 ?ZNs  Oct30   0:00  \_ [su] 

buildd   23305  100  1.6 1007296 33288 ?   RN   Oct30 3507:30 dpkg-query 
--status squashfs-source

At the same time:

% free
 total   used   free sharedbuffers cached
Mem:   20730402021224  51816  0 196808  21144
-/+ buffers/cache:1803272 269768
Swap:  10486881041048584
% uptime
 22:38:36 up 2 days, 10:53,  1 user,  load average: 3.00, 3.01, 3.00

Given that it's still not catatonic, can I do something to provide some
debugging information?

(BTW, I'm subscribed to the sparclinux list now.)

-- 
 2. That which causes joy or happiness.
-
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange CPU occupation... and system hangs

2007-11-01 Thread BERTRAND Joël

BERTRAND Joël wrote:




and some process are in D state :
Root gershwin:[/etc] > ps auwx | grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root   270  0.0  0.0  0 0 ?DOct27   1:17 [pdflush]
root  3676  0.9  0.0  0 0 ?DOct27  56:03 [nfsd]
root  5435  0.0  0.0  0 0 ?D<   Oct27   3:16 [md7_raid1]
root  5438  0.0  0.0  0 0 ?D<   Oct27   1:01 [kjournald]
root  5440  0.0  0.0  0 0 ?D<   Oct27   0:33 [loop0]
root  5441  0.0  0.0  0 0 ?D<   Oct27   0:05 [kjournald]
root 16442  0.0  0.0  20032  1208 pts/2D+   13:23   0:00 iftop 
-i eth2


Why md7_raid is in D state ? Same question about iftop ?


	Some bad news... After ten or eleven hours, kernel crashes on this 
server. The last top screen is :


top - 04:59:46 up 4 days, 16:24,  3 users,  load average: 19.72, 19.22, 
19.05

Tasks: 285 total,   5 running, 279 sleeping,   0 stopped,   1 zombie
Cpu(s):  0.0%us,  4.2%sy,  0.0%ni, 68.5%id, 27.3%wa,  0.0%hi,  0.0%si, 
0.0%st

Mem:   4139024k total,  4130800k used, 8224k free,38984k buffers
Swap:  7815536k total,  304k used,  7815232k free,79056k cached
PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND 


5426 root  15  -5 000 R  100  0.0 970:17.21 md_d0_raid5
26923 root  20   0  3120 1568 1112 R2  0.0  13:32.24 top 


...

	I have rebooted. I don't have any message in log files. I don't have 
any screen but I haven't seen anything on serial console. In ker.log, I 
have :
Oct 31 15:36:15 gershwin kernel: swapper: page allocation failure. 
order:2, mode:0x4020

Oct 31 15:36:15 gershwin kernel: Call Trace:
Oct 31 15:36:15 gershwin kernel:  [004b6568] 
__slab_alloc+0x1b0/0x720
Oct 31 15:36:15 gershwin kernel:  [004b87a8] 
__kmalloc_track_caller+0xb0/0xe0

Oct 31 15:36:15 gershwin kernel:  [00601d68] __alloc_skb+0x50/0x120
Oct 31 15:36:15 gershwin kernel:  [00642ee0] 
tcp_collapse+0x1e8/0x440
Oct 31 15:36:15 gershwin kernel:  [00643298] 
tcp_prune_queue+0x160/0x3a0
Oct 31 15:36:15 gershwin kernel:  [00643d08] 
tcp_data_queue+0x830/0xde0
Oct 31 15:36:15 gershwin kernel:  [00645d74] 
tcp_rcv_established+0x35c/0x840
Oct 31 15:36:15 gershwin kernel:  [0064cf7c] 
tcp_v4_do_rcv+0xe4/0x4a0

Oct 31 15:36:15 gershwin kernel:  [0064fdd8] tcp_v4_rcv+0xb00/0xb20
Oct 31 15:36:15 gershwin kernel:  [0062e2ac] 
ip_local_deliver+0x194/0x3a0

Oct 31 15:36:15 gershwin kernel:  [0062dd98] ip_rcv+0x360/0x6e0
Oct 31 15:36:15 gershwin kernel:  [00607f64] 
netif_receive_skb+0x1ec/0x480

Oct 31 15:36:15 gershwin kernel:  [005a5fe0] tg3_poll+0x6c8/0xc40
Oct 31 15:36:15 gershwin kernel:  [0060a940] 
net_rx_action+0x88/0x160

Oct 31 15:36:15 gershwin kernel:  [00468078] __do_softirq+0x80/0x100
Oct 31 15:36:15 gershwin kernel:  [0046815c] do_softirq+0x64/0x80
Oct 31 15:36:15 gershwin kernel: Mem-info:
Oct 31 15:36:15 gershwin kernel: Normal per-cpu:
Oct 31 15:36:15 gershwin kernel: CPU0: Hot: hi:   90, btch:  15 usd: 
 15   Cold: hi:   30, btch:   7 usd:   5
Oct 31 15:36:15 gershwin kernel: CPU1: Hot: hi:   90, btch:  15 usd: 
 31   Cold: hi:   30, btch:   7 usd:   4
Oct 31 15:36:15 gershwin kernel: CPU2: Hot: hi:   90, btch:  15 usd: 
  4   Cold: hi:   30, btch:   7 usd:   3
Oct 31 15:36:15 gershwin kernel: CPU3: Hot: hi:   90, btch:  15 usd: 
 82   Cold: hi:   30, btch:   7 usd:   2
Oct 31 15:36:15 gershwin kernel: CPU4: Hot: hi:   90, btch:  15 usd: 
 84   Cold: hi:   30, btch:   7 usd:   0
Oct 31 15:36:15 gershwin kernel: CPU5: Hot: hi:   90, btch:  15 usd: 
 65   Cold: hi:   30, btch:   7 usd:   4
Oct 31 15:36:15 gershwin kernel: CPU6: Hot: hi:   90, btch:  15 usd: 
 85   Cold: hi:   30, btch:   7 usd:   6
Oct 31 15:36:15 gershwin kernel: CPU7: Hot: hi:   90, btch:  15 usd: 
 69   Cold: hi:   30, btch:   7 usd:   4
Oct 31 15:36:15 gershwin kernel: CPU8: Hot: hi:   90, btch:  15 usd: 
 11   Cold: hi:   30, btch:   7 usd:   5
Oct 31 15:36:15 gershwin kernel: CPU9: Hot: hi:   90, btch:  15 usd: 
 75   Cold: hi:   30, btch:   7 usd:   1
Oct 31 15:36:15 gershwin kernel: CPU   10: Hot: hi:   90, btch:  15 usd: 
 84   Cold: hi:   30, btch:   7 usd:   2
Oct 31 15:36:15 gershwin kernel: CPU   11: Hot: hi:   90, btch:  15 usd: 
 13   Cold: hi:   30, btch:   7 usd:   1
Oct 31 15:36:15 gershwin kernel: CPU   12: Hot: hi:   90, btch:  15 usd: 
 17   Cold: hi:   30, btch:   7 usd:  23
Oct 31 15:36:15 gershwin kernel: CPU   13: Hot: hi:   90, btch:  15 usd: 
  7   Cold: hi:   30, btch:   7 usd:  25
Oct 31 15:36:15 gershwin kernel: CPU   14: Hot: hi:   90, btch:  15 usd: 
 64   Cold: hi:   30, btch:   7 usd:  27
Oct 31 15:36:15 gershwin kernel: CPU   15: Hot: hi:   90, btch:  15 usd: 
 12   Cold: hi:   30, btch:   7 usd:   6
Oct 31 15:36:15 gershwin kernel: CPU   16: