Interpreting call trace

2011-04-06 Thread Daniel Baluta
Hello,

I have the following stack trace:

[3992.172/0] BUG: soft lockup - CPU#0 stuck for 61s! [events/0:39]
[ 3992.215/0] Pid: 39, comm: events/0 Tainted: PW  2.6.32 #1 X8DTT-H
[ 3992.222/0] RIP: 0010:[8115f989]  [8115f989]
__write_lock_failed+0x9/0x20
[ 3992.231/0] RSP: 0018:880028203908  EFLAGS: 0287
[ 3992.236/0] RAX: 88063ce0bfd8 RBX: 880028203920 RCX: 814e1b00
[ 3992.243/0] RDX: 8800282144e0 RSI: 880028203abc RDI: 814e1b04
[ 3992.250/0] RBP: 8100c6f3 R08: 0004 R09: 880028203ab0
[ 3992.257/0] R10: 88061851fa08 R11:  R12: 880028203880
[ 3992.264/0] R13: 88060404f800 R14: 814e1b04 R15: 8101fd88
[ 3992.272/0] FS:  () GS:88002820()
knlGS:
[ 3992.280/0] CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
[ 3992.286/0] CR2: 7f9cc9fb9000 CR3: 01001000 CR4: 06f0
[ 3992.293/0] DR0:  DR1:  DR2: 
[ 3992.300/0] DR3:  DR6: 0ff0 DR7: 0400
[ 3992.307/0] Call Trace:
[ 3992.309/0]  IRQ  [81374e12] ? _write_lock_bh+0x22/0x30
[ 3992.316/0]  [81297751] ? neigh_create+0x2f1/0x6b0
[ 3992.321/0]  [81374f00] ? _read_unlock_bh+0x10/0x20
[ 3992.327/0]  [81296401] ? neigh_lookup+0xd1/0xf0
[ 3992.333/0]  [81297b96] ? neigh_event_ns+0x86/0xc0
[ 3992.338/0]  [812f2cdb] ? arp_process+0x9fb/0xbf0
[ 3992.344/0]  [812f2ffc] ? arp_rcv+0x11c/0x130

How could one interpret this?

For example having:
[ 3992.316/0]  [81297751] ? neigh_create+0x2f1/0x6b0
[ 3992.321/0]  [81374f00] ? _read_unlock_bh+0x10/0x20

means that _read_unlock_bh has been called  finished ? Or it has been
interrupted by
next function in trace (neigh_create).

Any other info about interpreting call traces are welcomed.


thanks,
Daniel.

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Interpreting call trace

2011-04-06 Thread Michael Blizek
Hi!

On 15:56 Wed 06 Apr , Daniel Baluta wrote:
...
 For example having:
 [ 3992.316/0]  [81297751] ? neigh_create+0x2f1/0x6b0
 [ 3992.321/0]  [81374f00] ? _read_unlock_bh+0x10/0x20
 
 means that _read_unlock_bh has been called  finished ? Or it has been
 interrupted by
 next function in trace (neigh_create).
 
 Any other info about interpreting call traces are welcomed.

Usually this means that _read_unlock_bh has called neigh_create. However, this
looks very unlikely to me. So maybe there was a buffer overflow on the stack
or any other form of memory corruption.

-Michi
-- 
programing a layer 3+4 network protocol for mesh networks
see http://michaelblizek.twilightparadox.com


___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies


Re: Interpreting call trace

2011-04-06 Thread Mulyadi Santosa
Hi Daniel...



On Wed, Apr 6, 2011 at 19:56, Daniel Baluta daniel.bal...@gmail.com wrote:
 Hello,

 I have the following stack trace:

 [3992.172/0] BUG: soft lockup - CPU#0 stuck for 61s! [events/0:39]

alright, so we see soft lockup here so that means your machine is
not entirely frozen, right?

If that's true, then I think this entire call trace try to show that
there is lock contention somewhere along the trace...very likely
something grab a lock too long in fast path...


-- 
regards,

Mulyadi Santosa
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com
training: mulyaditraining.blogspot.com

___
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies