Re: 2.6.22-rc6 spurious hangs

2007-07-01 Thread Thomas Sattler
>>> Thomas, any chance you could try the patch below? >> I'm still testing but I couldn't break it until now. > Great, thanks a lot Thomas! The box is still running without a problem, it seems the bug is fixed. Thanks a lot, Thomas -- keep mailinglists in english, feel free to send PM in german

Re: 2.6.22-rc6 spurious hangs

2007-07-01 Thread Oleg Nesterov
On 07/01, Thomas Sattler wrote: > > > Thomas, any chance you could try the patch below? It is very, very stupid, > > it was done without any understanding of this code, and of course it is > > completely untested. I doubt very much it is correct, and even if it is > > correct it is definitely not

Re: 2.6.22-rc6 spurious hangs

2007-07-01 Thread Thomas Sattler
> Thomas, any chance you could try the patch below? It is very, very stupid, > it was done without any understanding of this code, and of course it is > completely untested. I doubt very much it is correct, and even if it is > correct it is definitely not good. It would be great if Dmitry can take

Re: 2.6.22-rc6 spurious hangs

2007-07-01 Thread Thomas Sattler
Thomas, any chance you could try the patch below? It is very, very stupid, it was done without any understanding of this code, and of course it is completely untested. I doubt very much it is correct, and even if it is correct it is definitely not good. It would be great if Dmitry can take a

Re: 2.6.22-rc6 spurious hangs

2007-07-01 Thread Oleg Nesterov
On 07/01, Thomas Sattler wrote: Thomas, any chance you could try the patch below? It is very, very stupid, it was done without any understanding of this code, and of course it is completely untested. I doubt very much it is correct, and even if it is correct it is definitely not good. It

Re: 2.6.22-rc6 spurious hangs

2007-07-01 Thread Thomas Sattler
Thomas, any chance you could try the patch below? I'm still testing but I couldn't break it until now. Great, thanks a lot Thomas! The box is still running without a problem, it seems the bug is fixed. Thanks a lot, Thomas -- keep mailinglists in english, feel free to send PM in german - To

Re: 2.6.22-rc6 spurious hangs

2007-06-30 Thread Oleg Nesterov
On 06/29, Markus Rechberger wrote: > > On 6/29/07, Mauro Carvalho Chehab <[EMAIL PROTECTED]> wrote: > >> Still we can't do this under cinergyt2->sem, because cinergyt2_query() > >> takes it too. This all looks very wrong to me, I hope maintaners can > >> explain. > > > >AFAIK, the driver authors

Re: 2.6.22-rc6 spurious hangs

2007-06-30 Thread Oleg Nesterov
On 06/29, Markus Rechberger wrote: On 6/29/07, Mauro Carvalho Chehab [EMAIL PROTECTED] wrote: Still we can't do this under cinergyt2-sem, because cinergyt2_query() takes it too. This all looks very wrong to me, I hope maintaners can explain. AFAIK, the driver authors are not working

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Markus Rechberger
On 6/29/07, Mauro Carvalho Chehab <[EMAIL PROTECTED]> wrote: > Still we can't do this under cinergyt2->sem, because cinergyt2_query() > takes it too. This all looks very wrong to me, I hope maintaners can > explain. AFAIK, the driver authors are not working anymore with CinergyT2. The last

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Mauro Carvalho Chehab
> Still we can't do this under cinergyt2->sem, because cinergyt2_query() > takes it too. This all looks very wrong to me, I hope maintaners can > explain. AFAIK, the driver authors are not working anymore with CinergyT2. The last patch we have on development tree from Holger is dated as Dec, 3

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Oleg Nesterov
On 06/29, Dmitry Torokhov wrote: > > Well, not really maintainer but I think the short term soluton (at > least for the RC part) is to alter cinergyt2_query_rc to take > cinergyt2->sem only around cinergyt2_command(). Ther rest of the > polling function need not be protected as it does nto tun

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Dmitry Torokhov
On 6/29/07, Ingo Molnar <[EMAIL PROTECTED]> wrote: * Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > > ->disconnect_pending is used without any locks/barriers, perhaps > > > this is the reason. > > I misread cinergyt2_release, it checks !->disconnect_pending, so it is > very clear why

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Ingo Molnar
* Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > > ->disconnect_pending is used without any locks/barriers, perhaps > > > this is the reason. > > I misread cinergyt2_release, it checks !->disconnect_pending, so it is > very clear why cinergyt2_query_rc() tries to take the mutex. > > > > I'll

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Oleg Nesterov
On 06/29, Ingo Molnar wrote: > * Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > > Yes, I think cinergyt2 is buggy. > > > cinergyt2_release() does flush_scheduled_work() under cinergyt2->sem. > > flush_scheduled_work() hangs because cinergyt2_query_rc() waits for > > the same cinergyt2->sem. > >

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Ingo Molnar
* Oleg Nesterov <[EMAIL PROTECTED]> wrote: > Yes, I think cinergyt2 is buggy. > cinergyt2_release() does flush_scheduled_work() under cinergyt2->sem. > flush_scheduled_work() hangs because cinergyt2_query_rc() waits for > the same cinergyt2->sem. > > ->disconnect_pending is used without any

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Oleg Nesterov
On 06/29, Thomas Sattler wrote: > > >> Jun 28 19:23:03 pearl cinergyt2_query_rc+0x0/0x2e9 [cinergyT2] > > > > cinergyt2_query_rc() hangs. I'll try to look tomorrov, but I know nothing > > about drivers/media/dvb/. > > Does this mean the problem is in the cinergyt2 driver? I'm having similar >

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Tomi Orava
Hello, > I'm observing seldom hangs with linux 2.6. I can't tell when exactly it > happened the first time, I think somewhere around 2.6.16 or 2.6.17. I > see it about once or twice a month. With absolutely nothing in the logs. > So far I asked for help: > The box I talk about is an IBM T41p

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Thomas Sattler
>> Jun 28 19:23:03 pearl cinergyt2_query_rc+0x0/0x2e9 [cinergyT2] > > cinergyt2_query_rc() hangs. I'll try to look tomorrov, but I know nothing > about drivers/media/dvb/. Does this mean the problem is in the cinergyt2 driver? I'm having similar problems with another box but with different

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Thomas Sattler
Jun 28 19:23:03 pearl cinergyt2_query_rc+0x0/0x2e9 [cinergyT2] cinergyt2_query_rc() hangs. I'll try to look tomorrov, but I know nothing about drivers/media/dvb/. Does this mean the problem is in the cinergyt2 driver? I'm having similar problems with another box but with different hardware.

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Tomi Orava
Hello, I'm observing seldom hangs with linux 2.6. I can't tell when exactly it happened the first time, I think somewhere around 2.6.16 or 2.6.17. I see it about once or twice a month. With absolutely nothing in the logs. So far I asked for help: snip The box I talk about is an IBM T41p

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Oleg Nesterov
On 06/29, Thomas Sattler wrote: Jun 28 19:23:03 pearl cinergyt2_query_rc+0x0/0x2e9 [cinergyT2] cinergyt2_query_rc() hangs. I'll try to look tomorrov, but I know nothing about drivers/media/dvb/. Does this mean the problem is in the cinergyt2 driver? I'm having similar problems with

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Ingo Molnar
* Oleg Nesterov [EMAIL PROTECTED] wrote: Yes, I think cinergyt2 is buggy. cinergyt2_release() does flush_scheduled_work() under cinergyt2-sem. flush_scheduled_work() hangs because cinergyt2_query_rc() waits for the same cinergyt2-sem. -disconnect_pending is used without any

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Oleg Nesterov
On 06/29, Ingo Molnar wrote: * Oleg Nesterov [EMAIL PROTECTED] wrote: Yes, I think cinergyt2 is buggy. cinergyt2_release() does flush_scheduled_work() under cinergyt2-sem. flush_scheduled_work() hangs because cinergyt2_query_rc() waits for the same cinergyt2-sem.

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Ingo Molnar
* Oleg Nesterov [EMAIL PROTECTED] wrote: -disconnect_pending is used without any locks/barriers, perhaps this is the reason. I misread cinergyt2_release, it checks !-disconnect_pending, so it is very clear why cinergyt2_query_rc() tries to take the mutex. I'll try to look

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Oleg Nesterov
On 06/29, Dmitry Torokhov wrote: Well, not really maintainer but I think the short term soluton (at least for the RC part) is to alter cinergyt2_query_rc to take cinergyt2-sem only around cinergyt2_command(). Ther rest of the polling function need not be protected as it does nto tun

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Dmitry Torokhov
On 6/29/07, Ingo Molnar [EMAIL PROTECTED] wrote: * Oleg Nesterov [EMAIL PROTECTED] wrote: -disconnect_pending is used without any locks/barriers, perhaps this is the reason. I misread cinergyt2_release, it checks !-disconnect_pending, so it is very clear why cinergyt2_query_rc() tries

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Mauro Carvalho Chehab
Still we can't do this under cinergyt2-sem, because cinergyt2_query() takes it too. This all looks very wrong to me, I hope maintaners can explain. AFAIK, the driver authors are not working anymore with CinergyT2. The last patch we have on development tree from Holger is dated as Dec, 3 2004.

Re: 2.6.22-rc6 spurious hangs

2007-06-29 Thread Markus Rechberger
On 6/29/07, Mauro Carvalho Chehab [EMAIL PROTECTED] wrote: Still we can't do this under cinergyt2-sem, because cinergyt2_query() takes it too. This all looks very wrong to me, I hope maintaners can explain. AFAIK, the driver authors are not working anymore with CinergyT2. The last patch we

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler
> Could you also show the result of sysrq-T ? I was so happy that I could trigger it that fast ... ... that I forgot to press Alt-Sysrq-t before reboot. :-( But, I could trigger it again. :-) This time I can offer: - Debug output from Oleg's patch (11x, every 30s) - Alt-Sysrq-t (3x, about 30s

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Oleg Nesterov
On 06/28, Thomas Sattler wrote: > > Here is the logfile. > Jun 28 18:26:03 pearl ERR!! events/0 flush hang: f573759c f573759c 5782 5782 > 0 1 > Jun 28 18:26:03 pearl CURR: 5557 5557 dvbd 0 129024 > Jun 28 18:26:03 pearl wq_barrier_func+0x0/0xd > Jun 28 18:26:03 pearl cache_reap+0x0/0xe3 > Jun 28

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler
Here is the logfile. Thomas -- keep mailinglists in english, feel free to send PM in german messages.gz Description: application/gzip

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler
As Ingo told me I run 'echo t > /proc/sysrq-trigger' this time. The corresponding part of my syslogs is attached, as well as my kernel config. >>> Could you try the patch below? It dumps some info when flush_workqueue() >>> hangs. >> I'm compiling a patched kernel right now. As I wrote

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Ingo Molnar
* Thomas Sattler <[EMAIL PROTECTED]> wrote: > >> As Ingo told me I run 'echo t > /proc/sysrq-trigger' this time. The > >> corresponding part of my syslogs is attached, as well as my kernel config. > > > > Could you try the patch below? It dumps some info when flush_workqueue() > > hangs. > >

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler
>> As Ingo told me I run 'echo t > /proc/sysrq-trigger' this time. The >> corresponding part of my syslogs is attached, as well as my kernel config. > > Could you try the patch below? It dumps some info when flush_workqueue() > hangs. I'm compiling a patched kernel right now. As I wrote in my

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Oleg Nesterov
On 06/28, Thomas Sattler wrote: > > As Ingo told me I run 'echo t > /proc/sysrq-trigger' this time. The > corresponding part of my syslogs is attached, as well as my kernel config. xs_connect() and release_dev() are blocked on flush_workqueue(). Perhaps this is OK, but may be not. Could you try

2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler
Hi there ... I'm observing seldom hangs with linux 2.6. I can't tell when exactly it happened the first time, I think somewhere around 2.6.16 or 2.6.17. I see it about once or twice a month. With absolutely nothing in the logs. So far I asked for help: - in the -ck list Mon Sep 4 10:22:06 EST

2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler
Hi there ... I'm observing seldom hangs with linux 2.6. I can't tell when exactly it happened the first time, I think somewhere around 2.6.16 or 2.6.17. I see it about once or twice a month. With absolutely nothing in the logs. So far I asked for help: - in the -ck list Mon Sep 4 10:22:06 EST

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Oleg Nesterov
On 06/28, Thomas Sattler wrote: As Ingo told me I run 'echo t /proc/sysrq-trigger' this time. The corresponding part of my syslogs is attached, as well as my kernel config. xs_connect() and release_dev() are blocked on flush_workqueue(). Perhaps this is OK, but may be not. Could you try the

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler
As Ingo told me I run 'echo t /proc/sysrq-trigger' this time. The corresponding part of my syslogs is attached, as well as my kernel config. Could you try the patch below? It dumps some info when flush_workqueue() hangs. I'm compiling a patched kernel right now. As I wrote in my former

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Ingo Molnar
* Thomas Sattler [EMAIL PROTECTED] wrote: As Ingo told me I run 'echo t /proc/sysrq-trigger' this time. The corresponding part of my syslogs is attached, as well as my kernel config. Could you try the patch below? It dumps some info when flush_workqueue() hangs. I'm compiling a

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler
As Ingo told me I run 'echo t /proc/sysrq-trigger' this time. The corresponding part of my syslogs is attached, as well as my kernel config. Could you try the patch below? It dumps some info when flush_workqueue() hangs. I'm compiling a patched kernel right now. As I wrote in my former mail

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler
Here is the logfile. Thomas -- keep mailinglists in english, feel free to send PM in german messages.gz Description: application/gzip

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Oleg Nesterov
On 06/28, Thomas Sattler wrote: Here is the logfile. Jun 28 18:26:03 pearl ERR!! events/0 flush hang: f573759c f573759c 5782 5782 0 1 Jun 28 18:26:03 pearl CURR: 5557 5557 dvbd 0 129024 Jun 28 18:26:03 pearl wq_barrier_func+0x0/0xd Jun 28 18:26:03 pearl cache_reap+0x0/0xe3 Jun 28

Re: 2.6.22-rc6 spurious hangs

2007-06-28 Thread Thomas Sattler
Could you also show the result of sysrq-T ? I was so happy that I could trigger it that fast ... ... that I forgot to press Alt-Sysrq-t before reboot. :-( But, I could trigger it again. :-) This time I can offer: - Debug output from Oleg's patch (11x, every 30s) - Alt-Sysrq-t (3x, about 30s