subject:"Processes stuck on D state on Dual Opteron"

Re: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Nick Piggin

Claudio Martins wrote: On Tuesday 12 April 2005 01:46, Andrew Morton wrote: Claudio Martins <[EMAIL PROTECTED]> wrote: I think I'm going to give a try to Neil's patch, but I'll have to apply some patches from -mm. Just this one if you're using 2.6.12-rc2: --- 25/drivers/md/md.c~avoid-deadlock-in-s

Re: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Nick Piggin

Chen, Kenneth W wrote: Nick Piggin wrote on Tuesday, April 12, 2005 4:09 AM Chen, Kenneth W wrote: I like the patch a lot and already did bench it on our db setup. However, I'm seeing a negative regression compare to a very very crappy patch (see attached, you can laugh at me for doing things like

Re: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Claudio Martins

On Tuesday 12 April 2005 01:46, Andrew Morton wrote: > Claudio Martins <[EMAIL PROTECTED]> wrote: > > I think I'm going to give a try to Neil's patch, but I'll have to apply > > some patches from -mm. > > Just this one if you're using 2.6.12-rc2: > > --- 25/drivers/md/md.c~avoid-deadlock-in-sync

RE: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Chen, Kenneth W

Nick Piggin wrote on Tuesday, April 12, 2005 4:09 AM > Chen, Kenneth W wrote: > > I like the patch a lot and already did bench it on our db setup. However, > > I'm seeing a negative regression compare to a very very crappy patch (see > > attached, you can laugh at me for doing things like that :-)

Re: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Thomas Davis

Nick Piggin wrote: It is a bit subtle: get_request may only drop the lock and return NULL (after retaking the lock), if we fail on a memory allocation. If we just fail due to unavailable queue slots, then the lock is never dropped. And the mem allocation can't fail because it is a mempool alloc wit

Re: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Nick Piggin

Chen, Kenneth W wrote: On Tue, Apr 12 2005, Nick Piggin wrote: Actually the patches I have sent you do fix real bugs, but they also make the block layer less likely to recurse into page reclaim, so it may be eg. hiding the problem that Neil's patch fixes. Jens Axboe wrote on Tuesday, April 12, 200

Re: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Nick Piggin

Nick Piggin wrote: Nick Piggin wrote: Chen, Kenneth W wrote: I like the patch a lot and already did bench it on our db setup. However, I'm seeing a negative regression compare to a very very crappy patch (see attached, you can laugh at me for doing things like that :-). OK - if we go that way,

Re: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Nick Piggin

Nick Piggin wrote: Chen, Kenneth W wrote: I like the patch a lot and already did bench it on our db setup. However, I'm seeing a negative regression compare to a very very crappy patch (see attached, you can laugh at me for doing things like that :-). OK - if we go that way, perhaps the followi

RE: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Chen, Kenneth W

On Tue, Apr 12 2005, Nick Piggin wrote: > Actually the patches I have sent you do fix real bugs, but they also > make the block layer less likely to recurse into page reclaim, so it > may be eg. hiding the problem that Neil's patch fixes. Jens Axboe wrote on Tuesday, April 12, 2005 12:08 AM > Can

Re: Processes stuck on D state on Dual Opteron

2005-04-12 Thread Jens Axboe

On Tue, Apr 12 2005, Nick Piggin wrote: > Actually the patches I have sent you do fix real bugs, but they also > make the block layer less likely to recurse into page reclaim, so it > may be eg. hiding the problem that Neil's patch fixes. Can you push those to Andrew? I'm quite happy with the way

Re: Processes stuck on D state on Dual Opteron

2005-04-11 Thread Nick Piggin

On Tue, 2005-04-12 at 01:22 +0100, Claudio Martins wrote: > On Monday 11 April 2005 23:59, Nick Piggin wrote: > > > > > OK, I'll try them in a few minutes and report back. > > > > I'm not overly hopeful. If they fix the problem, then it's likely > > that the real bug is hidden. > > > > Well, t

Re: Processes stuck on D state on Dual Opteron

2005-04-11 Thread Andrew Morton

Claudio Martins <[EMAIL PROTECTED]> wrote: > > I think I'm going to give a try to Neil's patch, but I'll have to apply > some > patches from -mm. Just this one if you're using 2.6.12-rc2: --- 25/drivers/md/md.c~avoid-deadlock-in-sync_page_io-by-using-gfp_noio Mon Apr 11 16:55:07 2005 +++ 25

Re: Processes stuck on D state on Dual Opteron

2005-04-11 Thread Claudio Martins

On Tuesday 12 April 2005 00:46, Neil Brown wrote: > On Monday April 11, [EMAIL PROTECTED] wrote: > > Neil, have you had a look at the traces? Do they mean much to you? > > Just looked. > bio_alloc_bioset seems implicated, as does sync_page_io. > > sync_page_io used to use a 'struct bio' on the sta

Re: Processes stuck on D state on Dual Opteron

2005-04-11 Thread Claudio Martins

On Monday 11 April 2005 23:59, Nick Piggin wrote: > > > OK, I'll try them in a few minutes and report back. > > I'm not overly hopeful. If they fix the problem, then it's likely > that the real bug is hidden. > Well, the thing is, they do fix the problem. Or at least they hide it very well ;

Re: Processes stuck on D state on Dual Opteron

2005-04-11 Thread Neil Brown

On Monday April 11, [EMAIL PROTECTED] wrote: > > Neil, have you had a look at the traces? Do they mean much to you? > Just looked. bio_alloc_bioset seems implicated, as does sync_page_io. sync_page_io used to use a 'struct bio' on the stack, but Jens Axboe change it to use bio_alloc (don't kno

Re: Processes stuck on D state on Dual Opteron

2005-04-11 Thread Nick Piggin

Claudio Martins wrote: Right. I'm using two Seagate ATA133 disks (ide controler is AMD-8111) each with 4 partitions, so I get 4 md Raid1 devices. The first one, md0, is for swap. The rest are ~$ df -h FilesystemSize Used Avail Use% Mounted on /dev/md1 4.6G 1.9G 2.6

Re: Processes stuck on D state on Dual Opteron

2005-04-11 Thread Claudio Martins

On Monday 11 April 2005 13:45, Nick Piggin wrote: > > No luck yet (on SMP i386). How many disks are you using in each > raid1 array? You are using one array for swap, and one mounted as > ext3 for the working area of the `stress` program, right? > Right. I'm using two Seagate ATA133 disks (ide

Re: Processes stuck on D state on Dual Opteron

2005-04-11 Thread Nick Piggin

Nick Piggin wrote: The common theme seems to be: try_to_free_pages, swap_writepage, mempool_alloc, down/down_failed in .text.lock.md. Next I would suspect md/raid1 - maybe some deadlock in an uncommon memory allocation failure path? I'll see if I can reproduce it here. No luck yet (on SMP i386). Ho

Re: Processes stuck on D state on Dual Opteron

2005-04-11 Thread Nick Piggin

Claudio Martins wrote: On Sunday 10 April 2005 03:47, Andrew Morton wrote: Suggest you boot with `nmi_watchdog=0' to prevent the nmi watchdog from cutting in during long sysrq traces. Also, capture the `sysrq-m' output so we can see if the thing is out of memory. Hi Andrew, Thanks for the tip.

Re: Processes stuck on D state on Dual Opteron

2005-04-10 Thread Nick Piggin

Claudio Martins wrote: On Sunday 10 April 2005 03:47, Andrew Morton wrote: Suggest you boot with `nmi_watchdog=0' to prevent the nmi watchdog from cutting in during long sysrq traces. Also, capture the `sysrq-m' output so we can see if the thing is out of memory. Hi Andrew, Thanks for the tip.

Re: Processes stuck on D state on Dual Opteron

2005-04-10 Thread Claudio Martins

On Sunday 10 April 2005 03:47, Andrew Morton wrote: > > Suggest you boot with `nmi_watchdog=0' to prevent the nmi watchdog from > cutting in during long sysrq traces. > > Also, capture the `sysrq-m' output so we can see if the thing is out of > memory. Hi Andrew, Thanks for the tip. I booted

Re: Processes stuck on D state on Dual Opteron

2005-04-09 Thread Claudio Martins

On Sunday 10 April 2005 03:53, Nick Piggin wrote: > > Looks like you may possibly have a memory allocation deadlock > (although I can't explain the NMI oops). > > I would be interested to see if the following patch is of any > help to you. > Hi Nick, I'll build a kernel with your patch and r

Re: Processes stuck on D state on Dual Opteron

2005-04-09 Thread Claudio Martins

On Sunday 10 April 2005 03:47, Andrew Morton wrote: > > Suggest you boot with `nmi_watchdog=0' to prevent the nmi watchdog from > cutting in during long sysrq traces. > > Also, capture the `sysrq-m' output so we can see if the thing is out of > memory. OK, will do it ASAP and report back. Tha

Re: Processes stuck on D state on Dual Opteron

2005-04-09 Thread Nick Piggin

Claudio Martins wrote: On Tuesday 05 April 2005 03:12, Andrew Morton wrote: Claudio Martins <[EMAIL PROTECTED]> wrote: While stress testing 2.6.12-rc2 on an HP DL145 I get processes stuck in D state after some time. This machine is a dual Opteron 248 with 2GB (ECC) on one node (the other node h

Re: Processes stuck on D state on Dual Opteron

2005-04-09 Thread Andrew Morton

Claudio Martins <[EMAIL PROTECTED]> wrote: > > I repeated the test to try to get more output from alt-sysreq-T, but it > oopsed again with even less output. >By the way, I have also tested 2.6.11.6 and I get stuck processes in the > same way. With 2.6.9 I get a hard lockup with no workin

Re: Processes stuck on D state on Dual Opteron

2005-04-09 Thread Claudio Martins

On Tuesday 05 April 2005 03:12, Andrew Morton wrote: > Claudio Martins <[EMAIL PROTECTED]> wrote: > >While stress testing 2.6.12-rc2 on an HP DL145 I get processes stuck > > in D state after some time. > >This machine is a dual Opteron 248 with 2GB (ECC) on one node (the > > other node has

Re: Processes stuck on D state on Dual Opteron

2005-04-04 Thread Andrew Morton

Claudio Martins <[EMAIL PROTECTED]> wrote: > >While stress testing 2.6.12-rc2 on an HP DL145 I get processes stuck in D > state after some time. >This machine is a dual Opteron 248 with 2GB (ECC) on one node (the other > node has no RAM modules plugged in, since this board works only

Processes stuck on D state on Dual Opteron

2005-04-04 Thread Claudio Martins

Hi, While stress testing 2.6.12-rc2 on an HP DL145 I get processes stuck in D state after some time. This machine is a dual Opteron 248 with 2GB (ECC) on one node (the other node has no RAM modules plugged in, since this board works only with pairs). I was using stress (http://weathe

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

RE: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

RE: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Re: Processes stuck on D state on Dual Opteron

Processes stuck on D state on Dual Opteron

28 matches

Site Navigation

Mail list logo

Footer information