Re: raid5 hang on get_active_stripe

2006-11-15 Thread dean gaudet
and i haven't seen it either... neil do you think your latest patch was 
hiding the bug?  'cause there was an iteration of an earlier patch which 
didn't produce much spam in dmesg but the bug was still there, then there 
is the version below which spams dmesg a fair amount but i didn't see the 
bug in ~30 days.

btw i've upgraded that box to 2.6.18.2 without the patch (it had some 
conflicts)... haven't seen the bug yet though (~10 days so far).

hmm i wonder if i could reproduce it more rapidly if i lowered 
/sys/block/mdX/md/stripe_cache_size.  i'll give that a go.

-dean


On Tue, 14 Nov 2006, Chris Allen wrote:

 You probably guessed that no matter what I did, I never, ever saw the problem
 when your
 trace was installed. I'd guess at some obscure timing-related problem. I can
 still trigger it
 consistently with a vanilla 2.6.17_SMP though, but again only when bitmaps are
 turned on.
 
 
 
 Neil Brown wrote:
  On Tuesday October 10, [EMAIL PROTECTED] wrote:

   Very happy to. Let me know what you'd like me to do.
   
  
  Cool thanks.
  
  At the end is a patch against 2.6.17.11, though it should apply against
  any later 2.6.17 kernel.
  Apply this and reboot.
  
  Then run
  
 while true
 do cat /sys/block/mdX/md/stripe_cache_active
sleep 10
 done  /dev/null
  
  (maybe write a little script or whatever).  Leave this running. It
  effects the check for has raid5 hung.  Make sure to change mdX to
  whatever is appropriate.
  
  Occasionally look in the kernel logs for
 plug problem:
  
  if you find that, send me the surrounding text - there should be about
  a dozen lines following this one.
  
  Hopefully this will let me know which is last thing to happen: a plug
  or an unplug.
  If the last is a plug, then the timer really should still be
  pending, but isn't (this is impossible).  So I'll look more closely at
  that option.
  If the last is an unplug, then the 'Plugged' flag should really be
  clear but it isn't (this is impossible).  So I'll look more closely at
  that option.
  
  Dean is running this, but he only gets the hang every couple of
  weeks.  If you get it more often, that would help me a lot.
  
  Thanks,
  NeilBrown
  
  
  diff ./.patches/orig/block/ll_rw_blk.c ./block/ll_rw_blk.c
  --- ./.patches/orig/block/ll_rw_blk.c   2006-08-21 09:52:46.0 
  +1000
  +++ ./block/ll_rw_blk.c 2006-10-05 11:33:32.0 +1000
  @@ -1546,6 +1546,7 @@ static int ll_merge_requests_fn(request_
* This is called with interrupts off and no requests on the queue and
* with the queue lock held.
*/
  +static atomic_t seq = ATOMIC_INIT(0);
   void blk_plug_device(request_queue_t *q)
   {
  WARN_ON(!irqs_disabled());
  @@ -1558,9 +1559,16 @@ void blk_plug_device(request_queue_t *q)
  return;
  if (!test_and_set_bit(QUEUE_FLAG_PLUGGED, q-queue_flags)) {
  +   q-last_plug = jiffies;
  +   q-plug_seq = atomic_read(seq);
  +   atomic_inc(seq);
  mod_timer(q-unplug_timer, jiffies + q-unplug_delay);
  blk_add_trace_generic(q, NULL, 0, BLK_TA_PLUG);
  -   }
  +   } else
  +   q-last_plug_skip = jiffies;
  +   if (!timer_pending(q-unplug_timer) 
  +   !q-unplug_work.pending)
  +   printk(Neither Timer or work are pending\n);
   }
EXPORT_SYMBOL(blk_plug_device);
  @@ -1573,10 +1581,17 @@ int blk_remove_plug(request_queue_t *q)
   {
  WARN_ON(!irqs_disabled());
   -  if (!test_and_clear_bit(QUEUE_FLAG_PLUGGED, q-queue_flags))
  +   if (!test_and_clear_bit(QUEUE_FLAG_PLUGGED, q-queue_flags)) {
  +   q-last_unplug_skip = jiffies;
  return 0;
  +   }
  del_timer(q-unplug_timer);
  +   q-last_unplug = jiffies;
  +   q-unplug_seq = atomic_read(seq);
  +   atomic_inc(seq);
  +   if (test_bit(QUEUE_FLAG_PLUGGED, q-queue_flags))
  +   printk(queue still (or again) plugged\n);
  return 1;
   }
   @@ -1635,7 +1650,7 @@ static void blk_backing_dev_unplug(struc
   static void blk_unplug_work(void *data)
   {
  request_queue_t *q = data;
  -
  +   q-last_unplug_work = jiffies;
  blk_add_trace_pdu_int(q, BLK_TA_UNPLUG_IO, NULL,
  q-rq.count[READ] + q-rq.count[WRITE]);
   @@ -1649,6 +1664,7 @@ static void blk_unplug_timeout(unsigned
  blk_add_trace_pdu_int(q, BLK_TA_UNPLUG_TIMER, NULL,
  q-rq.count[READ] + q-rq.count[WRITE]);
   +  q-last_unplug_timeout = jiffies;
  kblockd_schedule_work(q-unplug_work);
   }
   
  diff ./.patches/orig/drivers/md/raid1.c ./drivers/md/raid1.c
  --- ./.patches/orig/drivers/md/raid1.c  2006-08-10 17:28:01.0
  +1000
  +++ ./drivers/md/raid1.c2006-09-04 21:58:31.0 +1000
  @@ -1486,7 +1486,6 @@ static void raid1d(mddev_t *mddev)
  d = conf-raid_disks;
  d--;
  rdev = 

Re: raid5 hang on get_active_stripe

2006-10-10 Thread Bas van Schaik
Hi all,

Neil Brown wrote:
 On Tuesday October 10, [EMAIL PROTECTED] wrote:
   
 Very happy to. Let me know what you'd like me to do.
 

 Cool thanks.
 (snip)
   
I don't know if it's useful information, but I'm encountering the same
problem here, in a totally different situation. I'm using Peter Breuers
ENBD (you probably know him, since he started a discussion about request
retries with exponential timeouts and a communication channel to raid a
while ago) to import a total of 12 devices from other machines to
compose those disks into 3 arrays of RAID5. Those 3 arrays are combined
in one VG with one LV, running CryptoLoop on top. Last, but not least, a
ReiserFS is created on the loopback device. I'm using the Debian Etch
stock 2.6.17-kernel, by the way.

When doing a lot of I/O on the ReiserFS (like a reiserfsck
--rebuild-tree), the machine suddenly gets stuck, I think after filling
it's memory with buffers. I've been doing a lot of debugging with Peter,
attached you'll find a ps -axl with a widened WCHAN column to see that
some of the enbd-client processes get stuck in the RAID code. We've not
been able find out how ENBD gets into the RAID code, but I don't think
that's really relevant right now. Here's the relevant part of ps:

ps ax -o f,uid,pid,ppid,pri,ni,vsz,rss,wchan:30,stat,tty,time,command
(only the relevant rows)

 F   UID   PID  PPID PRI  NI   VSZ  RSS WCHAN  STAT TT 
   TIME COMMAND
 (snip)
 5 0 26523 1  23   0  2140 1052 -  Ss   ?  
   00:00:00 enbd-client iss01 1300 -i iss01-hdd -n 2 -e -m -b 4096 -p 30 
 /dev/ndi
 5 0 26540 1  23   0  2140 1048 get_active_stripe  Ds   ?  
   00:00:00 enbd-client iss04 1300 -i iss04-hdd -n 2 -e -m -b 4096 -p 30 
 /dev/ndl
 5 0 26552 1  23   0  2140 1044 -  Ss   ?  
   00:00:00 enbd-client iss02 1200 -i iss02-hdc5 -n 2 -e -m -b 4096 -p 30 
 /dev/ndf
 5 0 26556 1  23   0  2140 1048 -  Ss   ?  
   00:00:00 enbd-client iss01 1100 -i iss01-hda5 -n 2 -e -m -b 4096 -p 30 
 /dev/nda
 5 0 26561 1  23   0  2140 1052 get_active_stripe  Ds   ?  
   00:00:00 enbd-client iss02 1100 -i iss02-hda5 -n 2 -e -m -b 4096 -p 30 
 /dev/ndb
 5 0 26564 1  23   0  2144 1052 -  Ss   ?  
   00:00:00 enbd-client iss03 1200 -i iss03-hdc5 -n 2 -e -m -b 4096 -p 30 
 /dev/ndg
 5 0 26568 1  23   0  2144 1052 -  Ss   ?  
   00:00:00 enbd-client iss04 1200 -i iss04-hdc5 -n 2 -e -m -b 4096 -p 30 
 /dev/ndh
 5 0 26581 1  23   0  2144 1052 -  Ss   ?  
   00:00:00 enbd-client iss03 1100 -i iss03-hda5 -n 2 -e -m -b 4096 -p 30 
 /dev/ndc
 5 0 26590 1  23   0  2140 1048 -  Ss   ?  
   00:00:00 enbd-client iss01 1200 -i iss01-hdc5 -n 2 -e -m -b 4096 -p 30 
 /dev/nde
 5 0 26606 1  23   0  2144 1052 -  Ss   ?  
   00:00:00 enbd-client iss02 1300 -i iss02-hdd -n 2 -e -m -b 4096 -p 30 
 /dev/ndj
 5 0 26614 1  23   0  2144 1052 -  Ss   ?  
   00:00:00 enbd-client iss03 1300 -i iss03-hdd -n 2 -e -m -b 4096 -p 30 
 /dev/ndk
 5 0 26616 1  23   0  2144 1056 -  Ss   ?  
   00:00:00 enbd-client iss04 1100 -i iss04-hda5 -n 2 -e -m -b 4096 -p 30 
 /dev/ndd
 5 0 26617 26523  24   0  2140  948 enbd_get_req   S?  
   00:00:00 enbd-client iss01 1300 -i iss01-hdd -n 2 -e -m -b 4096 -p 30 
 /dev/ndi
 5 0 26618 26523  24   0  2140  948 enbd_get_req   S?  
   00:00:00 enbd-client iss01 1300 -i iss01-hdd -n 2 -e -m -b 4096 -p 30 
 /dev/ndi
 5 0 26619 26540  24   0  2140  948 enbd_get_req   S?  
   00:00:01 enbd-client iss04 1300 -i iss04-hdd -n 2 -e -m -b 4096 -p 30 
 /dev/ndl
 5 0 26620 26540  24   0  2140  948 enbd_get_req   S?  
   00:00:01 enbd-client iss04 1300 -i iss04-hdd -n 2 -e -m -b 4096 -p 30 
 /dev/ndl
 5 0 26621 26552  24   0  2140  948 get_active_stripe  D?  
   00:32:11 enbd-client iss02 1200 -i iss02-hdc5 -n 2 -e -m -b 4096 -p 30 
 /dev/ndf
 5 0 26622 26552  24   0  2140  948 get_active_stripe  D?  
   00:32:18 enbd-client iss02 1200 -i iss02-hdc5 -n 2 -e -m -b 4096 -p 30 
 /dev/ndf
 5 0 26623 26564  23   0  2144  956 enbd_get_req   S?  
   00:32:27 enbd-client iss03 1200 -i iss03-hdc5 -n 2 -e -m -b 4096 -p 30 
 /dev/ndg
 5 0 26624 26564  24   0  2144  956 enbd_get_req   S?  
   00:32:37 enbd-client iss03 1200 -i iss03-hdc5 -n 2 -e -m -b 4096 -p 30 
 /dev/ndg
 5 0 26625 26568  24   0  2144  956 enbd_get_req   S?  
   00:35:35 enbd-client iss04 1200 -i iss04-hdc5 -n 2 -e -m -b 4096 -p 30 
 

Re: raid5 hang on get_active_stripe

2006-10-09 Thread Chris Allen
Ok, after more testing, this lockup happens consistently when
bitmaps are switched on and never when they are switched off.

Ideas anybody?




On Sun, Oct 08, 2006 at 12:25:46AM +0100, Chris Allen wrote:
 
 
 Neil Brown wrote:
 On Tuesday June 13, [EMAIL PROTECTED] wrote:
   
 Will that fix be in 2.6.17?
 
 
 
 Probably not.  We have had the last 'rc' twice and I so I don't think
 it is appropriate to submit the patch at this stage.
 I probably will submit it for an early 2.6.17.x. and for 2.6.16.y.
 
 
   
 
 What is the status of this?
 
 I've been experiencing exactly the same get_active_stripe lockup
 on a FC5 2.6.17-1.2187_FC5smp  stock kernel. Curiously we have ten 
 similar heavily loaded
 servers but only one of them experiences the problem. The problem 
 happens consistently after 24 hours or so
 when I hammer the raid5 array over NFS, but I've never managed to 
 trigger it with local access. I'd also
 say (anecdotally) that it only started happening since I added a bitmap 
 to my array.
 
 
 As with the other poster, the lockup is released by increasing 
 stripe_cache_size.
 
 
 
 
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-10-09 Thread Neil Brown
On Monday October 9, [EMAIL PROTECTED] wrote:
 Ok, after more testing, this lockup happens consistently when
 bitmaps are switched on and never when they are switched off.
 
 Ideas anybody?
 

No. I'm completely stumped.
Which means it is probably something very obvious, but I keep looking
in the wrong place :-)

The interaction with bitmaps is interesting and might prove helpful.
I'll have another look at the code and see if it opens up some
possibilities. 

I've been working with Dean Gaudet who can reproduce it.  He has been
trying out patches that print out lots of debugging information when
the problem occurs.  This has narrowed it down a bit, but I'm still in
the dark.

It seems that blk_plug_device is setting a timer to go off in about
3msecs, but it gets deactivated without firing. This should only
happen when the device gets unplugged, but the device remains plugged.

Are you happy to try a kernel.org kernel with a few patches and a
little shell script running?
The net result is that it detects when there is a problem, prints out
some trace information, and then gets array going again.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-10-09 Thread Chris Allen


Neil Brown wrote:

On Monday October 9, [EMAIL PROTECTED] wrote:
  

Ok, after more testing, this lockup happens consistently when
bitmaps are switched on and never when they are switched off.



Are you happy to try a kernel.org kernel with a few patches and a
little shell script running?
The net result is that it detects when there is a problem, prints out
some trace information, and then gets array going again.


  


Very happy to. Let me know what you'd like me to do.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-10-09 Thread Neil Brown
On Tuesday October 10, [EMAIL PROTECTED] wrote:
 
 Very happy to. Let me know what you'd like me to do.

Cool thanks.

At the end is a patch against 2.6.17.11, though it should apply against
any later 2.6.17 kernel.
Apply this and reboot.

Then run

   while true
   do cat /sys/block/mdX/md/stripe_cache_active
  sleep 10
   done  /dev/null

(maybe write a little script or whatever).  Leave this running. It
effects the check for has raid5 hung.  Make sure to change mdX to
whatever is appropriate.

Occasionally look in the kernel logs for
   plug problem:

if you find that, send me the surrounding text - there should be about
a dozen lines following this one.

Hopefully this will let me know which is last thing to happen: a plug
or an unplug.
If the last is a plug, then the timer really should still be
pending, but isn't (this is impossible).  So I'll look more closely at
that option.
If the last is an unplug, then the 'Plugged' flag should really be
clear but it isn't (this is impossible).  So I'll look more closely at
that option.

Dean is running this, but he only gets the hang every couple of
weeks.  If you get it more often, that would help me a lot.

Thanks,
NeilBrown


diff ./.patches/orig/block/ll_rw_blk.c ./block/ll_rw_blk.c
--- ./.patches/orig/block/ll_rw_blk.c   2006-08-21 09:52:46.0 +1000
+++ ./block/ll_rw_blk.c 2006-10-05 11:33:32.0 +1000
@@ -1546,6 +1546,7 @@ static int ll_merge_requests_fn(request_
  * This is called with interrupts off and no requests on the queue and
  * with the queue lock held.
  */
+static atomic_t seq = ATOMIC_INIT(0);
 void blk_plug_device(request_queue_t *q)
 {
WARN_ON(!irqs_disabled());
@@ -1558,9 +1559,16 @@ void blk_plug_device(request_queue_t *q)
return;
 
if (!test_and_set_bit(QUEUE_FLAG_PLUGGED, q-queue_flags)) {
+   q-last_plug = jiffies;
+   q-plug_seq = atomic_read(seq);
+   atomic_inc(seq);
mod_timer(q-unplug_timer, jiffies + q-unplug_delay);
blk_add_trace_generic(q, NULL, 0, BLK_TA_PLUG);
-   }
+   } else
+   q-last_plug_skip = jiffies;
+   if (!timer_pending(q-unplug_timer) 
+   !q-unplug_work.pending)
+   printk(Neither Timer or work are pending\n);
 }
 
 EXPORT_SYMBOL(blk_plug_device);
@@ -1573,10 +1581,17 @@ int blk_remove_plug(request_queue_t *q)
 {
WARN_ON(!irqs_disabled());
 
-   if (!test_and_clear_bit(QUEUE_FLAG_PLUGGED, q-queue_flags))
+   if (!test_and_clear_bit(QUEUE_FLAG_PLUGGED, q-queue_flags)) {
+   q-last_unplug_skip = jiffies;
return 0;
+   }
 
del_timer(q-unplug_timer);
+   q-last_unplug = jiffies;
+   q-unplug_seq = atomic_read(seq);
+   atomic_inc(seq);
+   if (test_bit(QUEUE_FLAG_PLUGGED, q-queue_flags))
+   printk(queue still (or again) plugged\n);
return 1;
 }
 
@@ -1635,7 +1650,7 @@ static void blk_backing_dev_unplug(struc
 static void blk_unplug_work(void *data)
 {
request_queue_t *q = data;
-
+   q-last_unplug_work = jiffies;
blk_add_trace_pdu_int(q, BLK_TA_UNPLUG_IO, NULL,
q-rq.count[READ] + q-rq.count[WRITE]);
 
@@ -1649,6 +1664,7 @@ static void blk_unplug_timeout(unsigned 
blk_add_trace_pdu_int(q, BLK_TA_UNPLUG_TIMER, NULL,
q-rq.count[READ] + q-rq.count[WRITE]);
 
+   q-last_unplug_timeout = jiffies;
kblockd_schedule_work(q-unplug_work);
 }
 

diff ./.patches/orig/drivers/md/raid1.c ./drivers/md/raid1.c
--- ./.patches/orig/drivers/md/raid1.c  2006-08-10 17:28:01.0 +1000
+++ ./drivers/md/raid1.c2006-09-04 21:58:31.0 +1000
@@ -1486,7 +1486,6 @@ static void raid1d(mddev_t *mddev)
d = conf-raid_disks;
d--;
rdev = conf-mirrors[d].rdev;
-   atomic_add(s, 
rdev-corrected_errors);
if (rdev 
test_bit(In_sync, 
rdev-flags)) {
if 
(sync_page_io(rdev-bdev,
@@ -1509,6 +1508,9 @@ static void raid1d(mddev_t *mddev)
 s9, 
conf-tmppage, READ) == 0)
/* Well, this 
device is dead */
md_error(mddev, 
rdev);
+   else
+   atomic_add(s, 
rdev-corrected_errors);
+
}
}
} else {

diff 

Re: raid5 hang on get_active_stripe

2006-10-07 Thread Chris Allen



Neil Brown wrote:

On Tuesday June 13, [EMAIL PROTECTED] wrote:
  

Will that fix be in 2.6.17?




Probably not.  We have had the last 'rc' twice and I so I don't think
it is appropriate to submit the patch at this stage.
I probably will submit it for an early 2.6.17.x. and for 2.6.16.y.


  


What is the status of this?

I've been experiencing exactly the same get_active_stripe lockup
on a FC5 2.6.17-1.2187_FC5smp  stock kernel. Curiously we have ten 
similar heavily loaded
servers but only one of them experiences the problem. The problem 
happens consistently after 24 hours or so
when I hammer the raid5 array over NFS, but I've never managed to 
trigger it with local access. I'd also
say (anecdotally) that it only started happening since I added a bitmap 
to my array.



As with the other poster, the lockup is released by increasing 
stripe_cache_size.






-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-06-02 Thread Neil Brown
On Friday June 2, [EMAIL PROTECTED] wrote:
 On Thu, 1 Jun 2006, Neil Brown wrote:
 
  I've got one more long-shot I would like to try first.  If you could
  backout that change to ll_rw_block, and apply this patch instead.
  Then when it hangs, just cat the stripe_cache_active file and see if
  that unplugs things or not (cat it a few times).
 
 nope that didn't unstick it... i had to raise stripe_cache_size (from 256 
 to 768... 512 wasn't enough)...
 
 -dean

Ok, thanks.
I still don't know what is really going on, but I'm 99.9863% sure this
will fix it, and is a reasonable thing to do.
(Yes, I lose a ';'.  That is deliberate).

Please let me know what this proves, and thanks again for your
patience.

NeilBrown


Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/raid5.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~   2006-05-28 21:56:56.0 +1000
+++ ./drivers/md/raid5.c2006-06-02 17:24:07.0 +1000
@@ -285,7 +285,7 @@ static struct stripe_head *get_active_st
  (conf-max_nr_stripes 
*3/4)
 || 
!conf-inactive_blocked),
conf-device_lock,
-   unplug_slaves(conf-mddev);
+   
raid5_unplug_device(conf-mddev-queue)
);
conf-inactive_blocked = 0;
} else
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-05-30 Thread Neil Brown
On Tuesday May 30, [EMAIL PROTECTED] wrote:
 On Tue, 30 May 2006, Neil Brown wrote:
 
  Could you try this patch please?  On top of the rest.
  And if it doesn't fail in a couple of days, tell me how regularly the
  message 
 kblockd_schedule_work failed
  gets printed.
 
 i'm running this patch now ... and just after reboot, no freeze yet, i've 
 already seen a handful of these:
 
 May 30 17:05:09 localhost kernel: kblockd_schedule_work failed
 May 30 17:05:59 localhost kernel: kblockd_schedule_work failed
 May 30 17:08:16 localhost kernel: kblockd_schedule_work failed
 May 30 17:10:51 localhost kernel: kblockd_schedule_work failed
 May 30 17:11:51 localhost kernel: kblockd_schedule_work failed
 May 30 17:12:46 localhost kernel: kblockd_schedule_work failed
 May 30 17:14:14 localhost kernel: kblockd_schedule_work failed

1 every minute or so.  That's probably more than I would have
expected, but strongly lends evidence to the theory that this is the
problem.

I certainly wouldn't expect a failure every time kblockd_schedule_work
failed (in the original code), but the fact that it does fail
sometimes means there is a possible race which can cause the failure
that experienced.

So I am optimistic that the patch will have fixed the problem.  Please
let me know when you reach an uptime of 3 days.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-05-30 Thread Neil Brown
On Tuesday May 30, [EMAIL PROTECTED] wrote:
 
 actually i think the rate is higher... i'm not sure why, but klogd doesn't 
 seem to keep up with it:
 
 [EMAIL PROTECTED]:~# grep -c kblockd_schedule_work /var/log/messages
 31
 [EMAIL PROTECTED]:~# dmesg | grep -c kblockd_schedule_work
 8192

# grep 'last message repeated' /var/log/messages
??

Obviously even faster than I thought.  I guess workqueue threads must
take a while to get scheduled...
I'm beginning to wonder if I really have found the bug after all :-(

I'll look forward to the results either way.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-05-30 Thread dean gaudet
On Wed, 31 May 2006, Neil Brown wrote:

 On Tuesday May 30, [EMAIL PROTECTED] wrote:
  
  actually i think the rate is higher... i'm not sure why, but klogd doesn't 
  seem to keep up with it:
  
  [EMAIL PROTECTED]:~# grep -c kblockd_schedule_work /var/log/messages
  31
  [EMAIL PROTECTED]:~# dmesg | grep -c kblockd_schedule_work
  8192
 
 # grep 'last message repeated' /var/log/messages
 ??

um hi, of course :)  the paste below is approximately correct.

-dean

[EMAIL PROTECTED]:~# egrep 'kblockd_schedule_work|last message repeated' 
/var/log/messages
May 30 17:05:09 localhost kernel: kblockd_schedule_work failed
May 30 17:05:59 localhost kernel: kblockd_schedule_work failed
May 30 17:08:16 localhost kernel: kblockd_schedule_work failed
May 30 17:10:51 localhost kernel: kblockd_schedule_work failed
May 30 17:11:51 localhost kernel: kblockd_schedule_work failed
May 30 17:12:46 localhost kernel: kblockd_schedule_work failed
May 30 17:12:56 localhost last message repeated 22 times
May 30 17:14:14 localhost kernel: kblockd_schedule_work failed
May 30 17:16:57 localhost kernel: kblockd_schedule_work failed
May 30 17:17:00 localhost last message repeated 83 times
May 30 17:17:02 localhost kernel: kblockd_schedule_work failed
May 30 17:17:33 localhost last message repeated 950 times
May 30 17:18:34 localhost last message repeated 2218 times
May 30 17:19:35 localhost last message repeated 1581 times
May 30 17:20:01 localhost last message repeated 579 times
May 30 17:20:02 localhost kernel: kblockd_schedule_work failed
May 30 17:20:02 localhost kernel: kblockd_schedule_work failed
May 30 17:20:02 localhost kernel: kblockd_schedule_work failed
May 30 17:20:02 localhost last message repeated 23 times
May 30 17:20:03 localhost kernel: kblockd_schedule_work failed
May 30 17:20:34 localhost last message repeated 1058 times
May 30 17:21:35 localhost last message repeated 2171 times
May 30 17:22:36 localhost last message repeated 2305 times
May 30 17:23:37 localhost last message repeated 2311 times
May 30 17:24:38 localhost last message repeated 1993 times
May 30 17:25:01 localhost last message repeated 702 times
May 30 17:25:02 localhost kernel: kblockd_schedule_work failed
May 30 17:25:02 localhost last message repeated 15 times
May 30 17:25:02 localhost kernel: kblockd_schedule_work failed
May 30 17:25:02 localhost last message repeated 12 times
May 30 17:25:03 localhost kernel: kblockd_schedule_work failed
May 30 17:25:34 localhost last message repeated 1061 times
May 30 17:26:35 localhost last message repeated 2009 times
May 30 17:27:36 localhost last message repeated 1941 times
May 30 17:28:37 localhost last message repeated 2345 times
May 30 17:29:38 localhost last message repeated 2367 times
May 30 17:30:01 localhost last message repeated 870 times
May 30 17:30:01 localhost kernel: kblockd_schedule_work failed
May 30 17:30:01 localhost last message repeated 45 times
May 30 17:30:02 localhost kernel: kblockd_schedule_work failed
May 30 17:30:33 localhost last message repeated 1180 times
May 30 17:31:34 localhost last message repeated 2062 times
May 30 17:32:34 localhost last message repeated 2277 times
May 30 17:32:36 localhost kernel: kblockd_schedule_work failed
May 30 17:33:07 localhost last message repeated 1114 times
May 30 17:34:08 localhost last message repeated 2308 times
May 30 17:35:01 localhost last message repeated 1941 times
May 30 17:35:01 localhost kernel: kblockd_schedule_work failed
May 30 17:35:02 localhost last message repeated 20 times
May 30 17:35:02 localhost kernel: kblockd_schedule_work failed
May 30 17:35:33 localhost last message repeated 1051 times
May 30 17:36:34 localhost last message repeated 2002 times
May 30 17:37:35 localhost last message repeated 1644 times
May 30 17:38:36 localhost last message repeated 1731 times
May 30 17:39:37 localhost last message repeated 1844 times
May 30 17:40:01 localhost last message repeated 817 times
May 30 17:40:02 localhost kernel: kblockd_schedule_work failed
May 30 17:40:02 localhost last message repeated 39 times
May 30 17:40:02 localhost kernel: kblockd_schedule_work failed
May 30 17:40:02 localhost last message repeated 12 times
May 30 17:40:03 localhost kernel: kblockd_schedule_work failed
May 30 17:40:34 localhost last message repeated 1051 times
May 30 17:41:35 localhost last message repeated 1576 times
May 30 17:42:36 localhost last message repeated 2000 times
May 30 17:43:37 localhost last message repeated 2058 times
May 30 17:44:15 localhost last message repeated 1337 times
May 30 17:44:15 localhost kernel: kblockd_schedule_work failed
May 30 17:44:46 localhost last message repeated 1016 times
May 30 17:45:01 localhost last message repeated 432 times
May 30 17:45:02 localhost kernel: kblockd_schedule_work failed
May 30 17:45:02 localhost kernel: kblockd_schedule_work failed
May 30 17:45:33 localhost last message repeated 1229 times
May 30 17:46:34 localhost last message repeated 2552 times
May 30 17:47:36 localhost last message repeated 

Re: raid5 hang on get_active_stripe

2006-05-29 Thread dean gaudet
On Sun, 28 May 2006, Neil Brown wrote:

 The following patch adds some more tracing to raid5, and might fix a
 subtle bug in ll_rw_blk, though it is an incredible long shot that
 this could be affecting raid5 (if it is, I'll have to assume there is
 another bug somewhere).   It certainly doesn't break ll_rw_blk.
 Whether it actually fixes something I'm not sure.
 
 If you could try with these on top of the previous patches I'd really
 appreciate it.
 
 When you read from /stripe_cache_active, it should trigger a
 (cryptic) kernel message within the next 15 seconds.  If I could get
 the contents of that file and the kernel messages, that should help.

got the hang again... attached is the dmesg with the cryptic messages.  i 
didn't think to grab the task dump this time though.

hope there's a clue in this one :)  but send me another patch if you need 
more data.

-dean

neemlark:/sys/block/md4/md# cat stripe_cache_size 
256
neemlark:/sys/block/md4/md# cat stripe_cache_active 
251
0 preread
plugged
bitlist=0 delaylist=251
neemlark:/sys/block/md4/md# cat stripe_cache_active 
251
0 preread
plugged
bitlist=0 delaylist=251
neemlark:/sys/block/md4/md# echo 512 stripe_cache_size 
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
292 preread
not plugged
bitlist=0 delaylist=32
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
292 preread
not plugged
bitlist=0 delaylist=32
neemlark:/sys/block/md4/md# cat stripe_cache_active
445
0 preread
not plugged
bitlist=0 delaylist=73
neemlark:/sys/block/md4/md# cat stripe_cache_active
480
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
413
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
13
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
493
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
487
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
405
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
1 preread
not plugged
bitlist=0 delaylist=28
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
84 preread
not plugged
bitlist=0 delaylist=69
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
69 preread
not plugged
bitlist=0 delaylist=56
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
41 preread
not plugged
bitlist=0 delaylist=38
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
10 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
453
3 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
480
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
14 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
477
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
476
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
486
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
480
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
384
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
387
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
462
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
480
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
448
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
501
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
476
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
416
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
386
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
434
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
406
0 preread
not plugged
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
447
0 preread
not plugged
bitlist=0 

Re: raid5 hang on get_active_stripe

2006-05-28 Thread Neil Brown
On Saturday May 27, [EMAIL PROTECTED] wrote:
 On Sat, 27 May 2006, Neil Brown wrote:
 
  Thanks.  This narrows it down quite a bit... too much infact:  I can
  now say for sure that this cannot possible happen :-)
  
2/ The message.gz you sent earlier with the
echo t  /proc/sysrq-trigger
   trace in it didn't contain information about md4_raid5 - the 
 
 got another hang again this morning... full dmesg output attached.
 

Thanks.  Nothing surprising there, which maybe is a surprise itself...

I'm still somewhat stumped by this.  But given that it is nicely
repeatable, I'm sure we can get there...

The following patch adds some more tracing to raid5, and might fix a
subtle bug in ll_rw_blk, though it is an incredible long shot that
this could be affecting raid5 (if it is, I'll have to assume there is
another bug somewhere).   It certainly doesn't break ll_rw_blk.
Whether it actually fixes something I'm not sure.

If you could try with these on top of the previous patches I'd really
appreciate it.

When you read from /stripe_cache_active, it should trigger a
(cryptic) kernel message within the next 15 seconds.  If I could get
the contents of that file and the kernel messages, that should help.

Thanks heaps,

NeilBrown


Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./block/ll_rw_blk.c  |4 ++--
 ./drivers/md/raid5.c |   18 ++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff ./block/ll_rw_blk.c~current~ ./block/ll_rw_blk.c
--- ./block/ll_rw_blk.c~current~2006-05-28 21:54:23.0 +1000
+++ ./block/ll_rw_blk.c 2006-05-28 21:55:17.0 +1000
@@ -874,7 +874,7 @@ static void __blk_queue_free_tags(reques
}
 
q-queue_tags = NULL;
-   q-queue_flags = ~(1  QUEUE_FLAG_QUEUED);
+   clear_bit(QUEUE_FLAG_QUEUED, q-queue_flags);
 }
 
 /**
@@ -963,7 +963,7 @@ int blk_queue_init_tags(request_queue_t 
 * assign it, all done
 */
q-queue_tags = tags;
-   q-queue_flags |= (1  QUEUE_FLAG_QUEUED);
+   set_bit(QUEUE_FLAG_QUEUED, q-queue_flags);
return 0;
 fail:
kfree(tags);

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~   2006-05-27 09:17:10.0 +1000
+++ ./drivers/md/raid5.c2006-05-28 21:56:56.0 +1000
@@ -1701,13 +1701,20 @@ static sector_t sync_request(mddev_t *md
  * During the scan, completed stripes are saved for us by the interrupt
  * handler, so that they will not have to wait for our next wakeup.
  */
+static unsigned long trigger;
+
 static void raid5d (mddev_t *mddev)
 {
struct stripe_head *sh;
raid5_conf_t *conf = mddev_to_conf(mddev);
int handled;
+   int trace = 0;
 
PRINTK(+++ raid5d active\n);
+   if (test_and_clear_bit(0, trigger))
+   trace = 1;
+   if (trace)
+   printk(raid5d runs\n);
 
md_check_recovery(mddev);
 
@@ -1725,6 +1732,13 @@ static void raid5d (mddev_t *mddev)
activate_bit_delay(conf);
}
 
+   if (trace)
+   printk( le=%d, pas=%d, bqp=%d le=%d\n,
+  list_empty(conf-handle_list),
+  atomic_read(conf-preread_active_stripes),
+  blk_queue_plugged(mddev-queue),
+  list_empty(conf-delayed_list));
+
if (list_empty(conf-handle_list) 
atomic_read(conf-preread_active_stripes)  IO_THRESHOLD 
!blk_queue_plugged(mddev-queue) 
@@ -1756,6 +1770,8 @@ static void raid5d (mddev_t *mddev)
unplug_slaves(mddev);
 
PRINTK(--- raid5d inactive\n);
+   if (trace)
+   printk(raid5d done\n);
 }
 
 static ssize_t
@@ -1813,6 +1829,7 @@ stripe_cache_active_show(mddev_t *mddev,
struct list_head *l;
n = sprintf(page, %d\n, atomic_read(conf-active_stripes));
n += sprintf(page+n, %d preread\n, 
atomic_read(conf-preread_active_stripes));
+   n += sprintf(page+n, %splugged\n, 
blk_queue_plugged(mddev-queue)?:not );
spin_lock_irq(conf-device_lock);
c1=0;
list_for_each(l, conf-bitmap_list)
@@ -1822,6 +1839,7 @@ stripe_cache_active_show(mddev_t *mddev,
c2++;
spin_unlock_irq(conf-device_lock);
n += sprintf(page+n, bitlist=%d delaylist=%d\n, c1, c2);
+   trigger = 0x;
return n;
} else
return 0;
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-05-26 Thread dean gaudet
On Tue, 23 May 2006, Neil Brown wrote:

 I've spent all morning looking at this and while I cannot see what is
 happening I did find a couple of small bugs, so that is good...
 
 I've attached three patches.  The first fix two small bugs (I think).
 The last adds some extra information to
   /sys/block/mdX/md/stripe_cache_active
 
 They are against 2.6.16.11.
 
 If you could apply them and if the problem recurs, report the content
 of stripe_cache_active several times before and after changing it,
 just like you did last time, that might help throw some light on the
 situation.

i applied them against 2.6.16.18 and two days later i got my first hang... 
below is the stripe_cache foo.

thanks
-dean

neemlark:~# cd /sys/block/md4/md/
neemlark:/sys/block/md4/md# cat stripe_cache_active 
255
0 preread
bitlist=0 delaylist=255
neemlark:/sys/block/md4/md# cat stripe_cache_active 
255
0 preread
bitlist=0 delaylist=255
neemlark:/sys/block/md4/md# cat stripe_cache_active 
255
0 preread
bitlist=0 delaylist=255
neemlark:/sys/block/md4/md# cat stripe_cache_active 
255
0 preread
bitlist=0 delaylist=255
neemlark:/sys/block/md4/md# cat stripe_cache_active 
255
0 preread
bitlist=0 delaylist=255
neemlark:/sys/block/md4/md# cat stripe_cache_size 
256
neemlark:/sys/block/md4/md# echo 512 stripe_cache_size
neemlark:/sys/block/md4/md# cat stripe_cache_active
474
187 preread
bitlist=0 delaylist=222
neemlark:/sys/block/md4/md# cat stripe_cache_active
438
222 preread
bitlist=0 delaylist=72
neemlark:/sys/block/md4/md# cat stripe_cache_active
438
222 preread
bitlist=0 delaylist=72
neemlark:/sys/block/md4/md# cat stripe_cache_active
469
222 preread
bitlist=0 delaylist=72
neemlark:/sys/block/md4/md# cat stripe_cache_active
512
72 preread
bitlist=160 delaylist=103
neemlark:/sys/block/md4/md# cat stripe_cache_active
1
0 preread
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
2
0 preread
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
0
0 preread
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# cat stripe_cache_active
2
0 preread
bitlist=0 delaylist=0
neemlark:/sys/block/md4/md# 

md4 : active raid5 sdd1[0] sde1[5](S) sdh1[4] sdg1[3] sdf1[2] sdc1[1]
  1562834944 blocks level 5, 128k chunk, algorithm 2 [5/5] [U]
  bitmap: 10/187 pages [40KB], 1024KB chunk
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-05-26 Thread Neil Brown
On Friday May 26, [EMAIL PROTECTED] wrote:
 On Tue, 23 May 2006, Neil Brown wrote:
 
 i applied them against 2.6.16.18 and two days later i got my first hang... 
 below is the stripe_cache foo.
 
 thanks
 -dean
 
 neemlark:~# cd /sys/block/md4/md/
 neemlark:/sys/block/md4/md# cat stripe_cache_active 
 255
 0 preread
 bitlist=0 delaylist=255
 neemlark:/sys/block/md4/md# cat stripe_cache_active 
 255
 0 preread
 bitlist=0 delaylist=255
 neemlark:/sys/block/md4/md# cat stripe_cache_active 
 255
 0 preread
 bitlist=0 delaylist=255

Thanks.  This narrows it down quite a bit... too much infact:  I can
now say for sure that this cannot possible happen :-)

Two things that might be helpful:
  1/ Do you have any other patches on 2.6.16.18 other than the 3 I
sent you?  If you do I'd like to see them, just in case.
  2/ The message.gz you sent earlier with the
  echo t  /proc/sysrq-trigger
 trace in it didn't contain information about md4_raid5 - the 
 controlling thread for that array.  It must have missed out
 due to a buffer overflowing.  Next time it happens, could you
 to get this trace again and see if you can find out what
 what md4_raid5 is going.  Maybe do the 'echo t' several times.
 I think that you need a kernel recompile to make the dmesg
 buffer larger.

Thanks for your patience - this must be very frustrating for you.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-05-26 Thread dean gaudet
On Sat, 27 May 2006, Neil Brown wrote:

 On Friday May 26, [EMAIL PROTECTED] wrote:
  On Tue, 23 May 2006, Neil Brown wrote:
  
  i applied them against 2.6.16.18 and two days later i got my first hang... 
  below is the stripe_cache foo.
  
  thanks
  -dean
  
  neemlark:~# cd /sys/block/md4/md/
  neemlark:/sys/block/md4/md# cat stripe_cache_active 
  255
  0 preread
  bitlist=0 delaylist=255
  neemlark:/sys/block/md4/md# cat stripe_cache_active 
  255
  0 preread
  bitlist=0 delaylist=255
  neemlark:/sys/block/md4/md# cat stripe_cache_active 
  255
  0 preread
  bitlist=0 delaylist=255
 
 Thanks.  This narrows it down quite a bit... too much infact:  I can
 now say for sure that this cannot possible happen :-)

heheh.  fwiw the box has traditionally been rock solid.. it's ancient 
though... dual p3 750 w/440bx chipset and pc100 ecc memory... 3ware 7508 
w/seagate 400GB disks... i really don't suspect the hardware all that much 
because the freeze seems to be rather consistent as to time of day 
(overnight while i've got 3x rdiff-backup, plus bittorrent, plus updatedb 
going).  unfortunately it doesn't happen every time... but every time i've 
unstuck the box i've noticed those processes going.

other tidbits... md4 is a lvm2 PV ... there are two LVs, one with ext3
and one with xfs.


 Two things that might be helpful:
   1/ Do you have any other patches on 2.6.16.18 other than the 3 I
 sent you?  If you do I'd like to see them, just in case.

it was just 2.6.16.18 plus the 3 you sent... i attached the .config
(it's rather full -- based off debian kernel .config).

maybe there's a compiler bug:

gcc version 4.0.4 20060507 (prerelease) (Debian 4.0.3-3)


   2/ The message.gz you sent earlier with the
   echo t  /proc/sysrq-trigger
  trace in it didn't contain information about md4_raid5 - the 
  controlling thread for that array.  It must have missed out
  due to a buffer overflowing.  Next time it happens, could you
  to get this trace again and see if you can find out what
  what md4_raid5 is going.  Maybe do the 'echo t' several times.
  I think that you need a kernel recompile to make the dmesg
  buffer larger.

ok i'll set CONFIG_LOG_BUF_SHIFT=18 and rebuild ...

note that i'm going to include two more patches in this next kernel:

http://lkml.org/lkml/2006/5/23/42
http://arctic.org/~dean/patches/linux-2.6.16.5-no-treason.patch

the first was the Jens Axboe patch you mentioned here recently (for
accounting with i/o barriers)... and the second gets rid of the tcp
treason uncloaked messages.


 Thanks for your patience - this must be very frustrating for you.

fortunately i'm the primary user of this box... and the bug doesn't
corrupt anything... and i can unstick it easily :)  so it's not all that
frustrating actually.

-dean

config.gz
Description: Binary data


Re: raid5 hang on get_active_stripe

2006-05-22 Thread Neil Brown
On Wednesday May 17, [EMAIL PROTECTED] wrote:
 On Thu, 11 May 2006, dean gaudet wrote:
 
  On Tue, 14 Mar 2006, Neil Brown wrote:
  
   On Monday March 13, [EMAIL PROTECTED] wrote:
I just experienced some kind of lockup accessing my 8-drive raid5
(2.6.16-rc4-mm2). The system has been up for 16 days running fine, but
now processes that try to read the md device hang. ps tells me they are
all sleeping in get_active_stripe. There is nothing in the syslog, and I
can read from the individual drives fine with dd. mdadm says the state
is active.
 ...
  
  i seem to be running into this as well... it has happenned several times 
  in the past three weeks.  i attached the kernel log output...
 
 it happenned again...  same system as before...
 

I've spent all morning looking at this and while I cannot see what is
happening I did find a couple of small bugs, so that is good...

I've attached three patches.  The first fix two small bugs (I think).
The last adds some extra information to
  /sys/block/mdX/md/stripe_cache_active

They are against 2.6.16.11.

If you could apply them and if the problem recurs, report the content
of stripe_cache_active several times before and after changing it,
just like you did last time, that might help throw some light on the
situation.

Thanks,
NeilBrown

Status: ok

Fix a plug/unplug race in raid5

When a device is unplugged, requests are moved from one or two
(depending on whether a bitmap is in use) queues to the main
request queue.

So whenever requests are put on either of those queues, we should make
sure the raid5 array is 'plugged'.
However we don't.  We currently plug the raid5 queue just before
putting requests on queues, so there is room for a race.  If something
unplugs the queue at just the wrong time, requests will be left on
the queue and nothing will want to unplug them.
Normally something else will plug and unplug the queue fairly
soon, but there is a risk that nothing will.

Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/raid5.c |   18 ++
 1 file changed, 6 insertions(+), 12 deletions(-)

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~   2006-05-23 12:27:58.0 +1000
+++ ./drivers/md/raid5.c2006-05-23 12:28:26.0 +1000
@@ -77,12 +77,14 @@ static void __release_stripe(raid5_conf_
if (atomic_read(conf-active_stripes)==0)
BUG();
if (test_bit(STRIPE_HANDLE, sh-state)) {
-   if (test_bit(STRIPE_DELAYED, sh-state))
+   if (test_bit(STRIPE_DELAYED, sh-state)) {
list_add_tail(sh-lru, conf-delayed_list);
-   else if (test_bit(STRIPE_BIT_DELAY, sh-state) 
-conf-seq_write == sh-bm_seq)
+   blk_plug_device(conf-mddev-queue);
+   } else if (test_bit(STRIPE_BIT_DELAY, sh-state) 
+  conf-seq_write == sh-bm_seq) {
list_add_tail(sh-lru, conf-bitmap_list);
-   else {
+   blk_plug_device(conf-mddev-queue);
+   } else {
clear_bit(STRIPE_BIT_DELAY, sh-state);
list_add_tail(sh-lru, conf-handle_list);
}
@@ -1519,13 +1521,6 @@ static int raid5_issue_flush(request_que
return ret;
 }
 
-static inline void raid5_plug_device(raid5_conf_t *conf)
-{
-   spin_lock_irq(conf-device_lock);
-   blk_plug_device(conf-mddev-queue);
-   spin_unlock_irq(conf-device_lock);
-}
-
 static int make_request (request_queue_t *q, struct bio * bi)
 {
mddev_t *mddev = q-queuedata;
@@ -1577,7 +1572,6 @@ static int make_request (request_queue_t
goto retry;
}
finish_wait(conf-wait_for_overlap, w);
-   raid5_plug_device(conf);
handle_stripe(sh);
release_stripe(sh);
 
Status: ok

Fix some small races in bitmap plugging in raid5.

The comment gives more details, but I didn't quite have the
sequencing write, so there was room for races to leave bits
unset in the on-disk bitmap for short periods of time.

Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/raid5.c |   30 +++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~   2006-05-23 12:28:26.0 +1000
+++ ./drivers/md/raid5.c2006-05-23 12:28:53.0 +1000
@@ -15,6 +15,30 @@
  * Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
 
+/*
+ * BITMAP UNPLUGGING:
+ *
+ * The sequencing for updating the bitmap reliably is a little
+ * subtle 

Re: raid5 hang on get_active_stripe

2006-05-18 Thread Neil Brown
On Wednesday May 17, [EMAIL PROTECTED] wrote:
 
 let me know if you want the task dump output from this one too.
 

No thanks - I doubt it will containing anything helpful.

I'll try to put some serious time into this next week - as soon as I
get mdadm 2.5 out.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-05-17 Thread dean gaudet
On Thu, 11 May 2006, dean gaudet wrote:

 On Tue, 14 Mar 2006, Neil Brown wrote:
 
  On Monday March 13, [EMAIL PROTECTED] wrote:
   I just experienced some kind of lockup accessing my 8-drive raid5
   (2.6.16-rc4-mm2). The system has been up for 16 days running fine, but
   now processes that try to read the md device hang. ps tells me they are
   all sleeping in get_active_stripe. There is nothing in the syslog, and I
   can read from the individual drives fine with dd. mdadm says the state
   is active.
...
 
 i seem to be running into this as well... it has happenned several times 
 in the past three weeks.  i attached the kernel log output...

it happenned again...  same system as before...


  You could try increasing the size of the stripe cache
echo 512  /sys/block/mdX/md/stripe_cache_size
  (choose and appropriate 'X').
 
 yeah that got things going again -- it took a minute or so maybe, i
 wasn't paying attention as to how fast things cleared up.

i tried 768 this time and it wasn't enough... 1024 did it again...

 
  Maybe check the content of
   /sys/block/mdX/md/stripe_cache_active
  as well.
 
 next time i'll check this before i increase stripe_cache_size... it's
 0 now, but the raid5 is working again...

here's a sequence of things i did... not sure if it helps:

# cat /sys/block/md4/md/stripe_cache_active
435
# cat /sys/block/md4/md/stripe_cache_size
512
# echo 768 /sys/block/md4/md/stripe_cache_size
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# echo 1024 /sys/block/md4/md/stripe_cache_size
# cat /sys/block/md4/md/stripe_cache_active
927
# cat /sys/block/md4/md/stripe_cache_active
151
# cat /sys/block/md4/md/stripe_cache_active
66
# cat /sys/block/md4/md/stripe_cache_active
2
# cat /sys/block/md4/md/stripe_cache_active
1
# cat /sys/block/md4/md/stripe_cache_active
0
# cat /sys/block/md4/md/stripe_cache_active
3

and it's OK again... except i'm going to lower the stripe_cache_size to
256 again because i'm not sure i want to keep having to double it each
freeze :)

let me know if you want the task dump output from this one too.

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 hang on get_active_stripe

2006-03-13 Thread Neil Brown
On Monday March 13, [EMAIL PROTECTED] wrote:
 Hi all,
 
 I just experienced some kind of lockup accessing my 8-drive raid5
 (2.6.16-rc4-mm2). The system has been up for 16 days running fine, but
 now processes that try to read the md device hang. ps tells me they are
 all sleeping in get_active_stripe. There is nothing in the syslog, and I
 can read from the individual drives fine with dd. mdadm says the state
 is active.

Hmmm... That's sad. That's going to be very hard to track down.

If you could
  echo t  /proc/sysrq-trigger

and send me the dump that appears in the kernel log, I would
appreciate it.  I doubt it will be very helpful, but it is the best
bet I can come up with.

 
 I'm not sure what to do now. Is it safe to try to reboot the system or
 could that cause the device to get corrupted if it's hung in the middle
 of some important operation?

You could try increasing the size of the stripe cache
  echo 512  /sys/block/mdX/md/stripe_cache_size
(choose and appropriate 'X').
Maybe check the content of
 /sys/block/mdX/md/stripe_cache_active
as well.

Other than that, just reboot.  The raid5 will do a resync, but the
data should be fine.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html