Re: raid5 hang on get_active_stripe

2006-11-15 Thread dean gaudet
and i haven't seen it either... neil do you think your latest patch was hiding the bug? 'cause there was an iteration of an earlier patch which didn't produce much spam in dmesg but the bug was still there, then there is the version below which spams dmesg a fair amount but i didn't see the

Re: raid5 hang on get_active_stripe

2006-10-10 Thread Bas van Schaik
Hi all, Neil Brown wrote: On Tuesday October 10, [EMAIL PROTECTED] wrote: Very happy to. Let me know what you'd like me to do. Cool thanks. (snip) I don't know if it's useful information, but I'm encountering the same problem here, in a totally different situation. I'm using

Re: raid5 hang on get_active_stripe

2006-10-09 Thread Chris Allen
Ok, after more testing, this lockup happens consistently when bitmaps are switched on and never when they are switched off. Ideas anybody? On Sun, Oct 08, 2006 at 12:25:46AM +0100, Chris Allen wrote: Neil Brown wrote: On Tuesday June 13, [EMAIL PROTECTED] wrote: Will that fix be in

Re: raid5 hang on get_active_stripe

2006-10-09 Thread Neil Brown
On Monday October 9, [EMAIL PROTECTED] wrote: Ok, after more testing, this lockup happens consistently when bitmaps are switched on and never when they are switched off. Ideas anybody? No. I'm completely stumped. Which means it is probably something very obvious, but I keep looking in the

Re: raid5 hang on get_active_stripe

2006-10-09 Thread Chris Allen
Neil Brown wrote: On Monday October 9, [EMAIL PROTECTED] wrote: Ok, after more testing, this lockup happens consistently when bitmaps are switched on and never when they are switched off. Are you happy to try a kernel.org kernel with a few patches and a little shell script running?

Re: raid5 hang on get_active_stripe

2006-10-09 Thread Neil Brown
On Tuesday October 10, [EMAIL PROTECTED] wrote: Very happy to. Let me know what you'd like me to do. Cool thanks. At the end is a patch against 2.6.17.11, though it should apply against any later 2.6.17 kernel. Apply this and reboot. Then run while true do cat

Re: raid5 hang on get_active_stripe

2006-10-07 Thread Chris Allen
Neil Brown wrote: On Tuesday June 13, [EMAIL PROTECTED] wrote: Will that fix be in 2.6.17? Probably not. We have had the last 'rc' twice and I so I don't think it is appropriate to submit the patch at this stage. I probably will submit it for an early 2.6.17.x. and for 2.6.16.y.

Re: raid5 hang on get_active_stripe

2006-06-02 Thread Neil Brown
On Friday June 2, [EMAIL PROTECTED] wrote: On Thu, 1 Jun 2006, Neil Brown wrote: I've got one more long-shot I would like to try first. If you could backout that change to ll_rw_block, and apply this patch instead. Then when it hangs, just cat the stripe_cache_active file and see if

Re: raid5 hang on get_active_stripe

2006-05-30 Thread Neil Brown
On Tuesday May 30, [EMAIL PROTECTED] wrote: On Tue, 30 May 2006, Neil Brown wrote: Could you try this patch please? On top of the rest. And if it doesn't fail in a couple of days, tell me how regularly the message kblockd_schedule_work failed gets printed. i'm running this

Re: raid5 hang on get_active_stripe

2006-05-30 Thread Neil Brown
On Tuesday May 30, [EMAIL PROTECTED] wrote: actually i think the rate is higher... i'm not sure why, but klogd doesn't seem to keep up with it: [EMAIL PROTECTED]:~# grep -c kblockd_schedule_work /var/log/messages 31 [EMAIL PROTECTED]:~# dmesg | grep -c kblockd_schedule_work 8192 # grep

Re: raid5 hang on get_active_stripe

2006-05-30 Thread dean gaudet
On Wed, 31 May 2006, Neil Brown wrote: On Tuesday May 30, [EMAIL PROTECTED] wrote: actually i think the rate is higher... i'm not sure why, but klogd doesn't seem to keep up with it: [EMAIL PROTECTED]:~# grep -c kblockd_schedule_work /var/log/messages 31 [EMAIL PROTECTED]:~#

Re: raid5 hang on get_active_stripe

2006-05-29 Thread dean gaudet
On Sun, 28 May 2006, Neil Brown wrote: The following patch adds some more tracing to raid5, and might fix a subtle bug in ll_rw_blk, though it is an incredible long shot that this could be affecting raid5 (if it is, I'll have to assume there is another bug somewhere). It certainly doesn't

Re: raid5 hang on get_active_stripe

2006-05-28 Thread Neil Brown
On Saturday May 27, [EMAIL PROTECTED] wrote: On Sat, 27 May 2006, Neil Brown wrote: Thanks. This narrows it down quite a bit... too much infact: I can now say for sure that this cannot possible happen :-) 2/ The message.gz you sent earlier with the echo t

Re: raid5 hang on get_active_stripe

2006-05-26 Thread dean gaudet
On Tue, 23 May 2006, Neil Brown wrote: I've spent all morning looking at this and while I cannot see what is happening I did find a couple of small bugs, so that is good... I've attached three patches. The first fix two small bugs (I think). The last adds some extra information to

Re: raid5 hang on get_active_stripe

2006-05-26 Thread Neil Brown
On Friday May 26, [EMAIL PROTECTED] wrote: On Tue, 23 May 2006, Neil Brown wrote: i applied them against 2.6.16.18 and two days later i got my first hang... below is the stripe_cache foo. thanks -dean neemlark:~# cd /sys/block/md4/md/ neemlark:/sys/block/md4/md# cat

Re: raid5 hang on get_active_stripe

2006-05-26 Thread dean gaudet
On Sat, 27 May 2006, Neil Brown wrote: On Friday May 26, [EMAIL PROTECTED] wrote: On Tue, 23 May 2006, Neil Brown wrote: i applied them against 2.6.16.18 and two days later i got my first hang... below is the stripe_cache foo. thanks -dean neemlark:~# cd /sys/block/md4/md/

Re: raid5 hang on get_active_stripe

2006-05-22 Thread Neil Brown
On Wednesday May 17, [EMAIL PROTECTED] wrote: On Thu, 11 May 2006, dean gaudet wrote: On Tue, 14 Mar 2006, Neil Brown wrote: On Monday March 13, [EMAIL PROTECTED] wrote: I just experienced some kind of lockup accessing my 8-drive raid5 (2.6.16-rc4-mm2). The system has been up

Re: raid5 hang on get_active_stripe

2006-05-18 Thread Neil Brown
On Wednesday May 17, [EMAIL PROTECTED] wrote: let me know if you want the task dump output from this one too. No thanks - I doubt it will containing anything helpful. I'll try to put some serious time into this next week - as soon as I get mdadm 2.5 out. NeilBrown - To unsubscribe from

Re: raid5 hang on get_active_stripe

2006-05-17 Thread dean gaudet
On Thu, 11 May 2006, dean gaudet wrote: On Tue, 14 Mar 2006, Neil Brown wrote: On Monday March 13, [EMAIL PROTECTED] wrote: I just experienced some kind of lockup accessing my 8-drive raid5 (2.6.16-rc4-mm2). The system has been up for 16 days running fine, but now processes that try

Re: raid5 hang on get_active_stripe

2006-03-13 Thread Neil Brown
On Monday March 13, [EMAIL PROTECTED] wrote: Hi all, I just experienced some kind of lockup accessing my 8-drive raid5 (2.6.16-rc4-mm2). The system has been up for 16 days running fine, but now processes that try to read the md device hang. ps tells me they are all sleeping in