Re: PROBLEM: raid5 hangs

2007-11-14 Thread Justin Piszcz
This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the RAID5 
bio* patches are applied.


Justin.

On Wed, 14 Nov 2007, Peter Magnusson wrote:


Hey.

[1.] One line summary of the problem:

raid5 hangs and use 100% cpu

[2.] Full description of the problem/report:

I have used 2.6.18 for 284 days or something until my powersupply died, no 
problem what so ever duing that time. After that forced reboot I did these 
changes; Put in 2 GB more memory so I have 3 GB instead of 1 GB, two disks in 
the raid5 got badblocks so I didnt trust them anymore so I bought new disks 
(I managed to save the raid5). I have 6x300 GB in a raid5. Two of them are 
now 320 GB so created a small raid1 also. That raid5 is encrypted with 
aes-cbc-plain. The raid1 is encrypted with aes-cbc-essiv:sha256.


I compiled linux-2.6.22.3 and started to use that. I used the same .config
as in default FC5, I think i just selected P4 cpu and preemptive kernel type.

After 11 or 12 days the computer froze, I wasnt home when it happend and
couldnt fix it for like 3 days. It was just to reboot it as it wasnt possible 
to login remotely or on console. It did respond to ping however.

After reboot it rebuilded the raid5.

Then it happend again after approx the same time, 11 or 12 days. I noticed
that the process md1_raid5 used 100% cpu all the time. After reboot it
rebuilded the raid5.

I compiled linux-2.6.23.

And then... it happend again... After about the same time as before.
md1_raid5 used 100% cpu. I also noticed that I wasnt able to save
anything in my homedir, it froze during save. I could read from it however. 
My homedir isnt on raid5 but its encrypted. Its not on any disk that has to 
do with raid. This problem didnt happend when I used 2.6.18. Currently I use 
2.6.18 as I kinda need the computer stable.

After reboot it rebuilded the raid5.

top looked like this:

- 02:37:32 up 11 days,  2:00, 29 users,  load average: 21.06, 17.45, 9.38
Tasks: 284 total,   2 running, 282 sleeping,   0 stopped,   0 zombie
Cpu(s):  2.1%us, 51.2%sy,  0.0%ni,  0.0%id, 46.6%wa,  0.0%hi,  0.0%si, 0.0%st
Mem:   3114928k total,  2981720k used,   133208k free, 8244k buffers
Swap:  2096472k total,  252k used,  2096220k free,  1690196k cached

 PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
2147 root  15  -5 000 R  100  0.0  80:25.80 md1_raid5
11328 iocc  20   0  536m 374m  28m S3 12.3 249:32.38 firefox-bin

After some time, just before I rebooted I had this load:

02:48:36 up 11 days,  2:11, 29 users,  load average: 86.10, 70.80, 40.07

[3.] Keywords (i.e., modules, networking, kernel):

raid5, possible dm_mod

[4.] Kernel version (from /proc/version):

Not using 2.6.23 now but anyway...
Linux version 2.6.18 ([EMAIL PROTECTED]) (gcc version 4.1.1 
20060525 (Red Hat 4.1.1-1)) #1 SMP Sun Sep 24 12:58:16 CEST 2006


[5.] Output of Oops.. message (if applicable) with symbolic information
resolved (see Documentation/oops-tracing.txt)

No oopses, doesnt log anything.

[6.] A small shell script or example program which triggers the
problem (if possible)

-

[7.] Environment

Hmm..

FilesystemSize  Used Avail Use% Mounted on
/dev/sda1 7.8G  7.0G  761M  91% /<- unencrypted fs
tmpfs 1.5G 0  1.5G   0% /dev/shm
/dev/mapper/home   24G   23G  1.6G  94% /home<- encrypted fs
/dev/mapper/temp  1.4T  822G  555G  60% /temp<- encrypted fs,raid5
/dev/mapper/jb 18G   17G  1.2G  94% /mnt/jb  <- encrypted fs,raid1

[EMAIL PROTECTED] linux-2.6.23]# cryptsetup status home
/dev/mapper/home is active:
 cipher:  aes-cbc-plain
 keysize: 256 bits
 device:  /dev/sda3
 offset:  0 sectors
 size:50861790 sectors
 mode:read/write
[EMAIL PROTECTED] linux-2.6.23]# cryptsetup status temp
/dev/mapper/temp is active:
 cipher:  aes-cbc-plain
 keysize: 256 bits
 device:  /dev/md1
 offset:  0 sectors
 size:2930496000 sectors
 mode:read/write
[EMAIL PROTECTED] linux-2.6.23]# cryptsetup status jb
/dev/mapper/jb is active:
 cipher:  aes-cbc-essiv:sha256
 keysize: 256 bits
 device:  /dev/md0
 offset:  0 sectors
 size:37238528 sectors
 mode:read/write

[7.1.] Software (add the output of the ver_linux script here)

If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux flashdance.cx 2.6.18 #1 SMP Sun Sep 24 12:58:16 CEST 2006 i686 i686 
i386 GNU/Linux


Gnu C  4.1.1
Gnu make   3.80
binutils   2.16.91.0.6
util-linux 2.13-pre7
mount  2.13-pre7
module-init-tools  3.2.2
e2fsprogs  1.38
reiserfsprogs  3.6.19
quota-tools3.13.
PPP2.4.3
Linux C Library2.4
Dynamic linker (ldd)   2.4
Procps 3.2.7
Net-tools  1.60
Kbd1.12
oprofile   0.9.1
Sh-utils   5.97
udev

Re: PROBLEM: raid5 hangs

2007-11-14 Thread Peter Magnusson

On Wed, 14 Nov 2007, Justin Piszcz wrote:

This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the RAID5 
bio* patches are applied.


Ok, good to know.
Do you know when it first appeared because it existed in linux-2.6.22.3 
also...?

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: raid5 hangs

2007-11-14 Thread Justin Piszcz



On Wed, 14 Nov 2007, Peter Magnusson wrote:


On Wed, 14 Nov 2007, Justin Piszcz wrote:

This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the RAID5 
bio* patches are applied.


Ok, good to know.
Do you know when it first appeared because it existed in linux-2.6.22.3 
also...?




I am unsure, I and others started noticing it in 2.6.23 mainly; again, not 
sure, will let others answer this one.


Justin.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID5 Recovery

2007-11-14 Thread David Greaves
Neil Cavan wrote:
> Hello,
Hi Neil

What kernel version?
What mdadm version?

> This morning, I woke up to find the array had kicked two disks. This
> time, though, /proc/mdstat showed one of the failed disks (U_U_U, one
> of the "_"s) had been marked as a spare - weird, since there are no
> spare drives in this array. I rebooted, and the array came back in the
> same state: one failed, one spare. I hot-removed and hot-added the
> spare drive, which put the array back to where I thought it should be
> ( still U_U_U, but with both "_"s marked as failed). Then I rebooted,
> and the array began rebuilding on its own. Usually I have to hot-add
> manually, so that struck me as a little odd, but I gave it no mind and
> went to work. Without checking the contents of the filesystem. Which
> turned out not to have been mounted on reboot.
OK

> Because apparently things went horribly wrong.
Yep :(

> Do I have any hope of recovering this data? Could rebuilding the
> reiserfs superblock help if the rebuild managed to corrupt the
> superblock but not the data?
See below



> Nov 13 02:01:03 localhost kernel: [17805772.424000] hdc: dma_intr:
> status=0x51 { DriveReady SeekComplete Error }

> Nov 13 02:01:06 localhost kernel: [17805775.156000] lost page write
> due to I/O error on md0
hdc1 fails


> Nov 13 02:01:06 localhost kernel: [17805775.196000] RAID5 conf printout:
> Nov 13 02:01:06 localhost kernel: [17805775.196000]  --- rd:5 wd:3 fd:2
> Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 0, o:1, dev:hda1
> Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 1, o:0, dev:hdc1
> Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 2, o:1, dev:hde1
> Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 4, o:1, dev:hdi1

hdg1 is already missing?

> Nov 13 02:01:06 localhost kernel: [17805775.212000] RAID5 conf printout:
> Nov 13 02:01:06 localhost kernel: [17805775.212000]  --- rd:5 wd:3 fd:2
> Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 0, o:1, dev:hda1
> Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 2, o:1, dev:hde1
> Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 4, o:1, dev:hdi1

so now the array is bad.

a reboot happens and:
> Nov 13 07:21:07 localhost kernel: [17179584.712000] md: md0 stopped.
> Nov 13 07:21:07 localhost kernel: [17179584.876000] md: bind
> Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> Nov 13 07:21:07 localhost kernel: [17179584.892000] md: bind
> Nov 13 07:21:07 localhost kernel: [17179584.892000] md: kicking
> non-fresh hdg1 from array!
> Nov 13 07:21:07 localhost kernel: [17179584.892000] md: unbind
> Nov 13 07:21:07 localhost kernel: [17179584.892000] md: export_rdev(hdg1)
> Nov 13 07:21:07 localhost kernel: [17179584.896000] raid5: allocated
> 5245kB for md0
... apparently hdc1 is OK? Hmmm.

> Nov 13 07:21:07 localhost kernel: [17179665.524000] ReiserFS: md0:
> found reiserfs format "3.6" with standard journal
> Nov 13 07:21:07 localhost kernel: [17179676.136000] ReiserFS: md0:
> using ordered data mode
> Nov 13 07:21:07 localhost kernel: [17179676.164000] ReiserFS: md0:
> journal params: device md0, size 8192, journal first block 18, max
> trans len 1024, max batch 900, max commit age 30, max trans age 30
> Nov 13 07:21:07 localhost kernel: [17179676.164000] ReiserFS: md0:
> checking transaction log (md0)
> Nov 13 07:21:07 localhost kernel: [17179676.828000] ReiserFS: md0:
> replayed 7 transactions in 1 seconds
> Nov 13 07:21:07 localhost kernel: [17179677.012000] ReiserFS: md0:
> Using r5 hash to sort names
> Nov 13 07:21:09 localhost kernel: [17179682.064000] lost page write
> due to I/O error on md0
Reiser tries to mount/replay itself relying on hdc1 (which is partly bad)

> Nov 13 07:25:39 localhost kernel: [17179584.828000] md: raid5
> personality registered as nr 4
> Nov 13 07:25:39 localhost kernel: [17179585.708000] md: kicking
> non-fresh hdg1 from array!
Another reboot...

> Nov 13 07:25:40 localhost kernel: [17179666.064000] ReiserFS: md0:
> found reiserfs format "3.6" with standard journal
> Nov 13 07:25:40 localhost kernel: [17179676.904000] ReiserFS: md0:
> using ordered data mode
> Nov 13 07:25:40 localhost kernel: [17179676.928000] ReiserFS: md0:
> journal params: device md0, size 8192, journal first block 18, max
> trans len 1024, max batch 900, max commit age 30, max trans age 30
> Nov 13 07:25:40 localhost kernel: [17179676.932000] ReiserFS: md0:
> checking transaction log (md0)
> Nov 13 07:25:40 localhost kernel: [17179677.08] ReiserFS: md0:
> Using r5 hash to sort names
> Nov 13 07:25:42 localhost kernel: [17179683.128000] lost page write
> due to I/O error on md0
Reiser tries again...

> Nov 13 07:26:57 localhost kernel: [17179757.524000] md: unbind
> Nov 13 07:26:57 localhost kernel: [17179757.524000] md: export_rdev(hdc1)
> Nov 13 07:27:03 localhost kernel: [17

Fwd: RAID5 Recovery

2007-11-14 Thread Neil Cavan
Thanks for taking a look, David.

Kernel:
2.6.15-27-k7, stock for Ubuntu 6.06 LTS

mdadm:
mdadm - v1.12.0 - 14 June 2005

You're right, earlier in /var/log/messages there's a notice that hdg
dropped, I missed it before. I use mdadm --monitor, but I recently
changed the target email address - I guess it didn't take properly.

As for replacing hdc, thanks for the diagnosis but it won't help: the
drive is actually fine, as is hdg. I've replaced hdc before, only to
have the brand new hdc show the same behaviour, and SMART says the
drive is A-OK. There's something flaky about these PCI IDE
controllers. I think it's new system time.

Reiserfs recovery-wise: any suggestions? A simple fsck doesn't find a
file system superblock. Is --rebuild-sb the way to go here?

Thanks,
Neil


On Nov 14, 2007 5:58 AM, David Greaves <[EMAIL PROTECTED]> wrote:
> Neil Cavan wrote:
> > Hello,
> Hi Neil
>
> What kernel version?
> What mdadm version?
>
> > This morning, I woke up to find the array had kicked two disks. This
> > time, though, /proc/mdstat showed one of the failed disks (U_U_U, one
> > of the "_"s) had been marked as a spare - weird, since there are no
> > spare drives in this array. I rebooted, and the array came back in the
> > same state: one failed, one spare. I hot-removed and hot-added the
> > spare drive, which put the array back to where I thought it should be
> > ( still U_U_U, but with both "_"s marked as failed). Then I rebooted,
> > and the array began rebuilding on its own. Usually I have to hot-add
> > manually, so that struck me as a little odd, but I gave it no mind and
> > went to work. Without checking the contents of the filesystem. Which
> > turned out not to have been mounted on reboot.
> OK
>
> > Because apparently things went horribly wrong.
> Yep :(
>
> > Do I have any hope of recovering this data? Could rebuilding the
> > reiserfs superblock help if the rebuild managed to corrupt the
> > superblock but not the data?
> See below
>
>
>
> > Nov 13 02:01:03 localhost kernel: [17805772.424000] hdc: dma_intr:
> > status=0x51 { DriveReady SeekComplete Error }
> 
> > Nov 13 02:01:06 localhost kernel: [17805775.156000] lost page write
> > due to I/O error on md0
> hdc1 fails
>
>
> > Nov 13 02:01:06 localhost kernel: [17805775.196000] RAID5 conf printout:
> > Nov 13 02:01:06 localhost kernel: [17805775.196000]  --- rd:5 wd:3 fd:2
> > Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 0, o:1, dev:hda1
> > Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 1, o:0, dev:hdc1
> > Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 2, o:1, dev:hde1
> > Nov 13 02:01:06 localhost kernel: [17805775.196000]  disk 4, o:1, dev:hdi1
>
> hdg1 is already missing?
>
> > Nov 13 02:01:06 localhost kernel: [17805775.212000] RAID5 conf printout:
> > Nov 13 02:01:06 localhost kernel: [17805775.212000]  --- rd:5 wd:3 fd:2
> > Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 0, o:1, dev:hda1
> > Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 2, o:1, dev:hde1
> > Nov 13 02:01:06 localhost kernel: [17805775.212000]  disk 4, o:1, dev:hdi1
>
> so now the array is bad.
>
> a reboot happens and:
> > Nov 13 07:21:07 localhost kernel: [17179584.712000] md: md0 stopped.
> > Nov 13 07:21:07 localhost kernel: [17179584.876000] md: bind
> > Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> > Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> > Nov 13 07:21:07 localhost kernel: [17179584.884000] md: bind
> > Nov 13 07:21:07 localhost kernel: [17179584.892000] md: bind
> > Nov 13 07:21:07 localhost kernel: [17179584.892000] md: kicking
> > non-fresh hdg1 from array!
> > Nov 13 07:21:07 localhost kernel: [17179584.892000] md: unbind
> > Nov 13 07:21:07 localhost kernel: [17179584.892000] md: export_rdev(hdg1)
> > Nov 13 07:21:07 localhost kernel: [17179584.896000] raid5: allocated
> > 5245kB for md0
> ... apparently hdc1 is OK? Hmmm.
>
> > Nov 13 07:21:07 localhost kernel: [17179665.524000] ReiserFS: md0:
> > found reiserfs format "3.6" with standard journal
> > Nov 13 07:21:07 localhost kernel: [17179676.136000] ReiserFS: md0:
> > using ordered data mode
> > Nov 13 07:21:07 localhost kernel: [17179676.164000] ReiserFS: md0:
> > journal params: device md0, size 8192, journal first block 18, max
> > trans len 1024, max batch 900, max commit age 30, max trans age 30
> > Nov 13 07:21:07 localhost kernel: [17179676.164000] ReiserFS: md0:
> > checking transaction log (md0)
> > Nov 13 07:21:07 localhost kernel: [17179676.828000] ReiserFS: md0:
> > replayed 7 transactions in 1 seconds
> > Nov 13 07:21:07 localhost kernel: [17179677.012000] ReiserFS: md0:
> > Using r5 hash to sort names
> > Nov 13 07:21:09 localhost kernel: [17179682.064000] lost page write
> > due to I/O error on md0
> Reiser tries to mount/replay itself relying on hdc1 (which is partly bad)
>
> > Nov 13 07:25:39 localhost kernel: [17179584.828000] md: raid5
> > personality registered as nr 4
> > Nov 13 07:25:39 localh

Re: Fwd: RAID5 Recovery

2007-11-14 Thread David Greaves
Neil Cavan wrote:
> Thanks for taking a look, David.
No problem.

> Kernel:
> 2.6.15-27-k7, stock for Ubuntu 6.06 LTS
> 
> mdadm:
> mdadm - v1.12.0 - 14 June 2005
OK - fairly old then. Not really worth trying to figure out why hdc got re-added
when things had gone wrong.

> You're right, earlier in /var/log/messages there's a notice that hdg
> dropped, I missed it before. I use mdadm --monitor, but I recently
> changed the target email address - I guess it didn't take properly.
> 
> As for replacing hdc, thanks for the diagnosis but it won't help: the
> drive is actually fine, as is hdg. I've replaced hdc before, only to
> have the brand new hdc show the same behaviour, and SMART says the
> drive is A-OK. There's something flaky about these PCI IDE
> controllers. I think it's new system time.
Any excuse eh? :)


> Reiserfs recovery-wise: any suggestions? A simple fsck doesn't find a
> file system superblock. Is --rebuild-sb the way to go here?
No idea, sorry. I only ever tried Reiser once and it failed. It was very hard to
get recovered so I swapped back to XFS.

Good luck on the fscking

David
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building a new raid6 with bitmap does not clear bits during resync

2007-11-14 Thread Bill Davidsen

Neil Brown wrote:

On Monday November 12, [EMAIL PROTECTED] wrote:
  

Neil Brown wrote:


However there is value in regularly updating the bitmap, so add code
to periodically pause while all pending sync requests complete, then
update the bitmap.  Doing this only every few seconds (the same as the
bitmap update time) does not notciable affect resync performance.
  
  
I wonder if a minimum time and minimum number of stripes would be 
better. If a resync is going slowly because it's going over a slow link 
to iSCSI, nbd, or a box of cheap drives fed off a single USB port, just 
writing the updated bitmap may represent as much data as has been 
resynced in the time slice.


Not a suggestion, but a request for your thoughts on that.



Thanks for your thoughts.
Choosing how often to update the bitmap during a sync is certainly not
trivial.   In different situations, different requirements might rule.

I chose to base it on time, and particularly on the time we already
have for "how soon to write back clean bits to the bitmap" because it
is fairly easy to users to understand the implications (if I set the
time to 30 seconds, then I might have to repeat 30second of resync)
and it is already configurable (via the "--delay" option to --create
--bitmap).
  


Sounds right, that part of it is pretty user friendly.

Presumably if someone has a very slow system and wanted to use
bitmaps, they would set --delay relatively large to reduce the cost
and still provide significant benefits.  This would effect both normal
clean-bit writeback and during-resync clean-bit-writeback.

Hope that clarifies my approach.
  


Easy to implement and understand is always a strong point, and a user 
can make an informed decision. Thanks for the discussion.


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Proposal: non-striping RAID4

2007-11-14 Thread Bill Davidsen

James Lee wrote:

>From a quick search through this mailing list, it looks like I can
answer my own question regarding RAID1 --> RAID5 conversion.  Instead
of creating a RAID1 array for the partitions on the two biggest
drives, it should just create a 2-drive RAID5 (which is identical, but
can be expanded as with any other RAID5 array).

So it looks like this should work I guess.


I believe what you want to create might be a three drive raid-5 with one 
failed drive. That way you can just add a drive when you want.


 mdadm -C -c32 -l5 -n3 -amd /dev/md7 /dev/loop[12] missing

Then you can add another drive:

 mdadm --add /dev/md7 /dev/loop3

The output are at the end of this message.

But in general think it would be really great to be able to have a 
format which would do raid-5 or raid-6 over all the available parts of 
multiple drives, and since there's some similar logic for raid-10 over a 
selection of drives it is clearly possible. But in terms of the benefit 
to be gained, unless it fails out of the code and someone feels the 
desire to do it, I can't see much joy to ever having such a thing.


The feature I would really like to have is raid5e, distributed spare so 
head motion is spread over all drives. Don't have time to look at that 
one, either, but it really helps performance under load with small arrays.


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23

2007-11-14 Thread Greg KH
On Tue, Nov 13, 2007 at 10:36:30PM -0700, Dan Williams wrote:
> On Nov 13, 2007 8:43 PM, Greg KH <[EMAIL PROTECTED]> wrote:
> > >
> > > Careful, it looks like you cherry picked commit 4ae3f847 "md: raid5:
> > > fix clearing of biofill operations" which ended up misapplied in
> > > Linus' tree,  You should either also pick up def6ae26 "md: fix
> > > misapplied patch in raid5.c" or I can resend the original "raid5: fix
> > > clearing of biofill operations."
> > >
> > > The other patch for -stable "raid5: fix unending write sequence" is
> > > currently in -mm.
> >
> > Hm, I've attached the two patches that I have right now in the -stable
> > tree so far (still have over 100 patches to go, so I might not have
> > gotten to them yet if you have sent them).  These were sent to me by
> > Andrew on their way to Linus.  if I should drop either one, or add
> > another one, please let me know.
> >
> 
> Drop md-raid5-fix-clearing-of-biofill-operations.patch and replace it
> with the attached
> md-raid5-not-raid6-fix-clearing-of-biofill-operations.patch (the
> original sent to Neil).
> 
> The critical difference is that the replacement patch touches
> handle_stripe5, not handle_stripe6.  Diffing the patches shows the
> changes for hunk #3:
> 
> -@@ -2903,6 +2907,13 @@ static void handle_stripe6(struct stripe
> +@@ -2630,6 +2634,13 @@ static void handle_stripe5(struct stripe_head *sh)

Ah, ok, thanks, will do that.

> raid5-fix-unending-write-sequence.patch is in -mm and I believe is
> waiting on an Acked-by from Neil?

I don't see it in Linus's tree yet, so I can't apply it to -stable...

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: raid5 hangs

2007-11-14 Thread Bill Davidsen

Justin Piszcz wrote:
This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the 
RAID5 bio* patches are applied.


Note below he's running 2.6.22.3 which doesn't have the bug unless 
-STABLE added it. So should not really be in 2.6.22.anything. I assume 
you're talking the endless write or bio issue?


Justin.

On Wed, 14 Nov 2007, Peter Magnusson wrote:


Hey.

[1.] One line summary of the problem:

raid5 hangs and use 100% cpu

[2.] Full description of the problem/report:

I have used 2.6.18 for 284 days or something until my powersupply 
died, no problem what so ever duing that time. After that forced 
reboot I did these changes; Put in 2 GB more memory so I have 3 GB 
instead of 1 GB, two disks in the raid5 got badblocks so I didnt 
trust them anymore so I bought new disks (I managed to save the 
raid5). I have 6x300 GB in a raid5. Two of them are now 320 GB so 
created a small raid1 also. That raid5 is encrypted with 
aes-cbc-plain. The raid1 is encrypted with aes-cbc-essiv:sha256.


I compiled linux-2.6.22.3 and started to use that. I used the same 
.config
as in default FC5, I think i just selected P4 cpu and preemptive 
kernel type.


After 11 or 12 days the computer froze, I wasnt home when it happend and
couldnt fix it for like 3 days. It was just to reboot it as it wasnt 
possible to login remotely or on console. It did respond to ping 
however.

After reboot it rebuilded the raid5.

Then it happend again after approx the same time, 11 or 12 days. I 
noticed

that the process md1_raid5 used 100% cpu all the time. After reboot it
rebuilded the raid5.

I compiled linux-2.6.23.

And then... it happend again... After about the same time as before.
md1_raid5 used 100% cpu. I also noticed that I wasnt able to save
anything in my homedir, it froze during save. I could read from it 
however. My homedir isnt on raid5 but its encrypted. Its not on any 
disk that has to do with raid. This problem didnt happend when I used 
2.6.18. Currently I use 2.6.18 as I kinda need the computer stable.

After reboot it rebuilded the raid5.


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: raid5 hangs

2007-11-14 Thread Justin Piszcz



On Wed, 14 Nov 2007, Bill Davidsen wrote:


Justin Piszcz wrote:
This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the RAID5 
bio* patches are applied.


Note below he's running 2.6.22.3 which doesn't have the bug unless -STABLE 
added it. So should not really be in 2.6.22.anything. I assume you're talking 
the endless write or bio issue?

The bio issue is the root cause of the bug yes?
--

I am uncertain but I remember this happening in the past but I thought it 
was something I was doing (possibly < 2.6.23) so it may have been 
happenign earlier than that but I am not positive.




Justin.

On Wed, 14 Nov 2007, Peter Magnusson wrote:


Hey.

[1.] One line summary of the problem:

raid5 hangs and use 100% cpu

[2.] Full description of the problem/report:

I have used 2.6.18 for 284 days or something until my powersupply died, no 
problem what so ever duing that time. After that forced reboot I did these 
changes; Put in 2 GB more memory so I have 3 GB instead of 1 GB, two disks 
in the raid5 got badblocks so I didnt trust them anymore so I bought new 
disks (I managed to save the raid5). I have 6x300 GB in a raid5. Two of 
them are now 320 GB so created a small raid1 also. That raid5 is encrypted 
with aes-cbc-plain. The raid1 is encrypted with aes-cbc-essiv:sha256.


I compiled linux-2.6.22.3 and started to use that. I used the same .config
as in default FC5, I think i just selected P4 cpu and preemptive kernel 
type.


After 11 or 12 days the computer froze, I wasnt home when it happend and
couldnt fix it for like 3 days. It was just to reboot it as it wasnt 
possible to login remotely or on console. It did respond to ping however.

After reboot it rebuilded the raid5.

Then it happend again after approx the same time, 11 or 12 days. I noticed
that the process md1_raid5 used 100% cpu all the time. After reboot it
rebuilded the raid5.

I compiled linux-2.6.23.

And then... it happend again... After about the same time as before.
md1_raid5 used 100% cpu. I also noticed that I wasnt able to save
anything in my homedir, it froze during save. I could read from it 
however. My homedir isnt on raid5 but its encrypted. Its not on any disk 
that has to do with raid. This problem didnt happend when I used 2.6.18. 
Currently I use 2.6.18 as I kinda need the computer stable.

After reboot it rebuilded the raid5.


--
bill davidsen <[EMAIL PROTECTED]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979




-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: raid5 hangs

2007-11-14 Thread Dan Williams
On Nov 14, 2007 5:05 PM, Justin Piszcz <[EMAIL PROTECTED]> wrote:
> On Wed, 14 Nov 2007, Bill Davidsen wrote:
> > Justin Piszcz wrote:
> >> This is a known bug in 2.6.23 and should be fixed in 2.6.23.2 if the RAID5
> >> bio* patches are applied.
> >
> > Note below he's running 2.6.22.3 which doesn't have the bug unless -STABLE
> > added it. So should not really be in 2.6.22.anything. I assume you're 
> > talking
> > the endless write or bio issue?
> The bio issue is the root cause of the bug yes?

Not if this is a 2.6.22 issue.  Neither of the bugs fixed by "raid5:
fix clearing of biofill operations" or "raid5: fix unending write
sequence" existed prior to 2.6.23.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Proposal: non-striping RAID4

2007-11-14 Thread James Lee
But creating a 3-drive RAID5 with a missing device for the final two
drives wouldn't give me what I'm looking for, as that array would no
longer be fault-tolerant.  So I think what we'd have on an array of n
differently-sized drives is:
- One n drive RAID5 array.
- One (n-1) drive RAID5 array.
...
- One 2 drive RAID5 array.
- One non-RAIDed single partition.

All of these except for the non-RAIDed partition would then be used as
elements in a linear array (which would tolerate the failure of any
single drive, as each of its constituent arrays does).  This would
leave a single non-RAIDed partition which can be used for anything
else.

Thinking back over it, I think one potential issue might be how resync
works.  If all of the RAID5 arrays become in need of resync at the
same time (which is perfectly likely - e.g. if the system is powered
down abruptly, a drive is replaced, ...) will the md driver attempt to
resync each of the arrays sequentially or in parallel?  If the latter,
this is likely to be extremely slow, as it'll be trying to resync
multiple arrays on the same drives (and therefore doing huge amounts
of seeking, etc.).

The other issue is that it looks like (correct me if I'm wrong here),
mdadm doesn't support growing a linear array by increasing the size of
it's constituent parts (which is what would be required here to be
able to expand the entire array when adding a new drive).  I don't
know how hard this would be to implement (I don't know how data gets
arranged in a linear array - does it start with all of the first
drive, then the second, and so on or does it write bits to each?).

Neil: any comments on whether this would be desirable / useful / feasible?

James

PS: and as you say, all of the above could also be done with RAID6
arrays instead of RAID5.

On 14/11/2007, Bill Davidsen <[EMAIL PROTECTED]> wrote:
> James Lee wrote:
> > >From a quick search through this mailing list, it looks like I can
> > answer my own question regarding RAID1 --> RAID5 conversion.  Instead
> > of creating a RAID1 array for the partitions on the two biggest
> > drives, it should just create a 2-drive RAID5 (which is identical, but
> > can be expanded as with any other RAID5 array).
> >
> > So it looks like this should work I guess.
>
> I believe what you want to create might be a three drive raid-5 with one
> failed drive. That way you can just add a drive when you want.
>
>   mdadm -C -c32 -l5 -n3 -amd /dev/md7 /dev/loop[12] missing
>
> Then you can add another drive:
>
>   mdadm --add /dev/md7 /dev/loop3
>
> The output are at the end of this message.
>
> But in general think it would be really great to be able to have a
> format which would do raid-5 or raid-6 over all the available parts of
> multiple drives, and since there's some similar logic for raid-10 over a
> selection of drives it is clearly possible. But in terms of the benefit
> to be gained, unless it fails out of the code and someone feels the
> desire to do it, I can't see much joy to ever having such a thing.
>
> The feature I would really like to have is raid5e, distributed spare so
> head motion is spread over all drives. Don't have time to look at that
> one, either, but it really helps performance under load with small arrays.
>
> --
> bill davidsen <[EMAIL PROTECTED]>
>   CTO TMR Associates, Inc
>   Doing interesting things with small computers since 1979
>
>
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [stable] [PATCH 000 of 2] md: Fixes for md in 2.6.23

2007-11-14 Thread Neil Brown
On Tuesday November 13, [EMAIL PROTECTED] wrote:
> 
> raid5-fix-unending-write-sequence.patch is in -mm and I believe is
> waiting on an Acked-by from Neil?
> 

It seems to have just been sent on to Linus, so it probably will go in
without:

   Acked-By: NeilBrown <[EMAIL PROTECTED]>

I'm beginning to think that I really should sit down and make sure I
understand exactly how those STRIPE_OP_ flags are uses.  They
generally make sense but there seem to be a number of corner cases
where they aren't quite handled properly..  Maybe they are all found
now, or maybe

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Proposal: non-striping RAID4

2007-11-14 Thread Neil Brown
On Thursday November 15, [EMAIL PROTECTED] wrote:
> 
> Neil: any comments on whether this would be desirable / useful / feasible?

1/ Have in raid4 variant which arranges the data like 'linear' is
   something I am planning to do eventually.  If your filesystem nows
   about the geometry of the array , then it can distribute the data
   across the drives and can make up for a lot of the benefits of
   striping.  The big advantage of such an arrangement is that it is
   trivial to add a drive - just zero it and make it part of the
   array.  No need to re-arrange what is currently there.
   However I was not thinking of support different sizes devices in
   such a configuration.

2/ Having an array with redundancy where drives are of different sizes
   is awkward, primarily because if there was a spare that as not as
   large as the largest device, you may-or-may not be able to rebuild
   in that situation.   Certainly I could code up those decisions, but
   I'm not sure the scenario is worth the complexity.
   If you have drives of different sizes, use raid0 to combine pairs
   of smaller one to match larger ones, and do raid5 across devices
   that look like the same size.

3/ If you really want to use exactly what you have, you can partition
   them into bits and make a variety of raid5 arrays as you suggest.
   md will notice and will resync in series so that you don't kill
   performance.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html