Raid5 reshape

2006-06-15 Thread Tim
Hello all,

I'm sorry if this is a silly question, but I've been digging around for
a few days now and have not found a clear answer, so I'm tossing it out
to those who know it best.

I see that as of a few rc's ago, 2.6.17 has had the capability of adding
additional drives to an active raid 5 array (w/ the proper ver of mdadm,
of course). I cannot, however, for the life of me find out exactly how
one goes about doing it! I would love if someone could give a
step-by-step on what needs to be changed in, say, mdadm.conf (if
anything), and what args you need to throw at mdadm to start the reshape
process.

As a point of reference, here's my current mdadm.conf:


DEVICE /dev/sda1
DEVICE /dev/sdb1
DEVICE /dev/sdc1
ARRAY /dev/md0 devices=/dev/sda1,/dev/sdb1,/dev/sdc1 level=5 num-devices=3


I will be adding the devices /dev/sde1 and /dev/sdf1 (when I can find
out how :)

Thanks in advance,
-Tim

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-15 Thread Neil Brown
On Thursday June 15, [EMAIL PROTECTED] wrote:
> Hello all,
> 
> I'm sorry if this is a silly question, but I've been digging around for
> a few days now and have not found a clear answer, so I'm tossing it out
> to those who know it best.
> 
> I see that as of a few rc's ago, 2.6.17 has had the capability of adding
> additional drives to an active raid 5 array (w/ the proper ver of mdadm,
> of course). I cannot, however, for the life of me find out exactly how
> one goes about doing it! I would love if someone could give a
> step-by-step on what needs to be changed in, say, mdadm.conf (if
> anything), and what args you need to throw at mdadm to start the reshape
> process.
> 
> As a point of reference, here's my current mdadm.conf:
> 
> 
> DEVICE /dev/sda1
> DEVICE /dev/sdb1
> DEVICE /dev/sdc1
> ARRAY /dev/md0 devices=/dev/sda1,/dev/sdb1,/dev/sdc1 level=5 num-devices=3
> 

May I suggest:

   DEVICE /dev/sd?1
   ARRAY /dev/md0 UUID=whatever

it would be a lot safer.

> 
> I will be adding the devices /dev/sde1 and /dev/sdf1 (when I can find
> out how :)

 mdadm /dev/md0 --add /dev/sde1 /dev/sdf1
 mdadm --grow /dev/md0 --raid-disks=5

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-16 Thread Nigel J. Terry

Neil Brown wrote:

On Thursday June 15, [EMAIL PROTECTED] wrote:
  

Hello all,

I'm sorry if this is a silly question, but I've been digging around for
a few days now and have not found a clear answer, so I'm tossing it out
to those who know it best.

I see that as of a few rc's ago, 2.6.17 has had the capability of adding
additional drives to an active raid 5 array (w/ the proper ver of mdadm,
of course). I cannot, however, for the life of me find out exactly how
one goes about doing it! I would love if someone could give a
step-by-step on what needs to be changed in, say, mdadm.conf (if
anything), and what args you need to throw at mdadm to start the reshape
process.

As a point of reference, here's my current mdadm.conf:


DEVICE /dev/sda1
DEVICE /dev/sdb1
DEVICE /dev/sdc1
ARRAY /dev/md0 devices=/dev/sda1,/dev/sdb1,/dev/sdc1 level=5 num-devices=3




May I suggest:

   DEVICE /dev/sd?1
   ARRAY /dev/md0 UUID=whatever

it would be a lot safer.

  

I will be adding the devices /dev/sde1 and /dev/sdf1 (when I can find
out how :)



 mdadm /dev/md0 --add /dev/sde1 /dev/sdf1
 mdadm --grow /dev/md0 --raid-disks=5

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  

This might be an even sillier question, but I'll ask it anyway...

If I add a drive to my RAID5 array, what happens to the ext3 filesystem 
on top of it? Does it grow automatically? Do I have to take some action 
to use the extra space?


Thanks

Nigel
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-16 Thread Tim T
You have to grow the ext3 fs separately. ext2resize /dev/mdX. Keep in 
mind this can only be done off-line.


-Tim

Nigel J. Terry wrote:

Neil Brown wrote:

On Thursday June 15, [EMAIL PROTECTED] wrote:
 

Hello all,

I'm sorry if this is a silly question, but I've been digging around for
a few days now and have not found a clear answer, so I'm tossing it out
to those who know it best.

I see that as of a few rc's ago, 2.6.17 has had the capability of 
adding
additional drives to an active raid 5 array (w/ the proper ver of 
mdadm,

of course). I cannot, however, for the life of me find out exactly how
one goes about doing it! I would love if someone could give a
step-by-step on what needs to be changed in, say, mdadm.conf (if
anything), and what args you need to throw at mdadm to start the 
reshape

process.

As a point of reference, here's my current mdadm.conf:


DEVICE /dev/sda1
DEVICE /dev/sdb1
DEVICE /dev/sdc1
ARRAY /dev/md0 devices=/dev/sda1,/dev/sdb1,/dev/sdc1 level=5 
num-devices=3





May I suggest:

   DEVICE /dev/sd?1
   ARRAY /dev/md0 UUID=whatever

it would be a lot safer.

 

I will be adding the devices /dev/sde1 and /dev/sdf1 (when I can find
out how :)



 mdadm /dev/md0 --add /dev/sde1 /dev/sdf1
 mdadm --grow /dev/md0 --raid-disks=5

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  

This might be an even sillier question, but I'll ask it anyway...

If I add a drive to my RAID5 array, what happens to the ext3 
filesystem on top of it? Does it grow automatically? Do I have to take 
some action to use the extra space?


Thanks

Nigel
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-16 Thread Neil Brown
On Friday June 16, [EMAIL PROTECTED] wrote:
> You have to grow the ext3 fs separately. ext2resize /dev/mdX. Keep in 
> mind this can only be done off-line.
> 

ext3 can be resized online. I think ext2resize in the latest release
will "do the right thing" whether it is online or not.

There is a limit to the amount of expansion that can be achieved
on-line.  This limit is set when making the filesystem.  Depending on
which version of ext2-utils you used to make the filesystem, it may or
may not already be prepared for substantial expansion.

So if you want to do it on-line, give it a try or ask on the
ext3-users list for particular details on what versions you need and
how to see if your fs can be expanded.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-16 Thread Nigel J. Terry

Neil Brown wrote:

On Friday June 16, [EMAIL PROTECTED] wrote:
  
You have to grow the ext3 fs separately. ext2resize /dev/mdX. Keep in 
mind this can only be done off-line.





ext3 can be resized online. I think ext2resize in the latest release
will "do the right thing" whether it is online or not.

There is a limit to the amount of expansion that can be achieved
on-line.  This limit is set when making the filesystem.  Depending on
which version of ext2-utils you used to make the filesystem, it may or
may not already be prepared for substantial expansion.

So if you want to do it on-line, give it a try or ask on the
ext3-users list for particular details on what versions you need and
how to see if your fs can be expanded.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  
Thanks for all the advice. One final question, what kernel and mdadm 
versions do I need?


Nigel
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-16 Thread Neil Brown
On Friday June 16, [EMAIL PROTECTED] wrote:
> Thanks for all the advice. One final question, what kernel and mdadm 
> versions do I need?

For resizing raid5:

mdadm-2.4 or later
linux-2.6.17-rc2 or later

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Nigel J. Terry

Neil Brown wrote:

On Friday June 16, [EMAIL PROTECTED] wrote:
  
Thanks for all the advice. One final question, what kernel and mdadm 
versions do I need?



For resizing raid5:

mdadm-2.4 or later
linux-2.6.17-rc2 or later

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  

Ok, I tried and screwed up!

I upgraded my kernel and mdadm.
I set the grow going and all looked well, so as it said it was going to 
take 430 minutes, I went to Starbucks. When I came home there had been a 
power cut, but my UPS had shut the system down. When power returned I 
rebooted. Now I think I had failed to set the new partition on /dev/hdc1 
to Raid Autodetect, so it didn't find it at reboot. I tried to hot add 
it, but now I seem to have a deadlock situation. Although --detail shows 
that it is degraded and recovering, /proc/mdstat shows it is reshaping. 
In truth there is no disk activity and the count in /proc/mdstat is not 
changing. I gues sthe only good news is that I can still mount the 
device and my data is fine. Please see below...


Any ideas what I should do next? Thanks

Nigel

[EMAIL PROTECTED] ~]# uname -a
Linux homepc.nigelterry.net 2.6.17-rc6 #1 SMP Sat Jun 17 11:05:52 EDT 
2006 x86_64 x86_64 x86_64 GNU/Linux

[EMAIL PROTECTED] ~]# mdadm --version
mdadm - v2.5.1 -  16 June 2006
[EMAIL PROTECTED] ~]# mdadm --detail /dev/md0
/dev/md0:
   Version : 00.91.03
 Creation Time : Tue Apr 18 17:44:34 2006
Raid Level : raid5
Array Size : 490223104 (467.51 GiB 501.99 GB)
   Device Size : 245111552 (233.76 GiB 250.99 GB)
  Raid Devices : 4
 Total Devices : 4
Preferred Minor : 0
   Persistence : Superblock is persistent

   Update Time : Sat Jun 17 15:15:05 2006
 State : clean, degraded, recovering
Active Devices : 3
Working Devices : 4
Failed Devices : 0
 Spare Devices : 1

Layout : left-symmetric
Chunk Size : 128K

Reshape Status : 6% complete
 Delta Devices : 1, (3->4)

  UUID : 50e3173e:b5d2bdb6:7db3576b:644409bb
Events : 0.3211829

   Number   Major   Minor   RaidDevice State
  0   810  active sync   /dev/sda1
  1   8   171  active sync   /dev/sdb1
  2   3   652  active sync   /dev/hdb1
  3   003  removed

  4  221-  spare   /dev/hdc1
[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
 [=>...]  reshape =  6.9% (17073280/245111552) 
finish=86.3min speed=44003K/sec


unused devices: 
[EMAIL PROTECTED] ~]#

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Neil Brown
On Saturday June 17, [EMAIL PROTECTED] wrote:
> 
> Any ideas what I should do next? Thanks
> 

Looks like you've probably hit a bug.  I'll need a bit more info
though.

First:

> [EMAIL PROTECTED] ~]# cat /proc/mdstat
> Personalities : [raid5] [raid4]
> md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
>   490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
> [UUU_]
>   [=>...]  reshape =  6.9% (17073280/245111552) 
> finish=86.3min speed=44003K/sec
> 
> unused devices: 

This really makes it look like the reshape is progressing.  How
long after the reboot was this taken?  How long after hdc1 has hot
added (roughly)?  What does it show now?

What happens if you remove hdc1 again?  Does the reshape keep going?

What I would expect to happen in this case is that the array reshapes
into a degraded array, then the missing disk is recovered onto hdc1.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Nigel J. Terry


Neil Brown wrote:

On Saturday June 17, [EMAIL PROTECTED] wrote:
  

Any ideas what I should do next? Thanks




Looks like you've probably hit a bug.  I'll need a bit more info
though.

First:

  

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
  490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
  [=>...]  reshape =  6.9% (17073280/245111552) 
finish=86.3min speed=44003K/sec


unused devices: 



This really makes it look like the reshape is progressing.  How
long after the reboot was this taken?  How long after hdc1 has hot
added (roughly)?  What does it show now?

What happens if you remove hdc1 again?  Does the reshape keep going?

What I would expect to happen in this case is that the array reshapes
into a degraded array, then the missing disk is recovered onto hdc1.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  
I don't know how long the system was reshaping before the power went 
off, and then I had to restart when the power came back. It claimed it 
was going to take 430 minutes, so 6% would be about 25 minutes, which 
could make good sense, certainly it looked like it was working fine when 
I went out.


Now nothing is happening, it shows:

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
 [=>...]  reshape =  6.9% (17073280/245111552) 
finish=2281.2min speed=1665K/sec


unused devices: 
[EMAIL PROTECTED] ~]#

so the only thing changing is the time till finish.

I'll try removing and adding /dev/hdc1 again. Will it make any 
difference if the device is mounted or not?


Nigel
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Nigel J. Terry

Nigel J. Terry wrote:


Neil Brown wrote:

On Saturday June 17, [EMAIL PROTECTED] wrote:
 

Any ideas what I should do next? Thanks




Looks like you've probably hit a bug.  I'll need a bit more info
though.

First:

 

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
  490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 
[4/3] [UUU_]
  [=>...]  reshape =  6.9% (17073280/245111552) 
finish=86.3min speed=44003K/sec


unused devices: 



This really makes it look like the reshape is progressing.  How
long after the reboot was this taken?  How long after hdc1 has hot
added (roughly)?  What does it show now?

What happens if you remove hdc1 again?  Does the reshape keep going?

What I would expect to happen in this case is that the array reshapes
into a degraded array, then the missing disk is recovered onto hdc1.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  
I don't know how long the system was reshaping before the power went 
off, and then I had to restart when the power came back. It claimed it 
was going to take 430 minutes, so 6% would be about 25 minutes, which 
could make good sense, certainly it looked like it was working fine 
when I went out.


Now nothing is happening, it shows:

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 
[4/3] [UUU_]
 [=>...]  reshape =  6.9% (17073280/245111552) 
finish=2281.2min speed=1665K/sec


unused devices: 
[EMAIL PROTECTED] ~]#

so the only thing changing is the time till finish.

I'll try removing and adding /dev/hdc1 again. Will it make any 
difference if the device is mounted or not?


Nigel

Tried remove and add, made no difference:
[EMAIL PROTECTED] ~]# mdadm /dev/md0 --remove /dev/hdc1
mdadm: hot removed /dev/hdc1
[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
 [=>...]  reshape =  6.9% (17073280/245111552) 
finish=2321.5min speed=1636K/sec


unused devices: 
[EMAIL PROTECTED] ~]# mdadm /dev/md0 --add /dev/hdc1
mdadm: re-added /dev/hdc1
[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 hdc1[4](S) sdb1[1] sda1[0] hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
 [=>...]  reshape =  6.9% (17073280/245111552) 
finish=2329.3min speed=1630K/sec


unused devices: 
[EMAIL PROTECTED] ~]#

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Neil Brown

OK, thanks for the extra details.  I'll have a look and see what I can
find, but it'll probably be a couple of days before I have anything
useful for you.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-17 Thread Nigel J. Terry

Neil Brown wrote:

OK, thanks for the extra details.  I'll have a look and see what I can
find, but it'll probably be a couple of days before I have anything
useful for you.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  

OK, I'll try and be patient :-) At least everything else is working.

Let me know if you need to ssh to my machine.

Nigel
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-18 Thread Nigel J. Terry

Nigel J. Terry wrote:

Neil Brown wrote:

OK, thanks for the extra details.  I'll have a look and see what I can
find, but it'll probably be a couple of days before I have anything
useful for you.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  



This from dmesg might help diagnose the problem:

md: Autodetecting RAID arrays.
md: autorun ...
md: considering sdb1 ...
md:  adding sdb1 ...
md:  adding sda1 ...
md:  adding hdc1 ...
md:  adding hdb1 ...
md: created md0
md: bind
md: bind
md: bind
md: bind
md: running: 
raid5: automatically using best checksumming function: generic_sse
  generic_sse:  6795.000 MB/sec
raid5: using function: generic_sse (6795.000 MB/sec)
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
raid5: reshape will continue
raid5: device sdb1 operational as raid disk 1
raid5: device sda1 operational as raid disk 0
raid5: device hdb1 operational as raid disk 2
raid5: allocated 4268kB for md0
raid5: raid level 5 set md0 active with 3 out of 4 devices, algorithm 2
RAID5 conf printout:
--- rd:4 wd:3 fd:1
disk 0, o:1, dev:sda1
disk 1, o:1, dev:sdb1
disk 2, o:1, dev:hdb1
...ok start reshape thread
md: syncing RAID array md0
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 20 
KB/sec) for reconstruction.

md: using 128k window, over a total of 245111552 blocks.
Unable to handle kernel NULL pointer dereference at  RIP:
<>{stext+2145382632}
PGD 7c3f9067 PUD 7cb9e067 PMD 0
Oops: 0010 [1] SMP
CPU 0
Modules linked in: raid5 xor usb_storage video button battery ac lp 
parport_pc parport floppy nvram snd_intel8x0 snd_ac97_codec snd_ac97_bus 
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss ehci_hcd ohci1394 ieee1394 sg snd_pcm uhci_hcd 
i2c_nforce2 i2c_core forcedeth ohci_hcd snd_timer snd soundcore 
snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd sata_nv 
libata sd_mod scsi_mod

Pid: 1432, comm: md0_reshape Not tainted 2.6.17-rc6 #1
RIP: 0010:[<>] <>{stext+2145382632}
RSP: :81007aa43d60  EFLAGS: 00010246
RAX: 81007cf72f20 RBX: 81007c682000 RCX: 0006
RDX:  RSI:  RDI: 81007cf72f20
RBP: 02090900 R08:  R09: 810037f497b0
R10: 000b44ffd564 R11: 8022c92a R12: 
R13: 0100 R14:  R15: 
FS:  0066d870() GS:80611000() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2:  CR3: 7bebc000 CR4: 06e0
Process md0_reshape (pid: 1432, threadinfo 81007aa42000, task 
810037f497b0)

Stack: 803dce42  1d383600 
     
   
Call Trace: {md_do_sync+1307} 
{thread_return+0}
  {thread_return+94} 
{keventd_create_kthread+0}
  {md_thread+248} 
{keventd_create_kthread+0}

  {md_thread+0} {kthread+254}
  {child_rip+8} 
{keventd_create_kthread+0}

  {thread_return+0} {kthread+0}
  {child_rip+0}

Code:  Bad RIP value.
RIP <>{stext+2145382632} RSP 
CR2: 
<6>md: ... autorun DONE.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-18 Thread Neil Brown
On Sunday June 18, [EMAIL PROTECTED] wrote:
> This from dmesg might help diagnose the problem:
> 

Yes, that helps a lot, thanks.

The problem is that the reshape thread is restarting before the array
is fully set-up, so it ends up dereferencing a NULL pointer.

This patch should fix it.
In fact, there is a small chance that next time you boot it will work
without this patch, but the patch makes it more reliable.

There definitely should be no data-loss due to this bug.

Thanks,
NeilBrown



### Diffstat output
 ./drivers/md/md.c|6 --
 ./drivers/md/raid5.c |3 ---
 2 files changed, 4 insertions(+), 5 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2006-05-30 15:07:14.0 +1000
+++ ./drivers/md/md.c   2006-06-19 12:01:47.0 +1000
@@ -2719,8 +2719,6 @@ static int do_md_run(mddev_t * mddev)
}

set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
-   md_wakeup_thread(mddev->thread);
-   
if (mddev->sb_dirty)
md_update_sb(mddev);
 
@@ -2738,6 +2736,10 @@ static int do_md_run(mddev_t * mddev)
 
mddev->changed = 1;
md_new_event(mddev);
+
+   md_wakeup_thread(mddev->thread);
+   md_wakeup_thread(mddev->sync_thread);
+
return 0;
 }
 

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c2006-06-19 11:56:41.0 +1000
+++ ./drivers/md/raid5.c2006-06-19 11:56:44.0 +1000
@@ -2373,9 +2373,6 @@ static int run(mddev_t *mddev)
set_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
mddev->sync_thread = md_register_thread(md_do_sync, mddev,
"%s_reshape");
-   /* FIXME if md_register_thread fails?? */
-   md_wakeup_thread(mddev->sync_thread);
-
}
 
/* read-ahead size must cover two whole stripes, which is
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-19 Thread Nigel J. Terry

Neil Brown wrote:

On Sunday June 18, [EMAIL PROTECTED] wrote:
  

This from dmesg might help diagnose the problem:




Yes, that helps a lot, thanks.

The problem is that the reshape thread is restarting before the array
is fully set-up, so it ends up dereferencing a NULL pointer.

This patch should fix it.
In fact, there is a small chance that next time you boot it will work
without this patch, but the patch makes it more reliable.

There definitely should be no data-loss due to this bug.

Thanks,
NeilBrown

  

Neil

That seems to have fixed it. The reshape is now progressing and there are no 
apparent errors in dmesg. Details below.

I'll send another confirmation tomorrow when hopefully it has finished :-)

Many thanks for a great product and great support.

Nigel

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
 [=>...]  reshape =  7.9% (19588744/245111552) 
finish=6.4min speed=578718K/sec


unused devices: 
[EMAIL PROTECTED] ~]# mdadm --detail /dev/md0
/dev/md0:
   Version : 00.91.03
 Creation Time : Tue Apr 18 17:44:34 2006
Raid Level : raid5
Array Size : 490223104 (467.51 GiB 501.99 GB)
   Device Size : 245111552 (233.76 GiB 250.99 GB)
  Raid Devices : 4
 Total Devices : 4
Preferred Minor : 0
   Persistence : Superblock is persistent

   Update Time : Mon Jun 19 17:38:42 2006
 State : clean, degraded, recovering
Active Devices : 3
Working Devices : 4
Failed Devices : 0
 Spare Devices : 1

Layout : left-symmetric
Chunk Size : 128K

Reshape Status : 8% complete
 Delta Devices : 1, (3->4)

  UUID : 50e3173e:b5d2bdb6:7db3576b:644409bb
Events : 0.3287189

   Number   Major   Minor   RaidDevice State
  0   810  active sync   /dev/sda1
  1   8   171  active sync   /dev/sdb1
  2   3   652  active sync   /dev/hdb1
  3   003  removed

  4  221-  spare   /dev/hdc1
[EMAIL PROTECTED] ~]#

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-19 Thread Neil Brown
On Monday June 19, [EMAIL PROTECTED] wrote:
> 
> That seems to have fixed it. The reshape is now progressing and
> there are no apparent errors in dmesg. Details below. 

Great!

> 
> I'll send another confirmation tomorrow when hopefully it has finished :-)
> 
> Many thanks for a great product and great support.

And thank you for being a patient beta-tester!

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-19 Thread Nigel J. Terry

Neil Brown wrote:

On Monday June 19, [EMAIL PROTECTED] wrote:
  

That seems to have fixed it. The reshape is now progressing and
there are no apparent errors in dmesg. Details below. 



Great!

  

I'll send another confirmation tomorrow when hopefully it has finished :-)

Many thanks for a great product and great support.



And thank you for being a patient beta-tester!

NeilBrown

  
Neil - I see myself more as being an "idiot-proof" tester than a 
beta-tester...


One comment - As I look at the rebuild, which is now over 20%, the time 
till finish makes no sense. It did make sense when the first reshape 
started. I guess your estimating / averaging algorithm doesn't work for 
a restarted reshape. A minor cosmetic issue - see below


Nigel
[EMAIL PROTECTED] ~]$ cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
 490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
 [>]  reshape = 22.7% (55742816/245111552) 
finish=5.8min speed=542211K/sec


unused devices: 
[EMAIL PROTECTED] ~]$


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-19 Thread Mike Hardy


Nigel J. Terry wrote:

> One comment - As I look at the rebuild, which is now over 20%, the time
> till finish makes no sense. It did make sense when the first reshape
> started. I guess your estimating / averaging algorithm doesn't work for
> a restarted reshape. A minor cosmetic issue - see below
> 
> Nigel
> [EMAIL PROTECTED] ~]$ cat /proc/mdstat
> Personalities : [raid5] [raid4]
> md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
>  490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3]
> [UUU_]
>  [>]  reshape = 22.7% (55742816/245111552)
> finish=5.8min speed=542211K/sec


Unless something has changed recently the parity-rebuild-interrupted /
restarted-parity-rebuild case shows the same behavior.

It's probably the same chunk of code (I haven't looked, bad hacker!
bad!), but I thought I'd mention it in case Neil goes looking

The "speed" is truly impressive though. I'll almost be sorry to see it
fixed :-)

-Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-19 Thread Nigel J. Terry

Mike Hardy wrote:

Unless something has changed recently the parity-rebuild-interrupted /
restarted-parity-rebuild case shows the same behavior.

It's probably the same chunk of code (I haven't looked, bad hacker!
bad!), but I thought I'd mention it in case Neil goes looking

The "speed" is truly impressive though. I'll almost be sorry to see it
fixed :-)

-Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  
I'd love to agree about the speed, but this has been the longest 5.8 
minutes of my life... :-)

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-19 Thread Neil Brown
On Monday June 19, [EMAIL PROTECTED] wrote:
> 
> One comment - As I look at the rebuild, which is now over 20%, the time 
> till finish makes no sense. It did make sense when the first reshape 
> started. I guess your estimating / averaging algorithm doesn't work for 
> a restarted reshape. A minor cosmetic issue - see below
> 
> Nigel
> [EMAIL PROTECTED] ~]$ cat /proc/mdstat
> Personalities : [raid5] [raid4]
> md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
>   490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
> [UUU_]
>   [>]  reshape = 22.7% (55742816/245111552) 
> finish=5.8min speed=542211K/sec

Hmmm. I see.
This should fix that, but I don't expect you to interrupt your reshape
to try it.

Thanks,
NeilBrown


### Diffstat output
 ./drivers/md/md.c   |8 +---
 ./include/linux/raid/md_k.h |3 ++-
 2 files changed, 7 insertions(+), 4 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2006-06-19 11:52:55.0 +1000
+++ ./drivers/md/md.c   2006-06-20 09:30:57.0 +1000
@@ -2717,7 +2717,7 @@ static ssize_t
 sync_speed_show(mddev_t *mddev, char *page)
 {
unsigned long resync, dt, db;
-   resync = (mddev->curr_resync - atomic_read(&mddev->recovery_active));
+   resync = (mddev->curr_mark_cnt - atomic_read(&mddev->recovery_active));
dt = ((jiffies - mddev->resync_mark) / HZ);
if (!dt) dt++;
db = resync - (mddev->resync_mark_cnt);
@@ -4688,8 +4688,9 @@ static void status_resync(struct seq_fil
 */
dt = ((jiffies - mddev->resync_mark) / HZ);
if (!dt) dt++;
-   db = resync - (mddev->resync_mark_cnt/2);
-   rt = (dt * ((unsigned long)(max_blocks-resync) / (db/100+1)))/100;
+   db = (mddev->curr_mark_cnt - atomic_read(&mddev->recovery_active))
+   - mddev->resync_mark_cnt;
+   rt = (dt/2 * ((unsigned long)(max_blocks-resync) / (db/100+1)))/100;
 
seq_printf(seq, " finish=%lu.%lumin", rt / 60, (rt % 60)/6);
 
@@ -5204,6 +5205,7 @@ void md_do_sync(mddev_t *mddev)
 
j += sectors;
if (j>1) mddev->curr_resync = j;
+   mddev->curr_mark_cnt = io_sectors;
if (last_check == 0)
/* this is the earliers that rebuilt will be
 * visible in /proc/mdstat

diff .prev/include/linux/raid/md_k.h ./include/linux/raid/md_k.h
--- .prev/include/linux/raid/md_k.h 2006-06-20 09:31:22.0 +1000
+++ ./include/linux/raid/md_k.h 2006-06-20 09:31:58.0 +1000
@@ -148,9 +148,10 @@ struct mddev_s
 
struct mdk_thread_s *thread;/* management thread */
struct mdk_thread_s *sync_thread;   /* doing resync or 
reconstruct */
-   sector_tcurr_resync;/* blocks scheduled */
+   sector_tcurr_resync;/* last block scheduled 
*/
unsigned long   resync_mark;/* a recent timestamp */
sector_tresync_mark_cnt;/* blocks written at 
resync_mark */
+   sector_tcurr_mark_cnt; /* blocks scheduled now 
*/
 
sector_tresync_max_sectors; /* may be set by 
personality */
 
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-19 Thread Nigel J. Terry

Neil Brown wrote:

On Monday June 19, [EMAIL PROTECTED] wrote:
  
One comment - As I look at the rebuild, which is now over 20%, the time 
till finish makes no sense. It did make sense when the first reshape 
started. I guess your estimating / averaging algorithm doesn't work for 
a restarted reshape. A minor cosmetic issue - see below


Nigel
[EMAIL PROTECTED] ~]$ cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[4](S) hdb1[2]
  490223104 blocks super 0.91 level 5, 128k chunk, algorithm 2 [4/3] 
[UUU_]
  [>]  reshape = 22.7% (55742816/245111552) 
finish=5.8min speed=542211K/sec



Hmmm. I see.
This should fix that, but I don't expect you to interrupt your reshape
to try it.

Thanks,
NeilBrown

  

I have nothing better to do, I'll give it a go and let you know...
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 reshape

2006-06-20 Thread Nigel J. Terry

Nigel J. Terry wrote:

Well good news and bad news I'm afraid...

Well I would like to be able to tell you that the time calculation now 
works, but I can't. Here's why: Why I rebooted with the newly built 
kernel, it decided to hit the magic 21 reboots and hence decided to 
check the array for clean. The normally takes about 5-10 mins, but this 
time took several hours, so I went to bed! I suspect that it was doing 
the full reshape or something similar at boot time.


Now I am not sure that this makes good sense in a normal environment. 
This could keep a server down for hours or days. I might suggest that if 
such work was required, the clean check is postponed till next boot and 
the reshape allowed to continue in the background.


Anyway the good news is that this morning, all is well, the array is 
clean and grown as can be seen below. However, if you look further below 
you will see the section from dmesg which still shows RIP errors, so I 
guess there is still something wrong, even though it looks like it is 
working. Let me know if i can provide any more information.


Once again, many thanks. All I need to do now is grow the ext3 filesystem...

Nigel

[EMAIL PROTECTED] ~]# mdadm --detail /dev/md0
/dev/md0:
   Version : 00.90.03
 Creation Time : Tue Apr 18 17:44:34 2006
Raid Level : raid5
Array Size : 735334656 (701.27 GiB 752.98 GB)
   Device Size : 245111552 (233.76 GiB 250.99 GB)
  Raid Devices : 4
 Total Devices : 4
Preferred Minor : 0
   Persistence : Superblock is persistent

   Update Time : Tue Jun 20 06:27:49 2006
 State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
 Spare Devices : 0

Layout : left-symmetric
Chunk Size : 128K

  UUID : 50e3173e:b5d2bdb6:7db3576b:644409bb
Events : 0.3366644

   Number   Major   Minor   RaidDevice State
  0   810  active sync   /dev/sda1
  1   8   171  active sync   /dev/sdb1
  2   3   652  active sync   /dev/hdb1
  3  2213  active sync   /dev/hdc1
[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid5] [raid4]
md0 : active raid5 sdb1[1] sda1[0] hdc1[3] hdb1[2]
 735334656 blocks level 5, 128k chunk, algorithm 2 [4/4] []

unused devices: 
[EMAIL PROTECTED] ~]#

But from dmesg:

md: Autodetecting RAID arrays.
md: autorun ...
md: considering sdb1 ...
md:  adding sdb1 ...
md:  adding sda1 ...
md:  adding hdc1 ...
md:  adding hdb1 ...
md: created md0
md: bind
md: bind
md: bind
md: bind
md: running: 
raid5: automatically using best checksumming function: generic_sse
  generic_sse:  6795.000 MB/sec
raid5: using function: generic_sse (6795.000 MB/sec)
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
raid5: reshape will continue
raid5: device sdb1 operational as raid disk 1
raid5: device sda1 operational as raid disk 0
raid5: device hdb1 operational as raid disk 2
raid5: allocated 4268kB for md0
raid5: raid level 5 set md0 active with 3 out of 4 devices, algorithm 2
RAID5 conf printout:
--- rd:4 wd:3 fd:1
disk 0, o:1, dev:sda1
disk 1, o:1, dev:sdb1
disk 2, o:1, dev:hdb1
...ok start reshape thread
md: syncing RAID array md0
md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
md: using maximum available idle IO bandwidth (but not more than 20 
KB/sec) for reconstruction.

md: using 128k window, over a total of 245111552 blocks.
Unable to handle kernel NULL pointer dereference at  RIP:
<>{stext+2145382632}
PGD 7c3f9067 PUD 7cb9e067 PMD 0
Oops: 0010 [1] SMP
CPU 0
Modules linked in: raid5 xor usb_storage video button battery ac lp 
parport_pc parport floppy nvram snd_intel8x0 snd_ac97_codec snd_ac97_bus 
snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device 
snd_pcm_oss snd_mixer_oss ehci_hcd ohci1394 ieee1394 sg snd_pcm uhci_hcd 
i2c_nforce2 i2c_core forcedeth ohci_hcd snd_timer snd soundcore 
snd_page_alloc dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd sata_nv 
libata sd_mod scsi_mod

Pid: 1432, comm: md0_reshape Not tainted 2.6.17-rc6 #1
RIP: 0010:[<>] <>{stext+2145382632}
RSP: :81007aa43d60  EFLAGS: 00010246
RAX: 81007cf72f20 RBX: 81007c682000 RCX: 0006
RDX:  RSI:  RDI: 81007cf72f20
RBP: 02090900 R08:  R09: 810037f497b0
R10: 000b44ffd564 R11: 8022c92a R12: 
R13: 0100 R14:  R15: 
FS:  0066d870() GS:80611000() knlGS:
CS:  0010 DS: 0018 ES: 0018 CR0: 8005003b
CR2:  CR3: 7bebc000 CR4: 06e0
Process md0_reshape (pid: 1432, threadinfo 81007aa42000, task 
810037f497b0)

Stack: 803dce42  1d383600 
   

Re: Raid5 reshape

2006-06-20 Thread Neil Brown
On Tuesday June 20, [EMAIL PROTECTED] wrote:
> Nigel J. Terry wrote:
> 
> Well good news and bad news I'm afraid...
> 
> Well I would like to be able to tell you that the time calculation now 
> works, but I can't. Here's why: Why I rebooted with the newly built 
> kernel, it decided to hit the magic 21 reboots and hence decided to 
> check the array for clean. The normally takes about 5-10 mins, but this 
> time took several hours, so I went to bed! I suspect that it was doing 
> the full reshape or something similar at boot time.
> 

What "magic 21 reboots"??  md has no mechanism to automatically check
the array after N reboots or anything like that.  Or are you thinking
of the 'fsck' that does a full check every so-often?


> Now I am not sure that this makes good sense in a normal environment. 
> This could keep a server down for hours or days. I might suggest that if 
> such work was required, the clean check is postponed till next boot and 
> the reshape allowed to continue in the background.

An fsck cannot tell if there is a reshape happening, but the reshape
should notice the fsck and slow down to a crawl so the fsck can complete...

> 
> Anyway the good news is that this morning, all is well, the array is 
> clean and grown as can be seen below. However, if you look further below 
> you will see the section from dmesg which still shows RIP errors, so I 
> guess there is still something wrong, even though it looks like it is 
> working. Let me know if i can provide any more information.
> 
> Once again, many thanks. All I need to do now is grow the ext3 filesystem...
.

> ...ok start reshape thread
> md: syncing RAID array md0
> md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
> md: using maximum available idle IO bandwidth (but not more than 20 
> KB/sec) for reconstruction.
> md: using 128k window, over a total of 245111552 blocks.
> Unable to handle kernel NULL pointer dereference at  RIP:
> <>{stext+2145382632}
> PGD 7c3f9067 PUD 7cb9e067 PMD 0

> Process md0_reshape (pid: 1432, threadinfo 81007aa42000, task 
> 810037f497b0)
> Stack: 803dce42  1d383600 
>   
> 
> Call Trace: {md_do_sync+1307} 
> {thread_return+0}
>{thread_return+94} 
> {keventd_create_kthread+0}
>{md_thread+248} 

That looks very much like the bug that I already sent you a patch for!
Are you sure that the new kernel still had this patch?

I'm a bit confused by this

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


raid5 reshape/resync

2007-11-24 Thread Nagilum

Hi,
I'm running 2.6.23.8 x86_64 using mdadm v2.6.4.
I was adding a disk (/dev/sdf) to an existing raid5 (/dev/sd[a-e] -> md0)
During that reshape (at around 4%) /dev/sdd reported read errors and  
went offline.
I replaced /dev/sdd with a new drive and tried to reassemble the array  
(/dev/sdd was shown as removed and now as spare).

Assembly worked but it would not run unless I use --force.
Since I'm always reluctant to use force I put the bad disk back in,  
this time as /dev/sdg . I re-added the drive and could run the array.  
The array started to resync (since the disk can be read until 4%) and  
then I marked the disk as failed. Now the array is "active, degraded,  
recovering":


nas:~# mdadm -Q --detail /dev/md0
/dev/md0:
Version : 00.91.03
  Creation Time : Sat Sep 15 21:11:41 2007
 Raid Level : raid5
 Array Size : 1953234688 (1862.75 GiB 2000.11 GB)
  Used Dev Size : 488308672 (465.69 GiB 500.03 GB)
   Raid Devices : 6
  Total Devices : 7
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Sat Nov 24 10:10:46 2007
  State : active, degraded, recovering
 Active Devices : 5
Working Devices : 6
 Failed Devices : 1
  Spare Devices : 1

 Layout : left-symmetric
 Chunk Size : 16K

 Reshape Status : 19% complete
  Delta Devices : 1, (5->6)

   UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
 Events : 0.726347

Number   Major   Minor   RaidDevice State
   0   800  active sync   /dev/sda
   1   8   161  active sync   /dev/sdb
   2   8   322  active sync   /dev/sdc
   6   8   963  faulty spare rebuilding   /dev/sdg
   4   8   644  active sync   /dev/sde
   5   8   805  active sync   /dev/sdf

   7   8   48-  spare   /dev/sdd

iostat:
Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
sda 129.48  1498.01  1201.59   7520   6032
sdb 134.86  1498.01  1201.59   7520   6032
sdc 127.69  1498.01  1201.59   7520   6032
sdd   0.40 0.00 3.19  0 16
sde 111.55  1498.01  1201.59   7520   6032
sdf 117.73 0.00  1201.59  0   6032
sdg   0.00 0.00 0.00  0  0

What I find somewhat confusing/disturbing is that does not appear to  
utilize /dev/sdd. What I see here could be explained by md doing a  
RAID5 resync from the 4 drives sd[a-c,e] to sd[a-c,e,f] but I would  
have expected it to use the new spare sdd for that. Also the speed is  
unusually low which seems to indicate a lot of seeking as if two  
operations are happening at the same time.
Also when I look at the data rates it looks more like the reshape is  
continuing even though one drive is missing (possible but risky).

Can someone relief my doubts as to whether md does the right thing here?
Thanks,


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgpuZCEerJfLu.pgp
Description: PGP Digital Signature


Re: Raid5 reshape (Solved)

2006-06-24 Thread Nigel J. Terry

Neil

Well I did warn you that I was an idiot... :-) I have been attempting to
work out exactly what I did and what happened. All I have learned is
that I need to keep better notes

Yes, the 21 mounts is a fsck, nothing to do with raid. However it is
still noteworthy that this took several hours to complete with the raid
also reshaping rather than the few minutes I have seen in the past. Some
kind of interaction there.

I think that the kernel I was using had both the fixes you had sent me
in it, but I honestly can't be sure - Sorry. In the past, that bug
caused it to fail immediately and the reshape to freeze. This appeared
to occur after the reshape, maybe a problem at the end of the reshape
process. Probably however I screwed up, and I have no way to retest.

Finally, just a note to say that the system continues to work just fine
and I am really impressed. Thanks again

Nigel

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 reshape/resync

2007-11-25 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Sat, 24 Nov 2007 12:02:09 +0100
From: Nagilum <[EMAIL PROTECTED]>
Reply-To: Nagilum <[EMAIL PROTECTED]>
 Subject: raid5 reshape/resync
  To: linux-raid@vger.kernel.org


Hi,
I'm running 2.6.23.8 x86_64 using mdadm v2.6.4.
I was adding a disk (/dev/sdf) to an existing raid5 (/dev/sd[a-e] -> md0)
During that reshape (at around 4%) /dev/sdd reported read errors and
went offline.
I replaced /dev/sdd with a new drive and tried to reassemble the array
(/dev/sdd was shown as removed and now as spare).
Assembly worked but it would not run unless I use --force.
Since I'm always reluctant to use force I put the bad disk back in,
this time as /dev/sdg . I re-added the drive and could run the array.
The array started to resync (since the disk can be read until 4%) and
then I marked the disk as failed. Now the array is "active, degraded,
recovering":

nas:~# mdadm -Q --detail /dev/md0
/dev/md0:
Version : 00.91.03
  Creation Time : Sat Sep 15 21:11:41 2007
 Raid Level : raid5
 Array Size : 1953234688 (1862.75 GiB 2000.11 GB)
  Used Dev Size : 488308672 (465.69 GiB 500.03 GB)
   Raid Devices : 6
  Total Devices : 7
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Sat Nov 24 10:10:46 2007
  State : active, degraded, recovering
 Active Devices : 5
Working Devices : 6
 Failed Devices : 1
  Spare Devices : 1

 Layout : left-symmetric
 Chunk Size : 16K

 Reshape Status : 19% complete
  Delta Devices : 1, (5->6)

   UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
 Events : 0.726347

Number   Major   Minor   RaidDevice State
   0   800  active sync   /dev/sda
   1   8   161  active sync   /dev/sdb
   2   8   322  active sync   /dev/sdc
   6   8   963  faulty spare rebuilding   /dev/sdg
   4   8   644  active sync   /dev/sde
   5   8   805  active sync   /dev/sdf

   7   8   48-  spare   /dev/sdd

iostat:
Device:tpskB_read/skB_wrtn/skB_readkB_wrtn
sda 129.48  1498.01  1201.59   7520   6032
sdb 134.86  1498.01  1201.59   7520   6032
sdc 127.69  1498.01  1201.59   7520   6032
sdd   0.40 0.00 3.19  0 16
sde 111.55  1498.01  1201.59   7520   6032
sdf 117.73 0.00  1201.59  0   6032
sdg   0.00 0.00 0.00  0  0

What I find somewhat confusing/disturbing is that does not appear to
utilize /dev/sdd. What I see here could be explained by md doing a
RAID5 resync from the 4 drives sd[a-c,e] to sd[a-c,e,f] but I would
have expected it to use the new spare sdd for that. Also the speed is
unusually low which seems to indicate a lot of seeking as if two
operations are happening at the same time.
Also when I look at the data rates it looks more like the reshape is
continuing even though one drive is missing (possible but risky).
Can someone relief my doubts as to whether md does the right thing here?
Thanks,


- End message from [EMAIL PROTECTED] -

Ok, so the reshape tried to continue without the failed drive and  
after that resynced to the new spare.
Unfortunately the result is a mess. On top of the Raid5 I have  
dm-crypt and LVM.
Although dmcrypt and LVM dont appear to have a problem the filesystems  
on top are a mess now.
I still have the failed drive, I can read the superblock from that  
drive and up to 4% from the beginning and probably backwards from the  
end towards that point.
So in theory it could be possible to reorder the stripe blocks which  
appears to have been messed up.(?)
Unfortunately I'm not sure what exactly went wrong or what I did  
wrong. Can someone please give me hint?

Thanks,
Alex.


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgp3M3OZnLTve.pgp
Description: PGP Digital Signature


Re: raid5 reshape/resync

2007-11-28 Thread Neil Brown
On Sunday November 25, [EMAIL PROTECTED] wrote:
> - Message from [EMAIL PROTECTED] -
>  Date: Sat, 24 Nov 2007 12:02:09 +0100
>  From: Nagilum <[EMAIL PROTECTED]>
> Reply-To: Nagilum <[EMAIL PROTECTED]>
>   Subject: raid5 reshape/resync
>To: linux-raid@vger.kernel.org
> 
> > Hi,
> > I'm running 2.6.23.8 x86_64 using mdadm v2.6.4.
> > I was adding a disk (/dev/sdf) to an existing raid5 (/dev/sd[a-e] -> md0)
> > During that reshape (at around 4%) /dev/sdd reported read errors and
> > went offline.

Sad.

> > I replaced /dev/sdd with a new drive and tried to reassemble the array
> > (/dev/sdd was shown as removed and now as spare).

There must be a step missing here.
Just because one drive goes offline, that  doesn't mean that you need
to reassemble the array.  It should just continue with the reshape
until that is finished.  Did you shut the machine down or did it crash
or what

> > Assembly worked but it would not run unless I use --force.

That suggests an unclean shutdown.  Maybe it did crash?


> > Since I'm always reluctant to use force I put the bad disk back in,
> > this time as /dev/sdg . I re-added the drive and could run the array.
> > The array started to resync (since the disk can be read until 4%) and
> > then I marked the disk as failed. Now the array is "active, degraded,
> > recovering":

It should have restarted the reshape from whereever it was up to, so
it should have hit the read error almost immediately.  Do you remember
where it started the reshape from?  If it restarted from the beginning
that would be bad.

Did you just "--assemble" all the drives or did you do something else?

> >
> > What I find somewhat confusing/disturbing is that does not appear to
> > utilize /dev/sdd. What I see here could be explained by md doing a
> > RAID5 resync from the 4 drives sd[a-c,e] to sd[a-c,e,f] but I would
> > have expected it to use the new spare sdd for that. Also the speed is

md cannot recover to a spare while a reshape is happening.  It
completes the reshape, then does the recovery (as you discovered).

> > unusually low which seems to indicate a lot of seeking as if two
> > operations are happening at the same time.

Well reshape is always slow as it has to read from one part of the
drive and write to another part of the drive.

> > Also when I look at the data rates it looks more like the reshape is
> > continuing even though one drive is missing (possible but risky).

Yes, that is happening.

> > Can someone relief my doubts as to whether md does the right thing here?
> > Thanks,

I believe it is do "the right thing".

> >
> - End message from [EMAIL PROTECTED] -
> 
> Ok, so the reshape tried to continue without the failed drive and  
> after that resynced to the new spare.

As I would expect.

> Unfortunately the result is a mess. On top of the Raid5 I have  

Hmm.  This I would not expect.

> dm-crypt and LVM.
> Although dmcrypt and LVM dont appear to have a problem the filesystems  
> on top are a mess now.

Can you be more specific about what sort of "mess" they are in?

NeilBrown


> I still have the failed drive, I can read the superblock from that  
> drive and up to 4% from the beginning and probably backwards from the  
> end towards that point.
> So in theory it could be possible to reorder the stripe blocks which  
> appears to have been messed up.(?)
> Unfortunately I'm not sure what exactly went wrong or what I did  
> wrong. Can someone please give me hint?
> Thanks,
> Alex.
> 
> 
> #_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
> #   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
> #  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
> # /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
> #   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #
> 
> 
> 
> 
> cakebox.homeunix.net - all the machine one needs..
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 reshape/resync

2007-12-01 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Thu, 29 Nov 2007 16:48:47 +1100
From: Neil Brown <[EMAIL PROTECTED]>
Reply-To: Neil Brown <[EMAIL PROTECTED]>
 Subject: Re: raid5 reshape/resync
  To: Nagilum <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org


> Hi,
> I'm running 2.6.23.8 x86_64 using mdadm v2.6.4.
> I was adding a disk (/dev/sdf) to an existing raid5 (/dev/sd[a-e] -> md0)
> During that reshape (at around 4%) /dev/sdd reported read errors and
> went offline.


Sad.


> I replaced /dev/sdd with a new drive and tried to reassemble the array
> (/dev/sdd was shown as removed and now as spare).


There must be a step missing here.
Just because one drive goes offline, that  doesn't mean that you need
to reassemble the array.  It should just continue with the reshape
until that is finished.  Did you shut the machine down or did it crash
or what

> Assembly worked but it would not run unless I use --force.


That suggests an unclean shutdown.  Maybe it did crash?


I started the reshape and went out. When I came back the controller  
was beeping (indicating the erraneous disk). I tried to log on but I  
could not get in. The machine was responding to pings but that was  
about it (no ssh or xdm login worked). So I hard rebooted.
I booted into a rescue root, the /etc/mdadm/mdadm.conf didn't yet  
include the new disk so the raid was missing one disk and not started.
Since I didn't know what exactly what was going on I --re-added sdf  
(the new disk) and tried to resume reshaping. A second into that the  
read failure on /dev/sdd was reported. So I stopped md0 and shut down  
to verify the read error with another controller.
After I had verified that I replaced /dev/sdd with a new drive and put  
in the broken drive as /dev/sdg, just in case.



> Since I'm always reluctant to use force I put the bad disk back in,
> this time as /dev/sdg . I re-added the drive and could run the array.
> The array started to resync (since the disk can be read until 4%) and
> then I marked the disk as failed. Now the array is "active, degraded,
> recovering":


It should have restarted the reshape from whereever it was up to, so
it should have hit the read error almost immediately.  Do you remember
where it started the reshape from?  If it restarted from the beginning
that would be bad.


It must have continued where it left off since the reshape position in  
all superblocks was at about 4%.



Did you just "--assemble" all the drives or did you do something else?


Sorry for being a bit unexact here, I didn't actually have to use  
--assemble, when booting into the rescue root the raid came up with  
/dev/sdd and /dev/sdf removed. I just had to --re-add /dev/sdf



> unusually low which seems to indicate a lot of seeking as if two
> operations are happening at the same time.


Well reshape is always slow as it has to read from one part of the
drive and write to another part of the drive.


Actually it was resyncing with the minimum speed, I managed to crank  
up the speed to >20MB/s by adjusting /sys/block/md0/md/sync_speed_min



> Can someone relief my doubts as to whether md does the right thing here?
> Thanks,


I believe it is do "the right thing".


>
- End message from [EMAIL PROTECTED] -

Ok, so the reshape tried to continue without the failed drive and
after that resynced to the new spare.


As I would expect.


Unfortunately the result is a mess. On top of the Raid5 I have


Hmm.  This I would not expect.


dm-crypt and LVM.
Although dmcrypt and LVM dont appear to have a problem the filesystems
on top are a mess now.


Can you be more specific about what sort of "mess" they are in?


Sure.
So here is the vg-layout:
nas:~# lvdisplay vg01
  --- Logical volume ---
  LV Name/dev/vg01/lv1
  VG Namevg01
  LV UUID4HmzU2-VQpO-vy5R-Wdys-PmwH-AuUg-W02CKS
  LV Write Accessread/write
  LV Status  available
  # open 0
  LV Size512.00 MB
  Current LE 128
  Segments   1
  Allocation inherit
  Read ahead sectors 0
  Block device   253:1

  --- Logical volume ---
  LV Name/dev/vg01/lv2
  VG Namevg01
  LV UUID4e2ZB9-29Rb-dy4M-EzEY-cEIG-Nm1I-CPI0kk
  LV Write Accessread/write
  LV Status  available
  # open 0
  LV Size7.81 GB
  Current LE 2000
  Segments   1
  Allocation inherit
  Read ahead sectors 0
  Block device   253:2

  --- Logical volume ---
  LV Name/dev/vg01/lv3
  VG Namevg01
  LV UUIDYQRd0X-5hF8-2dd3-GG4v-wQLH-WGH0-ntGgug
  LV Write Accessread/write
  LV Status  available
  #

Re: raid5 reshape/resync

2007-12-11 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Sat, 01 Dec 2007 15:48:17 +0100
From: Nagilum <[EMAIL PROTECTED]>
Reply-To: Nagilum <[EMAIL PROTECTED]>
 Subject: Re: raid5 reshape/resync
  To: Neil Brown <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org



I'm not sure how to reorder things so it will be ok again, I'll ponder
about that while I try to recreate the situation using files and
losetup.


- End message from [EMAIL PROTECTED] -

Ok, I've recreated the problem in form of a semiautomatic testcase.
All necessary files (plus the old xfs_repair output) are at:
 http://www.nagilum.de/md/

I also added a readme: http://www.nagilum.de/md/readme.txt

After running the test.sh the created xfs filesystem on the raid  
device is broken and (at last in my case) cannot be mounted anymore.


I hope this will help finding the problem.

Kind regards,
Alex.


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgppWJZ6Ayex5.pgp
Description: PGP Digital Signature


Re: raid5 reshape/resync

2007-12-16 Thread Janek Kozicki
Nagilum said: (by the date of Tue, 11 Dec 2007 22:56:13 +0100)

> Ok, I've recreated the problem in form of a semiautomatic testcase.
> All necessary files (plus the old xfs_repair output) are at:
>   http://www.nagilum.de/md/

> After running the test.sh the created xfs filesystem on the raid  
> device is broken and (at last in my case) cannot be mounted anymore.

I think that you should file a bugreport, and provide there the
explanations you have put in there. An automated test case that leads
to xfs corruption is a neat snack for bug squashers ;-)

I wonder however where to report this - the xfs or raid ? Eventually
cross report to both places and write in the bugreport that you are
not sure on which side there is a bug.

best regards
-- 
Janek Kozicki |
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 reshape/resync

2007-12-18 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Sun, 16 Dec 2007 14:16:45 +0100
From: Janek Kozicki <[EMAIL PROTECTED]>
Reply-To: Janek Kozicki <[EMAIL PROTECTED]>
 Subject: Re: raid5 reshape/resync
  To: Nagilum <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org



Nagilum said: (by the date of Tue, 11 Dec 2007 22:56:13 +0100)


Ok, I've recreated the problem in form of a semiautomatic testcase.
All necessary files (plus the old xfs_repair output) are at:
  http://www.nagilum.de/md/



After running the test.sh the created xfs filesystem on the raid
device is broken and (at last in my case) cannot be mounted anymore.


I think that you should file a bugreport, and provide there the
explanations you have put in there. An automated test case that leads
to xfs corruption is a neat snack for bug squashers ;-)

I wonder however where to report this - the xfs or raid ? Eventually
cross report to both places and write in the bugreport that you are
not sure on which side there is a bug.


- End message from [EMAIL PROTECTED] -

This is a md/mdadm problem. xfs is merely used as a vehicle to show  
the problem also amplified bei luks.

Where would I file this bug report? I thought this is the place?
I could also really use a way to fix that corruption. :(
Thanks,
Alex.

PS: yesterday I verified this bug on 2.6.23.9, will do 2.6.23.11 today.


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgpu1mUvwteaE.pgp
Description: PGP Digital Signature


raid5 reshape bug with XFS

2006-11-04 Thread Bill Cizek

Hi,

I'm setting up a raid 5 system and I ran across a bug when reshaping an 
array with a mounted XFS filesystem on it.  This is under linux 2.6.18.2 
and mdadm 2.5.5


I have a test array with 3 10 GB disks and a fourth 10 GB spare disk, 
and a mounted xfs filesystem on it:


[EMAIL PROTECTED] $ mdadm --detail /dev/md4
/dev/md4:
   Version : 00.90.03
 Creation Time : Sat Nov  4 18:58:59 2006
Raid Level : raid5
Array Size : 20964480 (19.99 GiB 21.47 GB)
   Device Size : 10482240 (10.00 GiB 10.73 GB)
  Raid Devices : 3
 Total Devices : 4
Preferred Minor : 4
   Persistence : Superblock is persistent
[snip]

...I Grow it:

[EMAIL PROTECTED] $ mdadm -G /dev/md4 -n4
mdadm: Need to backup 384K of critical section..
mdadm: ... critical section passed.
[EMAIL PROTECTED] $ mdadm --detail /dev/md4
/dev/md4:
   Version : 00.91.03
 Creation Time : Sat Nov  4 18:58:59 2006
Raid Level : raid5
Array Size : 20964480 (19.99 GiB 21.47 GB)
   Device Size : 10482240 (10.00 GiB 10.73 GB)
  Raid Devices : 4
 Total Devices : 4
Preferred Minor : 4
   Persistence : Superblock is persistent
---

It goes along and reshapes fine (from /proc/mdstat):

md4 : active raid5 dm-67[3] dm-66[2] dm-65[1] dm-64[0]
 20964480 blocks super 0.91 level 5, 64k chunk, algorithm 2 [4/4] 
[]
 [>]  reshape = 22.0% (2314624/10482240) 
finish=16.7min

speed=8128K/sec



When the reshape completes, the full array size gets corrupted:
/proc/mdstat:
md4 : active raid5 dm-67[3] dm-66[2] dm-65[1] dm-64[0]
 31446720 blocks level 5, 64k chunk, algorithm 2 [4/4] []

- looks good, but-

[EMAIL PROTECTED] $ mdadm --detail /dev/md4
/dev/md4:
   Version : 00.90.03
 Creation Time : Sat Nov  4 18:58:59 2006
Raid Level : raid5
>>
>>Array Size : 2086592 (2038.03 MiB 2136.67 MB)
>>
   Device Size : 10482240 (10.00 GiB 10.73 GB)
  Raid Devices : 4
 Total Devices : 4
Preferred Minor : 4
   Persistence : Superblock is persistent

(2086592 != 31446720 -- Bad, much too small)

-
xfs_growfs /dev/md4 barfs horribly - something about reading past the 
end of the device.


If I unmount the XFS filesystem, things work ok:

[EMAIL PROTECTED] $ umount /dev/md4

[EMAIL PROTECTED] $ mdadm --detail /dev/md4
/dev/md4:
   Version : 00.90.03
 Creation Time : Sat Nov  4 18:58:59 2006
Raid Level : raid5
Array Size : 31446720 (29.99 GiB 32.20 GB)
   Device Size : 10482240 (10.00 GiB 10.73 GB)
  Raid Devices : 4
 Total Devices : 4
Preferred Minor : 4
   Persistence : Superblock is persistent

(31446720 == 31446720 -- Good)

If I remount the fs, I can use xfs_growfs with no ill effects.

It's a pretty easy work-around to not have the fs mounted during the 
resize, but it doesn't seem right for the array size to get borked like 
this. If there's anything I can provide to debug this let me know.


Thanks,
Bill





-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 reshape/resync - BUGREPORT

2007-12-18 Thread Janek Kozicki
> - Message from [EMAIL PROTECTED] -
Nagilum said: (by the date of Tue, 18 Dec 2007 11:09:38 +0100)

> >> Ok, I've recreated the problem in form of a semiautomatic testcase.
> >> All necessary files (plus the old xfs_repair output) are at:
> >>
> >>   http://www.nagilum.de/md/
> >
> >> After running the test.sh the created xfs filesystem on the raid
> >> device is broken and (at last in my case) cannot be mounted anymore.
> >
> > I think that you should file a bugreport

> - End message from [EMAIL PROTECTED] -
> 
> Where would I file this bug report? I thought this is the place?
> I could also really use a way to fix that corruption. :(

ouch. To be honest I subscribed here just a month ago, so I'm not
sure. But I haven't seen other bugreports here so far. 

I was expecting that there is some bugzilla?

-- 
Janek Kozicki |
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


PROBLEM: RAID5 reshape data corruption

2007-12-31 Thread Nagilum

Ok, since my previous thread didn't seem to attract much attention,
let me try again.
An interrupted RAID5 reshape will cause the md device in question to
contain one corrupt chunk per stripe if resumed in the wrong manner.
A testcase can be found at http://www.nagilum.de/md/ .
The first testcase can be initialized with "start.sh" the real test
can then be run with "test.sh". The first testcase also uses dm-crypt
and xfs to show the corruption.
The second testcase uses nothing but mdadm and "testpat" - a small
program to write and verify a simple testpattern designed to find
block data corruptions. Use "v2_start.sh && v2_test.sh" to run.
At the end it will point out all the wrong bytes on the md device.
I'm not just interested in a simple behaviour fix I'm also interested
in what actually happens and if possible a repair program for that
kind of data corruption.
The bug is architectural agnostic. I first came across it using  
2.6.23.8 on amd64 but I verified it on 2.6.23.[8-12] and  
2.6.24-rc[5,6] on ppc. Always using mdadm 2.6.4.

The situation the bug first showed up was as follows:
1. A RAID5 reshape from 5->6 device was started.
2. After about 4% one disk failed, the machine appeared unresponsive  
and was rebooted.

3. A spare disk was added to the array.
4. The bad drive was re-added to the array in a different bay and the  
reshape resumed.

5. The drive failed again but the reshape continued.
6. The reshaped finished and after that the resync. The data after at  
about 4% on the md device is broken as described above.


Kind regards,
Alex.



#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #






cakebox.homeunix.net - all the machine one needs..



pgp41FEJ6D5Gy.pgp
Description: PGP Digital Signature


Raid5 Reshape Status + xfs_growfs = Success! (2.6.17.3)

2006-07-11 Thread Justin Piszcz

Neil,

It worked, echo'ing the 600 > to the stripe width in /sys, however, how 
come /dev/md3 says it is 0 MB when I type fdisk -l?


Is this normal?

Disk /dev/md0 doesn't contain a valid partition table

Disk /dev/md3: 0 MB, 0 bytes
2 heads, 4 sectors/track, 0 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md3 doesn't contain a valid partition table

Disk /dev/md2: 71.9 GB, 71954661376 bytes
2 heads, 4 sectors/track, 17567056 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Furthermore, the xfs_growfs worked beautifully!

p34:~# df -h
/dev/md3  2.2T  487G  1.8T  22% /raid5
p34:~# xfs_growfs /raid5
meta-data=/dev/md3   isize=256agcount=32, agsize=18314368 
blks

 =   sectsz=4096  attr=0
data =   bsize=4096   blocks=586059776, imaxpct=25
 =   sunit=128swidth=768 blks, unwritten=1
naming   =version 2  bsize=4096
log  =internal   bsize=4096   blocks=32768, version=2
 =   sectsz=4096  sunit=1 blks
realtime =none   extsz=3145728 blocks=0, rtextents=0
data blocks changed from 586059776 to 683740288
p34:~# df -h
FilesystemSize  Used Avail Use% Mounted on
/dev/md3  2.6T  487G  2.1T  19% /raid5
p34:~# umount /raid5
p34:~# mount /raid5
p34:~# dmesg | tail -5
[4354159.367000]  disk 7, o:1, dev:sdc1
[4360850.548000] XFS mounting filesystem md3
[4360850.803000] Ending clean XFS mount for filesystem: md3
[4360868.121000] XFS mounting filesystem md3
[4360868.189000] Ending clean XFS mount for filesystem: md3

Very nice stuff.

Thanks Neil & XFS team for the information and help!

Justin.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 reshape bug with XFS

2006-11-05 Thread Neil Brown
On Saturday November 4, [EMAIL PROTECTED] wrote:
> Hi,
> 
> I'm setting up a raid 5 system and I ran across a bug when reshaping an 
> array with a mounted XFS filesystem on it.  This is under linux 2.6.18.2 
> and mdadm 2.5.5
> 
...
> [EMAIL PROTECTED] $ mdadm --detail /dev/md4
> /dev/md4:
> Version : 00.90.03
>   Creation Time : Sat Nov  4 18:58:59 2006
>  Raid Level : raid5
>  >>
>  >>Array Size : 2086592 (2038.03 MiB 2136.67 MB)
>  >>
> Device Size : 10482240 (10.00 GiB 10.73 GB)
>Raid Devices : 4
>   Total Devices : 4
> Preferred Minor : 4
> Persistence : Superblock is persistent
> 
> (2086592 != 31446720 -- Bad, much too small)


You have CONFIG_LBD=n don't you?

Thanks for the report.  This should fix it.  Please let me know if it does.

NeilBrown

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/raid5.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c2006-11-03 15:11:52.0 +1100
+++ ./drivers/md/raid5.c2006-11-06 09:55:20.0 +1100
@@ -3909,7 +3909,7 @@ static void end_reshape(raid5_conf_t *co
bdev = bdget_disk(conf->mddev->gendisk, 0);
if (bdev) {
mutex_lock(&bdev->bd_inode->i_mutex);
-   i_size_write(bdev->bd_inode, conf->mddev->array_size << 
10);
+   i_size_write(bdev->bd_inode, 
(loff_t)conf->mddev->array_size << 10);
mutex_unlock(&bdev->bd_inode->i_mutex);
bdput(bdev);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 reshape bug with XFS

2006-11-05 Thread Bill Cizek

Neil Brown wrote:

On Saturday November 4, [EMAIL PROTECTED] wrote:
  

Hi,

I'm setting up a raid 5 system and I ran across a bug when reshaping an 
array with a mounted XFS filesystem on it.  This is under linux 2.6.18.2 
and mdadm 2.5.5

You have CONFIG_LBD=n don't you?
  

Yes,

I have CONFIG_LBD=n

...and the patch fixed the problem.

Side Note: I just converted 2 raid0 drives into a 4 drive raid5 array 
in-place, with relative ease.
I couldn't have done it without the work you (and I'm sure others) have 
done. Thanks.


-Bill


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 reshape bug with XFS

2006-11-06 Thread Neil Brown
On Sunday November 5, [EMAIL PROTECTED] wrote:
> Neil Brown wrote:
> > On Saturday November 4, [EMAIL PROTECTED] wrote:
> >   
> >> Hi,
> >>
> >> I'm setting up a raid 5 system and I ran across a bug when reshaping an 
> >> array with a mounted XFS filesystem on it.  This is under linux 2.6.18.2 
> >> and mdadm 2.5.5
> > You have CONFIG_LBD=n don't you?
> >   
> Yes,
> 
> I have CONFIG_LBD=n
> 
> ...and the patch fixed the problem.

Cool thanks.
> 
> Side Note: I just converted 2 raid0 drives into a 4 drive raid5 array 
> in-place, with relative ease.
> I couldn't have done it without the work you (and I'm sure others) have 
> done. Thanks.

And without bug reports like yours others would have more problems.

Thanks.
NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Raid5 Reshape gone wrong, please help

2007-08-17 Thread Greg Nicholson
I was trying to resize a Raid 5 array of 4 500G drives to 5.  Kernel
version 2.6.23-rc3 was the kernel I STARTED on this.

  I added the device to the array :
mdadm --add /dev/md0 /dev/sdb1

Then I started to grow the array :
 mdadm --grow /dev/md0 --raid-devices=5

At this point the machine locked up.  Not good.

I ended up having to hard reboot.  Now, I have the following in dmesg :

md: md0: raid array is not clean -- starting background reconstruction
raid5: reshape_position too early for auto-recovery - aborting.
md: pers->run() failed ...

/proc/mdstat is :

Personalities : [raid6] [raid5] [raid4]
md0 : inactive sdf1[0] sdb1[4] sdc1[3] sdd1[2] sde1[1]
  2441918720 blocks super 0.91

unused devices: 


It doesn't look like it actually DID anything besides update the raid
count to 5 from 4. (I think.)

How do I do a manual recovery on this?


Examining the disks:

 mdadm -E /dev/sdb1
/dev/sdb1:
  Magic : a92b4efc
Version : 00.91.00
   UUID : a9a472d3:9586c602:9207b56b:a5185bd3
  Creation Time : Thu Dec 21 09:42:27 2006
 Raid Level : raid5
  Used Dev Size : 488383744 (465.76 GiB 500.10 GB)
 Array Size : 1953534976 (1863.04 GiB 2000.42 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

  Reshape pos'n : 0
  Delta Devices : 1 (4->5)

Update Time : Fri Aug 17 19:49:43 2007
  State : active
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
   Checksum : c8ebb87b - correct
 Events : 0.2795

 Layout : left-symmetric
 Chunk Size : 256K

  Number   Major   Minor   RaidDevice State
this 4   8   174  active sync   /dev/sdb1

   0 0   8   810  active sync   /dev/sdf1
   1 1   8   651  active sync   /dev/sde1
   2 2   8   492  active sync   /dev/sdd1
   3 3   8   333  active sync   /dev/sdc1
   4 4   8   174  active sync   /dev/sdb1

 mdadm -E /dev/sdc1
/dev/sdc1:
  Magic : a92b4efc
Version : 00.91.00
   UUID : a9a472d3:9586c602:9207b56b:a5185bd3
  Creation Time : Thu Dec 21 09:42:27 2006
 Raid Level : raid5
  Used Dev Size : 488383744 (465.76 GiB 500.10 GB)
 Array Size : 1953534976 (1863.04 GiB 2000.42 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

  Reshape pos'n : 0
  Delta Devices : 1 (4->5)

Update Time : Fri Aug 17 19:49:43 2007
  State : active
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
   Checksum : c8ebb889 - correct
 Events : 0.2795

 Layout : left-symmetric
 Chunk Size : 256K

  Number   Major   Minor   RaidDevice State
this 3   8   333  active sync   /dev/sdc1

   0 0   8   810  active sync   /dev/sdf1
   1 1   8   651  active sync   /dev/sde1
   2 2   8   492  active sync   /dev/sdd1
   3 3   8   333  active sync   /dev/sdc1
   4 4   8   174  active sync   /dev/sdb1

 mdadm -E /dev/sdd1
/dev/sdd1:
  Magic : a92b4efc
Version : 00.91.00
   UUID : a9a472d3:9586c602:9207b56b:a5185bd3
  Creation Time : Thu Dec 21 09:42:27 2006
 Raid Level : raid5
  Used Dev Size : 488383744 (465.76 GiB 500.10 GB)
 Array Size : 1953534976 (1863.04 GiB 2000.42 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

  Reshape pos'n : 0
  Delta Devices : 1 (4->5)

Update Time : Fri Aug 17 19:49:43 2007
  State : active
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
   Checksum : c8ebb897 - correct
 Events : 0.2795

 Layout : left-symmetric
 Chunk Size : 256K

  Number   Major   Minor   RaidDevice State
this 2   8   492  active sync   /dev/sdd1

   0 0   8   810  active sync   /dev/sdf1
   1 1   8   651  active sync   /dev/sde1
   2 2   8   492  active sync   /dev/sdd1
   3 3   8   333  active sync   /dev/sdc1
   4 4   8   174  active sync   /dev/sdb1

/dev/sde1:
  Magic : a92b4efc
Version : 00.91.00
   UUID : a9a472d3:9586c602:9207b56b:a5185bd3
  Creation Time : Thu Dec 21 09:42:27 2006
 Raid Level : raid5
  Used Dev Size : 488383744 (465.76 GiB 500.10 GB)
 Array Size : 1953534976 (1863.04 GiB 2000.42 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

  Reshape pos'n : 0
  Delta Devices : 1 (4->5)

Update Time : Fri Aug 17 19:49:43 2007
  State : active
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
   Checksum : c8ebb8a5 - correct
 Events : 0.2795

 Layout : left-symmetric
 Chunk Size : 256K

  Number   Major   Minor   RaidDevice St

Help RAID5 reshape Oops / backup-file

2007-10-09 Thread Nagilum

Hi,
During the process of reshaping a Raid5 from 3 (/dev/sd[a-c]) to 5
devices (/dev/sd[a-e]) the system was accidentally shut down.
I know I was stupid I should have used a --backup-file but stupid me didn't.
Thanks for not rubbing it any further. :(
Ok, here is what I have:

nas:~# uname -a
Linux nas 2.6.18-5-amd64 #1 SMP Thu Aug 30 01:14:54 UTC 2007 x86_64 GNU/Linux
nas:~# mdadm --version
mdadm - v2.5.6 - 9 November 2006
nas:~# mdadm -Q --detail /dev/md0
/dev/md0:
 Version : 00.91.03
   Creation Time : Sat Sep 15 21:11:41 2007
  Raid Level : raid5
 Device Size : 488308672 (465.69 GiB 500.03 GB)
Raid Devices : 5
   Total Devices : 5
Preferred Minor : 0
 Persistence : Superblock is persistent

 Update Time : Mon Oct  8 23:59:27 2007
   State : active, degraded, Not Started
  Active Devices : 3
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 2

  Layout : left-symmetric
  Chunk Size : 16K

   Delta Devices : 2, (3->5)

UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
  Events : 0.470134

 Number   Major   Minor   RaidDevice State
0   800  active sync   /dev/sda
1   8   161  active sync   /dev/sdb
2   8   322  active sync   /dev/sdc
3   003  removed
4   004  removed

5   8   48-  spare   /dev/sdd
6   8   64-  spare   /dev/sde


nas:~# mdadm -E /dev/sd[a-e]
/dev/sda:
   Magic : a92b4efc
 Version : 00.91.00
UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
   Creation Time : Sat Sep 15 21:11:41 2007
  Raid Level : raid5
 Device Size : 488308672 (465.69 GiB 500.03 GB)
  Array Size : 1953234688 (1862.75 GiB 2000.11 GB)
Raid Devices : 5
   Total Devices : 5
Preferred Minor : 0

   Reshape pos'n : 872095808 (831.70 GiB 893.03 GB)
   Delta Devices : 2 (3->5)

 Update Time : Mon Oct  8 23:59:27 2007
   State : clean
  Active Devices : 5
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 0
Checksum : f425054d - correct
  Events : 0.470134

  Layout : left-symmetric
  Chunk Size : 16K

   Number   Major   Minor   RaidDevice State
this 0   800  active sync   /dev/sda

0 0   800  active sync   /dev/sda
1 1   8   161  active sync   /dev/sdb
2 2   8   322  active sync   /dev/sdc
3 3   8   643  active sync   /dev/sde
4 4   8   484  active sync   /dev/sdd
/dev/sdb:
   Magic : a92b4efc
 Version : 00.91.00
UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
   Creation Time : Sat Sep 15 21:11:41 2007
  Raid Level : raid5
 Device Size : 488308672 (465.69 GiB 500.03 GB)
  Array Size : 1953234688 (1862.75 GiB 2000.11 GB)
Raid Devices : 5
   Total Devices : 5
Preferred Minor : 0

   Reshape pos'n : 872095808 (831.70 GiB 893.03 GB)
   Delta Devices : 2 (3->5)

 Update Time : Mon Oct  8 23:59:27 2007
   State : clean
  Active Devices : 5
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 0
Checksum : f425055f - correct
  Events : 0.470134

  Layout : left-symmetric
  Chunk Size : 16K

   Number   Major   Minor   RaidDevice State
this 1   8   161  active sync   /dev/sdb

0 0   800  active sync   /dev/sda
1 1   8   161  active sync   /dev/sdb
2 2   8   322  active sync   /dev/sdc
3 3   8   643  active sync   /dev/sde
4 4   8   484  active sync   /dev/sdd
/dev/sdc:
   Magic : a92b4efc
 Version : 00.91.00
UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
   Creation Time : Sat Sep 15 21:11:41 2007
  Raid Level : raid5
 Device Size : 488308672 (465.69 GiB 500.03 GB)
  Array Size : 1953234688 (1862.75 GiB 2000.11 GB)
Raid Devices : 5
   Total Devices : 5
Preferred Minor : 0

   Reshape pos'n : 872095808 (831.70 GiB 893.03 GB)
   Delta Devices : 2 (3->5)

 Update Time : Mon Oct  8 23:59:27 2007
   State : clean
  Active Devices : 5
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 0
Checksum : f4250571 - correct
  Events : 0.470134

  Layout : left-symmetric
  Chunk Size : 16K

   Number   Major   Minor   RaidDevice State
this 2   8   322  active sync   /dev/sdc

0 0   800  active sync   /dev/sda
1 1   8   161  active sync   /dev/sdb
2 2   8   322  active sync   /dev/sdc
3 3   8   643  active sync   /dev/sde
4 4  

Bad drive discovered during raid5 reshape

2007-10-28 Thread Kyle Stuart
Hi,
I bought two new hard drives to expand my raid array today and
unfortunately one of them appears to be bad. The problem didn't arise
until after I attempted to grow the raid array. I was trying to expand
the array from 6 to 8 drives. I added both drives using mdadm --add
/dev/md1 /dev/sdb1 which completed, then mdadm --add /dev/md1 /dev/sdc1
which also completed. I then ran mdadm --grow /dev/md1 --raid-devices=8.
It passed the critical section, then began the grow process.

After a few minutes I started to hear unusual sounds from within the
case. Fearing the worst I tried to cat /proc/mdstat which resulted in no
output so I checked dmesg which showed that /dev/sdb1 was not working
correctly. After several minutes dmesg indicated that mdadm gave up and
the grow process stopped. After googling around I tried the solutions
that seemed most likely to work, including removing the new drives with
mdadm --remove --force /dev/md1 /dev/sd[bc]1 and rebooting after which I
ran mdadm -Af /dev/md1. The grow process restarted then failed almost
immediately. Trying to mount the drive gives me a reiserfs replay
failure and suggests running fsck. I don't dare fsck the array since
I've already messed it up so badly. Is there any way to go back to the
original working 6 disc configuration with minimal data loss? Here's
where I'm at right now, please let me know if I need to include any
additional information.

# uname -a
Linux nas 2.6.22-gentoo-r5 #1 SMP Thu Aug 23 16:59:47 MDT 2007 x86_64
AMD Athlon(tm) 64 Processor 3500+ AuthenticAMD GNU/Linux

# mdadm --version
mdadm - v2.6.2 - 21st May 2007

# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md1 : active raid5 hdb1[0] sdb1[8](F) sda1[5] sdf1[4] sde1[3] sdg1[2]
sdd1[1]
  1220979520 blocks super 0.91 level 5, 64k chunk, algorithm 2 [8/6]
[UU__]

unused devices: 

# mdadm --detail --verbose /dev/md1
/dev/md1:
Version : 00.91.03
  Creation Time : Sun Apr  8 19:48:01 2007
 Raid Level : raid5
 Array Size : 1220979520 (1164.42 GiB 1250.28 GB)
  Used Dev Size : 244195904 (232.88 GiB 250.06 GB)
   Raid Devices : 8
  Total Devices : 7
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Mon Oct 29 00:53:21 2007
  State : clean, degraded
 Active Devices : 6
Working Devices : 6
 Failed Devices : 1
  Spare Devices : 0

 Layout : left-symmetric
 Chunk Size : 64K

  Delta Devices : 2, (6->8)

   UUID : 56e7724e:9a5d0949:ff52889f:ac229049
 Events : 0.487460

Number   Major   Minor   RaidDevice State
   0   3   650  active sync   /dev/hdb1
   1   8   491  active sync   /dev/sdd1
   2   8   972  active sync   /dev/sdg1
   3   8   653  active sync   /dev/sde1
   4   8   814  active sync   /dev/sdf1
   5   815  active sync   /dev/sda1
   6   006  removed
   8   8   177  faulty spare rebuilding   /dev/sdb1

#dmesg

md: md1 stopped.
md: unbind
md: export_rdev(hdb1)
md: unbind
md: export_rdev(sdc1)
md: unbind
md: export_rdev(sdb1)
md: unbind
md: export_rdev(sda1)
md: unbind
md: export_rdev(sdf1)
md: unbind
md: export_rdev(sde1)
md: unbind
md: export_rdev(sdg1)
md: unbind
md: export_rdev(sdd1)
md: bind
md: bind
md: bind
md: bind
md: bind
md: bind
md: bind
md: bind
md: md1 stopped.
md: unbind
md: export_rdev(hdb1)
md: unbind
md: export_rdev(sdc1)
md: unbind
md: export_rdev(sdb1)
md: unbind
md: export_rdev(sda1)
md: unbind
md: export_rdev(sdf1)
md: unbind
md: export_rdev(sde1)
md: unbind
md: export_rdev(sdg1)
md: unbind
md: export_rdev(sdd1)
md: bind
md: bind
md: bind
md: bind
md: bind
md: bind
md: bind
md: bind
md: kicking non-fresh sdc1 from array!
md: unbind
md: export_rdev(sdc1)
raid5: reshape will continue
raid5: device hdb1 operational as raid disk 0
raid5: device sdb1 operational as raid disk 7
raid5: device sda1 operational as raid disk 5
raid5: device sdf1 operational as raid disk 4
raid5: device sde1 operational as raid disk 3
raid5: device sdg1 operational as raid disk 2
raid5: device sdd1 operational as raid disk 1
raid5: allocated 8462kB for md1
raid5: raid level 5 set md1 active with 7 out of 8 devices, algorithm 2
RAID5 conf printout:
 --- rd:8 wd:7
 disk 0, o:1, dev:hdb1
 disk 1, o:1, dev:sdd1
 disk 2, o:1, dev:sdg1
 disk 3, o:1, dev:sde1
 disk 4, o:1, dev:sdf1
 disk 5, o:1, dev:sda1
 disk 7, o:1, dev:sdb1
...ok start reshape thread
md: reshape of RAID array md1
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 20
KB/sec) for reshape.
md: using 128k window, over a total of 244195904 blocks.
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd 35/00:00:3f:42:02/00:04:00:00:00/e0 tag 0 cdb 0x0 data
524288 out
 res 40/00:00:00:00:00/00:00:00:00:00/00 Em

Re: raid5 reshape/resync - BUGREPORT/PROBLEM

2007-12-19 Thread Nagilum

- Message from [EMAIL PROTECTED] -

- Message from [EMAIL PROTECTED] -

Nagilum said: (by the date of Tue, 18 Dec 2007 11:09:38 +0100)


>> Ok, I've recreated the problem in form of a semiautomatic testcase.
>> All necessary files (plus the old xfs_repair output) are at:
>>
>>   http://www.nagilum.de/md/
>
>> After running the test.sh the created xfs filesystem on the raid
>> device is broken and (at last in my case) cannot be mounted anymore.
>
> I think that you should file a bugreport



- End message from [EMAIL PROTECTED] -

Where would I file this bug report? I thought this is the place?
I could also really use a way to fix that corruption. :(


ouch. To be honest I subscribed here just a month ago, so I'm not
sure. But I haven't seen other bugreports here so far.

I was expecting that there is some bugzilla?


Not really I'm afraid. At least not aware of anything like that for vanilla.

Anyway I just verified the bug on 2.6.23.11 and 2.6.24-rc5-git4.
Also I came across the bug on amd64 while I'm now using a PPC750  
machine to verify the bug. So it's an architecture undependant bug.  
(but that was to be expected)
I also prepared a different version of the testcase "v2_start.sh" and  
"v2_test.sh". This will print out all the wrong bytes (longs to be  
exact) + location.

It shows the data is there, but scattered. :(
Kind regards,
Alex.

- End message from [EMAIL PROTECTED] -




#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgptVVVnLvuof.pgp
Description: PGP Digital Signature


Re: PROBLEM: RAID5 reshape data corruption

2008-01-03 Thread Neil Brown
On Monday December 31, [EMAIL PROTECTED] wrote:
> Ok, since my previous thread didn't seem to attract much attention,
> let me try again.

Thank you for your report and your patience.

> An interrupted RAID5 reshape will cause the md device in question to
> contain one corrupt chunk per stripe if resumed in the wrong manner.
> A testcase can be found at http://www.nagilum.de/md/ .
> The first testcase can be initialized with "start.sh" the real test
> can then be run with "test.sh". The first testcase also uses dm-crypt
> and xfs to show the corruption.

It looks like this can be fixed with the patch:

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/raid5.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c2008-01-04 09:20:54.0 +1100
+++ ./drivers/md/raid5.c2008-01-04 09:21:05.0 +1100
@@ -2865,7 +2865,7 @@ static void handle_stripe5(struct stripe
md_done_sync(conf->mddev, STRIPE_SECTORS, 1);
}
 
-   if (s.expanding && s.locked == 0)
+   if (s.expanding && s.locked == 0 && s.req_compute == 0)
handle_stripe_expansion(conf, sh, NULL);
 
if (sh->ops.count)


With this patch in place, the v2 test only reports errors after the end
of the original array, as you would expect (the new space is
initialised to 0).

> I'm not just interested in a simple behaviour fix I'm also interested
> in what actually happens and if possible a repair program for that
> kind of data corruption.

What happens is that when reshape happens while a device is missing,
the data on that device should be computed from the other data devices
and parity.  However because of the above bug, the data is copied into
the new layout before the compute is complete.  This means that the
data that was on that device is really lost beyond recovery.

I'm really sorry about that, but there is nothing that can be done to
recover the lost data.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PROBLEM: RAID5 reshape data corruption

2008-01-05 Thread Nagilum

- Message from [EMAIL PROTECTED] -
 Date: Fri, 4 Jan 2008 09:37:24 +1100
 From: Neil Brown <[EMAIL PROTECTED]>
Reply-To: Neil Brown <[EMAIL PROTECTED]>
  Subject: Re: PROBLEM: RAID5 reshape data corruption
   To: Nagilum <[EMAIL PROTECTED]>
   Cc: linux-raid@vger.kernel.org, Dan Williams
<[EMAIL PROTECTED]>, "H. Peter Anvin" <[EMAIL PROTECTED]>



I'm not just interested in a simple behaviour fix I'm also interested
in what actually happens and if possible a repair program for that
kind of data corruption.


What happens is that when reshape happens while a device is missing,
the data on that device should be computed from the other data devices
and parity.  However because of the above bug, the data is copied into
the new layout before the compute is complete.  This means that the
data that was on that device is really lost beyond recovery.

I'm really sorry about that, but there is nothing that can be done to
recover the lost data.


Thanks a lot Neil!
I can confirm your findings, the data in the chunks is the data from  
the broken device. Now to my particular case:

I still have the old disk and I haven't touched the array since.
I just run a dd_rescue -r (reverse) on the old disk and as I expected  
most of it (>99%) is still readable. So what I want to do is read the  
chunks from that disk - starting at the end down to the 4% point where  
the reshape was interrupted due to the disk read error - and replace  
the chunks on md0.

That should restore most of the data.
Now in order to do so I need to know how to calculate the different  
positions of the chunks.

So for the old disk I have:
nas:~# mdadm -E /dev/sdg
/dev/sdg:
  Magic : a92b4efc
Version : 00.91.00
   UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
  Creation Time : Sat Sep 15 21:11:41 2007
 Raid Level : raid5
  Used Dev Size : 488308672 (465.69 GiB 500.03 GB)
 Array Size : 2441543360 (2328.44 GiB 2500.14 GB)
   Raid Devices : 6
  Total Devices : 7
Preferred Minor : 0

  Reshape pos'n : 118360960 (112.88 GiB 121.20 GB)
  Delta Devices : 1 (5->6)

Update Time : Fri Nov 23 20:05:50 2007
  State : active
 Active Devices : 6
Working Devices : 7
 Failed Devices : 0
  Spare Devices : 1
   Checksum : 9a8358c4 - correct
 Events : 0.677965

 Layout : left-symmetric
 Chunk Size : 16K

  Number   Major   Minor   RaidDevice State
this 3   8   963  active sync   /dev/sdg

   0 0   800  active sync   /dev/sda
   1 1   8   161  active sync   /dev/sdb
   2 2   8   322  active sync   /dev/sdc
   3 3   8   963  active sync   /dev/sdg
   4 4   8   644  active sync   /dev/sde
   5 5   8   805  active sync   /dev/sdf
   6 6   8   486  spare   /dev/sdd

the current array is:

nas:~# mdadm -Q --detail /dev/md0
/dev/md0:
Version : 00.90.03
  Creation Time : Sat Sep 15 21:11:41 2007
 Raid Level : raid5
 Array Size : 2441543360 (2328.44 GiB 2500.14 GB)
  Used Dev Size : 488308672 (465.69 GiB 500.03 GB)
   Raid Devices : 6
  Total Devices : 6
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Sat Jan  5 17:53:54 2008
  State : clean
 Active Devices : 6
Working Devices : 6
 Failed Devices : 0
  Spare Devices : 0

 Layout : left-symmetric
 Chunk Size : 16K

   UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
 Events : 0.986918

Number   Major   Minor   RaidDevice State
   0   800  active sync   /dev/sda
   1   8   161  active sync   /dev/sdb
   2   8   322  active sync   /dev/sdc
   3   8   483  active sync   /dev/sdd
   4   8   644  active sync   /dev/sde
   5   8   805  active sync   /dev/sdf

At the moment I'm thinking about writing a small perl program that  
will generate me a shell script or makefile containing dd commands  
that will copy the chunks from the drive to /dev/md0. I don't care if  
that will be dog slow as long as I get most of my data back. (I'd  
probably go forward instead of backward to take advantage of the  
readahead, after I've determined the exact start chunk.)

For that I need to know one more thing.
Used Dev Size is 488308672k for md0 as well as the disk, 16k chunk size.
488308672/16 = 30519292.00
so the first dd would look like:
 dd if=/dev/sdg of=/dev/md0 bs=16k count=1 skip=30519291 seek=X

The big question now being how to calculate X.
Since I have a working testcase I can do a lot of testing before  
touching the real thing. The formula to get X will probably contain a  
5 for the 5(+1) devices the raid spans now, a 4 for the 4

Re: PROBLEM: RAID5 reshape data corruption

2008-01-06 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Sun, 06 Jan 2008 00:31:46 +0100
From: Nagilum <[EMAIL PROTECTED]>


At the moment I'm thinking about writing a small perl program that
will generate me a shell script or makefile containing dd commands
that will copy the chunks from the drive to /dev/md0. I don't care if
that will be dog slow as long as I get most of my data back. (I'd
probably go forward instead of backward to take advantage of the
readahead, after I've determined the exact start chunk.)
For that I need to know one more thing.
Used Dev Size is 488308672k for md0 as well as the disk, 16k chunk size.
488308672/16 = 30519292.00
so the first dd would look like:
 dd if=/dev/sdg of=/dev/md0 bs=16k count=1 skip=30519291 seek=X

The big question now being how to calculate X.
Since I have a working testcase I can do a lot of testing before
touching the real thing. The formula to get X will probably contain a
5 for the 5(+1) devices the raid spans now, a 4 for the 4(+1) devices
the raid spanned before the reshape, a 3 for the device number of the
disk that failed and of course the skip/current chunk number.
Can you help me come up with it?
Thanks again for looking into the whole issue.

- End message from [EMAIL PROTECTED] -

Ok, the spare time over the weekend allowed me to make some headway.
I'm not sure if the attachment will make it through to the ML so I  
uploaded the perl script to: http://www.nagilum.de/md/rdrep.pl
First tests show already promising results although I seem to miss the  
start of the error corruption. Anyway unlike with the testcase at the  
real array I have to start after the area that is unreadable. I have  
already determined that last Friday.

Anyway I would appreciate it if someone could have a look over the script.
I'll probably change it a little bit and make every other dd run via  
exec instead of system to use some parallelism. (I guess the overhead  
for runnung dd will take about as much time as the transfer itself)

Thanks again,
Alex


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



rdrep.pl
Description: Perl program


pgpqtVehc384R.pgp
Description: PGP Digital Signature


Re: PROBLEM: RAID5 reshape data corruption

2008-01-11 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Sun, 06 Jan 2008 22:35:46 +0100
From: Nagilum <[EMAIL PROTECTED]>
Reply-To: Nagilum <[EMAIL PROTECTED]>
 Subject: Re: PROBLEM: RAID5 reshape data corruption
  To: Nagilum <[EMAIL PROTECTED]>
  Cc: Neil Brown <[EMAIL PROTECTED]>, linux-raid@vger.kernel.org, Dan  
Williams <[EMAIL PROTECTED]>, "H. Peter Anvin" <[EMAIL PROTECTED]>




- Message from [EMAIL PROTECTED] -
Date: Sun, 06 Jan 2008 00:31:46 +0100
From: Nagilum <[EMAIL PROTECTED]>


At the moment I'm thinking about writing a small perl program that
will generate me a shell script or makefile containing dd commands
that will copy the chunks from the drive to /dev/md0. I don't care if
that will be dog slow as long as I get most of my data back. (I'd
probably go forward instead of backward to take advantage of the
readahead, after I've determined the exact start chunk.)
For that I need to know one more thing.
Used Dev Size is 488308672k for md0 as well as the disk, 16k chunk size.
488308672/16 = 30519292.00
so the first dd would look like:
dd if=/dev/sdg of=/dev/md0 bs=16k count=1 skip=30519291 seek=X

The big question now being how to calculate X.
Since I have a working testcase I can do a lot of testing before
touching the real thing. The formula to get X will probably contain a
5 for the 5(+1) devices the raid spans now, a 4 for the 4(+1) devices
the raid spanned before the reshape, a 3 for the device number of the
disk that failed and of course the skip/current chunk number.
Can you help me come up with it?
Thanks again for looking into the whole issue.

- End message from [EMAIL PROTECTED] -

Ok, the spare time over the weekend allowed me to make some headway.
I'm not sure if the attachment will make it through to the ML so I
uploaded the perl script to: http://www.nagilum.de/md/rdrep.pl
First tests show already promising results although I seem to miss the
start of the error corruption. Anyway unlike with the testcase at the
real array I have to start after the area that is unreadable. I have
already determined that last Friday.
Anyway I would appreciate it if someone could have a look over the script.
I'll probably change it a little bit and make every other dd run via
exec instead of system to use some parallelism. (I guess the overhead
for runnung dd will take about as much time as the transfer itself)

- End message from [EMAIL PROTECTED] -

I just want to give a quick update.
The program run for about one and a half day and it looks good, the  
directories and files appear ok. I'll do some work on it this evening,  
see if I can restore some more blocks before running xfs_repair.

Kind regards,


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgpynIpmENFK6.pgp
Description: PGP Digital Signature


Re: Raid5 Reshape Status + xfs_growfs = Success! (2.6.17.3)

2006-07-17 Thread Neil Brown
On Tuesday July 11, [EMAIL PROTECTED] wrote:
> Neil,
> 
> It worked, echo'ing the 600 > to the stripe width in /sys, however, how 
> come /dev/md3 says it is 0 MB when I type fdisk -l?
> 
> Is this normal?

Yes.  The 'cylinders' number is limited to 16bits.  For you 2.2TB
array, the number of 'cylinders' (given 2 heads and 4 sectors) would
be about 500,000 which doesn't fit into 16 bits.
> 
> Furthermore, the xfs_growfs worked beautifully!
> 

Excellent!

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Reshape Status + xfs_growfs = Success! (2.6.17.3)

2006-07-21 Thread Jan Engelhardt

On Jul 11 2006 12:03, Justin Piszcz wrote:

>Subject: Raid5 Reshape Status + xfs_growfs = Success! (2.6.17.3)
 
Now we just need shrink-reshaping and xfs_shrinkfs... :)


Jan Engelhardt
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Reshape gone wrong, please help

2007-08-18 Thread Neil Brown
On Friday August 17, [EMAIL PROTECTED] wrote:
> I was trying to resize a Raid 5 array of 4 500G drives to 5.  Kernel
> version 2.6.23-rc3 was the kernel I STARTED on this.
> 
>   I added the device to the array :
> mdadm --add /dev/md0 /dev/sdb1
> 
> Then I started to grow the array :
>  mdadm --grow /dev/md0 --raid-devices=5
> 
> At this point the machine locked up.  Not good.

No, not good.  But it shouldn't be fatal.

> 
> I ended up having to hard reboot.  Now, I have the following in dmesg :
> 
> md: md0: raid array is not clean -- starting background reconstruction
> raid5: reshape_position too early for auto-recovery - aborting.
> md: pers->run() failed ...

Looks like you crashed during the 'critical' period.

> 
> /proc/mdstat is :
> 
> Personalities : [raid6] [raid5] [raid4]
> md0 : inactive sdf1[0] sdb1[4] sdc1[3] sdd1[2] sde1[1]
>   2441918720 blocks super 0.91
> 
> unused devices: 
> 
> 
> It doesn't look like it actually DID anything besides update the raid
> count to 5 from 4. (I think.)
> 
> How do I do a manual recovery on this?

Simply use mdadm to assemble the array:

  mdadm -A /dev/md0 /dev/sd[bcdef]1

It should notice that the kernel needs help, and will provide
that help.
Specifically, when you started the 'grow', mdadm copied the first few
stripes into unused space in the new device.  When you re-assemble, it
will copy those stripes back into the new layout, then let the kernel
do the rest.

Please let us know how it goes.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Reshape gone wrong, please help

2007-08-18 Thread Greg Nicholson
On 8/18/07, Neil Brown <[EMAIL PROTECTED]> wrote:
> On Friday August 17, [EMAIL PROTECTED] wrote:
> > I was trying to resize a Raid 5 array of 4 500G drives to 5.  Kernel
> > version 2.6.23-rc3 was the kernel I STARTED on this.
> >
> >   I added the device to the array :
> > mdadm --add /dev/md0 /dev/sdb1
> >
> > Then I started to grow the array :
> >  mdadm --grow /dev/md0 --raid-devices=5
> >
> > At this point the machine locked up.  Not good.
>
> No, not good.  But it shouldn't be fatal.

Well, that was my thought as well.
>
> >
> > I ended up having to hard reboot.  Now, I have the following in dmesg :
> >
> > md: md0: raid array is not clean -- starting background reconstruction
> > raid5: reshape_position too early for auto-recovery - aborting.
> > md: pers->run() failed ...
>
> Looks like you crashed during the 'critical' period.
>
> >
> > /proc/mdstat is :
> >
> > Personalities : [raid6] [raid5] [raid4]
> > md0 : inactive sdf1[0] sdb1[4] sdc1[3] sdd1[2] sde1[1]
> >   2441918720 blocks super 0.91
> >
> > unused devices: 
> >
> >
> > It doesn't look like it actually DID anything besides update the raid
> > count to 5 from 4. (I think.)
> >
> > How do I do a manual recovery on this?
>
> Simply use mdadm to assemble the array:
>
>   mdadm -A /dev/md0 /dev/sd[bcdef]1
>
> It should notice that the kernel needs help, and will provide
> that help.
> Specifically, when you started the 'grow', mdadm copied the first few
> stripes into unused space in the new device.  When you re-assemble, it
> will copy those stripes back into the new layout, then let the kernel
> do the rest.
>
> Please let us know how it goes.
>
> NeilBrown
>


I had already tried to assemble it by hand, before I basically said...
WAIT.  Ask for help.  Don't screw up more. :)

But I tried again:


[EMAIL PROTECTED] {  }$ mdadm -A /dev/md0 /dev/sd[bcdef]1
mdadm: device /dev/md0 already active - cannot assemble it
[EMAIL PROTECTED] { ~ }$ mdadm -S /dev/md0
mdadm: stopped /dev/md0
[EMAIL PROTECTED] { ~ }$ mdadm -A /dev/md0 /dev/sd[bcdef]1
mdadm: failed to RUN_ARRAY /dev/md0: Invalid argument


Dmesg shows:

md: md0 stopped.
md: unbind
md: export_rdev(sdf1)
md: unbind
md: export_rdev(sdb1)
md: unbind
md: export_rdev(sdc1)
md: unbind
md: export_rdev(sdd1)
md: unbind
md: export_rdev(sde1)
md: md0 stopped.
md: bind
md: bind
md: bind
md: bind
md: bind
md: md0: raid array is not clean -- starting background reconstruction
raid5: reshape_position too early for auto-recovery - aborting.
md: pers->run() failed ...
md: md0 stopped.
md: unbind
md: export_rdev(sdf1)
md: unbind
md: export_rdev(sdb1)
md: unbind
md: export_rdev(sdc1)
md: unbind
md: export_rdev(sdd1)
md: unbind
md: export_rdev(sde1)
md: md0 stopped.
md: bind
md: bind
md: bind
md: bind
md: bind
md: md0: raid array is not clean -- starting background reconstruction
raid5: reshape_position too early for auto-recovery - aborting.
md: pers->run() failed ...

And the raid stays in an inactive state.

Using mdadm v2.6.2 and kernel 2.6.23-rc3, although I can push back to
earlier versions easily if it would help.

I know that sdb1 is the new device.  When mdadm ran, it said the
critical section was 3920k (approximately).  When I didn't get a
response for five minutes, and there wasn't ANY disk activity, I
booted the box.

Based on your message and the man page, it sounds like mdadm should
have placed something on sdb1.  So... Trying to be non-destructive,
but still gather information:

 dd if=/dev/sdb1 of=/tmp/test bs=1024k count=1000
 hexdump /tmp/test
 000        
 *
 3e80

dd if=/dev/sdb1 of=/tmp/test bs=1024k count=1000 skip=999
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 35.0176 seconds, 29.9 MB/s
[EMAIL PROTECTED] { ~ }$ hexdump /tmp/test
000        
*
3e80

That looks to me like the first 2 gig is completely empty on the
drive.  I really don't think it actually started to do anything.

Do you have further suggestions on where to go now?

Oh, and thank you very much for your help.  Most of the data on this
array I can stand to loose... It's not critical, but there are some of
my photographs on this that my backup is out of date on.  I can
destroy it all and start over, but really want to try to recover this
if it's possible.  For that matter, if it didn't actually start
rewriting the stripes, is there anyway to push it back down to 4 disks
to recover ?
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Reshape gone wrong, please help

2007-08-19 Thread Neil Brown
On Saturday August 18, [EMAIL PROTECTED] wrote:
> 
> That looks to me like the first 2 gig is completely empty on the
> drive.  I really don't think it actually started to do anything.

The backup data is near the end of the device.  If you look at the
last 2 gig you should see something.

> 
> Do you have further suggestions on where to go now?

Maybe an 'strace' of "mdadm -A " might show me something.

If you feel like following the code, Assemble (in Assemble.c) should
call Grow_restart.
This should look in /dev/sdb1 (which is already open in 'fdlist') by
calling 'load_super'.  It should then seek to 8 sectors before the
superblock (or close to there) and read a secondary superblock which
describes the backup data.
If this looks good, it seeks to where the backup data is (which is
towards the end of the device) and reads that.  It uses this to
restore the 'critical section', and then updates the superblock on all
devices.

As you aren't getting the messages 'restoring critical section',
something is going wrong before there.  It should fail:
  /dev/md0: Failed to restore critical section for reshape, sorry.
but I can see that there is a problem with the error return from
'Grow_restart'.  I'll get that fixed.


> 
> Oh, and thank you very much for your help.  Most of the data on this
> array I can stand to loose... It's not critical, but there are some of
> my photographs on this that my backup is out of date on.  I can
> destroy it all and start over, but really want to try to recover this
> if it's possible.  For that matter, if it didn't actually start
> rewriting the stripes, is there anyway to push it back down to 4 disks
> to recover ?

You could always just recreate the array:

 mdadm -C /dev/md0 -l5 -n4 -c256 --assume-clean /dev/sdf1 /dev/sde1  \
/dev/sdd1 /dev/sdc1

and make sure the data looks good (which it should).

I'd still like to know that the problem is though

Thanks,
NeilBeon
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Reshape gone wrong, please help

2007-08-19 Thread Greg Nicholson
On 8/19/07, Neil Brown <[EMAIL PROTECTED]> wrote:
> On Saturday August 18, [EMAIL PROTECTED] wrote:
> >
> > That looks to me like the first 2 gig is completely empty on the
> > drive.  I really don't think it actually started to do anything.
>
> The backup data is near the end of the device.  If you look at the
> last 2 gig you should see something.
>

I figured something like that after I started thinking about it...
That device is currently offline while I do some DD's to new devices.

> >
> > Do you have further suggestions on where to go now?
>
> Maybe an 'strace' of "mdadm -A " might show me something.
>
> If you feel like following the code, Assemble (in Assemble.c) should
> call Grow_restart.
> This should look in /dev/sdb1 (which is already open in 'fdlist') by
> calling 'load_super'.  It should then seek to 8 sectors before the
> superblock (or close to there) and read a secondary superblock which
> describes the backup data.
> If this looks good, it seeks to where the backup data is (which is
> towards the end of the device) and reads that.  It uses this to
> restore the 'critical section', and then updates the superblock on all
> devices.
>
> As you aren't getting the messages 'restoring critical section',
> something is going wrong before there.  It should fail:
>   /dev/md0: Failed to restore critical section for reshape, sorry.
> but I can see that there is a problem with the error return from
> 'Grow_restart'.  I'll get that fixed.
>
>
> >
> > Oh, and thank you very much for your help.  Most of the data on this
> > array I can stand to loose... It's not critical, but there are some of
> > my photographs on this that my backup is out of date on.  I can
> > destroy it all and start over, but really want to try to recover this
> > if it's possible.  For that matter, if it didn't actually start
> > rewriting the stripes, is there anyway to push it back down to 4 disks
> > to recover ?
>
> You could always just recreate the array:
>
>  mdadm -C /dev/md0 -l5 -n4 -c256 --assume-clean /dev/sdf1 /dev/sde1  \
> /dev/sdd1 /dev/sdc1
>
> and make sure the data looks good (which it should).
>
> I'd still like to know that the problem is though
>
> Thanks,
> NeilBeon
>

My current plan of attack, which I've been proceeding upon for the
last 24 hours... I'm DDing the original drives to new devices.  Once I
have copies of the drives, I'm going to try to recreate the array as a
4 device array.  Hopefully, at that point, the raid will come up, LVM
will initialize, and it's time to saturate the GigE offloading
EVERYTHING.

Assuming the above goes well which will definitely take some time,
Then I'll take the original drives, run the strace and try to get some
additional data for you.  I'd love to know what's up with this as
well.  If there is additional information I can get you to help, let
me know.  I've grown several arrays before without any issue, which
frankly is why I didn't think this would have been an issue thus,
my offload of the stuff I actually cared about wasn't up to date.

At the end of day (or more likely, week)  I'll completely destroy the
existing raid, and rebuild the entire thing to make sure I'm starting
from a good base.  At least at that point, I'll have additional
drives.  Given that I have dual File-servers that will have drives
added, it seems likely that I'll be testing the code again soon.  Big
difference being that this time, I won't make the assumption that
everything will be perfect. :)

Thanks again for your help, I'll post on my results as well as try to
get you that strace.  It's been quite a while since I dove into kernel
internals, or C for that matter, so it's unlikely I'm going to find
anything myself But I'll definitely send results back if I can.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Reshape gone wrong, please help

2007-08-19 Thread Greg Nicholson
On 8/19/07, Greg Nicholson <[EMAIL PROTECTED]> wrote:
> On 8/19/07, Neil Brown <[EMAIL PROTECTED]> wrote:
> > On Saturday August 18, [EMAIL PROTECTED] wrote:
> > >
> > > That looks to me like the first 2 gig is completely empty on the
> > > drive.  I really don't think it actually started to do anything.
> >
> > The backup data is near the end of the device.  If you look at the
> > last 2 gig you should see something.
> >
>
> I figured something like that after I started thinking about it...
> That device is currently offline while I do some DD's to new devices.
>
> > >
> > > Do you have further suggestions on where to go now?
> >
> > Maybe an 'strace' of "mdadm -A " might show me something.
> >
> > If you feel like following the code, Assemble (in Assemble.c) should
> > call Grow_restart.
> > This should look in /dev/sdb1 (which is already open in 'fdlist') by
> > calling 'load_super'.  It should then seek to 8 sectors before the
> > superblock (or close to there) and read a secondary superblock which
> > describes the backup data.
> > If this looks good, it seeks to where the backup data is (which is
> > towards the end of the device) and reads that.  It uses this to
> > restore the 'critical section', and then updates the superblock on all
> > devices.
> >
> > As you aren't getting the messages 'restoring critical section',
> > something is going wrong before there.  It should fail:
> >   /dev/md0: Failed to restore critical section for reshape, sorry.
> > but I can see that there is a problem with the error return from
> > 'Grow_restart'.  I'll get that fixed.
> >
> >
> > >
> > > Oh, and thank you very much for your help.  Most of the data on this
> > > array I can stand to loose... It's not critical, but there are some of
> > > my photographs on this that my backup is out of date on.  I can
> > > destroy it all and start over, but really want to try to recover this
> > > if it's possible.  For that matter, if it didn't actually start
> > > rewriting the stripes, is there anyway to push it back down to 4 disks
> > > to recover ?
> >
> > You could always just recreate the array:
> >
> >  mdadm -C /dev/md0 -l5 -n4 -c256 --assume-clean /dev/sdf1 /dev/sde1  \
> > /dev/sdd1 /dev/sdc1
> >
> > and make sure the data looks good (which it should).
> >
> > I'd still like to know that the problem is though
> >
> > Thanks,
> > NeilBeon
> >
>
> My current plan of attack, which I've been proceeding upon for the
> last 24 hours... I'm DDing the original drives to new devices.  Once I
> have copies of the drives, I'm going to try to recreate the array as a
> 4 device array.  Hopefully, at that point, the raid will come up, LVM
> will initialize, and it's time to saturate the GigE offloading
> EVERYTHING.
>
> Assuming the above goes well which will definitely take some time,
> Then I'll take the original drives, run the strace and try to get some
> additional data for you.  I'd love to know what's up with this as
> well.  If there is additional information I can get you to help, let
> me know.  I've grown several arrays before without any issue, which
> frankly is why I didn't think this would have been an issue thus,
> my offload of the stuff I actually cared about wasn't up to date.
>
> At the end of day (or more likely, week)  I'll completely destroy the
> existing raid, and rebuild the entire thing to make sure I'm starting
> from a good base.  At least at that point, I'll have additional
> drives.  Given that I have dual File-servers that will have drives
> added, it seems likely that I'll be testing the code again soon.  Big
> difference being that this time, I won't make the assumption that
> everything will be perfect. :)
>
> Thanks again for your help, I'll post on my results as well as try to
> get you that strace.  It's been quite a while since I dove into kernel
> internals, or C for that matter, so it's unlikely I'm going to find
> anything myself But I'll definitely send results back if I can.
>


Ok, as an update.  ORDER MATTERS.  :)

The above command didn't work.  It built, but LVM didn't recognize.
So, after despair, I thought, that's not the way I built it.  So, I
redid it in Alphabetical order... and it worked.

I'm in the process of taring and pulling everything off.

Once that is done, I'll put the original drives back in, and try to
understand what went wrong with the original grow/build.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Reshape gone wrong, please help

2007-08-20 Thread Greg Nicholson
On 8/19/07, Greg Nicholson <[EMAIL PROTECTED]> wrote:
> On 8/19/07, Greg Nicholson <[EMAIL PROTECTED]> wrote:
> > On 8/19/07, Neil Brown <[EMAIL PROTECTED]> wrote:
> > > On Saturday August 18, [EMAIL PROTECTED] wrote:
> > > >
> > > > That looks to me like the first 2 gig is completely empty on the
> > > > drive.  I really don't think it actually started to do anything.
> > >
> > > The backup data is near the end of the device.  If you look at the
> > > last 2 gig you should see something.
> > >
> >
> > I figured something like that after I started thinking about it...
> > That device is currently offline while I do some DD's to new devices.
> >
> > > >
> > > > Do you have further suggestions on where to go now?
> > >
> > > Maybe an 'strace' of "mdadm -A " might show me something.
> > >
> > > If you feel like following the code, Assemble (in Assemble.c) should
> > > call Grow_restart.
> > > This should look in /dev/sdb1 (which is already open in 'fdlist') by
> > > calling 'load_super'.  It should then seek to 8 sectors before the
> > > superblock (or close to there) and read a secondary superblock which
> > > describes the backup data.
> > > If this looks good, it seeks to where the backup data is (which is
> > > towards the end of the device) and reads that.  It uses this to
> > > restore the 'critical section', and then updates the superblock on all
> > > devices.
> > >
> > > As you aren't getting the messages 'restoring critical section',
> > > something is going wrong before there.  It should fail:
> > >   /dev/md0: Failed to restore critical section for reshape, sorry.
> > > but I can see that there is a problem with the error return from
> > > 'Grow_restart'.  I'll get that fixed.
> > >
> > >
> > > >
> > > > Oh, and thank you very much for your help.  Most of the data on this
> > > > array I can stand to loose... It's not critical, but there are some of
> > > > my photographs on this that my backup is out of date on.  I can
> > > > destroy it all and start over, but really want to try to recover this
> > > > if it's possible.  For that matter, if it didn't actually start
> > > > rewriting the stripes, is there anyway to push it back down to 4 disks
> > > > to recover ?
> > >
> > > You could always just recreate the array:
> > >
> > >  mdadm -C /dev/md0 -l5 -n4 -c256 --assume-clean /dev/sdf1 /dev/sde1  \
> > > /dev/sdd1 /dev/sdc1
> > >
> > > and make sure the data looks good (which it should).
> > >
> > > I'd still like to know that the problem is though
> > >
> > > Thanks,
> > > NeilBeon
> > >
> >
> > My current plan of attack, which I've been proceeding upon for the
> > last 24 hours... I'm DDing the original drives to new devices.  Once I
> > have copies of the drives, I'm going to try to recreate the array as a
> > 4 device array.  Hopefully, at that point, the raid will come up, LVM
> > will initialize, and it's time to saturate the GigE offloading
> > EVERYTHING.
> >
> > Assuming the above goes well which will definitely take some time,
> > Then I'll take the original drives, run the strace and try to get some
> > additional data for you.  I'd love to know what's up with this as
> > well.  If there is additional information I can get you to help, let
> > me know.  I've grown several arrays before without any issue, which
> > frankly is why I didn't think this would have been an issue thus,
> > my offload of the stuff I actually cared about wasn't up to date.
> >
> > At the end of day (or more likely, week)  I'll completely destroy the
> > existing raid, and rebuild the entire thing to make sure I'm starting
> > from a good base.  At least at that point, I'll have additional
> > drives.  Given that I have dual File-servers that will have drives
> > added, it seems likely that I'll be testing the code again soon.  Big
> > difference being that this time, I won't make the assumption that
> > everything will be perfect. :)
> >
> > Thanks again for your help, I'll post on my results as well as try to
> > get you that strace.  It's been quite a while since I dove into kernel
> > internals, or C for that matter, so it's unlikely I'm going to find
> > anything myself But I'll definitely send results back if I can.
> >
>
>
> Ok, as an update.  ORDER MATTERS.  :)
>
> The above command didn't work.  It built, but LVM didn't recognize.
> So, after despair, I thought, that's not the way I built it.  So, I
> redid it in Alphabetical order... and it worked.
>
> I'm in the process of taring and pulling everything off.
>
> Once that is done, I'll put the original drives back in, and try to
> understand what went wrong with the original grow/build.
>

And as a final update... I pulled all the data from the 4 disk array I
built from the copied Disks.  Everything looks to be intact.  That is
definitely a better feeling for me.

I then put the original disks back in, and compiled 2.6.3 to see if it
did any better on the assemble.  It appears that your update about the
critical section missing 

Re: Raid5 Reshape gone wrong, please help

2007-08-23 Thread Greg Nicholson


OK I've reproduced the original issue on a seperate box.
2.6.23-rc3 does not like to grow Raid 5 arrays.  MDadm 2.6.3

mdadm --add /dev/md0 /dev/sda1
mdadm -G --backup-file=/root/backup.raid.file /dev/md0

(Yes, I added the backup-file this time... just to be sure.)

Mdadm began the grow, and stopped in the critical section, or right
after creating the backup... Not sure which.  Reboot.

Refused to start the array.  So...

 mdadm -A /dev/md0 /dev/sd[abdefg]1

and we have in /proc/mdstat:

Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdg1[0] sda1[5] sdf1[4] sdd1[3] sdb1[2] sde1[1]
  1953535488 blocks super 0.91 level 5, 128k chunk, algorithm 2
[6/6] [UU]
  [>]  reshape =  0.0% (512/488383872)
finish=378469.4min speed=0K/sec

unused devices: 

And it's sat there without change for the past 2 hours.  Now, I have a
backup, so frankly, I'm about to blow away the array and just recreate
it, but I thought you should know.

I've got the stripe_cache_size at 8192... 256 and 1024 don't change anything.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Reshape gone wrong, please help

2007-08-23 Thread Greg Nicholson
On 8/23/07, Greg Nicholson <[EMAIL PROTECTED]> wrote:
> 
>
> OK I've reproduced the original issue on a seperate box.
> 2.6.23-rc3 does not like to grow Raid 5 arrays.  MDadm 2.6.3
>
> mdadm --add /dev/md0 /dev/sda1
> mdadm -G --backup-file=/root/backup.raid.file /dev/md0
>
> (Yes, I added the backup-file this time... just to be sure.)
>
> Mdadm began the grow, and stopped in the critical section, or right
> after creating the backup... Not sure which.  Reboot.
>
> Refused to start the array.  So...
>
>  mdadm -A /dev/md0 /dev/sd[abdefg]1
>
> and we have in /proc/mdstat:
>
> Personalities : [raid6] [raid5] [raid4]
> md0 : active raid5 sdg1[0] sda1[5] sdf1[4] sdd1[3] sdb1[2] sde1[1]
>   1953535488 blocks super 0.91 level 5, 128k chunk, algorithm 2
> [6/6] [UU]
>   [>]  reshape =  0.0% (512/488383872)
> finish=378469.4min speed=0K/sec
>
> unused devices: 
>
> And it's sat there without change for the past 2 hours.  Now, I have a
> backup, so frankly, I'm about to blow away the array and just recreate
> it, but I thought you should know.
>
> I've got the stripe_cache_size at 8192... 256 and 1024 don't change anything.
>

Forgot the DMESG output:

md: bind
md: bind
md: bind
md: bind
md: bind
md: bind
md: md0: raid array is not clean -- starting background reconstruction
raid5: reshape will continue
raid5: device sdg1 operational as raid disk 0
raid5: device sda1 operational as raid disk 5
raid5: device sdf1 operational as raid disk 4
raid5: device sdd1 operational as raid disk 3
raid5: device sdb1 operational as raid disk 2
raid5: device sde1 operational as raid disk 1
raid5: allocated 6293kB for md0
raid5: raid level 5 set md0 active with 6 out of 6 devices, algorithm 2
RAID5 conf printout:
 --- rd:6 wd:6
 disk 0, o:1, dev:sdg1
 disk 1, o:1, dev:sde1
 disk 2, o:1, dev:sdb1
 disk 3, o:1, dev:sdd1
 disk 4, o:1, dev:sdf1
 disk 5, o:1, dev:sda1
...ok start reshape thread
md: reshape of RAID array md0
md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than
20 KB/sec) for reshape.
md: using 128k window, over a total of 488383872 blocks.

Looks good, but it doesn't actually do anything.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Reshape gone wrong, please help

2007-08-27 Thread Neil Brown
On Thursday August 23, [EMAIL PROTECTED] wrote:
> 
> 
> OK I've reproduced the original issue on a seperate box.
> 2.6.23-rc3 does not like to grow Raid 5 arrays.  MDadm 2.6.3

No, you are right. It doesn't.

Obviously insufficient testing and review - thanks for find it for us.

This patch seems to make it work - raid5 and raid6.

Dan: Could you check it for me, particularly the moving of
+   async_tx_ack(tx);
+   dma_wait_for_async_tx(tx);
outside of the loop.

Greg: could you pleas check it works for you too - it works for me,
but double-testing never hurts.

Thanks again,

NeilBrown



-
Fix some bugs with growing raid5/raid6 arrays.



### Diffstat output
 ./drivers/md/raid5.c |   17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c2007-08-24 16:36:22.0 +1000
+++ ./drivers/md/raid5.c2007-08-27 20:50:57.0 +1000
@@ -2541,7 +2541,7 @@ static void handle_stripe_expansion(raid
struct dma_async_tx_descriptor *tx = NULL;
clear_bit(STRIPE_EXPAND_SOURCE, &sh->state);
for (i = 0; i < sh->disks; i++)
-   if (i != sh->pd_idx && (r6s && i != r6s->qd_idx)) {
+   if (i != sh->pd_idx && (!r6s || i != r6s->qd_idx)) {
int dd_idx, pd_idx, j;
struct stripe_head *sh2;
 
@@ -2574,7 +2574,8 @@ static void handle_stripe_expansion(raid
set_bit(R5_UPTODATE, &sh2->dev[dd_idx].flags);
for (j = 0; j < conf->raid_disks; j++)
if (j != sh2->pd_idx &&
-   (r6s && j != r6s->qd_idx) &&
+   (!r6s || j != raid6_next_disk(sh2->pd_idx,
+sh2->disks)) &&
!test_bit(R5_Expanded, &sh2->dev[j].flags))
break;
if (j == conf->raid_disks) {
@@ -2583,12 +2584,12 @@ static void handle_stripe_expansion(raid
}
release_stripe(sh2);
 
-   /* done submitting copies, wait for them to complete */
-   if (i + 1 >= sh->disks) {
-   async_tx_ack(tx);
-   dma_wait_for_async_tx(tx);
-   }
}
+   /* done submitting copies, wait for them to complete */
+   if (tx) {
+   async_tx_ack(tx);
+   dma_wait_for_async_tx(tx);
+   }
 }
 
 /*
@@ -2855,7 +2856,7 @@ static void handle_stripe5(struct stripe
sh->disks = conf->raid_disks;
sh->pd_idx = stripe_to_pdidx(sh->sector, conf,
conf->raid_disks);
-   s.locked += handle_write_operations5(sh, 0, 1);
+   s.locked += handle_write_operations5(sh, 1, 1);
} else if (s.expanded &&
!test_bit(STRIPE_OP_POSTXOR, &sh->ops.pending)) {
clear_bit(STRIPE_EXPAND_READY, &sh->state);
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Raid5 Reshape gone wrong, please help

2007-08-27 Thread Williams, Dan J
> From: Neil Brown [mailto:[EMAIL PROTECTED]
> On Thursday August 23, [EMAIL PROTECTED] wrote:
> > 
> >
> > OK I've reproduced the original issue on a seperate box.
> > 2.6.23-rc3 does not like to grow Raid 5 arrays.  MDadm 2.6.3
> 
> No, you are right. It doesn't.
> 
> Obviously insufficient testing and review - thanks for find it for us.
> 
Agreed - seconded.

> This patch seems to make it work - raid5 and raid6.
> 
> Dan: Could you check it for me, particularly the moving of
> + async_tx_ack(tx);
> + dma_wait_for_async_tx(tx);
> outside of the loop.
> 
Yes, this definitely needs to be outside the loop.

> Greg: could you pleas check it works for you too - it works for me,
> but double-testing never hurts.
> 
> Thanks again,
> 
> NeilBrown
> 
> 
> 
> -
> Fix some bugs with growing raid5/raid6 arrays.
> 
> 
> 
> ### Diffstat output
>  ./drivers/md/raid5.c |   17 +
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
> --- .prev/drivers/md/raid5.c  2007-08-24 16:36:22.0 +1000
> +++ ./drivers/md/raid5.c  2007-08-27 20:50:57.0 +1000
> @@ -2541,7 +2541,7 @@ static void handle_stripe_expansion(raid
>   struct dma_async_tx_descriptor *tx = NULL;
>   clear_bit(STRIPE_EXPAND_SOURCE, &sh->state);
>   for (i = 0; i < sh->disks; i++)
> - if (i != sh->pd_idx && (r6s && i != r6s->qd_idx)) {
> + if (i != sh->pd_idx && (!r6s || i != r6s->qd_idx)) {
>   int dd_idx, pd_idx, j;
>   struct stripe_head *sh2;
> 
> @@ -2574,7 +2574,8 @@ static void handle_stripe_expansion(raid
>   set_bit(R5_UPTODATE, &sh2->dev[dd_idx].flags);
>   for (j = 0; j < conf->raid_disks; j++)
>   if (j != sh2->pd_idx &&
> - (r6s && j != r6s->qd_idx) &&
> + (!r6s || j !=
raid6_next_disk(sh2->pd_idx,
> +
sh2->disks)) &&
>   !test_bit(R5_Expanded,
&sh2->dev[j].flags))
>   break;
>   if (j == conf->raid_disks) {
> @@ -2583,12 +2584,12 @@ static void handle_stripe_expansion(raid
>   }
>   release_stripe(sh2);
> 
> - /* done submitting copies, wait for them to
complete */
> - if (i + 1 >= sh->disks) {
> - async_tx_ack(tx);
> - dma_wait_for_async_tx(tx);
> - }
>   }
> + /* done submitting copies, wait for them to complete */
> + if (tx) {
> + async_tx_ack(tx);
> + dma_wait_for_async_tx(tx);
> + }
>  }
> 
>  /*
> @@ -2855,7 +2856,7 @@ static void handle_stripe5(struct stripe
>   sh->disks = conf->raid_disks;
>   sh->pd_idx = stripe_to_pdidx(sh->sector, conf,
>   conf->raid_disks);
> - s.locked += handle_write_operations5(sh, 0, 1);
> + s.locked += handle_write_operations5(sh, 1, 1);
How about for clarity:
s.locked += handle_write_operations5(sh, RECONSTRUCT_WRITE, 1);

>   } else if (s.expanded &&
>   !test_bit(STRIPE_OP_POSTXOR, &sh->ops.pending)) {
>   clear_bit(STRIPE_EXPAND_READY, &sh->state);

Signed-off-by: Dan Williams <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid5 Reshape gone wrong, please help

2007-08-29 Thread Bill Davidsen

Greg Nicholson wrote:



OK I've reproduced the original issue on a seperate box.
2.6.23-rc3 does not like to grow Raid 5 arrays.  MDadm 2.6.3
  


I have to say that trying something as critical as a reshape of live 
data on an -rc kernel is a great way to have a learning experience. Good 
that you found the problem, but also good that *you* found the problem, 
not me.


Thanks for testing. ;-)

mdadm --add /dev/md0 /dev/sda1
mdadm -G --backup-file=/root/backup.raid.file /dev/md0

(Yes, I added the backup-file this time... just to be sure.)

Mdadm began the grow, and stopped in the critical section, or right
after creating the backup... Not sure which.  Reboot.

Refused to start the array.  So...

 mdadm -A /dev/md0 /dev/sd[abdefg]1

and we have in /proc/mdstat:

Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdg1[0] sda1[5] sdf1[4] sdd1[3] sdb1[2] sde1[1]
  1953535488 blocks super 0.91 level 5, 128k chunk, algorithm 2
[6/6] [UU]
  [>]  reshape =  0.0% (512/488383872)
finish=378469.4min speed=0K/sec

unused devices: 

And it's sat there without change for the past 2 hours.  Now, I have a
backup, so frankly, I'm about to blow away the array and just recreate
it, but I thought you should know.

I've got the stripe_cache_size at 8192... 256 and 1024 don't change anything.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  



--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Raid5 Reshape gone wrong, please help

2007-08-29 Thread Neil Brown
On Monday August 27, [EMAIL PROTECTED] wrote:
> > -   s.locked += handle_write_operations5(sh, 0, 1);
> > +   s.locked += handle_write_operations5(sh, 1, 1);
> How about for clarity:
>   s.locked += handle_write_operations5(sh, RECONSTRUCT_WRITE, 1);
> 

Nope.  That second argument is a boolean, not an enum.
If it was changed to 'writemode' (or similar) and the code in
handle_write_operations5 were changed to

  switch(writemode) {
  case RECONSTRUCT_WRITE:
 
  case READ_MODIFY_WRITE:
 
  }

Then it would make sense to use RECONSTRUCT_WRITE in the call - and
the code would probably be more readable on the whole.
But as it is, either 'true' or '1' should go there.

NeilBrown

  
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help RAID5 reshape Oops / backup-file

2007-10-11 Thread Nagilum
Ok, after looking in "Grow.c" I can see that the backup file is  
removed once the critial section has passed:


if (backup_file)
unlink(backup_file);

printf(Name ": ... critical section passed.\n");

Since I had passed that point I'll try to find out where  
Grow_restart() stumbles. By looking at it I'm not even sure it's able  
to "resume" and not just restart. :-/



- Message from [EMAIL PROTECTED] -
Date: Tue, 09 Oct 2007 20:58:47 +0200
From: Nagilum <[EMAIL PROTECTED]>
Reply-To: Nagilum <[EMAIL PROTECTED]>
 Subject: Help RAID5 reshape Oops / backup-file
  To: linux-raid@vger.kernel.org



Hi,
During the process of reshaping a Raid5 from 3 (/dev/sd[a-c]) to 5
devices (/dev/sd[a-e]) the system was accidentally shut down.
I know I was stupid I should have used a --backup-file but stupid me didn't.
Thanks for not rubbing it any further. :(
Ok, here is what I have:

nas:~# uname -a
Linux nas 2.6.18-5-amd64 #1 SMP Thu Aug 30 01:14:54 UTC 2007 x86_64 GNU/Linux
nas:~# mdadm --version
mdadm - v2.5.6 - 9 November 2006
nas:~# mdadm -Q --detail /dev/md0
/dev/md0:
 Version : 00.91.03
   Creation Time : Sat Sep 15 21:11:41 2007
  Raid Level : raid5
 Device Size : 488308672 (465.69 GiB 500.03 GB)
Raid Devices : 5
   Total Devices : 5
Preferred Minor : 0
 Persistence : Superblock is persistent

 Update Time : Mon Oct  8 23:59:27 2007
   State : active, degraded, Not Started
  Active Devices : 3
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 2

  Layout : left-symmetric
  Chunk Size : 16K

   Delta Devices : 2, (3->5)

UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
  Events : 0.470134

 Number   Major   Minor   RaidDevice State
0   800  active sync   /dev/sda
1   8   161  active sync   /dev/sdb
2   8   322  active sync   /dev/sdc
3   003  removed
4   004  removed

5   8   48-  spare   /dev/sdd
6   8   64-  spare   /dev/sde


nas:~# mdadm -E /dev/sd[a-e]
/dev/sda:
   Magic : a92b4efc
 Version : 00.91.00
UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
   Creation Time : Sat Sep 15 21:11:41 2007
  Raid Level : raid5
 Device Size : 488308672 (465.69 GiB 500.03 GB)
  Array Size : 1953234688 (1862.75 GiB 2000.11 GB)
Raid Devices : 5
   Total Devices : 5
Preferred Minor : 0

   Reshape pos'n : 872095808 (831.70 GiB 893.03 GB)
   Delta Devices : 2 (3->5)

 Update Time : Mon Oct  8 23:59:27 2007
   State : clean
  Active Devices : 5
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 0
Checksum : f425054d - correct
  Events : 0.470134

  Layout : left-symmetric
  Chunk Size : 16K

   Number   Major   Minor   RaidDevice State
this 0   800  active sync   /dev/sda

0 0   800  active sync   /dev/sda
1 1   8   161  active sync   /dev/sdb
2 2   8   322  active sync   /dev/sdc
3 3   8   643  active sync   /dev/sde
4 4   8   484  active sync   /dev/sdd
/dev/sdb:
   Magic : a92b4efc
 Version : 00.91.00
UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
   Creation Time : Sat Sep 15 21:11:41 2007
  Raid Level : raid5
 Device Size : 488308672 (465.69 GiB 500.03 GB)
  Array Size : 1953234688 (1862.75 GiB 2000.11 GB)
Raid Devices : 5
   Total Devices : 5
Preferred Minor : 0

   Reshape pos'n : 872095808 (831.70 GiB 893.03 GB)
   Delta Devices : 2 (3->5)

 Update Time : Mon Oct  8 23:59:27 2007
   State : clean
  Active Devices : 5
Working Devices : 5
  Failed Devices : 0
   Spare Devices : 0
Checksum : f425055f - correct
  Events : 0.470134

  Layout : left-symmetric
  Chunk Size : 16K

   Number   Major   Minor   RaidDevice State
this 1   8   161  active sync   /dev/sdb

0 0   800  active sync   /dev/sda
1 1   8   161  active sync   /dev/sdb
2 2   8   322  active sync   /dev/sdc
3 3   8   643  active sync   /dev/sde
4 4   8   484  active sync   /dev/sdd
/dev/sdc:
   Magic : a92b4efc
 Version : 00.91.00
UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
   Creation Time : Sat Sep 15 21:11:41 2007
  Raid Level : raid5
 Device Size : 488308672 (465.69 GiB 500.03 GB)
  Array Size : 1953234688 (1862.75 GiB 2000.11 GB)
Raid Devices : 5
   Total Devices : 5

Re: Help RAID5 reshape Oops / backup-file

2007-10-11 Thread Neil Brown
On Thursday October 11, [EMAIL PROTECTED] wrote:
> Ok, after looking in "Grow.c" I can see that the backup file is  
> removed once the critial section has passed:
> 
>   if (backup_file)
>   unlink(backup_file);
> 
>   printf(Name ": ... critical section passed.\n");
> 
> Since I had passed that point I'll try to find out where  
> Grow_restart() stumbles. By looking at it I'm not even sure it's able  
> to "resume" and not just restart. :-/
> 

It isn't a problem that you didn't specify a backup-file.
If you don't, mdadm uses some spare space on one of the new drives.
After the critical section has passed, the backup file isn't needed
any longer.
The problem is that mdadm still wants to find and recover from it.

I throughly tested mdadm restarting from a crash during the critical
section, but it looks like I didn't properly test restarting from a
later crash.

I think if you just change the 'return 1' at the end of Grow_restart
to 'return 0' it should work for you.

I'll try to get this fixed properly (and tested) and release a 2.6.4.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help RAID5 reshape Oops / backup-file

2007-10-11 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Fri, 12 Oct 2007 09:51:08 +1000
From: Neil Brown <[EMAIL PROTECTED]>
Reply-To: Neil Brown <[EMAIL PROTECTED]>
 Subject: Re: Help RAID5 reshape Oops / backup-file
  To: Nagilum <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org



On Thursday October 11, [EMAIL PROTECTED] wrote:

Ok, after looking in "Grow.c" I can see that the backup file is
removed once the critial section has passed:

if (backup_file)
unlink(backup_file);

printf(Name ": ... critical section passed.\n");

Since I had passed that point I'll try to find out where
Grow_restart() stumbles. By looking at it I'm not even sure it's able
to "resume" and not just restart. :-/



It isn't a problem that you didn't specify a backup-file.
If you don't, mdadm uses some spare space on one of the new drives.
After the critical section has passed, the backup file isn't needed
any longer.
The problem is that mdadm still wants to find and recover from it.

I throughly tested mdadm restarting from a crash during the critical
section, but it looks like I didn't properly test restarting from a
later crash.

I think if you just change the 'return 1' at the end of Grow_restart
to 'return 0' it should work for you.

I'll try to get this fixed properly (and tested) and release a 2.6.4.

NeilBrown




- End message from [EMAIL PROTECTED] -

Thanks, I changed Grow_restart as suggested, now I get:
nas:~/mdadm-2.6.3# ./mdadm -A /dev/md0 /dev/sd[a-e]
mdadm: /dev/md0 assembled from 3 drives and 2 spares - not enough to  
start the array.


nas:~/mdadm-2.6.3# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : inactive sda[0] sde[6] sdd[5] sdc[2] sdb[1]
  2441543360 blocks

unused devices: 

which is similar to what the old mdadm is telling me.
I'll try to find out where it gets the idea these are spares..
Would it be a good idea to update to vanilla 2.6.23 instead of running  
Debian Etch's 2.6.18-5?

If there is anything I can do to help with v2.6.4 let me know!
Thanks,
Alex.


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgpdgRJbJZVe9.pgp
Description: PGP Digital Signature


Re: Help RAID5 reshape Oops / backup-file

2007-10-14 Thread Nagilum

Can someone tell me if I'm on the right track?
I've now noticed the following:
# ~/mdadm-2.6.3/mdadm -v -A /dev/md0 /dev/sd[d-e]
mdadm: looking for devices for /dev/md0
mdadm: /dev/sdd is identified as a member of /dev/md0, slot -1.
mdadm: /dev/sde is identified as a member of /dev/md0, slot -1.
mdadm: No suitable drives found for /dev/md0

This "slot -1" is also visible in the examine output:

# mdadm -E /dev/sdd
/dev/sdd:
  Magic : a92b4efc
Version : 00.91.00
   UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
  Creation Time : Sat Sep 15 21:11:41 2007
 Raid Level : raid5
Device Size : 488308672 (465.69 GiB 500.03 GB)
 Array Size : 1953234688 (1862.75 GiB 2000.11 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0

  Reshape pos'n : 872095808 (831.70 GiB 893.03 GB)
  Delta Devices : 2 (3->5)

Update Time : Mon Oct  8 23:59:27 2007
  State : clean
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
   Checksum : f42505b9 - correct
 Events : 0.470134

 Layout : left-symmetric
 Chunk Size : 16K

  Number   Major   Minor   RaidDevice State
this 5   8   48   -1  spare   /dev/sdd

   0 0   800  active sync   /dev/sda
   1 1   8   161  active sync   /dev/sdb
   2 2   8   322  active sync   /dev/sdc
   3 3   8   643  active sync   /dev/sde
   4 4   8   484  active sync   /dev/sdd


So can someone confirm that is the likely source of my problem?
And hopefully - if that is indeed the problem - someone can tell me  
how to update that slot number?

Thanks,
Alex.

- Message from [EMAIL PROTECTED] -
Date: Fri, 12 Oct 2007 08:43:28 +0200
From: Nagilum <[EMAIL PROTECTED]>
Reply-To: Nagilum <[EMAIL PROTECTED]>
 Subject: Re: Help RAID5 reshape Oops / backup-file
  To: Neil Brown <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org



- Message from [EMAIL PROTECTED] -
Date: Fri, 12 Oct 2007 09:51:08 +1000
From: Neil Brown <[EMAIL PROTECTED]>
Reply-To: Neil Brown <[EMAIL PROTECTED]>
 Subject: Re: Help RAID5 reshape Oops / backup-file
  To: Nagilum <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org



It isn't a problem that you didn't specify a backup-file.
If you don't, mdadm uses some spare space on one of the new drives.
After the critical section has passed, the backup file isn't needed
any longer.
The problem is that mdadm still wants to find and recover from it.

I throughly tested mdadm restarting from a crash during the critical
section, but it looks like I didn't properly test restarting from a
later crash.

I think if you just change the 'return 1' at the end of Grow_restart
to 'return 0' it should work for you.

I'll try to get this fixed properly (and tested) and release a 2.6.4.

NeilBrown




- End message from [EMAIL PROTECTED] -

Thanks, I changed Grow_restart as suggested, now I get:
nas:~/mdadm-2.6.3# ./mdadm -A /dev/md0 /dev/sd[a-e]
mdadm: /dev/md0 assembled from 3 drives and 2 spares - not enough to
start the array.

nas:~/mdadm-2.6.3# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : inactive sda[0] sde[6] sdd[5] sdc[2] sdb[1]
  2441543360 blocks

unused devices: 

which is similar to what the old mdadm is telling me.
I'll try to find out where it gets the idea these are spares..
Would it be a good idea to update to vanilla 2.6.23 instead of running
Debian Etch's 2.6.18-5?
If there is anything I can do to help with v2.6.4 let me know!
Thanks,
Alex.


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



- End message from [EMAIL PROTECTED] -




#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #



---

Re: Help RAID5 reshape Oops / backup-file

2007-10-14 Thread Neil Brown
On Sunday October 14, [EMAIL PROTECTED] wrote:
> Can someone tell me if I'm on the right track?
> I've now noticed the following:
> # ~/mdadm-2.6.3/mdadm -v -A /dev/md0 /dev/sd[d-e]
> mdadm: looking for devices for /dev/md0
> mdadm: /dev/sdd is identified as a member of /dev/md0, slot -1.
> mdadm: /dev/sde is identified as a member of /dev/md0, slot -1.
> mdadm: No suitable drives found for /dev/md0

Hmm... that might be useful..

I just found your earlier email where you said:

> After the machine came back up (on a rescue disk) I thought I'd
> simply have to go through the process again. So I use add add the
> new disk again. 
> Although that worked, I am now unable to resume the growing
> process. 

Using "add add" again was not correct, and should not have been
possible.
You should have simply assembled the array with the full new set of
devices.  Then reshape would have automatically restarted properly.

Can you remember *exactly* what you did?  If I can reproduce the
situation, I can find the best way to fix it and send you something to
try.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help RAID5 reshape Oops / backup-file

2007-10-15 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Mon, 15 Oct 2007 09:31:23 +1000
From: Neil Brown <[EMAIL PROTECTED]>
Reply-To: Neil Brown <[EMAIL PROTECTED]>
 Subject: Re: Help RAID5 reshape Oops / backup-file
  To: Nagilum <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org



On Sunday October 14, [EMAIL PROTECTED] wrote:

Can someone tell me if I'm on the right track?
I've now noticed the following:
# ~/mdadm-2.6.3/mdadm -v -A /dev/md0 /dev/sd[d-e]
mdadm: looking for devices for /dev/md0
mdadm: /dev/sdd is identified as a member of /dev/md0, slot -1.
mdadm: /dev/sde is identified as a member of /dev/md0, slot -1.
mdadm: No suitable drives found for /dev/md0


Hmm... that might be useful..

I just found your earlier email where you said:


After the machine came back up (on a rescue disk) I thought I'd
simply have to go through the process again. So I use add add the
new disk again.
Although that worked, I am now unable to resume the growing
process.


Using "add add" again was not correct, and should not have been
possible.
You should have simply assembled the array with the full new set of
devices.  Then reshape would have automatically restarted properly.

Can you remember *exactly* what you did?  If I can reproduce the
situation, I can find the best way to fix it and send you something to
try.

NeilBrown



- End message from [EMAIL PROTECTED] -

Sure, here it goes:
The system is running Debian Etch ia64, kernel 2.6.18,
(since the exact versions might be important in this case I made  
copies of what I deemed to be relevant available online)
a copy of the "linux/drivers/md" folder of that particular kernel can  
be found at:

  http://www.nagilum.de/md/md

Etch comes with mdadm-2.5.6 + Debian patches.
See http://www.nagilum.de/md/mdadm-2.5.6/debian/changelog
I made the whole Debian Package available here:
 http://www.nagilum.de/md/
 - "mdadm-2.5.6" the extracted source with Debian patches applied
 -  mdadm_2.5.6-9.diff.gz the diff to mdadm_2.5.6.orig.tar.gz
 -  mdadm_2.5.6-9_i386.deb the i385 version of the package, however I  
was/am using  mdadm_2.5.6-9_ia64.deb

 - "mdadm_2.5.6-9.dsc" description file for building the .deb

The Raid was being reshaped from three to five drives when the  
shutdown was issued. I assume the shutdown went normally since the  
machine was off and there was no power interruption.

Upon booting the system it became apparent that the RAID was non functional.
The system boots off of a USB stick and then mounts its root  
filesystem from the RAID. Assembling the RAID happens within the  
initrd. The relevant scripts can be found here:  
http://www.nagilum.de/md/local-top/

I booted a rescue disk which is based on the identical Linux version.
I looked at the "mdadm -Q --detail /dev/md0" output and saw only 3 of  
the 5 disks in the RAID. Then I did (what I should not have done) the  
add of the two new disks, assuming that mdadm will touch these in a  
harmful way (without using --force) and refuse to do so if that's not  
the way to add active disk.

The disks were added but the reshape did not continue.
Up until now I can't think of anything else I did that could have  
changed something. (and "mdadm -Q --detail /dev/md0" looks the same  
ever since)
I think, what I should have done instead of adding those disks would  
have been to either use --re-add and/or update /etc/mdadm/mdadm.conf.  
But then again I never expected this to become so problematic. :(
By now I can also boot with 2.6.23 (I'll update to 2.6.23.1 shortly)  
and I have the latest mdadm tools (in parallel to the old ones).
I also build the test_stripe utility and tried a very briefly the  
"test" argument, but it wanted me to specify an existing file so I  
chickened out. ;)

Thanks a lot for looking into this!
Alex.


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgpJsbtGGuxW4.pgp
Description: PGP Digital Signature


Re: Help RAID5 reshape Oops / backup-file

2007-10-15 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Mon, 15 Oct 2007 13:55:22 +0200
From: Nagilum <[EMAIL PROTECTED]>
Reply-To: Nagilum <[EMAIL PROTECTED]>
 Subject: Re: Help RAID5 reshape Oops / backup-file
  To: Neil Brown <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org


 -  mdadm_2.5.6-9_i386.deb the i385 version of the package, however I


 i386 of course


add of the two new disks, assuming that mdadm will touch these in a


 will _not_ touch these in a harmful way

- End message from [EMAIL PROTECTED] -

..stupid typos ;)


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgph0eEuIYoLp.pgp
Description: PGP Digital Signature


Re: Help RAID5 reshape Oops / backup-file

2007-10-15 Thread Neil Brown

Thanks for the extra details.
I still cannot manage to reproduce it which is frustrating, but I
think I can fix your array for you.

Get the source for mdadm 2.6.3, apply the following patch, then use

   mdadm -A /dev/md0 --update=this /dev/sd[abcde]

that should re-write the part of the superblocks that is wrong, then
assemble the array.

Please let me know how it goes.

Also, if you could show me "mdadm.conf" and "mdrun.conf" from the
initrd, that might help.

Thanks,
NeilBrown


diff --git a/Grow.c b/Grow.c
index 825747e..8ad1537 100644
--- a/Grow.c
+++ b/Grow.c
@@ -978,5 +978,5 @@ int Grow_restart(struct supertype *st, struct mdinfo *info, 
int *fdlist, int cnt
/* And we are done! */
return 0;
}
-   return 1;
+   return 0;
 }
diff --git a/mdadm.c b/mdadm.c
index 40fdccf..7e7e803 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -584,6 +584,8 @@ int main(int argc, char *argv[])
exit(2);
}
update = optarg;
+   if (strcmp(update, "this")==0) 
+   continue;
if (strcmp(update, "sparc2.2")==0) 
continue;
if (strcmp(update, "super-minor") == 0)
diff --git a/super0.c b/super0.c
index 0396c2c..e33e623 100644
--- a/super0.c
+++ b/super0.c
@@ -394,6 +394,21 @@ static int update_super0(struct mdinfo *info, void *sbv, 
char *update,
fprintf (stderr, Name ": adjusting superblock of %s for 
2.2/sparc compatability.\n",
 devname);
}
+   if (strcmp(update, "this") == 0) {
+   /* to fix a particular corrupt superblock.
+   */
+   int i;
+   for (i=0; i<10; i++)
+   if (sb->disks[i].major == sb->this_disk.major &&
+   sb->disks[i].minor == sb->this_disk.minor) {
+   if (sb->this_disk.number == sb->disks[i].number)
+   break;
+   fprintf(stderr, Name ": Setting this disk from 
%d to %d\n",
+   sb->this_disk.number, 
sb->disks[i].number);
+   sb->this_disk = sb->disks[i];
+   break;
+   }
+   }
if (strcmp(update, "super-minor") ==0) {
sb->md_minor = info->array.md_minor;
if (verbose > 0)
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help RAID5 reshape Oops / backup-file

2007-10-16 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Tue, 16 Oct 2007 11:16:19 +1000
From: Neil Brown <[EMAIL PROTECTED]>
Reply-To: Neil Brown <[EMAIL PROTECTED]>
 Subject: Re: Help RAID5 reshape Oops / backup-file
  To: Nagilum <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org




Thanks for the extra details.
I still cannot manage to reproduce it which is frustrating, but I
think I can fix your array for you.

Get the source for mdadm 2.6.3, apply the following patch, then use

   mdadm -A /dev/md0 --update=this /dev/sd[abcde]

that should re-write the part of the superblocks that is wrong, then
assemble the array.

Please let me know how it goes.

Also, if you could show me "mdadm.conf" and "mdrun.conf" from the
initrd, that might help.

Thanks,
NeilBrown




- End message from [EMAIL PROTECTED] -

Thanks a bunch mate!

So far it looks very well:

nas:~/mdadm-2.6.3# ./mdadm -Q --detail /dev/md0
/dev/md0:
Version : 00.91.03
  Creation Time : Sat Sep 15 21:11:41 2007
 Raid Level : raid5
  Used Dev Size : 488308672 (465.69 GiB 500.03 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Mon Oct  8 23:59:27 2007
  State : active, degraded, Not Started
 Active Devices : 3
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 2

 Layout : left-symmetric
 Chunk Size : 16K

  Delta Devices : 2, (3->5)

   UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
 Events : 0.470134

Number   Major   Minor   RaidDevice State
   0   800  active sync   /dev/sda
   1   8   161  active sync   /dev/sdb
   2   8   322  active sync   /dev/sdc
   3   003  removed
   4   004  removed

   5   8   48-  spare   /dev/sdd
   6   8   64-  spare   /dev/sde
nas:~/mdadm-2.6.3# mdadm -S /dev/md0
mdadm: stopped /dev/md0
nas:~/mdadm-2.6.3# ./mdadm -A /dev/md0 --update=this /dev/sd[abcde]
mdadm: Setting this disk from 5 to 4
mdadm: Setting this disk from 6 to 3
mdadm: /dev/md0 assembled from 3 drives and 2 spares - not enough to  
start the array.

nas:~/mdadm-2.6.3# mdadm -S /dev/md0
mdadm: stopped /dev/md0
nas:~/mdadm-2.6.3# ./mdadm -A /dev/md0 /dev/sd[abcde]
mdadm: /dev/md0 has been started with 5 drives.
nas:~/mdadm-2.6.3# ./mdadm -Q --detail /dev/md0
/dev/md0:
Version : 00.91.03
  Creation Time : Sat Sep 15 21:11:41 2007
 Raid Level : raid5
 Array Size : 976617344 (931.37 GiB 1000.06 GB)
  Used Dev Size : 488308672 (465.69 GiB 500.03 GB)
   Raid Devices : 5
  Total Devices : 5
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Tue Oct 16 13:42:03 2007
  State : clean, recovering
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0

 Layout : left-symmetric
 Chunk Size : 16K

 Reshape Status : 44% complete
  Delta Devices : 2, (3->5)

   UUID : 25da80a6:d56eb9d6:0d7656f3:2f233380
 Events : 0.470212

Number   Major   Minor   RaidDevice State
   0   800  active sync   /dev/sda
   1   8   161  active sync   /dev/sdb
   2   8   322  active sync   /dev/sdc
   3   8   643  active sync   /dev/sde
   4   8   484  active sync   /dev/sdd

nas:~# cat /proc/mdstat
md0 : active raid5 sda[0] sdd[4] sde[3] sdc[2] sdb[1]
  976617344 blocks super 0.91 level 5, 16k chunk, algorithm 2  
[5/5] [U]
  [=>...]  reshape = 67.5% (329927392/488308672)  
finish=48.6min speed=54235K/sec


unused devices: 

I'll send an update (including the configs) when its done and I've  
verified everything is healthy. (and my heart has stopped racing ;)
At first I was bit scared because of the order (sde before sdd) but  
that's consistent with the mdadm -E output from the devices earlier,  
so it looks like I'll soon have my data back. *yay* :)



#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgp7dmzkVp0Ne.pgp
Description: PGP Digital Signature


Re: Help RAID5 reshape Oops / backup-file

2007-10-17 Thread Nagilum

- Message from [EMAIL PROTECTED] -
Date: Tue, 16 Oct 2007 14:50:09 +0200
From: Nagilum <[EMAIL PROTECTED]>
Reply-To: Nagilum <[EMAIL PROTECTED]>
 Subject: Re: Help RAID5 reshape Oops / backup-file
  To: Neil Brown <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org



- Message from [EMAIL PROTECTED] -
Date: Tue, 16 Oct 2007 11:16:19 +1000
From: Neil Brown <[EMAIL PROTECTED]>
Reply-To: Neil Brown <[EMAIL PROTECTED]>
 Subject: Re: Help RAID5 reshape Oops / backup-file
  To: Nagilum <[EMAIL PROTECTED]>
  Cc: linux-raid@vger.kernel.org


Please let me know how it goes.

Also, if you could show me "mdadm.conf" and "mdrun.conf" from the
initrd, that might help.

Thanks,
NeilBrown




- End message from [EMAIL PROTECTED] -


Ok, the array reshaped successfully and is back in production. :)
Here the content of the (old) initrd /etc/mdadm/mdadm.conf:

DEVICE partitions
ARRAY /dev/md0 level=raid5 num-devices=3  
UUID=25da80a6:d56eb9d6:c7780c0e:bc15422d


That needed updating of course. I don't have a "mdrun.conf" on my  
system or on the initrd.
I'll try to replicate the issue using a different machine and plain  
files (lets see if that works) at the weekend and let you know if I  
succeed.

Again, thank you so much for the patch!
Alex.


#_  __  _ __ http://www.nagilum.org/ \n icq://69646724 #
#   / |/ /__  _(_) /_  _  [EMAIL PROTECTED] \n +491776461165 #
#  // _ `/ _ `/ / / // /  ' \  Amiga (68k/PPC): AOS/NetBSD/Linux   #
# /_/|_/\_,_/\_, /_/_/\_,_/_/_/_/   Mac (PPC): MacOS-X / NetBSD /Linux #
#   /___/ x86: FreeBSD/Linux/Solaris/Win2k  ARM9: EPOC EV6 #




cakebox.homeunix.net - all the machine one needs..



pgpDLsjs2EcYb.pgp
Description: PGP Digital Signature


Re: Bad drive discovered during raid5 reshape

2007-10-29 Thread Neil Brown
On Monday October 29, [EMAIL PROTECTED] wrote:
> Hi,
> I bought two new hard drives to expand my raid array today and
> unfortunately one of them appears to be bad. The problem didn't arise
> until after I attempted to grow the raid array. I was trying to expand
> the array from 6 to 8 drives. I added both drives using mdadm --add
> /dev/md1 /dev/sdb1 which completed, then mdadm --add /dev/md1 /dev/sdc1
> which also completed. I then ran mdadm --grow /dev/md1 --raid-devices=8.
> It passed the critical section, then began the grow process.
> 
> After a few minutes I started to hear unusual sounds from within the
> case. Fearing the worst I tried to cat /proc/mdstat which resulted in no
> output so I checked dmesg which showed that /dev/sdb1 was not working
> correctly. After several minutes dmesg indicated that mdadm gave up and
> the grow process stopped. After googling around I tried the solutions
> that seemed most likely to work, including removing the new drives with
> mdadm --remove --force /dev/md1 /dev/sd[bc]1 and rebooting after which I
> ran mdadm -Af /dev/md1. The grow process restarted then failed almost
> immediately. Trying to mount the drive gives me a reiserfs replay
> failure and suggests running fsck. I don't dare fsck the array since
> I've already messed it up so badly. Is there any way to go back to the
> original working 6 disc configuration with minimal data loss? Here's
> where I'm at right now, please let me know if I need to include any
> additional information.

Looks like you are in real trouble.  Both the drives seem bad in some
way.  If it was just sdc that was failing it would have picked up
after the "-Af", but when it tried, sdb gave errors.

Have two failed devices in a RAID5 is not good!

Your best bet goes like this:

  The reshape has started and got up to some point.  The data
  before that point is spread over 8 drives.  The data after is over
  6.
  We need to restripe the 8drive data back to 6 drives.  This can be
  done with the test_stripe tool that can be built from the mdadm
  source. 

  1/ Find out how far the reshape progressed, by using "mdadm -E" on
 one of the devices.
  2/ use something like
test_stripe save /some/file 8 $chunksize 5 2 0 $length  /dev/..

 If you get all the args right, this should copy the data from
 the array into /some/file.
 You could possibly do the same thing by assembling the array 
 read-only (set /sys/modules/md_mod/parameters/start_ro to 1)
 and 'dd' from the array.  It might be worth doing both and
 checking you get the same result.

  3/ use something like
test_stripe restore /some/file 6 ..
 to restore the data to just 6 devices.

  4/ use "mdadm -C" to create the array a-new on the 6 devices.  Make
 sure the order and the chunksize etc is preserved.

 Once you have done this, the start of the array should (again)
 look like the content of /some/file.  It wouldn't hurt to check.

   Then your data would be as much back together as possible.
   You will probably still need to do an fsck, but I think you did the
   right thing in holding off.  Don't do an fsck until you are sure
   the array is writable.

You can probably do the above without using test_stripe by using dd to
copy of the array before you recreate it, then using dd to put the
same data back.  Using test_stripe as well might give you extra
confidence. 

Feel free to ask questions

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bad drive discovered during raid5 reshape

2007-10-30 Thread David Greaves
Neil Brown wrote:
> On Monday October 29, [EMAIL PROTECTED] wrote:
>> Hi,
>> I bought two new hard drives to expand my raid array today and
>> unfortunately one of them appears to be bad. The problem didn't arise

> Looks like you are in real trouble.  Both the drives seem bad in some
> way.  If it was just sdc that was failing it would have picked up
> after the "-Af", but when it tried, sdb gave errors.

Humble enquiry :)

I'm not sure that's right?
He *removed* sdb and sdc when the failure occurred so sdc would indeed be 
non-fresh.

The key question I think is: will md continue to grow an array even if it enters
degraded mode during the grow?
ie grow from a 6 drive array to a 7-of-8 degraded array?

Technically I guess it should be able to.

In which case should he be able to re-add /dev/sdc and allow md to retry the
grow? (possibly losing some data due to the sdc staleness)

David


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bad drive discovered during raid5 reshape

2007-10-30 Thread Neil Brown
On Tuesday October 30, [EMAIL PROTECTED] wrote:
> Neil Brown wrote:
> > On Monday October 29, [EMAIL PROTECTED] wrote:
> >> Hi,
> >> I bought two new hard drives to expand my raid array today and
> >> unfortunately one of them appears to be bad. The problem didn't arise
> 
> > Looks like you are in real trouble.  Both the drives seem bad in some
> > way.  If it was just sdc that was failing it would have picked up
> > after the "-Af", but when it tried, sdb gave errors.
> 
> Humble enquiry :)
> 
> I'm not sure that's right?
> He *removed* sdb and sdc when the failure occurred so sdc would indeed be 
> non-fresh.

I'm not sure what point you are making here.
In any case, remove two drives from a raid5 is always a bad thing.
Part of the array was striped over 8 drives by this time.  With only
six still in the array, some data will be missing.

> 
> The key question I think is: will md continue to grow an array even if it 
> enters
> degraded mode during the grow?
> ie grow from a 6 drive array to a 7-of-8 degraded array?
> 
> Technically I guess it should be able to.

Yes, md can grow to a degraded array.  If you get a single failure I
would expect it to abort the growth process, then restart where it
left off (after checking that that made sense).

> 
> In which case should he be able to re-add /dev/sdc and allow md to retry the
> grow? (possibly losing some data due to the sdc staleness)

He only needs one of the two drives in there.  I got the impression
that both sdc and sdb had reported errors.  If not, and sdc really
seems OK, then "--assemble --force" listing all drives except sdb
should make it all work again.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bad drive discovered during raid5 reshape

2007-10-30 Thread David Greaves
Neil Brown wrote:
> On Tuesday October 30, [EMAIL PROTECTED] wrote:
>> The key question I think is: will md continue to grow an array even if it 
>> enters
>> degraded mode during the grow?
>> ie grow from a 6 drive array to a 7-of-8 degraded array?
> 
> Yes, md can grow to a degraded array.  If you get a single failure I
> would expect it to abort the growth process, then restart where it
> left off (after checking that that made sense).

I read that he aborted it, then removed both drives before giving md a chance to
restart.

He said:
After several minutes dmesg indicated that mdadm gave up and
the grow process stopped. After googling around I tried the solutions
that seemed most likely to work, including removing the new drives with
mdadm --remove --force /dev/md1 /dev/sd[bc]1 and rebooting

and *then* he: "ran mdadm -Af /dev/md1."

>> In which case should he be able to re-add /dev/sdc and allow md to retry the
>> grow? (possibly losing some data due to the sdc staleness)
> 
> He only needs one of the two drives in there.  I got the impression
> that both sdc and sdb had reported errors.  If not, and sdc really
> seems OK, then "--assemble --force" listing all drives except sdb
> should make it all work again.

Kyle - I think you need to clarify this as it may not be too bad. Apologies if I
misread something and sdc is bad too :)

It may be an idea to let us (Neil) know what you've done and if you've done any
writes to any devices before trying this assemble.

David

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bad drive discovered during raid5 reshape

2007-10-30 Thread Kyle Stuart
David Greaves wrote:
> I read that he aborted it, then removed both drives before giving md a chance 
> to
> restart.
>
> He said:
> After several minutes dmesg indicated that mdadm gave up and
> the grow process stopped. After googling around I tried the solutions
> that seemed most likely to work, including removing the new drives with
> mdadm --remove --force /dev/md1 /dev/sd[bc]1 and rebooting
>
> and *then* he: "ran mdadm -Af /dev/md1."
>   
This is correct. I first removed sdb and sdc then rebooted and ran mdadm
-Af /dev/md1.
>
> Kyle - I think you need to clarify this as it may not be too bad. Apologies 
> if I
> misread something and sdc is bad too :)
>
> It may be an idea to let us (Neil) know what you've done and if you've done 
> any
> writes to any devices before trying this assemble.
>
> David
When I sent the first email I thought only sdb had failed. After digging
into the log files it appears sdc also reported several bad blocks
during the grow. This is what I get for not testing cheap refurbed
drives before trusting them with my data, but hindsight is 20/20.
Fortunately all of the important data is backed up so if I can't recover
anything using Neil's suggestions it's not a total loss.

Thank you both for the help.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 000 of 7] md: Introduction - raid5 reshape mark-2

2006-01-23 Thread NeilBrown
Here is second release of my patches to support online reshaping
of a raid5 array, i.e. adding 1 or more devices and restriping the 
whole thing.

This release fixes an assortment of bugs and adds checkpoint/restart
to the process (the last two patches).
This means that if your machine crashes, or if you have to stop an
array before the reshape is complete, md will notice and will restart
the reshape at an appropriate place.

There is still a small window ( < 1 second) at the start of the reshape
during which a crash will cause unrecoverable corruption.  My plan is
to resolve this in mdadm rather than md. The critical data will be copied
into the new drive(s) prior to commencing the reshape.  If there is a crash
the kernel will refuse the reassmble the array.  mdadm will be able to 
re-assemble it by first restoring the critical data and then letting
the remainder of the reshape run it's course.

I will be changing the interface for starting a reshape slightly before
this patch become final.  This will mean that current 'mdadm' will not
be able to start a raid5 reshape.
This is partly to save people from risking the above mentioned tiny hole,
but also to prepare for reshaping which changes other aspects of the
shape, e.g. layout, chunksize, level.

I am expecting that I will ultimately support online conversion of
raid5 to raid6 with only one extra device.  This process is not
(efficiently) checkpointable and so will be at-your-risk.
Checkpointing such a process with anything like reasonable efficiency
requires a largish (multi-megabytes) temporary store, and doing so
will at-best halve the speed.  I will make sure the posibility of
add this later will be left open.

My thanks to those who have tested the first release, who have
provided feedback, who will test this release, and who contribute to
the discussion in any way.

NeilBrown



 [PATCH 001 of 7] md: Split disks array out of raid5 conf structure so it is 
easier to grow.
 [PATCH 002 of 7] md: Allow stripes to be expanded in preparation for expanding 
an array.
 [PATCH 003 of 7] md: Infrastructure to allow normal IO to continue while array 
is expanding.
 [PATCH 004 of 7] md: Core of raid5 resize process
 [PATCH 005 of 7] md: Final stages of raid5 expand code.
 [PATCH 006 of 7] md: Checkpoint and allow restart of raid5 reshape
 [PATCH 007 of 7] md: Only checkpoint expansion progress occasionally.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 003 of 6] md: Remove 'experimental' classification from raid5 reshape.

2006-09-28 Thread NeilBrown

I have had enough success reports not to believe that this 
is safe for 2.6.19.


Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/Kconfig |   10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff .prev/drivers/md/Kconfig ./drivers/md/Kconfig
--- .prev/drivers/md/Kconfig2006-09-29 11:38:03.0 +1000
+++ ./drivers/md/Kconfig2006-09-29 11:49:16.0 +1000
@@ -138,16 +138,16 @@ config MD_RAID456
  If unsure, say Y.
 
 config MD_RAID5_RESHAPE
-   bool "Support adding drives to a raid-5 array (experimental)"
-   depends on MD_RAID456 && EXPERIMENTAL
+   bool "Support adding drives to a raid-5 array"
+   depends on MD_RAID456
+   default y
---help---
  A RAID-5 set can be expanded by adding extra drives. This
  requires "restriping" the array which means (almost) every
  block must be written to a different place.
 
   This option allows such restriping to be done while the array
- is online.  However it is still EXPERIMENTAL code.  It should
- work, but please be sure that you have backups.
+ is online.
 
  You will need mdadm version 2.4.1 or later to use this
  feature safely.  During the early stage of reshape there is
@@ -164,6 +164,8 @@ config MD_RAID5_RESHAPE
  There should be enough spares already present to make the new
  array workable.
 
+ In unsure, say Y.
+
 config MD_MULTIPATH
tristate "Multipath I/O support"
depends on BLK_DEV_MD
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 7] md: Introduction - raid5 reshape mark-2

2006-01-24 Thread Lars Marowsky-Bree
On 2006-01-24T11:40:47, NeilBrown <[EMAIL PROTECTED]> wrote:

> I am expecting that I will ultimately support online conversion of
> raid5 to raid6 with only one extra device.  This process is not
> (efficiently) checkpointable and so will be at-your-risk.

So the best way to go about that, if one wants to keep that option open
w/o that risk, would be to not create a raid5 in the first place, but a
raid6 with one disk missing?

Maybe even have mdadm default to that - as long as just one parity disk
is missing, no slowdown should happen, right?


Sincerely,
Lars Marowsky-Brée

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 7] md: Introduction - raid5 reshape mark-2

2006-01-24 Thread Neil Brown
On Tuesday January 24, [EMAIL PROTECTED] wrote:
> On 2006-01-24T11:40:47, NeilBrown <[EMAIL PROTECTED]> wrote:
> 
> > I am expecting that I will ultimately support online conversion of
> > raid5 to raid6 with only one extra device.  This process is not
> > (efficiently) checkpointable and so will be at-your-risk.
> 
> So the best way to go about that, if one wants to keep that option open
> w/o that risk, would be to not create a raid5 in the first place, but a
> raid6 with one disk missing?
> 
> Maybe even have mdadm default to that - as long as just one parity disk
> is missing, no slowdown should happen, right?

Not exactly

raid6 has rotating parity drives, for both P and Q (the two different
'parity' blocks).
With one missing device, some Ps, some Qs, and some data would be
missing, and you would definitely get a slowdown trying to generate
some of it.

We could define a raid6 layout that didn't rotate Q.  Then you would
be able to do what you suggest.
However it would then be no different from creating a normal raid5 and
supporting online conversion from raid5 to raid6-with-non-rotating-Q.
This conversion doesn't need an reshaping pass, just a recovery of the
now-missing device.

raid6-with-non-rotating-Q would have similar issues to raid4 - one
drive becomes a hot-spot for writes.  I don't know how much of an
issue this really is though.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 7] md: Introduction - raid5 reshape mark-2

2006-02-07 Thread Henrik Holst
Hello linux world!

Excuse me for being so ignorant but /exactly how/ do I go about to find
out which files to download from kernel.org that will approve these
patches?

[snip from N. Browns initial post]

>
> [PATCH 001 of 7] md: Split disks array out of raid5 conf structure so it is 
> easier to grow.
> [PATCH 002 of 7] md: Allow stripes to be expanded in preparation for 
> expanding an array.
> [PATCH 003 of 7] md: Infrastructure to allow normal IO to continue while 
> array is expanding.
> [PATCH 004 of 7] md: Core of raid5 resize process
> [PATCH 005 of 7] md: Final stages of raid5 expand code.
> [PATCH 006 of 7] md: Checkpoint and allow restart of raid5 reshape
> [PATCH 007 of 7] md: Only checkpoint expansion progress occasionally.
>

I only get lot's of "chunk failed" when running patch command on my
src.tar.gz kernels. :-(

Thanks for advice,

Henrik Holst. Certified kernel patch noob.


-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 7] md: Introduction - raid5 reshape mark-2

2006-02-08 Thread Neil Brown
On Tuesday February 7, [EMAIL PROTECTED] wrote:
> Hello linux world!
> 
> Excuse me for being so ignorant but /exactly how/ do I go about to find
> out which files to download from kernel.org that will approve these
> patches?

I always make them against the latest -mm kernel, so that would be a
good place to start.  However things change quickly and I can't
promise it will apply against whatever is the 'latest' today.

If you would like to nominate a particular recent kernel, I'll create
a patch set that is guaranteed to apply against that. (Testing is
always appreciated, and well worth that small effort on my part).

NeilBrown

> 
> [snip from N. Browns initial post]
> 
> >
> > [PATCH 001 of 7] md: Split disks array out of raid5 conf structure so it is 
> > easier to grow.
> > [PATCH 002 of 7] md: Allow stripes to be expanded in preparation for 
> > expanding an array.
> > [PATCH 003 of 7] md: Infrastructure to allow normal IO to continue while 
> > array is expanding.
> > [PATCH 004 of 7] md: Core of raid5 resize process
> > [PATCH 005 of 7] md: Final stages of raid5 expand code.
> > [PATCH 006 of 7] md: Checkpoint and allow restart of raid5 reshape
> > [PATCH 007 of 7] md: Only checkpoint expansion progress occasionally.
> >
> 
> I only get lot's of "chunk failed" when running patch command on my
> src.tar.gz kernels. :-(
> 
> Thanks for advice,
> 
> Henrik Holst. Certified kernel patch noob.
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Kernels and MD versions (was: md: Introduction - raid5 reshape mark-2)

2006-02-08 Thread Patrik Jonsson
Neil Brown wrote:
> I always make them against the latest -mm kernel, so that would be a
> good place to start.  However things change quickly and I can't
> promise it will apply against whatever is the 'latest' today.
> 
> If you would like to nominate a particular recent kernel, I'll create
> a patch set that is guaranteed to apply against that. (Testing is
> always appreciated, and well worth that small effort on my part).

I find this is a major problem for me, too. Even though I try to stay up
to date with the md developments, I have a hard time piecing together
which past patches went with which version, so if I get a recent version
I don't know which patches need to be applied and which are already in
there.

My suggestion is that Neil, should he be willing, keep a log somewhere
which detail kernel version and what major updates in md functionality
go along with it. Something like
2.6.14 raid5 read error correction
2.6.15     md /sys interface
2.6.16.rc1 raid5 reshape
2.6.16.rc2-mm4 something else cool

it would include the released kernels and the previews of the current
kernel. That way, say I see that the latest FC4 kernel is 2.6.14, I
could look and see that since the raid5 read error correction was
included, I don't have to go looking for the patches.

Maybe this is too much hassle (or maybe it's already out there
somewhere) but I'm thinking simple, and I think it would give a
high-level overview of the development for us who are not intimately
involved in every kernel version.

Regards,

/Patrik


signature.asc
Description: OpenPGP digital signature


Re: [PATCH 003 of 6] md: Remove 'experimental' classification from raid5 reshape.

2006-09-28 Thread Jeff Breidenbach

Typo in last line of this patch.


+ In unsure, say Y.

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 003 of 6] md: Remove 'experimental' classification from raid5 reshape.

2006-10-02 Thread David Greaves
Typo in first line of this patch :)

> I have had enough success reports not^H^H^H to believe that this 
> is safe for 2.6.19.

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 006 of 7] md: Checkpoint and allow restart of raid5 reshape

2006-01-23 Thread NeilBrown
v)
return -EIO;
}
 
+   if (mddev->reshape_position != MaxSector) {
+   /* Check that we can continue the reshape.
+* Currently only disks can change, it must
+* increase, and we must be passed the point where
+* a stripe over-writes itself
+*/
+   sector_t here_new, here_old, there_new;
+   int old_disks;
+
+   if (mddev->new_level != mddev->level ||
+   mddev->new_layout != mddev->layout ||
+   mddev->new_chunk != mddev->chunk_size) {
+   printk(KERN_ERR "raid5: %s: unsupported reshape 
required - aborting.\n",
+  mdname(mddev));
+   return -EINVAL;
+   }
+   if (mddev->delta_disks <= 0) {
+   printk(KERN_ERR "raid5: %s: unsupported reshape (reduce 
disks) required - aborting.\n",
+  mdname(mddev));
+   return -EINVAL;
+   }
+   old_disks = mddev->raid_disks - mddev->delta_disks;
+   /* reshape_position must be on a new-stripe boundary, and one
+* further up in new geometry must map before here in old 
geometry.
+*/
+   here_new = mddev->reshape_position;
+   if (sector_div(here_new, 
(mddev->chunk_size>>9)*(mddev->raid_disks-1))) {
+   printk(KERN_ERR "raid5: reshape_position not on a 
stripe boundary\n");
+   return -EINVAL;
+   }
+   here_old = mddev->reshape_position;
+   sector_div(here_old, (mddev->chunk_size>>9)*(old_disks-1));
+   /* here_old is the first sector that we might need to read from
+* for the next movement
+*/
+   there_new = here_new + (mddev->chunk_size>>9);
+   /* there_new is the last sector that the next movement will be
+* written to.
+*/
+   if (there_new >= here_old) {
+   printk(KERN_ERR "raid5: reshape_position too early for 
auto-recovery - aborting.\n");
+   return -EINVAL;
+   }
+   printk("raid5: reshape will continue\n");
+   /* OK, we should be able to continue; */
+   }
+
+
mddev->private = kzalloc(sizeof (raid5_conf_t), GFP_KERNEL);
if ((conf = mddev->private) == NULL)
goto abort;
-   conf->disks = kzalloc(mddev->raid_disks * sizeof(struct disk_info),
+   if (mddev->reshape_position == MaxSector) {
+   conf->previous_raid_disks = conf->raid_disks = 
mddev->raid_disks;
+   } else {
+   conf->raid_disks = mddev->raid_disks;
+   conf->previous_raid_disks = mddev->raid_disks - 
mddev->delta_disks;
+   }
+
+   conf->disks = kzalloc(conf->raid_disks * sizeof(struct disk_info),
  GFP_KERNEL);
if (!conf->disks)
goto abort;
@@ -2134,7 +2212,7 @@ static int run(mddev_t *mddev)
 
ITERATE_RDEV(mddev,rdev,tmp) {
raid_disk = rdev->raid_disk;
-   if (raid_disk >= mddev->raid_disks
+   if (raid_disk >= conf->raid_disks
|| raid_disk < 0)
continue;
disk = conf->disks + raid_disk;
@@ -2150,7 +2228,6 @@ static int run(mddev_t *mddev)
}
}
 
-   conf->raid_disks = mddev->raid_disks;
/*
 * 0 for a fully functional array, 1 for a degraded array.
 */
@@ -2160,7 +2237,7 @@ static int run(mddev_t *mddev)
conf->level = mddev->level;
conf->algorithm = mddev->layout;
conf->max_nr_stripes = NR_STRIPES;
-   conf->expand_progress = MaxSector;
+   conf->expand_progress = mddev->reshape_position;
 
/* device size must be a multiple of chunk size */
mddev->size &= ~(mddev->chunk_size/1024 -1);
@@ -2233,6 +2310,20 @@ static int run(mddev_t *mddev)
 
print_raid5_conf(conf);
 
+   if (conf->expand_progress != MaxSector) {
+   printk("...ok start reshape thread\n");
+   atomic_set(&conf->reshape_stripes, 0);
+   clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
+   clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
+   set_bit(MD_RECOVERY_RESHAPE, &mddev->recovery);
+   set_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
+   mddev->sync_thread = md_register_thread(md_do_sync, mddev,
+

Re: Kernels and MD versions (was: md: Introduction - raid5 reshape mark-2)

2006-02-09 Thread Mr. James W. Laferriere

Hello Patrik ,

On Wed, 8 Feb 2006, Patrik Jonsson wrote:

Neil Brown wrote:

I always make them against the latest -mm kernel, so that would be a
good place to start.  However things change quickly and I can't
promise it will apply against whatever is the 'latest' today.

If you would like to nominate a particular recent kernel, I'll create
a patch set that is guaranteed to apply against that. (Testing is
always appreciated, and well worth that small effort on my part).


I find this is a major problem for me, too. Even though I try to stay up
to date with the md developments, I have a hard time piecing together
which past patches went with which version, so if I get a recent version
I don't know which patches need to be applied and which are already in
there.

My suggestion is that Neil, should he be willing, keep a log somewhere
which detail kernel version and what major updates in md functionality
go along with it. Something like
2.6.14 raid5 read error correction
2.6.15 md /sys interface
2.6.16.rc1 raid5 reshape
2.6.16.rc2-mm4 something else cool

it would include the released kernels and the previews of the current
kernel. That way, say I see that the latest FC4 kernel is 2.6.14, I
could look and see that since the raid5 read error correction was
included, I don't have to go looking for the patches.

Maybe this is too much hassle (or maybe it's already out there
somewhere) but I'm thinking simple, and I think it would give a
high-level overview of the development for us who are not intimately
involved in every kernel version.

Iirc ,  The git/snv/cvs repositories 'can' contain this
information .  Neil (as you say , if willing) can pull the
kernel versions & Notes for his submissions from the
repository .   Hth ,  JimL
--
+--+
| James   W.   Laferriere | SystemTechniques | Give me VMS |
| NetworkEngineer | 3542 Broken Yoke Dr. |  Give me Linux  |
| [EMAIL PROTECTED] | Billings , MT. 59105 |   only  on  AXP |
|  http://www.asteriskhelpdesk.com/cgi-bin/astlance/r.cgi?babydr   |
+--+
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 009 of 13] md: Checkpoint and allow restart of raid5 reshape

2006-03-16 Thread NeilBrown
event(mddev->sb_wait, mddev->sb_dirty == 0 ||
+   kthread_should_stop());
+
for (i=0; i < conf->chunk_size/512; i+= STRIPE_SECTORS) {
int j;
int skipped = 0;
@@ -1867,6 +1889,7 @@ static sector_t sync_request(mddev_t *md
sh = get_active_stripe(conf, sector_nr+i,
   conf->raid_disks, pd_idx, 0);
set_bit(STRIPE_EXPANDING, &sh->state);
+   atomic_inc(&conf->reshape_stripes);
/* If any of this stripe is beyond the end of the old
 * array, then we need to zero those blocks
 */
@@ -2106,10 +2129,61 @@ static int run(mddev_t *mddev)
return -EIO;
}
 
+   if (mddev->reshape_position != MaxSector) {
+   /* Check that we can continue the reshape.
+* Currently only disks can change, it must
+* increase, and we must be past the point where
+* a stripe over-writes itself
+*/
+   sector_t here_new, here_old, there_new;
+   int old_disks;
+
+   if (mddev->new_level != mddev->level ||
+   mddev->new_layout != mddev->layout ||
+   mddev->new_chunk != mddev->chunk_size) {
+   printk(KERN_ERR "raid5: %s: unsupported reshape 
required - aborting.\n",
+  mdname(mddev));
+   return -EINVAL;
+   }
+   if (mddev->delta_disks <= 0) {
+   printk(KERN_ERR "raid5: %s: unsupported reshape (reduce 
disks) required - aborting.\n",
+  mdname(mddev));
+   return -EINVAL;
+   }
+   old_disks = mddev->raid_disks - mddev->delta_disks;
+   /* reshape_position must be on a new-stripe boundary, and one
+* further up in new geometry must map after here in old 
geometry.
+*/
+   here_new = mddev->reshape_position;
+   if (sector_div(here_new, 
(mddev->chunk_size>>9)*(mddev->raid_disks-1))) {
+   printk(KERN_ERR "raid5: reshape_position not on a 
stripe boundary\n");
+   return -EINVAL;
+   }
+   /* here_new is the stripe we will write to */
+   here_old = mddev->reshape_position;
+   sector_div(here_old, (mddev->chunk_size>>9)*(old_disks-1));
+   /* here_old is the first stripe that we might need to read from 
*/
+   if (here_new >= here_old) {
+   /* Reading from the same stripe as writing to - bad */
+   printk(KERN_ERR "raid5: reshape_position too early for 
auto-recovery - aborting.\n");
+   return -EINVAL;
+   }
+   printk(KERN_INFO "raid5: reshape will continue\n");
+   /* OK, we should be able to continue; */
+   }
+
+
mddev->private = kzalloc(sizeof (raid5_conf_t), GFP_KERNEL);
if ((conf = mddev->private) == NULL)
goto abort;
-   conf->disks = kzalloc(mddev->raid_disks * sizeof(struct disk_info),
+   if (mddev->reshape_position == MaxSector) {
+   conf->previous_raid_disks = conf->raid_disks = 
mddev->raid_disks;
+   } else {
+   conf->raid_disks = mddev->raid_disks;
+   conf->previous_raid_disks = mddev->raid_disks - 
mddev->delta_disks;
+   }
+
+   conf->disks = kzalloc(conf->raid_disks * sizeof(struct disk_info),
  GFP_KERNEL);
if (!conf->disks)
goto abort;
@@ -2133,7 +2207,7 @@ static int run(mddev_t *mddev)
 
ITERATE_RDEV(mddev,rdev,tmp) {
raid_disk = rdev->raid_disk;
-   if (raid_disk >= mddev->raid_disks
+   if (raid_disk >= conf->raid_disks
|| raid_disk < 0)
continue;
disk = conf->disks + raid_disk;
@@ -2149,7 +2223,6 @@ static int run(mddev_t *mddev)
}
}
 
-   conf->raid_disks = mddev->raid_disks;
/*
 * 0 for a fully functional array, 1 for a degraded array.
 */
@@ -2159,7 +2232,7 @@ static int run(mddev_t *mddev)
conf->level = mddev->level;
conf->algorithm = mddev->layout;
conf->max_nr_stripes = NR_STRIPES;
-   conf->expand_progress = MaxSector;
+   conf->expand_progress = mddev->reshape_position;
 
/* device size must be a multiple of chunk size */
mddev->size &= ~(mddev

Re: [PATCH 006 of 7] md: Checkpoint and allow restart of raid5 reshape

2006-01-27 Thread Molle Bestefich
NeilBrown wrote:
> We allow the superblock to record an 'old' and a 'new'
> geometry, and a position where any conversion is up to.

> When starting an array we check for an incomplete reshape
> and restart the reshape process if needed.

*Super* cool!
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 002 of 9] md: Fix sizing problem with raid5-reshape and CONFIG_LBD=n

2006-11-07 Thread NeilBrown

I forgot to has the size-in-blocks to (loff_t) before shifting up to a 
size-in-bytes.


Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/raid5.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c2006-11-06 11:21:24.0 +1100
+++ ./drivers/md/raid5.c2006-11-06 11:28:51.0 +1100
@@ -3659,7 +3659,7 @@ static void end_reshape(raid5_conf_t *co
bdev = bdget_disk(conf->mddev->gendisk, 0);
if (bdev) {
mutex_lock(&bdev->bd_inode->i_mutex);
-   i_size_write(bdev->bd_inode, conf->mddev->array_size << 
10);
+   i_size_write(bdev->bd_inode, 
(loff_t)conf->mddev->array_size << 10);
mutex_unlock(&bdev->bd_inode->i_mutex);
bdput(bdev);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 005 of 6] md: Restart a (raid5) reshape that has been aborted due to a read/write error.

2007-02-19 Thread NeilBrown

An error always aborts any resync/recovery/reshape on the understanding
that it will immediately be restarted if that still makes sense.
However a reshape currently doesn't get restarted.  With this patch
it does.
To avoid restarting when it is not possible to do work, we call 
into the personality to check that a reshape is ok, and strengthen
raid5_check_reshape to fail if there are too many failed devices.

We also break some code out into a separate function: remove_and_add_spares
as the indent level for that code was getting crazy.

Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

### Diffstat output
 ./drivers/md/md.c|   74 +++
 ./drivers/md/raid5.c |2 +
 2 files changed, 47 insertions(+), 29 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2007-02-20 17:13:08.0 +1100
+++ ./drivers/md/md.c   2007-02-20 17:14:35.0 +1100
@@ -5357,6 +5357,44 @@ void md_do_sync(mddev_t *mddev)
 EXPORT_SYMBOL_GPL(md_do_sync);
 
 
+static int remove_and_add_spares(mddev_t *mddev)
+{
+   mdk_rdev_t *rdev;
+   struct list_head *rtmp;
+   int spares = 0;
+
+   ITERATE_RDEV(mddev,rdev,rtmp)
+   if (rdev->raid_disk >= 0 &&
+   (test_bit(Faulty, &rdev->flags) ||
+! test_bit(In_sync, &rdev->flags)) &&
+   atomic_read(&rdev->nr_pending)==0) {
+   if (mddev->pers->hot_remove_disk(
+   mddev, rdev->raid_disk)==0) {
+   char nm[20];
+   sprintf(nm,"rd%d", rdev->raid_disk);
+   sysfs_remove_link(&mddev->kobj, nm);
+   rdev->raid_disk = -1;
+   }
+   }
+
+   if (mddev->degraded) {
+   ITERATE_RDEV(mddev,rdev,rtmp)
+   if (rdev->raid_disk < 0
+   && !test_bit(Faulty, &rdev->flags)) {
+   rdev->recovery_offset = 0;
+   if (mddev->pers->hot_add_disk(mddev,rdev)) {
+   char nm[20];
+   sprintf(nm, "rd%d", rdev->raid_disk);
+   sysfs_create_link(&mddev->kobj,
+ &rdev->kobj, nm);
+   spares++;
+   md_new_event(mddev);
+   } else
+   break;
+   }
+   }
+   return spares;
+}
 /*
  * This routine is regularly called by all per-raid-array threads to
  * deal with generic issues like resync and super-block update.
@@ -5411,7 +5449,7 @@ void md_check_recovery(mddev_t *mddev)
return;
 
if (mddev_trylock(mddev)) {
-   int spares =0;
+   int spares = 0;
 
spin_lock_irq(&mddev->write_lock);
if (mddev->safemode && !atomic_read(&mddev->writes_pending) &&
@@ -5474,35 +5512,13 @@ void md_check_recovery(mddev_t *mddev)
 * Spare are also removed and re-added, to allow
 * the personality to fail the re-add.
 */
-   ITERATE_RDEV(mddev,rdev,rtmp)
-   if (rdev->raid_disk >= 0 &&
-   (test_bit(Faulty, &rdev->flags) || ! 
test_bit(In_sync, &rdev->flags)) &&
-   atomic_read(&rdev->nr_pending)==0) {
-   if (mddev->pers->hot_remove_disk(mddev, 
rdev->raid_disk)==0) {
-   char nm[20];
-   sprintf(nm,"rd%d", rdev->raid_disk);
-   sysfs_remove_link(&mddev->kobj, nm);
-   rdev->raid_disk = -1;
-   }
-   }
-
-   if (mddev->degraded) {
-   ITERATE_RDEV(mddev,rdev,rtmp)
-   if (rdev->raid_disk < 0
-   && !test_bit(Faulty, &rdev->flags)) {
-   rdev->recovery_offset = 0;
-   if 
(mddev->pers->hot_add_disk(mddev,rdev)) {
-   char nm[20];
-   sprintf(nm, "rd%d", 
rdev->raid_disk);
-   sysfs_create_link(&mddev->kobj, 
&rdev->kobj, nm);
-   spares++;
-   md_new_event(mddev);
-   } else
-   break;
-   }
-   }
 
-   if (spares) {
+