Re: raid5 stuck in degraded, inactive and dirty mode

2008-01-10 Thread CaT
On Wed, Jan 09, 2008 at 07:16:34PM +1100, CaT wrote:
  But I suspect that --assemble --force would do the right thing.
  Without more details, it is hard to say for sure.
 
 I suspect so aswell but throwing caution into the wind erks me wrt this
 raid array. :)

Sorry. Not to be a pain but considering the previous email with all the
examine dumps, etc would the above be the way to go? I just don't want
to have missed something and bugger the array up totally.

-- 
To the extent that we overreact, we proffer the terrorists the
greatest tribute.
- High Court Judge Michael Kirby
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Jed Davidow
I'm sorry- is this an inappropriate list to ask for help?  There seemed 
to be a fair amount of that when I searched the archives, but I don't 
want to bug developers with my problems!


Please let me know if I should find another place to ask for help (and 
please let me know where that might be!).


Thanks!

Jed Davidow wrote:
I have a RAID5 (5+1spare) setup that works perfectly well until I 
reboot.  I have 6 drives (two different models) partitioned to give me 
2 arrays, md0 and md1, that I use for /home and /var respectively.


When I reboot, the system assembles each array, but swaps out what was 
the spare with one of the member drives.  It then immediately detects 
a degraded array and rebuilds.  After that, all is fine and testing 
has shown things to be working like they should.  Until I reboot.


Example:
Built two arrays:  /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - 
/dev/sd[abcef]2

Added /dev/sdg1 and /dev/sdg2 as spares, and this works.

One scenario when I reboot:
   md0 is assembled from sd[abceg]1; it's degraded and reports a 
spares missing event.

   md1 assembles correctly, spare is not missing

Any ideas?  I have asked about this on various boards (some said UDEV 
rules would help, some thought the issue had to do with the /dev/sdX 
names changing, etc).  I don't think those are applicable since dmesg 
reports the arrays assemble as soon as the disks are detected.



Thanks in advance!


INFO:
(currently the boot drive (non raid) is sdd, otherwise all sd devices 
are part of the raid)


fdisk:

   $ sudo fdisk -l

   Disk /dev/sda: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sda1   1152112217401   fd  Linux raid
   autodetect
   /dev/sda21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdb: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdb1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdb21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdc: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdc1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdc21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/md0: 50.0 GB, 50041978880 bytes
   2 heads, 4 sectors/track, 12217280 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md0 doesn't contain a valid partition table

   Disk /dev/md1: 950.1 GB, 950183919616 bytes
   2 heads, 4 sectors/track, 231978496 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md1 doesn't contain a valid partition table

   Disk /dev/sdd: 120.0 GB, 120034123776 bytes
   255 heads, 63 sectors/track, 14593 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x535bfd7a

  Device Boot  Start End  Blocks   Id  System
   /dev/sdd1   *   1   14219   114214086   83  Linux
   /dev/sdd2   14220   14593 30041555  Extended
   /dev/sdd5   14220   14593 3004123+  82  Linux swap /
   Solaris

   Disk /dev/sde: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sde1   1152112217401   fd  Linux raid
   autodetect
   /dev/sde21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdf: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdf1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdf21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdg: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdg1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdg21522   30401   231978600   fd  Linux raid
   autodetect


$ sudo mdadm --detail /dev/md0 

Re: 2.6.24-rc6 reproducible raid5 hang

2008-01-10 Thread dean gaudet
On Thu, 10 Jan 2008, Neil Brown wrote:

 On Wednesday January 9, [EMAIL PROTECTED] wrote:
  On Sun, 2007-12-30 at 10:58 -0700, dean gaudet wrote:
   i have evidence pointing to d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
   
   http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1
   
   which was Neil's change in 2.6.22 for deferring generic_make_request 
   until there's enough stack space for it.
   
  
  Commit d89d87965dcbe6fe4f96a2a7e8421b3a75f634d1 reduced stack utilization
  by preventing recursive calls to generic_make_request.  However the
  following conditions can cause raid5 to hang until 'stripe_cache_size' is
  increased:
  
 
 Thanks for pursuing this guys.  That explanation certainly sounds very
 credible.
 
 The generic_make_request_immed is a good way to confirm that we have
 found the bug,  but I don't like it as a long term solution, as it
 just reintroduced the problem that we were trying to solve with the
 problematic commit.
 
 As you say, we could arrange that all request submission happens in
 raid5d and I think this is the right way to proceed.  However we can
 still take some of the work into the thread that is submitting the
 IO by calling raid5d() at the end of make_request, like this.
 
 Can you test it please?  Does it seem reasonable?
 
 Thanks,
 NeilBrown
 
 
 Signed-off-by: Neil Brown [EMAIL PROTECTED]

it has passed 11h of the untar/diff/rm linux.tar.gz workload... that's 
pretty good evidence it works for me.  thanks!

Tested-by: dean gaudet [EMAIL PROTECTED]

 
 ### Diffstat output
  ./drivers/md/md.c|2 +-
  ./drivers/md/raid5.c |4 +++-
  2 files changed, 4 insertions(+), 2 deletions(-)
 
 diff .prev/drivers/md/md.c ./drivers/md/md.c
 --- .prev/drivers/md/md.c 2008-01-07 13:32:10.0 +1100
 +++ ./drivers/md/md.c 2008-01-10 11:08:02.0 +1100
 @@ -5774,7 +5774,7 @@ void md_check_recovery(mddev_t *mddev)
   if (mddev-ro)
   return;
  
 - if (signal_pending(current)) {
 + if (current == mddev-thread-tsk  signal_pending(current)) {
   if (mddev-pers-sync_request) {
   printk(KERN_INFO md: %s in immediate safe mode\n,
  mdname(mddev));
 
 diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
 --- .prev/drivers/md/raid5.c  2008-01-07 13:32:10.0 +1100
 +++ ./drivers/md/raid5.c  2008-01-10 11:06:54.0 +1100
 @@ -3432,6 +3432,7 @@ static int chunk_aligned_read(struct req
   }
  }
  
 +static void raid5d (mddev_t *mddev);
  
  static int make_request(struct request_queue *q, struct bio * bi)
  {
 @@ -3547,7 +3548,7 @@ static int make_request(struct request_q
   goto retry;
   }
   finish_wait(conf-wait_for_overlap, w);
 - handle_stripe(sh, NULL);
 + set_bit(STRIPE_HANDLE, sh-state);
   release_stripe(sh);
   } else {
   /* cannot get stripe for read-ahead, just give-up */
 @@ -3569,6 +3570,7 @@ static int make_request(struct request_q
 test_bit(BIO_UPTODATE, bi-bi_flags)
   ? 0 : -EIO);
   }
 + raid5d(mddev);
   return 0;
  }
  
 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.24-rc6 reproducible raid5 hang

2008-01-10 Thread Dan Williams
On Jan 10, 2008 12:13 AM, dean gaudet [EMAIL PROTECTED] wrote:
 w.r.t. dan's cfq comments -- i really don't know the details, but does
 this mean cfq will misattribute the IO to the wrong user/process?  or is
 it just a concern that CPU time will be spent on someone's IO?  the latter
 is fine to me... the former seems sucky because with today's multicore
 systems CPU time seems cheap compared to IO.


I do not see this affecting the time slicing feature of cfq, because
as Neil says the work has to get done at some point.   If I give up
some of my slice working on someone else's I/O chances are the favor
will be returned in kind since the code does not discriminate.  The
io-priority capability of cfq currently does not work as advertised
with current MD since the priority is tied to the current thread and
the thread that actually submits the i/o on a stripe is
non-deterministic.  So I do not see this change making the situation
any worse.  In fact, it may make it a bit better since there is a
higher chance for the thread submitting i/o to MD to do its own i/o to
the backing disks.

Reviewed-by: Dan Williams [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Bill Davidsen

Jed Davidow wrote:
I have a RAID5 (5+1spare) setup that works perfectly well until I 
reboot.  I have 6 drives (two different models) partitioned to give me 
2 arrays, md0 and md1, that I use for /home and /var respectively.


When I reboot, the system assembles each array, but swaps out what was 
the spare with one of the member drives.  It then immediately detects 
a degraded array and rebuilds.  After that, all is fine and testing 
has shown things to be working like they should.  Until I reboot.


Example:
Built two arrays:  /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - 
/dev/sd[abcef]2

Added /dev/sdg1 and /dev/sdg2 as spares, and this works.

One scenario when I reboot:
   md0 is assembled from sd[abceg]1; it's degraded and reports a 
spares missing event.

   md1 assembles correctly, spare is not missing

I'm looking at the dmesg which follows and seeing md1 reconstructing. 
This seems to be at variance with assembles correctly here. That's the 
only thing which has struck me as worth mentioning so far.
Any ideas?  I have asked about this on various boards (some said UDEV 
rules would help, some thought the issue had to do with the /dev/sdX 
names changing, etc).  I don't think those are applicable since dmesg 
reports the arrays assemble as soon as the disks are detected.



Thanks in advance!


INFO:
(currently the boot drive (non raid) is sdd, otherwise all sd devices 
are part of the raid)


fdisk:

   $ sudo fdisk -l

   Disk /dev/sda: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sda1   1152112217401   fd  Linux raid
   autodetect
   /dev/sda21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdb: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdb1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdb21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdc: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdc1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdc21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/md0: 50.0 GB, 50041978880 bytes
   2 heads, 4 sectors/track, 12217280 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md0 doesn't contain a valid partition table

   Disk /dev/md1: 950.1 GB, 950183919616 bytes
   2 heads, 4 sectors/track, 231978496 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md1 doesn't contain a valid partition table

   Disk /dev/sdd: 120.0 GB, 120034123776 bytes
   255 heads, 63 sectors/track, 14593 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x535bfd7a

  Device Boot  Start End  Blocks   Id  System
   /dev/sdd1   *   1   14219   114214086   83  Linux
   /dev/sdd2   14220   14593 30041555  Extended
   /dev/sdd5   14220   14593 3004123+  82  Linux swap /
   Solaris

   Disk /dev/sde: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sde1   1152112217401   fd  Linux raid
   autodetect
   /dev/sde21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdf: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdf1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdf21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdg: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdg1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdg21522   30401   231978600   fd  Linux raid
   autodetect


$ sudo mdadm --detail /dev/md0 (md1 shows similar info)

   /dev/md0:
   Version : 00.90.03
 Creation Time : Sat Apr  7 23:32:58 

Re: md rotates RAID5 spare at boot

2008-01-10 Thread Jed Davidow

Hi Bill,

Maybe I'm using the wrong words...
In this instance, on the previous boot, md1 was assembled from 
sd[efbac]2 and sdg2 was the spare.  When I rebooted it assembled from 
sd[efbgc]2 and had no spare (appears that sdg was swapped in for sda).  
Since sdg2 had been the spare, the array is degraded and it rebuilds.  I 
suppose this would be the case if, during the shutdown that sda2 was 
compromised (although I see nothing about sda2 as being faulty- I can 
manually add it immediately).  But this happens just about every time I 
reboot, sometimes to only one of the two arrays, sometimes with the 
corresponding partitions on both arrays and sometimes with different 
partitions on each array.


If something was physically wrong with one of the drives, I would expect 
it to swap in the spare for that drive each time.  But it seems to swap 
in the spare randomly.


Note- last night I shutdown completely, restarted after 30 sec and for 
the first time in a while did not have an issue.  This time the drives 
were recognized and assigned device nodes in the 'correct' order (MB 
controller first, PCI controller next).  Would device node assignments 
have any affect on how the array was being assembled?


It looks to me like md inspects and attempts to assemble after each 
drive controller is scanned (from dmesg, there appears to be a failed 
bind on the first three devices after they are scanned, and then again 
when the second controller is scanned).  Would the scan order cause a 
spare to be swapped in?



Bill Davidsen wrote:

Jed Davidow wrote:
I have a RAID5 (5+1spare) setup that works perfectly well until I 
reboot.  I have 6 drives (two different models) partitioned to give 
me 2 arrays, md0 and md1, that I use for /home and /var respectively.


When I reboot, the system assembles each array, but swaps out what 
was the spare with one of the member drives.  It then immediately 
detects a degraded array and rebuilds.  After that, all is fine and 
testing has shown things to be working like they should.  Until I 
reboot.


Example:
Built two arrays:  /dev/md0 - /dev/sd[abcef]1 and /dev/md1 - 
/dev/sd[abcef]2

Added /dev/sdg1 and /dev/sdg2 as spares, and this works.

One scenario when I reboot:
   md0 is assembled from sd[abceg]1; it's degraded and reports a 
spares missing event.

   md1 assembles correctly, spare is not missing

I'm looking at the dmesg which follows and seeing md1 reconstructing. 
This seems to be at variance with assembles correctly here. That's 
the only thing which has struck me as worth mentioning so far.
Any ideas?  I have asked about this on various boards (some said UDEV 
rules would help, some thought the issue had to do with the /dev/sdX 
names changing, etc).  I don't think those are applicable since dmesg 
reports the arrays assemble as soon as the disks are detected.



Thanks in advance!


INFO:
(currently the boot drive (non raid) is sdd, otherwise all sd devices 
are part of the raid)


fdisk:

   $ sudo fdisk -l

   Disk /dev/sda: 250.0 GB, 250059350016 bytes
   255 heads, 63 sectors/track, 30401 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sda1   1152112217401   fd  Linux raid
   autodetect
   /dev/sda21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdb: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdb1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdb21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/sdc: 251.0 GB, 251000193024 bytes
   255 heads, 63 sectors/track, 30515 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x

  Device Boot  Start End  Blocks   Id  System
   /dev/sdc1   1152112217401   fd  Linux raid
   autodetect
   /dev/sdc21522   30401   231978600   fd  Linux raid
   autodetect

   Disk /dev/md0: 50.0 GB, 50041978880 bytes
   2 heads, 4 sectors/track, 12217280 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md0 doesn't contain a valid partition table

   Disk /dev/md1: 950.1 GB, 950183919616 bytes
   2 heads, 4 sectors/track, 231978496 cylinders
   Units = cylinders of 8 * 512 = 4096 bytes
   Disk identifier: 0x

   Disk /dev/md1 doesn't contain a valid partition table

   Disk /dev/sdd: 120.0 GB, 120034123776 bytes
   255 heads, 63 sectors/track, 14593 cylinders
   Units = cylinders of 16065 * 512 = 8225280 bytes
   Disk identifier: 0x535bfd7a

  Device Boot  Start End  Blocks   Id  System
   /dev/sdd1   *   1   14219   

this goes go my megaraid probs too was: Re: md rotates RAID5 spare at boot

2008-01-10 Thread Eric S. Johansson

Jed Davidow wrote:
I'm sorry- is this an inappropriate list to ask for help?  There seemed 
to be a fair amount of that when I searched the archives, but I don't 
want to bug developers with my problems!


Please let me know if I should find another place to ask for help (and 
please let me know where that might be!).


I could also use help with my mega-raid 150 question.  Don't know if I asked it 
wrong or it was the color shirt I was wearing.  I am unfortunately running with 
such a dearth of knowledge on the topic that I don't really know the right 
questions to ask when diagnosing a performance problem.  All I know is that 
there's very little documentation on this card, is even less documentation on 
the commandline tool to access/control the card and if it makes the most sense, 
I am perfectly willing to deep six the card on eBay and pick up a couple of 
reasonable speed serial ATA controller cards in its stead.


the only reason I want to try and learn more about the hardware raid is because 
the problems I'm experiencing with my virtual machines on this platform mimic 
problems a customer of mine is experiencing and if I can fix them just by 
changing how the raid controller uses the discs, then that is a huge win. 
Personally, I think it's something a little deeper because VMware server seems 
to go out to lunch whenever there is a backup in the disk I/O queue.  I'm 
seriously thinking about picking up esx as soon as the budget allows.


I just need some good solid advice on what path I should take.

---eric


--
Speech-recognition in use.  It makes mistakes, I correct some.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 stuck in degraded, inactive and dirty mode

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 On Wed, Jan 09, 2008 at 07:16:34PM +1100, CaT wrote:
   But I suspect that --assemble --force would do the right thing.
   Without more details, it is hard to say for sure.
  
  I suspect so aswell but throwing caution into the wind erks me wrt this
  raid array. :)
 
 Sorry. Not to be a pain but considering the previous email with all the
 examine dumps, etc would the above be the way to go? I just don't want
 to have missed something and bugger the array up totally.

Yes, definitely.

The superblocks look perfectly normal for a single drive failure
followed by a crash.  So --assemble --force is the way to go.

Technically you could have some data corruption if a write was under
way at the time of the crash.  In that case the parity block of that
stripe could be wrong, so the recovered data for the missing device
could be wrong.
This is why you are required to use --force - to confirm that you
are aware that there could be a problem.

It would be worth running fsck just to be sure that nothing critical
has been corrupted.  Also if you have a recent backup, I wouldn't
recycle it until I was fairly sure that all your data was really safe.

But in my experience the chance of actual data corruption in this
situation is fairly low.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 
 It looks to me like md inspects and attempts to assemble after each 
 drive controller is scanned (from dmesg, there appears to be a failed 
 bind on the first three devices after they are scanned, and then again 
 when the second controller is scanned).  Would the scan order cause a 
 spare to be swapped in?
 

This suggests that mdadm --incremental is being used to assemble the
arrays.  Every time udev finds a new device, it gets added to
whichever array is should be in.
If it is called as mdadm --incremental --run, then it will get
started as soon as possible, even if it is degraded.  With the
--run, it will wait until all devices are available.

Even with mdadm --incremental --run, you shouldn't get a resync if
the last device is added before the array is written to.

What distro are you running?
What does
   grep -R mdadm /etc/udev

show?

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid5 stuck in degraded, inactive and dirty mode

2008-01-10 Thread CaT
On Fri, Jan 11, 2008 at 07:21:42AM +1100, Neil Brown wrote:
 On Thursday January 10, [EMAIL PROTECTED] wrote:
  On Wed, Jan 09, 2008 at 07:16:34PM +1100, CaT wrote:
But I suspect that --assemble --force would do the right thing.
Without more details, it is hard to say for sure.
   
   I suspect so aswell but throwing caution into the wind erks me wrt this
   raid array. :)
  
  Sorry. Not to be a pain but considering the previous email with all the
  examine dumps, etc would the above be the way to go? I just don't want
  to have missed something and bugger the array up totally.
 
 Yes, definitely.

Cool.

 The superblocks look perfectly normal for a single drive failure
 followed by a crash.  So --assemble --force is the way to go.
 
 Technically you could have some data corruption if a write was under
 way at the time of the crash.  In that case the parity block of that

I'd expect so as I think the crash situation is one of rather severe
abruptness.

 stripe could be wrong, so the recovered data for the missing device
 could be wrong.
 This is why you are required to use --force - to confirm that you
 are aware that there could be a problem.

Right.

 It would be worth running fsck just to be sure that nothing critical
 has been corrupted.  Also if you have a recent backup, I wouldn't
 recycle it until I was fairly sure that all your data was really safe.

I'll be doing a fsck and checking what data I can over the weekend to
see what was fragged. I suspect it'll just be something rsynced due to
the time of the crash.

 But in my experience the chance of actual data corruption in this
 situation is fairly low.

Yaay. :)

Thanks. I'll now go and put humpty together again. For some reason
Johnny Cash's 'Ring of Fire' is playing in my head.

-- 
To the extent that we overreact, we proffer the terrorists the
greatest tribute.
- High Court Judge Michael Kirby
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


The effects of multiple layers of block drivers

2008-01-10 Thread Dennison Williams
Hello,

I am starting to dig into the Block subsystem to try and uncover the
reason for some data I lost recently.  My situation is that I have
multiple block drivers on top of each other and am wondering how the
effectss of a raid 5 rebuild would affect the block devices above it.

The layers are raid 5 - lvm - cryptoloop.  It seems that after the
raid 5 device was rebuilt by adding in a new disk, that the cryptoloop
doesn't have a valid ext3 partition on it.

As a raid device re-builds is there ant rearranging of sectors or
corresponding blocks that would effect another block device on top of it?

Sincerely,
Dennison Williams
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Jed Davidow

distro: Ubuntu 7.10

Two files show up...

85-mdadm.rules:
# This file causes block devices with Linux RAID (mdadm) signatures to
# automatically cause mdadm to be run.
# See udev(8) for syntax

SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \
   RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded



65-mdadm.vol_id.rules:
# This file causes Linux RAID (mdadm) block devices to be checked for
# further filesystems if the array is active.
# See udev(8) for syntax

SUBSYSTEM!=block, GOTO=mdadm_end
KERNEL!=md[0-9]*, GOTO=mdadm_end
ACTION!=add|change, GOTO=mdadm_end

# Check array status
ATTR{md/array_state}==|clear|inactive, GOTO=mdadm_end

# Obtain array information
IMPORT{program}=/sbin/mdadm --detail --export $tempnode
ENV{MD_NAME}==?*, SYMLINK+=disk/by-id/md-name-$env{MD_NAME}
ENV{MD_UUID}==?*, SYMLINK+=disk/by-id/md-uuid-$env{MD_UUID}

# by-uuid and by-label symlinks
IMPORT{program}=vol_id --export $tempnode
OPTIONS=link_priority=-100
ENV{ID_FS_USAGE}==filesystem|other|crypto, ENV{ID_FS_UUID_ENC}==?*, \
   SYMLINK+=disk/by-uuid/$env{ID_FS_UUID_ENC}
ENV{ID_FS_USAGE}==filesystem|other, ENV{ID_FS_LABEL_ENC}==?*, \
   SYMLINK+=disk/by-label/$env{ID_FS_LABEL_ENC}


I see.  So udev is invoking the assemble command as soon as it detects 
the devices.  So is it possible that the spare is not the last drive to 
be detected and mdadm assembles too soon?



Neil Brown wrote:

On Thursday January 10, [EMAIL PROTECTED] wrote:
  
It looks to me like md inspects and attempts to assemble after each 
drive controller is scanned (from dmesg, there appears to be a failed 
bind on the first three devices after they are scanned, and then again 
when the second controller is scanned).  Would the scan order cause a 
spare to be swapped in?





This suggests that mdadm --incremental is being used to assemble the
arrays.  Every time udev finds a new device, it gets added to
whichever array is should be in.
If it is called as mdadm --incremental --run, then it will get
started as soon as possible, even if it is degraded.  With the
--run, it will wait until all devices are available.

Even with mdadm --incremental --run, you shouldn't get a resync if
the last device is added before the array is written to.

What distro are you running?
What does
   grep -R mdadm /etc/udev

show?

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Jed Davidow
One quick question about those rules.  The 65-mdadm rule looks like it 
checks ACTIVE arrays for filesystems, and the 85 rule assembles arrays.  
Shouldn't they run in the other order?





distro: Ubuntu 7.10

Two files show up...

85-mdadm.rules:
# This file causes block devices with Linux RAID (mdadm) signatures to
# automatically cause mdadm to be run.
# See udev(8) for syntax

SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \
   RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded



65-mdadm.vol_id.rules:
# This file causes Linux RAID (mdadm) block devices to be checked for
# further filesystems if the array is active.
# See udev(8) for syntax

SUBSYSTEM!=block, GOTO=mdadm_end
KERNEL!=md[0-9]*, GOTO=mdadm_end
ACTION!=add|change, GOTO=mdadm_end

# Check array status
ATTR{md/array_state}==|clear|inactive, GOTO=mdadm_end

# Obtain array information
IMPORT{program}=/sbin/mdadm --detail --export $tempnode
ENV{MD_NAME}==?*, SYMLINK+=disk/by-id/md-name-$env{MD_NAME}
ENV{MD_UUID}==?*, SYMLINK+=disk/by-id/md-uuid-$env{MD_UUID}

# by-uuid and by-label symlinks
IMPORT{program}=vol_id --export $tempnode
OPTIONS=link_priority=-100
ENV{ID_FS_USAGE}==filesystem|other|crypto, ENV{ID_FS_UUID_ENC}==?*, \
   SYMLINK+=disk/by-uuid/$env{ID_FS_UUID_ENC}
ENV{ID_FS_USAGE}==filesystem|other, ENV{ID_FS_LABEL_ENC}==?*, \
   SYMLINK+=disk/by-label/$env{ID_FS_LABEL_ENC}


I see.  So udev is invoking the assemble command as soon as it detects 
the devices.  So is it possible that the spare is not the last drive to 
be detected and mdadm assembles too soon?




Neil Brown wrote:

On Thursday January 10, [EMAIL PROTECTED] wrote:
  
It looks to me like md inspects and attempts to assemble after each 
drive controller is scanned (from dmesg, there appears to be a failed 
bind on the first three devices after they are scanned, and then again 
when the second controller is scanned).  Would the scan order cause a 
spare to be swapped in?





This suggests that mdadm --incremental is being used to assemble the
arrays.  Every time udev finds a new device, it gets added to
whichever array is should be in.
If it is called as mdadm --incremental --run, then it will get
started as soon as possible, even if it is degraded.  With the
--run, it will wait until all devices are available.

Even with mdadm --incremental --run, you shouldn't get a resync if
the last device is added before the array is written to.

What distro are you running?
What does
   grep -R mdadm /etc/udev

show?

NeilBrown

  

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 distro: Ubuntu 7.10
 
 Two files show up...
 
 85-mdadm.rules:
 # This file causes block devices with Linux RAID (mdadm) signatures to
 # automatically cause mdadm to be run.
 # See udev(8) for syntax
 
 SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \
 RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded

 
 I see.  So udev is invoking the assemble command as soon as it detects 
 the devices.  So is it possible that the spare is not the last drive to 
 be detected and mdadm assembles too soon?

The --no-degraded' should stop it from assembling until all expected
devices have been found.  It could assemble before the spare is found,
but should not assemble before all the data devices have been found.

The dmesg trace you included in your first mail doesn't actually
show anything wrong - it never starts and incomplete array.
Can you try again and get a trace where there definitely is a rebuild
happening.

And please don't drop linux-raid from the 'cc' list.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 One quick question about those rules.  The 65-mdadm rule looks like it 
 checks ACTIVE arrays for filesystems, and the 85 rule assembles arrays.  
 Shouldn't they run in the other order?
 

They are fine.  The '65' rule applies to arrays.  I.e. it fires on an
array device once it has been started.
The '85' rule applies to component devices.

They are quite independent.

NeilBrown


 
 
 
 distro: Ubuntu 7.10
 
 Two files show up...
 
 85-mdadm.rules:
 # This file causes block devices with Linux RAID (mdadm) signatures to
 # automatically cause mdadm to be run.
 # See udev(8) for syntax
 
 SUBSYSTEM==block, ACTION==add|change, ENV{ID_FS_TYPE}==linux_raid*, \
 RUN+=watershed /sbin/mdadm --assemble --scan --no-degraded
 
 
 
 65-mdadm.vol_id.rules:
 # This file causes Linux RAID (mdadm) block devices to be checked for
 # further filesystems if the array is active.
 # See udev(8) for syntax
 
 SUBSYSTEM!=block, GOTO=mdadm_end
 KERNEL!=md[0-9]*, GOTO=mdadm_end
 ACTION!=add|change, GOTO=mdadm_end
 
 # Check array status
 ATTR{md/array_state}==|clear|inactive, GOTO=mdadm_end
 
 # Obtain array information
 IMPORT{program}=/sbin/mdadm --detail --export $tempnode
 ENV{MD_NAME}==?*, SYMLINK+=disk/by-id/md-name-$env{MD_NAME}
 ENV{MD_UUID}==?*, SYMLINK+=disk/by-id/md-uuid-$env{MD_UUID}
 
 # by-uuid and by-label symlinks
 IMPORT{program}=vol_id --export $tempnode
 OPTIONS=link_priority=-100
 ENV{ID_FS_USAGE}==filesystem|other|crypto, ENV{ID_FS_UUID_ENC}==?*, \
 SYMLINK+=disk/by-uuid/$env{ID_FS_UUID_ENC}
 ENV{ID_FS_USAGE}==filesystem|other, ENV{ID_FS_LABEL_ENC}==?*, \
 SYMLINK+=disk/by-label/$env{ID_FS_LABEL_ENC}
 
 
 I see.  So udev is invoking the assemble command as soon as it detects 
 the devices.  So is it possible that the spare is not the last drive to 
 be detected and mdadm assembles too soon?
 
 
 
 Neil Brown wrote:
  On Thursday January 10, [EMAIL PROTECTED] wrote:

  It looks to me like md inspects and attempts to assemble after each 
  drive controller is scanned (from dmesg, there appears to be a failed 
  bind on the first three devices after they are scanned, and then again 
  when the second controller is scanned).  Would the scan order cause a 
  spare to be swapped in?
 
  
 
  This suggests that mdadm --incremental is being used to assemble the
  arrays.  Every time udev finds a new device, it gets added to
  whichever array is should be in.
  If it is called as mdadm --incremental --run, then it will get
  started as soon as possible, even if it is degraded.  With the
  --run, it will wait until all devices are available.
 
  Even with mdadm --incremental --run, you shouldn't get a resync if
  the last device is added before the array is written to.
 
  What distro are you running?
  What does
 grep -R mdadm /etc/udev
 
  show?
 
  NeilBrown
 

 -
 To unsubscribe from this list: send the line unsubscribe linux-raid in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: md rotates RAID5 spare at boot

2008-01-10 Thread Jed Davidow

(Sorry- yes it looks like I posted an incorrect dmesg extract)

$egrep sd|md|raid|scsi /var/log/dmesg.0
[   36.112449] md: linear personality registered for level -1
[   36.117197] md: multipath personality registered for level -4
[   36.121795] md: raid0 personality registered for level 0
[   36.126950] md: raid1 personality registered for level 1
[   36.131424] raid5: automatically using best checksumming function: 
pIII_sse

[   36.150020] raid5: using function: pIII_sse (4564.000 MB/sec)
[   36.218015] raid6: int32x1780 MB/s
[   36.285943] raid6: int32x2902 MB/s
[   36.353961] raid6: int32x4667 MB/s
[   36.421869] raid6: int32x8528 MB/s
[   36.489811] raid6: mmxx1 1813 MB/s
[   36.557775] raid6: mmxx2 2123 MB/s
[   36.625763] raid6: sse1x11101 MB/s
[   36.693717] raid6: sse1x21898 MB/s
[   36.761688] raid6: sse2x12227 MB/s
[   36.829647] raid6: sse2x23178 MB/s
[   36.829695] raid6: using algorithm sse2x2 (3178 MB/s)
[   36.829744] md: raid6 personality registered for level 6
[   36.829793] md: raid5 personality registered for level 5
[   36.829842] md: raid4 personality registered for level 4
[   36.853475] md: raid10 personality registered for level 10
[   37.781513] scsi0 : sata_sil
[   37.781628] scsi1 : sata_sil
[   37.781724] scsi2 : sata_sil
[   37.781820] scsi3 : sata_sil
[   37.781922] ata1: SATA max UDMA/100 cmd 0xf88c0080 ctl 0xf88c008a 
bmdma 0xf88c irq 20
[   37.781997] ata2: SATA max UDMA/100 cmd 0xf88c00c0 ctl 0xf88c00ca 
bmdma 0xf88c0008 irq 20
[   37.782069] ata3: SATA max UDMA/100 cmd 0xf88c0280 ctl 0xf88c028a 
bmdma 0xf88c0200 irq 20
[   37.782142] ata4: SATA max UDMA/100 cmd 0xf88c02c0 ctl 0xf88c02ca 
bmdma 0xf88c0208 irq 20
[   39.577812] scsi 0:0:0:0: Direct-Access ATA  WDC WD2500JD-00H 
08.0 PQ: 0 ANSI: 5
[   39.578027] scsi 1:0:0:0: Direct-Access ATA  Maxtor 7L250S0   
BACE PQ: 0 ANSI: 5
[   39.578234] scsi 3:0:0:0: Direct-Access ATA  Maxtor 7L250S0   
BACE PQ: 0 ANSI: 5

[   39.632483] scsi4 : ata_piix
[   39.632591] scsi5 : ata_piix
[   39.632812] ata5: PATA max UDMA/133 cmd 0x000101f0 ctl 0x000103f6 
bmdma 0x0001f000 irq 14
[   39.634522] ata6: PATA max UDMA/133 cmd 0x00010170 ctl 0x00010376 
bmdma 0x0001f008 irq 15
[   39.634924] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors 
(250059 MB)

[   39.634995] sd 0:0:0:0: [sda] Write Protect is off
[   39.635048] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   39.635076] sd 0:0:0:0: [sda] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
[   39.635218] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors 
(250059 MB)

[   39.635292] sd 0:0:0:0: [sda] Write Protect is off
[   39.635350] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   39.635380] sd 0:0:0:0: [sda] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA

[   39.635462]  sda: sda1 sda2
[   39.650092] sd 0:0:0:0: [sda] Attached SCSI disk
[   39.650226] sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors 
(251000 MB)

[   39.650296] sd 1:0:0:0: [sdb] Write Protect is off
[   39.650348] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[   39.650379] sd 1:0:0:0: [sdb] Write cache: disabled, read cache: 
enabled, doesn't support DPO or FUA
[   39.650505] sd 1:0:0:0: [sdb] 490234752 512-byte hardware sectors 
(251000 MB)

[   39.650573] sd 1:0:0:0: [sdb] Write Protect is off
[   39.650625] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[   39.650657] sd 1:0:0:0: [sdb] Write cache: disabled, read cache: 
enabled, doesn't support DPO or FUA

[   39.650727]  sdb: sdb1 sdb2
[   39.667599] sd 1:0:0:0: [sdb] Attached SCSI disk
[   39.667719] sd 3:0:0:0: [sdc] 490234752 512-byte hardware sectors 
(251000 MB)

[   39.667788] sd 3:0:0:0: [sdc] Write Protect is off
[   39.667840] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[   39.667871] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
[   39.667997] sd 3:0:0:0: [sdc] 490234752 512-byte hardware sectors 
(251000 MB)

[   39.668064] sd 3:0:0:0: [sdc] Write Protect is off
[   39.668116] sd 3:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[   39.668146] sd 3:0:0:0: [sdc] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA

[   39.668213]  sdc: sdc1 sdc2
[   39.692703] sd 3:0:0:0: [sdc] Attached SCSI disk
[   39.699348] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   39.699570] sd 1:0:0:0: Attached scsi generic sg1 type 0
[   39.699786] sd 3:0:0:0: Attached scsi generic sg2 type 0
[   39.834560] md: md0 stopped.
[   39.870361] md: bindsdc1
[   39.870527] md: md1 stopped.
[   39.910999] md: md0 stopped.
[   39.911064] md: unbindsdc1
[   39.911120] md: export_rdev(sdc1)
[   39.929760] md: bindsda1
[   39.929953] md: bindsdc1
[   39.930139] md: bindsdb1
[   39.930231] md: md1 stopped.
[   39.932468] md: bindsdc2
[   39.932674] md: bindsda2
[   39.932860] md: bindsdb2
[   40.411001] scsi 4:0:1:0: CD-ROMLITE-ON  DVDRW SOHW-1213S 
TS09 PQ: 0 ANSI: 5

[   40.411152] scsi 4:0:1:0: Attached 

Re: md rotates RAID5 spare at boot

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 (Sorry- yes it looks like I posted an incorrect dmesg extract)

This still doesn't seem to match your description.
I see:

 [   41.247389] md: bindsdf1
 [   41.247584] md: bindsdb1
 [   41.247787] md: bindsda1
 [   41.247971] md: bindsdc1
 [   41.248151] md: bindsdg1
 [   41.248325] md: bindsde1
 [   41.256718] raid5: device sde1 operational as raid disk 0
 [   41.256771] raid5: device sdc1 operational as raid disk 4
 [   41.256821] raid5: device sda1 operational as raid disk 3
 [   41.256870] raid5: device sdb1 operational as raid disk 2
 [   41.256919] raid5: device sdf1 operational as raid disk 1
 [   41.257426] raid5: allocated 5245kB for md0
 [   41.257476] raid5: raid level 5 set md0 active with 5 out of 5 
 devices, algorithm 2

which looks like 'md0' started with 5 of 5 drives, plus g1 is there as
a spare.  And

 [   41.312250] md: bindsdf2
 [   41.312476] md: bindsdb2
 [   41.312711] md: bindsdg2
 [   41.312922] md: bindsdc2
 [   41.313138] md: bindsda2
 [   41.313343] md: bindsde2
 [   41.313452] md: md1: raid array is not clean -- starting background 
 reconstruction
 [   41.322189] raid5: device sde2 operational as raid disk 0
 [   41.322243] raid5: device sdc2 operational as raid disk 4
 [   41.322292] raid5: device sdg2 operational as raid disk 3
 [   41.322342] raid5: device sdb2 operational as raid disk 2
 [   41.322391] raid5: device sdf2 operational as raid disk 1
 [   41.322823] raid5: allocated 5245kB for md1
 [   41.322872] raid5: raid level 5 set md1 active with 5 out of 5 
 devices, algorithm 2

md1 also assembled with 5/5 drives and sda2 as a spare.  
This one was not shut down cleanly so it started a resync.  But there
is not evidence of anything starting degraded.



NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: The effects of multiple layers of block drivers

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 Hello,
 
 I am starting to dig into the Block subsystem to try and uncover the
 reason for some data I lost recently.  My situation is that I have
 multiple block drivers on top of each other and am wondering how the
 effectss of a raid 5 rebuild would affect the block devices above it.

It should just work - no surprises.  raid5 is just a block device
like any other.  When doing a rebuild it might be a bit slower, but
that is all.

 
 The layers are raid 5 - lvm - cryptoloop.  It seems that after the
 raid 5 device was rebuilt by adding in a new disk, that the cryptoloop
 doesn't have a valid ext3 partition on it.

There was a difference of opinion between raid5 and dm-crypt which
could cause some corruption.
What kernel version are you using, and are you using dm-crypt or loop
(e..g losetup) with encryption?


 
 As a raid device re-builds is there ant rearranging of sectors or
 corresponding blocks that would effect another block device on top of it?

No.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.24-rc6 reproducible raid5 hang

2008-01-10 Thread Neil Brown
On Thursday January 10, [EMAIL PROTECTED] wrote:
 On Jan 10, 2008 12:13 AM, dean gaudet [EMAIL PROTECTED] wrote:
  w.r.t. dan's cfq comments -- i really don't know the details, but does
  this mean cfq will misattribute the IO to the wrong user/process?  or is
  it just a concern that CPU time will be spent on someone's IO?  the latter
  is fine to me... the former seems sucky because with today's multicore
  systems CPU time seems cheap compared to IO.
 
 
 I do not see this affecting the time slicing feature of cfq, because
 as Neil says the work has to get done at some point.   If I give up
 some of my slice working on someone else's I/O chances are the favor
 will be returned in kind since the code does not discriminate.  The
 io-priority capability of cfq currently does not work as advertised
 with current MD since the priority is tied to the current thread and
 the thread that actually submits the i/o on a stripe is
 non-deterministic.  So I do not see this change making the situation
 any worse.  In fact, it may make it a bit better since there is a
 higher chance for the thread submitting i/o to MD to do its own i/o to
 the backing disks.
 
 Reviewed-by: Dan Williams [EMAIL PROTECTED]

Thanks.
But I suspect you didn't test it with a bitmap :-)
I ran the mdadm test suite and it hit a problem - easy enough to fix.

I'll look out for any other possible related problem (due to raid5d
running in different processes) and then submit it.

Thanks,
NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.24-rc6 reproducible raid5 hang

2008-01-10 Thread dean gaudet
On Fri, 11 Jan 2008, Neil Brown wrote:

 Thanks.
 But I suspect you didn't test it with a bitmap :-)
 I ran the mdadm test suite and it hit a problem - easy enough to fix.

damn -- i lost my bitmap 'cause it was external and i didn't have things 
set up properly to pick it up after a reboot :)

if you send an updated patch i'll give it another spin...

-dean
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html