Re: RAID 1 vs RAID 0

2006-01-18 Thread Max Waterman

Mark Hahn wrote:
They seem to suggest RAID 0 is faster for reading than RAID 1, and I 
can't figure out why.


with R0, streaming from two disks involves no seeks;
with R1, a single stream will have to read, say 0-64K from the first disk,
and 64-128K from the second.  these could happen at the same time, and 
would indeed match R0 bandwidth.  but with R1, each disk has to seek past

the blocks being read from the other disk.  seeking tends to be slow...


Ah, a good way of putting it...I think I was pretty much there with my 
followup message.


Still, it seems like it should be a solvable problem...if you order the 
data differently on each disk; for example, in the two disk case, 
putting odd and even numbered 'stripes' on different platters [or sides 
of platters].




Clearly, the write performance is worse for RAID 1 than RAID 0 since 
with RAID 1 that data you are writing at the same time is the same for 
both drives;


the cost for doing the double writes in R1 is not high, unless you've 
already got a bottleneck somewhere that limits you to talking to one disk
at a time.  for instance, R1 to a pair of disks at 50 MB/s apiece is 
basically trivial for a decent server, since it's about 1% of memory

bandwidth, and a smallish fraction of even plain old 64x66 PCI.


array has more than two disks, that would make RAID 1 *faster* than RAID 0.


R1 is not going to be faster than R0 on the same number of disks.


Yeah, I think I see that now.

Thanks.

Max.



regards, mark hahn.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html



-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 5] md: Introduction

2006-01-18 Thread Sander
Michael Tokarev wrote (ao):
 Most problematic case so far, which I described numerous times (like,
 why linux raid isn't Raid really, why it can be worse than plain
 disk) is when, after single sector read failure, md kicks the whole
 disk off the array, and when you start resync (after replacing the
 bad drive or just remapping that bad sector or even doing nothing,
 as it will be remapped in almost all cases during write, on real
 drives anyway),

If the (harddisk internal) remap succeeded, the OS doesn't see the bad
sector at all I believe.

If you (the OS) do see a bad sector, the disk couldn't remap, and goes
downhill from there, right?

Sander

-- 
Humilis IT Services and Solutions
http://www.humilis.net
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 1 vs RAID 0

2006-01-18 Thread Brad Campbell

Max Waterman wrote:


Still, it seems like it should be a solvable problem...if you order the 
data differently on each disk; for example, in the two disk case, 
putting odd and even numbered 'stripes' on different platters [or sides 
of platters].




The only problem there is determining the internal geometry of the disk, and knowing that each disk 
is probably different. How do you know which logical sector number correlates to which surface and 
where abouts on the surface? Just thinking about it makes my brain hurt.


Not like the good old days of the old stepper disks.

Brad
--
Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so. -- Douglas Adams
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 5] md: Introduction

2006-01-18 Thread Alan Cox
On Mer, 2006-01-18 at 09:14 +0100, Sander wrote:
 If the (harddisk internal) remap succeeded, the OS doesn't see the bad
 sector at all I believe.

True for ATA, in the SCSI case you may be told about the remap having
occurred but its a by the way type message not an error proper.

 If you (the OS) do see a bad sector, the disk couldn't remap, and goes
 downhill from there, right?

If a hot spare is configured it will be dropped into the configuration
at that point.

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: paralellism of device use in md

2006-01-18 Thread Andy Smith
On Tue, Jan 17, 2006 at 12:09:27PM +, Andy Smith wrote:
 I'm wondering: how well does md currently make use of the fact there
 are multiple devices in the different (non-parity) RAID levels for
 optimising reading and writing?

Thanks all for your answers.


signature.asc
Description: Digital signature


Re: RAID 1 vs RAID 0

2006-01-18 Thread Mario 'BitKoenig' Holbe
Max Waterman [EMAIL PROTECTED] wrote:
 Still, it seems like it should be a solvable problem...if you order the 
 data differently on each disk; for example, in the two disk case, 
 putting odd and even numbered 'stripes' on different platters [or sides 

Well, unfortunately for todays hard disks the OS doesn't know anything
anymore about platters, cylinders, sectors, zones etc.
Furthermore, such an attempt would break the (really nice!) ability to
use each single RAID1 mirror as a plain blockdevice with a plain
filesystem on it.


regards
   Mario
-- 
It is a capital mistake to theorize before one has data.
Insensibly one begins to twist facts to suit theories instead of theories
to suit facts.   -- Sherlock Holmes by Arthur Conan Doyle

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 1 vs RAID 0

2006-01-18 Thread Neil Brown
On Wednesday January 18, [EMAIL PROTECTED] wrote:
 Mark Hahn wrote:
  They seem to suggest RAID 0 is faster for reading than RAID 1, and I 
  can't figure out why.
  
  with R0, streaming from two disks involves no seeks;
  with R1, a single stream will have to read, say 0-64K from the first disk,
  and 64-128K from the second.  these could happen at the same time, and 
  would indeed match R0 bandwidth.  but with R1, each disk has to seek past
  the blocks being read from the other disk.  seeking tends to be slow...
 
 Ah, a good way of putting it...I think I was pretty much there with my 
 followup message.
 
 Still, it seems like it should be a solvable problem...if you order the 
 data differently on each disk; for example, in the two disk case, 
 putting odd and even numbered 'stripes' on different platters [or sides 
 of platters].

raid10 'far' mode is exactly designed to address this issue.
If you create a raid10 with 2 drives and a layout of 'f2':

  mdadm -C /dev/mdX --level=10 --layout=f2 --raid-disks=2 /dev/XX /dev/YY

then reads should be comparable to a raid0 of 2 drives, but you still
get raid1 protections.
Writes may be substantially slower though I haven't measured to be
sure.

It doesn't do different sides of platters as that is not possible
with modern drives (you have no knowledge and no control).  It does
different 'ends' of the drive.

NeilBrown

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 5] md: Introduction

2006-01-18 Thread John Hendrikx

Sander wrote:

Michael Tokarev wrote (ao):
  

Most problematic case so far, which I described numerous times (like,
why linux raid isn't Raid really, why it can be worse than plain
disk) is when, after single sector read failure, md kicks the whole
disk off the array, and when you start resync (after replacing the
bad drive or just remapping that bad sector or even doing nothing,
as it will be remapped in almost all cases during write, on real
drives anyway),



If the (harddisk internal) remap succeeded, the OS doesn't see the bad
sector at all I believe.
  
Most hard disks will not remap sectors when reading fails, because then 
the contents would be lost permanently. 

Instead, they will report a failure to the OS, hoping that the sector 
might be readable at some later time.


What Linux Raid could do is reconstructing the sector that failed from 
the other drives and then writing it to disk.  Because the original 
contents of the sector will be lost on writing, your hard disk can 
safely remap the sector (and it will -- I often repaired bad sectors 
by writing to them).



If you (the OS) do see a bad sector, the disk couldn't remap, and goes
downhill from there, right?
  
Not necessarily, if you see a bad sector after *writing* to it (several 
times), then your hard disk will probably go bad soon.  Most hard disks 
only remap sectors on write, so a simple full format can fix sectors 
that failed on read.


I agree with the original poster though, I'd really love to see Linux 
Raid take special action on sector read failures.  It happens about 5-6 
times a year here that a disk gets kicked out of the array for a simple 
read failure.  A rebuild of the array will fix it without a trace, but a 
rebuild takes about 3 hours :)


--John

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 1 vs RAID 0

2006-01-18 Thread John Hendrikx

Max Waterman wrote:

Mark Hahn wrote:
They seem to suggest RAID 0 is faster for reading than RAID 1, and I 
can't figure out why.


with R0, streaming from two disks involves no seeks;
with R1, a single stream will have to read, say 0-64K from the first 
disk,
and 64-128K from the second.  these could happen at the same time, 
and would indeed match R0 bandwidth.  but with R1, each disk has to 
seek past

the blocks being read from the other disk.  seeking tends to be slow...


Ah, a good way of putting it...I think I was pretty much there with my 
followup message.


Still, it seems like it should be a solvable problem...if you order 
the data differently on each disk; for example, in the two disk case, 
putting odd and even numbered 'stripes' on different platters [or 
sides of platters].
I don't think the example above is really that much of an issue.  AFAIK, 
most hard disks will read the current track (all platters) at once as 
soon as the heads are positioned.  It doesn't even wait for the start of 
the track, it just starts reading as soon as possible and stores all of 
it in the internal buffer (it will determine the real start of the track 
by looking for markers in the buffer).  It will then return the data 
from the buffer.


Anyway, the track buffer is quite large because it needs to be able to 
hold the data from an entire track, which is usually quite a bit larger 
than the stripe size (I'd say around 1 to 2 MB).  It's highly unlikely 
that your hard disk will need to seek to read 0-64k, then 128-192k, then 
256-320k, and so on.  There's a good chance that all of that data is 
stored on the same track and can be returned directly from the buffer.  
Even if a seek is required, it would only be a seek of 1 track which are 
relatively fast compared to a random seek.


The only reason I could think of why a mirror would be slower than a 
stripe is the fact that about twice as many single track seeks are 
needed when reading huge files.  That can be avoided if you increase the 
size of the reads significantly though (for example, reading the 1st 
half of the file from one disk, and the 2nd half of the file from the 
other).


--John

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 000 of 5] md: Introduction

2006-01-18 Thread Jan Engelhardt

personally, I think this this useful functionality, but my personal
preference is that this would be in DM/LVM2 rather than MD.  but given
Neil is the MD author/maintainer, I can see why he'd prefer to do it in
MD. :)

Why don't MD and DM merge some bits?



Jan Engelhardt
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: paralellism of device use in md

2006-01-18 Thread Francois Barre
2006/1/18, Mario 'BitKoenig' Holbe [EMAIL PROTECTED]:
 Mario 'BitKoenig' Holbe [EMAIL PROTECTED] wrote:
  scheduled read-requests. Would it probably make sense to split one
  single read over all mirrors that are currently idle?

 A I got it from the other thread - seek times :)
 Perhaps using some big (virtual) chunk size could do the trick? What
 about using chunks that big that seeking is faster than data-transfer...
 assuming a data rate of 50MB/s and 9ms average seek time would result in
 at least 500kB chunks, 14ms average seek time would result in at least
 750kB chunks.
 However, since the blocks being read are most likely somewhat close
 together, it's not a typical average seek, so probably smaller chunks
 would also be possible.


 regards
   Mario

Stop me if I'm wrong, but this is called... huge readahead. Instead of
reading 32k on drive0 then 32k on drive1, you read continuous 512k
from drive0 (16*32k) and 512k from drive1, resulting in a 1M read.
Maybe for a single 4k page...

So my additionnal question to this would be : how well does md fit
with linux's/fs readahead policies ?
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 000 of 5] md: Introduction

2006-01-18 Thread Neil Brown
On Wednesday January 18, [EMAIL PROTECTED] wrote:
 
 personally, I think this this useful functionality, but my personal
 preference is that this would be in DM/LVM2 rather than MD.  but given
 Neil is the MD author/maintainer, I can see why he'd prefer to do it in
 MD. :)
 
 Why don't MD and DM merge some bits?
 

Which bits?
Why?

My current opinion is that you should:

 Use md for raid1, raid5, raid6 - anything with redundancy.
 Use dm for multipath, crypto, linear, LVM, snapshot
 Use either for raid0 (I don't think dm has particular advantages
 for md or md over dm).

These can be mixed together quite effectively:
  You can have dm/lvm over md/raid1 over dm/multipath
with no problems.

If there is functionality missing from any of these recommended
components, then make a noise about it, preferably but not necessarily
with code, and it will quite possibly be fixed.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Adding a device to an active RAID1 array

2006-01-18 Thread Neil Brown
On Wednesday January 18, [EMAIL PROTECTED] wrote:
 
 Hi,
 
 Are there any known issues with changing the number of active devices in 
 a RAID1 array?

There is now, thanks.

 
 I'm trying to add a third mirror to an existing RAID1 array of two disks.
 
 I have /dev/md5 as a mirrored pair of two 40 GB disks.  Directly on top 
 of it is a 40 GB XFS filesystem.
 
 When I do mdadm --grow /dev/md5 -n 3 the count of devices changes from 
 2 to 3, as expected.
 If the XFS filesystem is mounted, the array size changes to 3.0 GB.
 If it is not mounted everything works fine.
 
 Is this supposed to work, is it required that the array be inactive to 
 add a disk, or am I just doing
 something stupid?

It is supposed to work, it doesn't, but it is the code doing something
stupid, not you.
Try the patch below.

 
 If there is a better forum for this question, please direct me there.
 In any case TIA for any help.

This is exactly the correct forum.

NeilBrown

Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/md.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff ./drivers/md/md.c~current~ ./drivers/md/md.c
--- ./drivers/md/md.c~current~  2006-01-17 17:42:31.0 +1100
+++ ./drivers/md/md.c   2006-01-19 10:29:46.0 +1100
@@ -3488,7 +3488,7 @@ static int update_raid_disks(mddev_t *md
bdev = bdget_disk(mddev-gendisk, 0);
if (bdev) {
mutex_lock(bdev-bd_inode-i_mutex);
-   i_size_write(bdev-bd_inode, mddev-array_size  10);
+   i_size_write(bdev-bd_inode, (loff_t)mddev-array_size 
 10);
mutex_unlock(bdev-bd_inode-i_mutex);
bdput(bdev);
}
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: paralellism of device use in md

2006-01-18 Thread Neil Brown
On Wednesday January 18, [EMAIL PROTECTED] wrote:
 2006/1/18, Mario 'BitKoenig' Holbe [EMAIL PROTECTED]:
  Mario 'BitKoenig' Holbe [EMAIL PROTECTED] wrote:
   scheduled read-requests. Would it probably make sense to split one
   single read over all mirrors that are currently idle?
 
  A I got it from the other thread - seek times :)
  Perhaps using some big (virtual) chunk size could do the trick? What
  about using chunks that big that seeking is faster than data-transfer...
  assuming a data rate of 50MB/s and 9ms average seek time would result in
  at least 500kB chunks, 14ms average seek time would result in at least
  750kB chunks.
  However, since the blocks being read are most likely somewhat close
  together, it's not a typical average seek, so probably smaller chunks
  would also be possible.
 
 
  regards
Mario
 
 Stop me if I'm wrong, but this is called... huge readahead. Instead of
 reading 32k on drive0 then 32k on drive1, you read continuous 512k
 from drive0 (16*32k) and 512k from drive1, resulting in a 1M read.
 Maybe for a single 4k page...
 
 So my additionnal question to this would be : how well does md fit
 with linux's/fs readahead policies ?

The read balancing in raid1 is clunky at best.  I've often thought
there must be a better way.  I've never thought what the better way
might be (though I haven't tried very hard).

If anyone would like to experiment with the read-balancing code,
suggest and test changes, it would be most welcome.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 5] md: Introduction

2006-01-18 Thread Neil Brown
On Wednesday January 18, [EMAIL PROTECTED] wrote:
 On Wed, 18 Jan 2006, John Hendrikx wrote:
 
  I agree with the original poster though, I'd really love to see Linux
  Raid take special action on sector read failures.  It happens about 5-6
  times a year here that a disk gets kicked out of the array for a simple
  read failure.  A rebuild of the array will fix it without a trace, but a
  rebuild takes about 3 hours :)
 
 One thing that's well worth doing before simply fail/remove/add the drive
 with the bad sector, is to do a read-only test on the other
 drives/paritions in the rest of the set. That way you won't find out
 half-way through the resync that other drives have failures, and then lose
 the lot. It adds time to the whole operation, but it's worth it IMO.

But what do you do if the read-only test fails... I guess you try to
reconstruct using the nearly-failed drive...

What might be good and practical is to not remove a failed drive
completely, but to hold on to it and only read from it in desperation
while reconstructing a spare.  That might be worth the effort...

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 5] md: Introduction

2006-01-18 Thread Neil Brown
On Wednesday January 18, [EMAIL PROTECTED] wrote:
 
 I agree with the original poster though, I'd really love to see Linux 
 Raid take special action on sector read failures.  It happens about 5-6 
 times a year here that a disk gets kicked out of the array for a simple 
 read failure.  A rebuild of the array will fix it without a trace, but a 
 rebuild takes about 3 hours :)

See 2.6.15 (for raid5) or 2.6.16-rc1 (for raid1).  You'll love it!

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: why md request buffers will not across devices

2006-01-18 Thread Neil Brown
On Wednesday January 18, [EMAIL PROTECTED] wrote:
 hi,
   I have a silly question. Why md request buffers will not
 across devices? That means Why a bh will only locate in a single
 storage device? I guess maybe file system has aligned the bh? Who
 can tell me the exact reasons? Thanks a lot! 
 

If you are talking 'bh' then you are talking '2.4'.
In 2.4. all requests match the 'blocksize' of the device (typically 1K
or 4K) and are aligned to that size.  As the chunksize is always a
multiple of the block size a block will never cross a chunk boundary
and so never cross a device boundary.

2.6 is quite different and md sometimes needs to split a 'bio' request
to feed part to one device and part to another.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 5] md: Introduction

2006-01-18 Thread Neil Brown
On Tuesday January 17, [EMAIL PROTECTED] wrote:
   Hello Neil ,
 
 On Tue, 17 Jan 2006, NeilBrown wrote:
  Greetings.
 
  In line with the principle of release early, following are 5 patches
  against md in 2.6.latest which implement reshaping of a raid5 array.
  By this I mean adding 1 or more drives to the array and then re-laying
  out all of the data.
   Please inform me of which of the 2.6.latest to use ?  Tia ,  JimL
 
 The latest stable version of the Linux kernel is: 2.6.15.1
 2006-01-15 06:14 UTCF   V   C   Changelog
 The latest prepatch for the stable Linux kernel tree is:  2.6.16-rc1  
 2006-01-17 08:09 UTCV   C   Changelog
 The latest snapshot for the stable Linux kernel tree is:  2.6.15-git12
 2006-01-16 08:04 UTCV   C   Changelog

Yes, any of those would be fine.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 001 of 5] md: Split disks array out of raid5 conf structure so it is easier to grow.

2006-01-18 Thread Neil Brown
On Tuesday January 17, [EMAIL PROTECTED] wrote:
  NeilBrown == NeilBrown  [EMAIL PROTECTED] writes:
 
 NeilBrown Previously the array of disk information was included in
 NeilBrown the raid5 'conf' structure which was allocated to an
 NeilBrown appropriate size.  This makes it awkward to change the size
 NeilBrown of that array.  So we split it off into a separate
 NeilBrown kmalloced array which will require a little extra indexing,
 NeilBrown but is much easier to grow.
 
 Neil,
 
 Instead of setting mddev-private = NULL, should you be doing a kfree
 on it as well when you are in an abort state?

The only times I set 
  mddev-private = NULL
it is immediately after
   kfree(conf)
and as conf is the thing that is assigned to mddev-private, this
should be doing exactly what you suggest.

Does that make sense?

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 5] md: Introduction

2006-01-18 Thread Neil Brown
On Tuesday January 17, [EMAIL PROTECTED] wrote:
 On Jan 17, 2006, at 06:26, Michael Tokarev wrote:
  This is about code complexity/bloat.  It's already complex enouth.  
  I rely on the stability of the linux softraid subsystem, and want  
  it to be reliable. Adding more features, especially non-trivial  
  ones, does not buy you bugfree raid subsystem, just the opposite:  
  it will have more chances to crash, to eat your data etc, and will  
  be harder in finding/fixing bugs.
 
 What part of: You will need to enable the experimental  
 MD_RAID5_RESHAPE config option for this to work. isn't bvious?  If  
 you don't want this feature, either don't turn on  
 CONFIG_MD_RAID5_RESHAPE, or don't use the raid5 mdadm reshaping  
 command.

This isn't really a fair comment.  CONFIG_MD_RAID5_RESHAPE just
enables the code.  All the code is included whether this config option
is set or not.  So if code-bloat were an issue, the config option
wouldn't answer it.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 005 of 5] md: Final stages of raid5 expand code.

2006-01-18 Thread Neil Brown
On Tuesday January 17, [EMAIL PROTECTED] wrote:
 NeilBrown wrote (ao):
  +config MD_RAID5_RESHAPE
 
 Would this also be possible for raid6?


Yes.  The will follow once raid5 is reasonably reliable.  It is
essentially the same change to a different file.
(One day we will merge raid5 and raid6 together into the one module,
but not today).

  +  This option allows this restiping to be done while the array
  ^
  restriping
  + Please to NOT use it on valuable data with good, tested, backups.
  ^^ 
  do without

Thanks * 3.

 
 Thanks a lot for this feature. I'll try to find a spare computer to test
 this on. Thanks!

That would be great!

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 5] md: Introduction

2006-01-18 Thread PFC


	While we're at it, here's a little issue I had with RAID5 ; not really  
the fault of md, but you might want to know...


	I have a 5x250GB RAID5 array for home storage (digital photo, my lossless  
ripped cds, etc). 1 IDE Drive ave 4 SATA Drives.
	Now, turns out one of the SATA drives is a Maxtor 6V250F0, and these have  
problems ; it died, then was RMA'd, then died again. Finally, it turned  
out this drive series is incompatible with nvidia sata chipsets. A third  
drive seems to work, setting the jumper to SATA 150.

Back to the point.

	Failure mode of these drives is an IDE command timeout. This takes a long  
time ! So, when the drive has failed, each command to it takes forever. md  
will eventually reject said drive, but it takes hours ; and meanwhile, the  
computer is unusable and data is offline...


	In this case, the really tempting solution is to hit the windows key (er,  
the hard reset button) ; but doing this, makes the array dirty and  
degraded, and it won't mount, and all data is seemingly lost. (well,  
recoverable with a bit of hacking /* goto error; */, but that's not very  
clean...)


	This isn't really a md issue, but it's really annoying only when using  
RAID, because it makes a normal process (kicking a dead drive out) so slow  
it's almost non-functional. Is there a way to modify the timeout in  
question ?


	Note that, re-reading the log below, it writes Disk failure on sdd1,  
disabling device. Operation continuing on 4 devices, but errors continue  
to come, and the array is still unreachable (ie. cat /proc/mdstat hangs,  
etc). Hmm...


Thanks for the time.


Jan  8 21:38:41 apollo13 ReiserFS: md2: checking transaction log (md2)
Jan  8 21:39:11 apollo13 ata4: command 0xca timeout, stat 0xd0 host_stat  
0x21
Jan  8 21:39:11 apollo13 ata4: translated ATA stat/err 0xca/00 to SCSI  
SK/ASC/ASCQ 0xb/47/00

Jan  8 21:39:11 apollo13 ata4: status=0xca { Busy }
Jan  8 21:39:11 apollo13 sd 3:0:0:0: SCSI error: return code = 0x802
Jan  8 21:39:11 apollo13 sdd: Current: sense key=0xb
Jan  8 21:39:11 apollo13 ASC=0x47 ASCQ=0x0
Jan  8 21:39:11 apollo13 Info fld=0x3f
Jan  8 21:39:11 apollo13 end_request: I/O error, dev sdd, sector 63
Jan  8 21:39:11 apollo13 raid5: Disk failure on sdd1, disabling device.  
Operation continuing on 4 devices

Jan  8 21:39:11 apollo13 ATA: abnormal status 0xD0 on port 0x977
Jan  8 21:39:11 apollo13 ATA: abnormal status 0xD0 on port 0x977
Jan  8 21:39:11 apollo13 ATA: abnormal status 0xD0 on port 0x977
Jan  8 21:39:41 apollo13 ata4: command 0xca timeout, stat 0xd0 host_stat  
0x21
Jan  8 21:39:41 apollo13 ata4: translated ATA stat/err 0xca/00 to SCSI  
SK/ASC/ASCQ 0xb/47/00

Jan  8 21:39:41 apollo13 ata4: status=0xca { Busy }
Jan  8 21:39:41 apollo13 sd 3:0:0:0: SCSI error: return code = 0x802
Jan  8 21:39:41 apollo13 sdd: Current: sense key=0xb
Jan  8 21:39:41 apollo13 ASC=0x47 ASCQ=0x0
Jan  8 21:39:41 apollo13 Info fld=0x9840097
Jan  8 21:39:41 apollo13 end_request: I/O error, dev sdd, sector 159645847
Jan  8 21:39:41 apollo13 ATA: abnormal status 0xD0 on port 0x977
Jan  8 21:39:41 apollo13 ATA: abnormal status 0xD0 on port 0x977
Jan  8 21:39:41 apollo13 ATA: abnormal status 0xD0 on port 0x977
Jan  8 21:40:01 apollo13 cron[17973]: (root) CMD (test -x  
/usr/sbin/run-crons  /usr/sbin/run-crons )
Jan  8 21:40:11 apollo13 ata4: command 0x35 timeout, stat 0xd0 host_stat  
0x21
Jan  8 21:40:11 apollo13 ata4: translated ATA stat/err 0x35/00 to SCSI  
SK/ASC/ASCQ 0x4/00/00
Jan  8 21:40:11 apollo13 ata4: status=0x35 { DeviceFault SeekComplete  
CorrectedError Error }

Jan  8 21:40:11 apollo13 sd 3:0:0:0: SCSI error: return code = 0x802
Jan  8 21:40:11 apollo13 sdd: Current: sense key=0x4
Jan  8 21:40:11 apollo13 ASC=0x0 ASCQ=0x0
Jan  8 21:40:11 apollo13 end_request: I/O error, dev sdd, sector 465232831
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html