On Fri, 22 Jun 2007, Bill Davidsen wrote:
By delaying parity computation until the first write to a stripe only the
growth of a filesystem is slowed, and all data are protected without waiting
for the lengthly check. The rebuild speed can be set very low, because
on-demand rebuild will do most
Bill Davidsen wrote:
David Greaves wrote:
[EMAIL PROTECTED] wrote:
On Fri, 22 Jun 2007, David Greaves wrote:
If you end up 'fiddling' in md because someone specified
--assume-clean on a raid5 [in this case just to save a few minutes
*testing time* on system with a heavily choked bus!] then th
David Greaves wrote:
[EMAIL PROTECTED] wrote:
On Fri, 22 Jun 2007, David Greaves wrote:
That's not a bad thing - until you look at the complexity it brings
- and then consider the impact and exceptions when you do, eg
hardware acceleration? md information fed up to the fs layer for
xfs? simp
[EMAIL PROTECTED] wrote:
On Fri, 22 Jun 2007, David Greaves wrote:
That's not a bad thing - until you look at the complexity it brings -
and then consider the impact and exceptions when you do, eg hardware
acceleration? md information fed up to the fs layer for xfs? simple
long term maintenan
On Fri, 22 Jun 2007, David Greaves wrote:
That's not a bad thing - until you look at the complexity it brings - and
then consider the impact and exceptions when you do, eg hardware
acceleration? md information fed up to the fs layer for xfs? simple long term
maintenance?
Often these problems
Neil Brown wrote:
On Thursday June 21, [EMAIL PROTECTED] wrote:
I didn't get a comment on my suggestion for a quick and dirty fix for
-assume-clean issues...
Bill Davidsen wrote:
How about a simple solution which would get an array on line and still
be safe? All it would take is a flag which
On Thursday June 21, [EMAIL PROTECTED] wrote:
> I didn't get a comment on my suggestion for a quick and dirty fix for
> -assume-clean issues...
>
> Bill Davidsen wrote:
> > How about a simple solution which would get an array on line and still
> > be safe? All it would take is a flag which force
I didn't get a comment on my suggestion for a quick and dirty fix for
-assume-clean issues...
Bill Davidsen wrote:
Neil Brown wrote:
On Thursday June 14, [EMAIL PROTECTED] wrote:
it's now churning away 'rebuilding' the brand new array.
a few questions/thoughts.
why does it need to do a re
On 21 Jun 2007, Neil Brown stated:
> I have that - apparently naive - idea that drives use strong checksum,
> and will never return bad data, only good data or an error. If this
> isn't right, then it would really help to understand what the cause of
> other failures are before working out how to
> "Mattias" == Mattias Wadenstein <[EMAIL PROTECTED]> writes:
Mattias> In theory, that's how storage should work. In practice,
Mattias> silent data corruption does happen. If not from the disks
Mattias> themselves, somewhere along the path of cables, controllers,
Mattias> drivers, buses, etc.
[EMAIL PROTECTED] wrote:
On Thu, 21 Jun 2007, David Chinner wrote:
On Thu, Jun 21, 2007 at 12:56:44PM +1000, Neil Brown wrote:
I have that - apparently naive - idea that drives use strong checksum,
and will never return bad data, only good data or an error. If this
isn't right, then it would
On Thu, 21 Jun 2007, Mattias Wadenstein wrote:
On Thu, 21 Jun 2007, Neil Brown wrote:
I have that - apparently naive - idea that drives use strong checksum,
and will never return bad data, only good data or an error. If this
isn't right, then it would really help to understand what the cau
On Thu, 21 Jun 2007, Mattias Wadenstein wrote:
On Thu, 21 Jun 2007, Neil Brown wrote:
I have that - apparently naive - idea that drives use strong checksum,
and will never return bad data, only good data or an error. If this
isn't right, then it would really help to understand what the caus
On Thu, 21 Jun 2007, Neil Brown wrote:
I have that - apparently naive - idea that drives use strong checksum,
and will never return bad data, only good data or an error. If this
isn't right, then it would really help to understand what the cause of
other failures are before working out how to h
On Thu, Jun 21, 2007 at 04:39:36PM +1000, David Chinner wrote:
> FWIW, I don't think this really removes the need for a filesystem to
> be able to keep multiple copies of stuff about. If the copy(s) on a
> device are gone, you've still got to have another copy somewhere
> else to get it back...
Sp
[EMAIL PROTECTED] wrote:
On Thu, 21 Jun 2007, David Chinner wrote:
one of the 'killer features' of zfs is that it does checksums of every
file on disk. so many people don't consider the disk infallable.
several other filesystems also do checksums
both bitkeeper and git do checksums of files t
Neil Brown wrote:
This isn't quite right.
Thanks :)
Firstly, it is mdadm which decided to make one drive a 'spare' for
raid5, not the kernel.
Secondly, it only applies to raid5, not raid6 or raid1 or raid10.
For raid6, the initial resync (just like the resync after an unclean
shutdown) reads
On Thu, 21 Jun 2007, David Chinner wrote:
On Thu, Jun 21, 2007 at 12:56:44PM +1000, Neil Brown wrote:
I have that - apparently naive - idea that drives use strong checksum,
and will never return bad data, only good data or an error. If this
isn't right, then it would really help to understand
On Thu, Jun 21, 2007 at 12:56:44PM +1000, Neil Brown wrote:
> On Monday June 18, [EMAIL PROTECTED] wrote:
> > On Sat, Jun 16, 2007 at 07:59:29AM +1000, Neil Brown wrote:
> > > Combining these thoughts, it would make a lot of sense for the
> > > filesystem to be able to say to the block device "That
On Saturday June 16, [EMAIL PROTECTED] wrote:
> Neil Brown wrote:
> > On Friday June 15, [EMAIL PROTECTED] wrote:
> >
> >> As I understand the way
> >> raid works, when you write a block to the array, it will have to read all
> >> the other blocks
On Monday June 18, [EMAIL PROTECTED] wrote:
> On Sat, Jun 16, 2007 at 07:59:29AM +1000, Neil Brown wrote:
> > Combining these thoughts, it would make a lot of sense for the
> > filesystem to be able to say to the block device "That blocks looks
> > wrong - can you find me another copy to try?". Th
On Tue, 19 Jun 2007, Lennart Sorensen wrote:
On Mon, Jun 18, 2007 at 02:56:10PM -0700, [EMAIL PROTECTED] wrote:
yes, I'm useing promise drive shelves, I have them configured to export
the 15 drives as 15 LUNs on a single ID.
I'm going to be useing this as a huge circular buffer that will just
On Mon, Jun 18, 2007 at 02:56:10PM -0700, [EMAIL PROTECTED] wrote:
> yes, I'm useing promise drive shelves, I have them configured to export
> the 15 drives as 15 LUNs on a single ID.
>
> I'm going to be useing this as a huge circular buffer that will just be
> overwritten eventually 99% of the
On Tue, 19 Jun 2007, Phillip Susi wrote:
[EMAIL PROTECTED] wrote:
one channel, 2 OS drives plus the 45 drives in the array.
Huh? You can only have 16 devices on a scsi bus, counting the host adapter.
And I don't think you can even manage that much reliably with the newer
higher speed vers
[EMAIL PROTECTED] wrote:
one channel, 2 OS drives plus the 45 drives in the array.
Huh? You can only have 16 devices on a scsi bus, counting the host
adapter. And I don't think you can even manage that much reliably with
the newer higher speed versions, at least not without some very specia
[EMAIL PROTECTED] wrote:
yes, I'm useing promise drive shelves, I have them configured to export
the 15 drives as 15 LUNs on a single ID.
Well, that would account for it. Your bus is very, very saturated. If
all your drives are active, you can't get more than ~7MB/s per disk
under perfect c
On Mon, 18 Jun 2007, Wakko Warner wrote:
Subject: Re: limits on raid
[EMAIL PROTECTED] wrote:
On Mon, 18 Jun 2007, Brendan Conoboy wrote:
[EMAIL PROTECTED] wrote:
yes, sorry, ultra 320 wide.
Exactly how many channels and drives?
one channel, 2 OS drives plus the 45 drives in the array
[EMAIL PROTECTED] wrote:
> On Mon, 18 Jun 2007, Brendan Conoboy wrote:
>
> >[EMAIL PROTECTED] wrote:
> >> yes, sorry, ultra 320 wide.
> >
> >Exactly how many channels and drives?
>
> one channel, 2 OS drives plus the 45 drives in the array.
Given that the drives only have 4 ID bits, how can you
On Mon, 18 Jun 2007, Brendan Conoboy wrote:
[EMAIL PROTECTED] wrote:
yes, sorry, ultra 320 wide.
Exactly how many channels and drives?
one channel, 2 OS drives plus the 45 drives in the array.
yes I realize that there will be bottlenecks with this, the large capacity
is to handle longer
[EMAIL PROTECTED] wrote:
yes, sorry, ultra 320 wide.
Exactly how many channels and drives?
--
Brendan Conoboy / Red Hat, Inc. / [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http
On Mon, 18 Jun 2007, Lennart Sorensen wrote:
On Mon, Jun 18, 2007 at 11:12:45AM -0700, [EMAIL PROTECTED] wrote:
simple ultra-wide SCSI to a single controller.
Hmm, isn't ultra-wide limited to 40MB/s? Is it Ultra320 wide? That
could do a lot more, and 220MB/s sounds plausable for 320 scsi.
On Mon, Jun 18, 2007 at 11:12:45AM -0700, [EMAIL PROTECTED] wrote:
> simple ultra-wide SCSI to a single controller.
Hmm, isn't ultra-wide limited to 40MB/s? Is it Ultra320 wide? That
could do a lot more, and 220MB/s sounds plausable for 320 scsi.
> I didn't realize that the rate reported by /pr
On Mon, 18 Jun 2007, Brendan Conoboy wrote:
[EMAIL PROTECTED] wrote:
I plan to test the different configurations.
however, if I was saturating the bus with the reconstruct how can I fire
off a dd if=/dev/zero of=/mnt/test and get ~45M/sec whild only slowing the
reconstruct to ~4M/sec?
I'
On Mon, 18 Jun 2007, Lennart Sorensen wrote:
On Mon, Jun 18, 2007 at 10:28:38AM -0700, [EMAIL PROTECTED] wrote:
I plan to test the different configurations.
however, if I was saturating the bus with the reconstruct how can I fire
off a dd if=/dev/zero of=/mnt/test and get ~45M/sec whild only s
[EMAIL PROTECTED] wrote:
I plan to test the different configurations.
however, if I was saturating the bus with the reconstruct how can I fire
off a dd if=/dev/zero of=/mnt/test and get ~45M/sec whild only slowing
the reconstruct to ~4M/sec?
I'm putting 10x as much data through the bus at th
On Mon, Jun 18, 2007 at 10:28:38AM -0700, [EMAIL PROTECTED] wrote:
> I plan to test the different configurations.
>
> however, if I was saturating the bus with the reconstruct how can I fire
> off a dd if=/dev/zero of=/mnt/test and get ~45M/sec whild only slowing the
> reconstruct to ~4M/sec?
>
On Mon, 18 Jun 2007, Brendan Conoboy wrote:
[EMAIL PROTECTED] wrote:
in my case it takes 2+ days to resync the array before I can do any
performance testing with it. for some reason it's only doing the rebuild
at ~5M/sec (even though I've increased the min and max rebuild speeds and
a dd to
[EMAIL PROTECTED] wrote:
in my case it takes 2+ days to resync the array before I can do any
performance testing with it. for some reason it's only doing the rebuild
at ~5M/sec (even though I've increased the min and max rebuild speeds
and a dd to the array seems to be ~44M/sec, even during the
On Sat, Jun 16, 2007 at 07:59:29AM +1000, Neil Brown wrote:
> Combining these thoughts, it would make a lot of sense for the
> filesystem to be able to say to the block device "That blocks looks
> wrong - can you find me another copy to try?". That is an example of
> the sort of closer integration
On Sun, 17 Jun 2007, dean gaudet wrote:
On Sun, 17 Jun 2007, Wakko Warner wrote:
What benefit would I gain by using an external journel and how big would it
need to be?
i don't know how big the journal needs to be... i'm limited by xfs'
maximum journal size of 128MiB.
i don't have much benc
On Sun, 17 Jun 2007, Wakko Warner wrote:
you can also easily move an ext3 journal to an external journal with
tune2fs (see man page).
I only have 2 ext3 file systems (One of which is mounted R/O since it's
full), all my others are reiserfs (v3).
What benefit would I gain by using an external
On Sun, 17 Jun 2007, Wakko Warner wrote:
> What benefit would I gain by using an external journel and how big would it
> need to be?
i don't know how big the journal needs to be... i'm limited by xfs'
maximum journal size of 128MiB.
i don't have much benchmark data -- but here are some rough not
dean gaudet wrote:
> On Sun, 17 Jun 2007, Wakko Warner wrote:
>
> > > i use an external write-intent bitmap on a raid1 to avoid this... you
> > > could use internal bitmap but that slows down i/o too much for my tastes.
> > >
> > > i also use an external xfs journal for the same reason. 2 dis
On Sun, 17 Jun 2007, Wakko Warner wrote:
> dean gaudet wrote:
> > On Sat, 16 Jun 2007, Wakko Warner wrote:
> >
> > > When I've had an unclean shutdown on one of my systems (10x 50gb raid5)
> > > it's
> > > always slowed the system down when booting up. Quite significantly I must
> > > say. I w
[EMAIL PROTECTED] wrote:
On Sat, 16 Jun 2007, Neil Brown wrote:
It would be possible to have a 'this is not initialised' flag on the
array, and if that is not set, always do a reconstruct-write rather
than a read-modify-write. But the first time you have an unclean
shutdown you are going to re
Neil Brown wrote:
On Thursday June 14, [EMAIL PROTECTED] wrote:
On Fri, 15 Jun 2007, Neil Brown wrote:
On Thursday June 14, [EMAIL PROTECTED] wrote:
what is the limit for the number of devices that can be in a single array?
I'm trying to build a 45x750G array and want to exper
dean gaudet wrote:
> On Sat, 16 Jun 2007, Wakko Warner wrote:
>
> > When I've had an unclean shutdown on one of my systems (10x 50gb raid5) it's
> > always slowed the system down when booting up. Quite significantly I must
> > say. I wait until I can login and change the rebuild max speed to slo
Neil Brown <[EMAIL PROTECTED]> writes:
>
> Having the filesystem duplicate data, store checksums, and be able to
> find a different copy if the first one it chose was bad is very
> sensible and cannot be done by just putting the filesystem on RAID.
Apropos checksums: since RAID5 copies/xors anywa
On Sat, 16 Jun 2007, Wakko Warner wrote:
> When I've had an unclean shutdown on one of my systems (10x 50gb raid5) it's
> always slowed the system down when booting up. Quite significantly I must
> say. I wait until I can login and change the rebuild max speed to slow it
> down while I'm using i
On Sat, 16 Jun 2007, David Greaves wrote:
> Neil Brown wrote:
> > On Friday June 15, [EMAIL PROTECTED] wrote:
> >
> > > As I understand the way
> > > raid works, when you write a block to the array, it will have to read all
> > > the other blocks
Neil Brown wrote:
>>>
>>>
>> Some things are not achievable with block-level raid. For example, with
>> redundancy integrated into the filesystem, you can have three copies for
>> metadata, two copies for small files, and parity blocks for large files,
>> effectively using different raid
On Sat, 16 Jun 2007, David Greaves wrote:
[EMAIL PROTECTED] wrote:
On Sat, 16 Jun 2007, Neil Brown wrote:
I want to test several configurations, from a 45 disk raid6 to a 45 disk
raid0. at 2-3 days per test (or longer, depending on the tests) this
becomes a very slow process.
Are you sugge
Neil Brown wrote:
> On Friday June 15, [EMAIL PROTECTED] wrote:
>
> > As I understand the way
> > raid works, when you write a block to the array, it will have to read all
> > the other blocks in the stripe and recalculate the parity and write it
[EMAIL PROTECTED] wrote:
On Sat, 16 Jun 2007, Neil Brown wrote:
I want to test several configurations, from a 45 disk raid6 to a 45 disk
raid0. at 2-3 days per test (or longer, depending on the tests) this
becomes a very slow process.
Are you suggesting the code that is written to enhance data
Neil Brown wrote:
On Friday June 15, [EMAIL PROTECTED] wrote:
As I understand the way
raid works, when you write a block to the array, it will have to read all
the other blocks in the stripe and recalculate the parity and write it out.
Your u
On Sat, 16 Jun 2007, Neil Brown wrote:
It would be possible to have a 'this is not initialised' flag on the
array, and if that is not set, always do a reconstruct-write rather
than a read-modify-write. But the first time you have an unclean
shutdown you are going to resync all the parity anyway
For raid5 on an array with more than 3 drive, if you attempt to write
a single block, it will:
- read the current value of the block, and the parity block.
- "subtract" the old value of the block from the parity, and "add"
the new value.
- write out the new data and the new parity.
If the
On Friday June 15, [EMAIL PROTECTED] wrote:
> As I understand the way
> raid works, when you write a block to the array, it will have to read all
> the other blocks in the stripe and recalculate the parity and write it out.
Your understanding is
Neil Brown wrote:
> On Thursday June 14, [EMAIL PROTECTED] wrote:
> > why does it need to do a rebuild when makeing a new array? couldn't it
> > just zero all the drives instead? (or better still just record most of the
> > space as 'unused' and initialize it as it starts useing it?)
>
> Yes, it
On Friday June 15, [EMAIL PROTECTED] wrote:
> On Fri, Jun 15, 2007 at 01:58:20PM +1000, Neil Brown wrote:
> > Certainly. But the raid doesn't need to be tightly integrated
> > into the filesystem to achieve this. The filesystem need only know
> > the geometry of the RAID and when it comes to writ
On Friday June 15, [EMAIL PROTECTED] wrote:
> Neil Brown wrote:
> >
> >> while I consider zfs to be ~80% hype, one advantage it could have (but I
> >> don't know if it has) is that since the filesystem an raid are integrated
> >> into one layer they can optimize the case where files are being
Jan Engelhardt wrote:
> On Jun 15 2007 14:10, Avi Kivity wrote:
>
>> Some things are not achievable with block-level raid. For example, with
>> redundancy integrated into the filesystem, you can have three copies for
>> metadata, two copies for small files, and parity blocks for large files,
>>
On Jun 15 2007 14:10, Avi Kivity wrote:
>
>Some things are not achievable with block-level raid. For example, with
>redundancy integrated into the filesystem, you can have three copies for
>metadata, two copies for small files, and parity blocks for large files,
>effectively using different raid
Neil Brown wrote:
>
>> while I consider zfs to be ~80% hype, one advantage it could have (but I
>> don't know if it has) is that since the filesystem an raid are integrated
>> into one layer they can optimize the case where files are being written
>> onto unallocated space and instead of read
On Fri, Jun 15, 2007 at 01:58:20PM +1000, Neil Brown wrote:
> Certainly. But the raid doesn't need to be tightly integrated
> into the filesystem to achieve this. The filesystem need only know
> the geometry of the RAID and when it comes to write, it tries to write
> full stripes at a time.
XFS
On Thursday June 14, [EMAIL PROTECTED] wrote:
> On Fri, 15 Jun 2007, Neil Brown wrote:
>
> > On Thursday June 14, [EMAIL PROTECTED] wrote:
> >> what is the limit for the number of devices that can be in a single array?
> >>
> >> I'm trying to build a 45x750G array and want to experiment with the
>
On Fri, 15 Jun 2007, Neil Brown wrote:
On Thursday June 14, [EMAIL PROTECTED] wrote:
what is the limit for the number of devices that can be in a single array?
I'm trying to build a 45x750G array and want to experiment with the
different configurations. I'm trying to start with raid6, but mdad
On Thursday June 14, [EMAIL PROTECTED] wrote:
> what is the limit for the number of devices that can be in a single array?
>
> I'm trying to build a 45x750G array and want to experiment with the
> different configurations. I'm trying to start with raid6, but mdadm is
> complaining about an inval
68 matches
Mail list logo