Re: Adding disks -the pain. Also vinum

1999-08-05 Thread Bernd Walter

On Fri, Aug 06, 1999 at 10:53:54AM +0930, Greg Lehey wrote:
 On Tuesday,  3 August 1999 at 23:20:45 +0200, Bernd Walter wrote:
  On Tue, Aug 03, 1999 at 03:59:46PM +0930, Greg Lehey wrote:
  On Tuesday,  3 August 1999 at  8:12:17 +0200, Bernd Walter wrote:
 
  For UFS/FFS there is nothing worth seting the stripesize to low.
  It is generally slower to acces 32k on different HDDs than to acces 64k on
  one HDD.
 
  It is always slower where the positioning time is greater than the
  transfer time for 32 kB.  On modern disks, 32 kB transfer in about 300
  µs.  The average rotational latency of a disk running at 10,800 rpm is
  2.8 ms, and even with spindle synchronization there's no way to avoid
  rotational latency under these circumstances.
 
  It shouldn't be the latency, because with spindlesync they are the same
  on both disks if the transfer is requested exactly the same time what
  is of course idealized..
 
 Spindle sync ensures that the same sectors on different disks are
 under the heads at the same time.  When you perform a stripe transfer,
 you're not accessing the same sectors, you're accessing different
 sectors.  There's no way to avoid rotational latency under these
 circumstances.

We are talking about the same point with the sme results.
I agree you will only access the same sectors in some special cases.
Lets say 2 Striped disks with 512 Byte stripes and FSS with 1k Frags.

 
  The point is that you have more then a single transfer.  With small
  transfers spindle sync is able to winback some of the performance
  you have lost with a to small stripe size.
 
 No, this isn't correct, unless you're running 512 byte stripes.  In
That's what I meant with a 'to small stripe size'

 this case, a single-stripe transfer of, say, 8 kB with the disks above
 would take about 7 ms total latency (same as with a single disk), but
 the transfer would take less time--5 µs instead of 80 µs.  You'd need
 16 disks, and you'd tie them all up for 7 ms.  And this doesn't
 consider the times of SCSI command setup and such.
In the rare case you need max bandwith for only one Aplication and one stream
I like to hear that all drives are tied up in the job.

 
 Basically, this is not the way to go if you have multiple clients for
 your storage.  Look at http://www.lemis.com/vinum/problems.html and
 http://www.lemis.com/vinum/Performance-issues.html and for more
 details.
 
  Spindle Sycronisation won't bring you that much on modern HDDs - I tried
  it using 5 Seagate Elite 2.9G (5,25" Full-Height).
 
  It should be useful for RAID-3 and streaming video.
 
  I case of large transfers it will make sense - but FFS is unable to set
  up big enough requests.
 
 No, this is a case where you're only using one client, so my
 argumentation above doesn't apply (since you're reading sequentially,
 so latency is no longer an issue).
I don't know what bandwith streaming video needs, but If you need sdditional
bandwith of all used disks the first thing to do is linearising access to
the disks.
Multifileaccess often breaks linearisation.

All what I trid to say is that it is hopeless to expect much more
bandwith than a single disk in single process access.
As an example: Yesterday I was asked if 6 old striped disks would be faster
for cvsup than one of his modern disks because it sometime needs more than
one telephone unit.
The answer is no. cvsupd (if run regulary) spends most of its time sending
The directory content of the destination.
Usually there are no other programms accessng any disks at the same time,
so you can benefit only a very small bit from additional drives.
Maybe the additional block cache on the drives and for updating atime.

Beleave it or not multiple files are accessed in servers and maybe under
some windomanagers, but on many home and desktop machines it happens only
rarely.
I personaly use as an example 7 200M IBM Disks striped to one volume (They
all have LEDs :).
The only way for to utilize nearly all in a sensefull way is writing with
softupdates enabled.


-- 
B.Walter  COSMO-Project  http://www.cosmo-project.de
[EMAIL PROTECTED] Usergroup[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Adding disks -the pain. Also vinum

1999-08-05 Thread Greg Lehey
On Tuesday,  3 August 1999 at 23:20:45 +0200, Bernd Walter wrote:
 On Tue, Aug 03, 1999 at 03:59:46PM +0930, Greg Lehey wrote:
 On Tuesday,  3 August 1999 at  8:12:17 +0200, Bernd Walter wrote:

 For UFS/FFS there is nothing worth seting the stripesize to low.
 It is generally slower to acces 32k on different HDDs than to acces 64k on
 one HDD.

 It is always slower where the positioning time is greater than the
 transfer time for 32 kB.  On modern disks, 32 kB transfer in about 300
 µs.  The average rotational latency of a disk running at 10,800 rpm is
 2.8 ms, and even with spindle synchronization there's no way to avoid
 rotational latency under these circumstances.

 It shouldn't be the latency, because with spindlesync they are the same
 on both disks if the transfer is requested exactly the same time what
 is of course idealized..

Spindle sync ensures that the same sectors on different disks are
under the heads at the same time.  When you perform a stripe transfer,
you're not accessing the same sectors, you're accessing different
sectors.  There's no way to avoid rotational latency under these
circumstances.

 The point is that you have more then a single transfer.  With small
 transfers spindle sync is able to winback some of the performance
 you have lost with a to small stripe size.

No, this isn't correct, unless you're running 512 byte stripes.  In
this case, a single-stripe transfer of, say, 8 kB with the disks above
would take about 7 ms total latency (same as with a single disk), but
the transfer would take less time--5 µs instead of 80 µs.  You'd need
16 disks, and you'd tie them all up for 7 ms.  And this doesn't
consider the times of SCSI command setup and such.

Basically, this is not the way to go if you have multiple clients for
your storage.  Look at http://www.lemis.com/vinum/problems.html and
http://www.lemis.com/vinum/Performance-issues.html and for more
details.

 Spindle Sycronisation won't bring you that much on modern HDDs - I tried
 it using 5 Seagate Elite 2.9G (5,25 Full-Height).

 It should be useful for RAID-3 and streaming video.

 I case of large transfers it will make sense - but FFS is unable to set
 up big enough requests.

No, this is a case where you're only using one client, so my
argumentation above doesn't apply (since you're reading sequentially,
so latency is no longer an issue).

Greg
--
See complete headers for address, home page and phone numbers
finger g...@lemis.com for PGP public key


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Adding disks -the pain. Also vinum

1999-08-05 Thread Bernd Walter
On Fri, Aug 06, 1999 at 10:53:54AM +0930, Greg Lehey wrote:
 On Tuesday,  3 August 1999 at 23:20:45 +0200, Bernd Walter wrote:
  On Tue, Aug 03, 1999 at 03:59:46PM +0930, Greg Lehey wrote:
  On Tuesday,  3 August 1999 at  8:12:17 +0200, Bernd Walter wrote:
 
  For UFS/FFS there is nothing worth seting the stripesize to low.
  It is generally slower to acces 32k on different HDDs than to acces 64k on
  one HDD.
 
  It is always slower where the positioning time is greater than the
  transfer time for 32 kB.  On modern disks, 32 kB transfer in about 300
  µs.  The average rotational latency of a disk running at 10,800 rpm is
  2.8 ms, and even with spindle synchronization there's no way to avoid
  rotational latency under these circumstances.
 
  It shouldn't be the latency, because with spindlesync they are the same
  on both disks if the transfer is requested exactly the same time what
  is of course idealized..
 
 Spindle sync ensures that the same sectors on different disks are
 under the heads at the same time.  When you perform a stripe transfer,
 you're not accessing the same sectors, you're accessing different
 sectors.  There's no way to avoid rotational latency under these
 circumstances.

We are talking about the same point with the sme results.
I agree you will only access the same sectors in some special cases.
Lets say 2 Striped disks with 512 Byte stripes and FSS with 1k Frags.

 
  The point is that you have more then a single transfer.  With small
  transfers spindle sync is able to winback some of the performance
  you have lost with a to small stripe size.
 
 No, this isn't correct, unless you're running 512 byte stripes.  In
That's what I meant with a 'to small stripe size'

 this case, a single-stripe transfer of, say, 8 kB with the disks above
 would take about 7 ms total latency (same as with a single disk), but
 the transfer would take less time--5 µs instead of 80 µs.  You'd need
 16 disks, and you'd tie them all up for 7 ms.  And this doesn't
 consider the times of SCSI command setup and such.
In the rare case you need max bandwith for only one Aplication and one stream
I like to hear that all drives are tied up in the job.

 
 Basically, this is not the way to go if you have multiple clients for
 your storage.  Look at http://www.lemis.com/vinum/problems.html and
 http://www.lemis.com/vinum/Performance-issues.html and for more
 details.
 
  Spindle Sycronisation won't bring you that much on modern HDDs - I tried
  it using 5 Seagate Elite 2.9G (5,25 Full-Height).
 
  It should be useful for RAID-3 and streaming video.
 
  I case of large transfers it will make sense - but FFS is unable to set
  up big enough requests.
 
 No, this is a case where you're only using one client, so my
 argumentation above doesn't apply (since you're reading sequentially,
 so latency is no longer an issue).
I don't know what bandwith streaming video needs, but If you need sdditional
bandwith of all used disks the first thing to do is linearising access to
the disks.
Multifileaccess often breaks linearisation.

All what I trid to say is that it is hopeless to expect much more
bandwith than a single disk in single process access.
As an example: Yesterday I was asked if 6 old striped disks would be faster
for cvsup than one of his modern disks because it sometime needs more than
one telephone unit.
The answer is no. cvsupd (if run regulary) spends most of its time sending
The directory content of the destination.
Usually there are no other programms accessng any disks at the same time,
so you can benefit only a very small bit from additional drives.
Maybe the additional block cache on the drives and for updating atime.

Beleave it or not multiple files are accessed in servers and maybe under
some windomanagers, but on many home and desktop machines it happens only
rarely.
I personaly use as an example 7 200M IBM Disks striped to one volume (They
all have LEDs :).
The only way for to utilize nearly all in a sensefull way is writing with
softupdates enabled.


-- 
B.Walter  COSMO-Project  http://www.cosmo-project.de
ti...@cicely.de Usergroupi...@cosmo-project.de



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Adding disks -the pain. Also vinum

1999-08-03 Thread Bernd Walter

On Tue, Aug 03, 1999 at 03:59:46PM +0930, Greg Lehey wrote:
 On Tuesday,  3 August 1999 at  8:12:17 +0200, Bernd Walter wrote:
 
  For UFS/FFS there is nothing worth seting the stripesize to low.
  It is generally slower to acces 32k on different HDDs than to acces 64k on
  one HDD.
 
 It is always slower where the positioning time is greater than the
 transfer time for 32 kB.  On modern disks, 32 kB transfer in about 300
 µs.  The average rotational latency of a disk running at 10,800 rpm is
 2.8 ms, and even with spindle synchronization there's no way to avoid
 rotational latency under these circumstances.
It shouldn't be the latency, because with spindlesync they are the same
on both disks if the transfer is requested exactly the same time what
is of course idealized..
The point is that you have more then a single transfer.
With small transfers spindle sync is able to winback some of the performance
you have lost with a to small stripe size.
 
  Spindle Sycronisation won't bring you that much on modern HDDs - I tried
  it using 5 Seagate Elite 2.9G (5,25" Full-Height).
 
 It should be useful for RAID-3 and streaming video.
I case of large transfers it will make sense - but FFS is unable to set
up big enough requests.

-- 
B.Walter  COSMO-Project  http://www.cosmo-project.de
[EMAIL PROTECTED] Usergroup[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Adding disks -the pain. Also vinum

1999-08-03 Thread Bernd Walter
On Tue, Aug 03, 1999 at 01:35:54PM +0930, Greg Lehey wrote:
 On Tuesday,  3 August 1999 at 11:11:39 +0800, Stephen Hocking-Senior 
 Programmer PGS Tensor Perth wrote:
 
 No, it would cause a higher I/O load.  Vinum doesn't transfer entire
 stripes, it transfers what you ask for.  With a large stripe size, the
 chances are higher that you can perform the transfer with only a
 single I/O.
 
If you use n*64K stripes a UFS/FFS should never access 2 disks at once.

  Looking at the systat display, the 8k fs blocks do seem to be
  clustered into larger requests, so I'm not too worried about the FS
  block size. What have people observed with trying larger FS block
  sizes?
 
 I don't know if anybody has tried larger FS blocks than 8 kB.  I once
 created a file system with 256 kB blocks (just to see if it could be
 done).  I also tried 512 kB blocks, but newfs died of an overflow.
 I'd expect that you would see a marked drop in performance, assuming
 that it would work at all.

AFAIK the limit is 64k because clustering is limitited to 64k and the fs
don't seem to handle it well.
I'm using 64k very often, because my growfs tool is already able with
this blocksize to grow a ffs over 1Tb.

-- 
B.Walter  COSMO-Project  http://www.cosmo-project.de
ti...@cicely.de Usergroupi...@cosmo-project.de



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Adding disks -the pain. Also vinum

1999-08-03 Thread Bernd Walter
On Tue, Aug 03, 1999 at 12:16:06PM +0800, Stephen Hocking-Senior Programmer PGS 
Tensor Perth wrote:
  
  No, it would cause a higher I/O load.  Vinum doesn't transfer entire
  stripes, it transfers what you ask for.  With a large stripe size, the
  chances are higher that you can perform the transfer with only a
  single I/O.
 
 Even if I'm using really large reads?
Several month ago I beleaved the same but there are severall points here:
 - UFS/FFS don't handle clustering over 64k
 - modern harddisks do preread simply by having a reversed sector layout.
 - without spindle syncronisation you will have additional latency
 - vinum don't aggregate access to subdisks, so the transfer to the subdisks
   is limited by the stripe size.

For UFS/FFS there is nothing worth seting the stripesize to low.
It is generally slower to acces 32k on different HDDs than to acces 64k on
one HDD.
Spindle Sycronisation won't bring you that much on modern HDDs - I tried
it using 5 Seagate Elite 2.9G (5,25 Full-Height).
There was no win using FFS.

If you need performance try softupdates.
At least for writing it should benefit much from striped partitions.
I never realy measured but I was astounished that you can have over 800
transactions/sec on a ccd with 6 striped disks.

-- 
B.Walter  COSMO-Project  http://www.cosmo-project.de
ti...@cicely.de Usergroupi...@cosmo-project.de



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Adding disks -the pain. Also vinum

1999-08-03 Thread Greg Lehey
On Tuesday,  3 August 1999 at  8:12:17 +0200, Bernd Walter wrote:
 On Tue, Aug 03, 1999 at 12:16:06PM +0800, Stephen Hocking-Senior Programmer 
 PGS Tensor Perth wrote:

 No, it would cause a higher I/O load.  Vinum doesn't transfer entire
 stripes, it transfers what you ask for.  With a large stripe size, the
 chances are higher that you can perform the transfer with only a
 single I/O.

 Even if I'm using really large reads?
 Several month ago I beleaved the same but there are severall points here:
  - UFS/FFS don't handle clustering over 64k
  - modern harddisks do preread simply by having a reversed sector layout.
  - without spindle syncronisation you will have additional latency
  - vinum don't aggregate access to subdisks, so the transfer to the subdisks
is limited by the stripe size.

Note, BTW, that this wouldn't make much sense.  To aggregate access to
consecutive stripes, your transfer would have to involve *all* the
disks in the stripe set, which would be a ridiculous performance hit.
Read http://www.lemis.com/vinum/Performance-issues.html for more
details.

 For UFS/FFS there is nothing worth seting the stripesize to low.
 It is generally slower to acces 32k on different HDDs than to acces 64k on
 one HDD.

It is always slower where the positioning time is greater than the
transfer time for 32 kB.  On modern disks, 32 kB transfer in about 300
µs.  The average rotational latency of a disk running at 10,800 rpm is
2.8 ms, and even with spindle synchronization there's no way to avoid
rotational latency under these circumstances.

 Spindle Sycronisation won't bring you that much on modern HDDs - I tried
 it using 5 Seagate Elite 2.9G (5,25 Full-Height).

It should be useful for RAID-3 and streaming video.

Greg
--
See complete headers for address, home page and phone numbers
finger g...@lemis.com for PGP public key


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Adding disks -the pain. Also vinum

1999-08-03 Thread Bernd Walter
On Tue, Aug 03, 1999 at 03:59:46PM +0930, Greg Lehey wrote:
 On Tuesday,  3 August 1999 at  8:12:17 +0200, Bernd Walter wrote:
 
  For UFS/FFS there is nothing worth seting the stripesize to low.
  It is generally slower to acces 32k on different HDDs than to acces 64k on
  one HDD.
 
 It is always slower where the positioning time is greater than the
 transfer time for 32 kB.  On modern disks, 32 kB transfer in about 300
 µs.  The average rotational latency of a disk running at 10,800 rpm is
 2.8 ms, and even with spindle synchronization there's no way to avoid
 rotational latency under these circumstances.
It shouldn't be the latency, because with spindlesync they are the same
on both disks if the transfer is requested exactly the same time what
is of course idealized..
The point is that you have more then a single transfer.
With small transfers spindle sync is able to winback some of the performance
you have lost with a to small stripe size.
 
  Spindle Sycronisation won't bring you that much on modern HDDs - I tried
  it using 5 Seagate Elite 2.9G (5,25 Full-Height).
 
 It should be useful for RAID-3 and streaming video.
I case of large transfers it will make sense - but FFS is unable to set
up big enough requests.

-- 
B.Walter  COSMO-Project  http://www.cosmo-project.de
ti...@cicely.de Usergroupi...@cosmo-project.de



To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Adding disks -the pain. Also vinum

1999-08-02 Thread Bernd Walter

On Tue, Aug 03, 1999 at 01:35:54PM +0930, Greg Lehey wrote:
 On Tuesday,  3 August 1999 at 11:11:39 +0800, Stephen Hocking-Senior Programmer PGS 
Tensor Perth wrote:
 
 No, it would cause a higher I/O load.  Vinum doesn't transfer entire
 stripes, it transfers what you ask for.  With a large stripe size, the
 chances are higher that you can perform the transfer with only a
 single I/O.
 
If you use n*64K stripes a UFS/FFS should never access 2 disks at once.

  Looking at the systat display, the 8k fs blocks do seem to be
  clustered into larger requests, so I'm not too worried about the FS
  block size. What have people observed with trying larger FS block
  sizes?
 
 I don't know if anybody has tried larger FS blocks than 8 kB.  I once
 created a file system with 256 kB blocks (just to see if it could be
 done).  I also tried 512 kB blocks, but newfs died of an overflow.
 I'd expect that you would see a marked drop in performance, assuming
 that it would work at all.

AFAIK the limit is 64k because clustering is limitited to 64k and the fs
don't seem to handle it well.
I'm using 64k very often, because my growfs tool is already able with
this blocksize to grow a ffs over 1Tb.

-- 
B.Walter  COSMO-Project  http://www.cosmo-project.de
[EMAIL PROTECTED] Usergroup[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Re: Adding disks -the pain. Also vinum

1999-08-02 Thread Greg Lehey
On Tuesday,  3 August 1999 at 11:11:39 +0800, Stephen Hocking-Senior Programmer 
PGS Tensor Perth wrote:
 The people who I work for were about to junk a bunch of 6 year old disks when
 I snaffled them. Among them were 4 DEC DSP5400S (3.8GB each), with a nice
 external case. These disks had been doing duty on a boat carrying out seismic
 surveys, attached to misc. Sun workstations. These are typical of their
 vintage - full height 5 1/4 drives fast narrow SCSI2, and noisy as all
 blazes. I have them hooked up to a NCR810, as one striped FS (it's just for
 experiments, not valuable data). fdisking them was easy, but disklabelling
 them was a royal pain. I ended up editing the /etc/disktab file to add an
 appropriate label and running disklabel -w -B /dev/rda0c DSP5400S which
 still gives an error message, but appears to install the label. I only found
 out that it installed the label by accident, wasting a bunch of time in the
 process.

Did you try 'disklabel -w da0 auto'?

 I created a striped volume across the 4 drives with the default stripe size of
 256K. I read the rather interesting discussion within the man pages about the
 optimal stripe size and have a couple of queries. Firstly, the type of traffic
 that this 13.9GB filesystem will see will be mainly sequential reading and
 writing of large files. There will only be a few files (~2-30), each several
 gigs. (I'm fooling around with the seismic software at home, and typcal
 surveys can results in files many gigs in size). Given that FreeBSD breaks
 I/Os down into 64k chunks, would having a 64k stripe size give more
 parallelism?

No, it would cause a higher I/O load.  Vinum doesn't transfer entire
stripes, it transfers what you ask for.  With a large stripe size, the
chances are higher that you can perform the transfer with only a
single I/O.

 I'm seeing 4.4MB/s if I read from an individual disk, but only about
 5.6MB/s when reading from the striped volume. 

How many concurrent processes?  Remember that striping doesn't buy you
anything with a single process.  You might like to try rawio
(ftp://ftp.lemis.com/pub/rawio.tar.gz) and see what that tells you.

 Looking at the systat display, the 8k fs blocks do seem to be
 clustered into larger requests, so I'm not too worried about the FS
 block size. What have people observed with trying larger FS block
 sizes?

I don't know if anybody has tried larger FS blocks than 8 kB.  I once
created a file system with 256 kB blocks (just to see if it could be
done).  I also tried 512 kB blocks, but newfs died of an overflow.
I'd expect that you would see a marked drop in performance, assuming
that it would work at all.

Greg
--
See complete headers for address, home page and phone numbers
finger g...@lemis.com for PGP public key


To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



Re: Adding disks -the pain. Also vinum

1999-08-02 Thread Stephen Hocking-Senior Programmer PGS Tensor Perth
 
 Did you try 'disklabel -w da0 auto'?

Yup - it also complained.

 
 No, it would cause a higher I/O load.  Vinum doesn't transfer entire
 stripes, it transfers what you ask for.  With a large stripe size, the
 chances are higher that you can perform the transfer with only a
 single I/O.

Even if I'm using really large reads?
 
  I'm seeing 4.4MB/s if I read from an individual disk, but only about
  5.6MB/s when reading from the striped volume. 
 
 How many concurrent processes?  Remember that striping doesn't buy you
 anything with a single process.  You might like to try rawio
 (ftp://ftp.lemis.com/pub/rawio.tar.gz) and see what that tells you.

OK, I was just using good ol' dd, with dd if=/cfs/foo of=/dev/null bs=2m

 
  Looking at the systat display, the 8k fs blocks do seem to be
  clustered into larger requests, so I'm not too worried about the FS
  block size. What have people observed with trying larger FS block
  sizes?
 
 I don't know if anybody has tried larger FS blocks than 8 kB.  I once
 created a file system with 256 kB blocks (just to see if it could be
 done).  I also tried 512 kB blocks, but newfs died of an overflow.
 I'd expect that you would see a marked drop in performance, assuming
 that it would work at all.
 

OK. The minimum data size read from these files tends to be about 10k. I'll 
have to try this all with a real app.


Stephen
-- 
  The views expressed above are not those of PGS Tensor.

We've heard that a million monkeys at a million keyboards could produce
 the Complete Works of Shakespeare; now, thanks to the Internet, we know
 this is not true.Robert Wilensky, University of California




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message