Re: Degrade mode shifting error

1999-08-18 Thread Daniel Seiler

Hello,

On Tue, 17 Aug 1999 [EMAIL PROTECTED] wrote:

 On Tue, Aug 17, 1999 at 04:18:27AM -0600, Ziber wrote:
  Hello,
  I am is facing problem in Raid.
  I have created Raid1 on 500Mb partitions on sda5 and sdb5. For testing purpose
  i disconnect power 
  
  from sdb while copying data to the Raid. But the raid is still trying to
  access sdb and then 
  
  giving  errors and do it in an infinite loop.
 

I have similar problems as ziber is decribing, but I've gone further and 
put my two disks on two different scsi-channels (one on a adaptec 2940 uw
and one on a adaptec 3940 uw). If I pull the powerplug from on of the 
disks, while doing a large cp from on partition to another, the system 
stops to access both disks and I get the timeout and reset messages. 

 It's a problem with your SCSI driver or hardware. It's not directly related to
 RAID, except for the fact that the RAID layer suffers from the shortcomings in
 those lower layers...
 
 After a boot (still with the power disconnected) your RAID should run just
 fine, but in degraded mode of course.
 
 The AIC7xxx and NCR8xx drivers should handle failing devices fairly well, but
 there could still be problems... Your mail indicates that someone should probably
 take a looksee on the AIC7xxx error handling.
 
 Sometimes you will see that the entire SCSI bus locks up, so your entire RAID
 dies when one disk dies.  This is unfortunate, but we're limited by the hardware
 or - at best - the SCSI drivers here.
 I've personally never seen a disk hang the bus, except when I did nasty stuff
 like you tried.  Usually a disk gives a hard read/write error, the RAID layer
 detects the error, and handles the situation just fine.
 
 If you have electrical failures on the bus, or if your SCSI drivers fail, there's
 little the RAID code can do about this.
 
 If you're really paranoid, you can put each disk on it's own SCSI channel to
 minimize the chance of one disk taking down the others with it.

didn't work for me 


daniel
 
 
 : [EMAIL PROTECTED]  : And I see the elder races, :
 :.: putrid forms of man:
 :   Jakob Østergaard  : See him rise and claim the earth,  :
 :OZ9ABN   : his downfall is at hand.   :
 :.:{Konkhra}...:
 



RE: Problem seting up RAID-1

1999-08-18 Thread Bruno Prior

Like Luca says, don't use HTML in mails to this list.

 Hi. I'm seting up RAID-1 in a server with two IDE 4.3GB hard
 disks. I'm using RedHat 5.2 with kernel 2.2.9.

First problem. raidtools-0.90 doesn't work with 2.2.8 and 2.2.9. You don't want
to use these kernels anyway, because of the filesystem corruption problems.
Either step back to 2.2.6/7 or step up to 2.2.10 or 2.2.12pre4 (although wasn't
2.2.10 also supposed to suffer from the same problem?).

 I've download and installed the raidtools-0.90.
 I've cutomized the /etc/raidtab file as follows:

No mention of patching the kernel. Did you try to apply the raid-patch to the
kernel? You get it from ftp.country.kernel.org/pub/linux/daemons/raid/alpha/
(substitute your country code for country, e.g. de for Germany, es for Spain).
This is also the place to get the latest raidtools - you should get tools and
patch to match.

The reason you may be confused is that the stock RedHat 6.0 kernels (e.g.
2.2.5-15) come with this patch already applied unlike other kernels which you
may download. I assume this is not a stock RedHat kernel. I don't think they put
out a 2.2.9 RPM.

 raiddev /dev/md0
   raid-level  1
   nr-raid-disks   2
   nr-spare-disks  0
   chunk-size 4
   persistent-superblock 1
   device  /dev/hda5
   raid-disk   0
   device  /dev/hdc5
   raid-disk   1

Looks OK to me.

 The proble is when is run the command "./mkraid /dev/md0"
 the following error is displayed:
cannot determine md version: No such file or directory

Patching and rebuilding the kernel should solve this. You may also need to
create the md devices as Luca says. If you download and build raidtools, the
"make install" part of the process should do this for you. If not, follow Luca's
instructions. The added bonus of downloading the raidtools tarball is that it
includes the new HOWTO. I'm not sure if RedHat's RPM includes this, or where it
puts it.

Cheers,

Bruno Prior [EMAIL PROTECTED]



UK Suppliers who will preconfigure RAID for Linux?

1999-08-18 Thread Joe McFadden

Hi,

Can anyone recommend UK hardware suppliers who will 
preconfigure RAID for Linux?

We are looking to purchase a Linux system for use as a 
web-server for a lightly loade internet / intranet site, 
which is being redeveloped as a database-backed site 
(apache/mod_perl/MySQL).

We initially approached Dell UK, but at present they only 
offer Linux pre-installed on their Precision Workstations, 
and they also don't support RAID for Linux. So I'm looking 
for alternative suppliers. So far I've come up with
Digital Networks (http://www.dnuk.com). Does anyone have 
any other recommendations?

Also, it would useful to have a "sanity check" on the 
specs. I've come up with:-

450MHz, 256MB, 2x9GB SCSI 7200rpm (RAID 1)

My reason for wanting to use RAID 1 over RAID 5 is to gain 
complete data redundancy (data set is small, so disk usage 
is not a big problem). RAID 1 seems to be a simpler 
design than RAID 5, so hopefully there is less to go wrong. 
Does this sound reasonable?  Finally, should I ask for 
separate SCSI controllers for the two disks, or is this 
overkill?

Regards


--
Joe McFadden   email: [EMAIL PROTECTED]
Web Development Managerweb:   http://www.icr.ac.uk
Institute of Cancer Research   Tel:   0171 352 8133 x5363
123 Old Brompton Road  Fax:   0171 225 2574
LONDON SW7 3RP, UK

Support the everyman campaign against male cancers by 
visiting http://www.icr.ac.uk/everyman/
--



Re: raid0 and raw io

1999-08-18 Thread Stephen C. Tweedie

Hi,

On Thu, 29 Jul 1999 09:38:20 -0700, Carlos Hwa [EMAIL PROTECTED]
said:

 I have a 2 disk raid0 with 32k chunk size using raidtools 0.90 beta10
 right now, and have applied stephen tweedie's raw i/o patch. the raw io
 patch works fine with a single disk but if i try to use raw io on
 /dev/md0 for some reason transfer sizes are only 512bytes according to
 the scsi analyzer, no matter what i specify (i am using lmdd from
 lmbench to test, lmdd if=/dev/zero of=/dev/raw1 bs=65536 count=2048,
 /dev/raw1 is the raw device for /dev/md0). Mr. tweedie says it should
 work correctly, so could this be a limitation with the linux raid
 software? Thanks.

I'm back from holiday, so...

Ingo, any thoughts on this?  The raw IO code is basically just stringing
together temporary buffer_heads and then submitting them all, as a
single call, to ll_rw_block (up to a limit of 128 sectors per call).
The IOs are ordered, so attempt_merge() should be happy enough about
merging.  The only thing I can think of which is somewhat unusual about
the IOs is that the device's blocksize is unconditionally set to 512
bytes beforehand: will that confuse md's block merging?

--Stephen



Re: Newbie question

1999-08-18 Thread James Manning

  I have a running system that I would like to put into raid1. However,
  what I have read is that the mkraid command would erase everything. Is
  this true? Will I loose my data that I have, or is it only the second
  disk that bites it.
 
 There is a way to preserve the data on the existing
 disk. Go, and fetch the latest Software-RAID HowTo.
 http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/

I'm probably losing it, but where in this HOWTO is the trick for 
raid1'ing a currently working drive covered?

Speaking of which, something I didn't quite get is why a "chunk-size"
was defined for raid1... Since they're mirrors, what's a chunk size
mean in raid1?

Thanks,

James
-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development



Is the latest RAID stuff in 2.2.11-ac3 ?

1999-08-18 Thread rich


Is the latest (ie. 19990724) RAID stuff in 2.2.11-ac3 ?

If not, what version of the RAID software does this
kernel correspond to?

On a related issue, when will all the good stuff
like RAID and the large fdset patch make it into
the real kernel - I really need these, and they are
surely stable enough by now.

Rich.

-- 
[EMAIL PROTECTED] | Free email for life at: http://www.postmaster.co.uk/
BiblioTech Ltd, Unit 2 Piper Centre, 50 Carnwath Road, London, SW6 3EG.
+44 171 384 6917 | Click here to play XRacer: http://xracer.annexia.org/
--- Original message content Copyright © 1999 Richard Jones ---



Re: Is the latest RAID stuff in 2.2.11-ac3 ?

1999-08-18 Thread Fred Reimer


The new RAID is in 2.2.12pre series.  Last I heard Alan was going to
send them to Linus to make the decision on whether to keep the changes
in or not.  I think most ppl are for keeping the changes in.  I don't
know about the fdset patch...

fwr


On Wed, 18 Aug 1999, [EMAIL PROTECTED] wrote:
 Is the latest (ie. 19990724) RAID stuff in 2.2.11-ac3 ?
 
 If not, what version of the RAID software does this
 kernel correspond to?
 
 On a related issue, when will all the good stuff
 like RAID and the large fdset patch make it into
 the real kernel - I really need these, and they are
 surely stable enough by now.
 
 Rich.
 
 -- 
 [EMAIL PROTECTED] | Free email for life at: http://www.postmaster.co.uk/
 BiblioTech Ltd, Unit 2 Piper Centre, 50 Carnwath Road, London, SW6 3EG.
 +44 171 384 6917 | Click here to play XRacer: http://xracer.annexia.org/
 --- Original message content Copyright © 1999 Richard Jones ---



Re: harmless (?) error

1999-08-18 Thread James Manning

   On an Intel architecture machine you'll never get more than about 80MBs
   regardless of the number of SCSI busses or the speed of the disks.  The
   PCI bus becomes a bottleneck at this point.
 
 Another consideration of course. But I think his problem was that he
 couldn't get any higher than 30MB/s, let alone 80. :)
   
 What about 64bit PCI? A lot of Intel, Compaq, Dell and Alpha boards
 have those slots, and intraserver for one has 64bit PCI scsi
 controllers. Then there's the even rarer 66MHz PCI. Wonder how they
 would affect the benchmarks.

FWIW, the eXtremeRAID 1100 cards are 64-bit PCI only (as are the ServeRAID
cards in my previous testing).  Other testing I've done has shown many
situations where even our quad P6/200 machines (PC Server 704) could
sustain 40MB/sec over the 32-bit 33-MHz PCI bus, so I'm really hoping
that I can do better with the 64-bit 33-Mhz bus :)

   I missed the start of this thread, so I don't know what RAID level you're
   using.  I did some RAID-0 tests with the new Linux RAID code back in March
   on a dual 450Mhz Xeon box.  Throughput on a single LVD bus appears to peak
   at about 55MBs - you can get 90% of this with four 7,200RPM Baracudas.  
   With two LVD busses, write performance peaks at just over 70MBs
   (diminishing returns after six disks)

Could you describe the set-up?  When I switched to s/w raid0 over
h/w raid0 just for testing, my block write rate in bonnie only went
up to 43 MB/sec.  All the best performance has come with the smallest
chunk-sizes that make sense (4k), where my improvement was significant
over 64k chunk-sizes.

 which ties in with why james only sees 30MB/s - 10 drives per
 channel.

I'll drive pulling 5 drives out of each drive enclosure, which will
leave 5 10krpm Cheetah-3 LVD drives on each of the two LVD channels.
So, the theory is too much bus contention and my numbers should improve
over 10 of the same drives on each of the two channels?

Out of sheer curiosity, I made this one on the raw drive, not
using the partition...

On the raw drives, s/w raid0 over 2 h/w raid0's (each channel separate)
 ---Sequential Output ---Sequential Input-- --Random--
 -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
  MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
2047  8658 43.3 12563 16.5  3235 10.2  4598 18.6  4729  8.4 141.7  1.8

Afterwards, I umount, raidstop, confirm dead with /proc/mdstat and get:
Re-read table failed with error 16: Device or resource busy.

Since I didn't wanna go through a reboot to clear this, I simply added
the "p1" extensions to the raidtab entries, even though fdisk said:

Disk /dev/rd/c0d0p1 doesn't contain a valid partition table
Disk /dev/rd/c0d1p1 doesn't contain a valid partition table

And it actually let me mkraid that and mdstat showed it active!
mke2fs and bonnie later, I get these results:

On partitions(?), s/w raid0 over 2 h/w raid0's (each channel separate)
 ---Sequential Output ---Sequential Input-- --Random--
 -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
  MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
2047 22127 98.9 58031 46.4 21058 47.0 23962 89.2 43068 62.1 648.0  7.8

This represents the same block-write rate I was getting with 10 drives
on each channel, so I'd definitely have to agree that the channels
are definitely in the way.

Any ideas on the bizarre results with raid on the raw drive block devs?
For that matter, mkraid letting me raid partitions that fdisk said
weren't even there?

Thanks!

James Manning
-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development



Re: Status of RAID in 2.2.11

1999-08-18 Thread James Manning

 I think so too. We have in the past been prepared to do this sort of 
 stuff. 2.2.11 did it with ISDN and I heard few moans, 2.2 at some point
 needs to do this with the knfsd update.

Side question... I noticed that the KNI stuff was stripped going into
2.2.11-ac3 (rightly so), so is 2.2.12 going to be a target to get all
the KNI stuff working? or perhaps somewhere in .12-acX?

Thanks!

James
-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development



Re: Status of RAID in 2.2.11

1999-08-18 Thread Alan Cox

 Side question... I noticed that the KNI stuff was stripped going into
 2.2.11-ac3 (rightly so), so is 2.2.12 going to be a target to get all
 the KNI stuff working? or perhaps somewhere in .12-acX?

Probably 2.3

Alan



Re: Status of RAID in 2.2.11

1999-08-18 Thread James Manning

  Side question... I noticed that the KNI stuff was stripped going into
  2.2.11-ac3 (rightly so), so is 2.2.12 going to be a target to get all
  the KNI stuff working? or perhaps somewhere in .12-acX?
 
 Probably 2.3

So the current s/w raid stuff will end up migrating into 2.3.x
as well?  That'd be fantastic... are there 2.3.x versions already
with the s/w raid integration present in 2.2.11-ac3 and/or .12pre?

Thanks!

James
-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development



Re: Newbie question

1999-08-18 Thread Christian Ordig


On 18-Aug-99 James Manning wrote:
 Speaking of which, something I didn't quite get is why a "chunk-size"
 was defined for raid1... Since they're mirrors, what's a chunk size
 mean in raid1?
Yes, RAID1 is a mirror, so a chink size cannot increase write speed, but when
reading this can be done "in parallel" from the two (or thre or whatever
number) of disks if date is contiguous...


---
Christian Ordig | Homepage: http://thor.prohosting.com/~chrordig/ 
Germany |eMail: Christian Ordig [EMAIL PROTECTED]
   __   _   | 
  / /  (_)__  __   __   | Why Linux? Because it is free, stable, and  
 / /__/ / _ \/ // /\ \/ /   | bugs can be fixed in source opposed to waiting  
//_/_//_/\_,_/ /_/\_\   | for a stable WinTendo from Micro$oft.   



Re: harmless (?) error

1999-08-18 Thread Jan Edler

I've been following these threads on sw raid over hw raid, etc., with
some curiosity.  I also did testing with a Mylex DAC1164P, in my case
using 8 IBM Ultrastar 18ZX drives (1rpm).

I get the following bonnie results on that system, just using hw raid,
for sequential input, sequential output, and random seeks, using the
default 8k/64k stripe/segment size, and write-back cache (32MB):

  input  output random
MB/s  %cpu MB/s  %cpu   /s   %cpu

1drive-jbod 19.45 16.3 17.99 16.4 153.90 4.0
raid0   48.49 42.1 25.48 23.1 431.00 7.4
raid01  53.23 41.4 21.22 19.0 313.10 9.5
raid5   52.47 39.3 21.35 19.8 365.60 11.2
raid5-degraded  20.23 15.5 21.86 20.3 277.90 7.8

That was using 3 busses, but the numbers for raid0 and raid5 weren't
very different when I first tried them with all 8 drives on 1 bus.

For comparison, here are my sw raid numbers using 8 much cheaper
ATA drives (Seagate ST317242A) and Ultra-33 controllers
(1 drive per controller, 16k chunksize):

  input  output random
MB/s  %cpu MB/s  %cpu   /s   %cpu

1drive  15.40 12.6 14.03 11.9 101.50 2.9
raid0-low   48.73 49.7 36.77 33.8 242.60 7.0
raid0-high  61.57 64.4 37.18 33.7 227.70 7.5
raid1-8drives   20.63 20.1 4.26 4.0 180.90 6.3
raid1-2drives   15.31 12.7 14.36 13.1 103.80 3.3
raid10  27.11 30.4 18.42 17.2 191.60 6.8
raid5   40.43 38.6 30.32 26.8 209.10 6.3
raid5-degraded  33.46 31.0 31.40 28.1 164.20 4.9

On all platforms, I see quite a bit of variation from run to run,
so my numbers are optimistic: I run several times (usually 10),
and take the best results.  I could (and sometimes do) compute
mean and standard deviation, but I find taking the best results to be
most "useful" for analysis and comparison.

I also skewed the numbers towards best results in another way: I generally
report results only for the outer cylinders (low numbered cylinders).
Well, I also test the inner cylinders for comparison.  The difference
is much less pronounced for the IBM SCSI drives than for the Seagate
ATA drives.  I do this for the simple reason that I want to know what the
maximum performance is.  I also want the minimum, so I check that too.
There's an anomaly in the software raid0 numbers, where the low cylinders
seem to perform worse than the high cylinders, which is the opposite
of what happens in all other cases.   Any ideas on this one?  It's very
repeatable for me, and disturbing since it involves the very highest
throughput numbers I've measured.  If there's enough interest, I can
present more low vs. high cylinder results, but this message is long
enough as it is.

I hacked my copy of bonnie in two ways: to skip the per-char tests,
which I'm not interested in, and to call fdatasync() after each write
test (but that change didn't really make much difference).  I use
an awk script to pick the best results from multiple runs, and to
convert KB/s to MB/s.

The sw raid measurements here were taken on a P3/450 system, and the
hw raid was done on a P2/400 system.  I compared both processors
on sw raid5, and found maybe 1% difference in performance.
The machines were otherwise identical, with 512MB ram.
I used uniprocessor mode, rather than SMP mode, to improve repeatibility
(and because the uniform-ide patches cause hangups for me in SMP mode).

I don't know why your sw over hw raid numbers are so poor by comparison.
Did you try plain hw raid, as above?

My hypothesis is that the mylex itself, or the kernel + driver,
is limited in the number of requests/s that can be handled.

I speculate that sequential throughput is not very important to the
companies developing and buying raid systems.  Rather, some sort
of database transactions are what drives this industry, where the
workload is much more driven by random I/O.  In that case, what you
want more than anything else is a whole lot of disks, because the
random I/O rate will go up.  The fact that my random results in bonnie
don't scale up anywhere near linear with the number of drives is
a weakness in this theory.  Perhaps the current Linux kernel/driver
structure is not well suited to tons of random I/Os.

Still, I consider the lack of higher rates to be an unsolved mystery.

Jan Edler
NEC Research Institute



Re: harmless (?) error

1999-08-18 Thread James Manning

   input  output random
 MB/s  %cpu MB/s  %cpu   /s   %cpu
 
 1drive-jbod 19.45 16.3 17.99 16.4 153.90 4.0
 raid0   48.49 42.1 25.48 23.1 431.00 7.4
 raid01  53.23 41.4 21.22 19.0 313.10 9.5
 raid5   52.47 39.3 21.35 19.8 365.60 11.2
 raid5-degraded  20.23 15.5 21.86 20.3 277.90 7.8

So in most cases you wrote data much faster than writing it?
Or am I misinterpreting your table?

 I hacked my copy of bonnie in two ways: to skip the per-char tests,
 which I'm not interested in, and to call fdatasync() after each write
 test (but that change didn't really make much difference).  I use
 an awk script to pick the best results from multiple runs, and to
 convert KB/s to MB/s.

Sounds quite useful :) willing to put it somewhere?

 I don't know why your sw over hw raid numbers are so poor by comparison.
 Did you try plain hw raid, as above?

My current 40MB/sec (s/w 5 over h/w 0) and 43MB/sec (s/w 0 over h/w 0)
numbers are at least getting closer and I hope to keep digging
into the scsi drive config stuff for improved performance.

If my DAC1164P didn't have a bad channel on it, I'd be testing
over 3 channels which should help performance immensely based
on my previous results.

 My hypothesis is that the mylex itself, or the kernel + driver,
 is limited in the number of requests/s that can be handled.

I don't see it settable in the DAC960 driver ... any ideas?
I've flashed with all the latest s/w from mylex.com

DAC960: * DAC960 RAID Driver Version 2.2.2 of 3 July 1999 *
DAC960: Copyright 1998-1999 by Leonard N. Zubkoff [EMAIL PROTECTED]
DAC960#0: Configuring Mylex DAC1164P PCI RAID Controller
DAC960#0:   Firmware Version: 5.07-0-79, Channels: 3, Memory Size: 32MB
DAC960#0:   PCI Bus: 11, Device: 8, Function: 0, I/O Address: Unassigned
DAC960#0:   PCI Address: 0xFE01 mapped at 0xC080, IRQ Channel: 9
DAC960#0:   Controller Queue Depth: 128, Maximum Blocks per Command: 128
DAC960#0:   Driver Queue Depth: 127, Maximum Scatter/Gather Segments: 33
DAC960#0:   Stripe Size: 8KB, Segment Size: 8KB, BIOS Geometry: 128/32

Stripe and Segment size hasn't made any difference in performance for me.

As soon as my shipment of more 1164P's come in, I'll be spreading the
drives across many more channels.  My 20 drives over 2 channels numbers
equalling my 10 drives over 2 channels sure indicate to do so :)

I'm still trying to figure out the *HUGE* difference in performance
I saw between s/w raid on partitions vs. s/w raid on whole drives :)

Thanks,

James
-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development



Re: harmless (?) error

1999-08-18 Thread James Manning

 So in most cases you wrote data much faster than writing it?

Ummm... s/writing/reading/;
:)

James



Re: harmless (?) error

1999-08-18 Thread Jan Edler

On Wed, Aug 18, 1999 at 12:18:05PM -0400, James Manning wrote:
input  output random
  MB/s  %cpu MB/s  %cpu   /s   %cpu
  
  1drive-jbod 19.45 16.3 17.99 16.4 153.90 4.0
  raid0   48.49 42.1 25.48 23.1 431.00 7.4
  raid01  53.23 41.4 21.22 19.0 313.10 9.5
  raid5   52.47 39.3 21.35 19.8 365.60 11.2
  raid5-degraded  20.23 15.5 21.86 20.3 277.90 7.8
 
 So in most cases you wrote data much faster than writing it?
 Or am I misinterpreting your table?

In most cases I can read data much faster than I can write it.

  I hacked my copy of bonnie in two ways: to skip the per-char tests,
  which I'm not interested in, and to call fdatasync() after each write
  test (but that change didn't really make much difference).  I use
  an awk script to pick the best results from multiple runs, and to
  convert KB/s to MB/s.
 
 Sounds quite useful :) willing to put it somewhere?

Yeah, but the changes are really quite trivial.
I'll attach the patch to Bonnie and the awk script
for picking the best results to this message
(just run bonnie several times with the same params and pipe
into the awk script).

 Stripe and Segment size hasn't made any difference in performance for me.

I saw very small changes when I tried a few settings other than the default.
 
 As soon as my shipment of more 1164P's come in, I'll be spreading the
 drives across many more channels.  My 20 drives over 2 channels numbers
 equalling my 10 drives over 2 channels sure indicate to do so :)

Or, as I say, it might indicate the mylex or driver is maxed out.

Jan Edler


--- Bonnie.c.unhacked   Wed Aug 28 12:23:49 1996
+++ Bonnie.cFri Aug  6 13:41:27 1999
@@ -148,6 +148,7 @@
   size = Chunk * (size / Chunk);
   fprintf(stderr, "File '%s', size: %ld\n", name, size);
 
+#if UNHACKED_VERSION
   /* Fill up a file, writing it a char at a time with the stdio putc() call */
   fprintf(stderr, "Writing with putc()...");
   newfile(name, fd, stream, 1);
@@ -160,6 +161,8 @@
* note that we always close the file before measuring time, in an
*  effort to force as much of the I/O out as we can
*/
+  if (fdatasync(fd)  0)
+io_error("fdatasync after putc");
   if (fclose(stream) == -1)
 io_error("fclose after putc");
   get_delta_t(Putc);
@@ -186,10 +189,13 @@
 if ((words = read(fd, (char *) buf, Chunk)) == -1)
   io_error("rwrite read");
   } /* while we can read a block */
+  if (fdatasync(fd)  0)
+io_error("fdatasync after rewrite");
   if (close(fd) == -1)
 io_error("close after rewrite");
   get_delta_t(ReWrite);
   fprintf(stderr, "done\n");
+#endif /* UNHACKED_VERSION */
 
   /* Write the whole file from scratch, again, with block I/O */
   newfile(name, fd, stream, 1);
@@ -205,11 +211,14 @@
 if (write(fd, (char *) buf, Chunk) == -1)
   io_error("write(2)");
   } /* for each word */
+  if (fdatasync(fd)  0)
+io_error("fdatasync after fast write");
   if (close(fd) == -1)
 io_error("close after fast write");
   get_delta_t(FastWrite);
   fprintf(stderr, "done\n");
 
+#if UNHACKED_VERSION
   /* read them all back with getc() */
   newfile(name, fd, stream, 0);
   for (words = 0; words  256; words++)
@@ -232,6 +241,7 @@
   /* use the frequency count */
   for (words = 0; words  256; words++)
 sprintf((char *) buf, "%d", chars[words]);
+#endif /* UNHACKED_VERSION */
 
   /* Now suck it in, Chunk at a time, as fast as we can */
   newfile(name, fd, stream, 0);
@@ -308,6 +318,8 @@
if (read(seek_control[0], seek_tickets, 1) != 1)
  io_error("read ticket");
   } /* until Mom says stop */
+  if (fdatasync(fd)  0)
+io_error("fdatasync after seek");
   if (close(fd) == -1)
 io_error("close after seek");
 
@@ -382,15 +394,27 @@
 
   printf("TRTD%s/TDTD%d/TD", machine, size / (1024 * 1024));
   
printf("TD%d/TDTD%4.1f/TDTD%d/TDTD%4.1f/TDTD%d/TDTD%4.1f/TD",
+#if UNHACKED_VERSION
 (int) (((double) size) / (delta[(int) Putc][Elapsed] * 1024.0)),
 delta[(int) Putc][CPU] / delta[(int) Putc][Elapsed] * 100.0,
+#else
+0, 0.0,
+#endif
 (int) (((double) size) / (delta[(int) FastWrite][Elapsed] * 1024.0)),
 delta[(int) FastWrite][CPU] / delta[(int) FastWrite][Elapsed] * 100.0,
+#if UNHACKED_VERSION
 (int) (((double) size) / (delta[(int) ReWrite][Elapsed] * 1024.0)),
 delta[(int) ReWrite][CPU] / delta[(int) ReWrite][Elapsed] * 100.0);
+#else
+0, 0.0);
+#endif
   printf("TD%d/TDTD%4.1f/TDTD%d/TDTD%4.1f/TD",
+#if UNHACKED_VERSION
 (int) (((double) size) / (delta[(int) Getc][Elapsed] * 1024.0)),
 delta[(int) Getc][CPU] / delta[(int) Getc][Elapsed] * 100.0,
+#else
+0, 0.0,
+#endif
 (int) (((double) size) / (delta[(int) FastRead][Elapsed] * 1024.0)),
 delta[(int) FastRead][CPU] / delta[(int) FastRead][Elapsed] * 100.0);
   printf("TD%5.1f/TDTD%4.1f/TD/TR\n",
@@ -415,15 +439,27 @@
 
   printf("%-8.8s %4d ", machine, size / (1024 * 1024));
   printf("%5d %4.1f %5d 

Re: harmless (?) error

1999-08-18 Thread James Manning

 input  output random
   MB/s  %cpu MB/s  %cpu   /s   %cpu
   
   1drive-jbod 19.45 16.3 17.99 16.4 153.90 4.0
   raid0   48.49 42.1 25.48 23.1 431.00 7.4
   raid01  53.23 41.4 21.22 19.0 313.10 9.5
   raid5   52.47 39.3 21.35 19.8 365.60 11.2
   raid5-degraded  20.23 15.5 21.86 20.3 277.90 7.8
  
  So in most cases you wrote data much faster than writing it?
  Or am I misinterpreting your table?
 
 In most cases I can read data much faster than I can write it.

Whoa... I think I've had "input" and "output" switched in their
correlation to file reading and file writing...  What worries me
about that is this result from a previous post:

- On partitions(?), s/w raid0 over 2 h/w raid0's (each channel separate)
-  ---Sequential Output ---Sequential Input-- --Random--
-  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
-   MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
- 2047 22127 98.9 58031 46.4 21058 47.0 23962 89.2 43068 62.1 648.0  7.8

So I wrote blocks at ~58MB/sec and read them at ~43MB/sec?

FWIW, the s/w 5 over h/w 0 has write at 27MB/sec (99.5% CPU util) and read
at 40MB/sec (63.6% util).  Now I know the processors (4 500 MHz Xeons)
have much more to do than just XOR calcs (and they can only use MMX
until the KNI code works), but combined with my s/w 0 over the same h/w
0's from above, doesn't this mean that my s/w 5 is bottlenecking on the
4 500 MHz processors?  The find_fastest picked out p5_mmx at ~1070MB/sec

Thanks,

James
-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development



Re: Is the latest RAID stuff in 2.2.11-ac3 ?

1999-08-18 Thread Mark Ferrell

Was playing w/ 2.2.12-final last night in alan's 2.2.12pre releases and it
appears to fully support the newer raid source.

--
 Mark Ferrell  : [EMAIL PROTECTED]

[EMAIL PROTECTED] wrote:

 Is the latest (ie. 19990724) RAID stuff in 2.2.11-ac3 ?

 If not, what version of the RAID software does this
 kernel correspond to?

 On a related issue, when will all the good stuff
 like RAID and the large fdset patch make it into
 the real kernel - I really need these, and they are
 surely stable enough by now.

 Rich.

 --
 [EMAIL PROTECTED] | Free email for life at: http://www.postmaster.co.uk/
 BiblioTech Ltd, Unit 2 Piper Centre, 50 Carnwath Road, London, SW6 3EG.
 +44 171 384 6917 | Click here to play XRacer: http://xracer.annexia.org/
 --- Original message content Copyright © 1999 Richard Jones ---



Re: Is the latest RAID stuff in 2.2.11-ac3 ?

1999-08-18 Thread Mark Ferrell

Oh yah,

Alan, the tulip drivers stopped functioning correctly for me in 2.2.12.  I
have a gut feeling it's not a kernel issue but I havn't had a chance to beat
on it.
Basicly description:
Patched a freshly extracted 2.2.11 w/ 2.2.12-final patch.  copied my
/boot/config-2.2.11-smp from my previous kernel as .config in the source and
did a make oldconfig
Installed kernel, Everything came up correctly, including the tulip driver,
but the card wasn't capable of sending data to the network.  ifconfig shows
that it was reciving packets and sending yet it wasn't.  Wish I could be more
descriptive but I am at work now and don't have any of my logs ..

Will get home and do a make mrproper and try again and see if it still has
issues .. will drop you a log if all is not well.

--
 Mark Ferrell  : [EMAIL PROTECTED]

"Ferrell, Mark (EXCHANGE:RICH2:2K25)" wrote:

 Was playing w/ 2.2.12-final last night in alan's 2.2.12pre releases and it
 appears to fully support the newer raid source.

 --
  Mark Ferrell  : [EMAIL PROTECTED]

 [EMAIL PROTECTED] wrote:

  Is the latest (ie. 19990724) RAID stuff in 2.2.11-ac3 ?
 
  If not, what version of the RAID software does this
  kernel correspond to?
 
  On a related issue, when will all the good stuff
  like RAID and the large fdset patch make it into
  the real kernel - I really need these, and they are
  surely stable enough by now.
 
  Rich.
 
  --
  [EMAIL PROTECTED] | Free email for life at: http://www.postmaster.co.uk/
  BiblioTech Ltd, Unit 2 Piper Centre, 50 Carnwath Road, London, SW6 3EG.
  +44 171 384 6917 | Click here to play XRacer: http://xracer.annexia.org/
  --- Original message content Copyright © 1999 Richard Jones ---



raid1 on 2.0.36

1999-08-18 Thread Olaf Ihmig



Hi,

i want to add a new metadevice /dev/md3. In /var/log/messages i find :

.. kernel: md: 08.21: invalid raid superblock magic (0) on block
4233088

what does it mean? Problem is, i cant mdrun /dev/md3 , because there are an
error message: invalid argument. /dev/md3 was build with mdcreate.
my mdtab is:

/dev/md3  raid1,8k,0,9c02e979 /dev/sdc1  /dev/sdd1


how can i solve this problem??

thanx.olaf.

[EMAIL PROTECTED]




Re: Is the latest RAID stuff in 2.2.11-ac3 ?

1999-08-18 Thread Alex Buell

On Wed, 18 Aug 1999 [EMAIL PROTECTED] wrote:

 On a related issue, when will all the good stuff like RAID and the
 large fdset patch make it into the real kernel - I really need these,
 and they are surely stable enough by now.

The large fdset patch won't go in until Linus is happy with it, however,
Alan Cox is working on addressing these issues, so watch this space!

PS: No news from Omar yet, but probably should hear soon. 

Cheers, 
Alex 
-- 

Legalise cannabis today!

http://www.tahallah.demon.co.uk



Re: harmless (?) error

1999-08-18 Thread James Manning

 Again your %cpu is high compared to what I've seen.  I've never seen
 anything at 99%.  Anyone else?

My s/w raid5 CPU util has always been between 99 and 100% for writes.
If some kind soul could help me figure out kernel profiling, I'll
profile 2.2.12 doing block s/w raid5 writes.

James
-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development



Re: [Re: Raid isnt shifting to degrading mode while copying data to it.]

1999-08-18 Thread Gadi Oxman

On 18 Aug 1999, Ziber wrote:

 But the problem is Raid isnt working in this manner. It try hard to access
 faulty device and after giving some errors system just hanged. In my case i m
 using Samba to access the Raid. and if i m copying data to raid and pull the
 power cable after some period my samba connection lost and linuxbox halted.
 Raid suppose to isolate drive after its old buffer is finished but in my case
 it isnt.
 This is very basic raid functionality isnt it.
 
 Is there any other procedure to test that raid is working fine?

Even if we do not survive the failure, the redundancy that RAID provides
is still valuable. On the next boot after the failure, we will be able
to continue working in degraded mode with the working drives.

To be able to survive the failure and continue working is not so
simple since in addition to the RAID layer, the low level disk drivers,
the bus controller and the devices have to survive the failure.

Gadi

 
 
 
 
 
 Gadi Oxman [EMAIL PROTECTED] wrote:
 This behavior can be improved, but most probably no bug there.
 
 On the first detected error, we have switched to degraded mode and
 *new requests* to the RAID device will not be redirected to the failed
 disk drive.
 
 However, in the current architecture *old requests* which have
 been submitted to the failed disk prior to the failure can't be
 aborted while there are already in the low level drivers queues.
 
 That period in which the failed disk queue is flushed can take quite
 a bit of time, as the low level drivers attempt do everything they
 possibly can to service the queued requests, including slow bus resets,
 etc (and rightly so, as they don't currently know that this device is
 actually a part of a redundant disk setup and that nothing bad will
 happen if they won't try as hard to service the requests). After this
 period, though, the failed disk should be idle.
 
 Gadi
 
 On Tue, 17 Aug 1999, James Manning wrote:
 
   I am is facing problem in Raid.
   I have created Raid1 on 500Mb partitions on sda5 and sdb5. For testing
 purpose
   i disconnect power from sdb while copying data to the Raid. But the raid
 is
   still trying to access sdb and then giving  errors and do it in an
 infinite
   loop.
  
  Since it looks like each new chunk is generating the raid1 attempt at
  recovery, it does seem like sdb5 should be removed from md0 automatically
  and the recovery thread should only get woken up again when you get the
  drive back into a workable state and "raidhotadd" the partition again.
  
  Worst case, it seems like you should be able to "raidhotremove" the
  partition in the mean time.
  
  Is this a raid/raid1 bug? Or desired behavior?
  
  James
  
   
   Aug 17 17:34:58 client7 kernel: scsi0 channel 0 : resetting for second
 half of
   retries. 
   Aug 17 17:34:58 client7 kernel: SCSI bus is being reset for host 0 channel
 0.
   
   Aug 17 17:35:01 client7 kernel: (scsi0:0:0:0) Synchronous at 20.0
 Mbyte/sec,
   offset 8. 
   Aug 17 17:35:01 client7 kernel: SCSI disk error : host 0 channel 0 id 1
 lun 0
   return code = 2603 
   Aug 17 17:35:01 client7 kernel: scsidisk I/O error: dev 08:15, sector
 557856 
   Aug 17 17:35:01 client7 kernel: md: recovery thread got woken up ... 
   Aug 17 17:35:01 client7 kernel: md0: no spare disk to reconstruct array!
 --
   continuing in degraded mode 
   Aug 17 17:35:01 client7 kernel: md: recovery thread finished ... 
   Aug 17 17:35:01 client7 kernel: SCSI disk error : host 0 channel 0 id 1
 lun 0
   return code = 2603 
   Aug 17 17:35:01 client7 kernel: scsidisk I/O error: dev 08:15, sector
 557730 
   Aug 17 17:35:01 client7 kernel: md: recovery thread got woken up ... 
   Aug 17 17:35:01 client7 kernel: md0: no spare disk to reconstruct array!
 --
   continuing in degraded mode 
   Aug 17 17:35:01 client7 kernel: md: recovery thread finished ... 
   Aug 17 17:35:01 client7 kernel: SCSI disk error : host 0 channel 0 id 1
 lun 0
   return code = 2603 
   Aug 17 17:35:01 client7 kernel: scsidisk I/O error: dev 08:15, sector
 557602 
   Aug 17 17:35:01 client7 kernel: md: recovery thread got woken up ... 
   Aug 17 17:35:01 client7 kernel: md0: no spare disk to reconstruct array!
 --
   continuing in degraded mode 
   Aug 17 17:35:01 client7 kernel: md: recovery thread finished ... 
   Aug 17 17:35:02 client7 kernel: scsi0 channel 0 : resetting for second
 half of
   retries. 
   Aug 17 17:35:02 client7 kernel: SCSI bus is being reset for host 0 channel
 0.
   
   Aug 17 17:35:05 client7 kernel: (scsi0:0:0:0) Synchronous at 20.0
 Mbyte/sec,
   offset 8. 
   Aug 17 17:35:05 client7 kernel: SCSI disk error : host 0 channel 0 id 1
 lun 0
   return code = 2603 
   Aug 17 17:35:05 client7 kernel: scsidisk I/O error: dev 08:15, sector
 557858 
   Aug 17 17:35:05 client7 kernel: md: recovery thread got woken up ... 
   Aug 17 17:35:05 client7 kernel: md0: no spare disk to reconstruct array!
 --
   continuing in degraded mode 
   Aug 17 

Re: harmless (?) error

1999-08-18 Thread Gadi Oxman

On Wed, 18 Aug 1999, James Manning wrote:

  input  output random
MB/s  %cpu MB/s  %cpu   /s   %cpu

1drive-jbod 19.45 16.3 17.99 16.4 153.90 4.0
raid0   48.49 42.1 25.48 23.1 431.00 7.4
raid01  53.23 41.4 21.22 19.0 313.10 9.5
raid5   52.47 39.3 21.35 19.8 365.60 11.2
raid5-degraded  20.23 15.5 21.86 20.3 277.90 7.8
   
   So in most cases you wrote data much faster than writing it?
   Or am I misinterpreting your table?
  
  In most cases I can read data much faster than I can write it.
 
 Whoa... I think I've had "input" and "output" switched in their
 correlation to file reading and file writing...  What worries me
 about that is this result from a previous post:
 
 - On partitions(?), s/w raid0 over 2 h/w raid0's (each channel separate)
 -  ---Sequential Output ---Sequential Input-- --Random--
 -  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
 -   MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
 - 2047 22127 98.9 58031 46.4 21058 47.0 23962 89.2 43068 62.1 648.0  7.8
 
 So I wrote blocks at ~58MB/sec and read them at ~43MB/sec?
 
 FWIW, the s/w 5 over h/w 0 has write at 27MB/sec (99.5% CPU util) and read
 at 40MB/sec (63.6% util).  Now I know the processors (4 500 MHz Xeons)
 have much more to do than just XOR calcs (and they can only use MMX
 until the KNI code works), but combined with my s/w 0 over the same h/w
 0's from above, doesn't this mean that my s/w 5 is bottlenecking on the
 4 500 MHz processors?  The find_fastest picked out p5_mmx at ~1070MB/sec
 
 Thanks,
 
 James
 -- 
 Miscellaneous Engineer --- IBM Netfinity Performance Development

I'd recommend verifying if the following changes affect the s/w
raid-5 performance:

1.  A kernel compiled with HZ=1024 instead of HZ=100 -- this
will decrease the latency between "i/o submitted to the raid
layer" and "i/o submitted to the low level drivers" by allowing
the raid-5 kernel thread to run more often.

2.  Increased NR_STRIPES constant in drivers/block/raid5.c from 128
to 256 of 512; this will potentially queue a larger amount of data
to the low level drivers simultaneously.

Gadi



Re: harmless (?) error

1999-08-18 Thread Chance Reschke

On Wed, 18 Aug 1999, James Manning wrote:

I missed the start of this thread, so I don't know what RAID level you're
using.  I did some RAID-0 tests with the new Linux RAID code back in March
on a dual 450Mhz Xeon box.  Throughput on a single LVD bus appears to peak
at about 55MBs - you can get 90% of this with four 7,200RPM Baracudas.  
With two LVD busses, write performance peaks at just over 70MBs
(diminishing returns after six disks)
 
 Could you describe the set-up?  When I switched to s/w raid0 over
 h/w raid0 just for testing, my block write rate in bonnie only went
 up to 43 MB/sec.  All the best performance has come with the smallest
 chunk-sizes that make sense (4k), where my improvement was significant
 over 64k chunk-sizes.

RAID0-8Disks-2Bus-MDChunk32_Stride256_QDepth10

Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
  ---Sequential Output ---Sequential Input-- --Random--
  -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
 1000  6538 98.9 72882 76.6 24213 54.7  5963 98.9 78587 76.3 387.2  5.0

I found the SCSI queue depth, RAID chunk size and stride size made
only small differences in bonie results.  The largest improvement in
performance comes from using a 4K filesystem block size.  

Check out:

   http://www-hpcc.astro.washington.edu/reschke/Linux_MD_Benchmarks/

I've included the bonnie output and a couple of graphs - one showing
throughput as a function of the # of disks and SCSI busses, and
another showing Bonnie's sensitivity to the the test-size/RAM-size
ratio.


 - C



Re: harmless (?) error

1999-08-18 Thread Gadi Oxman

On Wed, 18 Aug 1999, Gadi Oxman wrote:

 I'd recommend verifying if the following changes affect the s/w
 raid-5 performance:
 
 1.A kernel compiled with HZ=1024 instead of HZ=100 -- this
   will decrease the latency between "i/o submitted to the raid
   layer" and "i/o submitted to the low level drivers" by allowing
   the raid-5 kernel thread to run more often.
 
 2.Increased NR_STRIPES constant in drivers/block/raid5.c from 128
   to 256 of 512; this will potentially queue a larger amount of data
   to the low level drivers simultaneously.

Another thing which might hurt performance is the hash table scanning
order in the raid-5 kernel thread.

In the default setup, the hash table can contain up to 1024 entries,
and the hash function is:

#define stripe_hash(conf, sect, size)   ((conf)-stripe_hashtbl[((sect) /
(size  9))  HASH_MASK])

So that sectors 0 - 1023, 1024 - 2047, etc, will fill the slots 0 - 1023
in that order (for 512 bytes block size).

Only NR_STRIPES active stripes might be in the hash table at a time,
and in addition to using the hash table to find a stripe quickly, we
are also queueing the stripes to the low level drivers by scanning the
table in increasing order, starting from slot 0.

This means that if, for example, we currently have a 128 pending
write stripes which wrap around the table, for example for sectors
950 - 1077, we will actually queue sectors 1024 - 1077 first, and
only then queue sectors 950 - 1023, which might be one of the
causes for sub-optimal performance.

The following patch tries to find the current minimum sector,
and start running on the table from there in a circular manner,
so that in the above example, we will queue the sectors in
increasing order.

Gadi

--- drivers/block/raid5.c~  Fri Jun 18 10:18:07 1999
+++ drivers/block/raid5.c   Wed Aug 18 22:39:06 1999
@@ -1322,7 +1322,8 @@
struct stripe_head *sh;
raid5_conf_t *conf = data;
mddev_t *mddev = conf-mddev;
-   int i, handled = 0, unplug = 0;
+   int i, handled = 0, unplug = 0, min_index = 0;
+   unsigned long min_sector = 0;
unsigned long flags;
 
PRINTK(("+++ raid5d active\n"));
@@ -1332,8 +1333,22 @@
md_update_sb(mddev);
}
for (i = 0; i  NR_HASH; i++) {
-repeat:
sh = conf-stripe_hashtbl[i];
+   if (!sh || sh-phase == PHASE_COMPLETE || sh-nr_pending)
+   continue;
+   if (!min_sector) {
+   min_sector = sh-sector;
+   min_index = i;
+   continue;
+   }
+   if (sh-sector  min_sector) {
+   min_sector = sh-sector;
+   min_index = i;
+   }
+   }
+   for (i = 0; i  NR_HASH; i++) {
+repeat:
+   sh = conf-stripe_hashtbl[(i + min_index)  HASH_MASK];
for (; sh; sh = sh-hash_next) {
if (sh-raid_conf != conf)
continue;




Re: harmless (?) error

1999-08-18 Thread Leonard N. Zubkoff

  Date:   Wed, 18 Aug 1999 09:18:00 -0400 (EDT)
  From: James Manning [EMAIL PROTECTED]

  FWIW, the eXtremeRAID 1100 cards are 64-bit PCI only (as are the ServeRAID
  cards in my previous testing).  Other testing I've done has shown many
  situations where even our quad P6/200 machines (PC Server 704) could
  sustain 40MB/sec over the 32-bit 33-MHz PCI bus, so I'm really hoping
  that I can do better with the 64-bit 33-Mhz bus :)

The eXtremeRAID 1100 cards work just fine in 32 bit slots as well.

Leonard



Off Topic

1999-08-18 Thread Andreas Gietl

Sorry that i ask this off-topic question, but i thought because there
are a lot of people on this list who really know linux you could perhaps
help me:

The problem is with the PTYs:

on using autopasswd [which i really need] i get the following error:
/usr/bin/autopasswd gietl xxx
spawn passwd gietl
open(slave pty): bad file number
parent: sync byte write: broken pipe

Perhaps you can tell me what might cause these errors.

The used kernel has got UNIX98pty support and there are a lot of
/dev/pty devices.

Thank you

andreas
-- 
andreas gietl
dedicated server systems
fon +49 9402 2551
fax +49 9402 2604
mobile +49 171 60 70 008
[EMAIL PROTECTED]


# Das Handbuch sagt, das Programm benötige #
#  Windows 95 oder besser. Also habe ich   #
#  Linux installiert!  # 




RAID1 over RAID0 on 2.2.10

1999-08-18 Thread Alan Meadows


Hello, 

From past messages I've gotten the feeling that some people consider 
2.2.10 unstable just as 2.2.9, with the corrupt filesystem issue.  Does
anyone here have experience with 2.2.10 enough to know its stable, or
is anyone firm on the idea that its unstable?  None of the previous
e-mails seem to be conclusive enough for me =)

Thanks,

Alan Meadows
[EMAIL PROTECTED]



RE: How to setup RAID-1 on / partition

1999-08-18 Thread Christian Ordig


On 18-Aug-99 Gerardo Muñoz Martin wrote:
 How can I set up RAID-1 under root (/) partition if it's mounted?
 

Go and fetch the latest Software-RAID HowTo. I did it as the author tells, and
everything worked well.

URL:http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/


---
Christian Ordig | Homepage: http://thor.prohosting.com/~chrordig/ 
Germany |eMail: Christian Ordig [EMAIL PROTECTED]
   __   _   | 
  / /  (_)__  __   __   | Why Linux? Because it is free, stable, and  
 / /__/ / _ \/ // /\ \/ /   | bugs can be fixed in source opposed to waiting  
//_/_//_/\_,_/ /_/\_\   | for a stable WinTendo from Micro$oft.   



Re: Newbie question

1999-08-18 Thread James Manning

  There is a way to preserve the data on the existing disk. Go, and fetch the
  latest Software-RAID HowTo.
  http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/
 
 I couldn't find that part on the RAID-HOWTO. Could you send it to me?

Since it's not really referred to in the RAID-1 section (a navigation
link there might help :) you'll need to read the misnomer section
"Booting on RAID" which starts with " ... kernel cannot be loaded at
boot-time from a RAID device ... " (perhaps Root RAID instead?)

http://ostenfeld.dk/~jakob/Software-RAID.HOWTO/Software-RAID.HOWTO-4.html#ss4.11

James
-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development



4 cheetah-3's pretty much saturate 80MB/sec channel

1999-08-18 Thread James Manning

Dropping each of the 2 channels down to 4 drives started dropping
the performance...barely.   I'm still getting 99.6% CPU util on s/w
raid0 over 2 h/w raid0's scares me, but I'll try the HZ and NR_STRIPES
settings later on.  I'm getting worried I'm not bottlenecking on anything
scsi-related at all, and it's something else in the kernel *shrug*

raiddev/dev/md0
   raid-level 0
   nr-raid-disks 2
   nr-spare-disks0
   chunk-size 4

Partitions:
 ---Sequential Output ---Sequential Input-- --Random--
 -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
  MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
2047 22015 99.6 54881 44.3 20914 46.9 23882 88.9 42410 62.0 609.7  5.6

Since last time I started with whole drives, got bad performance, then
went to partitions and got good performance, I decided to do them
backwards this time. dd if=/dev/zero of=/dev/rd/c0d{0,1} bs=512 count=100
to make sure the partition tables were clear.

Whole drives:
 ---Sequential Output ---Sequential Input-- --Random--
 -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
  MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
2047 22238 99.0 54198 43.2 20813 47.0 24282 90.4 42598 60.5 623.5  7.3

So I have no idea what was causing my previous performance problems
using whole drives and not partitions :)  Of course, this does mean 1)
my CPU util is still insanely high for a 4-way Xeon and 2) I'm still
writing much faster than reading :)

James
-- 
Miscellaneous Engineer --- IBM Netfinity Performance Development



Re: Raid isnt shifting to degrading mode while copying data to it.

1999-08-18 Thread Ziber

Gadi Oxman [EMAIL PROTECTED] wrote:
Even if we do not survive the failure, the redundancy that RAID provides
is still valuable. On the next boot after the failure, we will be able
to continue working in degraded mode with the working drives.

No i cant afford System hanging. I m uising raid for file server. and in my
case while a user copying data to the server and a disk goes doen then srver
do nothing but hanged. At that time administrator have to restart the server
manually and since system is hanged he cant do it remotely have to operate
from console (In my case i have to power off the System). 


To be able to survive the failure and continue working is not so
simple since in addition to the RAID layer, the low level disk drivers,
the bus controller and the devices have to survive the failure.

Gadi



Get free email and a permanent address at http://www.netaddress.com/?N=1



Re: 4 cheetah-3's pretty much saturate 80MB/sec channel

1999-08-18 Thread Jan Edler

What is the rationale for running sw raid0 over hw raid0,
using a single hw raid controller?  I don't quite see why
it should be superior to the all-hw solution.
Now, if you have multiple hw raid controllers, or if you have
anemic controllers and want to do sw raid5 over hw raid0,
or something like that I can begin to understand.

On Wed, Aug 18, 1999 at 08:37:31PM -0400, James Manning wrote:
 Dropping each of the 2 channels down to 4 drives started dropping
 the performance...barely.   I'm still getting 99.6% CPU util on s/w
 raid0 over 2 h/w raid0's scares me, but I'll try the HZ and NR_STRIPES
 settings later on.  I'm getting worried I'm not bottlenecking on anything
 scsi-related at all, and it's something else in the kernel *shrug*

I agree, something is wrong to produce 99% utilization in such a situation.

Jan Edler
NEC Research Institute