Re: spare group

2007-06-12 Thread Neil Brown
On Tuesday June 12, [EMAIL PROTECTED] wrote:
  According to the source:
 
  * If an array has active  raid  spare == 0  spare_group !=NULL
  * Look for another array with spare  0 and active == raid and same 
  spare_group
  * if found, choose a device and hotremove/hotadd
 
  This is not happening. What is my mistake?
  
  Is mdadm --monitor running?  That is required to perform
  spare-migration.
 
 Yes, of course.

Good - I need to get the obvious things out of the way first :-)

(reads code).

Ahhh. You are using version-1 superblocks aren't you?  That code only
works for version-0.90 superblocks.  That was careless of me.  It
shouldn't be hard to make it work more generally, but it looks like it
will be slightly more than trivial.  I'll try to get you a patch in
the next day or so (feel free to remind me if I seem to have
forgotten).

Thanks for testing and reporting this problem.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SLES 9 SP3 and mdadm 2.6.1 (via rpm)

2007-06-12 Thread Thorsten Wolf
Hello everyone.

I've got a SLES9 SP3 running and I've been quite happy with it so far.

Recently, I've created a 4 disk spanning RAID-5 on our company server. Runs 
quite nice and we're happy with that too. I created that RAID using the SLES 
mdadm (1.4 I believe) package.

After discovering that there is a much newer mdadm out here (2.6.1), I decided 
to upgrade. It went just fine. Raid still running at 120 MB/sec.

After adding a disk to the raid, which went fine as well.. BUT:

The added disk /dev/sda1 shows up in /proc/mdstat, but does not have the spare 
(s) flag.

Plus... the --grow doesn't work...

I get the: mdadm: /dev/md0: Cannot get array details from sysfs error which has 
been discussed before. Can it be that this is caused by the 2.6.5-7.2xx Kernel? 
Any ideas?

regards,

Thorsten
-- 
Contact me on ICQ: 7656468
skype://sysfried

GMX FreeMail: 1 GB Postfach, 5 E-Mail-Adressen, 10 Free SMS.
Alle Infos und kostenlose Anmeldung: http://www.gmx.net/de/go/freemail
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SLES 9 SP3 and mdadm 2.6.1 (via rpm)

2007-06-12 Thread Neil Brown
On Tuesday June 12, [EMAIL PROTECTED] wrote:
 Hello everyone.
 
 I've got a SLES9 SP3 running and I've been quite happy with it so far.
 
 Recently, I've created a 4 disk spanning RAID-5 on our company
 server. Runs quite nice and we're happy with that too. I created
 that RAID using the SLES mdadm (1.4 I believe) package. 
 
 After discovering that there is a much newer mdadm out here (2.6.1),
 I decided to upgrade. It went just fine. Raid still running at 120
 MB/sec. 
 
 After adding a disk to the raid, which went fine as well.. BUT:
 
 The added disk /dev/sda1 shows up in /proc/mdstat, but does not have
 the spare (s) flag. 
 
 Plus... the --grow doesn't work...
 
 I get the: mdadm: /dev/md0: Cannot get array details from sysfs
 error which has been discussed before. Can it be that this is caused
 by the 2.6.5-7.2xx Kernel? Any ideas? 

Yes.  All of your issues are caused by using a 2.6.5 based kernel.
However even upgrading to SLES10 would not get you raid5-grow.  That
came a little later.  You would need to compile a mainline kernel or
wait for SLES11.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: spare group

2007-06-12 Thread Tomka Gergely

Neil Brown írta:

(reads code).

Ahhh. You are using version-1 superblocks aren't you?  That code only
works for version-0.90 superblocks.  That was careless of me.  It
shouldn't be hard to make it work more generally, but it looks like it
will be slightly more than trivial.  I'll try to get you a patch in
the next day or so (feel free to remind me if I seem to have
forgotten).

Thanks for testing and reporting this problem.

NeilBrown


# mdadm26 --detail /dev/md0
/dev/md0:
Version : 00.90.03
  Creation Time : Tue Jun 12 10:31:08 2007
 Raid Level : raid5
 Array Size : 19534848 (18.63 GiB 20.00 GB)
  Used Dev Size : 9767424 (9.31 GiB 10.00 GB)
   Raid Devices : 3
  Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Tue Jun 12 10:33:35 2007
  State : clean
 Active Devices : 3
Working Devices : 4
 Failed Devices : 0
  Spare Devices : 1

 Layout : left-symmetric
 Chunk Size : 64K

   UUID : 5fd83926:01739a55:36458d87:119f8994 (local to host 
ursula)

 Events : 0.4

Number   Major   Minor   RaidDevice State
   0   810  active sync   /dev/sda1
   1   8   171  active sync   /dev/sdb1
   2   8   332  active sync   /dev/sdc1

   3   8   49-  spare   /dev/sdd1

/dev/md1:
Version : 00.90.03
  Creation Time : Tue Jun 12 10:31:29 2007
 Raid Level : raid5
 Array Size : 19534848 (18.63 GiB 20.00 GB)
  Used Dev Size : 9767424 (9.31 GiB 10.00 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Tue Jun 12 10:36:18 2007
  State : clean, degraded
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0

 Layout : left-symmetric
 Chunk Size : 64K

   UUID : 815d6fc4:a55c2602:36458d87:119f8994 (local to host 
ursula)

 Events : 0.6

Number   Major   Minor   RaidDevice State
   0   8   650  active sync   /dev/sde1
   1   8   811  active sync   /dev/sdf1
   2   002  removed

   3   8   97-  faulty spare   /dev/sdg1

ARRAY /dev/md1 level=raid5 num-devices=3 spare-group=ubul 
UUID=815d6fc4:a55c2602:36458d87:119f8994
ARRAY /dev/md0 level=raid5 num-devices=3 spares=1 spare-group=ubul 
UUID=5fd83926:01739a55:36458d87:119f8994


I am very sorry, but it wont works with .9 superblocks also :( We 
missing something small, but important here. Before you start to code. 
mdadm was running in monitor mode, and reported a Fail. mdadm is the 
latest version, 2.6.2.


tg
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: spare group

2007-06-12 Thread Neil Brown
On Tuesday June 12, [EMAIL PROTECTED] wrote:
 
 I am very sorry, but it wont works with .9 superblocks also :( We 
 missing something small, but important here. Before you start to code. 
 mdadm was running in monitor mode, and reported a Fail. mdadm is the 
 latest version, 2.6.2.
 
 tg

Hmmm. 
[tests code]

Yes, you are right.  It looks like a bug was introduced in 2.6 which
broke various aspects of --monitor.  I guess I need to add some
--monitor tests to my regression test suite.

This patch should fix it.

Thanks again,
NeilBrown


---
Fix spare migration and other problems with --monitor.

2.6 broke --monitor in various ways, including spare migration
stopped working.  This fixes it.


### Diffstat output
 ./Monitor.c |1 +
 1 file changed, 1 insertion(+)

diff .prev/Monitor.c ./Monitor.c
--- .prev/Monitor.c 2007-02-22 14:59:11.0 +1100
+++ ./Monitor.c 2007-06-12 19:48:34.0 +1000
@@ -328,6 +328,7 @@ int Monitor(mddev_dev_t devlist,
for (i=0; iMaxDisks  i = array.raid_disks + 
array.nr_disks;
 i++) {
mdu_disk_info_t disc;
+   disc.number = i;
if (ioctl(fd, GET_DISK_INFO, disc) = 0) {
info[i].state = disc.state;
info[i].major = disc.major;
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Some RAID levels do not support bitmap

2007-06-12 Thread Bill Davidsen

Neil Brown wrote:

On Monday June 11, [EMAIL PROTECTED] wrote:
  

Jan Engelhardt wrote:


Hi,


RAID levels 0 and 4 do not seem to like the -b internal. Is this 
intentional? Runs 2.6.20.2 on i586.

(BTW, do you already have a PAGE_SIZE=8K fix?)

14:47 ichi:/dev # mdadm -C /dev/md0 -l 4 -e 1.0 -b internal -n 2 /dev/ram[01]
mdadm: RUN_ARRAY failed: Input/output error
mdadm: stopped /dev/md0
14:47 ichi:/dev # mdadm -C /dev/md0 -l 0 -e 1.0 -b internal -n 2 /dev/ram[01]
mdadm: RUN_ARRAY failed: Cannot allocate memory
mdadm: stopped /dev/md0

Right... md: bitmaps not supported for this level.
  
  
Bitmaps show what data has been modified but not written. For RAID-0 
there is no copy, therefore there can be no bitmap to show what still 
needs to be updated. I would have thought that RAID-4 would support 
bitmaps, but maybe it was just never added because use of RAID-4 is 
pretty uncommon.



added late rather than never added.  2.6.21 supports bitmaps on
RAID-4,  The patch is about 2 lines and would apply to 2.6.20 with no
trouble.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=3d37890baa4ca962f8a6b77525b8f3d0698eee09


  
BTW: RAID-4 seems to work fine with an external bitmap. Were you trying 
to do internal?



I suspect you were using 2.6.21-rc6 or later?


No, the machine I had for trial was 2.6.15 with a few patches, none in 
RAID. Seemed to work just fine if I put the bitmap on an external 
device. Was that not as expected?


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SLES 9 SP3 and mdadm 2.6.1 (via rpm)

2007-06-12 Thread Bill Davidsen

Neil Brown wrote:

On Tuesday June 12, [EMAIL PROTECTED] wrote:
  

Hello everyone.

I've got a SLES9 SP3 running and I've been quite happy with it so far.

Recently, I've created a 4 disk spanning RAID-5 on our company
server. Runs quite nice and we're happy with that too. I created
that RAID using the SLES mdadm (1.4 I believe) package. 


After discovering that there is a much newer mdadm out here (2.6.1),
I decided to upgrade. It went just fine. Raid still running at 120
MB/sec. 


After adding a disk to the raid, which went fine as well.. BUT:

The added disk /dev/sda1 shows up in /proc/mdstat, but does not have
the spare (s) flag. 


Plus... the --grow doesn't work...

I get the: mdadm: /dev/md0: Cannot get array details from sysfs
error which has been discussed before. Can it be that this is caused
by the 2.6.5-7.2xx Kernel? Any ideas? 



Yes.  All of your issues are caused by using a 2.6.5 based kernel.
However even upgrading to SLES10 would not get you raid5-grow.  That
came a little later.  You would need to compile a mainline kernel or
wait for SLES11.


I have to think that if features require a later kernel version that a 
warning message would be appropriate. I'm always leary about trying a 
new mdadm version with a vendor kernel unless it's a minor bugfix problem.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 000 of 2] md: Introduction - bugfixes for md/raid{1,10}

2007-06-12 Thread Bill Davidsen

NeilBrown wrote:

Following are a couple of bugfixes for raid10 and raid1.  They only
affect fairly uncommon configurations (more than 2 mirrors) and can
cause data corruption.  Thay are suitable for 2.6.22 and 21-stable.

Thanks,
NeilBrown


 [PATCH 001 of 2] md: Fix two raid10 bugs.
 [PATCH 002 of 2] md: Fix bug in error handling during raid1 repair.


I don't know about uncommon, given that I have six machines in this 
building with three way RAID-1 for the boot partition, to be sure I can 
get off the ground enough to get the other partitions up.


And since you added write-mostly for remote mirrors I do have a few 
systems doing 2 mirrors as well. This set of patches definitely will be 
in my kernel by this afternoon.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: below 10MB/s write on raid5

2007-06-12 Thread Bill Davidsen

Dexter Filmore wrote:
I recently upgraded my file server, yet I'm still unsatisfied with the write 
speed.

Machine now is a Athlon64 3400+ (Socket 754) equipped with 1GB of RAM.
The four RAID disks are attached to the board's onbaord sATA controller 
(Sil3114 attached via PCI)

Kernel is 2.6.21.1, custom on Slackware 11.0.
RAID is on four Samsung SpinPoint disks, has LVM, 3 volumes atop of each XFS.

The machine does some other work, too, but still I would have suspected to get 
into the 20-30MB/s area. Too much asked for?
  


Increase your stripe cache size in /sys/block/mdX/md/stripe_cache_size. 
If you have a chunk size of 256, try setting the cache size to 8192 and 
see if your write performance ends up ~100MB/s or so.


  echo 8192  /sys/block/mdX/md/stripe_cache_size

Where X is your array name, of course.

Note, larger values will help more, but it's definitely diminishing 
returns, so don't get carried away. There was a report of problems with 
size  32768, I don't remember the details, so I would avoid that as well.


--
bill davidsen [EMAIL PROTECTED]
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


raid1 with nbd member hangs MD on SLES10 and RHEL5

2007-06-12 Thread Mike Snitzer

When using raid1 with one local member and one nbd member (marked as
write-mostly) MD hangs when trying to format /dev/md0 with ext3.  Both
'cat /proc/mdstat' and 'mdadm --detail /dev/md0' hang infinitely.
I've not tried to reproduce on 2.6.18 or 2.6.19ish kernel.org kernels
yet but this issue affects both SLES10 and RHEL5.

sysrq traces for RHEL5 follow; I don't have immediate access to a
SLES10 system at the moment but I've seen this same hang with SLES10
SP1 RC4:

cat /proc/mdstat

cat   S 8100048e7de8  6208 11428  11391 (NOTLB)
8100048e7de8 076eb000 80098ea6 0008
81001ff170c0 810037e17100 00045f8d13924085 0006b89f
81001ff17290 0001 0005 
Call Trace:
[80098ea6] seq_printf+0x67/0x8f
[80233df5] __mutex_lock_interruptible_slowpath+0x7f/0xbc
[801be644] md_seq_show+0x123/0x6aa
[8009939f] seq_read+0x1b8/0x28d
[8007b7a8] vfs_read+0xcb/0x171
[8007bb87] sys_read+0x45/0x6e
[800097e1] tracesys+0xd1/0xdc

/sbin/mdadm --detail /dev/md0

mdadm S 810035a1dd78  6384  3829   3828 (NOTLB)
810035a1dd78 81003f4570c0 80094e4d 0001
81000617c870 810037e17100 00043e667c800afe 0005ae94
81000617ca40 0001 0021 
Call Trace:
[80094e4d] mntput_no_expire+0x19/0x89
[80233df5] __mutex_lock_interruptible_slowpath+0x7f/0xbc
[801be4e7] md_open+0x2e/0x68
[80082560] do_open+0x216/0x316
[8008280b] blkdev_open+0x0/0x4f
[8008282e] blkdev_open+0x23/0x4f
[80079889] __dentry_open+0xd9/0x1dc
[80079a40] do_filp_open+0x2d/0x3d
[80079a94] do_sys_open+0x44/0xbe
[800097e1] tracesys+0xd1/0xdc

I can provided more detailed information; please just ask.

thanks,
Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1 with nbd member hangs MD on SLES10 and RHEL5

2007-06-12 Thread Neil Brown
On Tuesday June 12, [EMAIL PROTECTED] wrote:
 
 I can provided more detailed information; please just ask.
 

A complete sysrq trace (all processes) might help.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1 with nbd member hangs MD on SLES10 and RHEL5

2007-06-12 Thread Mike Snitzer

On 6/12/07, Neil Brown [EMAIL PROTECTED] wrote:

On Tuesday June 12, [EMAIL PROTECTED] wrote:

 I can provided more detailed information; please just ask.


A complete sysrq trace (all processes) might help.


I'll send it to you off list.

thanks,
Mike
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html