Re: Software RAID on SPARC64

2005-07-19 Thread David Johnson
On Tuesday 19 July 2005 09:17, Turbo Fredriksson wrote:
>
> Yes. For MD devices on SPARC, the (first) partition _MUST_ start on '1',
> not '0' as is the default.
>

Thanks for that everyone, that was indeed the problem.

Regards,
David.

-- 
David Johnson
www.david-web.co.uk - My Personal Website
www.ethereye.org.uk - EtherEye Network Host Checker
www.penguincomputing.co.uk - Need a Web Developer?


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Software RAID on SPARC64

2005-07-19 Thread Turbo Fredriksson
Quoting Martin <[EMAIL PROTECTED]>:

> On Mon, 2005-07-18 at 14:06 -0400, Sam Creasey wrote:
>> Not sure this is actually it (I'm just guessing here), but are you
>> starting the raid partition on cylinder 0?   Or cyl 1?   I *think* (though
>> I'm not sure) that starting on cylinder 0 actually puts the disklabel into
>> the beginning of the partition itself.  This isn't a problem for UFS, as
>> UFS leaves 8k unused at the beginning of the FS, but it might be a problem
>> for md.
>> 
>> Again, that's just a shot in the dark, take it for what it's worth.
>
> This sounds familiar.  I certainly remember having to go back and
> re-partition starting on 1 not 0 to avoid eating the disklabel and
> partition table.  Can't remember what the symptoms where but I'm pretty
> sure it was on the machine that runs a RAID array...

Yes. For MD devices on SPARC, the (first) partition _MUST_ start on '1', not
'0' as is the default.

- s n i p -
[EMAIL PROTECTED]:~# mdadm -D /dev/md/0 | grep /dev/scsi
   0   8   340  active sync   
/dev/scsi/host3/bus0/target2/lun0/part2
   1   8  1301  active sync   
/dev/scsi/host3/bus0/target10/lun0/part2
   2   8  2262  active sync   
/dev/scsi/host4/bus0/target2/lun0/part2
   3  65   503  active sync   
/dev/scsi/host4/bus0/target10/lun0/part2
   4   8  242   -1  spare   
/dev/scsi/host4/bus0/target3/lun0/part2
   5   8   50   -1  spare   
/dev/scsi/host3/bus0/target3/lun0/part2
[EMAIL PROTECTED]:~# fdisk -l /dev/scsi/host4/bus0/target2/lun0/disc

Disk /dev/scsi/host4/bus0/target2/lun0/disc (Sun disk label): 64 heads, 32 
sectors, 8635 cylinders
Units = cylinders of 2048 * 512 bytes

 Device FlagStart   EndBlocks   Id  
System
/dev/scsi/host4/bus0/target2/lun0/disc1 1   257262144   83  
Linux native
/dev/scsi/host4/bus0/target2/lun0/disc2   257  8635   8579072   fd  
Linux raid autodetect
/dev/scsi/host4/bus0/target2/lun0/disc3 0  8635   88422405  
Whole disk
- s n i p -
-- 
[Hello to all my fans in domestic surveillance] Semtex kibo CIA killed
PLO Nazi NORAD explosion Ft. Bragg Clinton 767 SEAL Team 6 subway
congress
[See http://www.aclu.org/echelonwatch/index.html for more about this]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Software RAID on SPARC64

2005-07-18 Thread Simon Heywood
On Mon, 18 Jul 2005 17:57:22 +0100, David Johnson wrote:
> I've been having problems with software RAID (using mdadm) on Sarge on
> an Ultra Enterprise 450. There seems to be a bug somewhere causing
> corruption of Sun disk labels.
> 
> I start with 8 SCSI disks with valid Sun disk labels and one partition
> filling each disk. The partition types are set to "Linux RAID
> autodetect".

What do the partition tables look like?

You may know this already, but you need to start the first partition on
each disk at cylinder 1 instead of cylinder 0 to leave room for the disk
label.  Apparently, Ext2/3 leaves some free space at the start of
partitions for this sort of thing, but RAID doesn't.

S.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Software RAID on SPARC64

2005-07-18 Thread Sam Creasey

Not sure this is actually it (I'm just guessing here), but are you
starting the raid partition on cylinder 0?   Or cyl 1?   I *think* (though
I'm not sure) that starting on cylinder 0 actually puts the disklabel into
the beginning of the partition itself.  This isn't a problem for UFS, as
UFS leaves 8k unused at the beginning of the FS, but it might be a problem
for md.

Again, that's just a shot in the dark, take it for what it's worth.

-- Sam

On Mon, 18 Jul 2005, David Johnson wrote:

> Hi all,
>
> I've been having problems with software RAID (using mdadm) on Sarge on an
> Ultra Enterprise 450. There seems to be a bug somewhere causing corruption of
> Sun disk labels.
>
> I start with 8 SCSI disks with valid Sun disk labels and one partition filling
> each disk. The partition types are set to "Linux RAID autodetect".
>
> I can create a RAID 5 array without problems. This is the command I'm using:
> mdadm -C -l5 -n7
> -x1 /dev/md0 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 
> /dev/sdj1 /dev/sdk1
>
> If I now reboot, everything is OK and the array is detected and configured.
>
> The problem occurs when I format the array. E.g.:
> mkfs.ext2 /dev/md0
>
> This works OK and no errors are given. However when the system is rebooted,
> around half the disks now have corrupt disk labels. If I re-create the disk
> labels and restore the partition tables, everything is fine when I reboot.
> The array is started correctly and the disk labels are maintained on
> subsequent reboots. Only formatting the array causes the problem.
>
> I have tried with other filing systems and the same happens. Not all the disks
> are affected, only around half of them (the last time I did it, sd[efghi]
> were affected).
>
> I'd be interested to know if anyone else seen this issue.
>
> Regards,
> David.
>
> --
> David Johnson
> www.david-web.co.uk - My Personal Website
> www.ethereye.org.uk - EtherEye Network Host Checker
> www.penguincomputing.co.uk - Need a Web Developer?
>
>
> --
> To UNSUBSCRIBE, email to [EMAIL PROTECTED]
> with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
>


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: Software RAID on SPARC64

2005-07-18 Thread Martin
On Mon, 2005-07-18 at 14:06 -0400, Sam Creasey wrote:
> Not sure this is actually it (I'm just guessing here), but are you
> starting the raid partition on cylinder 0?   Or cyl 1?   I *think* (though
> I'm not sure) that starting on cylinder 0 actually puts the disklabel into
> the beginning of the partition itself.  This isn't a problem for UFS, as
> UFS leaves 8k unused at the beginning of the FS, but it might be a problem
> for md.
> 
> Again, that's just a shot in the dark, take it for what it's worth.
This sounds familiar.  I certainly remember having to go back and
re-partition starting on 1 not 0 to avoid eating the disklabel and
partition table.  Can't remember what the symptoms where but I'm pretty
sure it was on the machine that runs a RAID array...

Cheers,
 - Martin



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Software RAID on SPARC64

2005-07-18 Thread David Johnson
Hi all,

I've been having problems with software RAID (using mdadm) on Sarge on an 
Ultra Enterprise 450. There seems to be a bug somewhere causing corruption of 
Sun disk labels.

I start with 8 SCSI disks with valid Sun disk labels and one partition filling 
each disk. The partition types are set to "Linux RAID autodetect".

I can create a RAID 5 array without problems. This is the command I'm using:
mdadm -C -l5 -n7 
-x1 /dev/md0 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 
/dev/sdj1 /dev/sdk1

If I now reboot, everything is OK and the array is detected and configured.

The problem occurs when I format the array. E.g.:
mkfs.ext2 /dev/md0

This works OK and no errors are given. However when the system is rebooted, 
around half the disks now have corrupt disk labels. If I re-create the disk 
labels and restore the partition tables, everything is fine when I reboot. 
The array is started correctly and the disk labels are maintained on 
subsequent reboots. Only formatting the array causes the problem.

I have tried with other filing systems and the same happens. Not all the disks 
are affected, only around half of them (the last time I did it, sd[efghi] 
were affected).

I'd be interested to know if anyone else seen this issue.

Regards,
David.

-- 
David Johnson
www.david-web.co.uk - My Personal Website
www.ethereye.org.uk - EtherEye Network Host Checker
www.penguincomputing.co.uk - Need a Web Developer?


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]