On Wed, 26 Mar 2008, Jan Schulze wrote:

Hi all,

I have a disk array with about 4.5 TB and would like to use it as one large logical volume with an ext3 file system. When mounting the logical volume , I get an "Input/Output error: can't read superblock".

Do you get any interesting kernel messages in the output of dmesg (or /var/log/messages etc)? Which exact kernel is this (uname -r) and what arch (i386/x86_64 etc; uname -m)?

I'm using SL 4.2 with kernel 2.6 and this is what I did so far:

- used parted to create a gpt disk label (mklabel gpt) and one large
  partition (mkpart primary ext3 0s -1s)

- used parted to enable LVM flag on device (set 1 LVM on)

I know it would be slow but can you test that you can read/write to all of /dev/sda1?


- created one physical volume, one volume group and one logical volume
  (pvcreate /dev/sda1, vgcreate raid6 /dev/sda1, lvcreate -l 1189706 -n
  vol1 raid6)

- created an ext3 filesystem and explicitly specified a 4K blocksize, as
  this should allow a filesystem size of up to 16 TB (mkfs.ext3 -m 0 -b
  4096 /dev/raid6/vol1)

For some reason my EL4 notes tell me that we also specify -N (number of inodes), as well as -E (set RAID stride), -J size= (set journal size) and -O sparse_super,dir_index,filetype though most of that is probably the default these days...

However, mounting (mount /dev/raid6/vol1 /raid) gives the superblock error, mentioned above.

Everything is working as expected, when using ext2 filesystem (with LVM) or ext3 filesystem (without LVM). Using a smaller volume (< 2 TB) is working

with ext3+LVM as well. Only the combination of > 2TB+ext3+LVM gives me trouble.

Any ideas or suggestions?

We found that in at least some combinations of kernel/hardware (drivers really I expect), that support for >2TB block devices was still rather flakey (at least in the early versions of EL4).

We ended up getting our RAID boxes to present as multiple LUNs each under 2TB which we can then set up as PVs and join back together into a single VG and still have an LV which was bigger than 2TB. I'm rather conservative in such things so we still avoid big block devices at the moment.

[ obviously with single disk sizes growing at the rate they are it means that the block-devices >2TB code is going to get a LOT more testing! ]

However, some of the tools (e.g. ext2/3 fsck) still seemed to fail at about 3.5TB so we ended up needing to build the 'very latest' tools to be able to run fsck properly (the ones included in EL4 - and EL5 I think - get into an infinite loop at some point while scanning the inode tables).

Currently we try to avoid 'big' ext3 LVs ; the one where we discovered the fsck problems was originally ~6.8TB but we ended up splitting that into several smaller LVs since even with working tools it still took ~2 days to fsck... (and longer to dump/copy/restore it all!)

Some of my co-workers swear by XFS for 'big' volumes but then we do have SGI boxes where XFS (well CXFS) is the expected default fs. I've not done much testing with XFS on SL mainly because TUV don't like XFS much...

--
Jon Peatfield,  Computer Officer,  DAMTP,  University of Cambridge
Mail:  [EMAIL PROTECTED]     Web:  http://www.damtp.cam.ac.uk/

Reply via email to