Re: XFS vs Ext4

Edward Zuniga Wed, 06 Dec 2023 08:39:14 -0800

Cc'ing supervisor to loop him in as well.

On Wed, Dec 6, 2023, 9:18 AM Edward Zuniga <eazun...@vcu.edu> wrote:


> Thanks everyone for the feedback! I've learned so much from reading the
> discussions.
>
> For our application, we will have a LAN with a single server (1TB RAID1
> array for OS, 200TB RAID5 array for data) and up to 16 workstations (1TB
> RAID1 array for OS). Our IT department is more familiar with Rocky Linux 8,
> which I assume will perform the same as AlmaLinux 8. Some of our MRI
> processing can take weeks to finish, so we need a system that is very
> reliable. We also work with individual files in the hundreds of gigabytes.
>
> While reading the Red Hat 8 manual
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__access.redhat.com_documentation_en-2Dus_red-5Fhat-5Fenterprise-5Flinux_8_html_managing-5Ffile-5Fsystems_overview-2Dof-2Davailable-2Dfile-2Dsystems-5Fmanaging-2Dfile-2Dsystems&d=DwIFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=gd8BzeSQcySVxr0gDWSEbN-P-pgDXkdyCtaMqdCgPPdW1cyL5RIpaIYrCn8C5x2A&m=NWrdCkO_Rv2xr06ZFmX0tmfqqeYNrrwhynckTqel03PtwMXxPfTvgwA0pa8NEDQP&s=PuCPAQ38-YIaby8e4N7dh0ORNT6UbvsXS04mQ0wfKnw&e=
>  >,
> I found a few possible issues regarding XFS. I'm curious to see if anyone
> has experienced these as well.
>
> 1. Metadata error behaviorIn ext4, you can configure the behavior when
> the file system encounters metadata errors. The default behavior is to
> simply continue the operation. When XFS encounters an unrecoverable
> metadata error, it shuts down the file system and returns the EFSCORRUPTED
>  error.*This could be problematic for processing that takes several
> weeks.*2. Inode numbers
>
> The ext4 file system does not support more than 232 inodes.
>
> XFS dynamically allocates inodes. An XFS file system cannot run out of
> inodes as long as there is free space on the file system.
>
> Certain applications cannot properly handle inode numbers larger than 232 on
> an XFS file system. These applications might cause the failure of 32-bit
> stat calls with the EOVERFLOW return value. Inode number exceed 232 under
> the following conditions:
>
>    - The file system is larger than 1 TiB with 256-byte inodes.
>    - The file system is larger than 2 TiB with 512-byte inodes.
>
> If your application fails with large inode numbers, mount the XFS file
> system with the -o inode32 option to enforce inode numbers below 232.
> Note that using inode32 does not affect inodes that are already allocated
> with 64-bit numbers.
> *Has anyone encountered this issue? *3. The Red Hat 8 manual also warns
> that using xfs_repair -L might cause significant file system damage and
> data loss and should only be used as a last resort. The manual does not
> mention a similar warning about using e2fsck to repair an ext4 file system.
> Has anyone experienced issues repairing a corrupt XFS file system?
> Thanks,Eddie
>
> On Tue, Dec 5, 2023 at 8:46 PM Konstantin Olchanski <olcha...@triumf.ca>
> wrote:
>
>> On Mon, Dec 04, 2023 at 03:03:46PM -0500, Edward Zuniga wrote:
>> >
>> > We are upgrading our MRI Lab servers and workstations to AlmaLinux 8. We
>> > have used ext4 for the past 10 years, however we are considering using
>> XFS
>> > for its better performance with larger files. Which file system do you
>> use
>> > for your lab?
>> >
>>
>> Historical background.
>>
>> XFS filesystem with the companion XLV logical volume manager (aka
>> "partitioning tool")
>> came to Linux from SGI IRIX, where it was developed circa late-1990-ies.
>> XFS was copied
>> to Linux verbatim (initially with shims and kludges, later, fully
>> integrated).
>> XLV was reimplemented as LVM.
>>
>> The EXT series of filesystems were developed together with the linux
>> kernel (first ext
>> filesystem may have originated with MINIX, look it up). As improvements
>> were made,
>> journaling, no need to fsck after crash, online grow/shrink, etc, they
>> were
>> renamed ext2/ext3/ext4 and they are still largely compatible between
>> themselves.
>>
>> For many purposes, both filesystems are obsoleted by ZFS, which added:
>>
>> - added metadata and data checksums - to detect silent bit rot on
>> current-generation HDDs and SSDs
>> - added online filesystem check - for broken data, gives you list of
>> filenames instead of inode numbers
>> - added "built-in" mirroring - together with checksums, online fsck (zfs
>> scrub) and monthly zfs scrub cron job, allows automatic healing of bit rot.
>> - added "built-in" raid-5 and raid-6 - again, together with checksums and
>> online fsck, allows automatic healing and robust operation in presence of
>> disk bad sectors, I/O errors, corruption and single-disk failure.
>> - other goodies like snapshots, large ram cache, dedup, online
>> compression, etc are taken for granted for current generation filesystems.
>>
>> On current generation HDDs and SSds, use of bare XFS and ext4 is
>> dangerous, SSD failure or "HDD grows bad sectors" will destroy your data
>> completely.
>>
>> On current generation HDDs, use of mirrored XFS and ext4 is dangerous
>> (using mdadm or LVM mirroring), (a) bit rot inevitably causes differences
>> between data between the two disks. Lacking checksums, mdadm and LVM
>> mirroring cannot decide which of the two copies is the correct one. (b)
>> after a crash, mirror rebuild fill fail if both disks happen to have bad
>> sectors (or throw random I/O errors).
>>
>> Ditto for RAID5 and RAID6, probability of RAID rebuild failing because
>> multiple disks have have sectors and I/O errors goes up with the number of
>> disks.
>>
>> ZFS was invented to resolve all these problems. (BTRFS was invented as a
>> NIH erzatz ZFS, is still incomplete wrt RAID5/RAID6).
>>
>> Bottom line, if you can, use ZFS. Current Ubuntu installer has a button
>> "install on ZFS", use it!
>>
>> --
>> Konstantin Olchanski
>> Data Acquisition Systems: The Bytes Must Flow!
>> Email: olchansk-at-triumf-dot-ca
>> Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada
>>
>
>
> --
> Edward A. Zuniga
> Senior Research Analyst/CARI MRI Lab Manager
> Virginia Commonwealth University
> C. Kenneth and Dianne Wright Center for Clinical and Translational Research
> Institute for Drug and Alcohol Studies
> Collaborative Advanced Research Imaging (CARI) Facility
> 203 East Cary Street, Suite 202
> Richmond, Virginia 23219
> Phone: (804) 828-4184
> Fax: (804) 827-2565
> eazun...@vcu.edu
>

Re: XFS vs Ext4

Reply via email to