Re: XFS vs Ext4

2023-12-06 Thread Konstantin Olchanski
For your system, I would not agonize over choice of ext4 or XFS,
in practice you will see little difference.

Some practical advice for your system:

- bump the OS + home dirs from 1TB to 2TB (incremental cost of two 1TB SSD vs 
two 2TB SSD is small)
- run this system on UPS power. your 200 TB data array will be RAID-6 with 
12-16 of 20 TB HDDs, probability of one HDD failure is high, raid rebuild time 
will be about 2 days, if power goes out during rebuild, Bad Things Will Happen.
- we have been using XFS for large data arrays since late 1990-ies (on SGI 
machines), it is very reliable, it will not corrupt itself (unless you have 
defective hardware, ZFS was developed to deal with that, checksums, 
self-healing, etc).
- your server should have ECC DRAM. this is a standard feature for all 
server-class machines, use it. all our server machines have 64 GB of ECC memory.

If I were building this system, I would make both the 2TB SSD array and the 
200TB data array ZFS.

Also you do not say how you will back up your data. You must have both backups 
and archives. Backups protect you against oops, I deleted wrong file", archives 
protected you against "oops, I deleted wrong file 2 years ago".

Without backups and archives, if you have a fire, a flood, a crypto-ransomware 
attack,
if your server is stolen, you lose everything.


K.O.


On Wed, Dec 06, 2023 at 11:37:54AM -0500, Edward Zuniga wrote:
> Cc'ing supervisor to loop him in as well.
> 
> On Wed, Dec 6, 2023, 9:18 AM Edward Zuniga  wrote:
> 
> > Thanks everyone for the feedback! I've learned so much from reading the
> > discussions.
> >
> > For our application, we will have a LAN with a single server (1TB RAID1
> > array for OS, 200TB RAID5 array for data) and up to 16 workstations (1TB
> > RAID1 array for OS). Our IT department is more familiar with Rocky Linux 8,
> > which I assume will perform the same as AlmaLinux 8. Some of our MRI
> > processing can take weeks to finish, so we need a system that is very
> > reliable. We also work with individual files in the hundreds of gigabytes.
> >
> > While reading the Red Hat 8 manual
> >  >  >,
> > I found a few possible issues regarding XFS. I'm curious to see if anyone
> > has experienced these as well.
> >
> > 1. Metadata error behaviorIn ext4, you can configure the behavior when
> > the file system encounters metadata errors. The default behavior is to
> > simply continue the operation. When XFS encounters an unrecoverable
> > metadata error, it shuts down the file system and returns the EFSCORRUPTED
> >  error.*This could be problematic for processing that takes several
> > weeks.*2. Inode numbers
> >
> > The ext4 file system does not support more than 232 inodes.
> >
> > XFS dynamically allocates inodes. An XFS file system cannot run out of
> > inodes as long as there is free space on the file system.
> >
> > Certain applications cannot properly handle inode numbers larger than 232 on
> > an XFS file system. These applications might cause the failure of 32-bit
> > stat calls with the EOVERFLOW return value. Inode number exceed 232 under
> > the following conditions:
> >
> >- The file system is larger than 1 TiB with 256-byte inodes.
> >- The file system is larger than 2 TiB with 512-byte inodes.
> >
> > If your application fails with large inode numbers, mount the XFS file
> > system with the -o inode32 option to enforce inode numbers below 232.
> > Note that using inode32 does not affect inodes that are already allocated
> > with 64-bit numbers.
> > *Has anyone encountered this issue? *3. The Red Hat 8 manual also warns
> > that using xfs_repair -L might cause significant file system damage and
> > data loss and should only be used as a last resort. The manual does not
> > mention a similar warning about using e2fsck to repair an ext4 file system.
> > Has anyone experienced issues repairing a corrupt XFS file system?
> > Thanks,Eddie
> >
> > On Tue, Dec 5, 2023 at 8:46 PM Konstantin Olchanski 
> > wrote:
> >
> >> On Mon, Dec 04, 2023 at 03:03:46PM -0500, Edward Zuniga wrote:
> >> >
> >> > We are upgrading our MRI Lab servers and workstations to AlmaLinux 8. We
> >> > have used ext4 for the past 10 years, however we are considering using
> >> XFS
> >> > for its better performance with larger files. Which file system do you
> >> use
> >> > for your lab?
> >> >
> >>
> >> Historical background.
> >>
> >> XFS filesystem with the companion XLV logical volume manager (aka
> >> "partitioning tool")
> >> came to Linux from SGI IRIX, where it was developed circa late-1990-ies.

Re: [SCIENTIFIC-LINUX-USERS] XFS vs Ext4

2023-12-06 Thread Patrick Riehecky
On Wed, 2023-12-06 at 09:18 -0500, Edward Zuniga wrote:
Thanks everyone for the feedback! I've learned so much from reading the 
discussions.

For our application, we will have a LAN with a single server (1TB RAID1 array 
for OS, 200TB RAID5 array for data) and up to 16 workstations (1TB RAID1 array 
for OS). Our IT department is more familiar with Rocky Linux 8, which I assume 
will perform the same as AlmaLinux 8. Some of our MRI processing can take weeks 
to finish, so we need a system that is very reliable. We also work with 
individual files in the hundreds of gigabytes.

While reading the Red Hat 8 
manual,
 I found a few possible issues regarding XFS. I'm curious to see if anyone has 
experienced these as well.

1. Metadata error behavior
In ext4, you can configure the behavior when the file system encounters 
metadata errors. The default behavior is to simply continue the operation. When 
XFS encounters an unrecoverable metadata error, it shuts down the file system 
and returns the EFSCORRUPTED error.
This could be problematic for processing that takes several weeks.

In the rare issues I've hit with xfs metadata, xfs_repair has always been able 
to save me.  I've needed it maybe a dozen times in the last 20 years.  The 
repairs have been very fast in my experience - about 1 minute on an 8Tb volume. 
 Repair time seems to scale at O(ln(n)) based on my research.

2. Inode numbers

The ext4 file system does not support more than 232 inodes.

XFS dynamically allocates inodes. An XFS file system cannot run out of inodes 
as long as there is free space on the file system.

Certain applications cannot properly handle inode numbers larger than 232 on an 
XFS file system. These applications might cause the failure of 32-bit stat 
calls with the EOVERFLOW return value. Inode number exceed 232 under the 
following conditions:

  *   The file system is larger than 1 TiB with 256-byte inodes.
  *   The file system is larger than 2 TiB with 512-byte inodes.

If your application fails with large inode numbers, mount the XFS file system 
with the -o inode32 option to enforce inode numbers below 232. Note that using 
inode32 does not affect inodes that are already allocated with 64-bit numbers.

To be honest, I've only ever hit issues with 64 bit inodes on 32 bit kernels.  
I'm not sure I've truly stressed the space, but I've got some pretty decent 
volumes (30Tb) with a whole lot of files and not hit any issues.  Your mileage 
may vary.

Has anyone encountered this issue?
3. The Red Hat 8 manual also warns that using xfs_repair -L might cause 
significant file system damage and data loss and should only be used as a last 
resort. The manual does not mention a similar warning about using e2fsck to 
repair an ext4 file system. Has anyone experienced issues repairing a corrupt 
XFS file system?


xfs_repair -L is fairly scary as it zeros out the transaction log.  I'd reach 
out to the XFS folks before running it.  I've only needed a normal xfs_repair 
in the past, and that pretty infrequently.


On Tue, Dec 5, 2023 at 8:46 PM Konstantin Olchanski 
mailto:olcha...@triumf.ca>> wrote:
On Mon, Dec 04, 2023 at 03:03:46PM -0500, Edward Zuniga wrote:
>
> We are upgrading our MRI Lab servers and workstations to AlmaLinux 8. We
> have used ext4 for the past 10 years, however we are considering using XFS
> for its better performance with larger files. Which file system do you use
> for your lab?
>

Historical background.

XFS filesystem with the companion XLV logical volume manager (aka "partitioning 
tool")
came to Linux from SGI IRIX, where it was developed circa late-1990-ies. XFS 
was copied
to Linux verbatim (initially with shims and kludges, later, fully integrated).
XLV was reimplemented as LVM.

The EXT series of filesystems were developed together with the linux kernel 
(first ext
filesystem may have originated with MINIX, look it up). As improvements were 
made,
journaling, no need to fsck after crash, online grow/shrink, etc, they were
renamed ext2/ext3/ext4 and they are still largely compatible between themselves.

For many purposes, both filesystems are obsoleted by ZFS, which added:

- added metadata and data checksums - to detect silent bit rot on 
current-generation HDDs and SSDs
- added online filesystem check - for broken data, gives you list of filenames 
instead of inode numbers
- added "built-in" mirroring - together with checksums, online fsck (zfs scrub) 
and monthly zfs scrub cron job, allows automatic healing of bit rot.
- added "built-in" raid-5 and raid-6 - again, together 

Re: XFS vs Ext4

2023-12-06 Thread Miles ONeal
We've never had any problems - these or any of the others mentioned. We've used 
XFS on single HDD and SSD physical workstations, but have since migrated those 
to VMs, so they're on hardware RAID now, as are most of our systems - whether 
VM or bare metal. Since we've not encountered a corrupt XFS, we haven't had to 
repair one.

The servers I run stay up 24/7 and are critical to engineering. When we moved 
the application from Solaris to Linux 8-9 years ago, I picked XFS because ZFS 
wasn't fully developed and supported on Linux yet. To date, I've seen no reason 
to switch. Performance has been good. File sizes are all over the map. We also 
have apps that run for days. I'm told by people I trust that ZFS consumes a lot 
of RAM; we already use a lot of RAM between applications and buffering.

I'm not seeing any recent benchmarks of ZFS vs XFS. I wish I had the budget and 
time to do that myself.

Had ZFS been in great shape 8-9 years ago, I'd likely have gone with it since 
we'd been using it on Solaris. At this point, we have no compelling reason to 
change. AFAICT, ZFS is not currently supported by Redhat.

From: owner-scientific-linux-us...@listserv.fnal.gov 
 on behalf of Edward Zuniga 

Sent: Wednesday, December 6, 2023 08:18
To: Edward Zuniga ; SCIENTIFIC-LINUX-USERS@fnal.gov 

Cc: Konstantin Olchanski 
Subject: Re: XFS vs Ext4

Thanks everyone for the feedback! I've learned so much from reading the 
discussions. For our application, we will have a LAN with a single server (1TB 
RAID1 array for OS, 200TB RAID5 array for data) and up to 16 workstations (1TB 
RAID1
ZjQcmQRYFpfptBannerStart
This Message is from an Untrusted Sender
You have not previously corresponded with this sender.

Report Suspicious

ZjQcmQRYFpfptBannerEnd
Thanks everyone for the feedback! I've learned so much from reading the 
discussions.

For our application, we will have a LAN with a single server (1TB RAID1 array 
for OS, 200TB RAID5 array for data) and up to 16 workstations (1TB RAID1 array 
for OS). Our IT department is more familiar with Rocky Linux 8, which I assume 
will perform the same as AlmaLinux 8. Some of our MRI processing can take weeks 
to finish, so we need a system that is very reliable. We also work with 
individual files in the hundreds of gigabytes.

While reading the Red Hat 8 
manual,
 I found a few possible issues regarding XFS. I'm curious to see if anyone has 
experienced these as well.

1. Metadata error behavior
In ext4, you can configure the behavior when the file system encounters 
metadata errors. The default behavior is to simply continue the operation. When 
XFS encounters an unrecoverable metadata error, it shuts down the file system 
and returns the EFSCORRUPTED error.
This could be problematic for processing that takes several weeks.
2. Inode numbers

The ext4 file system does not support more than 232 inodes.

XFS dynamically allocates inodes. An XFS file system cannot run out of inodes 
as long as there is free space on the file system.

Certain applications cannot properly handle inode numbers larger than 232 on an 
XFS file system. These applications might cause the failure of 32-bit stat 
calls with the EOVERFLOW return value. Inode number exceed 232 under the 
following conditions:

  *   The file system is larger than 1 TiB with 256-byte inodes.
  *   The file system is larger than 2 TiB with 512-byte inodes.

If your application fails with large inode numbers, mount the XFS file system 
with the -o inode32 option to enforce inode numbers below 232. Note that using 
inode32 does not affect inodes that are already allocated with 64-bit numbers.

Has anyone encountered this issue?
3. The Red Hat 8 manual also warns that using xfs_repair -L might cause 
significant file system damage and data loss and should only be used as a last 
resort. The manual does not mention a similar warning about using e2fsck to 
repair an ext4 file system. Has anyone experienced issues repairing a corrupt 
XFS file system?
Thanks,
Eddie


On Tue, Dec 5, 2023 at 8:46 PM Konstantin Olchanski 
mailto:olcha...@triumf.ca>> wrote:
On Mon, Dec 04, 2023 at 03:03:46PM -0500, Edward Zuniga wrote:
>
> We are upgrading our MRI Lab servers and workstations to AlmaLinux 8. We
> have used ext4 for the past 10 years, however we are considering using XFS
> for its better performance with 

Re: XFS vs Ext4

2023-12-06 Thread Edward Zuniga
Cc'ing supervisor to loop him in as well.

On Wed, Dec 6, 2023, 9:18 AM Edward Zuniga  wrote:

> Thanks everyone for the feedback! I've learned so much from reading the
> discussions.
>
> For our application, we will have a LAN with a single server (1TB RAID1
> array for OS, 200TB RAID5 array for data) and up to 16 workstations (1TB
> RAID1 array for OS). Our IT department is more familiar with Rocky Linux 8,
> which I assume will perform the same as AlmaLinux 8. Some of our MRI
> processing can take weeks to finish, so we need a system that is very
> reliable. We also work with individual files in the hundreds of gigabytes.
>
> While reading the Red Hat 8 manual
>   >,
> I found a few possible issues regarding XFS. I'm curious to see if anyone
> has experienced these as well.
>
> 1. Metadata error behaviorIn ext4, you can configure the behavior when
> the file system encounters metadata errors. The default behavior is to
> simply continue the operation. When XFS encounters an unrecoverable
> metadata error, it shuts down the file system and returns the EFSCORRUPTED
>  error.*This could be problematic for processing that takes several
> weeks.*2. Inode numbers
>
> The ext4 file system does not support more than 232 inodes.
>
> XFS dynamically allocates inodes. An XFS file system cannot run out of
> inodes as long as there is free space on the file system.
>
> Certain applications cannot properly handle inode numbers larger than 232 on
> an XFS file system. These applications might cause the failure of 32-bit
> stat calls with the EOVERFLOW return value. Inode number exceed 232 under
> the following conditions:
>
>- The file system is larger than 1 TiB with 256-byte inodes.
>- The file system is larger than 2 TiB with 512-byte inodes.
>
> If your application fails with large inode numbers, mount the XFS file
> system with the -o inode32 option to enforce inode numbers below 232.
> Note that using inode32 does not affect inodes that are already allocated
> with 64-bit numbers.
> *Has anyone encountered this issue? *3. The Red Hat 8 manual also warns
> that using xfs_repair -L might cause significant file system damage and
> data loss and should only be used as a last resort. The manual does not
> mention a similar warning about using e2fsck to repair an ext4 file system.
> Has anyone experienced issues repairing a corrupt XFS file system?
> Thanks,Eddie
>
> On Tue, Dec 5, 2023 at 8:46 PM Konstantin Olchanski 
> wrote:
>
>> On Mon, Dec 04, 2023 at 03:03:46PM -0500, Edward Zuniga wrote:
>> >
>> > We are upgrading our MRI Lab servers and workstations to AlmaLinux 8. We
>> > have used ext4 for the past 10 years, however we are considering using
>> XFS
>> > for its better performance with larger files. Which file system do you
>> use
>> > for your lab?
>> >
>>
>> Historical background.
>>
>> XFS filesystem with the companion XLV logical volume manager (aka
>> "partitioning tool")
>> came to Linux from SGI IRIX, where it was developed circa late-1990-ies.
>> XFS was copied
>> to Linux verbatim (initially with shims and kludges, later, fully
>> integrated).
>> XLV was reimplemented as LVM.
>>
>> The EXT series of filesystems were developed together with the linux
>> kernel (first ext
>> filesystem may have originated with MINIX, look it up). As improvements
>> were made,
>> journaling, no need to fsck after crash, online grow/shrink, etc, they
>> were
>> renamed ext2/ext3/ext4 and they are still largely compatible between
>> themselves.
>>
>> For many purposes, both filesystems are obsoleted by ZFS, which added:
>>
>> - added metadata and data checksums - to detect silent bit rot on
>> current-generation HDDs and SSDs
>> - added online filesystem check - for broken data, gives you list of
>> filenames instead of inode numbers
>> - added "built-in" mirroring - together with checksums, online fsck (zfs
>> scrub) and monthly zfs scrub cron job, allows automatic healing of bit rot.
>> - added "built-in" raid-5 and raid-6 - again, together with checksums and
>> online fsck, allows automatic healing and robust operation in presence of
>> disk bad sectors, I/O errors, corruption and single-disk failure.
>> - other goodies like snapshots, large ram cache, dedup, online
>> compression, etc are taken for granted for current generation filesystems.
>>
>> On current generation HDDs and SSds, use of bare XFS and ext4 is
>> dangerous, SSD failure or "HDD grows bad sectors" will destroy your data
>> completely.
>>
>> On current generation HDDs, use of mirrored XFS and ext4 is dangerous
>> 

Re: XFS vs Ext4

2023-12-06 Thread Edward Zuniga
Thanks everyone for the feedback! I've learned so much from reading the
discussions.

For our application, we will have a LAN with a single server (1TB RAID1
array for OS, 200TB RAID5 array for data) and up to 16 workstations (1TB
RAID1 array for OS). Our IT department is more familiar with Rocky Linux 8,
which I assume will perform the same as AlmaLinux 8. Some of our MRI
processing can take weeks to finish, so we need a system that is very
reliable. We also work with individual files in the hundreds of gigabytes.

While reading the Red Hat 8 manual
,
I found a few possible issues regarding XFS. I'm curious to see if anyone
has experienced these as well.

1. Metadata error behaviorIn ext4, you can configure the behavior when the
file system encounters metadata errors. The default behavior is to simply
continue the operation. When XFS encounters an unrecoverable metadata
error, it shuts down the file system and returns the EFSCORRUPTED error.*This
could be problematic for processing that takes several weeks.*2. Inode
numbers

The ext4 file system does not support more than 232 inodes.

XFS dynamically allocates inodes. An XFS file system cannot run out of
inodes as long as there is free space on the file system.

Certain applications cannot properly handle inode numbers larger than 232 on
an XFS file system. These applications might cause the failure of 32-bit
stat calls with the EOVERFLOW return value. Inode number exceed 232 under
the following conditions:

   - The file system is larger than 1 TiB with 256-byte inodes.
   - The file system is larger than 2 TiB with 512-byte inodes.

If your application fails with large inode numbers, mount the XFS file
system with the -o inode32 option to enforce inode numbers below 232. Note
that using inode32 does not affect inodes that are already allocated with
64-bit numbers.
*Has anyone encountered this issue? *3. The Red Hat 8 manual also warns
that using xfs_repair -L might cause significant file system damage and
data loss and should only be used as a last resort. The manual does not
mention a similar warning about using e2fsck to repair an ext4 file system.
Has anyone experienced issues repairing a corrupt XFS file system?
Thanks,Eddie

On Tue, Dec 5, 2023 at 8:46 PM Konstantin Olchanski 
wrote:

> On Mon, Dec 04, 2023 at 03:03:46PM -0500, Edward Zuniga wrote:
> >
> > We are upgrading our MRI Lab servers and workstations to AlmaLinux 8. We
> > have used ext4 for the past 10 years, however we are considering using
> XFS
> > for its better performance with larger files. Which file system do you
> use
> > for your lab?
> >
>
> Historical background.
>
> XFS filesystem with the companion XLV logical volume manager (aka
> "partitioning tool")
> came to Linux from SGI IRIX, where it was developed circa late-1990-ies.
> XFS was copied
> to Linux verbatim (initially with shims and kludges, later, fully
> integrated).
> XLV was reimplemented as LVM.
>
> The EXT series of filesystems were developed together with the linux
> kernel (first ext
> filesystem may have originated with MINIX, look it up). As improvements
> were made,
> journaling, no need to fsck after crash, online grow/shrink, etc, they were
> renamed ext2/ext3/ext4 and they are still largely compatible between
> themselves.
>
> For many purposes, both filesystems are obsoleted by ZFS, which added:
>
> - added metadata and data checksums - to detect silent bit rot on
> current-generation HDDs and SSDs
> - added online filesystem check - for broken data, gives you list of
> filenames instead of inode numbers
> - added "built-in" mirroring - together with checksums, online fsck (zfs
> scrub) and monthly zfs scrub cron job, allows automatic healing of bit rot.
> - added "built-in" raid-5 and raid-6 - again, together with checksums and
> online fsck, allows automatic healing and robust operation in presence of
> disk bad sectors, I/O errors, corruption and single-disk failure.
> - other goodies like snapshots, large ram cache, dedup, online
> compression, etc are taken for granted for current generation filesystems.
>
> On current generation HDDs and SSds, use of bare XFS and ext4 is
> dangerous, SSD failure or "HDD grows bad sectors" will destroy your data
> completely.
>
> On current generation HDDs, use of mirrored XFS and ext4 is dangerous
> (using mdadm or LVM mirroring), (a) bit rot inevitably causes differences
> between data between the two disks. Lacking checksums, mdadm and LVM
> mirroring cannot decide which of the two copies is the correct one. (b)
> after a crash, mirror