Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread Jose Marques
> On 10 Apr 2017, at 18:23, David Sommerseth  
> wrote:
> 
> But I'll give you that Oracle is probably a very different beast on the
> legal side and doesn't have a too good "open source karma".

ZFS on Linux is based on OpenZFS (). Oracle 
has no input into its development as far as I can tell.

The University of St Andrews is a charity registered in Scotland, No. SC013532.


Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread David Sommerseth
On 11/04/17 10:30, Jose Marques wrote:
>> On 10 Apr 2017, at 18:23, David Sommerseth  
>> wrote:
>>
>> But I'll give you that Oracle is probably a very different beast on the
>> legal side and doesn't have a too good "open source karma".
> 
> ZFS on Linux is based on OpenZFS (). 
> Oracle has no input into its development as far as I can tell.

Yeah, that /sounds/ like a clean implementation.  IANAL and also cannot
predict if the Oracle lawyers agree.

I have few doubts of the F/L/OSS "pureness" of many of the developers
hired by Oracle, working with various upstream communities.  But I do
fear the intentions and attitudes of the Oracle business side people,
including the CEO.  ANd also consider that the ZFS name is a registered
trademark.

But that aside, according to [1], ZFS on Linux was considered stable in
2013.  That is still fairly fresh, and my concerns regarding the time it
takes to truly stabilize file systems for production [2] still stands.

[1] 
[2]



-- 
kind regards,

David Sommerseth


Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread Tom H
On Tue, Apr 11, 2017 at 4:30 AM, Jose Marques  wrote:
>> On 10 Apr 2017, at 18:23, David Sommerseth  
>> wrote:
>>
>> But I'll give you that Oracle is probably a very different beast on
>> the legal side and doesn't have a too good "open source karma".
>
> ZFS on Linux is based on OpenZFS
> (). Oracle has no input into its
> development as far as I can tell.

I'm not sure that David S is referring to. Sun open-sourced zfs and it
was at zpool version 28 when Oracle closed-sourced it. So the cat's
out of the bag up to v28. But the Solaris version's currently 36
(IIRC) and, in openzfs, you can enable extra, post-v28 features on a
case by case basis.

[Someone said up-thread that you couldn't expand a pool (or add disks
to a pool). That's incorrect. You can add a same-type vdev to a pool.]


Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread Kraus, Dave (GE Healthcare)
On 4/11/17, 10:08 AM, "owner-scientific-linux-us...@listserv.fnal.gov on behalf 
of Tom H"  wrote:

On Tue, Apr 11, 2017 at 4:30 AM, Jose Marques  
wrote:
>> On 10 Apr 2017, at 18:23, David Sommerseth 
 wrote:
>>
>> But I'll give you that Oracle is probably a very different beast on
>> the legal side and doesn't have a too good "open source karma".
>
> ZFS on Linux is based on OpenZFS
> (). Oracle has no input into its
> development as far as I can tell.

I'm not sure that David S is referring to. Sun open-sourced zfs and it
was at zpool version 28 when Oracle closed-sourced it. So the cat's
out of the bag up to v28. But the Solaris version's currently 36
(IIRC) and, in openzfs, you can enable extra, post-v28 features on a
case by case basis.

[Someone said up-thread that you couldn't expand a pool (or add disks
to a pool). That's incorrect. You can add a same-type vdev to a pool.]


FWIW, we’ve been shipping a current version of ZFS with our SL spin for a few 
years, and use it for several of our internal storage servers. It’s been quite 
stable in our environment with the proper care and feeding. But it’s still a 
storage-server environment (I don’t call it a filesystem), not a generic 
substitute for EXT4 or XFS. Tools for the job and all that.

Not going to comment much on the licensing aspects other than we’re fairly 
paranoid and have deemed the licenses acceptable for our business and 
redistribution purposes. Most if not all of the Oracle CDDL(?)-licensed code 
has been excised as far as I know. Others here track that, so I personally 
don’t think about it much.




Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread Konstantin Olchanski
On Tue, Apr 11, 2017 at 11:13:25AM +0200, David Sommerseth wrote:
> 
> But that aside, according to [1], ZFS on Linux was considered stable in
> 2013.  That is still fairly fresh, and my concerns regarding the time it
> takes to truly stabilize file systems for production [2] still stands.
> 

Why do you worry about filesystem stability?

So what if it eats your data every 100 years due to a rare bug? You do have
backups (filesystem stability does not protect you against
fat-fingering the "rm" command), and you do archive your data, yes?
You do have a hot-spare server (filesystem stability does not protect
you against power supply fire) and you do have disaster mitigation plans
(filesystem stability does not protect you against "server is down,
you are fired!").

So what if it eats your data every 1 week due to a frequent bug? How is that
different from faulty hardware eating your data? (like the cheap intel pcie ssd
eating all data on xfs and ext4 within 10 seconds of booting). You build
a system, you burn it in, you test it, if it works, it works, if it does not,
you throw zfs (or the hardware) into the dumpster, start again with yfs, qfs, 
whatever.

How else can you build something that works reliably? Using only components 
annointed
by the correct penguin can only take you so far.

-- 
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada


Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread David Sommerseth
On 11/04/17 18:44, Konstantin Olchanski wrote:
> On Tue, Apr 11, 2017 at 11:13:25AM +0200, David Sommerseth wrote:
>>
>> But that aside, according to [1], ZFS on Linux was considered stable in
>> 2013.  That is still fairly fresh, and my concerns regarding the time it
>> takes to truly stabilize file systems for production [2] still stands.
>>
> 
> Why do you worry about filesystem stability?
> 
> So what if it eats your data every 100 years due to a rare bug? You do have
> backups (filesystem stability does not protect you against
> fat-fingering the "rm" command), and you do archive your data, yes?
> You do have a hot-spare server (filesystem stability does not protect
> you against power supply fire) and you do have disaster mitigation plans
> (filesystem stability does not protect you against "server is down,
> you are fired!").
> 
> So what if it eats your data every 1 week due to a frequent bug? How is that
> different from faulty hardware eating your data? (like the cheap intel pcie 
> ssd
> eating all data on xfs and ext4 within 10 seconds of booting). You build
> a system, you burn it in, you test it, if it works, it works, if it does not,
> you throw zfs (or the hardware) into the dumpster, start again with yfs, qfs, 
> whatever.
> 
> How else can you build something that works reliably? Using only components 
> annointed
> by the correct penguin can only take you so far.

It's called risk assessment and management.  You consider the risks, and
newer file systems have more bugs than older file systems.  Consider the
amount of users using ext4 and XFS, vs btrfs/zfs ... and in which of
these groups do you hear more about users experiencing data loss?

And I do care about data loss.  Because if that happens, I need to start
running restore jobs from backups and recover files and file systems.
So having a stable and rock solid file system reduces the chances of
extra work for me.  And my users are much more happy too.

Why do you think I'm running RAID 6?  To reduce the risk of data loss if
even 2 drives decides to bail out.  And if all that should explode, then
I have backups to restore.  And if the local backup fails, I have an
offsite backup as well.  Because data loss is really annoying for me and
my users.

But even with all that ... RAID 6 won't save me from a file system going
crazy and leaving data into the void.  So the stability and matureness
of the file system is equally important.

If you don't need to care for you data in 1-2 months ... then I
understand you being willing to use less stable and mature file systems.


-- 
kind regards,

David Sommerseth


Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread Konstantin Olchanski
On Tue, Apr 11, 2017 at 07:47:46PM +0200, David Sommerseth wrote:
> 
> ... newer file systems have more bugs than older file systems. ...
>

I do not think that is necessarily true.

Newer filesystems have consistly added functionality to detect
and prevent data corruption, to preserve and ensure data integrity.
It is hard to quantify how this is offset by the (unmeasureable)
increase in number of bugs.

The 1st generation filesystems are quite well debugged (msdos/fat/vfat,
ext2/3/4) but do not have any features to preserve data integrity.

The 2nd generation filesystems (like SGI XFS) added some built-in data
integrity checks (the first release of XFS did not even have an fsck
because "it will never corrupt". the second release added an fsck,
because bad hardware does corrupt anything).

But they lack data checksums and "online fsck" features - you have to take
the server offline to confirm filesystem consistency.

They also tend to have bad interaction with the RAID layers (all the stripes
and stuff has to line-up correctly or performance is very bad).

The 3rd generation filesystems (zfs, btrfs) add data checksums, online-fsck,
"integrated raid".

So the choice is between you saying "I trust XFS, I never fsck it",
and me saying "I do not trust ZFS, I do not trust the hardware, I run
ZFS scrub daily, I have backups, I have archives".

>
> Consider the
> amount of users using ext4 and XFS, vs btrfs/zfs ... and in which of
> these groups do you hear more about users experiencing data loss?
> 

Ah, the science of it. "which do I hear more...". There is a difference
between the actual number of faults, reported number of faults
and "I read about it on slashdot" number of faults.

>
> And I do care about data loss.  Because if that happens, I need to start
> running restore jobs from backups and recover files and file systems.
> So having a stable and rock solid file system reduces the chances of
> extra work for me.  And my users are much more happy too.
> 

I think it is a good thing if you have to restore files from backup and archive
at least once a year. How else do you know your backups actually work?

-- 
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada


Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread jdow

On 2017-04-11 09:44, Konstantin Olchanski wrote:

On Tue, Apr 11, 2017 at 11:13:25AM +0200, David Sommerseth wrote:


But that aside, according to [1], ZFS on Linux was considered stable in
2013.  That is still fairly fresh, and my concerns regarding the time it
takes to truly stabilize file systems for production [2] still stands.



Why do you worry about filesystem stability?

So I suppose the extended downtime while several terabytes of data are restored 
after it's loss due to filesystem malfunction is of no consequence to you. 
Others find extended downtime both extremely frustrating and expensive. And that 
does ignore the last few {interval between backups} worth of data loss, which 
can also be expensive.


{o.o}   Joanne


Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread Michael Tiernan

On 4/11/17 3:41 PM, jdow wrote:
So I suppose the extended downtime while several terabytes of data are 
restored after it's loss due to filesystem malfunction is of no 
consequence to you.
While not trying to dowse the fire with gasoline, I'd like to remind 
folks that data loss isn't the only problem, data corruption is also a 
problem.


Oh, and while one's trying to manipulate serveral terabytes of data, the 
system slowing to a crawl because of excessive thrashing or fs overhead 
is also bad.


Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread David Sommerseth
On 11/04/17 20:48, Konstantin Olchanski wrote:
> 
> So the choice is between you saying "I trust XFS, I never fsck it",
> and me saying "I do not trust ZFS, I do not trust the hardware, I run
> ZFS scrub daily, I have backups, I have archives".

So lets flip this around ... Why isn't btrfs enabled by default in RHEL,
but still being a tech-preview which explicitly labels it "unsuitable
for production"?  And why haven't RHEL seen any active involvement from
RH and/or the Fedora community to add ZFS to the distro?  And why isn't
Fedora still using btrfs as the default file system, despite being
suggested a few times already?

  * Fedora 16

  * Fedora 17
  
  * Fedora 23



Part of the answer is definitely that the btrfs file system is not
considered ready for prime time production.  For ZFS, the licensing is
probably quite an issue as well.

As I already said in an earlier mail:

   "Once Red Hat enabling ZFS on a kernel where it is native in the
upstream kernel, not being labelled tech-preview - that day I will
consider ZFS for production.  If btrfs reaches this stage earlier
[than ZFS], then I will consider btrfs instead."

I trust the expertise RH have in the file system area.  In fact, I have
spent some time discussing this topic with a few of their FS developers
face-to-face several times, in addition to some of their storage driver
maintainers.  So when RH is not pushing customers unto these new file
systems, it is for a good reason.  And I choose to take RH's advise.
What you do is your decision - and I don't have a need to convince you
to change your opinion.


As I've now iterated a few times already things I've already said before
... I'm letting this be my last reply to this mail thread.


-- 
kind regards,

David Sommerseth


Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread Konstantin Olchanski
On Tue, Apr 11, 2017 at 10:47:09PM +0200, David Sommerseth wrote:
> 
> So lets flip this around ... Why isn't btrfs enabled by default in RHEL...
>

I already wrote about BTRFS in this thread. I did some extensive
tests of BTRFS and the performance is quite acceptable (close to hardware
capability, about the same as raid6+xfs), the interesting features
all work, raid5/raid6 is more flexible compared to zfs.

But for production use, they are missing one small feature:

there is no handling whatsoever of disk failures. If you unplug a disk
(or if a disk unplugs itself by giving up the ghost), BTRFS does
absolutely nothing other than filling the syslog with error messages.

Then there are small bugs, for example, if BTRFS is in degraded
mode (one or more dead disk) el7 will not boot (systemd will wait
forever for the missing disk). Suggested workaround do not work,
and this means if one disk dies, you have to manually replace
it and do the rebuild before rebooting. If you do reboot, you will have
to go into single user mode, install the replacement disk,
run the btrfs rebuild, only then you can boot normally. The server
is down all this time.

My conclusion is BTRFS in the "this is a toy" departement,
not in the "this is unstable" or "preview" departement.

-- 
Konstantin Olchanski
Data Acquisition Systems: The Bytes Must Flow!
Email: olchansk-at-triumf-dot-ca
Snail mail: 4004 Wesbrook Mall, TRIUMF, Vancouver, B.C., V6T 2A3, Canada


Re: [SCIENTIFIC-LINUX-USERS] RAID 6 array and failing harddrives

2017-04-11 Thread jdow

On 2017-04-11 12:57, Michael Tiernan wrote:

On 4/11/17 3:41 PM, jdow wrote:

So I suppose the extended downtime while several terabytes of data are
restored after it's loss due to filesystem malfunction is of no consequence to
you.

While not trying to dowse the fire with gasoline, I'd like to remind folks that
data loss isn't the only problem, data corruption is also a problem.

Oh, and while one's trying to manipulate serveral terabytes of data, the system
slowing to a crawl because of excessive thrashing or fs overhead is also bad.


The reduced utility and accessibility of the data being tested and restored is 
the sticky point as I see it. This can be a serious cost. Having dozens of 
backups is fine. If none of the backups are online on different disks available 
to take over the load while the main server's data is restored, you find 
yourself dead in the water for however long it takes to test and restore two or 
three digit number of Terabytes. (I wonder how NSA handles this in their new 
data center)


{^_^}