Re: ZFS with Linux: An Open Plea

2007-04-17 Thread Ricardo Correia
Ricardo Correia wrote:
> That FAQ entry is outdated, ZFS can recover from metadata corruption on
> non-replicated pools for a long time already.

Just a clarification, ZFS not only detects metadata corruption through
the use of checksums but, since it keeps 2-3 copies of each metadata
block on-disk (even on non-replicated pools), it also rewrites the
corrupted blocks, effectively repairing the corruption.

All of this is done transparently to the user but, of course, it's
possible to see a report of checksum failures.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ZFS with Linux: An Open Plea

2007-04-17 Thread Ricardo Correia
Florian Weimer wrote:
> 
> 
> I keep hoping that this FAQ entry is outdated, but the date on that
> page is rather current. 8-/

That FAQ entry is outdated, ZFS can recover from metadata corruption on
non-replicated pools for a long time already.

Background scrubbing and the ability to see a list of corrupted files
has also been available for a long time (even longer than the above).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ZFS with Linux: An Open Plea

2007-04-17 Thread Ricardo Correia
Xavier Bestel wrote:
>> That is not quite true. They made ZFS available under the CDDL, which is
>> an OSI-approved open-source license that is *less* restrictive than the
>> GPL. The CDDL doesn't prevent anyone from using the ZFS code in
>> combination with code under other licenses.
> 
> You are wrong. Please read e.g.
> 
> (maybe there are better analysis somewhere, but I don't know where).

What I meant in saying the CDDL is less restrictive than the GPL is that
the CDDL can be freely used in conjunction with code under other
licenses, as long as the files licensed under CDDL keep the license
notice, whereas the GPL requires that derived works also have to be
licensed under the GPL, which is not possible in many cases (such as
this one).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ZFS with Linux: An Open Plea

2007-04-17 Thread Ricardo Correia
Alan Cox wrote:
> The real test of whether Sun were serious about ZFS being anywhere but
> Solaris is what they do to license it - they've patented everything they
> can, and made the code available only under licenses incompatible with
> other OS products. Their intent is quite clear, and quite sad.

That is not quite true. They made ZFS available under the CDDL, which is
an OSI-approved open-source license that is *less* restrictive than the
GPL. The CDDL doesn't prevent anyone from using the ZFS code in
combination with code under other licenses.

The proof of that is that ZFS has already been ported to FreeBSD and
it's being ported to the Mac OS X kernel.

The main problem with ZFS on Linux is that most people consider that the
GPL doesn't allow non-GPL kernel modules.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How to flush the disk write cache from userspace

2007-01-17 Thread Ricardo Correia
On Tuesday 16 January 2007 00:38, you wrote:
> As always with these things, the devil is in the details. It requires
> the device to support a ->prepare_flush() queue hook, and not all
> devices do that. It will work for IDE/SATA/SCSI, though. In some devices
> you don't want/need to do a real disk flush, it depends on the write
> cache settings, battery backing, etc.

Is there any chance that someone could implement this (I don't have the 
skills, unfortunately)? Maybe add a new ioctl() to block devices, so that it 
doesn't break any existing code?

I believe it's a very useful (and relatively simple) feature that increases 
data integrity and reliability for applications that need this functionality.

I think it must be considered that most people have disk write caches enabled 
and are using IDE, SATA or SCSI disks.

I also think there's no point in disabling disks' write caches, since it slows 
writes and decreases disks' lifetime, and because there's a better solution.

Personally, I'm not really interested in specific filesystem behaviour, since 
my application uses block devices directly (it's a filesystem itself). 
Although I think all filesystems should guarantee data integrity in the face 
of fsync() or metadata modifications, even if it costs a little performance.

Thank you.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


How to flush the disk write cache from userspace

2007-01-13 Thread Ricardo Correia
Hi, (please CC: to my email address, I'm not subscribed)

Quick question: how can I flush the disk write cache from userspace?

Long question:

I'm porting the Solaris ZFS filesystem to the FUSE/Linux filesystem framework.
This is a copy-on-write, transactional filesystem and so it needs to ensure 
correct ordering of writes when transactions are written to disk.

At the moment, when transactions end, I'm using a fsync() on the block device 
followed by a ioctl(BLKFLSBUF).

This is because, according to the fsync manpage, even after fsync() returns, 
data might still be in the disk write cache, so fsync by itself doesn't 
guarantee data safety on power failure.

I was looking for something like the Solaris ioctl(DKIOCFLUSHWRITECACHE), 
which does exactly what I need.

The most similar thing I could find was ioctl(BLKFLSBUF), however a search for 
BLKFLSBUF on the Linux 2.6.15 source doesn't seem to return anything related 
to IDE or SCSI disks.

Can I trust ioctl(BLKFLSBUF) to flush disks' write caches (for disks that 
follow the specs)?

What about block devices of disk partitions, LVM logical volumes and the EMVS 
volumes, do they propagate flush commands to the respective disks?

What about loop devices?

Thanks in advance.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/