[zfs-discuss] In-place upgrades of Solaris Express on ZFS root (updated)

2007-12-06 Thread Albert Lee
Hi folks,

I've updated zfs_ttinstall to work correctly (finally =P) and for
slightly better error handling.

This allows you to perform an in-place upgrade on a Solaris Express
system that is using ZFS root. Sorry, no Live Upgrade for now. You will
have to boot from DVD media for the upgrade (CDs are untested, and may
fail because of the way it handles /sbin/umount).

The script is here:
http://trisk.acm.jhu.edu/zfs_ttinstall

To use it:
1)
Edit ROOTDEV, ROOTPOOL, ROOTFS, and DATAPOOLs accordingly.
Copy the zfs_ttinstall to your root pool (for this example, it will be
"rootpool").
# cp zfs_ttinstall /rootpool

2)
Boot from SXCE or SXDE DVD.
Select the "Solaris Express" (not DE) boot option.
Choose one of the graphical install options (having X is much nicer).

3)
After the installer starts, cancel it, and open a new terminal window.

4)
Type:
# mount -o remount,rw /
# zpool import -f rootpool
# sh /rootpool/zfs_ttinstall

5)
Check /var/sadm/system/logs/zfs_upgrade_log for problems.
# reboot
*Important* Don't zpool export your pools after the upgrade.


Please let me know if this works for you. I last tested it with updating
snv_73->snv_78, and it works for earlier builds, too.


Good luck,

-Albert

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Seperate ZIL

2007-12-06 Thread Brian Hechinger
On Wed, Dec 05, 2007 at 06:12:18PM -0600, Al Hopper wrote:
> 
> I don't think you'll see any worthwhile improvement.  For a ZIL 
> device, you really need something like a (small) SAS 15k RPM 3.5" 
> drive - which will sustain 700 to 900 IOPS (my number - open to 
> argument) - or a RAM disk or one of these [1].
> 
> 10K RPM SCSI disks will get (best case) 350 to 400 IOPS.  Remember, 
> the main issue with legacy SCSI is that (SCSI) commands are sent 
> 8-bits wide at 5Mbits/Sec - for backwards compatibility.  You simply 
> can't send enough commands over a SCSI bus to busy out a modern 10k 
> RPM SCSI drive.

Ah, ok then.  Thanks for the detailed explanation Al, that was very
helpful.

> PS: LsiLogic just updated their SAS HBAs and have a couple of products 
> very reasonably priced IMHO.  Combine that with a (single ?) Fujitsu 
> MAX3xxxRC (where xxx represents the size) and you'll be wearing a big 
> smile every time you work on a system so equipped.
> 
> Tell Santa that you want an LsiLogic SAS HBA and some SAS disks for 
> Xmas! :)

Santa came early this year. :)

The whole reason I'm able to get these SATA disks into my Ultra80 is because
I bought an LSI SAS3080X and 4-lane SAS to 4-sata cable. ;)  ($60 including
shipping, not a bad deal at all!!)

I wasn't sure what I was going to do with the other 4 channels, for the
time being nothing as I'm now completely broke. :)

A nice SAS enlosure would be a nice addition to this setup I think.  Maybe
a couple of those Fujitsu disks would be nice.

> [1] Finally, someone built a flash SSD that rocks (and they know how 
> fast it is judging by the pricetag):
> http://www.tomshardware.com/2007/11/21/mtron_ssd_32_gb/
> http://www.anandtech.com/storage/showdoc.aspx?i=3167

Uhm, holy crap.  I want one of those for my laptop now.  I only have a 40G
in there, so a 32G mtron would be perfect. ;)

Now, the price might be an issue however. ;)

-brian
-- 
"Perl can be fast and elegant as much as J2EE can be fast and elegant.
In the hands of a skilled artisan, it can and does happen; it's just
that most of the shit out there is built by people who'd be better
suited to making sure that my burger is cooked thoroughly."  -- Jonathan 
Patschke
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Seperate ZIL

2007-12-06 Thread Scott Laird
On 12/6/07, Brian Hechinger <[EMAIL PROTECTED]> wrote:
> On Wed, Dec 05, 2007 at 06:12:18PM -0600, Al Hopper wrote:
> >
> > PS: LsiLogic just updated their SAS HBAs and have a couple of products
> > very reasonably priced IMHO.  Combine that with a (single ?) Fujitsu
> > MAX3xxxRC (where xxx represents the size) and you'll be wearing a big
> > smile every time you work on a system so equipped.
>
> Hmmm, on second glace, 36G versions of that seem to be going for $40.

Do you mean $140, or am I missing a really good deal somewhere?


Scott
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Seperate ZIL

2007-12-06 Thread Brian Hechinger
On Wed, Dec 05, 2007 at 06:12:18PM -0600, Al Hopper wrote:
> 
> PS: LsiLogic just updated their SAS HBAs and have a couple of products 
> very reasonably priced IMHO.  Combine that with a (single ?) Fujitsu 
> MAX3xxxRC (where xxx represents the size) and you'll be wearing a big 
> smile every time you work on a system so equipped.

Hmmm, on second glace, 36G versions of that seem to be going for $40.

Hmmm...

-brian
-- 
"Perl can be fast and elegant as much as J2EE can be fast and elegant.
In the hands of a skilled artisan, it can and does happen; it's just
that most of the shit out there is built by people who'd be better
suited to making sure that my burger is cooked thoroughly."  -- Jonathan 
Patschke
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] why are these three ZFS caches using so much kmem?

2007-12-06 Thread James C. McPherson
Artem Kachitchkine wrote:
 >James McPherson wrote:
>> Following suggestions from Andre and Rich that this was
>> probably the ARC, I've implemented a 256Mb limit for my
>> system's ARC, per the Solaris Internals wiki:
>>
>> * 
>> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#ARCSIZE
>> * set arc max to 256Mb
>> set zfs:zfs_arc_max=0x1000
>>
>> And my system now seems to be chugging along quite happily.
> 
> Is there something very special about your system that's incompatible 
> with ZFS's default policy? I'm asking because the link above says that 
> "ZFS is not designed to steal memory from applications" and yet your 
> out-of-the-box experience was:
> 
>  > my system constantly swapping and
>  > interactive response for my desktop is horrendous.

Hi Artem,
yeah, my system is a little special :-)

In my global zone I run
gnome
firefox
thunderbird
XEmacs
XChat
pidgin

In my webserver zone I run
exim
apache 2.2.3
tomcat 5.5.17
JRoller 3.mumble
PostgreSQL 8.1
(all of which provide www.jmcp.homeunix.com)

In my punchin zone I run
thunderbird
firefox
bugster
cscope
XChat



James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread can you guess?
> can you guess? wrote:
> >> There aren't free alternatives in linux or freebsd
> >> that do what zfs does, period.
> >> 
> >
> > No one said that there were:  the real issue is
> that there's not much reason to care, since the
> available solutions don't need to be *identical* to
> offer *comparable* value (i.e., they each have
> different strengths and weaknesses and the net result
> yields no clear winner - much as some of you would
> like to believe otherwise).
> Ok. So according to you, most of what ZFS does is
> available elsewhere, 
> and the features it has that nothing else has are'
> really a value add, 
> ar least not enough to produce a 'clear winner'. Ok,
> assume for a second 
> that I believe that.

Unlike so many here I don't assume things lightly - and this one seems 
particularly shaky.

 can you list one other software
> raid/filesystem 
> that as any feature (small or large) that ZFS lacks?

Well, duh.

> 
> Because if all else is really equal, and ZFS is the
> only one with any 
> advantages then, whether those advantages are small
> or not (and I don't 
> agree with how small you think they are - see my
> other post that you've 
> ignored so far.)

Sorry - I do need to sleep sometimes.  But I'll get right to it, I assure you 
(or at worst soon:  time has gotten away from me again and I've got an 
appointment to keep this afternoon).

 I think there is a 'clear winner' -
> at least at the 
> moment - Things can change at any time.

You don't get out much, do you?

How does ZFS fall short of other open-source competitors (I'll limit myself to 
them, because once you get into proprietary systems - and away from the quaint 
limitations of Unix file systems - the list becomes utterly unmanageable)?  Let 
us count the ways (well, at least the ones that someone as uninformed as I am 
about open-source features can come up with off the top of his head), starting 
in the volume-management arena:

1.  RAID-Z, as I've explained elsewhere, is brain-damaged when it comes to 
effective disk utilization for small accesses (especially reads):  RAID-5 
offers the same space efficiency with N times the throughput for such workloads 
(used to be provided by mdadm on Linux unless the Linux LVM now supports it 
too).

2.  DRDB on Linux supports remote replication (IIRC there was an earlier, 
simpler mechanism that also did).

3.  Can you yet shuffle data off a disk such that it can be removed from a 
zpool?  LVM on Linux supports this.

4.  Last I knew, you couldn't change the number of disks in a RAID-Z stripe at 
all, let alone reorganize existing stripe layout on the fly.  Typical hardware 
RAIDs can do this and I thought that Linux RAID support could as well, but 
can't find verification now - so I may have been remembering a proposed 
enhancement.

And in the file system arena:

5.  No user/group quotas?  What *were* they thinking?  The discussions about 
quotas here make it clear that per-filesystem quotas are not an adequate 
alternative:  leaving aside the difficulty of simulating both user *and* group 
quotas using that mechanism, using it raises mount problems when very large 
numbers of users are involved, plus hard-link and NFS issues crossing mount 
points.

6.  ZFS's total disregard of on-disk file contiguity can torpedo 
sequential-access performance by well over a decimal order of magnitude in 
situations where files either start out severely fragmented (due to heavily 
parallel write activity during their population) or become so due to 
fine-grained random updates.

7.  ZFS's full-path COW approach increases the space overhead of snapshots 
compared with conventional file systems.

8.  Not available on Linux.

Damn - I've got to run.  Perhaps others more familiar with open-source 
alternatives will add to this list while I'm out.

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS write time performance question

2007-12-06 Thread eric kustarz

On Dec 5, 2007, at 8:38 PM, Anton B. Rang wrote:

> This might have been affected by the cache flush issue -- if the  
> 3310 flushes its NVRAM cache to disk on SYNCHRONIZE CACHE commands,  
> then ZFS is penalizing itself.  I don't know whether the 3310  
> firmware has been updated to support the SYNC_NV bit.  It wasn't  
> obvious on Sun's site where to download the latest firmware.

Yeah that would be my guess on the huge disparity.  I actually don't  
know of any storage device that actually supports SYNC_NV.  If  
someone knows of any, i'd love to know.

przemol, did you set the recordsize to 8KB?

What are the server's specs?  (memory, CPU)

Which version of FileBench and which version of the oltp.f workload  
did you use?

>
> A quick glance through the OpenSolaris code indicates that ZFS &  
> the sd driver have been updated to support this bit, but I didn't  
> track down which release first introduced this functionality.

Yep, that would be:
6462690 sd driver should set SYNC_NV bit when issuing SYNCHRONIZE  
CACHE to SBC-2 devices
http://bugs.opensolaris.org/view_bug.do?bug_id=6462690

It was putback in snv_74.

The PSARC case is 2007/053 (though i see its not open which doesn't  
do much good for externals...).  In any event, if the 3310 doesn't  
support SYNC_NV (which i would guess it doesn't) then it may require  
manually editing sd.conf to treat the flush commands as no-ops.

eric
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Moving ZFS file system to a different system

2007-12-06 Thread Walter Faleiro
Hi All,
We are currently a hardware issue with our zfs file server hence the file
system is unusable.
We are planning to move it to a different system.

The setup on the file server when it was running was

bash-3.00# zpool status
  pool: store1
 state: ONLINE
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
backup  ONLINE   0 0 0
  c1t2d1ONLINE   0 0 0
  c1t2d2ONLINE   0 0 0
  c1t2d3ONLINE   0 0 0
  c1t2d4ONLINE   0 0 0
  c1t2d5ONLINE   0 0 0

errors: No known data errors

  pool: store2
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

NAMESTATE READ WRITE CKSUM
store   ONLINE   0 0 1
  c1t3d0ONLINE   0 0 0
  c1t3d1ONLINE   0 0 0
  c1t3d2ONLINE   0 0 1
  c1t3d3ONLINE   0 0 0
  c1t3d4ONLINE   0 0 0

errors: No known data errors

The store1 was a external raid device with slice configured to boot the
system+swap and the remaining disk space configured for use with zfs.

The store2 was a similar external raid device which had all slices
configured for use for zfs.

Since both are scsi raid devices, we are thinking of booting up the former
using a different SUN Box.

Are there some precautions to be taken to avoid any data loss?

Thanks,
--W
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] why are these three ZFS caches using so much kmem?

2007-12-06 Thread Artem Kachitchkine

> Following suggestions from Andre and Rich that this was
> probably the ARC, I've implemented a 256Mb limit for my
> system's ARC, per the Solaris Internals wiki:
> 
> * http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#ARCSIZE
> * set arc max to 256Mb
> set zfs:zfs_arc_max=0x1000
> 
> And my system now seems to be chugging along quite happily.

Is there something very special about your system that's incompatible 
with ZFS's default policy? I'm asking because the link above says that 
"ZFS is not designed to steal memory from applications" and yet your 
out-of-the-box experience was:

 > my system constantly swapping and
 > interactive response for my desktop is horrendous.

-Artem
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread Tim Cook
STILL haven't given us a list of these filesystems you say match what zfs does. 
 STILL coming back with long winded responses with no content whatsoever to try 
to divert the topic at hand.  And STILL making incorrect assumptions.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread Bakul Shah
> The 45 byte score is the checksum of the top of the tree, isn't that
> right?

Yes. Plus an optional label.

> ZFS snapshots and clones save a lot of space, but the
> 'content-hash == address' trick means you could potentially save
> much more.

Especially if you carry around large files (disk images,
databases) that change.

> Though I'm still not sure how well it scales up -
> Bigger working set means you need longer (more expensive) hashes
> to avoid a collision, and even then its not guaranteed.

> When i last looked they were still using SHA-160
> and I ran away screaming at that point :)

You need 2^80 blocks for a 50%+ probability that a pair will
have the same SHA-160 hash (by the birthday paradox).  Crypto
attacks are not relevant.  For my personal use I am willing
to live with these odds until my backups cross 2^40 distinct
blocks (greater than 8 Petabytes)!
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread can you guess?
(Can we
> declare this thread
> dead already?)

Many have already tried, but it seems to have a great deal of staying power.  
You, for example, have just contributed to its continued vitality.

> 
> Others seem to care.
> 
> > *identical* to offer *comparable* value (i.e., they
> each have
> > different strengths and weaknesses and the net
> result yields no clear
> > winner - much as some of you would like to believe
> otherwise).
> 
> Interoperability counts for a lot for some people.

Then you'd better work harder on resolving the licensing issues with Linux.

>  Fewer filesystems to
> earn about can count too.

And since ZFS differs significantly from its more conventional competitors, 
that's something of an impediment to acceptance.

> 
> ZFS provides peace of mind that you tell us doesn't
> matter.

Sure it matters, if it gives that to you:  just don't pretend that it's of any 
*objective* significance, because *that* requires actual quantification.

  And it's
> actively developed and you and everyone else can see
> that this is so,

Sort of like ext2/3/4, and XFS/JFS (though the latter have the advantage of 
already being very mature, hence need somewhat less 'active' development).

> and that recent ZFS improvements and others that are
> in the pipe (and
> discussed publicly) are very good improvements, which
> all portends an
> even better future for ZFS down the line.

Hey, it could even become a leadership product someday.  Or not - time will 
tell.

> 
> Whatever you do not like about ZFS today may be fixed
> tomorrow,

There'd be more hope for that if its developers and users seemed less obtuse.

 except
> for the parts about it being ZFS, opensource,
> Sun-developed, ..., the
> parts that really seem to bother you.

Specific citations of material that I've posted that gave you that impression 
would be useful:  otherwise, you just look like another self-professed psychic 
(is this a general characteristic of Sun worshipers, or just of ZFS fanboys?).

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Help replacing dual identity disk in ZFS raidz and SVM mirror

2007-12-06 Thread Matt B
Anyone? Really need some help here
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread can you guess?
> can you guess? wrote:
> >   
> >> There aren't free alternatives in linux or freebsd
> >> that do what zfs does, period.
> >> 
> >
> > No one said that there were:  the real issue is
> that there's not much reason to care, since the
> available solutions don't need to be *identical* to
> offer *comparable* value (i.e., they each have
> different strengths and weaknesses and the net result
> yields no clear winner - much as some of you would
> like to believe otherwise).
> >
> >   
> I see you carefully snipped "You would think the fact
> zfs was ported to
> freebsd so quickly would've been a good first
> indicator that the
> functionality wasn't already there."  A point you
> appear keen to avoid
> discussing.

Hmmm - do I detect yet another psychic-in-training here?  Simply ignoring 
something that one considers irrelevant does not necessarily imply any active 
desire to *avoid* discussing it.

I suspect that whoever ported ZFS to FreeBSD was a fairly uncritical enthusiast 
just as so many here appear to be (and I'll observe once again that it's very 
easy to be one, because ZFS does sound impressive until you really begin 
looking at it closely).  Not to mention the fact that open-source operating 
systems often gather optional features more just because they can than because 
they necessarily offer significant value:  all it takes is one individual who 
(for whatever reason) feels like doing the work.

Linux, for example, is up to its ears in file systems, all of which someone 
presumably felt it worthwhile to introduce there.  Perhaps FreeBSD proponents 
saw an opportunity to narrow the gap in this area (especially since 
incorporating ZFS into Linux appears to have licensing obstacles).

In any event, the subject under discussion here is not popularity but utility - 
*quantifiable* utility - and hence the porting of ZFS to FreeBSD is not 
directly relevant.

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS with Memory Sticks

2007-12-06 Thread Constantin Gonzalez
Hi,

> # /usr/sbin/zpool import
>   pool: Radical-Vol
> id: 3051993120652382125
>  state: FAULTED
> status: One or more devices contains corrupted data.
> action: The pool cannot be imported due to damaged devices or data.
>see: http://www.sun.com/msg/ZFS-8000-5E
> config:
> 
> Radical-Vol  UNAVAIL   insufficient replicas
>   c7t0d0s0  UNAVAIL   corrupted data

ok, ZFS did recognize the disk, but the pool is corrupted. Did you remove
it without exporting the pool first?

> Following your command:
> 
> $ /opt/sfw/bin/sudo /usr/sbin/zpool status
>   pool: Rad_Disk_1
>  state: ONLINE
> status: The pool is formatted using an older on-disk format.  The pool can
> still be used, but some features are unavailable.
> action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
> pool will no longer be accessible on older software versions.
>  scrub: none requested
> config:
> 
> NAMESTATE READ WRITE CKSUM
> Rad_Disk_1  ONLINE   0 0 0
>   c0t1d0ONLINE   0 0 0
> 
> errors: No known data errors

But this pool should be accessible, since you can zpool status it. Have
you check zfs get all Rad_Disk_1? Does it show mount points and whether
it should be mounted?

> But this device works currently on my Solaris PC's, the W2100z and a 
> laptop of mine.

Strange. Maybe it's a USB issue. Have you checked:

   http://www.sun.com/io_technologies/usb/USB-Faq.html#Storage

Especially #19?

Best regards,
Constantin

-- 
Constantin GonzalezSun Microsystems GmbH, Germany
Platform Technology Group, Global Systems Engineering  http://www.sun.de/
Tel.: +49 89/4 60 08-25 91   http://blogs.sun.com/constantin/

Sitz d. Ges.: Sun Microsystems GmbH, Sonnenallee 1, 85551 Kirchheim-Heimstetten
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread can you guess?
> apologies in advance for prolonging this thread ..

Why do you feel any need to?  If you were contributing posts as completely 
devoid of technical content as some of the morons here have recently been 
submitting I could understand it, but my impression is that the purpose of this 
forum is to explore the kind of questions that you're interested in discussing.

 i
> had considered  
> taking this completely offline, but thought of a few
> people at least  
> who might find this discussion somewhat interesting

And any who don't are free to ignore it, so no harm done there either.

> .. at the least i  
> haven't seen any mention of Merkle trees yet as the
> nerd in me yearns  
> for

I'd never heard of them myself until recently, despite having come up with the 
idea independently to use a checksumming mechanism very similar to ZFS's.  
Merkle seems to be an interesting guy - his home page is worth a visit.

> 
> On Dec 5, 2007, at 19:42, bill todd - aka can you
> guess? wrote:
> 
> >> what are you terming as "ZFS' incremental risk
> reduction"? ..  
> >> (seems like a leading statement toward a
> particular assumption)
> >
> > Primarily its checksumming features, since other
> open source  
> > solutions support simple disk scrubbing (which
> given its ability to  
> > catch most deteriorating disk sectors before they
> become unreadable  
> > probably has a greater effect on reliability than
> checksums in any  
> > environment where the hardware hasn't been slapped
> together so  
> > sloppily that connections are flaky).
> 
> ah .. okay - at first reading "incremental risk
> reduction" seems to  
> imply an incomplete approach to risk

The intent was to suggest a step-wise approach to risk, where some steps are 
far more significant than others (though of course some degree of overlap 
between steps is also possible).

*All* approaches to risk are incomplete.

 ...

 i do  
> believe that an interesting use of the merkle tree
> with a sha256 hash  
> is somewhat of an improvement over conventional
> volume based data  
> scrubbing techniques

Of course it is:  that's why I described it as 'incremental' rather than as 
'redundant'.  The question is just how *significant* an improvement it offers.

 since there can be a unique
> integration between  
> the hash tree for the filesystem block layout and a
> hierarchical data  
> validation method.  In addition to the finding
> unknown areas with the  
> scrub, you're also doing relatively inexpensive data
> validation  
> checks on every read.

Yup.

...
 
> sure - we've seen many transport errors,

I'm curious what you mean by that, since CRCs on the transports usually 
virtually eliminate them as problems.  Unless you mean that you've seen many 
*corrected* transport errors (indicating that the CRC and retry mechanisms are 
doing their job and that additional ZFS protection in this area is probably 
redundant).

 as well as
> firmware  
> implementation errors

Quantitative and specific examples are always good for this kind of thing; the 
specific hardware involved is especially significant to discussions of the sort 
that we're having (given ZFS's emphasis on eliminating the need for much 
special-purpose hardware).

 .. in fact with many arrays
> we've seen data  
> corruption issues with the scrub

I'm not sure exactly what you're saying here:  is it that the scrub has 
*uncovered* many apparent instances of data corruption (as distinct from, e.g., 
merely unreadable disk sectors)?

 (particularly if the
> checksum is  
> singly stored along with the data block)

Since (with the possible exception of the superblock) ZFS never stores a 
checksum 'along with the data block', I'm not sure what you're saying there 
either.

 -  just like
> spam you really  
> want to eliminate false positives that could indicate
> corruption  
> where there isn't any.

The only risk that ZFS's checksums run is the infinitesimal possibility that 
corruption won't be detected, not that they'll return a false positive.

  if you take some time to read
> the on disk  
> format for ZFS you'll see that there's a tradeoff
> that's done in  
> favor of storing more checksums in many different
> areas instead of  
> making more room for direct block pointers.

While I haven't read that yet, I'm familiar with the trade-off between using 
extremely wide checksums (as ZFS does - I'm not really sure why, since 
cryptographic-level security doesn't seem necessary in this application) and 
limiting the depth of the indirect block tree.  But (yet again) I'm not sure 
what you're trying to get at here.

...

 on this list we've seen a number of consumer
> level products  
> including sata controllers, and raid cards (which are
> also becoming  
> more commonplace in the consumer realm) that can be
> confirmed to  
> throw data errors.

Your phrasing here is a bit unusual ('throwing errors' - or exceptions - is not 
commonly related to corrupting data).  If you're referring to some kin

Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread Wade . Stuart
[EMAIL PROTECTED] wrote on 12/06/2007 09:58:00 AM:

> On Dec 6, 2007 1:13 AM, Bakul Shah <[EMAIL PROTECTED]> wrote:
>
> > Note that I don't wish to argue for/against zfs/billtodd but
> > the comment above about "no *real* opensource software
> > alternative zfs automating checksumming and simple
> > snapshotting" caught my eye.
> >
> > There is an open source alternative for archiving that works
> > quite well.  venti has been available for a few years now.
> > It runs on *BSD, linux, macOS & plan9 (its native os).  It
> > uses strong crypto checksums, stored separately from the data
> > (stored in the pointer blocks) so you get a similar guarantee
> > against silent data corruption as ZFS.
>
> Last time I looked into  Venti, it used content hashing to
> locate storage blocks. Which was really cool, because (as
> you say) it magically consolidates blocks with the same checksum
> together.
>
> The 45 byte score is the checksum of the top of the tree, isn't that
> right?
>
> Good to hear it's still alive and been revamped somewhat.
>
> ZFS snapshots and clones save a lot of space, but the
> 'content-hash == address' trick means you could potentially save
> much more.
>
> Though I'm still not sure how well it scales up -
> Bigger working set means you need longer (more expensive) hashes
> to avoid a collision, and even then its not guaranteed.
>
> When i last looked they were still using SHA-160
> and I ran away screaming at that point :)

The hash chosen is close to inconsequential as long as you perform
collision checks and the collision rate is "low".  Hash key collision
branching is pretty easy and has been used for decades (see perl's
collision forking for hash var key collisions for an example).  The process
is lookup a key, verify data matches, if it does inc the ref count store
and go,  if no match split out a sub key, store and go.   There are "cost"
curves for both the hashing, and data matching portions. As the number of
hash matches goes up so does the cost for data verifying -- but no matter
what hash you use (assuming at least one bit less information then the
original data) there _will_ be collisions possible so the verify must
exist.

-Wade


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread Dick Davies
On Dec 6, 2007 1:13 AM, Bakul Shah <[EMAIL PROTECTED]> wrote:

> Note that I don't wish to argue for/against zfs/billtodd but
> the comment above about "no *real* opensource software
> alternative zfs automating checksumming and simple
> snapshotting" caught my eye.
>
> There is an open source alternative for archiving that works
> quite well.  venti has been available for a few years now.
> It runs on *BSD, linux, macOS & plan9 (its native os).  It
> uses strong crypto checksums, stored separately from the data
> (stored in the pointer blocks) so you get a similar guarantee
> against silent data corruption as ZFS.

Last time I looked into  Venti, it used content hashing to
locate storage blocks. Which was really cool, because (as
you say) it magically consolidates blocks with the same checksum
together.

The 45 byte score is the checksum of the top of the tree, isn't that
right?

Good to hear it's still alive and been revamped somewhat.

ZFS snapshots and clones save a lot of space, but the
'content-hash == address' trick means you could potentially save
much more.

Though I'm still not sure how well it scales up -
Bigger working set means you need longer (more expensive) hashes
to avoid a collision, and even then its not guaranteed.

When i last looked they were still using SHA-160
and I ran away screaming at that point :)

> Google for "venti sean dorward".  If interested, go to
> http://swtch.com/plan9port/ and pick up plan9port (a
> collection of programs from plan9, not just venti).  See
> http://swtch.com/plan9port/man/man8/index.html for how to use
> venti.




-- 
Rasputnik :: Jack of All Trades - Master of Nuns
http://number9.hellooperator.net/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] mixing raidz1 and raidz2 in same pool

2007-12-06 Thread Kam
Does anyone know if there are any issues mixing one 5+2 raidz2 in the same pool 
with 6 5+1 raidz1 vdevs? Would there be any performance hit?
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread Tim Cook
> As I explained, there are eminently acceptable
> alternatives to ZFS from any objective standpoint.
> 

So name these mystery alternatives that come anywhere close to the protection, 
functionality, and ease of use that zfs provides.  You keep talking about how 
they exist, yet can't seem to come up with any real names.

Really, a five page dissertation isn't required.  A simple numbered list will 
be more than acceptable.  Although, I think we all know that won't happen since 
you haven't a list to provide.

Oh and "I'm sure there's something out there, I'm just not sure what" 
DEFINITELY isn't an acceptable answer.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread Tim Cook
For the same reason he won't respond to Jone, and can't answer the original 
question.  He's not trying to help this list out at all, or come up with any 
real answers.  He's just here to troll.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread Tim Cook
Whoever coined that phrase must've been wrong, it should definitely be "By 
billtodd you've got it".
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread Kyle McDonald
can you guess? wrote:
>> There aren't free alternatives in linux or freebsd
>> that do what zfs does, period.
>> 
>
> No one said that there were:  the real issue is that there's not much reason 
> to care, since the available solutions don't need to be *identical* to offer 
> *comparable* value (i.e., they each have different strengths and weaknesses 
> and the net result yields no clear winner - much as some of you would like to 
> believe otherwise).
Ok. So according to you, most of what ZFS does is available elsewhere, 
and the features it has that nothing else has are' really a value add, 
ar least not enough to produce a 'clear winner'. Ok, assume for a second 
that I believe that. can you list one other software raid/filesystem 
that as any feature (small or large) that ZFS lacks?

Because if all else is really equal, and ZFS is the only one with any 
advantages then, whether those advantages are small or not (and I don't 
agree with how small you think they are - see my other post that you've 
ignored so far.) I think there is a 'clear winner' - at least at the 
moment - Things can change at any time.

 -Kyle
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS write time performance question

2007-12-06 Thread przemolicc
On Wed, Dec 05, 2007 at 09:02:43PM -0800, Tim Cook wrote:
> what firmware revision are you at?

Revision: 415G


Regards
przemol

-- 
http://przemol.blogspot.com/






















--
A co by bylo, gdybys to TY rzadzil?
Kliknij >>> http://link.interia.pl/f1c91

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread Nicolas Williams
On Wed, Dec 05, 2007 at 09:45:55PM -0800, can you guess? wrote:
> > There aren't free alternatives in linux or freebsd
> > that do what zfs does, period.
> 
> No one said that there were:  the real issue is that there's not much
> reason to care, since the available solutions don't need to be

If you don't care, then go off not caring.  (Can we declare this thread
dead already?)

Others seem to care.

> *identical* to offer *comparable* value (i.e., they each have
> different strengths and weaknesses and the net result yields no clear
> winner - much as some of you would like to believe otherwise).

Interoperability counts for a lot for some people.  Fewer filesystems to
learn about can count too.

ZFS provides peace of mind that you tell us doesn't matter.  And it's
actively developed and you and everyone else can see that this is so,
and that recent ZFS improvements and others that are in the pipe (and
discussed publicly) are very good improvements, which all portends an
even better future for ZFS down the line.

Whatever you do not like about ZFS today may be fixed tomorrow, except
for the parts about it being ZFS, opensource, Sun-developed, ..., the
parts that really seem to bother you.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-12-06 Thread Ian Collins
can you guess? wrote:
>   
>> There aren't free alternatives in linux or freebsd
>> that do what zfs does, period.
>> 
>
> No one said that there were:  the real issue is that there's not much reason 
> to care, since the available solutions don't need to be *identical* to offer 
> *comparable* value (i.e., they each have different strengths and weaknesses 
> and the net result yields no clear winner - much as some of you would like to 
> believe otherwise).
>
>   
I see you carefully snipped "You would think the fact zfs was ported to
freebsd so quickly would've been a good first indicator that the
functionality wasn't already there."  A point you appear keen to avoid
discussing.

Ian

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss