Re: [zfs-discuss] Re: Re: Re: Re: Proposal: multiple copies of user data

2006-09-16 Thread Wee Yeh Tan

On 9/15/06, can you guess? [EMAIL PROTECTED] wrote:

Implementing it at the directory and file levels would be even more flexible:  
redundancy strategy would no longer be tightly tied to path location, but 
directories and files could themselves still inherit defaults from the 
filesystem and pool when appropriate (but could be individually handled when 
desirable).


Ideally so.  FS (or dataset) level is sufficiently fine grain for my
use.  If I take the trouble to specify copies for a directory, I
really do not mind the trouble of creating a new dataset for it at the
same time.  file-level, however, is really pushing it.  You might end
up with an administrative nightmare deciphering which files have how
many copies.  I just do not see it being useful to my environment.


It would be interesting to know whether that would still be your experience in 
environments that regularly scrub active data as ZFS does (assuming that said 
experience was accumulated in environments that don't).  The theory behind 
scrubbing is that all data areas will be hit often enough that they won't have 
time to deteriorate (gradually) to the point where they can't be read at all, 
and early deterioration encountered during the scrub pass (or other access) in 
which they have only begun to become difficult to read will result in immediate 
revectoring (by the disk or, if not, by the file system) to healthier locations.


Scrubbing exercises the disk area to prevent bit-rot.  I do not think
ZFS's scrubbing changes the failure mode of the raw devices.  OTOH, I
really have no such experience to speak of *fingers crossed*.  I
failed to locate the code where the relocation of files happens but
assume that copies would make this process more reliable.


Since ZFS-style scrubbing detects even otherwise-indetectible 'silent 
corruption' missed by the disk's own ECC mechanisms, that lower-probability 
event is also covered (though my impression is that the probability of even a 
single such sector may be significantly lower than that of whole-disk failure, 
especially in laptop environments).


I do not any data to support nor dismiss that. Matt was right that
probability of failure modes is a huge can of worms that can drag
forever.


--
Just me,
Wire ...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Proposal: multiple copies of user

2006-09-16 Thread can you guess?
 On 9/15/06, can you guess? [EMAIL PROTECTED]
 wrote:

...

  file-level, however, is really pushing
 it.  You might end
 up with an administrative nightmare deciphering which
 files have how
 many copies.\

I'm not sure what you mean:  the level of redundancy would be a per-file 
attribute that could be examined, and would be normally just be defaulted to a 
common value.

...

  It would be interesting to know whether that would
 still be your experience in environments that
 regularly scrub active data as ZFS does (assuming
 that said experience was accumulated in environments
 that don't).  The theory behind scrubbing is that all
 data areas will be hit often enough that they won't
 have time to deteriorate (gradually) to the point
 where they can't be read at all, and early
 deterioration encountered during the scrub pass (or
 other access) in which they have only begun to become
 difficult to read will result in immediate
 revectoring (by the disk or, if not, by the file
 system) to healthier locations.
 
 Scrubbing exercises the disk area to prevent bit-rot.
  I do not think
 FS's scrubbing changes the failure mode of the raw
 devices.

It doesn't change the failure rate (if anything, it might accelerate it 
marginally due to the extra disk activity), but it *does* change, potentially 
radically, the frequency with which sectors containing user data become 
unreadable - because it allows them to be detected *before* that happens such 
that the data can be moved to a good sector (often by the disk itself, else by 
higher-level software) and the failing sector marked bad.

  OTOH, I
 really have no such experience to speak of *fingers
 crossed*.  I
 failed to locate the code where the relocation of
 files happens but
 assume that copies would make this process more
 reliable.

Sort of:  while they don't make any difference when you catch a failing sector 
while it's still readable, they certainly help if you only catch it after it's 
become unreadable (or has been 'silently' corrupted).

 
  Since ZFS-style scrubbing detects even
 otherwise-indetectible 'silent corruption' missed by
 the disk's own ECC mechanisms, that lower-probability
 event is also covered (though my impression is that
 the probability of even a single such sector may be
 significantly lower than that of whole-disk failure,
 especially in laptop environments).
 
 I do not any data to support nor dismiss that.

Quite a few years ago Seagate still published such data, but of course I didn't 
copy it down (because it was 'always available' when I wanted it - as I said, 
it was quite a while ago and I was not nearly as well-acquainted with the 
volatility of Internet data as I would subsequently become).  But to the best 
of my recollection their enterprise disks at that time were specced to have no 
worse than 1 uncorrectable error for every petabit read and no worse than 1 
undetected error for every exabit read.

A fairly recent paper by people who still have access to such data suggests 
that the frequency of uncorrectable errors in enterprise drives is still about 
the same, but that the frequency of undetected errors may have increased 
markedly (to perhaps once in every 10 petabits read) - possibly a result of 
ever-increasing on-disk bit densities and the more aggressive error correction 
required to handle them (perhaps this is part of the reason they don't make 
error rates public any more...).  They claim that SATA drives have error rates 
around 10x that of enterprise drives (or an undetected error rate of around 
once per petabit).

Figure out a laptop drive's average data rate and that gives you a mean time to 
encountering undetected corruption.  Compare that to the drive's in-use MTBF 
rating and there you go!  If I haven't dropped a decimal place or three doing 
this in my head, then even if laptop drives have nominal MTBFs equal to desktop 
SATA drives it looks as if it would take an average data rate of 60 - 70 KB/sec 
(24/7, year-in, year-out) for the likelihood of an undetected error to be 
comparable in likelihood to a whole-disk failure:  that's certainly nothing 
much for a fairly well-loaded server in constant (or even just 40 hour/week) 
use, but for a laptop?.

- bill
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: Re: reslivering, how long will it take?

2006-09-16 Thread Tim Cook
Yes sir:

[EMAIL PROTECTED]:/
# zpool status -v fserv
pool: fserv
state: DEGRADED
status: One or more devices is currently being resilvered. The pool
will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scrub: resilver in progress, 5.90% done, 27h13m to go
config:

NAME STATE READ WRITE CKSUM
fserv DEGRADED 0 0 0
raidz1 DEGRADED 0 0 0
replacing DEGRADED 0 0 0
c5d0s0/o UNAVAIL 0 0 0 cannot open
c5d0 ONLINE 0 0 0
c3d0 ONLINE 0 0 0
c3d1 ONLINE 0 0 0
c4d0 ONLINE 0 0 0

errors: No known data errors


-Original Message-
From: Bill Moore [mailto:Bill dot Moore at sun dot com]
Sent: Friday, September 15, 2006 4:45 PM
To: Tim Cook
Cc: zfs-discuss at opensolaris dot org
Subject: Re: [zfs-discuss] Re: reslivering, how long will it take?

On Fri, Sep 15, 2006 at 01:26:21PM -0700, Tim Cook wrote:
 says it's online now so I can only assume it's working. Doesn't seem
 to be reading from any of the other disks in the array though. Can it
 sliver without traffic to any other disks? /noob

Can you send the output of zpool status -v pool?


--Bill
___ 
zfs-discuss mailing list
zfs-discuss at opensolaris dot org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] mounting during boot

2006-09-16 Thread Krzys
Hello everyone, I just wanted to play with zfs just a bit before I start using 
it at my workplace on servers so I did set it up on my Solaris 10 U2 box.
I used to have all my disks mounted as UFS and everything was fine. I had my 
/etc/vfstab as such:

#
fd -   /dev/fd   fd -   no  -
/proc  -   /proc proc   -   no  -
/dev/dsk/c1t0d0s1  -   - swap   -   no  -
/dev/dsk/c1t0d0s0  /dev/rdsk/c1t0d0s0  / ufs1   no  logging
/dev/dsk/c1t0d0s6  /dev/rdsk/c1t0d0s6  /usr  ufs1   no  logging
/dev/dsk/c1t0d0s5  /dev/rdsk/c1t0d0s5  /var  ufs1   no  logging
/dev/dsk/c1t0d0s7  /dev/rdsk/c1t0d0s7  /d/d1 ufs2   yes logging
/devices   -   /devices  devfs  -   no  -
ctfs   -   /system/contractctfs-   no  -
objfs  -   /system/object  objfs   -   no  -
swap   -   /tmptmpfs   -   yes -
/dev/dsk/c1t1d0s7  /dev/rdsk/c1t1d0s7  /d/d2   ufs 2   yes logging
/d/d2/downloads -  /d/d2/web/htdocs/downloads  lofs2   yes -
/d/d1/home/cw/pics  -  /d/d2/web/htdocs/pics   lofs2   yes -

So I decided to put /d/d2 drive on zfs, created my pool, then did create zfs an 
dmounted it under /d/d2 while I did copy content od /d/d2 to my new zfs and then 
removed it from vfstab file.


Ok, so now line where is does say:
/dev/dsk/c1t1d0s7  /dev/rdsk/c1t1d0s7  /d/d2   ufs 2   yes logging
is commented out from my vfstab file. I rebooted my system just to get all my 
things started as I wanted (well I did bring all webservers and everything else 
down for the duration of copy so that nothing was accessing /d/d2 drive).


So my system is booting up and I cannot login. aparently my service:
svc:/system/filesystem/local:default went into maitenance mode... somehow system 
could not mount those two items from vfstab:

/d/d2/downloads -  /d/d2/web/htdocs/downloads  lofs2   yes -
/d/d1/home/cw/pics  -  /d/d2/web/htdocs/pics   lofs2   yes -
I could not login and do anything, had to login trough console put my service
svc:/system/filesystem/local:default out of maitenance mode, clear maitenance 
state and all my services started to get going and system was no longer in 
single user mode...


That sucks a bit since how can I mount both UFS drives, then mount zfs and then 
get lofs mountpoints after?


Also if certain dysks did not mount I used to go to /etc/vfstab and was able to 
see what was going on, now since zfs does not use vfstab how can I know what was 
mounted or not before system went down? Sometimes drives go bad, sometimes 
certain dysks are commented out in vfstab such as backup disks, with zfs it is 
controlled trough command line, what if I do not want to boot something at boot 
time? How can I distinguish what suppose to be mounted at boot and whats not 
uzing zfs list? is there a config file that I can just comment out few lines and 
be able to mount them at other times other than boot?


Thanks for suggestions... and sorry if this is wrong group to post such question 
since this is not a question about opensolaris but zfs on Solaris 10 Update 2.


Chris

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss