Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2009-02-07 Thread Gino
> FYI, I'm working on a workaround for broken devices. > As you note, > ome disks flat-out lie: you issue the > synchronize-cache command, > they say "got it, boss", yet the data is still not on > stable storage. > Why do they do this? Because "it performs better". > Well, duh -- > ou can make s

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-11-30 Thread Ray Clark
It would be extremely helpful to know what brands/models of disks lie and which don't. This information could be provided diplomatically simply as threads documenting problems you are working on, stating the facts. Use of a specific string of words would make searching for it easy. There shou

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-11-29 Thread Gino
> About 99% of the problems reported as "I need ZFS > fsck" can be summed up > by two ZFS bugs: > > 1. If a toplevel vdev fails to open, we should be > able to pull > information from necessary ditto blocks to open > the pool and make > what progress we can. Right now, the root vdev > code assu

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-13 Thread Mike Gerdts
On Thu, Oct 9, 2008 at 10:33 PM, Mike Gerdts <[EMAIL PROTECTED]> wrote: > On Thu, Oct 9, 2008 at 10:18 AM, Mike Gerdts <[EMAIL PROTECTED]> wrote: >> On Thu, Oct 9, 2008 at 10:10 AM, Greg Shaw <[EMAIL PROTECTED]> wrote: >>> Nevada isn't production code. For real ZFS testing, you must use a >>> prod

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-13 Thread Wade . Stuart
[EMAIL PROTECTED] wrote on 10/11/2008 09:36:02 PM: > > On Oct 10, 2008, at 7:55 PM 10/10/, David Magda wrote: > > > > > If someone finds themselves in this position, what advice can be > > followed to minimize risks? > > Can you ask for two LUNs on different physical SAN devices and have > an

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-11 Thread Keith Bierman
On Oct 10, 2008, at 7:55 PM 10/10/, David Magda wrote: > > If someone finds themselves in this position, what advice can be > followed to minimize risks? Can you ask for two LUNs on different physical SAN devices and have an expectation of getting it? > -- Keith H. Bierman [EMAIL PROTEC

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Mike Gerdts
On Fri, Oct 10, 2008 at 9:14 PM, Jeff Bonwick <[EMAIL PROTECTED]> wrote: > Note: even in a single-device pool, ZFS metadata is replicated via > ditto blocks at two or three different places on the device, so that > a localized media failure can be both detected and corrected. > If you have two or m

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Jeff Bonwick
> Or is there a way to mitigate a checksum error on non-redundant zpool? It's just like the difference between non-parity, parity, and ECC memory. Most filesystems don't have checksums (non-parity), so they don't even know when they're returning corrupt data. ZFS without any replication can detec

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread David Magda
On Oct 10, 2008, at 15:48, Victor Latushkin wrote: > I've mostly seen (2), because despite all the best practices out > there, > single vdev pools are quite common. In all such cases that I had my > hands on it was possible to recover pool by going back by one or two > txgs. For better or wor

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Richard Elling
Timh Bergström wrote: > 2008/10/10 Richard Elling <[EMAIL PROTECTED]>: > >> Timh Bergström wrote: >> >>> 2008/10/9 Bob Friesenhahn <[EMAIL PROTECTED]>: >>> >>> On Thu, 9 Oct 2008, Miles Nordin wrote: > catastrophically. If this is really the situation, th

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Ricardo M. Correia
On Sex, 2008-10-10 at 11:23 -0700, Eric Schrock wrote: > But I haven't actually heard a reasonable proposal for what a > fsck-like tool (i.e. one that could "repair" things automatically) would > actually *do*, let alone how it would work in the variety of situations > it needs to (compressed RAID-

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Marcelo Leal
> On Fri, Oct 10, 2008 at 06:15:16AM -0700, Marcelo > Leal wrote: > > - "ZFS does not need fsck". > > Ok, that?s a great statement, but i think ZFS > needs one. Really does. > > And in my opinion a enhanced zdb would be the > solution. Flexibility. > > Options. > > About 99% of the problems re

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Timh Bergström
2008/10/10 Richard Elling <[EMAIL PROTECTED]>: > Timh Bergström wrote: >> >> 2008/10/9 Bob Friesenhahn <[EMAIL PROTECTED]>: >> >>> >>> On Thu, 9 Oct 2008, Miles Nordin wrote: >>> catastrophically. If this is really the situation, then ZFS needs to give the sysadmin a way to isolate

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Victor Latushkin
Eric Schrock wrote: > On Fri, Oct 10, 2008 at 06:15:16AM -0700, Marcelo Leal wrote: >> - "ZFS does not need fsck". >> Ok, that?s a great statement, but i think ZFS needs one. Really does. >> And in my opinion a enhanced zdb would be the solution. Flexibility. >> Options. > > About 99% of the p

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Eric Schrock
On Fri, Oct 10, 2008 at 06:15:16AM -0700, Marcelo Leal wrote: > - "ZFS does not need fsck". > Ok, that?s a great statement, but i think ZFS needs one. Really does. > And in my opinion a enhanced zdb would be the solution. Flexibility. > Options. About 99% of the problems reported as "I need ZF

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Miles Nordin
> "jb" == Jeff Bonwick <[EMAIL PROTECTED]> writes: > "rmc" == Ricardo M Correia <[EMAIL PROTECTED]> writes: jb> We need a little more Code of Hammurabi in the storage jb> industry. It seems like most of the work people have to do now is cleaning up after the sloppyness of others.

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Marcelo Leal
Hello all, I think the problem here is the ZFS´ capacity for recovery from a failure. Forgive me, but thinking about creating a code "without failures", maybe the hackers did forget that other people can make mistakes (if they can´t). - "ZFS does not need fsck". Ok, that´s a great statement,

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Ricardo M. Correia
Hi Jeff, On Sex, 2008-10-10 at 01:26 -0700, Jeff Bonwick wrote: > > The circumstances where I have lost data have been when ZFS has not > > handled a layer of redundancy. However, I am not terribly optimistic > > of the prospects of ZFS on any device that hasn't committed writes > > that ZFS thin

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Ross
That sounds like a great idea for a tool Jeff. Would it be possible to build that in as a "zpool recover" command? Being able to run a tool like that and see just how bad the corruption is, but know it's possible to recover an older version would be great. Is there any chance of outputting de

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Jeff Bonwick
> The circumstances where I have lost data have been when ZFS has not > handled a layer of redundancy. However, I am not terribly optimistic > of the prospects of ZFS on any device that hasn't committed writes > that ZFS thinks are committed. FYI, I'm working on a workaround for broken devices.

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-10 Thread Timh Bergström
2008/10/9 Bob Friesenhahn <[EMAIL PROTECTED]>: > On Thu, 9 Oct 2008, Miles Nordin wrote: >> >> catastrophically. If this is really the situation, then ZFS needs to >> give the sysadmin a way to isolate and fix the problems >> deterministically before filling the pool with data, not just blame >> t

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread Mike Gerdts
On Thu, Oct 9, 2008 at 10:18 AM, Mike Gerdts <[EMAIL PROTECTED]> wrote: > On Thu, Oct 9, 2008 at 10:10 AM, Greg Shaw <[EMAIL PROTECTED]> wrote: >> Nevada isn't production code. For real ZFS testing, you must use a >> production release, currently Solaris 10 (update 5, soon to be update 6). > > I m

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread Bob Friesenhahn
On Thu, 9 Oct 2008, Miles Nordin wrote: > > catastrophically. If this is really the situation, then ZFS needs to > give the sysadmin a way to isolate and fix the problems > deterministically before filling the pool with data, not just blame > the sysadmin based on nebulous speculatory hindsight gr

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread Miles Nordin
> "gs" == Greg Shaw <[EMAIL PROTECTED]> writes: gs> Nevada isn't production code. For real ZFS testing, you must gs> use a production release, currently Solaris 10 (update 5, soon gs> to be update 6). based on list feedback, my impression is that the results of a ``test'' confine

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread Mike Gerdts
On Thu, Oct 9, 2008 at 10:10 AM, Greg Shaw <[EMAIL PROTECTED]> wrote: > Nevada isn't production code. For real ZFS testing, you must use a > production release, currently Solaris 10 (update 5, soon to be update 6). I misstated before in my LDoms case. The corrupted pool was on Solaris 10, with L

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread Greg Shaw
Perhaps I mis-understand, but the below issues are all based on Nevada, not Solaris 10. Nevada isn't production code. For real ZFS testing, you must use a production release, currently Solaris 10 (update 5, soon to be update 6). In the last 2 years, I've stored everything in my environment (

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread Timh Bergström
Unfortunely I can only agree to the doubts about running ZFS in production environments, i've lost ditto-blocks, i''ve gotten corrupted pools and a bunch of other failures even in mirror/raidz/raidz2 setups with or without hardware mirrors/raid5/6. Plus the insecurity of a sudden crash/reboot will

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread Mike Gerdts
On Thu, Oct 9, 2008 at 7:44 AM, Ahmed Kamal <[EMAIL PROTECTED]> wrote: > >> >>In the past year I've lost more ZFS file systems than I have any other >>type of file system in the past 5 years. With other file systems I >>can almost always get some data back. With ZFS I can't get an

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread Ahmed Kamal
> >In the past year I've lost more ZFS file systems than I have any other >type of file system in the past 5 years. With other file systems I >can almost always get some data back. With ZFS I can't get any back. Thats scary to hear! > > I am really scared now! I was the one trying to

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread Wilkinson, Alex
0n Thu, Oct 09, 2008 at 06:37:23AM -0500, Mike Gerdts wrote: >FWIW, I belive that I have hit the same type of bug as the OP in the >following combinations: > >- T2000, LDoms 1.0, various builds of Nevada in control and guest > domains. >- Laptop, VirtualBox 1.6.2, Wi

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread Mike Gerdts
On Thu, Oct 9, 2008 at 4:53 AM, . <[EMAIL PROTECTED]> wrote: > While it's clearly my own fault for taking the risks I did, it's > still pretty frustrating knowing that all my data is likely still > intact and nicely checksummed on the disk but that none of it is > accessible due to some tiny filesy

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-09 Thread .
> His explanation: he invalidated the incorrect > uberblocks and forced zfs to revert to an earlier > state that was consistent. Would someone be willing to document the steps required in order to do this please? I have a disk in a similar state: # zpool import pool: tank id: 132344393378

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-06 Thread Darren J Moffat
Fajar A. Nugraha wrote: > On Fri, Oct 3, 2008 at 10:37 PM, Vasile Dumitrescu > <[EMAIL PROTECTED]> wrote: > >> VMWare 6.0.4 running on Debian unstable, >> Linux bigsrv 2.6.26-1-amd64 #1 SMP Wed Sep 24 13:59:41 UTC 2008 x86_64 >> GNU/Linux >> >> Solaris is vanilla snv_90 installed with no GUI. >

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-04 Thread Fajar A. Nugraha
On Fri, Oct 3, 2008 at 10:37 PM, Vasile Dumitrescu <[EMAIL PROTECTED]> wrote: > VMWare 6.0.4 running on Debian unstable, > Linux bigsrv 2.6.26-1-amd64 #1 SMP Wed Sep 24 13:59:41 UTC 2008 x86_64 > GNU/Linux > > Solaris is vanilla snv_90 installed with no GUI. > > in summary: physical disks, assi

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-03 Thread Vasile Dumitrescu
> > Which VM solution was this ? VMware, VirtualBox, Xen, > other ? How were > the "disks" presented to the guest ? What are the > "disks" in the host, > real disks, files, something else ? > > > -- > Darren J Moffat > ___ > zfs-discuss mailing li

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-03 Thread Darren J Moffat
Vasile Dumitrescu wrote: > Hi folks, > > I just wanted to share the end of my "adventure" here and especially take the > time to thank Victor for helping me out of this mess. > > I will let him explain the technical details (I am out of my depth here) but > bottom line he spent a couple of hour

Re: [zfs-discuss] Solved - a big THANKS to Victor Latushkin @ Sun / Moscow

2008-10-03 Thread Vasile Dumitrescu
Hi folks, I just wanted to share the end of my "adventure" here and especially take the time to thank Victor for helping me out of this mess. I will let him explain the technical details (I am out of my depth here) but bottom line he spent a couple of hours with me on the machine and sorted me