Re: [zfs-discuss] ZFS fsck?
On Tue, 6 Jul 2010, Roy Sigurd Karlsbakk wrote: what I'm saying is that there are several posts in here where the only solution is to boot onto a live cd and then do an import, due to metadata corruption. This should be doable from the installed system Ah, I understand now. A couple of things worth noting: - if the root filesystem in a boot pool cannot be mounted, it's problematic to access the tools necessary to repair it. So going to a livecd (or a network boot for that matter) is the best way forward. - if the tools available to failsafe are insufficient to repair a pool, then booting off a livecd/network is the only way forward. It is also worth pointing out here that the 134a build has the pool recovery code built-in. The "-F" option to zpool import only became available after build 128 or 129. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS fsck?
> You can do this with "zpool scrub". It visits every allocated block > and > verifies that everything is correct. It's not the same as fsck in that > scrub can detect and repair problems with the pool still online and > all > datasets mounted, whereas fsck cannot handle mounted filesystems. > > If you really want to use it on an exported pool, you can use zdb, > although it might take some time. Here's an example on a small empty > pool: > > # zpool create -f mypool raidz c4t1d0s0 c4t2d0s0 c4t3d0s0 c4t4d0s0 > c4t5d0s0 > # zpool list mypool > NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT > mypool 484M 280K 484M 0% 1.00x ONLINE - > # zpool export mypool > # zdb -ebcc mypool ... what I'm saying is that there are several posts in here where the only solution is to boot onto a live cd and then do an import, due to metadata corruption. This should be doable from the installed system Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS fsck?
On Tue, 6 Jul 2010, Roy Sigurd Karlsbakk wrote: Hi all With several messages in here about troublesome zpools, would there be a good reason to be able to fsck a pool? As in, check the whole thing instead of having to boot into live CDs and whatnot? You can do this with "zpool scrub". It visits every allocated block and verifies that everything is correct. It's not the same as fsck in that scrub can detect and repair problems with the pool still online and all datasets mounted, whereas fsck cannot handle mounted filesystems. If you really want to use it on an exported pool, you can use zdb, although it might take some time. Here's an example on a small empty pool: # zpool create -f mypool raidz c4t1d0s0 c4t2d0s0 c4t3d0s0 c4t4d0s0 c4t5d0s0 # zpool list mypool NAME SIZE ALLOC FREECAP DEDUP HEALTH ALTROOT mypool 484M 280K 484M 0% 1.00x ONLINE - # zpool export mypool # zdb -ebcc mypool Traversing all blocks to verify checksums and verify nothing leaked ... No leaks (block sum matches space maps exactly) bp count: 48 bp logical:378368 avg: 7882 bp physical:39424 avg:821 compression: 9.60 bp allocated: 185344 avg: 3861 compression: 2.04 bp deduped: 0ref>1: 0 deduplication: 1.00 SPA allocated: 185344 used: 0.04% # ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS fsck?
- Original Message - > From: "Roy Sigurd Karlsbakk" > To: "OpenSolaris ZFS discuss" > Sent: Tuesday, 6 July, 2010 6:35:51 PM > Subject: [zfs-discuss] ZFS fsck? > Hi all > > With several messages in here about troublesome zpools, would there be > a good reason to be able to fsck a pool? As in, check the whole thing > instead of having to boot into live CDs and whatnot? > > Vennlige hilsener / Best regards > > roy Scrub? :) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS fsck?
Hi all With several messages in here about troublesome zpools, would there be a good reason to be able to fsck a pool? As in, check the whole thing instead of having to boot into live CDs and whatnot? Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Hi, >> *everybody* is interested in the flag days page. Including me. >> Asking me to "raise the priority" is not helpful. > >> From my perspective, it's a surprise that 'everybody' is interested, as I'm > not seeing a lot of people complaining that the flag day page is not updating. > Only a couple of people on this list, and one of those is me! > Perhaps I'm looking in the wrong places. I used this page frequently, too. But now i'm just using the twitter account feeded by onnv-notify . You can look to it at http://twitter.com/codenews Regards Joerg ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Hi James James C. McPherson wrote: > *everybody* is interested in the flag days page. Including me. > Asking me to "raise the priority" is not helpful. >From my perspective, it's a surprise that 'everybody' is interested, as I'm not seeing a lot of people complaining that the flag day page is not updating. Only a couple of people on this list, and one of those is me! Perhaps I'm looking in the wrong places. I'm prepared to admit that I may well have misjudged the situation, due to my lack of a full overview. I'm sorry if my forum posts regarding this has not been helpful, as my only intention was to try to be helpful. Best Regards Nigel Smith -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Nigel Smith wrote: On Thu Nov 5 14:38:13 PST 2009, Gary Mills wrote: It would be nice to see this information at: http://hub.opensolaris.org/bin/view/Community+Group+on/126-130 but it hasn't changed since 23 October. Well it seems we have an answer: http://mail.opensolaris.org/pipermail/zfs-discuss/2009-November/033672.html On Mon Nov 9 14:26:54 PST 2009, James C. McPherson wrote: The flag days page has not been updated since the switch to XWiki, it's on my todo list but I don't have an ETA for when it'll be done. Perhaps anyone interested in seeing the flags days page resurrected can petition James to raise the priority on his todo list. Nigel, *everybody* is interested in the flag days page. Including me. Asking me to "raise the priority" is not helpful. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
On Thu Nov 5 14:38:13 PST 2009, Gary Mills wrote: > It would be nice to see this information at: > http://hub.opensolaris.org/bin/view/Community+Group+on/126-130 > but it hasn't changed since 23 October. Well it seems we have an answer: http://mail.opensolaris.org/pipermail/zfs-discuss/2009-November/033672.html On Mon Nov 9 14:26:54 PST 2009, James C. McPherson wrote: > The flag days page has not been updated since the switch > to XWiki, it's on my todo list but I don't have an ETA > for when it'll be done. Perhaps anyone interested in seeing the flags days page resurrected can petition James to raise the priority on his todo list. Thanks Nigel Smith -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
On Sun, Nov 8, 2009 at 7:55 AM, Robert Milkowski wrote: > > fyi > > Robert Milkowski wrote: >> >> XXX wrote: >>> >>> | Have you actually tried to roll-back to previous uberblocks when you >>> | hit the issue? I'm asking as I haven't yet heard about any case >>> | of the issue witch was not solved by rolling back to a previous >>> | uberblock. The problem though was that the way to do it was "hackish". >>> >>> Until recently I didn't even know that this was possible or a likely >>> solution to 'pool panics system on import' and similar pool destruction, >>> and I don't have any tools to do it. (Since we run Solaris 10, we won't >>> have official support for it for quite some time.) >>> >> >> I wouldn't be that surprised if this particular feature would actually be >> backported to S10 soon. At least you may raise a CR asking for it - maybe >> you will get an access to IDR first (I'm not saying there is or isn't >> already one). >> >>> If there are (public) tools for doing this, I will give them a try >>> the next time I get a test pool into this situation. >>> >> >> IIRC someone send one to the zfs-discuss list some time ago. >> Then usually you will also need to poke with zdb. >> A sketchy and unsupported procedure was discussed on the list as well. >> Look at the archives. >> >>> | The bugs which prevented importing a pool in some circumstances were >>> | really "annoying" but lets face it - it was bound to happen and they >>> | are just bugs which are getting fixed. ZFS is still young after all. >>> | And when you google for data loss on other filesystems I'm sure you >>> | will find lots of user testimonies - be it ufs, ext3, raiserfs or your >>> | favourite one. >>> >>> The difference between ZFS and those other filesystems is that with >>> a few exceptions (XFS, ReiserFS), which sysadmins in the field didn't >>> like either, those filesystems didn't generally lose *all* your data >>> when something went wrong. Their official repair tools could usually >>> put things back together to at least some extent. >>> >> >> Generally they didn't although I've seen situation when entire ext2 and >> ufs were lost and fsck was not able to get them even mounted (kernel panics >> right after mounting them). In other occassion fsck was crashing the box in >> yet another one fsck claimed everything was ok but then when doing backup >> system was crashing (fsck can't really properly fix filesystem state - it is >> more of guessing and sometimes it goes terribly wrong). >> >> But I agrre that generally with other file systems you can recover most or >> all data just fine. >> And generally it is the case with zfs - there were probably more bugs in >> ZFS as it is much younger filesystem, but most of them were very quickly >> fixed. And the uberblock one - I 100% agree then when you hit the issue and >> didn't know about manual method to recover it was very bad - but it has >> finally been fixed. >> >>> (Just as importantly, when they couldn't put things back together you >>> could honestly tell management and the users 'we ran the recovery tools >>> and this is all they could get back'. At the moment, we would have >>> to tell users and management 'well, there are no (official) recovery >>> tools...', unless Sun Support came through for once.) >>> >> >> But these tools are built-in into zfs and are happening automatically and >> with virtually 100% confidence that if something can be fixed it is fixed >> correctly and if something is wrong it will be detected - thanks to >> end-to-end checksumming of data and meta-data. The problem *was* that one >> case scenario when rolling back to previous uberblock is required was not >> implemented and required a complicated and undocumented procedure to follow. >> It wasn't high priority for Sun as it was very rare , wasn't affecting much >> enterprise customers and although complicated the procedure is there is one >> and was successfully used on many occasions even for non paying customers >> thanks to guys like Victor on the zfs mailing list who helped some people in >> such a situations. >> >> But you didn't know about it and it seems like Sun's support service was >> no use for you - which is really a shame. >> In your case I would probably point that out to them and at least get some >> good deal as a compensation or something... >> >> But what is most important is that finally fully supported, built in and >> easy to use procedure is available to recover from such situations. As time >> will progress and more bugs will be fixed ZFS will behave much better under >> many corner cases as it does already in Open Solaris - last 6 months or so >> were really very productive in fixing many bugs like that. >> >>> | However the whole point of the discussion is that zfs really doesn't | >>> need a fsck tool. >>> | All the problems encountered so far were bugs and most of them are | >>> already fixed. One missing feature was a built-in support for | rolling-back >>> uberblock whic
Re: [zfs-discuss] ZFS + fsck
fyi Robert Milkowski wrote: XXX wrote: | Have you actually tried to roll-back to previous uberblocks when you | hit the issue? I'm asking as I haven't yet heard about any case | of the issue witch was not solved by rolling back to a previous | uberblock. The problem though was that the way to do it was "hackish". Until recently I didn't even know that this was possible or a likely solution to 'pool panics system on import' and similar pool destruction, and I don't have any tools to do it. (Since we run Solaris 10, we won't have official support for it for quite some time.) I wouldn't be that surprised if this particular feature would actually be backported to S10 soon. At least you may raise a CR asking for it - maybe you will get an access to IDR first (I'm not saying there is or isn't already one). If there are (public) tools for doing this, I will give them a try the next time I get a test pool into this situation. IIRC someone send one to the zfs-discuss list some time ago. Then usually you will also need to poke with zdb. A sketchy and unsupported procedure was discussed on the list as well. Look at the archives. | The bugs which prevented importing a pool in some circumstances were | really "annoying" but lets face it - it was bound to happen and they | are just bugs which are getting fixed. ZFS is still young after all. | And when you google for data loss on other filesystems I'm sure you | will find lots of user testimonies - be it ufs, ext3, raiserfs or your | favourite one. The difference between ZFS and those other filesystems is that with a few exceptions (XFS, ReiserFS), which sysadmins in the field didn't like either, those filesystems didn't generally lose *all* your data when something went wrong. Their official repair tools could usually put things back together to at least some extent. Generally they didn't although I've seen situation when entire ext2 and ufs were lost and fsck was not able to get them even mounted (kernel panics right after mounting them). In other occassion fsck was crashing the box in yet another one fsck claimed everything was ok but then when doing backup system was crashing (fsck can't really properly fix filesystem state - it is more of guessing and sometimes it goes terribly wrong). But I agrre that generally with other file systems you can recover most or all data just fine. And generally it is the case with zfs - there were probably more bugs in ZFS as it is much younger filesystem, but most of them were very quickly fixed. And the uberblock one - I 100% agree then when you hit the issue and didn't know about manual method to recover it was very bad - but it has finally been fixed. (Just as importantly, when they couldn't put things back together you could honestly tell management and the users 'we ran the recovery tools and this is all they could get back'. At the moment, we would have to tell users and management 'well, there are no (official) recovery tools...', unless Sun Support came through for once.) But these tools are built-in into zfs and are happening automatically and with virtually 100% confidence that if something can be fixed it is fixed correctly and if something is wrong it will be detected - thanks to end-to-end checksumming of data and meta-data. The problem *was* that one case scenario when rolling back to previous uberblock is required was not implemented and required a complicated and undocumented procedure to follow. It wasn't high priority for Sun as it was very rare , wasn't affecting much enterprise customers and although complicated the procedure is there is one and was successfully used on many occasions even for non paying customers thanks to guys like Victor on the zfs mailing list who helped some people in such a situations. But you didn't know about it and it seems like Sun's support service was no use for you - which is really a shame. In your case I would probably point that out to them and at least get some good deal as a compensation or something... But what is most important is that finally fully supported, built in and easy to use procedure is available to recover from such situations. As time will progress and more bugs will be fixed ZFS will behave much better under many corner cases as it does already in Open Solaris - last 6 months or so were really very productive in fixing many bugs like that. | However the whole point of the discussion is that zfs really doesn't | need a fsck tool. | All the problems encountered so far were bugs and most of them are | already fixed. One missing feature was a built-in support for | rolling-back uberblock which just has been integrated. But I'm sure | there are more bugs to be found.. I disagree strongly. Fsck tools have multiple purposes; ZFS obsoletes some of them but not all. One thing fsck is there for is to recover as much as possible after things happen that are supposed to be impossible, like opera
Re: [zfs-discuss] ZFS + fsck
Thanks for taking the time to write this - very useful info :) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Hi Gary I will let 'website-discuss' know about this problem. They normally fix issues like that. Those pages always seemed to just update automatically. I guess it's related to the website transition. Thanks Nigel Smith -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
On Thu, Nov 05, 2009 at 03:04:05PM -0700, Tim Haley wrote: > Robert Milkowski wrote: > >I think that most people including ZFS developers agree with you that > >losing an access to entire pool is not acceptable. And this has been > >fixed in snv_126 so now in those rare circumstances you should be able > >to import a pool. And generally you will end-up in a much better > >situation than with legacy filesystems + fsck. > > > Just a slight correction. The current build in-process is 128 and that's > the build into which the changes were pushed. It would be nice to see this information at: http://hub.opensolaris.org/bin/view/Community+Group+on/126-130 but it hasn't changed since 23 October. -- -Gary Mills--Unix Group--Computer and Network Services- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Robert Milkowski wrote: Miles Nordin wrote: "csb" == Craig S Bell writes: csb> Two: If you lost data with another filesystem, you may have csb> overlooked it and blamed the OS or the application, yeah, but with ZFS you often lose the whole pool in certain classes of repeatable real-world failures, like hotswap disks with flakey power or SAN's without NVRAM where the target reboots and the initiator does not. Losing the whole pool is relevantly different to corrupting the insides of a few files. I think that most people including ZFS developers agree with you that losing an access to entire pool is not acceptable. And this has been fixed in snv_126 so now in those rare circumstances you should be able to import a pool. And generally you will end-up in a much better situation than with legacy filesystems + fsck. Just a slight correction. The current build in-process is 128 and that's the build into which the changes were pushed. -tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
On Thu, 5 Nov 2009, Miles Nordin wrote: "rm" == Robert Milkowski writes: rm> Personally I don't blame Sun that implementing the CR took so rm> long as it mostly affected home users with cheap hardware from rm> BestBuy like sources no, many of the reports were FC SAN's. Do you have a secret back-channel to receive these many reports? Are the reports from trolls or gnomes? rm> and even then it was relatively rare. no, they said they were losing way more zpools than they ever lost vxfs's in the same environment. Who are 'they'? Are they the little gnomes that come out at night and lurk in your computer room? Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Hi Robert I think you mean snv_128 not 126 :-) 6667683 need a way to rollback to an uberblock from a previous txg http://bugs.opensolaris.org/view_bug.do?bug_id=6667683 http://hg.genunix.org/onnv-gate.hg/rev/8aac17999e4d Regards Nigel Smith -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Miles Nordin wrote: "rm" == Robert Milkowski writes: rm> Personally I don't blame Sun that implementing the CR took so rm> long as it mostly affected home users with cheap hardware from rm> BestBuy like sources no, many of the reports were FC SAN's. rm> and even then it was relatively rare. no, they said they were losing way more zpools than they ever lost vxfs's in the same environment. Well, who's they? I've been depolying ZFS for years on many different platforms from low-end, jbods, thru midrange, SAN, and high-end disk arrays and I have yet to loose a pool (hopefully not). It doesn't mean that some other people did not have problems or did not loose they pools - in most if not in all such cases almost all data could probably be recovered by following manual and "hackish" procedure to rollback to a previous uberblock. Now it is integrated into ZFS and no special knowledge is required to be able to do so in such circumstances. Then there might have been other bugs... life, no software is without them. rm> called enterprise customers were affected even less and then rm> either they had enough expertise or called Sun's support rm> organization to get a pool manually reverted to its previous rm> uberblock. which is probably why the tool exists. but, great! The point is that you don't need the tool now as it is built-in in zfs starting with snv_126. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Miles Nordin wrote: "csb" == Craig S Bell writes: csb> Two: If you lost data with another filesystem, you may have csb> overlooked it and blamed the OS or the application, yeah, but with ZFS you often lose the whole pool in certain classes of repeatable real-world failures, like hotswap disks with flakey power or SAN's without NVRAM where the target reboots and the initiator does not. Losing the whole pool is relevantly different to corrupting the insides of a few files. I think that most people including ZFS developers agree with you that losing an access to entire pool is not acceptable. And this has been fixed in snv_126 so now in those rare circumstances you should be able to import a pool. And generally you will end-up in a much better situation than with legacy filesystems + fsck. -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
> "rm" == Robert Milkowski writes: rm> Personally I don't blame Sun that implementing the CR took so rm> long as it mostly affected home users with cheap hardware from rm> BestBuy like sources no, many of the reports were FC SAN's. rm> and even then it was relatively rare. no, they said they were losing way more zpools than they ever lost vxfs's in the same environment. rm> called enterprise customers were affected even less and then rm> either they had enough expertise or called Sun's support rm> organization to get a pool manually reverted to its previous rm> uberblock. which is probably why the tool exists. but, great! pgpFajIq35ZZW.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
> "csb" == Craig S Bell writes: csb> Two: If you lost data with another filesystem, you may have csb> overlooked it and blamed the OS or the application, yeah, but with ZFS you often lose the whole pool in certain classes of repeatable real-world failures, like hotswap disks with flakey power or SAN's without NVRAM where the target reboots and the initiator does not. Losing the whole pool is relevantly different to corrupting the insides of a few files. Yes, I know, the red-eyed screaming ZFS rats will come out of the walls screaming ``that 1 bit could have been critical Banking Data on which millions of lives depend and nuclear reactors and spaceships too! Wouldn't you rather KNOW, even if ZFS desides to inform with zpool_self-destruct_condescending-error()?'' Maybe, sometimes, yes, but USUALLY, **NO**! I've no objection to deciding how much recovery tools are needed based on experience rather than wide-eyed kool-aid ranting or presumptions from earlier filesystems, but so far experience says the recovery work was really needed, so I can't agree with the bloggers rehashing each other's zealotry. It would be nice to isolate and fix the underlying problems, though. That is the spirit in all these ``we don't need no fsck because we are perfect'' blogs with which I do agree. Their overoptimism isn't as honest as I'd like about the way ZFS's error messages do not enough to lead us toward the real cause in the case of SAN problems because they are all designed presuming spatially-clustered, temporally-spread, disk-based failures rather than temporally-clustered interconnect failures, so rather the error detection becomes no more than ``printf("simon sez u will not blame me, blame someone else. these aren't the droids you're looking for. move along.");'' but, yeah, the blogger's point of banging on the whole stack until it works rather than concealing errors, is a good one. Unfortunately I don't think that's what will actually happen with these dropped-write SAN failures. I think people will just use the new recovery bits, which conceal errors just like earlier filesystems and fsck tools, and shrug. pgpRg4gotskPU.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Orvar Korvar wrote: Does this putback mean that I have to upgrade my zpool, or is it a zfs tool? If I missed upgrading my zpool I am smoked? The putback did not bump zpool or zfs versions. You shouldn't have to upgrade your pool. -tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Tim Haley wrote: Robert Milkowski wrote: There is another CR (don't have its number at hand) which is about implementing a delayed re-use on just freed blocks which should allow for more data to be recovered in such a case as above. Although I'm not sure if it has been implemented yet. IMHO with the above CR implemented, in most cases ZFS currently provides *much* better solution to random data corruption than any other filesystem+fsck in the market. The code for the putback of 2009/479 allows reverting to an earlier uberblock AND defers the re-use of blocks for a short time to make this "rewind" safer. Excellent! Thank you for the information. -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Robert Milkowski wrote: Kevin Walker wrote: Hi all, Just subscribed to the list after a debate on our helpdesk lead me to the posting about ZFS corruption and the need for a fsck repair tool of some kind... Has there been any update on this? I guess the discussion started after someone read an article on OSNEWS. The way zfs works is that basically you get an fsck equivalent while using a pool. ZFS checks checksums for all metadata and user data while reading it. Then all metadata are using ditto blocks to provide 2 or three copies of it (totally independent from any pool redundancy) depends on type of metadata. If it is corrupted a second (or third) copy will be used so correct data is returned and a corrupted block is automatically repaired. The ability to repair a block containing a user data depends on if you have a pool configured with or without redundancy. But even if pool is non-redundant (lets say a single disk drive) zfs still will be able to detect corruption and will be able to tell you what files are affected while metadata will be correct in most cases (unless corruption is so large and not localized so it affected all copies of a block in a pool). You will be able to read all other files and other parts of the file. So fsck actually happens while you are accessing your data and it is even better than fsck on most other filesystems as thanks to checksumming of all data and metadata zfs knows exactly when/if something is wrong and in most cases is even able to fix it on the fly. If you want to scan entire pool including all redundant copies and get them fix if something doesn't checksum then you can schedule the pool scrubbing (while your applications are still using the pool!). This will force zfs to read all blocks from all copies to be read, their checksum checked and if needed data corrected if possible and the fact reported to user. Legacy fsck is not even close to it. I think that the perceived need for fsck for ZFS probably comes from lack of understanding how ZFS works and from some frustrated users where under a very unlikely and rare circumstances due to data corruption a user might be in a position of not being able to import the pool therefore not being able to access any data at all while a corruption might have affected only a relatively small amount of data. Most other filesystem will allow you to access most of the data after fsck in such a situation (probably with some data loss) while zfs left user with no access to data at all. In such a case the problem lies with zfs uberblock and the remedy is to revert a pool to its previous uberblock version (or even an earlier one). In almost all the cases this will render a pool importable and then the mechanisms described in the first paragraph above will kick-in. The problem is (was) that the procedure to revert a pool to one of its previous uberblock is not documented nor is automatic and is definitely far from being sys-admin friendly. But thanks to some community members (most notably mr. Victor I think) some users affected by the issue were given a hand and were able to recover most/all their data. Others were probably assisted by Sun's support service I guess. Fortunately a much more user-friendly mechanism has been finally implemented and inegrated into Open Solaris build 126 which allows a user to import a pool and force it to on of the previous versions of its uberblock if necessary. See http://c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-support.html for more details. There is another CR (don't have its number at hand) which is about implementing a delayed re-use on just freed blocks which should allow for more data to be recovered in such a case as above. Although I'm not sure if it has been implemented yet. IMHO with the above CR implemented, in most cases ZFS currently provides *much* better solution to random data corruption than any other filesystem+fsck in the market. The code for the putback of 2009/479 allows reverting to an earlier uberblock AND defers the re-use of blocks for a short time to make this "rewind" safer. -tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Kevin Walker wrote: Hi all, Just subscribed to the list after a debate on our helpdesk lead me to the posting about ZFS corruption and the need for a fsck repair tool of some kind... Has there been any update on this? I guess the discussion started after someone read an article on OSNEWS. The way zfs works is that basically you get an fsck equivalent while using a pool. ZFS checks checksums for all metadata and user data while reading it. Then all metadata are using ditto blocks to provide 2 or three copies of it (totally independent from any pool redundancy) depends on type of metadata. If it is corrupted a second (or third) copy will be used so correct data is returned and a corrupted block is automatically repaired. The ability to repair a block containing a user data depends on if you have a pool configured with or without redundancy. But even if pool is non-redundant (lets say a single disk drive) zfs still will be able to detect corruption and will be able to tell you what files are affected while metadata will be correct in most cases (unless corruption is so large and not localized so it affected all copies of a block in a pool). You will be able to read all other files and other parts of the file. So fsck actually happens while you are accessing your data and it is even better than fsck on most other filesystems as thanks to checksumming of all data and metadata zfs knows exactly when/if something is wrong and in most cases is even able to fix it on the fly. If you want to scan entire pool including all redundant copies and get them fix if something doesn't checksum then you can schedule the pool scrubbing (while your applications are still using the pool!). This will force zfs to read all blocks from all copies to be read, their checksum checked and if needed data corrected if possible and the fact reported to user. Legacy fsck is not even close to it. I think that the perceived need for fsck for ZFS probably comes from lack of understanding how ZFS works and from some frustrated users where under a very unlikely and rare circumstances due to data corruption a user might be in a position of not being able to import the pool therefore not being able to access any data at all while a corruption might have affected only a relatively small amount of data. Most other filesystem will allow you to access most of the data after fsck in such a situation (probably with some data loss) while zfs left user with no access to data at all. In such a case the problem lies with zfs uberblock and the remedy is to revert a pool to its previous uberblock version (or even an earlier one). In almost all the cases this will render a pool importable and then the mechanisms described in the first paragraph above will kick-in. The problem is (was) that the procedure to revert a pool to one of its previous uberblock is not documented nor is automatic and is definitely far from being sys-admin friendly. But thanks to some community members (most notably mr. Victor I think) some users affected by the issue were given a hand and were able to recover most/all their data. Others were probably assisted by Sun's support service I guess. Fortunately a much more user-friendly mechanism has been finally implemented and inegrated into Open Solaris build 126 which allows a user to import a pool and force it to on of the previous versions of its uberblock if necessary. See http://c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-support.html for more details. There is another CR (don't have its number at hand) which is about implementing a delayed re-use on just freed blocks which should allow for more data to be recovered in such a case as above. Although I'm not sure if it has been implemented yet. IMHO with the above CR implemented, in most cases ZFS currently provides *much* better solution to random data corruption than any other filesystem+fsck in the market. Personally I don't blame Sun that implementing the CR took so long as it mostly affected home users with cheap hardware from BestBuy like sources and even then it was relatively rare. So called enterprise customers were affected even less and then either they had enough expertise or called Sun's support organization to get a pool manually reverted to its previous uberblock. So from Sun's perspective the issue was far from being top-priority and the resources are limited as usual. Still IIRC it was thanks to some vocal users here complaining about the issue which convinced ZFS developers to get it expedited... :) ps. sorry for a chaotic email but lack of time is mine friend as usual :) -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Joerg just posted a lengthy answer to the fsck question: http://www.c0t0d0s0.org/archives/6071-No,-ZFS-really-doesnt-need-a-fsck.html Good stuff. I see two answers to "nobody complained about lying hardware before ZFS". One: The user has never tried another filesystem that tests for end-to-end data integrity, so ZFS notices more problems, and sooner. Two: If you lost data with another filesystem, you may have overlooked it and blamed the OS or the application, instead of the inexpensive hardware. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Also, read this: http://c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-support.html -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Such a functionality is in the ZFS code now. It will be available later for us http://c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-support.html -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
ZFS scrub will detect many types of error in your data or the filesystem metadata. If you have sufficient redundancy in your pool and the errors were not due to dropped or misordered writes, then they can often be automatically corrected during the scrub. If ZFS detects an error from which it cannot automatically recover, it will often instantly lock your entire pool to prevent any read or write access, informing you only that you must destroy it and "restore from backups" to get your data back. Your only recourse in such situations is to do exactly that, or enlist the help of Victor Latushkin to attempt to recover your pool using painstaking manual manipulation. Recent putbacks seem to indicate that future releases will provide a mechanism to allow mere mortals to recover from some of the errors caused by dropped writes. cheers, Rob -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS + fsck
Hi all, Just subscribed to the list after a debate on our helpdesk lead me to the posting about ZFS corruption and the need for a fsck repair tool of some kind... Has there been any update on this? Kind regards, Kevin Walker Coreix Limited DDI: (+44) 0207 183 1725 ext 90 Mobile: (+44) 07960 967818 Fax: (+44) 0208 53 44 111 * This message is intended solely for the use of the individual or organisation to whom it is addressed. It may contain privileged or confidential information. If you are not the intended recipient, you should not use, copy, alter, or disclose the contents of this message * ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss