Re: [zfs-discuss] ZFS on Dell with FreeBSD
On Thu, Oct 20, 2011 at 7:56 AM, Dave Pooser wrote: > On 10/19/11 9:14 AM, "Albert Shih" wrote: > >>When we buy a MD1200 we need a RAID PERC H800 card on the server > > No, you need a card that includes 2 external x4 SFF8088 SAS connectors. > I'd recommend an LSI SAS 9200-8e HBA flashed with the IT firmware-- then > it presents the individual disks and ZFS can handle redundancy and > recovery. Exactly, thanks for suggesting an exact controller model that can present disks as JBOD. With hardware RAID, you'd pretty much rely on the controller to behave nicely, which is why I suggested to simply create one big volume for zfs to use (so you pretty much only use features like snapshot, clones, etc, but don't use zfs self healing feature). Again, others might (and have) disagree and suggest using volumes for individual disk (even when you're still relying on hardware RAID controller). But ultimately there's no question that the best possible setup would be to present the disks as JBOD and let zfs handle it directly. -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on Dell with FreeBSD
I also recommend LSI 9200-8E or new 9205-8E with the IT firmware based on past experience Also LSI Original HBA normally released FW earlier than OEM. Plus, most of users in community use LSI HBA. Rocky -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Dave Pooser Sent: Wednesday, October 19, 2011 5:56 PM To: freebsd-questi...@freebsd.org; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] ZFS on Dell with FreeBSD On 10/19/11 9:14 AM, "Albert Shih" wrote: >When we buy a MD1200 we need a RAID PERC H800 card on the server No, you need a card that includes 2 external x4 SFF8088 SAS connectors. I'd recommend an LSI SAS 9200-8e HBA flashed with the IT firmware-- then it presents the individual disks and ZFS can handle redundancy and recovery. -- Dave Pooser Manager of Information Services Alford Media http://www.alfordmedia.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on Dell with FreeBSD
On 10/19/11 9:14 AM, "Albert Shih" wrote: >When we buy a MD1200 we need a RAID PERC H800 card on the server No, you need a card that includes 2 external x4 SFF8088 SAS connectors. I'd recommend an LSI SAS 9200-8e HBA flashed with the IT firmware-- then it presents the individual disks and ZFS can handle redundancy and recovery. -- Dave Pooser Manager of Information Services Alford Media http://www.alfordmedia.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] commercial zfs-based storage replication software?
2011-10-19 17:54, Fajar A. Nugraha пишет: On Wed, Oct 19, 2011 at 7:52 PM, Jim Klimov wrote: Well, just for the sake of completeness: most of our systems are using zfs-auto-snap service, including Solaris 10 systems datiing from Sol10u6. Installation of relevant packages from SXCE (ranging snv_117-snv_130) was trivial, but some script-patching was in order. I think, replacement of the ksh interpreter to ksh93. Yes, I remembered reading about that. Actually, I revised the systems: the scripts are kept in original form, but those sol10 servers where ksh93 was absent, got a symlink: /usr/bin/ksh93 -> ../dt/bin/dtksh ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Wed, Oct 19, 2011 at 10:13:56AM -0400, David Magda wrote: > On Wed, October 19, 2011 08:15, Pawel Jakub Dawidek wrote: > > > Fsck can only fix known file system inconsistencies in file system > > structures. Because there is no atomicity of operations in UFS and other > > file systems it is possible that when you remove a file, your system can > > crash between removing directory entry and freeing inode or blocks. > > This is expected with UFS, that's why there is fsck to verify that no > > such thing happend. > > Slightly OT, but this non-atomic delay between meta-data updates and > writes to the disk is exploited by "soft updates" with FreeBSD's UFS: > > http://www.freebsd.org/doc/en/books/handbook/configtuning-disk.html#SOFT-UPDATES > > It may be of some interest to the file system geeks on the list. Well, soft-updates thanks to careful ordering of operation allow to mount file system even in inconsistent state and run fsck in background, as the only inconsistencies are resource leaks - directory entry will never point at unallocated inode and an inode will never point at unallocated block, etc. This is still not atomic. With recent versions of FreeBSD, soft-updates were extended to journal those resource leaks, so background fsck is not needed anymore. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgp1e542EIuks.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Wed, Oct 19, 2011 at 7:24 AM, Garrett D'Amore wrote: > I'd argue that from a *developer* point of view, an fsck tool for ZFS might > well be useful. Isn't that what zdb is for? :-) > > But ordinary administrative users should never need something like this, > unless they have encountered a bug in ZFS itself. (And bugs are as likely to > exist in the checker tool as in the filesystem. ;-) zdb can be useful for admins -- say, to gather stats not reported by the system, to explore the fs/vol layout, for educational purposes, and so on. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on Dell with FreeBSD
On Wed, Oct 19, 2011 at 10:14 AM, Albert Shih wrote: > When we buy a MD1200 we need a RAID PERC H800 card on the server so we have > two options : > > 1/ create a LV on the PERC H800 so the server see one volume and put > the zpool on this unique volume and let the hardware manage the > raid. > > 2/ create 12 LV on the perc H800 (so without raid) and let FreeBSD > and ZFS manage the raid. > > which one is the best solution ? > > Any advise about the RAM I need on the server (actually one MD1200 so 12x2To > disk) I know the PERC H200 can be flashed with IT firmware, making it in effect a "dumb" HBA perfect for ZFS usage. Perhaps the H800 has the same? (If not, can you get the machine configured with a H200?) If that's not an option, I think Option 2 will work. My first ZFS server ran on a PERC 5/i, and I was forced to make 8 single-drive RAID 0s in the PERC Option ROM, but Solaris did not seem to mind that. --khd ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on Dell with FreeBSD
On Wed, Oct 19, 2011 at 11:14 AM, Albert Shih wrote: > Hi > > Sorry to cross-posting. I don't knwon which mailing-list I should post this > message. > > I'll would like to use FreeBSD with ZFS on some Dell server with some > MD1200 (classique DAS). > > When we buy a MD1200 we need a RAID PERC H800 card on the server so we have > two options : > > 1/ create a LV on the PERC H800 so the server see one volume and put > the zpool on this unique volume and let the hardware manage the > raid. > > 2/ create 12 LV on the perc H800 (so without raid) and let FreeBSD > and ZFS manage the raid. > > which one is the best solution ? > > Any advise about the RAM I need on the server (actually one MD1200 so 12x2To > disk) > > Regards. for ZFS approach the second option in my opinion is better. > JAS > -- > Albert SHIH > DIO batiment 15 > Observatoire de Paris > 5 Place Jules Janssen > 92195 Meudon Cedex > Téléphone : 01 45 07 76 26/06 86 69 95 71 > Heure local/Local time: > mer 19 oct 2011 16:11:40 CEST > ___ > freebsd-questi...@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org" > -- Jorge Andrés Medina Oliva. Computer engineer. IT consultant http://www.bsdchile.cl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] repair [was: about btrfs and zfs]
On Oct 19, 2011, at 1:52 PM, Richard Elling wrote: > On Oct 18, 2011, at 5:21 PM, Edward Ned Harvey wrote: > >>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >>> boun...@opensolaris.org] On Behalf Of Tim Cook >>> >>> I had and have redundant storage, it has *NEVER* automatically fixed >>> it. You're the first person I've heard that has had it automatically fix >> it. >> >> That's probably just because it's normal and expected behavior to >> automatically fix it - I always have redundancy, and every cksum error I >> ever find is always automatically fixed. I never tell anyone here because >> it's normal and expected. > > Yes, and in fact the automated tests for ZFS developers intentionally > corrupts data > so that the repair code can be tested. Also, the same checksum code is used > to > calculate the checksum when writing and reading. > >> If you have redundancy, and cksum errors, and it's not automatically fixed, >> then you should report the bug. > > For modern Solaris-based implementations, each checksum mismatch that is > repaired reports the bitmap of the corrupted vs expected data. Obviously, if > the > data cannot be repaired, you cannot know the expected data, so the error is > reported without identification of the broken bits. > > In the archives, you can find reports of recoverable and unrecoverable errors > attributed to: > 1. ZFS software (rare, but a bug a few years ago mishandled a raidz > case) > 2. SAN switch firmware > 3. "Hardware" RAID array firmware > 4. Power supplies > 5. RAM > 6. HBA > 7. PCI-X bus > 8. BIOS settings > 9. CPU and chipset errata > > Personally, I've seen all of the above except #7, because PCI-X hardware is > hard to find now. I've seen #7. I have some PCI-X hardware that is flaky in my home lab. ;-) There was a case of #1 not very long ago, but it was a difficult to trigger race and is fixed in illumos and I presume other derivatives (including NexentaStor). - Garrett > > If consistently see unrecoverable data from a system that has protected data, > then > there may be an issue with a part of the system that is a single point of > failure. Very, > very, very few x86 systems are designed with no SPOF. > -- richard > > -- > > ZFS and performance consulting > http://www.RichardElling.com > VMworld Copenhagen, October 17-20 > OpenStorage Summit, San Jose, CA, October 24-27 > LISA '11, Boston, MA, December 4-9 > > > > > > > > > > > > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
I'd argue that from a *developer* point of view, an fsck tool for ZFS might well be useful. Isn't that what zdb is for? :-) But ordinary administrative users should never need something like this, unless they have encountered a bug in ZFS itself. (And bugs are as likely to exist in the checker tool as in the filesystem. ;-) - Garrett On Oct 19, 2011, at 2:15 PM, Pawel Jakub Dawidek wrote: > On Wed, Oct 19, 2011 at 08:40:59AM +1100, Peter Jeremy wrote: >> fsck verifies the logical consistency of a filesystem. For UFS, this >> includes: used data blocks are allocated to exactly one file, >> directory entries point to valid inodes, allocated inodes have at >> least one link, the number of links in an inode exactly matches the >> number of directory entries pointing to that inode, directories form a >> single tree without loops, file sizes are consistent with the number >> of allocated blocks, unallocated data/inodes blocks are in the >> relevant free bitmaps, redundant superblock data is consistent. It >> can't verify data. > > Well said. I'd add that people who insist on ZFS having a fsck are > missing the whole point of ZFS transactional model and copy-on-write > design. > > Fsck can only fix known file system inconsistencies in file system > structures. Because there is no atomicity of operations in UFS and other > file systems it is possible that when you remove a file, your system can > crash between removing directory entry and freeing inode or blocks. > This is expected with UFS, that's why there is fsck to verify that no > such thing happend. > > In ZFS on the other hand there are no inconsistencies like that. If all > blocks match their checksums and you find directory loop or something > like that, it is a bug in ZFS, not expected inconsistency. It should be > fixed in ZFS and not work-arounded with some fsck for ZFS. > > -- > Pawel Jakub Dawidek http://www.wheelsystems.com > FreeBSD committer http://www.FreeBSD.org > Am I Evil? Yes, I Am! http://yomoli.com > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on Dell with FreeBSD
On 10/19/11 15:30, Fajar A. Nugraha wrote: On Wed, Oct 19, 2011 at 9:14 PM, Albert Shih wrote: Hi Sorry to cross-posting. I don't knwon which mailing-list I should post this message. I'll would like to use FreeBSD with ZFS on some Dell server with some MD1200 (classique DAS). When we buy a MD1200 we need a RAID PERC H800 card on the server so we have two options : 1/ create a LV on the PERC H800 so the server see one volume and put the zpool on this unique volume and let the hardware manage the raid. 2/ create 12 LV on the perc H800 (so without raid) and let FreeBSD and ZFS manage the raid. which one is the best solution ? Neither. The best solution is to find a controller which can pass the disk as JBOD (not encapsulated as virtual disk). Failing that, I'd go with (1) (though others might disagree). No go with 2. ALWAYS let ZFS manage the redundancy otherwise it can't self-heal. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On 10/18/11 03:31 PM, Tim Cook wrote: On Tue, Oct 18, 2011 at 3:27 PM, Peter Tribble mailto:peter.trib...@gmail.com>> wrote: On Tue, Oct 18, 2011 at 9:12 PM, Tim Cook mailto:t...@cook.ms>> wrote: > > > On Tue, Oct 18, 2011 at 3:06 PM, Peter Tribble mailto:peter.trib...@gmail.com>> > wrote: >> >> On Tue, Oct 18, 2011 at 8:52 PM, Tim Cook mailto:t...@cook.ms>> wrote: >> > >> > Every scrub I've ever done that has found an error required manual >> > fixing. >> > Every pool I've ever created has been raid-z or raid-z2, so the silent >> > healing, while a great story, has never actually happened in practice in >> > any >> > environment I've used ZFS in. >> >> You have, of course, reported each such failure, because if that >> was indeed the case then it's a clear and obvious bug? >> >> For what it's worth, I've had ZFS repair data corruption on >> several occasions - both during normal operation and as a >> result of a scrub, and I've never had to intervene manually. >> >> -- >> -Peter Tribble >> http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ > > > Given that there are guides on how to manually fix the corruption, I don't > see any need to report it. It's considered acceptable and expected behavior > from everyone I've talked to at Sun... > http://dlc.sun.com/osol/docs/content/ZFSADMIN/gbbwl.html If you have adequate redundancy, ZFS will - and does - repair errors. The document you quote is for the case where you don't actually have adequate redundancy: ZFS will refuse to make up data for you, and report back where the problem was. Exactly as designed. (And yes, I've come across systems without redundant storage, or had multiple simultaneous failures. The original statement was that if you have redundant copies of the data or, in the case of raidz, enough information to reconstruct it, then ZFS will repair it for you. Which has been exactly in accord with my experience.) -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ I had and have redundant storage, it has *NEVER* automatically fixed it. You're the first person I've heard that has had it automatically fix it. Per the page "or an unlikely series of events conspired to corrupt multiple copies of a piece of data." Their unlikely series of events, that goes unnamed, is not that unlikely in my experience. --Tim Just another 2 cents towards a euro/dollar/yen. I've only had data redundancy in ZFS via mirrors (not that it should matter as long as there's redundancy), and in every case I've had it repair data automatically via a scrub. The one case where it didn't was when the disk controller both drives happened to share (bad design, yes) started erroring and corrupting writes to both disks in parallel, so there was no good data to fix it with. I was still happy to be using ZFS, as a filesystem without a scrub/scan of some sort wouldn't have even noticed in my experience - I suspect btrfs would have if it's scan works similarly. cheers, Brian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- --- Brian Wilson, Solaris SE, UW-Madison DoIT Room 3114 CS&S608-263-8047 brian.wilson(a)doit.wisc.edu 'I try to save a life a day. Usually it's my own.' - John Crichton --- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on Dell with FreeBSD
On Wed, Oct 19, 2011 at 9:14 PM, Albert Shih wrote: > Hi > > Sorry to cross-posting. I don't knwon which mailing-list I should post this > message. > > I'll would like to use FreeBSD with ZFS on some Dell server with some > MD1200 (classique DAS). > > When we buy a MD1200 we need a RAID PERC H800 card on the server so we have > two options : > > 1/ create a LV on the PERC H800 so the server see one volume and put > the zpool on this unique volume and let the hardware manage the > raid. > > 2/ create 12 LV on the perc H800 (so without raid) and let FreeBSD > and ZFS manage the raid. > > which one is the best solution ? Neither. The best solution is to find a controller which can pass the disk as JBOD (not encapsulated as virtual disk). Failing that, I'd go with (1) (though others might disagree). > > Any advise about the RAM I need on the server (actually one MD1200 so 12x2To > disk) The more the better :) Just make sure do NOT use dedup untul you REALLY know what you're doing (which usually means buying lots of RAM and SSD for L2ARC). -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS on Dell with FreeBSD
Hi Sorry to cross-posting. I don't knwon which mailing-list I should post this message. I'll would like to use FreeBSD with ZFS on some Dell server with some MD1200 (classique DAS). When we buy a MD1200 we need a RAID PERC H800 card on the server so we have two options : 1/ create a LV on the PERC H800 so the server see one volume and put the zpool on this unique volume and let the hardware manage the raid. 2/ create 12 LV on the perc H800 (so without raid) and let FreeBSD and ZFS manage the raid. which one is the best solution ? Any advise about the RAM I need on the server (actually one MD1200 so 12x2To disk) Regards. JAS -- Albert SHIH DIO batiment 15 Observatoire de Paris 5 Place Jules Janssen 92195 Meudon Cedex Téléphone : 01 45 07 76 26/06 86 69 95 71 Heure local/Local time: mer 19 oct 2011 16:11:40 CEST ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Wed, October 19, 2011 08:15, Pawel Jakub Dawidek wrote: > Fsck can only fix known file system inconsistencies in file system > structures. Because there is no atomicity of operations in UFS and other > file systems it is possible that when you remove a file, your system can > crash between removing directory entry and freeing inode or blocks. > This is expected with UFS, that's why there is fsck to verify that no > such thing happend. Slightly OT, but this non-atomic delay between meta-data updates and writes to the disk is exploited by "soft updates" with FreeBSD's UFS: http://www.freebsd.org/doc/en/books/handbook/configtuning-disk.html#SOFT-UPDATES It may be of some interest to the file system geeks on the list. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] commercial zfs-based storage replication software?
On Wed, Oct 19, 2011 at 7:52 PM, Jim Klimov wrote: > 2011-10-13 13:27, Darren J Moffat пишет: >> >> On 10/13/11 09:27, Fajar A. Nugraha wrote: >>> >>> On Tue, Oct 11, 2011 at 5:26 PM, Darren J Moffat >>> wrote: Have you looked at the time-slider functionality that is already in Solaris ? >>> >>> Hi Darren. Is it available for Solaris 10? I just installed Solaris 10 >>> u10 and couldn't find it. >> >> No it is not. >>> >>> Is there a reference on how to get/install this functionality on Solaris >>> 10? >> >> No because it doesn't exist on Solaris 10. >> > > Well, just for the sake of completeness: most of our systems are using > zfs-auto-snap service, including Solaris 10 systems datiing from Sol10u6. > Installation of relevant packages from SXCE (ranging snv_117-snv_130) > was trivial, but some script-patching was in order. I think, replacement > of the ksh interpreter to ksh93. Yes, I remembered reading about that. > > I haven't used the GUI part and I guess my experience relates to the > script-based zfs-auto-snap (before it was remade into current binary > form, or so I read). We kind of got stuck with SXCE systems which > still "just work" finely ;) > > The point is, even if unsupported (may be a problem in OP's case) > it is likely that one or another version of zfs-auto-snap or TimeSlider > can be made to work in Sol10 with little effort. To be honest, if it's just to get it work, I'd just make my own. Or running SE with a solaris 10 zone inside it, with SE managing time-slider/replication and S10 zone running the application. But for this particular case support is essential. That's why I mentioned earlier if I can't get a supported solution for this setup (with a reasonable price), storage-based replication would have to do. -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] bootadm hang WAS tuning zfs_arc_min
2011-10-12 11:56, Frank Van Damme пишет: The root of the problem seems to be that that process never completes. 9 /lib/svc/bin/svc.startd 332 /sbin/sh /lib/svc/method/boot-archive-update 347 /sbin/bootadm update-archive Can't kill it and run from the cmdline either, it simply ignores SIGKILL. (Which shouldn't even be possible). I guess it is possible when things lock up in kernel calls, waiting for them to complete. It has happened on me a number of times, usually related to ZFS pool being too busy working or repairing to do anything else, and this per se often lead to system crashing (see i.e. my adventures this spring reported on the forums). I had hit a number of problems generally leading to the whole zfs subsystem "running away to a happy place". As an indication of this you can try running something as simple as "zpool list" in the background (otherwise your shell locks up too) and see if it ever completes: # zpool list & Earlier there were bugs related to inaccessible snapshots (marked for deletion, but not actually deletable until you mount and unmount the parent dataset) - these mostly fired in zfs-auto-snap auto-deletions, but also happened to influence bootadm. I am not sure in what way bootadm relies on zfs/zpool, but empirically - it does. You might work around the problem by: * exporting "data" zfs pools before updating the bootarchive (bootadm update-archive); if you're rebooting the system anyway - stop the zones and services manually, and give this a try. * booting from another media like a Failsafe Boot (SXCE, Sol10) or LiveCD (Indiana) and importing your rootpool to "/a", then run # bootadm update-archive -R /a * booting into single-user mode, making the root RW if needed, and updating the archive. ** You're likely to go this way anyway if your boot is interrupted due to an outdated boot archive (SMF failure - requires a repair shell interaction). When the archive is updated, you need to clear the service (svcadm clear boot-archive) and exit the repair shell in order to continue booting the OS. * brute force - updating the bootarchive (/platform/i86pc/boot_archive and /platform/i86pc/amd64/boot_archive ) manually as an FS image, with files listed in /boot/solaris/filelist.ramdisk. Usually failure on boot is related to updating of some config files in /etc... //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
Thank you. The following is the best "layman's" explanation as to _why_ ZFS does not have an fsck equivalent (or even needs one). On the other hand, there are situations where you really do need to force ZFS to do something that may not be a"good idea", but is the best of a bad set of choices. Hence the zpool import -F (and other such tools available via zdb). While the ZFS data may not be corrupt, it is possible to corrupt the ZFS metadata, uberblock, and labals in such a way that force is necessary. On Wed, Oct 19, 2011 at 8:15 AM, Pawel Jakub Dawidek wrote: > Well said. I'd add that people who insist on ZFS having a fsck are > missing the whole point of ZFS transactional model and copy-on-write > design. > > Fsck can only fix known file system inconsistencies in file system > structures. Because there is no atomicity of operations in UFS and other > file systems it is possible that when you remove a file, your system can > crash between removing directory entry and freeing inode or blocks. > This is expected with UFS, that's why there is fsck to verify that no > such thing happend. > > In ZFS on the other hand there are no inconsistencies like that. If all > blocks match their checksums and you find directory loop or something > like that, it is a bug in ZFS, not expected inconsistency. It should be > fixed in ZFS and not work-arounded with some fsck for ZFS. > > -- > Pawel Jakub Dawidek http://www.wheelsystems.com > FreeBSD committer http://www.FreeBSD.org > Am I Evil? Yes, I Am! http://yomoli.com > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > -- {1-2-3-4-5-6-7-} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] commercial zfs-based storage replication software?
2011-10-13 13:27, Darren J Moffat пишет: On 10/13/11 09:27, Fajar A. Nugraha wrote: On Tue, Oct 11, 2011 at 5:26 PM, Darren J Moffat wrote: Have you looked at the time-slider functionality that is already in Solaris ? Hi Darren. Is it available for Solaris 10? I just installed Solaris 10 u10 and couldn't find it. No it is not. Is there a reference on how to get/install this functionality on Solaris 10? No because it doesn't exist on Solaris 10. Well, just for the sake of completeness: most of our systems are using zfs-auto-snap service, including Solaris 10 systems datiing from Sol10u6. Installation of relevant packages from SXCE (ranging snv_117-snv_130) was trivial, but some script-patching was in order. I think, replacement of the ksh interpreter to ksh93. I haven't used the GUI part and I guess my experience relates to the script-based zfs-auto-snap (before it was remade into current binary form, or so I read). We kind of got stuck with SXCE systems which still "just work" finely ;) The point is, even if unsupported (may be a problem in OP's case) it is likely that one or another version of zfs-auto-snap or TimeSlider can be made to work in Sol10 with little effort. HTH, //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Growing CKSUM errors with no READ/WRITE errors
2011-10-19 16:01, Richard Elling пишет: On Oct 18, 2011, at 6:35 PM, David Magda wrote: If we've found one bad disk, what are our options? Live with it or replace it :-) -- richard Similar question: a HDD went awry last week in an snv_117 box (the controller no longer sees the drive - so I guess there is either a dead drive, or dead power/data ports on the backplane), and a hot-spare replaced it okay. However, there are a number of CKSUM errors on the replacement disk, growing by about 100 daily (according to "zpool status"). I tried scrubbing the pool and zeroing the counter with "zpool clear", but new CKSUM errors are being found. There are zero READ or WRITE error counts, though. Should we be worried about replacing the ex-hotspare drive ASAP as well? There are no errors in dmesg regarding the ex-hotspare drive, only those regarding the dead one, occasionally: === dmesg: Oct 19 16:28:23 thumper scsi: [ID 107833 kern.warning] WARNING: /pci@1,0/pci1022,7458@4/pci11ab,11ab@1/disk@6,0 (sd40): Oct 19 16:28:23 thumper Command failed to complete...Device is gone Oct 19 16:28:23 thumper scsi: [ID 107833 kern.warning] WARNING: /pci@1,0/pci1022,7458@4/pci11ab,11ab@1/disk@6,0 (sd40): Oct 19 16:28:23 thumper SYNCHRONIZE CACHE command failed (5) === format: 30. c5t6d0 /pci@1,0/pci1022,7458@4/pci11ab,11ab@1/disk@6,0 -- ++ || | Климов Евгений, Jim Klimov | | технический директор CTO | | ЗАО "ЦОС и ВТ" JSC COS&HT | || | +7-903-7705859 (cellular) mailto:jimkli...@cos.ru | | CC:ad...@cos.ru,jimkli...@mail.ru | ++ | () ascii ribbon campaign - against html mail | | /\- against microsoft attachments | ++ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] repair [was: about btrfs and zfs]
2011-10-19 15:52, Richard Elling пишет: In the archives, you can find reports of recoverable and unrecoverable errors attributed to: ... Ah, yes, and 11. Faulty disk cabling (i.e. plastic connectors that soften with heat and fall of) - that has happened to cause strange behavior as well ;) Even if the connectors don't fall off, unreliable physical connection (including oxydization of metal plugs) leads to all sorts of noise on the wire which may be misinterpreted as random bits. These can often be fixed (and diagnozed) by pulling the connectors and plugging them back in - the oxyde film is scratched off, and the cable works again, for a few months more... //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] repair [was: about btrfs and zfs]
2011-10-19 15:52, Richard Elling wrote: In the archives, you can find reports of recoverable and unrecoverable errors attributed to: 1. ZFS software (rare, but a bug a few years ago mishandled a raidz case) 2. SAN switch firmware 3. "Hardware" RAID array firmware 4. Power supplies 5. RAM 6. HBA 7. PCI-X bus 8. BIOS settings 9. CPU and chipset errata 10. Broken HDDs ;) For weird inexplicable bugs, insufficient or faulty power supplies and cooling are often the core cause, at least in "enthisiast PCs". Perhaps the PS is okay to run but fails under some peak loads, and that leads to random bits being generated in RAM or on connection buses... Also some interference can be caused by motors, etc. in the HDDs and cooling fans - with older audio cards you could actually hear your HDD or CDROM spin up - by a characteristic buzz in the headphones or on the loudspeakers. Whether other components would fail or not under such EMI - that depends. //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Wed, Oct 19, 2011 at 08:40:59AM +1100, Peter Jeremy wrote: > fsck verifies the logical consistency of a filesystem. For UFS, this > includes: used data blocks are allocated to exactly one file, > directory entries point to valid inodes, allocated inodes have at > least one link, the number of links in an inode exactly matches the > number of directory entries pointing to that inode, directories form a > single tree without loops, file sizes are consistent with the number > of allocated blocks, unallocated data/inodes blocks are in the > relevant free bitmaps, redundant superblock data is consistent. It > can't verify data. Well said. I'd add that people who insist on ZFS having a fsck are missing the whole point of ZFS transactional model and copy-on-write design. Fsck can only fix known file system inconsistencies in file system structures. Because there is no atomicity of operations in UFS and other file systems it is possible that when you remove a file, your system can crash between removing directory entry and freeing inode or blocks. This is expected with UFS, that's why there is fsck to verify that no such thing happend. In ZFS on the other hand there are no inconsistencies like that. If all blocks match their checksums and you find directory loop or something like that, it is a bug in ZFS, not expected inconsistency. It should be fixed in ZFS and not work-arounded with some fsck for ZFS. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com pgpXffQuNhb6M.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Oct 18, 2011, at 6:35 PM, David Magda wrote: > If we've found one bad disk, what are our options? Live with it or replace it :-) -- richard -- ZFS and performance consulting http://www.RichardElling.com VMworld Copenhagen, October 17-20 OpenStorage Summit, San Jose, CA, October 24-27 LISA '11, Boston, MA, December 4-9 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] repair [was: about btrfs and zfs]
On Oct 18, 2011, at 5:21 PM, Edward Ned Harvey wrote: >> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >> boun...@opensolaris.org] On Behalf Of Tim Cook >> >> I had and have redundant storage, it has *NEVER* automatically fixed >> it. You're the first person I've heard that has had it automatically fix > it. > > That's probably just because it's normal and expected behavior to > automatically fix it - I always have redundancy, and every cksum error I > ever find is always automatically fixed. I never tell anyone here because > it's normal and expected. Yes, and in fact the automated tests for ZFS developers intentionally corrupts data so that the repair code can be tested. Also, the same checksum code is used to calculate the checksum when writing and reading. > If you have redundancy, and cksum errors, and it's not automatically fixed, > then you should report the bug. For modern Solaris-based implementations, each checksum mismatch that is repaired reports the bitmap of the corrupted vs expected data. Obviously, if the data cannot be repaired, you cannot know the expected data, so the error is reported without identification of the broken bits. In the archives, you can find reports of recoverable and unrecoverable errors attributed to: 1. ZFS software (rare, but a bug a few years ago mishandled a raidz case) 2. SAN switch firmware 3. "Hardware" RAID array firmware 4. Power supplies 5. RAM 6. HBA 7. PCI-X bus 8. BIOS settings 9. CPU and chipset errata Personally, I've seen all of the above except #7, because PCI-X hardware is hard to find now. If consistently see unrecoverable data from a system that has protected data, then there may be an issue with a part of the system that is a single point of failure. Very, very, very few x86 systems are designed with no SPOF. -- richard -- ZFS and performance consulting http://www.RichardElling.com VMworld Copenhagen, October 17-20 OpenStorage Summit, San Jose, CA, October 24-27 LISA '11, Boston, MA, December 4-9 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Stream versions in Solaris 10.
I just tried sending from a oi151a system to a Solaris 10 backup server and the server barfed with zfs_receive: stream is unsupported version 17 I can't find any documentation linking stream version to release, so does anyone know the Update 10 stream version? -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss