Re: [zfs-discuss] zfs snapshot revert
On 07/ 9/10 01:29 PM, zfsnoob4 wrote: Hi, I have a question about snapshots. If I restore a file system based on some snapshot I took in the past, is it possible to revert back to before I restored? ie: zfs snapshot t...@yesterday mkdir /test/newfolder zfs rollback t...@yesterday so now newfolder is gone. But is there a way to take a snapshot so I can revert back be before I restored? ie, get newfolder back after the rollback? You could clone the newer snapshot and promote the clone before rolling back. What particular problem are you trying to solve? Also from what I understand if I have two snapshots and revert to the older one, the newer one will be deleted. Is that correct? Yes. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hashing files rapidly on ZFS
On Fri, 2010-07-09 at 10:23 +1000, Peter Jeremy wrote: > On 2010-Jul-09 06:46:54 +0800, Edward Ned Harvey > wrote: > >md5 is significantly slower (but surprisingly not much slower) and it's a > >cryptographic hash. Probably not necessary for your needs. > > As someone else has pointed out, MD5 is no longer considered secure > (neither is SHA-1). If you want cryptographic hashing, you should > probably use SHA-256 for now and be prepared to migrate to SHA-3 once > it is announced. Unfortunately, SHA-256 is significantly slower than > MD5 (about 4 times on a P-4, about 3 times on a SPARC-IV) and no > cryptographic hash is amenable to multi-threading . The new crypto > instructions on some of Intel's recent offerings may help performance > (and it's likely that they will help more with SHA-3). > > >And one more thing. No matter how strong your hash is, unless your hash is > >just as big as your file, collisions happen. Don't assume data is the same > >just because hash is the same, if you care about your data. Always > >byte-level verify every block or file whose hash matches some other hash. > > In theory, collisions happen. In practice, given a cryptographic hash, > if you can find two different blocks or files that produce the same > output, please publicise it widely as you have broken that hash function. Not necessarily. While you *should* publicize it widely, given all the possible text that we have, and all the other variants, its theoretically possibly to get likely. Like winning a lottery where everyone else has a million tickets, but you only have one. Such an occurrence -- if isolated -- would not, IMO, constitute a 'breaking' of the hash function. - Garrett > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lost ZIL Device - FIXED
Greetings All, I can't believe it didn't figure this out sooner. First of all, a big thank you to everyone who gave me advice and suggestions, especially Richard. The problem was with the -d switch. When importing a pool if you specify -d and a path it ONLY looks there. So if I run: # zpool import -d /var/zfs-log/ tank It won't look for devices in /dev/dsk Consequently running without -d /var/zfs-log/ it won't find the log device. Here is the command that worked: # zpool import -d /var/zfs-log -d /dev/dsk tank And to make sure that this doesn't happen again (I have learned my lesson this time) I have ordered two small SSD drives to put in a mirrored config for the log device. Thanks again to everyone and now I will get some worry-free sleep :) Andrew ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs snapshot revert
Hi, I have a question about snapshots. If I restore a file system based on some snapshot I took in the past, is it possible to revert back to before I restored? ie: zfs snapshot t...@yesterday mkdir /test/newfolder zfs rollback t...@yesterday so now newfolder is gone. But is there a way to take a snapshot so I can revert back be before I restored? ie, get newfolder back after the rollback? Also from what I understand if I have two snapshots and revert to the older one, the newer one will be deleted. Is that correct? Thanks. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hashing files rapidly on ZFS
On 2010-Jul-09 06:46:54 +0800, Edward Ned Harvey wrote: >md5 is significantly slower (but surprisingly not much slower) and it's a >cryptographic hash. Probably not necessary for your needs. As someone else has pointed out, MD5 is no longer considered secure (neither is SHA-1). If you want cryptographic hashing, you should probably use SHA-256 for now and be prepared to migrate to SHA-3 once it is announced. Unfortunately, SHA-256 is significantly slower than MD5 (about 4 times on a P-4, about 3 times on a SPARC-IV) and no cryptographic hash is amenable to multi-threading . The new crypto instructions on some of Intel's recent offerings may help performance (and it's likely that they will help more with SHA-3). >And one more thing. No matter how strong your hash is, unless your hash is >just as big as your file, collisions happen. Don't assume data is the same >just because hash is the same, if you care about your data. Always >byte-level verify every block or file whose hash matches some other hash. In theory, collisions happen. In practice, given a cryptographic hash, if you can find two different blocks or files that produce the same output, please publicise it widely as you have broken that hash function. -- Peter Jeremy pgpiebzGoklvU.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] snapshot out of space
I am getting the following erorr message when trying to do a zfs snapshot: r...@pluto#zfs snapshot datapool/m...@backup1 cannot create snapshot 'datapool/m...@backup1': out of space r...@pluto#zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT datapool 556G 110G 446G 19% ONLINE - rpool 278G 12.5G 265G 4% ONLINE - Any ideas??? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Kernel Panic on zpool clean
> I think it is quite likely to be possible to get > readonly access to your data, but this requires > modified ZFS binaries. What is your pool version? > What build do you have installed on your system disk > or available as LiveCD? [Prompted by an off-list e-mail from Victor asking if I was still having problems] Thanks for your reply, and apologies for not having replied here sooner - I was going to try something myself (which I'll explain shortly) but have been hampered by a flakey cdrom drive - something I won't have chance to sort until the weekend. In answer to your question the installed system is running 2009.06 (b111b) and the LiveCD I've been using is b134. The problem with the Installed system crashing when I tried to run "zpool clean" I believe is being caused by http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6794136 which makes me think that the same command run from a later version should work fine. I haven't had any success doing this though and I believe the reason is that several of the ZFS commands won't work if the hostid of the machine to last access the pool is different from the current system (and the pool is exported/faulted), as happens when using a LiveCD. Where I was getting errors about "storage2 does not exist" I found it was writing errors to the syslog that the pool "could not be loaded as it was last accessed by another system". I tried to get round this using the Dtrace hostid changing script I mentioned in one of my earlier messages but this seemed not to be able to fool system processes. I also tried exporting the pool from the Installed system to see if that would help but unfortunately it didn't. After having exported the pool "zfs import" run on the Installed system reported "The pool can be imported despite missing or damaged devices." however when trying to import it (with or without -f) it refused to import it as "one or more devices is currently unavailable". When booting the LiveCD after having exported the pool it still gave errors about having been last accessed by another system. I couldn't spot any method of modifying the LiveCD image to have a particular hostid so my plan therefore has been to try installing b134 onto the system, setting the hostid under /etc and seeing if things then behaved in a more straightforward fashion, which I haven't managed yet due to the cdrom problems. I also mentioned in one of my earlier e-mails that I was confused that the Installed system mentioned an unreadable intent log but the LiveCD said the problem was corrupted metadata. This seems to be caused by the functions print_import_config and print_statement_config having slightly different case statements and not a difference in the pool itself. Hopefully I'll be able to complete the reinstall soon and see if that fixes things or there's a deeper problem. Thanks again for your help, George -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Legality and the future of zfs...
On Thu, 8 Jul 2010, Edward Ned Harvey wrote: > apple servers contribute negative value to an infrastructure, I do know a > lot of people who buy / have bought them. And I think that number would be > higher, if Apple were shipping ZFS. Yep. Provided it supported ZFS, a Mac Mini makes for a compelling SOHO server. The lack of ZFS is the main thing holding me back here... -- Rich Teer, Publisher Vinylphile Magazine www.vinylphilemag.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Should i enable Write-Cache ?
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Philippe Schwarz > > 3Ware cards > > Any drawback (except that without BBU, i've got a pb in case of power > loss) in enabling the WC with ZFS ? If you don't have a BBU, and you care about your data, don't enable WriteBack. If you enable writeback without a BBU, you might as well just disable ZIL instead. It's more effective, and just as dangerous. Actually, disabling the ZIL is probably faster *and* safer than running WriteBack without BBU. But if you're impressed with performance by enabling writeback, you can still do better ... The most effective thing you could possibly do is to disable the writeback, and add SSD for log device. ZFS is able to perform in this configuration, better than the WriteBack. And in this situation, surprisingly, enabling the WriteBack actually hurts performance slightly. The performance of writeback vs naked disk is comparable to the performance of SSD log vs writeback. So prepare yourself to be impressed one more time. The performance with disabled ZIL is yet again, another impressive step. And this performance is unbeatable. There are situations when disabled ZIL is actually a good configuration. If you don't know ... I'll suggest posting again, to learn when it's appropriate to disable ZIL. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Legality and the future of zfs...
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Peter Taps > > As you may have heard, NetApp has a lawsuit against Sun in 2007 (and > now carried over to Oracle) for patent infringement with the zfs file > > Given this, I am wondering what you think is the future of zfs as an > open source project. Others have already stated "Oracle won the case" in better detail than I could. So ZFS is safe in solaris/opensolaris. But some other big names (Apple) have backed down from deploying ZFS, presumably due to threats, and some others (coraid) are being sued anyway. This does reduce the number of ZFS deployments in the world, so it's probably benefitting Netapp to keep the suit alive, even if they never collect a dollar. But surprisingly, I think it's also benefitting Oracle. The lack of ZFS competition certainly helps Oracle sell sun hardware and solaris support contracts. As strongly as I feel "apple enterprise" is an oxymoron, and apple servers contribute negative value to an infrastructure, I do know a lot of people who buy / have bought them. And I think that number would be higher, if Apple were shipping ZFS. So, IMHO: COW lawsuit: Good for NetApp. Good for Oracle and solaris. Bad for ZFS like a rainy day is bad for baseball. And bad for everybody else. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send, receive, compress, dedup
On 07/ 9/10 10:59 AM, Brandon High wrote: Personally, I've started organizing datasets in a hierarchy, setting the properties that I want for descendant datasets at a level where it will apply to everything that I want to get it. So if you have your source at tank/export/foo and your destination is tank2/export/foo, you'd want to set compression, etc at tank/export or tank2/export. I believe that is the best way to go. I tend to set key attributes such as compress at the root and change on filesystems where they are not appropriate. My backup server has compress and dedup on at the root, along with atime=off and readonly=on. If you're going to have a newer zfs version on the destination, be aware that you can't receive from datasets with a higher version than the destination. Yes, one to watch for recovery! You can set the filesystem version explicitly, but then you loose the newer version's features. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send, receive, compress, dedup
On Thu, Jul 8, 2010 at 2:21 PM, Edward Ned Harvey wrote: > Can I "zfs send" from the fileserver to the backupserver and expect it to be > compressed and/or dedup'd upon receive? Does "zfs send" preserve the > properties of the originating filesystem? Will the "zfs receive" clobber or > ignore the compression / dedup options on the destination filesystem? If you set a property on the dataset that you're sending, then it'll overwrite what's on the receiving end if you used 'zfs send -R' or 'zfs send -p'. You can get around this by sending only a snapshot without -R or -p, then using -I to get intermediate snapshots for every dataset. If the destination dataset has the property set on it, then it'll be overwritten if you used 'zfs send -R' or 'zfs send -p'. If the source inherited the property from a parent and the destination inherits the property, then it won't be overwritten. Personally, I've started organizing datasets in a hierarchy, setting the properties that I want for descendant datasets at a level where it will apply to everything that I want to get it. So if you have your source at tank/export/foo and your destination is tank2/export/foo, you'd want to set compression, etc at tank/export or tank2/export. If you're going to have a newer zfs version on the destination, be aware that you can't receive from datasets with a higher version than the destination. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SATA 6G controller for OSOL
Thanks! I just need the SATA part for the SSD serving as my L2ARC. Could care less about PATA, and have no USB3 peripherals, anyway. I'll let everyone know how it works! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hashing files rapidly on ZFS
On Thu, 2010-07-08 at 18:46 -0400, Edward Ned Harvey wrote: > > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > > boun...@opensolaris.org] On Behalf Of Bertrand Augereau > > > > is there a way to compute very quickly some hash of a file in a zfs? > > As I understand it, everything is signed in the filesystem, so I'm > > wondering if I can avoid reading whole files with md5sum just to get a > > unique hash. Seems very redundant to me :) > > If I understand right: > > Although zfs is calculating hashes of blocks, it doesn't correlate to hashes > of files, for many reasons: > > Block boundaries are not well aligned with file boundaries. A single block > might encapsulate several small files, or a file might start in the middle > of a block, span several more, and end in the middle of another block. > > Blocks also contain non-file information. > > Hashing blocks will be even more irrelevant to file hashes, if you have > compression enabled, because I think it hashes the compressed data, not the > uncompressed data. > > If you want to create file hashes out of block hashes, it's even more > convoluted. Because you can't generally compute hash(A+B) based on hash(A) > and hash(B). Although perhaps you can for some algorithms. > > My advice would be: > > Computing hashes is not very expensive, as long as you're just computing > hashes for data that you were going to handle for other reasons anyway. > Specifically, I benchmarked several hash algorithms a while back, and found > ... I forget which ... either adler32 or crc is almost zero-time to compute > ... that is ... the cpu was very lightly utilized while hashing blocks at > maximum disk speed. > > The weakness of adler32 and crc is that they're not cryptographic hashes. > If a malicious person wants to corrupt a data stream while preserving the > hash, it's not difficult to do. adler32 and crc are good as long as you can > safely assume no malice. > > md5 is significantly slower (but surprisingly not much slower) and it's a > cryptographic hash. Probably not necessary for your needs. > > And one more thing. No matter how strong your hash is, unless your hash is > just as big as your file, collisions happen. Don't assume data is the same > just because hash is the same, if you care about your data. Always > byte-level verify every block or file whose hash matches some other hash. MD5 hashing is not recommended for "cryptographically strong" hashing anymore. SHA256 is the current recommendation I would make (the state of the art changes over time.) The caution about collisions happening is relevant, but with a suitably strong hash, the risk is close enough to zero that normal people don't care. By that, I mean that the chance of a collision within a 256 bit hash is something like 1/2^255. You're probably more likely to spontaneously combust (by an order of magnitude) than you are to have two files that "accidentally" (or even maliciously) reduce to the same hash. When the probability of the Sun going supernova in the next 30 seconds exceeds the probability of a cryptographic hash collision, I don't worry about the collision anymore. :-) - Garrett > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send, receive, compress, dedup
On 07/ 9/10 09:21 AM, Edward Ned Harvey wrote: Suppose I have a fileserver, which may be zpool 10, 14, or 15. No compression, no dedup. Suppose I have a backupserver. I want to zfs send from the fileserver to the backupserver, and I want the backupserver to receive and store compressed and/or dedup'd. The backupserver can be a more recent version of zpool than the fileserver, but right now, whatever they are, they're the same as each other. And I think it's 10, and I think I have 15 available if I upgrade them. (Obviously I can get accurate details on the exact versions, if it matters.) I have created the destination filesystem with compression. And now my question ... Can I "zfs send" from the fileserver to the backupserver and expect it to be compressed and/or dedup'd upon receive? Does "zfs send" preserve the properties of the originating filesystem? Will the "zfs receive" clobber or ignore the compression / dedup options on the destination filesystem? Yes. Unless they are explicitly set on the source file system. I replicate to a build 134 box with dedup enabled from an older server. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Should i enable Write-Cache ?
On Fri, 2010-07-09 at 00:23 +0200, Ragnar Sundblad wrote: > On 8 jul 2010, at 17.23, Garrett D'Amore wrote: > > > You want the write cache enabled, for sure, with ZFS. ZFS will do the > > right thing about ensuring write cache is flushed when needed. > > That is not for sure at all, it all depends on what "the right thing" > is, which depends on the application and/or what other measures > there are for redundancy. > > Many raid controllers will respond to a flush [to persistant storage] > by doing nothing, since the data already is in its write cache buffer > which (hopefully) is battery backed or something. > They will typically NOT flush the data to the drives and issue > flush commands to the disks, and wait for that to finish, before > responding. I consider such behavior "bad", at least if not explicitly enabled, and probably a bug. > > This means, that if your raid controller dies on you, some of your most > recently written data will be gone. It may very well be vital metadata, > and in the zfs world it may be data or metadata from several of the > latest txgs, that is gone. Potentially, it could leave your file system > corrupt, or arbitrary pieces of your data could be lost. Yes, failure to flush data when the OS requests it is *evil*. Enabling a write a cache should only be done if you believe your hardware is not buggy in this respect. I guess I was thinking more along the lines of simple JBOD controllers when I answered the question the first time -- I failed to take into accounted RAID controller behavior. - Garrett > > Maybe that is acceptable in the application, or maybe this is compensated > for with other means as multiple raid controllers with the data mirrored > over all of them which could reduce the risks by a lot, but you have to > evaluate each separate case by itself. > > /ragge > > > For the case of a single JBOD, I don't find it surprising that UFS beats > > ZFS. ZFS is designed for more complex configurations, and provides much > > better data integrity guarantees than UFS (so critical data is written > > to the disk more than once, and in areas of the drive that are not > > adjacent to improve the chances of recovery in the event of a localized > > media failure). That said, you could easily accelerate the write > > performance of ZFS on that single JBOD by adding a small SSD log device. > > (4GB would be enough. :-) > > > > - Garrett > > > > On Thu, 2010-07-08 at 15:10 +0200, Philippe Schwarz wrote: > >> Hi, > >> > >> With dual-Xeon, 4GB of Ram (will be 8GB in a couple of weeks), two PCI-X > >> 3Ware cards 7 Sata disks (750G & 1T) over FreeBSD 8.0 (But i think it's > >> OS independant), i made some tests. > >> > >> The disks are exported as JBOD, but i tried enabling/disabling write-cache > >> . > >> > >> I tried with UFS and ZFS on the same disk and the difference is > >> overwhelming. > >> > >> With a 1GB file (greater than the ZFS cache ?): > >> > >> With Writecache disabled > >> UFS > >> time cp /mnt/ufs/rnd /mnt/ufs/rnd2 > >> real 2m58.073s > >> ZFS > >> time cp /zfs/rnd /zfs/rnd2 > >> real 4m33.726s > >> > >> On the same card with WCache enabled > >> UFS > >> time cp /mnt/ufs/rnd /mnt/ufs/rnd2 > >> real 0m31.406s > >> ZFS > >> time cp /zfs/rnd /zfs/rnd2 > >> real 1m0.199s > >> > >> So, despite the fact that ZFS can be twice slower than UFS, it is clear > >> that Write-Cache have to be enabled on the controller. > >> > >> Any drawback (except that without BBU, i've got a pb in case of power > >> loss) in enabling the WC with ZFS ? > >> > >> > >> Thanks. > >> Best regards. > >> > >> > > > > > > ___ > > zfs-discuss mailing list > > zfs-discuss@opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Hashing files rapidly on ZFS
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Bertrand Augereau > > is there a way to compute very quickly some hash of a file in a zfs? > As I understand it, everything is signed in the filesystem, so I'm > wondering if I can avoid reading whole files with md5sum just to get a > unique hash. Seems very redundant to me :) If I understand right: Although zfs is calculating hashes of blocks, it doesn't correlate to hashes of files, for many reasons: Block boundaries are not well aligned with file boundaries. A single block might encapsulate several small files, or a file might start in the middle of a block, span several more, and end in the middle of another block. Blocks also contain non-file information. Hashing blocks will be even more irrelevant to file hashes, if you have compression enabled, because I think it hashes the compressed data, not the uncompressed data. If you want to create file hashes out of block hashes, it's even more convoluted. Because you can't generally compute hash(A+B) based on hash(A) and hash(B). Although perhaps you can for some algorithms. My advice would be: Computing hashes is not very expensive, as long as you're just computing hashes for data that you were going to handle for other reasons anyway. Specifically, I benchmarked several hash algorithms a while back, and found ... I forget which ... either adler32 or crc is almost zero-time to compute ... that is ... the cpu was very lightly utilized while hashing blocks at maximum disk speed. The weakness of adler32 and crc is that they're not cryptographic hashes. If a malicious person wants to corrupt a data stream while preserving the hash, it's not difficult to do. adler32 and crc are good as long as you can safely assume no malice. md5 is significantly slower (but surprisingly not much slower) and it's a cryptographic hash. Probably not necessary for your needs. And one more thing. No matter how strong your hash is, unless your hash is just as big as your file, collisions happen. Don't assume data is the same just because hash is the same, if you care about your data. Always byte-level verify every block or file whose hash matches some other hash. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool spares listed twice, as both AVAIL and FAULTED
Hi Ryan, What events lead up to this situation? I've seen a similar problem when a system upgrade caused the controller numbers of the spares to change. In that case, the workaround was to export the pool, correct the spare device names, and import the pool. I'm not sure if this workaround applies to your case. Do you know if the spare device names changed? My hunch is that you could export this pool, reconnect the spare devices, and reimport the pool, but I'd rather test this on my own pool first and I can't reproduce this problem. I don't think you can remove the spares by their GUID. At least, I couldn't. You said you tried to remove the spares with zpool remove. Did you try this command: # zpool remove idgsun02 c0t6d0 Or this command, which I don't think would work, but you would get a message like this: # zpool remove idgsun02 c0t6d0s0 cannot remove c0t6d0s0: no such device in pool Thanks, Cindy On 07/08/10 14:55, Ryan Schwartz wrote: I've got an x4500 with a zpool in a weird state. The two spares are listed twice each, once as AVAIL, and once as FAULTED. [IDGSUN02:/opt/src] root# zpool status pool: idgsun02 state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM idgsun02ONLINE 0 0 0 raidz2ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c6t5d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 raidz2ONLINE 0 0 0 c0t0d0 ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c1t0d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c6t0d0 ONLINE 0 0 0 c6t4d0 ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c7t4d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 spares c0t6d0AVAIL c5t5d0AVAIL c0t6d0FAULTED corrupted data c5t5d0FAULTED corrupted data errors: No known data errors I've been working with Sun support, but wanted to toss it out to the community as well. I found and compiled the zpconfig util from here: http://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSGuids and found that the spares in question have different GUIDs, but the same vdev path: spares[0] type='disk' guid=7826011125406290675 path='/dev/dsk/c0t6d0s0' devid='id1,s...@sata_hitachi_hua7210s__gtf000pahjmlxf/a' phys_path='/p...@0,0/pci1022,7...@1/pci11ab,1...@1/d...@6,0:a' whole_disk=1 is_spare=1 stats: state=7 aux=0 ... spares[1] type='disk' guid=870554111467930413 path='/dev/dsk/c5t5d0s0' devid='id1,s...@sata_hitachi_hua7210s__gtf000pahj5nlf/a' phys_path='/p...@1,0/pci1022,7...@4/pci11ab,1...@1/d...@5,0:a' whole_disk=1 is_spare=1 stats: state=7 aux=0 ... spares[2] type='disk' guid=5486341412008712208 path='/dev/dsk/c0t6d0s0' devid='id1,s...@sata_hitachi_hua7210s__gtf000pahjmlxf/a' phys_path='/p...@0,0/pci1022,7...@1/pci11ab,1...@1/d...@6,0:a' whole_disk=1 stats: state=4 aux=2 ... spares[3] type='disk' guid=16971039974506843020 path='/dev/dsk/c5t5d0s0' devid='id1,s...@sata_hitachi_hua7210s__gtf000pahj5nlf/a' phys_path='/p...@1,0/pci1022,7...@4/pci11ab,1...@1/d...@5,0:a' whole_disk=1 stats: state=4 aux=2 ... I've exported/imported the pool and the spares are still listed as above.The regular 'zpool remove idgsun02 c0t6d0s0' (and c5t5d0s0) also do not work, but do not produce any error output either. This sounds remarkably like http://bugs.opensolaris.org/bugdatabase/view_bug.do;?bug_id=6893472 but as I said, the export/import does not correct the issue. Any suggestions on how I can remove the "FAULTED" spares from the pool? Can I use the GUID with zpool remove somehow? ___ zfs-discuss mailin
Re: [zfs-discuss] Should i enable Write-Cache ?
On 8 jul 2010, at 17.23, Garrett D'Amore wrote: > You want the write cache enabled, for sure, with ZFS. ZFS will do the > right thing about ensuring write cache is flushed when needed. That is not for sure at all, it all depends on what "the right thing" is, which depends on the application and/or what other measures there are for redundancy. Many raid controllers will respond to a flush [to persistant storage] by doing nothing, since the data already is in its write cache buffer which (hopefully) is battery backed or something. They will typically NOT flush the data to the drives and issue flush commands to the disks, and wait for that to finish, before responding. This means, that if your raid controller dies on you, some of your most recently written data will be gone. It may very well be vital metadata, and in the zfs world it may be data or metadata from several of the latest txgs, that is gone. Potentially, it could leave your file system corrupt, or arbitrary pieces of your data could be lost. Maybe that is acceptable in the application, or maybe this is compensated for with other means as multiple raid controllers with the data mirrored over all of them which could reduce the risks by a lot, but you have to evaluate each separate case by itself. /ragge > For the case of a single JBOD, I don't find it surprising that UFS beats > ZFS. ZFS is designed for more complex configurations, and provides much > better data integrity guarantees than UFS (so critical data is written > to the disk more than once, and in areas of the drive that are not > adjacent to improve the chances of recovery in the event of a localized > media failure). That said, you could easily accelerate the write > performance of ZFS on that single JBOD by adding a small SSD log device. > (4GB would be enough. :-) > > - Garrett > > On Thu, 2010-07-08 at 15:10 +0200, Philippe Schwarz wrote: >> Hi, >> >> With dual-Xeon, 4GB of Ram (will be 8GB in a couple of weeks), two PCI-X >> 3Ware cards 7 Sata disks (750G & 1T) over FreeBSD 8.0 (But i think it's >> OS independant), i made some tests. >> >> The disks are exported as JBOD, but i tried enabling/disabling write-cache . >> >> I tried with UFS and ZFS on the same disk and the difference is >> overwhelming. >> >> With a 1GB file (greater than the ZFS cache ?): >> >> With Writecache disabled >> UFS >> time cp /mnt/ufs/rnd /mnt/ufs/rnd2 >> real 2m58.073s >> ZFS >> time cp /zfs/rnd /zfs/rnd2 >> real 4m33.726s >> >> On the same card with WCache enabled >> UFS >> time cp /mnt/ufs/rnd /mnt/ufs/rnd2 >> real 0m31.406s >> ZFS >> time cp /zfs/rnd /zfs/rnd2 >> real 1m0.199s >> >> So, despite the fact that ZFS can be twice slower than UFS, it is clear >> that Write-Cache have to be enabled on the controller. >> >> Any drawback (except that without BBU, i've got a pb in case of power >> loss) in enabling the WC with ZFS ? >> >> >> Thanks. >> Best regards. >> >> > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Should i enable Write-Cache ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Le 08/07/2010 18:52, Freddie Cash a écrit : > On Thu, Jul 8, 2010 at 6:10 AM, Philippe Schwarz wrote: >> With dual-Xeon, 4GB of Ram (will be 8GB in a couple of weeks), two PCI-X >> 3Ware cards 7 Sata disks (750G & 1T) over FreeBSD 8.0 (But i think it's >> OS independant), i made some tests. OK, thanks for all the answers: - - Test if controllers/disks honor cache-flush command - - Buy an SSD for both L2ARC&ZIL - - Use Single disks arrays instead of JBOD Ok for both SSD&Array, but how can i know if my ctrl+disk are not lying about flushing their cache when asked to ? It reminds me of a thread here related to this ""feature"" ;-) Thanks. Best regards. - -- Lycée Maximilien Perret, Alfortville -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkw2R5AACgkQlhqCFkbqHRbSOwCggYR29QsrWGGN1JvWuweDT4cH NPEAnjsi+nzemThEWdCvtwn8ZZ37Zq0b =86UV -END PGP SIGNATURE- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs send, receive, compress, dedup
Suppose I have a fileserver, which may be zpool 10, 14, or 15. No compression, no dedup. Suppose I have a backupserver. I want to zfs send from the fileserver to the backupserver, and I want the backupserver to receive and store compressed and/or dedup'd. The backupserver can be a more recent version of zpool than the fileserver, but right now, whatever they are, they're the same as each other. And I think it's 10, and I think I have 15 available if I upgrade them. (Obviously I can get accurate details on the exact versions, if it matters.) I have created the destination filesystem with compression. And now my question ... Can I "zfs send" from the fileserver to the backupserver and expect it to be compressed and/or dedup'd upon receive? Does "zfs send" preserve the properties of the originating filesystem? Will the "zfs receive" clobber or ignore the compression / dedup options on the destination filesystem? I'm doing it now. So I guess the question here is sort of academic. ;-) I'll know the answer tomorrow, either way. ;-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lost ZIL Device
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Andrew Kener > > According to 'zpool upgrade' my pool versions are are 22. All pools > were upgraded several months ago, including the one in question. Here > is what I get when I try to import: > > fileserver ~ # zpool import 9013303135438223804 > cannot import 'tank': pool may be in use from other system, it was last > accessed by fileserver (hostid: 0x406155) on Tue Jul 6 10:46:13 2010 > use '-f' to import anyway > > fileserver ~ # zpool import -f 9013303135438223804 > cannot import 'tank': one or more devices is currently unavailable > Destroy and re-create the pool from > a backup source. That's a major bummer. And I don't think it's caused by the log device, because as you say, zpool 22 > 19, which means your system supports log device removal. I think ... zpool status? Will show you which devices are "currently unavailable" right? I know "zpool status" will show the status of vdev's, in a healthy pool. I just don't know if the same is true for faulted pools. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool spares listed twice, as both AVAIL and FAULTED
I've got an x4500 with a zpool in a weird state. The two spares are listed twice each, once as AVAIL, and once as FAULTED. [IDGSUN02:/opt/src] root# zpool status pool: idgsun02 state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM idgsun02ONLINE 0 0 0 raidz2ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 c1t1d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c6t5d0 ONLINE 0 0 0 c7t1d0 ONLINE 0 0 0 c7t5d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 raidz2ONLINE 0 0 0 c0t0d0 ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c1t0d0 ONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c6t0d0 ONLINE 0 0 0 c6t4d0 ONLINE 0 0 0 c7t0d0 ONLINE 0 0 0 c7t4d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 spares c0t6d0AVAIL c5t5d0AVAIL c0t6d0FAULTED corrupted data c5t5d0FAULTED corrupted data errors: No known data errors I've been working with Sun support, but wanted to toss it out to the community as well. I found and compiled the zpconfig util from here: http://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSGuids and found that the spares in question have different GUIDs, but the same vdev path: spares[0] type='disk' guid=7826011125406290675 path='/dev/dsk/c0t6d0s0' devid='id1,s...@sata_hitachi_hua7210s__gtf000pahjmlxf/a' phys_path='/p...@0,0/pci1022,7...@1/pci11ab,1...@1/d...@6,0:a' whole_disk=1 is_spare=1 stats: state=7 aux=0 ... spares[1] type='disk' guid=870554111467930413 path='/dev/dsk/c5t5d0s0' devid='id1,s...@sata_hitachi_hua7210s__gtf000pahj5nlf/a' phys_path='/p...@1,0/pci1022,7...@4/pci11ab,1...@1/d...@5,0:a' whole_disk=1 is_spare=1 stats: state=7 aux=0 ... spares[2] type='disk' guid=5486341412008712208 path='/dev/dsk/c0t6d0s0' devid='id1,s...@sata_hitachi_hua7210s__gtf000pahjmlxf/a' phys_path='/p...@0,0/pci1022,7...@1/pci11ab,1...@1/d...@6,0:a' whole_disk=1 stats: state=4 aux=2 ... spares[3] type='disk' guid=16971039974506843020 path='/dev/dsk/c5t5d0s0' devid='id1,s...@sata_hitachi_hua7210s__gtf000pahj5nlf/a' phys_path='/p...@1,0/pci1022,7...@4/pci11ab,1...@1/d...@5,0:a' whole_disk=1 stats: state=4 aux=2 ... I've exported/imported the pool and the spares are still listed as above.The regular 'zpool remove idgsun02 c0t6d0s0' (and c5t5d0s0) also do not work, but do not produce any error output either. This sounds remarkably like http://bugs.opensolaris.org/bugdatabase/view_bug.do;?bug_id=6893472 but as I said, the export/import does not correct the issue. Any suggestions on how I can remove the "FAULTED" spares from the pool? Can I use the GUID with zpool remove somehow? -- Ryan Schwartz, UNIX Systems Administrator, VitalSource Technologies, Inc. - An Ingram Digital Company Mob: (608) 886-3513 ▪ ryan.schwa...@ingramdigital.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SATA 6G controller for OSOL
On Wed, Jul 7, 2010 at 3:52 PM, valrh...@gmail.com wrote: > Does anyone have an opinion, or some experience? Thanks in advance! Both of them support AHCI, so they should work for SATA 6G. The USB3.0 and PATA may not work however. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Help with Faulted Zpool Call for Help(Cross post)
On Jul 7, 2010, at 3:27 AM, Richard Elling wrote: > > On Jul 6, 2010, at 10:02 AM, Sam Fourman Jr. wrote: > >> Hello list, >> >> I posted this a few days ago on opensolaris-discuss@ list >> I am posting here, because there my be too much noise on other lists >> >> I have been without this zfs set for a week now. >> My main concern at this point,is it even possible to recover this zpool. >> >> How does the metadata work? what tool could is use to rebuild the >> corrupted parts >> or even find out what parts are corrupted. >> >> >> most but not all of these disks were Hitachi Retail 1TB didks. >> >> >> I have a Fileserver that runs FreeBSD 8.1 (zfs v14) >> after a poweroutage, I am unable to import my zpool named Network >> my pool is made up of 6 1TB disks configured in raidz. >> there is ~1.9TB of actual data on this pool. >> >> I have loaded Open Solaris svn_134 on a seprate boot disk, >> in hopes of recovering my zpool. >> >> on Open Solaris 134, I am not able to import my zpool >> almost everything I try gives me cannot import 'Network': I/O error >> >> I have done quite a bit of searching, and I found that import -fFX >> Network should work >> however after ~ 20 hours this hard locks Open Solaris (however it does >> return a ping) >> >> here is a list of commands that I have run on Open Solaris >> >> http://www.puffybsd.com/zfsv14.txt > > You ran "zdb -l /dev/dsk/c7t5d0s2" which is not the same as > "zdb -l /dev/dsk/c7t5d0p0" because of the default partitioning. > In Solaris c*t*d*p* are fdisk partitions and c*t*d*s* are SMI or > EFI slices. This why label 2&3 could not be found and can be > part of the problem to start. This is unlikely, as raidz vdev is reported as ONLINE, though you can use attached script to verify this. raidz_open2.d Description: Binary data ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Should i enable Write-Cache ?
On Thu, Jul 8, 2010 at 6:10 AM, Philippe Schwarz wrote: > With dual-Xeon, 4GB of Ram (will be 8GB in a couple of weeks), two PCI-X > 3Ware cards 7 Sata disks (750G & 1T) over FreeBSD 8.0 (But i think it's > OS independant), i made some tests. > > The disks are exported as JBOD, but i tried enabling/disabling write-cache . Don't use JBOD, as that disabled a lot of the advanced features of the 3Ware controllers. Instead, create "Single Disk" arrays for each disk. That way, you get all the management features of the card, all the advanced features of the card (StorSave policies, command queuing, separate read/write cache policies, SMART monitoring, access to the onboard cache, etc). With only the RAID hardware disabled on the controllers. You should get better performance using Single Disk over JBOD (which basically turns your expensive RAID controller into a "dumb" SATA controller). -- Freddie Cash fjwc...@gmail.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS recovery tools
On Jul 8, 2010, at 11:15 AM, R. Eulenberg wrote: > > pstack 'pgrep zdb'/1 > > and system answers: > > pstack: cannot examine pgrep zdb/1: no such process or core file use ` instead of ' in the above command. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Should i enable Write-Cache ?
You want the write cache enabled, for sure, with ZFS. ZFS will do the right thing about ensuring write cache is flushed when needed. For the case of a single JBOD, I don't find it surprising that UFS beats ZFS. ZFS is designed for more complex configurations, and provides much better data integrity guarantees than UFS (so critical data is written to the disk more than once, and in areas of the drive that are not adjacent to improve the chances of recovery in the event of a localized media failure). That said, you could easily accelerate the write performance of ZFS on that single JBOD by adding a small SSD log device. (4GB would be enough. :-) - Garrett On Thu, 2010-07-08 at 15:10 +0200, Philippe Schwarz wrote: > Hi, > > With dual-Xeon, 4GB of Ram (will be 8GB in a couple of weeks), two PCI-X > 3Ware cards 7 Sata disks (750G & 1T) over FreeBSD 8.0 (But i think it's > OS independant), i made some tests. > > The disks are exported as JBOD, but i tried enabling/disabling write-cache . > > I tried with UFS and ZFS on the same disk and the difference is > overwhelming. > > With a 1GB file (greater than the ZFS cache ?): > > With Writecache disabled > UFS > time cp /mnt/ufs/rnd /mnt/ufs/rnd2 > real 2m58.073s > ZFS > time cp /zfs/rnd /zfs/rnd2 > real 4m33.726s > > On the same card with WCache enabled > UFS > time cp /mnt/ufs/rnd /mnt/ufs/rnd2 > real 0m31.406s > ZFS > time cp /zfs/rnd /zfs/rnd2 > real 1m0.199s > > So, despite the fact that ZFS can be twice slower than UFS, it is clear > that Write-Cache have to be enabled on the controller. > > Any drawback (except that without BBU, i've got a pb in case of power > loss) in enabling the WC with ZFS ? > > > Thanks. > Best regards. > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Should i enable Write-Cache ?
Hi, With dual-Xeon, 4GB of Ram (will be 8GB in a couple of weeks), two PCI-X 3Ware cards 7 Sata disks (750G & 1T) over FreeBSD 8.0 (But i think it's OS independant), i made some tests. The disks are exported as JBOD, but i tried enabling/disabling write-cache . I tried with UFS and ZFS on the same disk and the difference is overwhelming. With a 1GB file (greater than the ZFS cache ?): With Writecache disabled UFS time cp /mnt/ufs/rnd /mnt/ufs/rnd2 real2m58.073s ZFS time cp /zfs/rnd /zfs/rnd2 real4m33.726s On the same card with WCache enabled UFS time cp /mnt/ufs/rnd /mnt/ufs/rnd2 real0m31.406s ZFS time cp /zfs/rnd /zfs/rnd2 real1m0.199s So, despite the fact that ZFS can be twice slower than UFS, it is clear that Write-Cache have to be enabled on the controller. Any drawback (except that without BBU, i've got a pb in case of power loss) in enabling the WC with ZFS ? Thanks. Best regards. -- Lycée Maximilien Perret, Alfortville ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Legality and the future of zfs...
"Garrett D'Amore" wrote: > This situation is why I'm coming to believe that there is almost no case > for software patents. (I still think there may be a few exceptions -- > the RSA patent being a good example where there was significant enough > innovation to possibly justify a patent). The sad fact is that when a RSA never has been a patent in Europe as it was files after the decription was published ;-) > company feels it can't compete on the merits of innovation or cost, it > seeks to litigate the competition. What NetApp *should* be doing is > figuring out how to out-innovate us, undercut us ("us" collectively > meaning Oracle and all other ZFS users), or find other ways to compete > effectively. They can't, so they resort to litigation. Sounds like a Patent claims in this area are usually a result of missing competitive products at the side of the plaintiff. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS recovery tools
Hi, today I was running zdb -e -bcsvL tank1 and zdb -eC tank1 again and it don't comes back a reply or prompt from the system. Than I was open a new console and run pstack 'pgrep zdb'/1 and system answers: pstack: cannot examine pgrep zdb/1: no such process or core file What's that? Why I don't get back a prompt and an answer of sending the 1st two commands? Regards ron -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss