Re: [zfs-discuss] Workaround for mpt timeouts in snv_127
... I have identified the culprit is the Western Digital drive WD2002FYPS-01U1B0. It's not clear if they can fix it in firmware, but Western Digital is replacing my drives. Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Feb 17 04:45:10 thecratewall scsi: [ID 365881 kern.info] /p...@0,0/pci15ad,7...@15/pci1000,3...@0 (mpt_sas0): Feb 17 04:45:10 thecratewall Log info 0x31110630 received for target 13. Feb 17 04:45:10 thecratewall scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Hi, do you have disks connected in sata1/2? With WD2003FYYS-01T8B0/WD20EADS-00S2B0/WD1001FALS-00J7B1/WD1002FBYS-01A6B0 these timeouts are to be expected if disk is in SATA2 mode, we've get rid of these timeouts after forcing disks in SATA1-mode with jumpers, now they only appear when disk is having real issues and needs to be replaced. Yours Markus Kovero ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?
Tonmaus wrote: Hi David, why not just use a couple of SAS expanders? Regards, Tonmaus I would go this route (SAS expanders for a 8-port HBA). E.g.: http://www.supermicro.com/products/accessories/mobilerack/CSE-M28E2.cfm is a great internal bay set up if you want a non-Supermicro case. $250 or thereabouts. You can get two of these, plus a pair of nice LSI HBAs. As an added bonus, it's easy to put a SSD into one or more of the bays with no additional rewiring needed. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Workaround for mpt timeouts in snv_127
Hi, do you have disks connected in sata1/2? With WD2003FYYS-01T8B0/WD20EADS-00S2B0/WD1001FALS-00J7B1/WD1002FBYS-01A6B0 these timeouts are to be expected if disk is in SATA2 mode, No, why are they to be expected with SATA2 mode? Is the defect specific to the SATA2 circuitry? I guess it could be a temporary workaround provided they would eventually fix the problem in firmware, but I'm getting new drives, so I guess I can't complain :-) -- Maurice Volaski, maurice.vola...@einstein.yu.edu Computing Support, Rose F. Kennedy Center Albert Einstein College of Medicine of Yeshiva University ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Workaround for mpt timeouts in snv_127
No, why are they to be expected with SATA2 mode? Is the defect specific to the SATA2 circuitry? I guess it could be a temporary workaround provided they would eventually fix the problem in firmware, but I'm getting new drives, so I guess I can't complain :-) Probably your new disks do this too, I really don't know whats with flawkey sata2 but I'd be quite sure it would fix your issues. Performance drop is not even noticeable, so it's worth a try. Yours Markus Kovero ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?
Just as an FYI, not all drives like sas expanders. As an example, we had a lot of trouble with Indilinx MLC based SSDs. The systems had Adaptec 52445 controllers and Chenbro SAS expanders. In the end we had to remove the SAS expanders and put a 2nd 52445 in each system to get them to work properly. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about backup and mirrored pools
David Dyer-Bennet d...@dd-b.net writes: [...] Am I way wrong on this, and further I'm curious if it would make more versatile use of the space if I were to put the mirrored pairs into one big pool containing 3 mirrored pairs (6 discs) Well, my own thinking doesn't consider that adequate for my own data; which is not identical to thinking you're actually wrong, of course. Issues I see include: Flood, fire, foes, bugs, user error. rm -rf / will destroy your data just as well on the mirror as on a single disk, as will hacker breakins. OS and driver bugs can corrupt both sides of the mirror. And burning your house down, or flooding it perhaps (depending on where your server is; mine's in the basement, so if we flood, it gets wet), will destroy your data. Yeah there is all that, but in my case the data is also on the other machines in bits and pieces over several machines. Is yours only on the zfs server? An example here might be my photo/music collection. It resides on a windows XP pro machine where I have the tools I use to tinker with it. I back it up to the zfs server, but what is on the server is always a bit older (between backups) than the current version on the windows machine. So if the zfs server were to be beamed up to another solar system, I'd still have the latest greatest version on the windows machine. Anyway losing my entire house and several machines would leave me with much bigger problems than losing my photo/music collection. The shelter I'd be living in wouldn't have room for several machines. Nor would I have money to spend on such luxuries. I make and keep off-site backups, formerly on optical media, moving towards external disk drives. I'd be interested to hear about that. If you think its OT here feel free to write me direct (reader AT newsguy DOT com) I have something like a terabyte of data on the server. Man I'd really hate to try to back that up to optical media. Even to external hard drive would be a major time sync. Just backing up the 80 or so GB of Photos/music to optical medial is a nasty undertaking. I quit doing that when it grew past 15 gb or so. [...] thanks for the other (snipped) input. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about backup and mirrored pools
Bob Friesenhahn bfrie...@simple.dallas.tx.us writes: On Fri, 9 Apr 2010, Harry Putnam wrote: Am I way wrong on this, and further I'm curious if it would make more versatile use of the space if I were to put the mirrored pairs into one big pool containing 3 mirrored pairs (6 discs) Besides more versatile use of the space, you would get 3X the performance. That speed up surprises me. Can you explain briefly how that works? If that is a bit much to ask here, maybe a pointer to specific documentation? Luckily, since you are using mirrors, you can easily migrate disks from your existing extra pools to the coalesced pool. Just make sure to scrub first in order to have confidence that there won't be data loss. Would that be in pairs, or can you show a generalized outline of how that would be done. Again, a documentation pointer would be good too. Is it wise to have rpool as the migration destination, making it the only pool? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about backup and mirrored pools
Richard Jahnel rich...@ellipseinc.com writes: [...] Perhaps mirrored sets with daily snapshots and a knowedge of how to mount snapshots as clones so that you can pull a copy of that file you deleted 3 days ago. :) I've been doing that with the default auto snapshot setup, but hadn't noticed a need to mount a snapshot as a clone in order to be able to pull a file. I've just sought out most recent snapshot with the file and copied it. I'm not sure now if I've retrieved lost files, or just got an older version that way... but I think both. Can you explain a bit why you need to mount a snapshot as clone? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?
As far as I have read, that problem has been reported to be a compatibility problem of the Adaptec controller and the expander chipset, e.g. LSI SASx which is also on the mentioned Chenbro expander. There is no problem with 106x chipset and sas expanders that I know of. People sceptical about expanders: quite a couple of th Areca cards actually have expander chips on board. Don't know about the 1680 specifically, though. Cheers, Tonmaus -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about backup and mirrored pools
On Sat, 10 Apr 2010, Harry Putnam wrote: Am I way wrong on this, and further I'm curious if it would make more versatile use of the space if I were to put the mirrored pairs into one big pool containing 3 mirrored pairs (6 discs) Besides more versatile use of the space, you would get 3X the performance. That speed up surprises me. Can you explain briefly how that works? It is quite simple. With three sets of mirrors in the pool, the data is distributed across the three mirrors. There is 3X the hardware available for each write. There is more than 3X the hardware for each read since either side of a mirror may be used to satisfy a read. If that is a bit much to ask here, maybe a pointer to specific documentation? That would be too much to ask since it is clear that you did not spend more than a few minutes reading the documentation. Spend some more minutes. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
Due to recent experiences, and discussion on this list, my colleague and I performed some tests: Using solaris 10, fully upgraded. (zpool 15 is latest, which does not have log device removal that was introduced in zpool 19) In any way possible, you lose an unmirrored log device, and the OS will crash, and the whole zpool is permanently gone, even after reboots. Using opensolaris, upgraded to latest, which includes zpool version 22. (Or was it 23? I forget now.) Anyway, it's =19 so it has log device removal. 1. Created a pool, with unmirrored log device. 2. Started benchmark of sync writes, verified the log device getting heavily used. 3. Yank out the log device. Behavior was good. The pool became degraded which is to say, it started using the primary storage for the ZIL, performance presumably degraded, but the system remained operational and error free. I was able to restore perfect health by zpool remove the failed log device, and zpool add a new log device. Next: 1. Created a pool, with unmirrored log device. 2. Started benchmark of sync writes, verified the log device getting heavily used. 3. Yank out both power cords. 4. While the system is down, also remove the log device. (OOoohhh, that's harsh.) I created a situation where an unmirrored log device is known to have unplayed records, there is an ungraceful shutdown, *and* the device disappears. That's the absolute worst case scenario possible, other than the whole building burning down. Anyway, the system behaved as well as it possibly could. During boot, the faulted pool did not come up, but the OS came up fine. My zpool status showed this: # zpool status pool: junkpool state: FAULTED status: An intent log record could not be read. Waiting for adminstrator intervention to fix the faulted pool. action: Either restore the affected device(s) and run 'zpool online', or ignore the intent log records by running 'zpool clear'. see: http://www.sun.com/msg/ZFS-8000-K4 scrub: none requested config: NAMESTATE READ WRITE CKSUM junkpoolFAULTED 0 0 0 bad intent log c8t4d0ONLINE 0 0 0 c8t5d0ONLINE 0 0 0 logs c8t3d0UNAVAIL 0 0 0 cannot open (---) I know the unplayed log device data is lost forever. So I clear the error, remove the faulted log device, and acknowledge that I have lost the last few seconds of written data, up to the system crash: # zpool clear junkpool # zpool status pool: junkpool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-2Q scrub: none requested config: NAMESTATE READ WRITE CKSUM junkpoolDEGRADED 0 0 0 c8t4d0ONLINE 0 0 0 c8t5d0ONLINE 0 0 0 logs c8t3d0UNAVAIL 0 0 0 cannot open # zpool remove junkpool c8t3d0 # zpool status junkpool pool: junkpool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM junkpoolONLINE 0 0 0 c8t4d0ONLINE 0 0 0 c8t5d0ONLINE 0 0 0 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On Sat, 10 Apr 2010, Edward Ned Harvey wrote: Using solaris 10, fully upgraded. (zpool 15 is latest, which does not have log device removal that was introduced in zpool 19) In any way possible, you lose an unmirrored log device, and the OS will crash, and the whole zpool is permanently gone, even after reboots. Is anyone willing to share what zfs version will be included with Solaris 10 U9? Will graceful intent log removal be included? Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Sync Write - ZIL log performance - Feedback for ZFS developers?
Neil or somebody? Actual ZFS developers? Taking feedback here? ;-) While I was putting my poor little server through cruel and unusual punishment as described in my post a moment ago, I noticed something unexpected: I expected that while I'm stressing my log device by infinite sync writes, my primary storage devices would also be busy(ish). Not really busy, but not totally idle either. Since the primary storage is a stripe of spindle mirrors, obviously it can handle much more sustainable throughput than the individual log device, but the log device can respond with smaller latency. What I noticed was this: For several seconds, *only* the log device is busy. Then it stops, and for maybe 0.5 secs *only* the primary storage disks are busy. Repeat, recycle. I expected to see the log device busy nonstop. And the spindle disks blinking lightly. As long as the spindle disks are idle, why wait for a larger TXG to be built? Why not flush out smaller TXG's as long as the disks are idle? But worse yet . During the 1-second (or 0.5 second) that the spindle disks are busy, why stop the log device? (Presumably also stopping my application that's doing all the writing.) This means, if I'm doing zillions of *tiny* sync writes, I will get the best performance with the dedicated log device present. But if I'm doing large sync writes, I would actually get better performance without the log device at all. Or else . add just as many log devices as I have primary storage devices. Which seems kind of crazy. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On Sat, Apr 10, 2010 at 10:08 AM, Edward Ned Harvey solar...@nedharvey.comwrote: Due to recent experiences, and discussion on this list, my colleague and I performed some tests: Using solaris 10, fully upgraded. (zpool 15 is latest, which does not have log device removal that was introduced in zpool 19) In any way possible, you lose an unmirrored log device, and the OS will crash, and the whole zpool is permanently gone, even after reboots. Using opensolaris, upgraded to latest, which includes zpool version 22. (Or was it 23? I forget now.) Anyway, it’s =19 so it has log device removal. 1. Created a pool, with unmirrored log device. 2. Started benchmark of sync writes, verified the log device getting heavily used. 3. Yank out the log device. Behavior was good. The pool became “degraded” which is to say, it started using the primary storage for the ZIL, performance presumably degraded, but the system remained operational and error free. I was able to restore perfect health by “zpool remove” the failed log device, and “zpool add” a new log device. Next: 1. Created a pool, with unmirrored log device. 2. Started benchmark of sync writes, verified the log device getting heavily used. 3. Yank out both power cords. 4. While the system is down, also remove the log device. (OOoohhh, that’s harsh.) I created a situation where an unmirrored log device is known to have unplayed records, there is an ungraceful shutdown, * *and** the device disappears. That’s the absolute worst case scenario possible, other than the whole building burning down. Anyway, the system behaved as well as it possibly could. During boot, the faulted pool did not come up, but the OS came up fine. My “zpool status” showed this: # zpool status pool: junkpool state: FAULTED status: An intent log record could not be read. Waiting for adminstrator intervention to fix the faulted pool. action: Either restore the affected device(s) and run 'zpool online', or ignore the intent log records by running 'zpool clear'. see: http://www.sun.com/msg/ZFS-8000-K4 scrub: none requested config: NAMESTATE READ WRITE CKSUM junkpoolFAULTED 0 0 0 bad intent log c8t4d0ONLINE 0 0 0 c8t5d0ONLINE 0 0 0 logs c8t3d0UNAVAIL 0 0 0 cannot open (---) I know the unplayed log device data is lost forever. So I clear the error, remove the faulted log device, and acknowledge that I have lost the last few seconds of written data, up to the system crash: # zpool clear junkpool # zpool status pool: junkpool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://www.sun.com/msg/ZFS-8000-2Q scrub: none requested config: NAMESTATE READ WRITE CKSUM junkpoolDEGRADED 0 0 0 c8t4d0ONLINE 0 0 0 c8t5d0ONLINE 0 0 0 logs c8t3d0UNAVAIL 0 0 0 cannot open # zpool remove junkpool c8t3d0 # zpool status junkpool pool: junkpool state: ONLINE scrub: none requested config: NAMESTATE READ WRITE CKSUM junkpoolONLINE 0 0 0 c8t4d0ONLINE 0 0 0 c8t5d0ONLINE 0 0 0 Awesome! Thanks for letting us know the results of your tests Ed, that's extremely helpful. I was actually interested in grabbing some of the cheaper intel SSD's for home use, but didn't want to waste my money if it wasn't going to handle the various failure modes gracefully. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sync Write - ZIL log performance - Feedback for ZFS developers?
On Sat, 10 Apr 2010, Edward Ned Harvey wrote: For several seconds, *only* the log device is busy. Then it stops, and for maybe 0.5 secs *only* the primary storage disks are busy. Repeat, recycle. I expected to see the log device busy nonstop. And the spindle disks blinking lightly. As long as the spindle disks are idle, why wait for a larger TXG to be built? Why not flush out smaller TXG’s as long as the disks are idle? But worse yet … During the 1-second (or 0.5 second) that the spindle disks are busy, why stop the log device? (Presumably also stopping my application that’s doing all the writing.) What you are seeing should be expected and is good. The intent log allows synchronous writes to be turned into lazy ordinary writes (like async writes) in the next TXG cycle. Since the intent log is on a SSD, the pressure is taken off of the primary disks to serve that function so you will not see so many IOPS to the primary disks. This means, if I’m doing zillions of *tiny* sync writes, I will get the best performance with the dedicated log device present. But if I’m doing large sync writes, I would actually get better performance without the log device at all. Or else … add just as many log devices as I have primary storage devices. Which seems kind of crazy. If this is really a problem for you, then you should be able to somewhat resolve it by placing a smaller cap on the maximum size of a TXG. Then the system will write more often. However, the maximum synchronous bulk write rate will still be limited by the bandwidth of your intent log devices. Huge synchronous bulk writes are pretty rare since usually the bottleneck is elsewhere, such as the ethernet. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
Thanks for the testing. so FINALLY with version 19 does ZFS demonstrate production-ready status in my book. How long is it going to take Solaris to catch up? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS RaidZ recommendation
On Fri, Apr 9, 2010 at 9:31 PM, Eric D. Mudama edmud...@bounceswoosh.orgwrote: On Sat, Apr 10 at 7:22, Daniel Carosone wrote: On Fri, Apr 09, 2010 at 10:21:08AM -0700, Eric Andersen wrote: If I could find a reasonable backup method that avoided external enclosures altogether, I would take that route. I'm tending to like bare drives. If you have the chassis space, there are 5-in-3 bays that don't need extra drive carriers, they just slot a bare 3.5 drive. For e.g. http://www.newegg.com/Product/Product.aspx?Item=N82E16817994077 I have a few of the 3-in-2 versions of that same enclosure from the same manufacturer, and they installed in about 2 minutes in my tower case. The 5-in-3 doesn't have grooves in the sides like their 3-in-2 does, so some cases may not accept the 5-in-3 if your case has tabs to support devices like DVD drives in the 5.25 slots. The grooves are clearly visible in this picture: http://www.newegg.com/Product/Product.aspx?Item=N82E16817994075 The doors are a bit light perhaps, but it works just fine for my needs and holds drives securely. The small fans are a bit noisy, but since the box lives in the basement I don't really care. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org At that price, for the 5-in-3 at least, I'd go with supermicro. For $20 more, you get what appears to be a far more solid enclosure. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Secure delete?
Hi all Is it possible to securely delete a file from a zfs dataset/zpool once it's been snapshotted, meaning delete (and perhaps overwrite) all copies of this file? Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about backup and mirrored pools
Bob Friesenhahn bfrie...@simple.dallas.tx.us writes: On Sat, 10 Apr 2010, Harry Putnam wrote: Am I way wrong on this, and further I'm curious if it would make more versatile use of the space if I were to put the mirrored pairs into one big pool containing 3 mirrored pairs (6 discs) Besides more versatile use of the space, you would get 3X the performance. That speed up surprises me. Can you explain briefly how that works? It is quite simple. With three sets of mirrors in the pool, the data is distributed across the three mirrors. There is 3X the hardware available for each write. There is more than 3X the hardware for each read since either side of a mirror may be used to satisfy a read. Thanks for your comments. Its always so much easier when someone explains in plain english. If that is a bit much to ask here, maybe a pointer to specific documentation? That would be too much to ask since it is clear that you did not spend more than a few minutes reading the documentation. Spend some more minutes. Well now, that would only be true if you meant very recently. In fact I have spent quite hefty amounts of times reading zfs and opensolaris documentation. Including hefty tracts of the `Bible' (that isn't even close) book. The trouble is that I'm understand about 1/10 of it, and that 1/10 soon departs my pea brain when I don't use it daily. You may be used to dealing with folks who have a basic understanding of OSs' and programming, rc files and etc. Probably some amount of formal higher education too. I've come on that kind of info in a very haphazard, hard scrabble way. My education stopped in 9th grade, I went to work at that point, out in the west of our country. Started industrial work a few yrs later (1965) and eventually became a field construction boilermaker and worked around many of the midwest, western and west coast states on refineries, powerplants, steelmills, and other big industrial plants (now retired). None of that was very conducive to the finer points of Operating systems and programming or admin chores. So whatever I've learned about that kind of stuff in the last 10 yrs or so has gaping holes that you could drive 18 wheelers through. I've never found that reading documentation I barely understand is a very good way of actually learning something. On the other hand, hearing about it from current practitioners and heavy experimentation is (for me) the best way to learn about something. With some of that going, then the documentation may start to be a lot more meaningful, once I have something to hang it on. Egad... sorry about the rant but sometimes it seems to just need to be said. Thanks again for what you have contributed, not just to this thread but your many hundreds, maybe thousands of messages of help to others as well. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
No, until all snapshots referencing the file in question are removed. Simplest way to understand snapshots is to consider them as references. Any file-system object (say, file or block) is only removed when its reference count drops to zero. Regards, Andrey On Sat, Apr 10, 2010 at 10:20 PM, Roy Sigurd Karlsbakk r...@karlsbakk.net wrote: Hi all Is it possible to securely delete a file from a zfs dataset/zpool once it's been snapshotted, meaning delete (and perhaps overwrite) all copies of this file? Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Secure delete?
Roy Sigurd Karlsbakk r...@karlsbakk.net wrote: I guess that's the way I thought it was. Perhaps it would be nice to add such a feature? If something gets stuck in a truckload of snapshots, say a 40GB file in the root fs, it'd be nice to just rm --killemall largefile Let us first asume the simple case where the file is not a part of any snapshot. For a secure delete, a file needs to be overwritten in place and this cannot be done on a FS that is always COW. The secure deletion of the data would be something that hallens before the file is actually unlinked (e.g. by rm). This secure deletion would need open the file in a non COW mode. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de(uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sync Write - ZIL log performance - Feedback for ZFS developers?
On Sat, Apr 10, 2010 at 11:50:05AM -0500, Bob Friesenhahn wrote: Huge synchronous bulk writes are pretty rare since usually the bottleneck is elsewhere, such as the ethernet. Also, large writes can go straight to the pool, and the zil only logs the intent to commit those blocks (ie, link them into the zfs data structure). I don't recall what the threshold for this is, but I think it's one of those Evil Tunables. -- Dan. pgpUn4A0x96oI.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Areca ARC-1680 on OpenSolaris 2009.06?
Any hints as to where you read that? I'm working on another system design with LSI controllers and being able to use SAS expanders would be a big help. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS RaidZ recommendation
On Sat, Apr 10, 2010 at 12:56:04PM -0500, Tim Cook wrote: At that price, for the 5-in-3 at least, I'd go with supermicro. For $20 more, you get what appears to be a far more solid enclosure. My intent with that link was only to show an example, not make a recommendation. I'm glad others have the example I picked as the easiest search result, I don't. There are plenty of other choices. Note also, the supermicro ones are not trayless. The example was specifically of a trayless model. Supermicro may be good for permanent drives, but the trayless option is convenient for backup bays, where you have 2 or more sets of drives that rotate through the bays in the backup cycle. Getting extra trays can be irritating, and at least some trays make handling and storing the drives outside their slots rather cumbersome (odd corners and edges and stacking difficulties). Having bays that take bare drives is also great for recovering data from disks taken from other machines. Bare drives are also most easily interchangable between racks from different makers - say if a better trayless model became available between the purchase times of different machines. -- Dan. pgp5TkE1uAvAf.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration
On Sat, Apr 10, 2010 at 02:51:45PM -0500, Harry Putnam wrote: [Note: This discussion started in another thread Subject: about backup and mirrored pools but the subject has been significantly changed so started a new thread] Bob Friesenhahn bfrie...@simple.dallas.tx.us writes: Luckily, since you are using mirrors, you can easily migrate disks from your existing extra pools to the coalesced pool. Just make sure to scrub first in order to have confidence that there won't be data loss. You know, when I saw those words, I worried someone, somewhere would interpret them incorrectly. The migration he's referring to is of disks, not of contents. The contents you'd have to migrate first (say with send|recv), before destroying the emptied pool and adding the disks to the pool you want to expand, as a new vdev.There's an implicit requirement here for free space (or staging space elsewhere) to enable the move. Note, also, rpool can't have multiple vdevs, so the best you could combine curently is z2 and z3. I'm getting a little (read horribly) confused how to go about doing something like creating a single zpool of 3 sets of currently mirrored disks that for each pair, constitute a zpool themselves. This would be something like a zpool merge operation, which does not exist. In the case I'm describing I guess rpool will be the only pool when its completed. Is that a sensible thing to do? No, as above. You might consider new disks for a new rpool (say, ssd with some zil or l2arc) and reusing the current disks for data if they're the same as the other data disks. Or would it make more sense to leave rpool alone and make a single zpool of the other two mirrored pairs? Yep. -- Dan. pgpsuMiotk6Nw.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sync Write - ZIL log performance - Feedback for ZFS developers?
On 04/10/10 09:28, Edward Ned Harvey wrote: Neil or somebody? Actual ZFS developers? Taking feedback here? ;-) While I was putting my poor little server through cruel and unusual punishment as described in my post a moment ago, I noticed something unexpected: I expected that while I'm stressing my log device by infinite sync writes, my primary storage devices would also be busy(ish). Not really busy, but not totally idle either. Since the primary storage is a stripe of spindle mirrors, obviously it can handle much more sustainable throughput than the individual log device, but the log device can respond with smaller latency. What I noticed was this: For several seconds, **only** the log device is busy. Then it stops, and for maybe 0.5 secs **only** the primary storage disks are busy. Repeat, recycle. These are the txgs getting pushed out. I expected to see the log device busy nonstop. And the spindle disks blinking lightly. As long as the spindle disks are idle, why wait for a larger TXG to be built? Why not flush out smaller TXG's as long as the disks are idle? Sometimes it's more efficient to batch up requests. Less blocks are written. As you mentioned you weren't stressing the system heavily. ZFS will perform differently when under pressure. It will shorten the time between txgs if the data arrives quicker. But worse yet ... During the 1-second (or 0.5 second) that the spindle disks are busy, why stop the log device? (Presumably also stopping my application that's doing all the writing.) Yes, this has been observed by many people. There are two sides to this problem related to the CPU and IO used while pushing a txg: 6806882 need a less brutal I/O scheduler 6881015 ZFS write activity prevents other threads from running in a timely manner The CPU side (6881015) was fixed relatively recently in snv_129. This means, if I'm doing zillions of **tiny** sync writes, I will get the best performance with the dedicated log device present. But if I'm doing large sync writes, I would actually get better performance without the log device at all. Or else ... add just as many log devices as I have primary storage devices. Which seems kind of crazy. Yes you're right, there are times when it's better to bypass the slog and use the pool disks which can deliver better bandwidth. The algorithm for where and what the ZIL writes has got quite complex: - There was another change recently to bypass the slog if 1MB had been sent to it and 2MB were waiting to be sent. - There's a new property logbias which when set to throughput directs the ZIL to send all of it's writes to the main pool devices thus freeing the slog for more latency sensitive work (ideal for database data files). - If synchronous writes are large (32K) and block aligned then the blocks are written directly to the pool and a small record written to the log. Later when the txg commits then the blocks are just linked into the txg. However, this processing is not done if there are any slogs because I found it didn't perform as well. Probably ought to be re-evaluated. - There are further tweaks being suggested and which might make it to a ZIL near you soon. Neil. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sync Write - ZIL log performance - Feedback for ZFS developers?
On 04/10/10 14:55, Daniel Carosone wrote: On Sat, Apr 10, 2010 at 11:50:05AM -0500, Bob Friesenhahn wrote: Huge synchronous bulk writes are pretty rare since usually the bottleneck is elsewhere, such as the ethernet. Also, large writes can go straight to the pool, and the zil only logs the intent to commit those blocks (ie, link them into the zfs data structure). I don't recall what the threshold for this is, but I think it's one of those Evil Tunables. This is zfs_immediate_write_sz which is 32K. However this only happens currently if you don't have any slogs. If logbias is set to throughput then all writes go straight to the pool regardless of zfs_immediate_write_sz. Neil. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration
On Sun, 11 Apr 2010, Daniel Carosone wrote: The migration he's referring to is of disks, not of contents. The contents you'd have to migrate first (say with send|recv), before destroying the emptied pool and adding the disks to the pool you want to expand, as a new vdev.There's an implicit requirement here for free space (or staging space elsewhere) to enable the move. Since he is already using mirrors, he already has enough free space since he can move one disk from each mirror to the main pool (which unfortunately, can't be the boot 'rpool' pool), send the data, and then move the second disks from the pools which are be removed. The main risk here is that there is only single redundancy for a while. No, as above. You might consider new disks for a new rpool (say, ssd with some zil or l2arc) and reusing the current disks for data if they're the same as the other data disks. It is nice to have a tiny disk for the root pool. An alternative is to create a small partition on two of the boot disks for use as the root pool, and use the remainder ('hog') partition for the main data pool. It is usually most convenient in the long run for the root pool to be physically separate from the data storage though. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration
On Sat, 10 Apr 2010, Bob Friesenhahn wrote: Since he is already using mirrors, he already has enough free space since he can move one disk from each mirror to the main pool (which unfortunately, can't be the boot 'rpool' pool), send the data, and then move the second disks from the pools which are be removed. The main risk here is that there is only single redundancy for a while. I should mention that since this is a bit risky, it would be wise for Harry to post the procedure and commands he plans to use so that other eyes can verify that a correct result will be obtained. There is more risk of configuring the pool incorrectly (e.g. failing to end up with mirror vdevs) than there is of losing data. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration
On Sat, Apr 10, 2010 at 06:20:54PM -0500, Bob Friesenhahn wrote: Since he is already using mirrors, he already has enough free space since he can move one disk from each mirror to the main pool (which unfortunately, can't be the boot 'rpool' pool), send the data, and then move the second disks from the pools which are be removed. Ah, right you are. D'oh. The main risk here is that there is only single redundancy for a while. You mean single copy, no redundancy, but otherwise yes.. perhaps that's why I hadn't noticed this scheme, but it was a subconscious oversight. I'd rather consider and eliminate it consciously if so. For Harry's benefit, the recipe we're talking about here is roughly as follows. Your pools z2 and z3, we will merge into z2. diskx and disky are the current members of z3. Break the z3 mirror # zpool detach z3 diskx Add a new vdev to z2 # zpool add -f z2 diskx The -f may be necessary, since you're adding a vdev with different redundancy profile to the existing vdev. Replicate the z3 data into z2 # zfs snapshot -R z...@move # zfs create z2/z3 # zfs send -R z...@move | zfs recv -d z2/z3 Free up the second z3 disk and attach as a mirror # zpool destroy z3 # zpool attach z2 diskx disky Again, commands are approximate to illustrate the steps; in particular you might choose a differnet replication structure. -- Dan. pgpqZBmWn0joT.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration
Daniel Carosone d...@geek.com.au writes: Thanks for the input.. very helpful. [...] No, as above. You might consider new disks for a new rpool (say, ssd with some zil or l2arc) and reusing the current disks for data if they're the same as the other data disks. Would you mind expanding the abbrevs: ssd zil 12arc? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration
On Sat, 10 Apr 2010, Harry Putnam wrote: Would you mind expanding the abbrevs: ssd zil 12arc? SSD = Solid State Device ZIL = ZFS Intent Log (log of pending synchronous writes) L2ARC = Level 2 Adaptive Replacement Cache Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Create 1 pool from 3 exising pools in mirror configuration
On 04/11/10 11:55 AM, Harry Putnam wrote: Would you mind expanding the abbrevs: ssd zil 12arc? http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss