Re: [zfs-discuss] ZFS Restripe
I notice you use the word "volume" which really isn't accurate or appropriate here. If all of these VDEVs are part of the same pool, which as I recall you said they are, then writes are striped across all of them (with bias for the more empty aka less full VDEVs). You probably want to "zfs send" the oldest dataset (ZFS terminology for a file system) into a new dataset. That oldest dataset was created when there were only 2 top level VDEVs, most likely. If you have multiple datasets created when you had only 2 VDEVs, then send/receive them both (in serial fashion, one after the other). If you have room for the snapshots too, then send all of it and then delete the source dataset when done. I think this will achieve what you want. You may want to get a bit more specific and choose from the oldest datasets THEN find the smallest of those oldest datasets and send/receive it first. That way, the send/receive completes in less time, and when you delete the source dataset, you've now created more free space on the entire pool but without the risk of a single dataset exceeding your 10 TiB of workspace. ZFS' copy-on-write nature really wants no less than 20% free because you never update data in place; a new copy is always written to disk. You might want to consider turning on compression on your new datasets too, especially if you have free CPU cycles to spare. I don't know how compressible your data is, but if it's fairly compressible, say lots of text, then you might get some added benefit when you copy the old data into the new datasets. Saving more space, then deleting the source dataset, should help your pool have more free space, and thus influence your writes for better I/O balancing when you do the next (and the next) dataset copies. HTH. On Tue, Aug 3, 2010 at 22:48, Eduardo Bragatto wrote: > On Aug 3, 2010, at 10:08 PM, Khyron wrote: > > Long answer: Not without rewriting the previously written data. Data >> is being striped over all of the top level VDEVs, or at least it should >> be. But there is no way, at least not built into ZFS, to re-allocate the >> storage to perform I/O balancing. You would basically have to do >> this manually. >> >> Either way, I'm guessing this isn't the answer you wanted but hey, you >> get what you get. >> > > Actually, that was the answer I was expecting, yes. The real question, > then, is: what data should I rewrite? I want to rewrite data that's written > on the nearly full volumes so they get spread to the volumes with more space > available. > > Should I simply do a " zfs send | zfs receive" on all ZFSes I have? (we are > talking about 400 ZFSes with about 7 snapshots each, here)... Or is there a > way to rearrange specifically the data from the nearly full volumes? > > > Thanks, > Eduardo Bragatto > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Restripe
Short answer: No. Long answer: Not without rewriting the previously written data. Data is being striped over all of the top level VDEVs, or at least it should be. But there is no way, at least not built into ZFS, to re-allocate the storage to perform I/O balancing. You would basically have to do this manually. Either way, I'm guessing this isn't the answer you wanted but hey, you get what you get. On Tue, Aug 3, 2010 at 13:52, Eduardo Bragatto wrote: > Hi, > > I have a large pool (~50TB total, ~42TB usable), composed of 4 raidz1 > volumes (of 7 x 2TB disks each): > > # zpool iostat -v | grep -v c4 > capacity operationsbandwidth > pool used avail read write read write > - - - - - - > backup35.2T 15.3T602272 15.3M 11.1M > raidz1 11.6T 1.06T138 49 2.99M 2.33M > raidz1 11.8T 845G163 54 3.82M 2.57M > raidz1 6.00T 6.62T161 84 4.50M 3.16M > raidz1 5.88T 6.75T139 83 4.01M 3.09M > - - - - - - > > Originally there were only the first two raidz1 volumes, and the two from > the bottom were added later. > > You can notice that by the amount of used / free space. The first two > volumes have ~11TB used and ~1TB free, while the other two have around ~6TB > used and ~6TB free. > > I have hundreds of zfs'es storing backups from several servers. Each ZFS > has about 7 snapshots of older backups. > > I have the impression I'm getting degradation in performance due to the > limited space in the first two volumes, specially the second, which has only > 845GB free. > > Is there any way to re-stripe the pool, so I can take advantage of all > spindles across the raidz1 volumes? Right now it looks like the newer > volumes are doing the heavy while the other two just hold old data. > > Thanks, > Eduardo Bragatto > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup tool
My inclination, based on what I've read and heard from others, is to say "no". But again, the best way to find out is to write the code. :\ On Wed, Jun 9, 2010 at 11:45, Edward Ned Harvey wrote: > > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > > boun...@opensolaris.org] On Behalf Of Toyama Shunji > > > > Certainly I feel it is difficult, but is it logically impossible to > > write a filter program to do that, with reasonable memory use? > > Good question. I don't know the answer. > > If somebody wanted to, would it be impossible to write a program to extract > a single file (or subset of files) from a zfs send datastream? > > I don't know anything about the internal data structuring format of the zfs > send datastream. So I couldn't begin to answer the question. > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] General help with understanding ZFS performance bottlenecks
It would be helpful if you posted more information about your configuration. Numbers *are* useful too, but minimally, describing your setup, use case, the hardware and other such facts would provide people a place to start. There are much brighter stars on this list than myself, but if you are sharing your ZFS dataset(s) via NFS with a heavy traffic load (particularly writes), a mirrored SLOG will probably be useful. (The ZIL is a component of every ZFS pool. A SLOG is a device, usually an SSD or mirrored pair of SSDs, on which you can locate your ZIL for enhanced *synchronous* write performance.) Since ZFS does sync writes, that might be a win for you, but again it depends on a lot of factors. Help us (or rather, the community) help you by providing real information and data. On Mon, Jun 7, 2010 at 19:59, besson3c wrote: > Hello, > > I'm wondering if somebody can kindly direct me to a sort of newbie way of > assessing whether my ZFS pool performance is a bottleneck that can be > improved upon, and/or whether I ought to invest in a SSD ZIL mirrored pair? > I'm a little confused by what the output of iostat, fsstat, the zilstat > script, and other diagnostic tools illuminates, and I'm definitely not > completely confident with what I think I do understand. I'd like to sort of > start over from square one with my understanding of all of this. > > So, instead of my posting a bunch of numbers, could you please help me with > some basic tactics and techniques for making these assessments? I have some > reason to believe that there are some performance problems, as the loads on > the machine writing to these ZFS NFS shares can get pretty high during heavy > writing of small files. Throw in the ZFS queue parameters in addition to all > of these others numbers and variables and I'm a little confused as to where > best to start. It is also a possibility that the ZFS server is not the > bottleneck here, but I would love it if I can feel a little more confident > in my assessments. > > Thanks for your help! I expect that this conversation will get pretty > technical and that's cool (that's what I want too), but hopefully this is > enough to get the ball rolling! > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive as backup tool
To answer the question you asked here...the answer is "no". There have been MANY discussions of this in the past. Here's the lng thread I started back in May about backup strategies for ZFS pools and file systems: http://mail.opensolaris.org/pipermail/zfs-discuss/2010-March/038678.html But to do what you're talking about, no, you cannot. There are other ways to accomplish that outcome and the above thread discusses many of them. But ZFS send/recv cannot and isn't designed to. On Mon, Jun 7, 2010 at 10:34, Toyama Shunji wrote: > Can I extract one or more specific files from zfs snapshot stream? > Without restoring full file system. > Like ufs based 'restore' tool. > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] The next release
Ian: Of course they expected answers to those questions here. It seems many people do not read the forums or mailing list archives to see their questions previously asked (and answered) many many times over, or the flames that erupt from them. It's scary how much people don't check historical records before asking questions. As my Trinidadian friends would say, most of these posters are "asking answers". Autumn (if that is your real name), the short answer to the first question is "it will be released when it is released". Read the archives. The answer to the 2nd question is "we'll have to see what Oracle does related to the community, so continue watching their behavior", followed with "read the archives AGAIN". Finally, "Autumn", the answer to the question about the direction of ZFS is... wait for it...read the bloody archives. All of your questions have been asked and answered MULTIPLE times in the past by far too many people who could have taken some time to just read for the answer instead of exhibiting poor Netiquette by asking questions such as these. On Wed, Apr 28, 2010 at 19:20, Ian Collins wrote: > On 04/29/10 11:02 AM, autumn Wang wrote: > >> One quick question: When will the next formal release be released? >> >> > > Of what? > > > Does oracle have plan to support OpenSolaris community as Sun did before? >> What is the direction of ZFS in future? >> >> > > Do you really expect answers to those question here? > > -- > Ian. > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS deduplication ratio on Server 2008 backup VHD files
A few things come to mind... 1. A lot better than...what? Setting the recordsize to 4K got you some deduplication but maybe the pertinent question is what were you expecting? 2. Dedup is fairly new. I haven't seen any reports of experiments like yours so...CONGRATULATIONS!! You're probably the first. Or at least the first willing to discuss it with the world as a matter of public record? Since dedup is new, you can't expect much in the way of previous experience with it. I also haven't seen coordinated experiments of various configurations with dedup off then on, for comparison. In the end, the question is going to be whether that level of dedup is going to be enough for you. Is dedup even important? Is it just a "gravy" feature or a key requirement? You're in un-explored territory, it appears. On Fri, Apr 23, 2010 at 11:41, tim Kries wrote: > Hi, > > I am playing with opensolaris a while now. Today i tried to deduplicate the > backup VHD files Windows Server 2008 generates. I made a backup before and > after installing AD-role and copied the files to the share on opensolaris > (build 134). First i got a straight 1.00x, then i set recordsize to 4k (to > be like NTFS), it jumped up to 1.29x after that. But it should be a lot > better right? > > Is there something i missed? > > Regards > Tim > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Oracle to no longer support ZFS on OpenSolaris?
I have no idea who you're talking to, but presumably you mean this link: http://lists.freebsd.org/pipermail/freebsd-questions/2010-April/215269.html Worked fine for me. I didn't post it. I'm not the OP on this thread or on the FreeBSD thread. So what "broken link" are you talking about and to whom were you responding? On Tue, Apr 20, 2010 at 06:58, Tonmaus wrote: > Why don't you just fix the apparently broken link to your source, then? > > Regards, > > Tonmaus > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Oracle to no longer support ZFS on OpenSolaris?
This is how rumors get started. >From reading that thread, the OP didn't seem to know much of anything about... anything. Even less so about Solaris and OpenSolaris. I'd advise not to get your news from mailing lists, especially not mailing lists for people who don't use the product you're interested in. Nothing like this has been said anywhere by anyone that even resembles or approximates an Oracle representative. So, yeah, ignore it, as the guy was just asking dumb questions in a very poor manner about things he has absolutely no knowledge of, and adding assumptions on top of that, in his best but not very good English. At least, that's my impression and opinion. Finally, Michael S. made the best recommendation...talk to your sales rep if you're a paying customer. Cheers! On Tue, Apr 20, 2010 at 01:18, Ken Gunderson wrote: > Greetings All: > > Granted there has been much fear, uncertainty, and doubt following > Oracle's take over of Sun, but I ran across this on a FreeBSD mailing > list post dated 4/20/2010" > > "...Seems that Oracle won't offer support for ZFS on opensolaris" > > Link here to full post here: > > < > http://lists.freebsd.org/pipermail/freebsd-questions/2010-April/215269.html > > > > It seems like such would be pretty outrageous and the OP either confused > or spreading FUD, but then on the other hand there's lot of rumors > flying around about hidden agendas behind the 2010.03 delay, and Oracle > being Oracle such could be within the realm of possibilities. > > Given Oracle's information policies we're not likely to know if such is > indeed the case until it's a fait accompli but I nonetheless thought > this would be the best place to inquire (or perhaps Indiana list, as I > assume OP is referencing upcoming opensolaris.com release). > > Thank you and have a nice day. > > -- > Ken Gunderson > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] newbie, WAS: Re: SSD best practices
I would advise getting familiar with the basic terminology and vocabulary of ZFS first. Start with the Solaris 10 ZFS Administration Guide. It's a bit more complete for a newbie. http://docs.sun.com/app/docs/doc/819-5461?l=en You can then move on to the Best Practices Guide, Configuration Guide, Troubleshooting Guide and Evil Tuning Guide on solarisinternals.com: http://www.solarisinternals.com//wiki/index.php?title=Category:ZFS All of the features in ZFS on Solaris 10 appear in OpenSolaris; the inverse does not necessarily hold true, as active development occurs on the OpenSolaris trunk and updates take about a year to filter back down into Solaris due to integration concerns, testing, etc. A Separate Log (SLOG) device can be used for a ZIL, but they are not necessarily the same thing. The ZIL always exists, and is part of the pool if you have not defined a SLOG device. The zpool.cache file does not reside in the pool. It lives in /etc/zfs in the root file system of your OpenSolaris system. Thus, it does not reside "on the ZIL device" either, since there may not necessarily be a SLOG (what you would term a "ZIL device") anyway. (There is always a ZIL, though. See remarks above.) Hopefully that clears up some of the misconceptions and misunderstandings you have. Cheers! On Mon, Apr 19, 2010 at 06:52, Michael DeMan wrote: > Also, pardon my typos, and my lack of re-titling my subject to note that it > is a fork from the original topic. Corrections in text that I noticed after > finally sorting out getting on the mailing list are below... > > On Apr 19, 2010, at 3:26 AM, Michael DeMan wrote: > > > By the way, > > > > I would like to chip in about how informative this thread has been, at > least for me, despite (and actually because of) the strong opinions on some > of the posts about the issues involved. > > > > From what I gather, there is still an interesting failure possibility > with ZFS, although probably rare. In the case where a zil (aka slog) device > fails, AND the zpool.cache information is not available, basically folks are > toast? > > > > In addition, the zpool.cache itself exhibits the following behaviors (and > I could be totally wrong, this is why I ask): > > > > A. It is not written to frequently, i.e., it is not a performance impact > unless new zfs file systems (pardon me if I have the incorrect terminology) > are not being fabricated and supplied to the underlying operating system. > > > The above 'are not being fabricated' should be 'are regularly being > fabricated' > > > B. The current implementation stores that cache file on the zil device, > so if for some reason, that device is totally lost (along with said .cache > file), it is nigh impossible to recover the entire pool it correlates with. > The above, 'on the zil device', should say 'on the fundamental zfs file > system itself, or a zil device if one is provisioned' > > > > > > > possible solutions: > > > > 1. Why not have an option to mirror that darn cache file (like to the > root file system of the boot device at least as an initial implementation) > no matter what intent log devices are present? Presuming that most folks at > least want enough redundancy that their machine will boot, and if it boots - > then they have a shot at recovery of the balance of the associated (zfs) > directly attached storage, and with my other presumptions above, there is > little reason do not to offer a feature like this? > Missing final sentence: The vast amount of problems with computer and > network reliability is typically related to human error. The more '9s' that > can be intrinsically provided by the systems themselves helps mitigate this. > > > > > > > Respectfully, > > - mike > > > > > > On Apr 18, 2010, at 10:10 PM, Richard Elling wrote: > > > >> On Apr 18, 2010, at 7:02 PM, Don wrote: > >> > >>> If you have a pair of heads talking to shared disks with ZFS- what can > you do to ensure the second head always has a current copy of the > zpool.cache file? > >> > >> By definition, the zpool.cache file is always up to date. > >> > >>> I'd prefer not to lose the ZIL, fail over, and then suddenly find out I > can't import the pool on my second head. > >> > >> I'd rather not have multiple failures, either. But the information > needed in the > >> zpool.cache file for reconstructing a missing (as in destroyed) > top-level vdev is > >> easily recovered from a backup or snapshot. > >> -- richard > >> > >> ZFS storage and performance consulting at http://www.RichardElling.com > >> ZFS training on deduplication, NexentaStor, and NAS performance > >> Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com > >> > >> > >> > >> > >> > >> ___ > >> zfs-discuss mailing list > >> zfs-discuss@opensolaris.org > >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > > ___ > > zfs-discuss mailing list > > zfs-discuss@opensolaris.or
Re: [zfs-discuss] snapshots taking too much space
Now is probably a good time to mention that dedupe likes LOTS of RAM, based on experiences described here. 8 GiB minimum is a good start. And to avoid those obscenely long removal times due to updating the DDT, an SSD based L2ARC device seems to be highly recommended as well. That is, of course, if the OP decides to go the dedupe route. I get the feeling there is an actual solution to, or at least an intelligent reason for, for the symptoms he's experiencing. I'm just not sure what either of those might be. On Tue, Apr 13, 2010 at 03:09, Peter Tripp wrote: > Oops, I meant SHA256. My mind just maps SHA->SHA1, totally forgetting that > ZFS actually uses SHA256 (a SHA-2 variant). > > More on ZFS dedup, checksums and collisions: > http://blogs.sun.com/bonwick/entry/zfs_dedup > http://www.c0t0d0s0.org/archives/6349-Perceived-Risk.html > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Removing SSDs from pool
Response below... 2010/4/5 Andreas Höschler > Hi Edward, > > thanks a lot for your detailed response! > > > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >>> boun...@opensolaris.org] On Behalf Of Andreas Höschler >>> >>> • I would like to remove the two SSDs as log devices from the pool and >>> instead add them as a separate pool for sole use by the database to >>> see how this enhences performance. I could certainly do >>> >>>zpool detach tank c1t7d0 >>> >>> to remove one disk from the log mirror. But how can I get back the >>> second SSD? >>> >> >> If you're running solaris, sorry, you can't remove the log device. You >> better keep your log mirrored until you can plan for destroying and >> recreating the pool. Actually, in your example, you don't have a mirror >> of >> logs. You have two separate logs. This is fine for opensolaris (zpool >> >>> =19), but not solaris (presently up to zpool 15). If this is solaris, >>> and >>> >> *either* one of those SSD's fails, then you lose your pool. >> > > I run Solaris 10 (not Open Solaris)! > > You say the log mirror > > > pool: tank > state: ONLINE > scrub: none requested > config: > >NAMESTATE READ WRITE CKSUM >tankONLINE 0 0 0 >... > >logs > c1t6d0ONLINE 0 0 0 > c1t7d0ONLINE 0 0 0 > > does not do me anything good (redundancy-wise)!? Shouldn't I dettach the > second drive then and try to use it for something else, may be another > machine? > > No, he did *not* say that a mirrored SLOG has no benefit, redundancy-wise. He said that YOU do *not* have a mirrored SLOG. You have 2 SLOG devices which are striped. And if this machine is running Solaris 10, then you cannot remove a log device because those updates have not made their way into Solaris 10 yet. You need pool version >= 19 to remove log devices, and S10 does not currently have patches to ZFS to get to a pool version >= 19. If your SLOG above were mirrored, you'd have "mirror" under "logs". And you probably would have "log" not "logs" - notice the "s" at the end meaning plural, meaning multiple independent log devices, not a mirrored pair of logs which would effectively look like 1 device. -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS backup configuration
Yes, I think Eric is correct. Funny, this is an adjunct to the thread I started entitled "Thoughts on ZFS Pool Backup Strategies". I was going to include this point in that thread but thought better of it. It would be nice if there were an easy way to extract a pool configuration, with all of the dataset properties, ACLs, etc. so that you could easily reload it into a new pool. I could see this being useful in a disaster recovery sense, and I'm sure people smarter than I can think of other uses. >From my reading of the documentation and man pages, I don't see that any such command currently exists. Something that would allow you dump the config into a file and read it back from a file using typical Unix semantics like STDIN/STDOUT. I was thinking something like: zpool dump [-o ] zpool load [-f wrote: > On Wed, Mar 24 at 12:20, Wolfraider wrote: > >> Sorry if this has been dicussed before. I tried searching but I >> couldn't find any info about it. We would like to export our ZFS >> configurations in case we need to import the pool onto another >> box. We do not want to backup the actual data in the zfs pool, that >> is already handled through another program. >> > > I'm pretty sure the configuration is embedded in the pool itself. > Just import on the new machine. You may need --force/-f the pool > wasn't exported on the old system properly. > > --eric > > -- > Eric D. Mudama > edmud...@mail.bounceswoosh.org > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Intel SASUC8I - worth every penny
Heh. The original definition of "I" was inexpensive. Was never meant to be "independent". Guess that changed by vendors. The idea all along was to take inexpensive hardware and use software to turn it into a reliable system. http://portal.acm.org/citation.cfm?id=50214 http://www.cs.cmu.edu/~garth/RAIDpaper/Patterson88.pdf Regarding the 2.5" laptop drives, do the inherent error detection properties >> of ZFS subdue any concerns over a laptop drive's higher bit error rate or >> rated MTBF? I've been reading about OpenSolaris and ZFS for several months >> now and am incredibly intrigued, but have yet to implement the solution in >> my lab. >> > > Well ... the price difference means you can have mirrors of the laptop > drives and still save money compared to the "enterprise" ones. With a modern > patrol-reading (scrub or hardware raid) array-setup, and with some > redundancy, you can re-implement "I" to mean "inexpensive" not "independent" > in RAID. ;) > > > //Svein > > -- > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies
Responses inline below... On Sat, Mar 20, 2010 at 00:57, Edward Ned Harvey wrote: > > 1. NDMP for putting "zfs send" streams on tape over the network. So > > Tell me if I missed something here. I don't think I did. I think this > sounds like crazy talk. > > I used NDMP up till November, when we replaced our NetApp with a Solaris > Sun > box. In NDMP, to choose the source files, we had the ability to browse the > fileserver, select files, and specify file matching patterns. My point is: > NDMP is file based. It doesn't allow you to spawn a process and backup a > data stream. > > Unless I missed something. Which I doubt. ;-) > > You clearly know more about NDMP than I do. I'm still learning. I forgot that you previously mentioned the file-based nature of NDMP. I'm still wondering about that in the longer term, but yeah, this is my mistake. I'll end up doing some deeper diving on this topic, I can see. But this was just me seeking clarity. Maybe Fishworks appliances would benefit from the presence of NDMP but if you're using a standard server running (Open)Solaris, it looks like a non-starter. > > > To Ed Harvey: > > > > Some questions about your use of NetBackup on your secondary server: > > > > 1. Do you successfully backup ZVOLs? We know NetBackup should be able > > to capture datasets (ZFS file systems) using straight POSIX semantics. > > I wonder if I'm confused by that question. "backup zvols" to me, would > imply something at a lower level than the filesystem. No, we're not doing > that. We just specify "backup the following directory and all of its > subdirectories." Just like any other typical backup tool. > > The reason we bought NetBackup is because it intelligently supports all the > permissions, ACL's, weird (non-file) file types, and so on. And it > officially supports ZFS, and you can pay for an enterprise support > contract. > > Basically, I consider the purchase cost of NetBackup to be insurance. > Although I never plan to actually use it for anything, because all our > bases > are covered by "zfs send" to hard disks and tapes. I actually trust the > "zfs send" solution more, but I can't claim that I, or anything I've ever > done, is 100% infallible. So I need a commercial solution too, just so I > can point my finger somewhere if needed. > Yeah, I get all the reasons you state for using NetBackup. Makes total sense. And I asked this question to be clear about support for backing up ZVOLs outside if ZFS-specific tools e.g. zfs(1M). I didn't actually think NetBackup could capture ZVOLs, for the reasons you listed, but I wanted to be absolutely clear. Asking the wrong questions is the leading cause of wrong answers, as a former boss of mine used to say. > > > > 2. What version of NetBackup are you using? > > I could look it up, but I'd have to VPN in and open up a console, etc etc. > We bought it in November, so it's whatever was current 4-5 months ago. > > Ok. Thanks. > > > 3. You simply run the NetBackup agent locally on the (Open)Solaris > > server? > > Yup. We're doing no rocket science with it. Ours is the absolute most > basic NetBackup setup you could possibly have. We're not using 90% of the > features of NetBackup. It's installed on a Solaris 10 server, with locally > attached tape library, and it does backups directly from local disk to > local > tape. > > This is an advantage of Solaris being a 1st class citizen in the NetBackup world. For a Unified Storage appliance, however, NDMP for file level backup may be a reasonable choice (as Darren postulated earlier). But if you just buy a server and install Solaris, then the NetBackup Solaris agent is the easiest route, as you've shown. Thanks again, Ed, for your time and generosity. And thank you to all contributors to this thread for indulging my curiosity. -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies
Erik, I don't think there was any confusion about the block nature of "zfs send" vs. the file nature of star. I think what this discussion is coming down to is the best ways to utilize "zfs send" as a backup, since (as Darren Moffat has noted) it supports all the ZFS objects and metadata. I see 2 things coming out of this: 1. NDMP for putting "zfs send" streams on tape over the network. So the question I have now is for anyone who has used or is using NDMP on OSol. How well does it work? Pros? Cons? If people aren't using it, why not? I think this is one area where there are some gains to be made on the OSol backup front. I still need to go back and look at the best ways to use local tape drives on OSol file servers running ZFS to capture ZFS objects and metadata (ZFS ACLs, ZVOLs, etc.). 2. A new tool is required to provide some of the functionality desired, at least as a supported backup method from Sun. While someone in the community may be interested in developing such a tool, Darren also noted that the requisite APIs are private currently and still in flux. They haven't yet stabilized and been published. To Ed Harvey: Some questions about your use of NetBackup on your secondary server: 1. Do you successfully backup ZVOLs? We know NetBackup should be able to capture datasets (ZFS file systems) using straight POSIX semantics. 2. What version of NetBackup are you using? 3. You simply run the NetBackup agent locally on the (Open)Solaris server? I thank everyone who has participated in this conversation for sharing their thoughts, experiences and realities. It has been most informational. On Fri, Mar 19, 2010 at 13:11, erik.ableson wrote: > On 19 mars 2010, at 17:11, Joerg Schilling wrote: > > >> I'm curious, why isn't a 'zfs send' stream that is stored on a tape yet > >> the implication is that a tar archive stored on a tape is considered a > >> backup ? > > > > You cannot get a single file out of the zfs send datastream. > > zfs send is a block-level transaction with no filesystem dependencies - it > could be transmitting a couple of blocks that represent a portion of a file, > not necessarily an entire file. And since it can also be used to host a > zvol with any filesystem format imaginable it doesn't want to know. > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Usage of hot spares and hardware allocation capabilities.
Responses inline... On Tue, Mar 16, 2010 at 07:35, Robin Axelsson wrote: > I've been informed that newer versions of ZFS supports the usage of hot > spares which is denoted for drives that are not in use but available for > resynchronization/resilvering should one of the original drives fail in the > assigned storage pool. > That is the definition of a hot spare, at least informally. ZFS has supported this for some time (if not from the beginning; I'm not in a position to answer that). It is *not* new. > > I'm a little sceptical about this because even the hot spare will be > running for the same duration as the other disks in the pool and therefore > will be exposed to the same levels of hardware degradation and failures > unless it is put to sleep during the time it is not being used for storage. > So, is there a sleep/hibernation/standby mode that the hot spares operate in > or are they on all the time regardless of whether they are in use or not? > Not that I am aware of or have heard others report. No such "sleep mode" exists. Sounds like you want a Copan storage system. AFAIK, hot spares are always spinning, that's why they are hot. > > Usually the hot spare is on a not so well-performing SAS/SATA controller, > so given the scenario of a hard drive failure upon which a hot spare has > been used for resilvering of say a raidz2 cluster, can I move the resilvered > hot spare to the faster controller by letting it take the faulty hard > drive's space using the "zpool offline", "zpool online" commands? > Usually? That's not my experience, from multiple vendors hardware RAID arrays. Usually it's on a channel used by storage disks. Maybe someone else has seen otherwise. I'd be personally curious to know what system puts a spare on a lower performance channel. That risks slowing the entire device (RAID set/group) when the hot spare kicks in. As for your questions, that doesn't make a lot of sense to me. I don't even get how that would work, but I'm not "Wile E. Coyote, Super Genius" either. > > To be more general; are the hard drives in the pool "hard coded" to their > SAS/SATA channels or can I swap their connections arbitrarily if I would > want to do that? Will zfs automatically identify the association of each > drive of a given pool or tank and automatically reallocate them to put the > pool/tank/filesystem back in place? > No. Each disk in the pool has a unique ID, as I understand. Thus, you should be able to move a disk to another location (channel, slot) and it would still be a part of the same pool and VDEV. All of that said, I saw this post when it originally came in. I notice no one has responded to it until now. I don't know about anyone else, but I know that I was offended when I read this. I know for myself, I wasn't sure how to take this when I read it. Maybe you should not assume that people on this list don't know what hot sparing is, or that ZFS just learned. Just a suggestion. -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS/OSOL/Firewire...
The point I think Bob was making is that FireWire is an Apple technology, so they have a vested interest in making sure it works well on their systems and with their OS. They could even have a specific chipset that they exclusively use in their systems, although I don't see why others couldn't source it (with the exception that others may be too cheap to do so). Given these factors, it makes sense that FireWire performs brilliantly on Apple hardware/software, while everyone else makes the bare minimum (or less) investment in it, if that much. So those open drivers, while they could be useful for learning or other purposes, may not be directly usable for the systems people are running with OpenSolaris. At least, that's what I think Bob meant. On Fri, Mar 19, 2010 at 17:08, Alex Blewitt wrote: > On 19 Mar 2010, at 15:30, Bob Friesenhahn wrote: > > > On Fri, 19 Mar 2010, Khyron wrote: > >> Getting better FireWire performance on OpenSolaris would be nice though. > >> Darwin drivers are open...hmmm. > > > > OS-X is only (legally) used on Apple hardware. Has anyone considered > that since Firewire is important to Apple, they may have selected a > particular Firewire chip which performs particularly well? > > Darwin is open-source. > > http://www.opensource.apple.com/source/xnu/xnu-1486.2.11/ > > http://www.opensource.apple.com/source/IOFireWireFamily/IOFireWireFamily-417.4.0/ > > Alex -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS/OSOL/Firewire...
I'm also a Mac user. I use Mozy instead of DropBox, but it sounds like DropBox should get a place at the table. I'm about to download it in a few minutes. I'm right now re-cloning my internal HD due to some HFS+ weirdness. I have to completely agree that ZFS would be a great addition to MacOS X, and the best imaginable replacement for HFS+. The file system and associated problems are my only complaint with the entire OS. I guess my browser usage pattern is just too much for HFS+. Of course, I'm the only person I know who said that Sun should have bought Apple 10 years ago. What do I know? Getting better FireWire performance on OpenSolaris would be nice though. Darwin drivers are open...hmmm. On Thu, Mar 18, 2010 at 18:19, David Magda wrote: > On Mar 18, 2010, at 14:23, Bob Friesenhahn wrote: > > On Thu, 18 Mar 2010, erik.ableson wrote: >> >>> >>> Ditto on the Linux front. I was hoping that Solaris would be the >>> exception, but no luck. I wonder if Apple wouldn't mind lending one of the >>> driver engineers to OpenSolaris for a few months... >>> >> >> Perhaps the issue is the filesystem rather than the drivers. Apple users >> have different expectations regarding data loss than Solaris and Linux users >> do. >> > > Apple users (of which I am one) expect things to Just Work. :) > > And there are Apple users and Apple users: > > http://daringfireball.net/2010/03/ode_to_diskwarrior_superduper_dropbox > > If anyone Apple is paying attention, perhaps you could re-open discussions > with now-Oracle about getting ZFS into Mac OS. :) > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies
Ahhh, this has been...interesting...some real "personalities" involved in this discussion. :p The following is long-ish but I thought a re-cap was in order. I'm sure we'll never finish this discussion, but I want to at least have a new plateau or base from which to consider these questions. I've just read through EVERY post to this thread, so I want to recap the best points in the vein of the original thread, and set a new base for continuing the conversation. Personally, I'm less interested in the archival case; rather, I'm looking for the best way to either recover from a complete system failure or recover an individual file or file set from some backup media, most likely tape. Now let's put all of this together, along with some definitions. First, the difference between archival storage (to tape or other) and backup. I think the best definition provided in this thread came from Darren Moffat as well. As Carsten Aulbert mentioned, this discussion is fairly useless until we start using the same terminology to describe a set of actions. For this discussion, I am defining archival as taking the data and placing it on some media - likely tape, but not necessarily - in the simplest format possible that could hopefully be read by another device in the future. This could exclude capturing NTFS/NFSv4/ZFS ACLs, Solaris extended attributes, or zpool properties (aka metadata for purposes of this discussion). With an archive, we may not go back and touch the data for a long time, if ever again. Backup, OTOH, is the act of making a perfect copy of the data to some media (in my interest tape, but again, not necessarily) which includes all of the metadata associated with that data. Such a copy would allow perfect re-creation of the data in a new environment, recovery from a complete system failure, or single file (or file set) recovery. With a backup, we have the expectation that we may need to return to it shortly after it is created, so we have to be able to trust it...now. Data restored from this backup needs to be an exact replica of the original source - ZFS pool and dataset properties, extended attributes, and ZFS ACLs included. Now that I hopefully have common definitions for this conversation (and I hope I captured Darren's meaning accurately), I'll divide this into 2 sections, starting with NDMP. NDMP: For those who are unaware (and to clarify my own understanding), I'll take a moment to describe NDMP. NDMP was invented by NetApp to allow direct backup of their Filers to tape backup servers, and eventually onto tape. It is designed to remove the need for indirect backup by backing up the NFS or CIFS shared file systems on the clients. Instead, we backup the shared file systems directly from the Filer (or other file server - say Fishworks box or OpenSolaris server) to the backup server via the network. We avoid multiple copies of the shared file systems. NDMP is a network-based delivery mechanism to get data from a storage server to a backup server, which is why the backup software must also speak NDMP. Hopefully, my description is mostly accurate, and it is clear why this might be useful for people using (Open)Solaris + ZFS for tape backup or archival purposes. Darren Moffat made the point that NDMP could be used to do the tape splitting, but I'm not sure this is accurate. If "zfs send" from a file server running (Open)Solaris to a tape drive over NDMP is viable -- which it appears to be to me -- then the tape splitting would be handled by the tape backup application. In my world, that's typically NetBackup or some similar enterprise offering. I see no reason why it couldn't be Amanda or Bacula or Arkeia or something else. THIS is why I am looking for faster progress on NDMP. Now, NDMP doesn't do you much good for a locally attached tape drive, as Darren and Svein pointed out. However, provided the software which is installed on this fictional server can talk to the tape in an appropriate way, then all you have to do is pipe "zfs send" into it. Right? What did I miss? ZVOLs and NTFS/NFSv4/ZFS ACLs: The answer is "zfs send" to both of my questions about ZVOLs and ACLs. At the center of all of this attention is "zfs send". As Darren Moffat pointed out, it has all the pieces to do a proper, complete and correct backup. The big remaining issue that I see is how do you place a "zfs send" stream on a tape in a reliable fashion. CR 6936195 would seem to handle one complaint from Svein, Miles Nordin and others about reliability of the send stream on the tape. Again, I think NDMP may help answer this question for file servers without attached tape devices. For those with attached tape devices, what's the equivalent answer? Who is doing this, and how? I believe we've seen Ed Harvey say "NetBackup" and Ian Collins say "NetVault". Do these products capture all the metadata required to call this copy a "backup"? That's my next question. Finally, Damon Atkins said: "But
Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies
Ian, When you say you spool to tape for off-site archival, what software do you use? On Wed, Mar 17, 2010 at 18:53, Ian Collins wrote: > > I have been using a two stage backup process with my main client, > send/receive to a backup pool and spool to tape for off site archival. > > I use a pair (on connected, one off site) of removable drives as single > volume pools for my own backups via send/receive. > > -- > Ian. > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to manage scrub priority or defer scrub?
For those following along, this is the e-mail I meant to send to the list but instead sent directly to Tonmaus. My mistake, and I apologize for having to re-send. === Start === My understanding, limited though it may be, is that a scrub touches ALL data that has been written, including the parity data. It confirms the validity of every bit that has been written to the array. Now, there may be an implementation detail that is responsible for the pathology that you observed. More than likely, I'd imagine. Filing a bug may be in order. Since triple parity RAIDZ exists now, you may want to test with that by grabbing a LiveCD or LiveUSB image from genunix.org. Maybe RAIDZ3 has the same (or worse) problems? As for "scrub management", I pointed out the specific responses from Richard where he noted that scrub I/O priority *can* be tuned. How you do that, I'm not sure. Richard, how does one tune scrub I/O priority? Other than that, as I said, I don't think there is a model (publicly available anyway) describing scrub behavior and how it scales with pool size (< 5 TB, 5 TB - 50 TB, > 50 TB, etc.) or data layout (mirror vs. RAIDZ vs. RAIDZ2). ZFS is really that new, that all of this needs to be reconsidered and modeled. Maybe this is something you can contribute to the community? ZFS is a new storage system, not the same old file systems whose behaviors and quirks are well known because of 20+ years of history. We're all writing a new chapter in data storage here, so it is incumbent upon us to share knowledge in order to answer these types of questions. I think the questions I raised in my longer response are also valid and need to be re-considered. There are large pools in production today. So how are people scrubbing these pools? Please post your experiences with scrubbing 100+ TB pools. Tonmaus, maybe you should repost my other questions in a new, separate thread? === End === On Tue, Mar 16, 2010 at 19:41, Tonmaus wrote: > > Are you sure that you didn't also enable > > something which > > does consume lots of CPU such as enabling some sort > > of compression, > > sha256 checksums, or deduplication? > > None of them is active on that pool or in any existing file system. Maybe > the issue is particular to RAIDZ2, which is comparably recent. On that > occasion: does anybody know if ZFS reads all parities during a scrub? > Wouldn't it be sufficient for stale corruption detection to read only one > parity set unless an error occurs there? > > > The main concern that one should have is I/O > > bandwidth rather than CPU > > consumption since "software" based RAID must handle > > the work using the > > system's CPU rather than expecting it to be done by > > some other CPU. > > There are more I/Os and (in the case of mirroring) > > more data > > transferred. > > What I am trying to say is that CPU may become the bottleneck for I/O in > case of parity-secured stripe sets. Mirrors and simple stripe sets have > almost 0 impact on CPU. So far at least my observations. Moreover, x86 > processors not optimized for that kind of work as much as i.e. an Areca > controller with a dedicated XOR chip is, in its targeted field. > > Regards, > > Tonmaus > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to manage scrub priority or defer scrub?
Ugh! I meant that to go to the list, so I'll probably re-send it for the benefit of everyone involved in the discussion. There were parts of that that I wanted others to read. >From a re-read of Richard's e-mail, maybe he meant that the number of I/Os queued to a device can be tuned lower and not the priority of the scrub (as I took him to mean). Hopefully Richard can clear that up. I personally stand corrected for mis-reading Richard there. Of course the performance of a given system cannot be described until it is built. Again, my interpretation of your e-mail was that you were looking for a model for the performance of concurrent scrub and I/O load of a RAIDZ2 VDEV that you could scale up from your "test" environment of 11 disks to a 200+ TB behemoth. As I mentioned several times, I doubt such a model exists, and I have not seen anything published to that effect. I don't know how useful it would be if it did exist because the performance of your disks would be a critical factor. (Although *any* model beats no model any day.) Let's just face it. You're using a new storage system that has not been modeled. To get the model you seek, you will probably have to create it yourself. (It's notable that most of the ZFS models that I have seen have been done by Richard. Of course, they were MTTDL models, not scrub vs. I/O performance models for different VDEV types.) As for your point about building large pools from lots of mirror VDEVs, my response is "meh". I've said several times, and maybe you've missed it several times, that there may be pathologies for which YOU should open bugs. RAIDZ3 may exhibit the same kind of pathologies you observed with RAIDZ2. Apparently RAIDZ does not. I've also noticed (and I'm sure I'll be corrected if I'm mistaken) that there is not a limit on the number of VDEVs in a pool but single digit RAID VDEVs are recommended. So there is nothing preventing you from building (for example) VDEVs from 1 TB disks. If you take 9 x 1 TB disks per VDEV, and use RAIDZ2, you get 7 TB usable. That means about 29 VDEVs to get 200 TB. Double the disk capacity and you can probably get to 15 top level VDEVs. (And you'll want that RAIDZ2 as well since I don't know if you could trust that many disks, whether enterprise or consumer.) However, that number of top level VDEVs sounds reasonable based on what others have reported. What's been proven to be "A Bad Idea(TM)" is putting lots of disks in a single VDEV. Remember that ZFS is a *new* software system. It is complex. It will have bugs. You have chosen ZFS; it didn't choose you. So I'd say you can contribute to the community by reporting back your experiences, opening bugs on things which make sense to open bugs on, testing configurations, modeling, documenting and sharing. So far, you just seem to be interested in taking w/o so much as an offer of helping the community or developers to understand what works and what doesn't. All take and no give is not cool. And if you don't like ZFS, then choose something else. I'm sure EMC or NetApp will willingly sell you all the spindles you want. However, I think it is still early to write off ZFS as a losing proposition, but that's my opinion. So far, you seem to be spending a lot of time complaining about a *new* software system that you're not paying for. That's pretty tasteless, IMO. And now I'll re-send that e-mail... P.S.: Did you remember to re-read this e-mail? Read it 2 or 3 times and be clear about what I said and what I did _not_ say. On Wed, Mar 17, 2010 at 16:12, Tonmaus wrote: > Hi, > > I got a message from you off-list that doesn't show up in the thread even > after hours. As you mentioned the aspect here as well I'd like to respond > to, I'll do it from here: > > > Third, as for ZFS scrub prioritization, Richard > > answered your question about that. He said it is > > low priority and can be tuned lower. However, he was > > answering within the context of an 11 disk RAIDZ2 > > with slow disks His exact words were: > > > > > > This could be tuned lower, but your storage > > is slow and *any* I/O activity will be > > noticed. > > Richard told us two times that scrub already is as low in priority as can > be. From another message: > > "Scrub is already the lowest priority. Would you like it to be lower?" > > > = > > As much as the comparison goes between "slow" and "fast" storage. I have > understood that Richard's message was that with storage providing better > random I/O zfs priority scheduling will perform significantly better, > providing less degradation of concurrent load. While I am even inclined to > buy that, nobody will be able to tell me how a certain system will behave > until it was tested, and to what degree concurrent scrubbing still will be > possible. > Another thing: people are talking a lot about narrow vdevs and mirrors. > However, when you need to build a 2
Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies
To be sure, Ed, I'm not asking: Why bother trying to backup with "zfs send" when there are fully supportable and working options available right NOW? Rather, I am asking: Why do we want to adapt "zfs send" to do something it was never intended to do, and probably won't be adapted to do (well, if at all) anytime soon instead of optimizing existing technologies for this use case? But I got it. "zfs send" is fast. Let me ask you this, Ed...where do you "zfs send" your data to? Another pool? Does it go to tape eventually? If so, what is the setup such that it goes to tape? I apologize for asking here, as I'm sure you described it in one of the other threads I mentioned, but I'm not able to go digging in those threads at the moment. I ask this because I see an opportunity to kill 2 birds with one stone. With proper NDMP support and "zfs send" performance, why can't you get the advantages of "zfs send" without trying to shoehorn "zfs send" into a use it's not designed for? Maybe NDMP support needs to be a higher focus of the ZFS team? I noticed not many people even seem to be asking for it, never mind screaming for it. However, I did say this in my original e-mail - that I see NDMP support as being a way to handle the calls for "zfs send" to tape. Maybe we can broaden the conversation at this point. For all of those who use NDMP today to backup Filers, be they NetApp, EMC, or other vendors' devices...how is your experience with NDMP? *IS* anyone using NDMP? If you have the option of using NDMP and you don't, why don't you? Backing up file servers directly to tape seems to be an obvious WIN, so if people aren't doing it, I'm curious why they aren't. That's any kind of file server, because (Open)Solaris will increasingly be applied in this role. That was pretty much the goal of the Fishworks team, IIRC. So this looks like an opportunity by Sun (Oracle) to take a neglected backup technology and make it a must-have backup technology, by making it integrate smoothly with ZFS and high performance. On Wed, Mar 17, 2010 at 09:37, Edward Ned Harvey wrote: > > The one thing that I keep thinking, and which I have yet to see > > discredited, is that > > ZFS file systems use POSIX semantics. So, unless you are using > > specific features > > (notably ACLs, as Paul Henson is), you should be able to backup those > > file systems > > using well known tools. > > This is correct. Many people do backup using tar, star, rsync, etc. > > > > The Best Practices Guide is also very clear about send and receive NOT > > being > > designed explicitly for backup purposes. I find it odd that so many > > people seem to > > want to force this point. ZFS appears to have been designed to allow > > the use of > > well known tools that are available today to perform backups and > > restores. I'm not > > sure how many people are actually using NFS v4 style ACLs, but those > > people have > > the most to worry about when it comes to using tar or NetBackup or > > Networker or > > Amanda or Bacula or star to backup ZFS file systems. Everyone else, > > which appears > > to be the majority of people, have many tools to choose from, tools > > they've used > > for a long time in various environments on various platforms. The > > learning curve > > doesn't appear to be as steep as most people seem to make it out to > > be. I honestly > > think many people may be making this issue more complex than it needs > > to be. > > I think what you're saying is: Why bother trying to backup with "zfs send" > when the recommended practice, fully supportable, is to use other tools for > backup, such as tar, star, Amanda, bacula, etc. Right? > > The answer to this is very simple. > #1 "zfs send" is much faster. Particularly for incrementals on large > numbers of files. > #2 "zfs send" will support every feature of the filesystem, including > things like filesystem properties, hard links, symlinks, and objects which > are not files, such as character special objects, fifo pipes, and so on. > Not to mention ACL's. If you're considering some other tool (rsync, star, > etc), you have to read the man pages very carefully to formulate the exact > backup command, and there's no guarantee you'll find a perfect backup > command. There is a certain amount of comfort knowing that the people who > wrote "zfs send" are the same people who wrote the filesystem. It's > simple, > and with no arguments, and no messing around with man page research, it's > guaranteed to make a perfect copy of the whole filesystem. > > Did I mention fast? ;-) Prior to zfs, I backed up my file server via > rsync. It's 1TB of mostly tiny files, and it ran for 10 hours every night, > plus 30 hours every weekend. Now, I use zfs send, and it runs for an > average 7 minutes every night, depending on how much data changed that day, > and I don't know - 20 hours I guess - every month. > > -- "You can choose your friends, you can choose the deals." - Equity Private "I
Re: [zfs-discuss] Thoughts on ZFS Pool Backup Strategies
Exactly! This is what I meant, at least when it comes to backing up ZFS datasets. There are tools available NOW, such as Star, which will backup ZFS datasets due to the POSIX nature of those datasets. As well, Amanda, Bacula, NetBackup, Networker and probably some others I missed. Re-inventing the wheel is not required in these cases. As I said in my original e-mail, Star is probably perfect once it gets ZFS (e.g. NFS v4) ACL and NDMP support (e.g. accepting NDMP input streams and ouputting onto tape). ZVOLs are the piece I'm still not sure about though. So I repeat my question: how are people backing up ZVOLs today? (If Star could do ZVOLs as well as NDMP and ZFS ACLs, then it literally *is* perfect.) On Wed, Mar 17, 2010 at 09:01, Joerg Schilling < joerg.schill...@fokus.fraunhofer.de> wrote: > Stephen Bunn wrote: > > > between our machine's pools and our backup server pool. It would be > > nice, however, if some sort of enterprise level backup solution in the > > style of ufsdump was introduced to ZFS. > > Star can do the same as ufsdump does but independent of OS and filesystem. > > Star is currently missing support for ZFS ACLs and for extended attributes > from Solaris. If you are interested, make a test. If you need support for > ZFS > ACLs or Solaris extendd attributes, send me a note. > > Jörg > > -- > > EMail:jo...@schily.isdn.cs.tu-berlin.de(home) > Jörg Schilling D-13353 Berlin > j...@cs.tu-berlin.de(uni) > joerg.schill...@fokus.fraunhofer.de (work) Blog: > http://schily.blogspot.com/ > URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Thoughts on ZFS Pool Backup Strategies
Note to readers: There are multiple topics discussed herein. Please identify which idea(s) you are responding to, should you respond. Also make sure to take in all of this before responding. Something you want to discuss may already be covered at a later point in this e-mail, including NDMP and ZFS ACLs. It's lng. It seems to me that something is being overlooked (either by myself or others) in all of these discussions about backing up ZFS pools... The one thing that I keep thinking, and which I have yet to see discredited, is that ZFS file systems use POSIX semantics. So, unless you are using specific features (notably ACLs, as Paul Henson is), you should be able to backup those file systems using well known tools. The ZFS Best Practices Guide speaks to this in section 4.4 (specifically 4.4.3[1]) and there have been various posters who have spoken of using other tools. (Star comes to mind, most prominently.) The Best Practices Guide is also very clear about send and receive NOT being designed explicitly for backup purposes. I find it odd that so many people seem to want to force this point. ZFS appears to have been designed to allow the use of well known tools that are available today to perform backups and restores. I'm not sure how many people are actually using NFS v4 style ACLs, but those people have the most to worry about when it comes to using tar or NetBackup or Networker or Amanda or Bacula or star to backup ZFS file systems. Everyone else, which appears to be the majority of people, have many tools to choose from, tools they've used for a long time in various environments on various platforms. The learning curve doesn't appear to be as steep as most people seem to make it out to be. I honestly think many people may be making this issue more complex than it needs to be. Maybe the people having the most problems are those who are new to Solaris, but if you have any real *nix experience, Solaris shouldn't be that difficult to figure out, especially for those with System V experience. The Linux folks? Well, I sorta feel sorry for you and I sorta don't. So, am I missing something? It wouldn't surprise me if I am. What am I missing? The other things I have been thinking about are NDMP support and what tools out there support NFS v4 ACLs. Has anyone successfully used NDMP support with ZFS? If so, what did you do? How did you configure your system, including any custom coding you did? From the looks of the NDMP project on os.org, NDMP was integrated in build 102[3] but it appears to only be NDMP v4 not the latest, v5. Maybe NDMP support would placate some of those screaming for the send stream to be a tape backup format? As for ACLs[2], the list of tools supporting NFS v4 ACLs seems to be pretty small. I plan to spend some quality time with RFC 3530 to get my head around NFS v4, and ACLs in particular. star seems to be fairly adept, with the exception of the NFS v4 ACL support. Hopefully that is forthcoming? Again, I think those people who are not using ZFS ACLs can probably perform actual tape backups (should they choose to) with existing tools. If I'm mistaken or missing something, I invite someone to please point it out. Finally, there's backup of ZVOLs. I don't know what the commercial tool support for backing up ZVOLs looks like but I know this is the *perfect* place for NDMP. Backing up ZVOLs should be priority #1 for NDMP support in (Open)Solaris, I think. Looking through the symbols in libzfs.so, I don't see anything specifically related to backup of ZVOLs in the existing code. How are people handling ZVOL backups today? Not to be too flip, but star looks like it might be the perfect tape backup software if it supported NDMP, NFS v4 ACLs and ZVOLs. Just thinking out loud... [1] http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Using_ZFS_With_Enterprise_Backup_Solutions [2] http://docs.sun.com/app/docs/doc/819-5461/ftyxi?l=en&a=view [3] http://hub.opensolaris.org/bin/view/Project+ndmp/ Aside: I see so many posts to this list about backup strategy for ZFS file systems, and I continue to be amazed by how few people check the archives for previous discussions before they start a new one. So many of the conversations are repeated over and over, with good information being spread over multiple threads? I personally find it interesting that so few people read first before posting. Few even seem to bother to do so much (little?) as a Google search which would yield several previous discussions on the topic of ZFS pool backups to tape. Oh well. -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to manage scrub priority or defer scrub?
The issue as presented by Tonmaus was that a scrub was negatively impacting his RAIDZ2 CIFS performance, but he didn't see the same impact with RAIDZ. I'm not going to say whether that is a "problem" one way or the other; it may be expected behavior under the circumstances. That's for ZFS developers to speak on. (This was one of many issues Tonmaus mentioned.) However, what was lost was the context. Tonmaus reported this behavior on a commodity server using slow disks in an 11 disk RAIDZ2 set. However, he *really* wants to know if this will be an issue on a 100+ TB pool. So his examples were given on a pool that was possibly 5% of the size the pool that he actually wants to deploy. He never said any of this in the original e-mail, so Richard assumed the context to be the smaller system. That's why I pointed out all of the discrepancies and questions he could/should have asked which might have yielded more useful answers. There's quite a difference between the 11 disk RAIDZ2 set and a 100+ TB ZFS pool, especially when the use case, VDEV layout and other design aspects of the 100+ TB pool have not been described. On Tue, Mar 16, 2010 at 13:41, David Dyer-Bennet wrote: > > On Tue, March 16, 2010 11:53, thomas wrote: > > Even if it might not be the best technical solution, I think what a lot > of > > people are looking for when this comes up is a knob they can use to say > "I > > only want X IOPS per vdev" (in addition to low prioritization) to be used > > while scrubbing. Doing so probably helps them feel more at ease that they > > have some excess capacity on cpu and vdev if production traffic should > > come along. > > > > That's probably a false sense of moderating resource usage when the > > current "full speed, but lowest prioritization" is just as good and would > > finish quicker.. but, it gives them peace of mind? > > I may have been reading too quickly, but I have the impression that at > least some of the people not happy with the current prioritization were > reporting severe impacts to non-scrub performance when a scrub was in > progress. If that's the case, then they have a real problem, they're not > just looking for more peace of mind in a hypothetical situation. > -- > David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ > Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ > Photos: http://dd-b.net/photography/gallery/ > Dragaera: http://dragaera.info > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to manage scrub priority or defer scrub?
In following this discussion, I get the feeling that you and Richard are somewhat talking past each other. He asked you about the hardware you are currently running on, whereas you seem to be interested in a model for the impact of scrubbing on I/O throughput that you can apply to some not-yet-acquired hardware. It should be clear by now that the model you are looking for does not exist given how new ZFS is, and Richard has been focusing his comments on your existing (home) configuration since that is what you provided specs for. Since you haven't provided specs for this larger system you may be purchasing in the future, I don't think anyone can give you specific guidance on what the I/O impact of scrubs on your configuration will be. Richard seems to be giving more design guidelines and hints, and just generally good to know information to keep in mind while designing your solution. Of course, he's been giving it in the context of your 11 disk wide RAIDZ2 and not the 200 TB monster you only described in the last e-mail. Stepping back, it may be worthwhile to examine the advice Richard has given, in the context of the larger configuration. First, you won't be using commodity hardware for your enterprise-class storage system, will you? Second, I would imagine that as a matter of practice, most people schedule their pools to scrub as far away from prime hours as possible. Maybe it's possible, and maybe it's not. The question to the larger community should be "who is running a 100+ TB pool and how have you configured your scrubs?" Or even "for those running 100+ TB pools, do your scrubs interfere with your production traffic/throughput? If so, how do you compensate for this?" Third, as for ZFS scrub prioritization, Richard answered your question about that. He said it is low priority and can be tuned lower. However, he was answering within the context of an 11 disk RAIDZ2 with slow disks His exact words were: "This could be tuned lower, but your storage is slow and *any* I/O activity will be noticed." If you had asked about a 200 TB enterprise-class pool, he may have had a different response. I don't know if ZFS will make different decisisons regarding I/O priority on commodity hardware as opposed to enterprise hardware, but I imagine it does *not*. If I'm mistaken, someone should correct me. Richard also said "In b133, the priority scheduler will work better than on older releases." That may not be an issue since you haven't acquired your hardware YET, but again, Richard didn't know that you were talking about a 200 TB behemoth because you never said that. Fourth, Richard mentioned a wide RAIDZ2 set. Hopefully, if nothing else, we've seen that designing larger ZFS storage systems with pools composed of smaller top level VDEVs works better, and preferably mirrored top level VDEVs in the case of lots of small, random reads. You didn't indicate the profile of the data to be stored on your system, so no one can realistically speak to that. I think the general guidance is sound. Multiple top level VDEVs, preferably mirrors. If you're creating RAIDZ2 top level VDEVs, then they should be smaller (narrower) in terms of the number of disks in the set. 11 would be too many, based on what I have seen and heard on this list cross referenced with the (little) information you have provided. RAIDZ2 would appear to require more CPU power that RAIDZ, based on the report you gave and thus may have less negative impact on the performance of your storage system. I'll cop to that. However, you never mentioned how your 200 TB behemoth system will be used, besides an off-hand remark about CIFS. Will it be serving CIFS? NFS? Raw ZVOLs over iSCSI? You never mentioned any of that. Asking about CIFS if you're not going to serve CIFS doesn't make much sense. That would appear to be another question for the ZFS gurus here -- WHY does RAIDZ2 cause so much negative performance impact on your CIFS service while RAIDZ does not? Your experience is that a scrub of a RAIDZ2 maxed CPU while a RAIDZ scrub did not, right? Fifth, the pool scrub should probably be as far away from peak usage times as possible. That may or may not be feasible, but I don't think anyone would disagree with that advice. Again, I know there are people running large pools who perform scrubs. It might be worthwhile to directly ask what these people have experienced in terms of scrub performance on RAIDZ vs. RAIDZ2, or in general. Finally, I would also note that Richard has been very responsive to your questions (in his own way) but you increasingly seem to be hostile and even disrespectful toward him. (I've noticed this in more then one of your e-mails; they sound progressively more self-centered and selfish. That's just my opinion.) If this is a community, that's not a helpful way to treat a senior member of the community, even if he's not answering the question you want answered. Keep in mind that asking the wrong questions
Re: [zfs-discuss] Posible newbie question about space between zpool and zfs file systems
Yeah, this threw me. A 3 disk RAID-Z2 doesn't make sense, because at a redundancy level, RAID-Z2 looks like RAID 6. That is, there are 2 levels of parity for the data. Out of 3 disks, the equivalent of 2 disks will be used to store redundancy (parity) data and only 1 disk equivalent will store actual data. This is what others might term a "degenerate case of 3-way mirroring", except with a lot more computational overhead since we're performing 2 parity calculations. I'm curious what the purpose of creating a 3 disk RAID-Z2 pool is/was? (For my own personal edification. Maybe there is something for me to learn from this example.) Aside: Does ZFS actually create the pool as a 3-way mirror, given that this configuration is effectively the same? This is a question for any of the ZFS team who may be reading but I'm curious now. On Mon, Mar 15, 2010 at 10:38, Michael Hassey wrote: > Sorry if this is too basic - > > So I have a single zpool in addition to the rpool, called xpool. > > NAMESIZE USED AVAILCAP HEALTH ALTROOT > rpool 136G 109G 27.5G79% ONLINE - > xpool 408G 171G 237G42% ONLINE - > > I have 408 in the pool, am using 171 leaving me 237 GB. > > The pool is built up as; > > pool: xpool > state: ONLINE > scrub: none requested > config: > >NAMESTATE READ WRITE CKSUM >xpool ONLINE 0 0 0 > raidz2ONLINE 0 0 0 >c8t1d0 ONLINE 0 0 0 >c8t2d0 ONLINE 0 0 0 >c8t3d0 ONLINE 0 0 0 > > errors: No known data errors > > > But - and here is the question - > > Creating file systems on it, and the file systems in play report only 76GB > of space free > > <<<>> > > xpool/zones/logserver/ROOT/zbe 975M 76.4G 975M legacy > xpool/zones/openxsrvr 2.22G 76.4G 21.9K > /export/zones/openxsrvr > xpool/zones/openxsrvr/ROOT2.22G 76.4G 18.9K legacy > xpool/zones/openxsrvr/ROOT/zbe2.22G 76.4G 2.22G legacy > xpool/zones/puggles241M 76.4G 21.9K > /export/zones/puggles > xpool/zones/puggles/ROOT 241M 76.4G 18.9K legacy > xpool/zones/puggles/ROOT/zbe 241M 76.4G 241M legacy > xpool/zones/reposerver 299M 76.4G 21.9K > /export/zones/reposerver > > > So my question is, where is the space from xpool being used? or is it? > > > Thanks for reading. > > Mike. > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Disk label reference info
I thought pointing out some of this information might come in handy for some of the folks who are new to the (Open)Solaris world. The following section discusses differences between SMI labels (aka VTOC) and EFI GPT labels. It may not be everything one needs to know in order to successfully manage disks with ZFS, it contains a fair amount. I definitely recommend reading for anyone new to ZFS and/or (Open)Solaris. SMI labels are not evil. Dated, probably, but they are not as difficult to understand or use as people seem to want to make them out to be. See the "About Disk Labels" and "About Disk Slices" sections, in particular. The "Comparison of the EFI Label and the VTOC Label" might also be helpful. http://docs.sun.com/app/docs/doc/817-5093/disksconcepts-1?a=view For tasks and concepts around administering disks, see: http://docs.sun.com/app/docs/doc/817-5093/disksprep-31030?a=view Good adjunct information to the first link. Finally, the Wikipedia entry for the EFI GPT disk label: http://en.wikipedia.org/wiki/GUID_Partition_Table Hopefully this will be useful to someone. I apologize if others feel this is not the place for this, but I so often see questions about disk labeling, partitioning, and other associated topics. It seemed like it might be helpful to some people. I promise that my next post will deal with a very ZFS specific topic. Cheers! -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] getting drive serial number
I'm imagining that OpenSolaris isn't *too* different from Solaris 10 in this regard. I believe Richard Elling recommended "cfgadm -v". I'd also suggest "iostat -E", with and without "-n" for good measure. So that's "iostat -E" and "iostat -En". As long as you know the physical drive specification for the drive (ctd which appears to be c9t1d0 from the other e-mail you sent), "iostat -E" has never failed me. If you need to know the drive identifier, then that's an additional issue. On Sun, Mar 7, 2010 at 13:30, Ethan wrote: > I have a failing drive, and no way to correlate the device with errors in > the zpool status with an actual physical drive. > If I could get the device's serial number, I could use that as it's printed > on the drive. > I come from linux, so I tried dmesg, as that's what's familiar (I see that > the man page for dmesg on opensolaris says that I should be using syslogd > but I haven't been able to figure out how to get the same output from > syslogd). But, while I see at the top the serial numbers for some other > drives, I don't see the one I want because it seems to be scrolled off the > top. > Can anyone tell me how to get the serial number of my failing drive? Or > some other way to correlate the device with the physical drive? > > -Ethan > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Reading ZFS config for an extended period
Ugh! If you received a direct response to me instead of via the list, apologies for that. Rob: I'm just reporting the news. The RFE is out there. Just like SLOGs, I happen to think it a good idea, personally, but that's my personal opinion. If it makes dedup more usable, I don't see the harm. Taemun: The issue, as I understand it, is not "use-lots-of-cpu" or "just dies from paging". I believe it is more to do with all of the small, random reads/writes in updating the DDT. Remember, the DDT is stored within the pool, just as the ZIL is if you don't have a SLOG. (The S in SLOG standing for "separate".) So all the DDT updates are in competition for I/O with the actual data deletion. If the DDT could be stored as a separate VDEV already, I'm sure a way would have been hacked together by someone (likely someone on this list). Hence, the need for the RFE to create this functionality where it does not currently exist. The DDT is separate from the ARC or L2ARC. Here's the bug: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6913566 If I'm incorrect, someone please let me know. Markus: Yes, the issue would appear to be dataset size vs. RAM size. Sounds like an area ripe for testing, much like RAID Z3 performance. Cheers all! On Tue, Feb 16, 2010 at 00:20, taemun wrote: > The system in question has 8GB of ram. It never paged during the > import (unless I was asleep at that point, but anyway). > > It ran for 52 hours, then started doing 47% kernel cpu usage. At this > stage, dtrace stopped responding, and so iopattern died, as did > iostat. It was also increasing ram usage rapidly (15mb / minute). > After an hour of that, the cpu went up to 76%. An hour later, CPU > usage stopped. Hard drives were churning throughout all of this > (albeit at a rate that looks like each vdev is being controller by a > single threaded operation). > > I'm guessing that if you don't have enough ram, it gets stuck on the > use-lots-of-cpu phase, and just dies from too much paging. Of course, > I have absolutely nothing to back that up. > > Personally, I think that if L2ARC devices were persistent, we already > have the mechanism in place for storing the DDT as a "seperate vdev". > The problem is, there is nothing you can run at boot time to populate > the L2ARC, so the dedup writes are ridiculously slow until the cache > is warm. If the cache stayed warm, or there was an option to forcibly > warm up the cache, this could be somewhat alleviated. > > Cheers > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Reading ZFS config for an extended period
The DDT is stored within the pool, IIRC, but there is an RFE open to allow you to store it on a separate top level VDEV, like a SLOG. The other thing I've noticed with all of the "destroyed a large dataset with dedup enabled and it's taking forever to import/destory/ wrote: > Just thought I'd chime in for anyone who had read this - the import > operation completed this time, after 60 hours of disk grinding. > > :) > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Different Hash algorithm
Well, it's an attack, right? Neither Skein nor Threefish has been compromised. In fact, this is what you want to see - researchers attacking an algorithm which goes a long way toward furthering or proving the security of said algorithm. I think I agree with Darren overall, but this still looks promising because these researchers, while attacking Threefish and clearly finding some way to simplify a further attack, have still not managed to compromise it. Exposing the algo to the scrutiny of the community will either help strengthen it, or expose its weakness, and all will be better as a result (in theory). I am now curious, though, along with David, as to the reason Skein in particular was pointed out? Is there any particular reason, or is it just that Joerg came across it while working on his blog posts? There may not be a reason, which is perfectly fine, but for the sake of curiosity, if there is one, please share Joerg. On Sun, Feb 7, 2010 at 15:53, David Magda wrote: > > On Feb 7, 2010, at 15:10, Darren J Moffat wrote: > > On 07/02/2010 20:07, Joerg Moellenkamp wrote: >> >>> Hello, >>> >>> while writing some articles about dedup, hashes and ZFS for my blog, i >>> asked myself: When fletcher4 is fast, but collision prone and sha256 is >>> slower, but relatively secure, wouldn't it be reasonable to integrate >>> Skein (http://www.schneier.com/skein.pdf) into ZFS to yield faster >>> checksumming as well as a reduced probability of false positive >>> deduplications due to hash collisions? >>> >> >> If Skein passes the cryptanlaysis for the SHA3 competition being run by >> NIST and is the winner of that competition or is otherwise considered sounds >> by the crypto community then yes until then I think it is premature to do so >> as it is a very new algorithm. >> > > A new attack on Threefish (which Skein is based on) was recently announced: > >http://www.schneier.com/blog/archives/2010/02/new_attack_on_t.html > > Any reason why the OP prefers Skein over any of the other SHA-3 candidates? > >http://en.wikipedia.org/wiki/NIST_hash_function_competition > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private "If Linux is faster, it's a Solaris bug." - Phil Harman Blog - http://whatderass.blogspot.com/ Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Help: Advice for a NAS
I think the point, Chester, which everyone seems to be dancing around or missing, is that your planning may need to go back to the drawing board on this one. Absorb the resources out there for how to best configure your pools and vdevs, *then* implement. That's the most efficient way to go about doing what you want to do. You can't add a single disk to an existing vdev, as near as I can tell. There are other discussions on this list about expanding or transforming vdevs of a certain type to another type, but this functionality appears to be low on the ZFS development teams list of priorities (probably with good reason) and somewhat high on complexity. Now, I'll include my list of things you might want to go back and read (and possibly re-read) to do planning for your migration from the current RAIDZ to a new implementation of ZFS that gives you your desired outcome. Please excuse me if you have spent time with these, but I figure it's worth saying just in case: ZFS Admin Guide: http://docs.sun.com/app/docs/doc/819-5461?l=en ZFS @ Solaris Internals: http://www.solarisinternals.com//wiki/index.php?title=Category:ZFS Of course, the ZFS community on OpenSolaris.org: http://opensolaris.org/os/community/zfs/ No need to mention zfs-discuss. Anyway, the answer is that what you say you want to do, you cannot do. There is probably another way to accomplish your ultimate goal, but you want to get clear about that goal in DETAIL then see how, or if, you can use ZFS to achieve that goal. Since ZFS hails from a world of people who are fairly accustomed to thinking in detailed, enterprise terms, you'll either want to start looking at things in this way or seek out solutions which fit how you want to use your software and hardware. HTH. Cheers! On Mon, Aug 10, 2009 at 00:17, Chester wrote: > Hi guys, > > Previously, I had three 1TB drives in my desktop using the Intel's > southbridge RAID for storage. The only problem with that is every time > Windows Vista took a dump, I would be in jeopardy of corrupting the storage > space; thus, I decided to have a dedicated machine just for serving up > files. > > I now have a 3ware 9650SE 16 port host controller and currently four 2TB > Western Digital WD2002FYPS drives (I also have a leftover drive from an old > machine that's currently the boot drive), Supermicro server MB and a Celeron > 440 2GHz chip (primarly because it's 35watts and I wanted to minimize power > usage when the machine sat idle). > > I was weary about using a Microsoft OS as I wanted it to be reliable and > also didn't want to have to pay for another license, etc. I tried FreeNAS > and it was ok, but somewhat limited in what you can do with it. I have a > squeezebox and would want to install their SqueezeCenter server to server up > lossless compressed audio files. I was also intrigued by ZFS after reading > so much about it. > > I did try around with creating a raidz1 zpool, but then learned that the > current implementation is somewhat limited in that you can't expand the > raidset. Using a raidz storage pool would also not take advantage of the > 3ware's dedicated hardware for computing parity, etc. Anyway, I > successfully created a RAID5 set using three drives and created a zpool. I > then migrated the RAID5 set to add an additional 2TB drive (took 3 days!). > The server is currently down because I needed to use the RAM elsewhere, but > after expanding the storage area, the filesystem still stayed the original > size using the three 2TB drives. I tried finding a way to get the full > space allocation, but it seems that many use simple SATA ports and the raidz > solution. I'm also a n00b, so any advice would be greatly appreciated. If > you think I'm going down the wrong path, I would like to hear it. > > TIA, > Chester > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- "You can choose your friends, you can choose the deals." - Equity Private AlphaGuy - http://alphaguy.blogspot.com On Twitter - @khyron4eva ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss