Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?
On Sun, Jan 24, 2010 at 19:34, Toby Thain wrote: > > On 24-Jan-10, at 11:26 AM, R.G. Keen wrote: > > ... >> >> I’ll just blather a bit. The most durable data backup medium humans have >> come up with was invented about 4000-6000 years ago. It’s fired cuniform >> tablets as used in the Middle East. Perhaps one could include stone carvings >> of Egyptian and/or Maya cultures in that. ... >> >> The modern computer era has nothing that even comes close. ... >> > > And I can’t bet on a really archival data storage technology becoming >> available. It may not get there in my lifetime. >> > > > A better digital archival medium may already exist: > http://hardware.slashdot.org/story/09/11/13/019202/Synthetic-Stone- > DVD-Claimed-To-Last-1000-Years That would be nice - but - I have to wonder how they would test it in order to justify the actual lifespan claim. Seems like the first real aging test would be down the road aways. Just as well it's a start-up I guess. Mezzanine funding round...to come... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Best 1.5TB drives for consumer RAID?
On Sun, Jan 24, 2010 at 08:36, Erik Trimble wrote: > These days, I've switched to 2.5" SATA laptop drives for large-storage > requirements. > They're going to cost more $/GB than 3.5" drives, but they're still not > horrible ($100 for a 500GB/7200rpm Seagate Momentus). They're also easier > to cram large numbers of them in smaller spaces, so it's easier to get > larger number of spindles in the same case. Not to mention being lower-power > than equivalent 3.5" drives. > > My sole problem is finding well-constructed high-density 2.5" hot-swap > bay/chassis setups. > If anyone has a good recommendation for a 1U or 2U JBOD chassis for 2.5" > drives, that would really be helpful. > > Erik, try this one on for size; http://www.supermicro.com/products/accessories/mobilerack/CSE-M28E1.cfm Supermicro has a number of variations on this theme, but I deployed this one at a client site, and - so far - no complaints. I'm not sure I'd run one of these personally, because it seems that drives would tend to run hotter than if individually stacked in a conventional PC case***butwrite some of that off to me being excessively conservative when it comes to cooling. That said, it's Supermicro after all, and they tend to sell well engineered gear. ***Unless you stuffed it with SSD's - -Me ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Defragmentation - will it happen?
Can anyone comment on the likelihood of zfs defrag becoming a reality some day? If so, any approximation as to when? I realize this isn't exactly a trivial endeavor, but it sure would be nice to see. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled
On Sat, Jan 2, 2010 at 13:10, Markus Kovero wrote: > If pool isnt rpool you might to want to boot into singleuser mode (-s after > kernel parameters on boot) remove /etc/zfs/zpool.cache and then reboot. > after that you can merely ssh into box and watch iostat while import. > Wow, it's utterly priceless tidbits like this that keeps me addicted to zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Can I destroy a Zpool without importing it?
Are there any negative consequences as a result of a force import? I mean STUNT; "Sudden Totally Unexpected and Nasty Things" -Me On Sun, Dec 27, 2009 at 17:55, Sriram Narayanan wrote: > opensolaris has a newer version of ZFS than Solaris. What you have is > a pool that was not marked as exported for use on a different OS > install. > > Simply force import the pool using zpool import -f > > -- Sriram > > On 12/27/09, Havard Kruger wrote: > > Hi, in the process of building a new fileserver and I'm currently playing > > around with various operating systems, I created a pool in Solaris, > before I > > decided to try OpenSolaris aswell, so I installed OpenSolaris 20009.06, > but > > I forgot to destroy the pool I created in Solaris, so now I can't import > it > > because it's a newer version of ZFS in Solaris then it is in OpenSolaris. > > > > And I can not seem to find a way to destroy the pool without importing it > > first. I guess I could format the drives in another OS, but that is alot > > more work then it should be. Is there any way to do this in OpenSolaris? > > -- > > This message posted from opensolaris.org > > ___ > > zfs-discuss mailing list > > zfs-discuss@opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > > -- > Sent from my mobile device > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs pool Configuration "calculator"?
Does anyone know if a zfs "pool configuration calculator" (for want of a better term) exists? Reason for the question: We're looking at building a configuration which has some hard limits (case size for one). It's a collaborative project, and not everyone resides in the same place (or timezone even), so we would like to "build" various disk configurations for folks to look at, and hammer out arguments for/against. Not everyone knows how to calculate the net space available by stuffing config-a or config-b in the box, so if a tool exists to help us visualize this, that would be invaluable (to put it mildly). I saw Adam's approach here: http://blogs.sun.com/ahl/entry/sun_storage_7410_space_calculator but, as he says; > Remember that you need a Sun Storage 7000 appliance (even a virtual one) to > execute the capacity calculation Oops, I wish we had such a creature ("one day" I tell myself!) :) As always, my thanks in anticipation of any suggestions, pointers and the like. Warm regards to all, -Colin ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How do I determine dedupe effectiveness?
On Sun, Dec 20, 2009 at 16:23, Nick wrote: > > IMHO, snapshots are not a replacement for backups. Backups should > definitely reside outside the system, so that if you lose your entire array, > SAN, controller, etc., you can recover somewhere else. Snapshots, on the > other hand, give you the ability to quickly recover to a point in time when > something not-so-catastrophic happens - like a user deletes a file, an O/S > update fails and hoses your system, etc. - without going to a backup system. > Snapshots are nice, but they're no replacement for backups. > I agree, and said so, in response to: > You seem to be confusing "snapshots" with "backup". > To which I replied: No, I wasn't confusing them at all. Backups are backups. Snapshots however, do have some limited value as backups. They're no substitute, but augment a planned backup schedule rather nicely in many situations. Please note, that I said that snapshots AUGMENT a well planned backup schedule, and in no way are they - nor should they be - considered a replacement. Your quoted scenario is the perfect illustration, a user-deleted file, a rollback for that update that "didn't quite work out as you hoped" and so forth. Agreed, no argument. The (one and only) point that I was making was that - like backups - snapshots should be kept "elsewhere" whether by using zfs-send, or zipping up the whole shebang and ssh'ing it someplace"elsewhere" meaning beyond the pool. Rolling 15 minute and hourly snapshotsno, they stay local, but daily/weekly/monthly snapshots get stashed "offsite" (off-box). Apart from anything else, it's one heck of a spacesaver - in the long run. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How do I determine dedupe effectiveness?
On Sat, Dec 19, 2009 at 19:08, Toby Thain wrote: > > On 19-Dec-09, at 11:34 AM, Colin Raven wrote > > Then again (not sure how gurus feel on this point) but I have this probably > naive and foolish belief that snapshots (mostly) oughtta reside on a > separate physical box/disk_array... > > > > That is not possible, except in the case of a mirror, where one side is > recoverable separately. > I was referring to zipping up a snapshot and getting it outta Dodge onto another physical box, or separate array. > You seem to be confusing "snapshots" with "backup". > No, I wasn't confusing them at all. Backups are backups. Snapshots however, do have some limited value as backups. They're no substitute, but augment a planned backup schedule rather nicely in many situations. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How do I determine dedupe effectiveness?
On Sat, Dec 19, 2009 at 17:20, Bob Friesenhahn wrote: > On Sat, 19 Dec 2009, Colin Raven wrote: > >> >> There is no original, there is no copy. There is one block with reference >> counters. >> >> - Fred can rm his "file" (because clearly it isn't a file, it's a filename >> and that's all) >> - result: the reference count is decremented by one - the data remains on >> disk. >> > > While the similarity to hard links is a good analogy, there really is a > unique "file" in this case. If Fred does a 'rm' on the file then the > reference count on all the file blocks is reduced by one, and the block is > freed if the reference count goes to zero. Behavior is similar to the case > where a snapshot references the file block. If Janet updates a block in the > file, then that updated block becomes unique to her "copy" of the file (and > the reference count on the original is reduced by one) and it remains unique > unless it happens to match a block in some other existing file (or snapshot > of a file). > Wait...whoah, hold on. If snapshots reside within the confines of the pool, are you saying that dedup will also count what's contained inside the snapshots? I'm not sure why, but that thought is vaguely disturbing on some level. Then again (not sure how gurus feel on this point) but I have this probably naive and foolish belief that snapshots (mostly) oughtta reside on a separate physical box/disk_array..."someplace else" anyway. I say "mostly" because I s'pose keeping 15 minute snapshots on board is perfectly OK - and in fact handy. Hourly...ummm, maybe the same - but Daily/Monthly should reside "elsewhere". > > When we are children, we are told that sharing is good. In the case or > references, sharing is usually good, but if there is a huge amount of > sharing, then it can take longer to delete a set of files since the mutual > references create a "hot spot" which must be updated sequentially. Y'know, that is a GREAT point. Taking this one step further then - does that also imply that there's one "hot spot" physically on a disk that keeps getting read/written to? if so then your point has even greater merit for more reasons...disk wear for starters, and other stuff too, no doubt. > Files are usually created slowly so we don't notice much impact from this > sharing, but we expect (hope) that files will be deleted almost > instantaneously. <http://www.GraphicsMagick.org/> Indeed, that's is completely logical. Also, something most of us don't spend time thinking about. Bob, thanks. Your thoughts and insights are always interesting - and usually most revealing! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How do I determine dedupe effectiveness?
On Sat, Dec 19, 2009 at 05:25, Ian Collins wrote: > Stacy Maydew wrote: > >> The commands "zpool list" and "zpool get dedup " both show a ratio >> of 1.10. >> So thanks for that answer. I'm a bit confused though if the dedup is >> applied per zfs filesystem, not zpool, why can I only see the dedup on a per >> pool basis rather than for each zfs filesystem? >> >> Seems to me there should be a way to get this information for a given zfs >> filesystem? >> >> >> > The information, if present, would probably be meaningless. Consider which > filesystem holds the block and which the dupe? What happens if the original > is removed? > AHA - "original/copy" I fell into the same trap. This is the question I had back in November. Michael Schuster http://blogs.sun.com/recursion helped me out and that's my reference point. Here was my scenario: in /home/fred there's a photo collection > another collection exists in /home/janet > at some point in the past, fred sent janet a party picture, let's call > it DSC4456.JPG > In the dataset, there are now two copies of the file, which are > genuinely identical. > > So then: > - When you de-dupe, which copy of the file gets flung? > Michael provided the following really illuminating explanation: dedup (IIRC) operates at block level, not file level, so the question, as it > stands, has no answer. what happens - again, from what I read in Jeff's blog > - is this: zfs detects that a copy of a block with the same hash is being > created, so instead of storing the block again, it just increments the > reference count and makes sure whatever "thing" references this piece of > data points to the "old" data. > > In that sense, you could probably argue that the "new" copy never gets > created. > ("Jeff's blog" referred to above is here: http://blogs.sun.com/bonwick/entry/zfs_dedup) OK, fair enough but I still could quite get my head around what's actually happening, so I posed this followup question, in order to cement the idea in my silly head (because I still wasn't focused on "new copy never gets created") Fred has an image (DSC4456.JPG in my example) in his home directory, he's sent it to Janet. Arguably - when Janet pulled the attachment out of the email and saved it to her $HOME - that copy never got written! Instead, the reference count was incremented by one. Fair enough, but what is Janet "seeing" when she does an ls and greps for that image? What is she seeing: - a symlink? - an "apparition" of some kind? she sees the file, it's there, but what exactly is she seeing? Michael stepped in and described this: they're going to see the same file (the blocks of which now have a ref. > counter that is one less than it was before). > > think posix-style hard links: two directory entries pointing to the same > inode - both "files" are actually one, but as long as you don't change it, > it doesn't matter. when you "remove" one (by removing the name), the other > remains, the ref. count in the inode is decremented by one. > So, coming around full circle to your question; "What happens if the original is removed?" it can be answered this way: There is no original, there is no copy. There is one block with reference counters. - Fred can rm his "file" (because clearly it isn't a file, it's a filename and that's all) - result: the reference count is decremented by one - the data remains on disk. OR - Janet can rm her "filename" - result: the reference count is decremented by one - the data remains on disk OR -both can rm the filename the reference count is now decremented by two - but there were only two so now it's really REALLY gone. Or is it really REALLY gone? Nope, If you snapshotted the pool it isn't! :) For me, within the core of the explanation, the posix hard link reference somehow tipped the scales and made me understand, but we all have mental hooks into different parts of an explanation (the "aha" moment) so YMMV :) Dedup is fascinating, I hope you don't mind me sharing this little list-anecdote because it honestly made a huge difference to my understanding of the concept. Once again, many thanks to Michael Schuster at Sun for having the patience to walk a n00b through the steps towards enlightenment. -- -Me ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] force 4k writes
On Thu, Dec 17, 2009 at 09:14, Eric D. Mudama wrote: > On Wed, Dec 16 at 7:35, Bill Sprouse wrote: > >> The question behind the question is, given the really bad things that can >> happen performance-wise with writes that are not 4k aligned when using flash >> devices, is there any way to insure that any and all writes from ZFS are 4k >> aligned? >> > > Some flash devices can handle this better than others, often several > orders of magnitude better. Not all devices (as you imply) are > so-affected. > Is there - somewhere - a list of flash devices, with some (perhaps subjective) indication of how they handle issues like this? -- -Me ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] will deduplication know about old blocks?
Adam, So therefore, the best way is to set this at pool creation timeOK, that makes sense, it operates only on fresh data that's coming over the fence. BUT What happens if you snapshot, send, destroy, recreate (with dedup on this time around) and then write the contents of the cloned snapshot to the various places in the pool - which properties are in the ascendancy here? the "host pool" or the contents of the clone? The host pool I assume, because clone contents are (in this scenario) "just some new data"? -Me On Wed, Dec 9, 2009 at 18:43, Adam Leventhal wrote: > Hi Kjetil, > > Unfortunately, dedup will only apply to data written after the setting is > enabled. That also means that new blocks cannot dedup against old block > regardless of how they were written. There is therefore no way to "prepare" > your pool for dedup -- you just have to enable it when you have the new > bits. > > On Dec 9, 2009, at 3:40 AM, Kjetil Torgrim Homme wrote: > > > I'm planning to try out deduplication in the near future, but started > > wondering if I can prepare for it on my servers. one thing which struck > > me was that I should change the checksum algorithm to sha256 as soon as > > possible. but I wonder -- is that sufficient? will the dedup code know > > about old blocks when I store new data? > > > > let's say I have an existing file img0.jpg. I turn on dedup, and copy > > it twice, to img0a.jpg and img0b.jpg. will all three files refer to the > > same block(s), or will only img0a and img0b share blocks? > > > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Deduplication - deleting the original
On Tue, Dec 8, 2009 at 22:54, Jeff Bonwick wrote: > > i am no pro in zfs, but to my understanding there is no original. > > That is correct. From a semantic perspective, there is no change > in behavior between dedup=off and dedup=on. Even the accounting > remains the same: each reference to a block is charged to the dataset > making the reference. The only place you see the effect of dedup > is at the pool level, which can now have more logical than physical > data. You may also see a difference in performance, which can be > either positive or negative depending on a whole bunch of factors. > > At the implementation level, all that's really happening with dedup > is that when you write a block whose contents are identical to an > existing block, instead of allocating new disk space we just increment > a reference count on the existing block. When you free the block > (from the dataset's perspective), the storage pool decrements the > reference count, but the block remains allocated at the pool level. > When the reference count goes to zero, the storage pool frees the > block for real (returns it to the storage pool's free space map). > > But, to reiterate, none of this is visible semantically. The only > way you can even tell dedup is happening is to observe that the > total space used by all datasets exceeds the space allocated from > the pool -- i.e. that the pool's dedup ratio is greater than 1.0. Jeff, Thomas, Ed & Michael; Thank you all for assisting in the education of a n00bie in this most important ZFS feature. I *think* I have a better overall understanding now. This list is a resource treasure trove! I hope I'm able to acquire sufficient knowledge over time to eventually be able to contribute help to other newcomers. Regards & Thanks for all the help, -Me ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Deduplication - deleting the original
In reading this blog post: http://blogs.sun.com/bobn/entry/taking_zfs_deduplication_for_a a question came to mind. To understand the context of the question, consider the opening paragraph from the above post; Here is my test case: I have 2 directories of photos, totaling about 90MB > each. And here's the trick - they are almost complete duplicates of each > other. I downloaded all of the photos from the same camera on 2 different > days. How many of you do that ? Yeah, me too. OK, I consider myself in that category most certainly. Through just plain 'ol sloppiness I must have multiple copies of some images. Sad self indictment...but anyway What happens if, once dedup is on, I (or someone else with delete rights) open a photo management app containing that collection, and start deleting dupes - AND - happen to delete the original that all other references are pointing to. I know, I know, it doesn't matter - snapshots save the day - but in this instance that's not the point because I'm trying to properly understand the underlying dedup concept. Logically, if you delete what everything is pointing at, all the pointers are now null values, they are - in effect - pointing at nothing...an empty hole. I have the feeling the answer to this is; "no they don't, there is no spoon ("original") you're still OK". I suspect that, only because the people who thought this up couldn't possibly have missed such an "obvious" point. The problem I have is in trying to mentally frame this in such a way that I can subsequently explain it, if asked to do so (which I see coming for sure). Help in understanding this would be hugely helpful - anyone? Regards & TIA, -Me ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS send | verify | receive
On Sat, Dec 5, 2009 at 17:17, Richard Elling wrote: > On Dec 4, 2009, at 4:11 PM, Edward Ned Harvey wrote: > > Depending of your version of OS, I think the following post from Richard >>> Elling >>> will be of great interest to you: >>> - >>> >>> http://richardelling.blogspot.com/2009/10/check-integrity-of-zfs-send-streams >>> . >>> html >>> >> >> Thanks! :-) >> No, wait! >> >> According to that page, if you "zfs receive -n" then you should get a 0 >> exit >> status for success, and 1 for error. >> >> Unfortunately, I've been sitting here and testing just now ... I created >> a >> "zfs send" datastream, then I made a copy of it and toggled a bit in the >> middle to make it corrupt ... >> >> I found that the "zfs receive -n" always returns 0 exit status, even if >> the >> data stream is corrupt. In order to get the "1" exit status, you have to >> get rid of the "-n" which unfortunately means writing the completely >> restored filesystem to disk. >> > > I believe it will depend on the nature of the corruption. Regardless, > the answer is to use zstreamdump. Richard, do you know of any usage examples of zfstreamdump? I've been searching for examples since you posted this, and don't see anything that shows how to use it in practice. argh. -C ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup issue
Hey Cindy! Any idea of when we might see 129? (an approximation only). I ask the question because I'm pulling budget funds to build a filer, but it may not be in service until mid-January. Would it be reasonable to say that we might see 129 by then, or are we looking at summer...or even beyond? I don't see that there's a "wrong answer: here necessarily, :) :) :) I'll go with what's out, but dedup is a big one and a feature that made me commit to this project. -Colin On Wed, Dec 2, 2009 at 17:06, Cindy Swearingen wrote: > Hi Jim, > > Nevada build 128 had some problems so will not be released. > > The dedup space fixes should be available in build 129. > > Thanks, > > Cindy > > > On 12/02/09 02:37, Jim Klimov wrote: > >> Hello all >> >> Sorry for bumping an old thread, but now that snv_128 is due to appear as >> a public DVD download, I wonder: has this fix for zfs-accounting and other >> issues with zfs dedup been integrated into build 128? >> >> We have a fileserver which is likely to have much redundant data and we'd >> like to clean up its space with zfs-deduping (even if that takes copying >> files over to a temp dir and back - so their common blocks are noticed by >> the code). Will build 128 be ready for the task - and increase our server's >> available space after deduping - or should we better wait for another one? >> >> In general, were there any stability issues with snv_128 during >> internal/BFU testing? >> >> TIA, >> //Jim >> > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Dedup question
Folks, I've been reading Jeff Bonwick's fascinating dedup post. This is going to sound like either the dumbest or the most obvious question ever asked, but, if you don't know and can't produce meaningful RTFM resultsask...so here goes: Assuming you have a dataset in a zfs pool that's been deduplicated, with pointers all nicely in place and so on. Doesn't this mean that you're now always and forever tied to ZFS (and why not? I'm certainly not saying that's a Bad Thing) because no other "wannabe file system" will be able to read those ZFS pointers? Or am I horribly misunderstanding the concept in some way? Regards - and as always - TIA, -Me ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Resilver/scrub times?
Hi all! I've decided to take the "big jump" and build a ZFS home filer (although it might also do "other work" like caching DNS, mail, usenet, bittorent and so forth). YAY! I wonder if anyone can shed some light on how long a pool scrub would take on a fairly decent rig. These are the specs as-ordered: Asus P5Q-EM mainboard Core2 Quad 2.83 GHZ 8GB DDR2/80 OS: 2 x SSD's in RAID 0 (brand/size not decided on yet, but they will definitely be some flavor of SSD) Data: 4 x 1TB Samsung Spin Point 7200 RPM 32MB cache SATA HD's (RAIDZ) Data payload initially will be around 550GB or so, (before loading any stuff from another NAS and so on) Does scrub like memory, or CPU, or both? There's enough horsepower available, I would think. Same question applies to resilvering if I need to swap out drives at some point. [cough] I can't wait to get this thing built! :) Regards & TIA, Your just-subscribed total ZFS n00b, -Me ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on SAN?
On Sun, Feb 15, 2009 at 8:02 PM, Bob Friesenhahn < bfrie...@simple.dallas.tx.us> wrote: > On Sun, 15 Feb 2009, Colin Raven wrote: > >> >> Pardon me for jumping into this discussion. I invariably lurk and keep >> mouth >> firmly shut. In this case however, curiosity and a degree of alarm bade me >> to jump incould you elaborate on 'fragmentation' since the only >> context >> I know this is Windows. Now surely, ZFS doesn't suffer from the same >> sickness? >> > > ZFS is "fragmented by design". Regardless, it takes steps to minimize > fragmentation, and the costs of fragmentation. Files written sequentially > at a reasonable rate of speed are usually contiguous on disk as well. A > "slab" allocator is used in order to allocate space in larger units, and > then dice this space up into ZFS 128K blocks so that related blocks will be > close together on disk. The use of larger block sizes (default 128K vs 4K, > or 8K) dramatically reduces the amount of disk seeking required for > sequential I/O when fragmentation is present. Written data is buffered in > RAM for up to 5 seconds before being written so that opportunities for > contiguous storage are improved. When the pool has multiple vdevs, then > ZFS's "load share" can also intelligently allocate file blocks across > multiple disks such that there is minimal head movement, and multiple seeks > can take place at once. > > As a followup; is there any ongoing sensible way to defend against the >> dreaded fragmentation? A [shudder] "defrag" routine of some kind perhaps? >> Forgive the "silly questions" from the sidelines.ignorance knows no >> bounds apparently :) >> > > The most important thing is to never operate your pool close to 100% full. > Always leave a reserve so that ZFS can use reasonable block allocation > policies, and is not forced to allocate blocks in a way which causes > additional performance penalty. Installing more RAM in the system is likely > to decrease fragmentation since then ZFS can defer writes longer and make > better choices about where to put the data. > > Updating already written portions of files "in place" will convert a > completely contiguous file into a fragmented file due to ZFS's copy-on-write > design. > > > Thank you for a most lucid and readily understandable explanation. I shall > now return to the sidelineshoping to have a zfs box up and running > sometime in the near future when budget and time permit. Keeping up with > this list is helpful in anticipation of that time arriving. > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS on SAN?
On Sun, Feb 15, 2009 at 5:00 PM, Bob Friesenhahn < bfrie...@simple.dallas.tx.us> wrote: > On Sun, 15 Feb 2009, Robert Milkowski wrote: > >> >> Well, in most cases resilver in ZFS should be quicker than resilver in >> a disk array because ZFS will resilver only blocks which are actually >> in use while most disk arrays will blindly resilver full disk drives. >> So assuming you still have plenty unused disk space in a pool then zfs >> resilver should take less time. >> > > It is reasonable to assume that storage will eventually become close to > full. Then the user becomes entrapped by their design. Adding to the > issues is that as the ZFS pool ages and becomes full, it becomes slower as > well due to increased fragmentation, and this fragmentation slows down > resilver performance. > Pardon me for jumping into this discussion. I invariably lurk and keep mouth firmly shut. In this case however, curiosity and a degree of alarm bade me to jump incould you elaborate on 'fragmentation' since the only context I know this is Windows. Now surely, ZFS doesn't suffer from the same sickness? As a followup; is there any ongoing sensible way to defend against the dreaded fragmentation? A [shudder] "defrag" routine of some kind perhaps? Forgive the "silly questions" from the sidelines.ignorance knows no bounds apparently :) Warm Regards, -Colin ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] The ZFS inventor and Linus sitting in a tree?
On Mon, May 19, 2008 at 10:06 PM, Bill McGonigle <[EMAIL PROTECTED]> wrote: > On May 18, 2008, at 14:01, Mario Goebbels wrote: > >> I mean, if the Linux folks to want it, fine. But if Sun's actually >> helping with such a possible effort, then it's just shooting itself in >> the foot here, in my opinion. > > [] > they're quick to do it - they threatened to sue me when they couldn't > figure out how to take back a try-out server). There's a story contained within that for sure! :) You brought a smile to this subscriber when I read it. > Having ZFS as a de- facto standard lifts all boats, IMHO. It's still hard to believe (in one sense) that the entire world isn't beating a path to Sun's door and PLEADING for ZFS. This is (if y'all will forgive the colloquialism) a kick-ass amazing piece of software. It appears to defy all the rules, a bit like levitation in a way, or perhaps it just rewrites those rules. There are days I still can't get my head around what ZFS really is. In general, licensing issues just make my brain bleed, but one hopes that the licensing gurus can get their heads together and find a way to get this done. I don't personally believe that Open Solaris *OR* Solaris will lose if ZFS makes its way over the fence to Linux, I think that this is a big enough tent for everyone. Sure hope so anyway, it would be immensely sad to see technology like this not being adopted/ported/migrated/whatever more widely because of "damn lawyers" and the morass called licensing. Perhaps (gazing into a cloudy crystal ball that hasn't been cleaned in a while) Solaris/Open Solaris can manage to hold onto ZFS-on-boot which is perhaps *the* most mind bending accomplishment within the zfs concept, and let the rest procreate elsewhere. That could contribute to the "must-have/must-install" cachet of Solaris/OpenSolaris. Umm, my uninspiring and non-expert contribution to this (unusually) non-expert thread. Thanks to all involved on this list, sometimes it seems like every post is a kind of mini tutorial all on its own. Warm Regards and thanks for all the fish/knowledge nuggets -Colin -- Colin J. Raven "A wide-eyed neophyte staring through the window at the wizards toiling within" ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss