Re: [zfs-discuss] Server Cloning With ZFS?
On Thu, Jun 18, 2009 at 10:56 AM, Dave Ringkorno-re...@opensolaris.org wrote: But what if I used zfs send to save a recursive snapshot of my root pool on the old server, booted my new server (with the same architecture) from the DVD in single user mode and created a ZFS pool on its local disks, and did zfs receive to install the boot environments there? The filesystems don't care about the underlying disks. The pool hides the disk specifics. There's no vfstab to edit. http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#ZFS_Root_Pool_Recovery -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] compression at zfs filesystem creation
Bob Friesenhahn wrote: On Wed, 17 Jun 2009, Haudy Kazemi wrote: usable with very little CPU consumed. If the system is dedicated to serving files rather than also being used interactively, it should not matter much what the CPU usage is. CPU cycles can't be stored for later use. Ultimately, it (mostly*) does not matter if Clearly you have not heard of the software flywheel: http://www.simplesystems.org/users/bfriesen/software_flywheel.html I had not heard of such a device, however from the description it appears to be made from virtual unobtanium :) My line of reasoning is that unused CPU cycles are to some extent a wasted resource, paralleling the idea that having system RAM sitting empty/unused is also a waste and should be used for caching until the system needs that RAM for other purposes (how the ZFS cache is supposed to work). This isn't a perfect parallel as CPU power consumption and heat outlet do vary by load much more than does RAM. I'm sure someone could come up with a formula for the optimal CPU loading to maximize energy efficiency. There has been work on this the paper 'Dynamic Data Compression in Multi-hop Wireless Networks' at http://enl.usc.edu/~abhishek/sigmpf03-sharma.pdf . If I understand the blog entry correctly, for text data the task took up to 3.5X longer to complete, and for media data, the task took about 2.2X longer to complete with a maximum storage compression ratio of 2.52X. For my backup drive using lzjb compression I see a compression ratio of only 1.53x. I linked to several blog posts. It sounds like you are referring to ' http://blogs.sun.com/dap/entry/zfs_compression#comments '? This blog's test results show that on their quad core platform (Sun 7410 have quad core 2.3 ghz AMD Opteron cpus*) : * http://sunsolve.sun.com/handbook_pub/validateUser.do?target=Systems/7410/spec for text data, LZJB compression had negligible performance benefits (task times were unchanged or marginally better) and less storage space was consumed (1.47:1). for media data, LZJB compression had negligible performance benefits (task times were unchanged or marginally worse) and storage space consumed was unchanged (1:1). Take away message: as currently configured, their system has nothing to lose from enabling LZJB. for text data, GZIP compression at any setting, had a significant negative impact on write times (CPU bound), no performance impact on read times, and significant positive improvements in compression ratio. for media data, GZIP compression at any setting, had a significant negative impact on write times (CPU bound), no performance impact on read times, and marginal improvements in compression ratio. Take away message: With GZIP as their system is currently configured, write performance would suffer in exchange for a higher compression ratio. This may be acceptable if the system fulfills a role that has a read heavy usage profile of compressible content. (An archive.org backend would be such an example.) This is similar to the tradeoff made when comparing RAID1 or RAID10 vs RAID5. Automatic benchmarks could be used to detect and select the optimal compression settings for best performance, with the basic case assuming the system is a dedicated file server and more advanced cases accounting for the CPU needs of other processes run on the same platform. Another way would be to ask the administrator what the usage profile for the machine will be and preconfigure compression settings suitable for that use case. Single and dual core systems are more likely to become CPU bound from enabling compression than a quad core. All systems have bottlenecks in them somewhere by virtue of design decisions. One or more of these bottlenecks will be the rate limiting factor for any given workload, such that even if you speed up the rest of the system the process will still take the same amount of time to complete. The LZJB compression benchmarks on the quad core above demonstrate that LZJB is not the rate limiter either in writes or reads. The GZIP benchmarks show that it is a rate limiter, but only during writes. On a more powerful platform (6x faster CPU), GZIP writes may no longer be the bottleneck (assuming that the network bandwidth and drive I/O bandwidth remain unchanged). System component balancing also plays a role. If the server is connected via a 100 Mbps CAT5e link, and all I/O activity is from client computers on that link, does it make any difference if the server is actually capable of GZIP writes at 200 Mbps, 500 Mbps, or 1500 Mbps? If the network link is later upgraded to Gigabit ethernet, now only the system capable of GZIPing at 1500 Mbps can keep up. The rate limiting factor changes as different components are upgraded. In many systems for many workloads, hard drive I/O bandwidth is the rate limiting factor that has the most significant performance impact, such that a 20% boost
Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
2009/6/18 Timh Bergström timh.bergst...@diino.net: USB-sticks has proven a bad idea with zfs mirrors I think, USB sticks is bad idea for mirrors in general... :-) ZFS on iSCSI *is* flaky OK, so what is the status of your bugreport about this? Was ignored or just rejected?.. Flaming people on ./ Nobody flaming people nor in current directory (./) neither on /. (slash-dot). All asked is a practical steps or bug reports. P.S. Additionally, everyone can spend their true anger on an installed Solaris somewhere on a spare hardware and kill that sucker with a stress-tests. Effect: you're relaxed and Sun folks has a job. :-) -- Kind regards, BM Things, that are stupid at the beginning, rarely ends up wisely. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs on 32 bit?
yeah. many of those ARM systems will be low-power builtin-crypto-accel builtin-gigabit-MAC based on Orion and similar, NAS (NSLU2-ish) things begging for ZFS. So what's the boot environment they use? cd It's true for most of the Intel Atom family (Zxxx and Nxxx but cd not the 230 and 330 as those are 64 bit) Those are new cd systems. the 64-bit atom are desktop, and the 32-bit are laptop. They are both current chips right now---the 64-bit are not newer than 32-bit. I know; I'm not sure about the recently Pineview boxes, though. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Using single SSD for l2arc on multiple pools?
Hi Joseph ; You cant share SSDs between pools (at least for today) unless you slice. Also it's better to use 2x SSD's for L2 ARC as depending on your system there can be slight limitations of using one SSD. Best regards Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email mertol.ozyo...@sun.com -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Joseph Mocker Sent: Tuesday, June 16, 2009 10:28 PM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] Using single SSD for l2arc on multiple pools? Hello, I'm curious if it is possible to use a single SSD for the l2arc for multiple pools? I'm guessing that I can break the SSD into multiple slices and assign a slice as a cache device in each pool. That doesn't seem very flexible though, so I was wondering if there is another way to do this? Thanks... --joe ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Den 18 juni 2009 09.42 skrev Bogdan M. Maryniukbogdan.maryn...@gmail.com: ZFS on iSCSI *is* flaky OK, so what is the status of your bugreport about this? Was ignored or just rejected?.. No bug report because I don't think it's the file systems fault, and why bother when disappearing vdevs (even though the pool is fully redundant (raidz) and got enough vdevs to be theoretically working) causes the machine to panic and crash when there is other solutions/file systems that is more robust (for me) when using iscsi/fc. If my data is gone (or inaccessible), I have other things to worry about than filing bug reports and/or get on the list and get flamed for not having proper backups. :-] How to reproduce? Create a raidz2 pool (with Solaris 10u3) over two iscsi-enclosures, shutdown one of the enclosures, observe the results. It would probably work better if I upgraded solaris/zfs, but as I said - at the time I had other things to worry about. No flaming/blaming/hating, I simply don't use the combination zfs+iscsi/fc for critical data anymore and thats OK with me. -- Best Regards, Timh ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS: Re-Propragate inheritable ACL permissions
Hi Cindy and Christo, this is a good example of how useless ZFS ACLs are. Nobody understands how to use them! Please note in Cindy's examples above: You can not use file_inherit on files. Inheritance can only be set on directories. Depending on the zfs aclinherit mode, the result may not be what you want. When you have set an ACL inheritance on a directory and use chmod in the old way, e.g. chmod g-w dir1, the ACL inheritance of dir1 is modified! Be extremely careful with chmod A=... since this replaces any ACL set on file/dir, including trivial ACLs for owner@, group@ and every...@. My experience: Avoid ACLs wherever you can. They are simply not manageable. Andreas -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] compression at zfs filesystem creation
On Thu, 18 Jun 2009, Haudy Kazemi wrote: for text data, LZJB compression had negligible performance benefits (task times were unchanged or marginally better) and less storage space was consumed (1.47:1). for media data, LZJB compression had negligible performance benefits (task times were unchanged or marginally worse) and storage space consumed was unchanged (1:1). Take away message: as currently configured, their system has nothing to lose from enabling LZJB. My understanding is that these tests were done with NFS and one client over gigabit ethernet (a file server scenario). So in this case, the system is able to keep up with NFS over gigabit ethernet when LZJB is used. In a stand-alone power-user desktop scenario, the situtation may be quite different. In this case application CPU usage may be competing with storage CPU usage. Since ZFS often defers writes, it may be that the compression is performed at the same time as application compute cycles. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] 7110 questions
Hi all, (down to the wire here on EDU grant pricing :) i'm looking at buying a pair of 7110's in the EDU grant sale. The price is sure right. I'd use them in a mirrored, cold-failover config. I'd primarily be using them to serve a vmware cluster; the current config is two standalone ESX servers with local storage, 450G of SAS RAID10 each. the 7110 price point is great, and i think i have a reasonable understanding of how this stuff ought to work. I'm curious about a couple things that would be unsupported. Specifically, whether they are not supported if they have specifically been crippled in the software. 1) SSD's I can imagine buying an intel SSD, slotting it into the 7110, and using it as a ZFS L2ARC (? i mean the equivalent of readzilla) 2) expandability I can imagine buying a SAS card and a JBOD and hooking it up to the 7110; it has plenty of PCI slots. finally, one question - I presume that I need to devote a pair of disks to the OS, so I really only get 14 disks for data. Correct? thanks! danno -- Dan Pritts, Sr. Systems Engineer Internet2 office: +1-734-352-4953 | mobile: +1-734-834-7224 ESCC/Internet2 Joint Techs July 19-23, 2009 - Indianapolis, Indiana http://jointtechs.es.net/indiana2009/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] problems with l2arc in 2009.06
correct ratio of arc to l2arc? from http://blogs.sun.com/brendan/entry/l2arc_screenshots It costs some DRAM to reference the L2ARC, at a rate proportional to record size. For example, it currently takes about 15 Gbytes of DRAM to reference 600 Gbytes of L2ARC - at an 8 Kbyte ZFS record size. If you use a 16 Kbyte record size, that cost would be halve - 7.5 Gbytes. This means you shouldn't, for example, configure a system with only 8 Gbytes of DRAM, 600 Gbytes of L2ARC, and an 8 Kbyte record size - if you did, the L2ARC would never fully populate. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files
hi Dirk, How might we explain running find on a linux client to an NFS mounted file system under the 7000 taking significantly longer (i.e. performance behaving as though the command was run from Solaris?) Not sure if find would have the intelligence to differentiate between file system types and run different sections of code based upon what it finds? louis On 06/17/09 11:38, Dirk Nitschke wrote: Hi Louis! Solaris /usr/bin/find and Linux (GNU-) find work differently! I have experienced dramatic runtime differences some time ago. The reason is that Solaris find and GNU find use different algorithms. GNU find uses the st_nlink (number of links) field of the stat structure to optimize it's work. Solaris find does not use this kind of optimization because the meaning of number of links is not well defined and file system dependent. If you are interested, take a look at, say, CR 4907267 link count problem is hsfs CR 4462534 RFE: pcfs should emulate link counts for directories Dirk Am 17.06.2009 um 18:08 schrieb Louis Romero: Jose, I believe the problem is endemic to Solaris. I have run into similar problems doing a simple find(1) in /etc. On Linux, a find operation in /etc is almost instantaneous. On solaris, it has a tendency to spin for a long time. I don't know what their use of find might be but, running updatedb on the linux clients (with the NFS file system mounted of course) and using locate(1) will give you a work-around on the linux clients. Caveat Empore: There is a staleness factor associated with this solution as any new files dropped in after updatedb runs will not show up until the next updatedb is run. HTH louis On 06/16/09 11:55, Jose Martins wrote: Hello experts, IHAC that wants to put more than 250 Million files on a single mountpoint (in a directory tree with no more than 100 files on each directory). He wants to share such filesystem by NFS and mount it through many Linux Debian clients We are proposing a 7410 Openstore appliance... He is claiming that certain operations like find, even if taken from the Linux clients on such NFS mountpoint take significant more time than if such NFS share was provided by other NAS providers like NetApp... Can someone confirm if this is really a problem for ZFS filesystems?... Is there any way to tune it?... We thank any input Best regards Jose ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files
Hi Jose, Well it depends on the total size of your Zpool and how often these files are changed. I was at a customer an huge internet provider, who had 40x an X4500 with Standard solaris and using ZFS. All the machines were equiped with 48x 1TB disks. The machines were used to provide the email platform, so all the user email accounts were on the system. This did mean also millions of files in one ZPOOL. What they noticed on the the X4500 systems, that when the zpool became filled up for about 50-60% the performance of the system did drop enormously. They do claim this has to do with the fragmentation of the ZFS filesystem. So we did try over there putting an S7410 system in with about the same config on disks, 44x 1TB SATA BUT 4x 18GB WriteZilla (in a stripe) we were able to get much and much more i/o's from the system the the comparable X4500, however they did put it in production for a couple of weeks, and as soon as the ZFS filesystem did come in the range of about 50-60% filling the did see the same problem. The performance did drop down enormously. Netapps has the same problem with there Waffle filesystem, (they also tested this) however they do provide an Defragmentation tool for this. This is also NOT a nice solution, because you have to run this, manually or scheduled and it is taking a lot of system resources but it helps. I did hear Sun is denying we do have this problem in ZFS, and therefore we don't need a kind of defragmentation mechanism, however our customer experiences are different May be it is good for the ZFS group to look at this (potential) problem. The customer i am talking about is willing to share there experiences with Sun engineering. greetings, Cor Beumer Jose Martins wrote: Hello experts, IHAC that wants to put more than 250 Million files on a single mountpoint (in a directory tree with no more than 100 files on each directory). He wants to share such filesystem by NFS and mount it through many Linux Debian clients We are proposing a 7410 Openstore appliance... He is claiming that certain operations like find, even if taken from the Linux clients on such NFS mountpoint take significant more time than if such NFS share was provided by other NAS providers like NetApp... Can someone confirm if this is really a problem for ZFS filesystems?... Is there any way to tune it?... We thank any input Best regards Jose -- Cor Beumer Data Management Storage Sun Microsystems Nederland BV Saturnus 1 3824 ME Amersfoort The Netherlands Phone +31 33 451 5172 Mobile +31 6 51 603 142 Email cor.beu...@sun.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
bmm == Bogdan M Maryniuk bogdan.maryn...@gmail.com writes: tt == Toby Thain t...@telegraphics.com.au writes: bmm That's why I think that speaking My $foo crashes therefore it bmm is all crap is bad idea: either help to fix it or just don't bmm use it, First, people are allowed to speak and share information, and yes, even complain, without helping to fix things. You do not get to silence people who lack the talent, time, and interest to fix problems. Everyone's allowed to talk here. Second, I do use ZFS. But I keep a backup pool. And although my primary pool is iSCSI-based, the backup pools are direct-attached. Thanks to the open discussion on the list, I know that using iSCSI puts me at higher risk of pool loss. I know I need to budget for the backup pool equipment if I want to switch from $oldfilesystem to ZFS and not take a step down in reliability. I know that, while there is no time-consuming fsck to draw out downtime, pretty much every corruption event results in ``restore the pool from backup'' which takes a while, so I need to expect that by, for example, being prepared to run critical things directly off the backup pools. Finally, I know that ZFS pool corruption almost always results in loss of the whole pool, while other filesystem corruption tends to do crazier things which cappen to be less catastrophic to my particular dataset: some files but not all are lost after fsck, some files remain but lose their names, or more usefully retain their names but lose the name of one of their parent directories, the insides of some files are silently corrupted. There's actionable information in here. Technical discussion is worth more than sucks/rules armwrestling. bmm The same way, if you have a mirror of USB hard drives, then bmm swap cables and reboot — your mirror gone. But that's not bmm because of ZFS, if you will look more closely... actually I think you are the one not looking closely enough. You say no one is losing pools, and then 10min later reply to a post about running zdb on a lost pool. You shouldn't need me to tell you something's wrong. When you limit your thesis to ``ZFS rules'' and then actively mislead people, we all lose. tt /. is no person... right, so I use a word like ad hominem, and you stray from the main point to say ``Erm ayctually your use of rhetorical terminology is incorrect.'' maybe, maybe not, whatever, but again [x2], the posts in the slashdot thread complaining about corruption were just pointers to original posts on this list, so attacking the forum where you saw the pointer instead of the content of its destination really is clearly _ad hominem_. *brrk* *brr* ``no! no it's not ad hominem! it's a different word! ah, ha ah thought' you'd slip one past me there eh?'' QUIT BEING SO DAMNED ADD. We can get nowhere. As for the posts being rubbish, you and I both know it's plausible speculation that Apple delayed unleashing ZFS on their consumers because of the lost pool problems. ZFS doesn't suck, I do use it, I hope and predict it will get better---so just back off and calm down with the rotten fruit. But neither who's saying it nor your not wanting to hear it makes it less plausible. pgpqQ7LsTK2De.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] problems with l2arc in 2009.06
correct ratio of arc to l2arc? from http://blogs.sun.com/brendan/entry/l2arc_screenshots Thanks Rob. Hmm...that ratio isn't awesome. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 7110 questions
On Thu, Jun 18, 2009 at 11:51:44AM -0400, Dan Pritts wrote: I'm curious about a couple things that would be unsupported. Specifically, whether they are not supported if they have specifically been crippled in the software. We have not crippled the software in any way, but we have designed an appliance with some specific uses. Doing things from the Solaris shell by hand my damage your system and void your support contract. 1) SSD's I can imagine buying an intel SSD, slotting it into the 7110, and using it as a ZFS L2ARC (? i mean the equivalent of readzilla) That's not supported, it won't work easily, and if you get it working you'll be out of luck if you have a problem. 2) expandability I can imagine buying a SAS card and a JBOD and hooking it up to the 7110; it has plenty of PCI slots. Ditto. finally, one question - I presume that I need to devote a pair of disks to the OS, so I really only get 14 disks for data. Correct? That's right. We market the 7110 as either 2TB = 146GB x 14 or 4.2TB = 300GB x 14 raw capacity. Adam -- Adam Leventhal, Fishworks http://blogs.sun.com/ahl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files
Cor Beumer - Storage Solution Architect wrote: Hi Jose, Well it depends on the total size of your Zpool and how often these files are changed. ...and the average size of the files. For small files, it is likely that the default recordsize will not be optimal, for several reasons. Are these small files? -- richard I was at a customer an huge internet provider, who had 40x an X4500 with Standard solaris and using ZFS. All the machines were equiped with 48x 1TB disks. The machines were used to provide the email platform, so all the user email accounts were on the system. This did mean also millions of files in one ZPOOL. What they noticed on the the X4500 systems, that when the zpool became filled up for about 50-60% the performance of the system did drop enormously. They do claim this has to do with the fragmentation of the ZFS filesystem. So we did try over there putting an S7410 system in with about the same config on disks, 44x 1TB SATA BUT 4x 18GB WriteZilla (in a stripe) we were able to get much and much more i/o's from the system the the comparable X4500, however they did put it in production for a couple of weeks, and as soon as the ZFS filesystem did come in the range of about 50-60% filling the did see the same problem. The performance did drop down enormously. Netapps has the same problem with there Waffle filesystem, (they also tested this) however they do provide an Defragmentation tool for this. This is also NOT a nice solution, because you have to run this, manually or scheduled and it is taking a lot of system resources but it helps. I did hear Sun is denying we do have this problem in ZFS, and therefore we don't need a kind of defragmentation mechanism, however our customer experiences are different May be it is good for the ZFS group to look at this (potential) problem. The customer i am talking about is willing to share there experiences with Sun engineering. greetings, Cor Beumer Jose Martins wrote: Hello experts, IHAC that wants to put more than 250 Million files on a single mountpoint (in a directory tree with no more than 100 files on each directory). He wants to share such filesystem by NFS and mount it through many Linux Debian clients We are proposing a 7410 Openstore appliance... He is claiming that certain operations like find, even if taken from the Linux clients on such NFS mountpoint take significant more time than if such NFS share was provided by other NAS providers like NetApp... Can someone confirm if this is really a problem for ZFS filesystems?... Is there any way to tune it?... We thank any input Best regards Jose -- http://www.sun.com*Cor Beumer * Data Management Storage *Sun Microsystems Nederland BV* Saturnus 1 3824 ME Amersfoort The Netherlands Phone +31 33 451 5172 Mobile +31 6 51 603 142 Email cor.beu...@sun.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS metadata and cloning filesystem layout across machines
Hey ZFS experts, Where is the ZFS metadata stored under? Can it be viewed through some commands? Here is my requirement: I have a machine with lots of ZFS filesystems on it under couple of zpools and there is this another new machine with empty disks, what I want now is the similar layout of the pools, filesystems with quota,reservation and other properties intact along with the naming conventions created on the new machine. How do I create a schema or reverse engineer to run the zfs/zpool commands required to create the similay layout of the zfs filesystems on the new machine. Is there a import stuff like that? Thanks. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] problems with l2arc in 2009.06
Ethan Erchinger wrote: correct ratio of arc to l2arc? from http://blogs.sun.com/brendan/entry/l2arc_screenshots Thanks Rob. Hmm...that ratio isn't awesome. TANSTAAFL A good SWAG is about 200 bytes for L2ARC directory in the ARC for each record in the L2ARC. So if your recordsize is 512 bytes (pathologically worst case), you'll need 200/512 * size of L2ARC for a minimum ARC size, so ARC needs to be about 40% of the size of L2ARC. For 8 kByte recordsize it will be about 200/8192 or 2.5%. Neel liked using 16kByte recordsize for InnoDB, so figure about about 1.2%. In this case, if you have about 150 GBytes of L2ARC disk, and are using 8 kByte recordsize, you'll need at least 3.75 GBytes for the ARC, instead of 2 GBytes. Since this space competes with the regular ARC caches, you'll want even more headroom, so maybe 5 GBytes would be a reasonable minimum ARC cap? -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS metadata and cloning filesystem layout across machines
Hi Nikhil, take a look at the output from 'zpool history'. You should get/see all the information you need to be able to recreate your configuration. http://docs.sun.com/app/docs/doc/819-5461/gdswe?a=view Cheers, Henrik On Jun 18, 2009, at 8:47 PM, Nikhil wrote: Hey ZFS experts, Where is the ZFS metadata stored under? Can it be viewed through some commands? Here is my requirement: I have a machine with lots of ZFS filesystems on it under couple of zpools and there is this another new machine with empty disks, what I want now is the similar layout of the pools, filesystems with quota,reservation and other properties intact along with the naming conventions created on the new machine. How do I create a schema or reverse engineer to run the zfs/zpool commands required to create the similay layout of the zfs filesystems on the new machine. Is there a import stuff like that? Thanks. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 7110 questions
We have a 7110 on try and buy program. We tried using the 7110 with XEN Server 5 over iSCSI and NFS. Nothing seems to solve the slow write problem. Within the VM, we observed around 8MB/s on writes. Read performance is fantastic. Some troubleshooting was done with local SUN rep. The conclusion is that 7110 does not have write cache in forms of SSD or controller DRAM write cache. The solution from SUN is to buy StorageTek or 7000 series model with SSD write cache. Adam, please advise if there any fixes for 7110. I am still shopping for SAN and would rather buy a 7100 than a StorageTek or something else. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files
On Thu, Jun 18, 2009 at 12:12:16PM +0200, Cor Beumer - Storage Solution Architect wrote: What they noticed on the the X4500 systems, that when the zpool became filled up for about 50-60% the performance of the system did drop enormously. They do claim this has to do with the fragmentation of the ZFS filesystem. So we did try over there putting an S7410 system in with about the same config on disks, 44x 1TB SATA BUT 4x 18GB WriteZilla (in a stripe) we were able to get much and much more i/o's from the system the the comparable X4500, however they did put it in production for a couple of weeks, and as soon as the ZFS filesystem did come in the range of about 50-60% filling the did see the same problem. We had a similar problem with a T2000 and 2 TB of ZFS storage. Once the usage reached 1 TB, the write performance dropped considerably and the CPU consumption increased. Our problem was indirectly a result of fragmentation, but it was solved by a ZFS patch. I understand that this patch, which fixes a whole bunch of ZFS bugs, should be released soon. I wonder if this was your problem. -- -Gary Mills--Unix Support--U of M Academic Computing and Networking- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Server Cloning With ZFS?
Hi Dave, Until the ZFS/flash support integrates into an upcoming Solaris 10 release, I don't think we have an easy way to clone a root pool/dataset from one system to another system because system specific info is still maintained. Your manual solution sounds plausible but probably won't work because of the system specific info. Here are some options: 1. Wait for the ZFS/flash support in an upcoming Solaris 10 release. You can track CR 6690473 for this support. 2. Review interim solutions that involves UFS to ZFS migration but might give you some ideas: http://blogs.sun.com/scottdickson/entry/flashless_system_cloning_with_zfs http://blogs.sun.com/scottdickson/entry/a_much_better_way_to 3. Do an initial installation of your new server with a two-disk mirrored root pool. Set up a separate pool for data/applications. Snapshot data from the E450 and send/receive over to the data/app pool on the new server. Cindy Dave Ringkor wrote: So I had an E450 running Solaris 8 with VxVM encapsulated root disk. I upgraded it to Solaris 10 ZFS root using this method: - Unencapsulate the root disk - Remove VxVM components from the second disk - Live Upgrade from 8 to 10 on the now-unused second disk - Boot to the new Solaris 10 install - Create a ZFS pool on the now-unused first disk - Use Live Upgrade to migrate root filesystems to the ZFS pool - Add the now-unused second disk to the ZFS pool as a mirror Now my E450 is running Solaris 10 5/09 with ZFS root, and all the same users, software, and configuration that it had previously. That is pretty slick in itself. But the server itself is dog slow and more than half the disks are failing, and maybe I want to clone the server on new(er) hardware. With ZFS, this should be a lot simpler than it used to be, right? A new server has new hardware, new disks with different names and different sizes. But that doesn't matter anymore. There's a procedure in the ZFS manual to recover a corrupted server by using zfs receive to reinstall a copy of the boot environment into a newly created pool on the same server. But what if I used zfs send to save a recursive snapshot of my root pool on the old server, booted my new server (with the same architecture) from the DVD in single user mode and created a ZFS pool on its local disks, and did zfs receive to install the boot environments there? The filesystems don't care about the underlying disks. The pool hides the disk specifics. There's no vfstab to edit. Off the top of my head, all I can think to have to change is the network interfaces. And that change is as simple as cd /etc ; mv hostname.hme0 hostname.qfe0 or whatever. Is there anything else I'm not thinking of? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
On 18-Jun-09, at 12:14 PM, Miles Nordin wrote: bmm == Bogdan M Maryniuk bogdan.maryn...@gmail.com writes: tt == Toby Thain t...@telegraphics.com.au writes: ... tt /. is no person... ... you and I both know it's plausible speculation that Apple delayed unleashing ZFS on their consumers because of the lost pool problems. ZFS doesn't suck, I do use it, I hope and predict it will get better---so just back off and calm down with the rotten fruit. But neither who's saying it nor your not wanting to hear it makes it less plausible. In my opinion, a more plausible explanation is: Apple has not made ZFS integration a high priority [for 10.6]. There is no doubt Apple has the engineering resources to make it perfectly reliable as a component of Mac OS X, if that were a high priority goal. I run OS X but I am not at all tempted to play with ZFS on it there; life is too short for betas. If I want ZFS I install Solaris 10. --Toby ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 7110 questions
Both iSCSI and NFS are slow? I would expect NFS to be slow, but in my iSCSI testing with OpenSolaris 2008.11, performance we reasonable, about 2x NFS. Setup: Dell 2950 with a SAS HBA and SATA 3x5 raidz (15 disks, no separate ZIL), iSCSI using vmware ESXi 3.5 software initiator. Scott -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 7110 questions
There's a configuration issue in there somewhere. I have a ZFS based system serving up to some ESX servers working great with a few exceptions. First off perf was awful, but there was some confusion on how to optimize network traffic on ESX so I installed a fresh one using only the defaults, no jumbo frames, no etherchannel and I was able to push the ZFS server to wire speed read and write over iSCSI. I still have the write problem over NFS though. I should be back in the datacenter tomorrow to see if it's specific to the ESX NFS client. So my advice is to start looking at all of the tweaks that have been applied to the networking setup on the Xen side first. Cordialement, Erik Ableson +33.6.80.83.58.28 Envoyé depuis mon iPhone On 18 juin 2009, at 21:06, lawrence ho no-re...@opensolaris.org wrote: We have a 7110 on try and buy program. We tried using the 7110 with XEN Server 5 over iSCSI and NFS. Nothing seems to solve the slow write problem. Within the VM, we observed around 8MB/s on writes. Read performance is fantastic. Some troubleshooting was done with local SUN rep. The conclusion is that 7110 does not have write cache in forms of SSD or controller DRAM write cache. The solution from SUN is to buy StorageTek or 7000 series model with SSD write cache. Adam, please advise if there any fixes for 7110. I am still shopping for SAN and would rather buy a 7100 than a StorageTek or something else. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 7110 questions
With XenServer 4 and NFS you had to grow the disks (modified manually from thin to fat) in order to get decent performance. On Fri, Jun 19, 2009 at 7:06 AM, lawrence ho no-re...@opensolaris.orgwrote: We have a 7110 on try and buy program. We tried using the 7110 with XEN Server 5 over iSCSI and NFS. Nothing seems to solve the slow write problem. Within the VM, we observed around 8MB/s on writes. Read performance is fantastic. Some troubleshooting was done with local SUN rep. The conclusion is that 7110 does not have write cache in forms of SSD or controller DRAM write cache. The solution from SUN is to buy StorageTek or 7000 series model with SSD write cache. Adam, please advise if there any fixes for 7110. I am still shopping for SAN and would rather buy a 7100 than a StorageTek or something else. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 7110 questions
Hey Lawrence, Make sure you're running the latest software update. Note that this forumn is not the appropriate place to discuss support issues. Please contact your official Sun support channel. Adam On Thu, Jun 18, 2009 at 12:06:02PM -0700, lawrence ho wrote: We have a 7110 on try and buy program. We tried using the 7110 with XEN Server 5 over iSCSI and NFS. Nothing seems to solve the slow write problem. Within the VM, we observed around 8MB/s on writes. Read performance is fantastic. Some troubleshooting was done with local SUN rep. The conclusion is that 7110 does not have write cache in forms of SSD or controller DRAM write cache. The solution from SUN is to buy StorageTek or 7000 series model with SSD write cache. Adam, please advise if there any fixes for 7110. I am still shopping for SAN and would rather buy a 7100 than a StorageTek or something else. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Adam Leventhal, Fishworks http://blogs.sun.com/ahl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Lots of metadata overhead on filesystems with 100M files
Gary Mills wrote: On Thu, Jun 18, 2009 at 12:12:16PM +0200, Cor Beumer - Storage Solution Architect wrote: What they noticed on the the X4500 systems, that when the zpool became filled up for about 50-60% the performance of the system did drop enormously. They do claim this has to do with the fragmentation of the ZFS filesystem. So we did try over there putting an S7410 system in with about the same config on disks, 44x 1TB SATA BUT 4x 18GB WriteZilla (in a stripe) we were able to get much and much more i/o's from the system the the comparable X4500, however they did put it in production for a couple of weeks, and as soon as the ZFS filesystem did come in the range of about 50-60% filling the did see the same problem. We had a similar problem with a T2000 and 2 TB of ZFS storage. Once the usage reached 1 TB, the write performance dropped considerably and the CPU consumption increased. Our problem was indirectly a result of fragmentation, but it was solved by a ZFS patch. I understand that this patch, which fixes a whole bunch of ZFS bugs, should be released soon. I wonder if this was your problem. George would probably have the latest info, but there were a number of things which circled around the notorious Stop looking and start ganging bug report, http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6596237 -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs on 32 bit?
cd == Casper Dik casper@sun.com writes: yeah. many of those ARM systems will be low-power builtin-crypto-accel builtin-gigabit-MAC based on Orion and similar, NAS (NSLU2-ish) things begging for ZFS. cd So what's the boot environment they use? i think it is called U-Boot: http://forum.openwrt.org/viewtopic.php?pid=60387 pgpy5uR63QExP.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] APPLE: ZFS need bug corrections instead of new func! Or?
Toby, On 17-Jun-09, at 7:37 AM, Orvar Korvar wrote: Ok, so you mean the comments are mostly FUD and bull shit? Because there are no bug reports from the whiners? Could this be the case? It is mostly FUD? Hmmm...? Having read the thread, I would say without a doubt. Slashdot was never the place to go for accurate information about ZFS. Many would even say: Slashdot was never the place to go for accurate information. Slashdot was never the place to go for information. Slashdot was never the place to go. Slashdot? Never. Take your pick ;-) Regards... Sean. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs on 32 bit?
On Thu, Jun 18, 2009 at 4:28 AM, Miles Nordincar...@ivy.net wrote: djm http://opensolaris.org/os/project/osarm/ yeah. many of those ARM systems will be low-power builtin-crypto-accel builtin-gigabit-MAC based on Orion and similar, NAS (NSLU2-ish) things begging for ZFS. Are they feasible targets for zfs? The N610N that I have (BCM3302, 300MHz, 64MB) isn't even powerful enough to saturate either the gigabit wired or 802.11n wireless. It only goes about 25Mbps. Last time I test on EEPC 2G's Celeron, zfs is slow to the point of unusable. Will it be usable enough on most ARMs? -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs on 32 bit?
Fajar A. Nugraha wrote: On Thu, Jun 18, 2009 at 4:28 AM, Miles Nordincar...@ivy.net wrote: djm http://opensolaris.org/os/project/osarm/ yeah. many of those ARM systems will be low-power builtin-crypto-accel builtin-gigabit-MAC based on Orion and similar, NAS (NSLU2-ish) things begging for ZFS. Are they feasible targets for zfs? The N610N that I have (BCM3302, 300MHz, 64MB) isn't even powerful enough to saturate either the gigabit wired or 802.11n wireless. It only goes about 25Mbps. Last time I test on EEPC 2G's Celeron, zfs is slow to the point of unusable. Will it be usable enough on most ARMs? Well, given that ARM processors use a completely different ISA (ie. they're not x86-compatible), OpenSolaris won't run on them currently. If you'd like to do the port wink I can't say as to the entire Atom line of stuff, but I've found the Atoms are OK for desktop use, and not anywhere powerful enough for even a basic NAS server. The demands of wire-speed Gigabit, ZFS, and encryption/compression are hard on the little Atom guys. Plus, it seems to be hard to find an Atom motherboard which supports more than 2GB of RAM, which is a serious problem. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs on 32 bit?
On Fri, Jun 19, 2009 at 11:16 AM, Erik Trimbleerik.trim...@sun.com wrote: I can't say as to the entire Atom line of stuff, but I've found the Atoms are OK for desktop use, and not anywhere powerful enough for even a basic NAS server. The demands of wire-speed Gigabit, ZFS, and encryption/compression are hard on the little Atom guys. +1. I wanted to skip it, but will reply. I have two Asus EeePC Box 202 / 2GB. These are running numerous zones (snv_111b) for me with various services on them and still are very usable and fast enough. Additionally, I overclocked each up to 1.75GHz, did some corrections to Solaris's TCP/IP stack, removed some unnecessary services and they are just fine. Plus, it seems to be hard to find an Atom motherboard which supports more than 2GB of RAM, which is a serious problem. Well, let's don't forget that Atom is also smallest low-power processor and is designed for cheap and small nettops/netbooks that are don't need 4GB RAM ever. Despite of that: http://www.mini-itx.com/store/?c=53 -- Kind regards, BM Things, that are stupid at the beginning, rarely ends up wisely. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss