Re: [zfs-discuss] zfs-discuss mailing list opensolaris EOL
Is it possible to replicate the whole opensolaris site to illumos/openindiana/smartos/omnios site in a sub-catalog as archive? -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Jim Klimov Sent: Sunday, February 17, 2013 7:42 AM To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs-discuss mailing list opensolaris EOL Hello Cindy, Are there any plans to preserve the official mailing lists' archives, or will they go the way of Jive forums and the future digs for bits of knowledge would rely on alternate mirrors and caches? I understand that Oracle has some business priorities, but retiring hardware causes site shutdown? They've gotta be kidding, with all the buzz about clouds and virtualization ;) I'd guess, you also are not authorized to say whether Oracle might permit re-use (re-hosting) of current OpenSolaris.Org materials or even give away the site and domain for community steering and rid itself of more black PR by shooting down another public project of the Sun legacy (hint: if the site does wither and die in community's hands - it is not Oracle's fault; and if it lives on - Oracle did something good for karma... win-win, at no price). Thanks for your helpfulness in the past years, //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] .send* TAG will be left if there is a corrupt/aborted zfs send?
It is the first time for me to get this TAG. zfs holds cn03/3/is8119aw@issi-backup:daily-2012-12-14-17:26 NAMETAGTIMESTAMP cn03/3/is8119aw@issi-backup:daily-2012-12-14-17:26 .send-24928-0 Sun Jan 6 17:49:59 2013 cn03/3/is8119aw@issi-backup:daily-2012-12-14-17:26 keep Fri Dec 14 17:27:39 2012 Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] any more efficient way to transfer snapshot between two hosts than ssh tunnel?
Even with infinite wire speed, you're bound by the ability of the source server to generate the snapshot stream and the ability of the destination server to write the snapshots to the media. Our little servers in-house using ZFS don't read/write that fast when pulling snapshot contents off the disks, since they're essentially random access on a server that's been creating/deleting snapshots for a long time. --eric That is true. This discussion is assumed under ideal condition. We want to minimize the overhead in transportation layer only. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] any more efficient way to transfer snapshot between two hosts than ssh tunnel?
I've heard you could, but I've never done it. Sorry I'm not much help, except as a cheer leader. You can do it! I think you can! Don't give up! heheheheh Please post back whatever you find, or if you have to figure it out for yourself, then blog about it and post that. Aha! Gotcha! I will give it a try. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] any more efficient way to transfer snapshot between two hosts than ssh tunnel?
Post in the list. -Original Message- From: Fred Liu Sent: 星期五, 十二月 14, 2012 23:41 To: 'real-men-dont-cl...@gmx.net' Subject: RE: [zfs-discuss] any more efficient way to transfer snapshot between two hosts than ssh tunnel? Hi Fred, I played with zfs send/reveive some time ago. One important thing I learned was that netcat is not the first choice to use. There is a tool called mbuffer out there. mbuffer works similar to netcat but allows a specific buffer size and block size. From various resources I found out that the best buffer and block sizes for zfs send/receive seem to be 1GB for the buffer with a block size of 131073. Replacing netcat by mubuffer dramatically increases the throughput. The resulting commands are like: ssh -f $REMOTESRV /opt/csw/bin/mbuffer -q -I $PORT -m 1G -s 131072 | zfs receive -vFd $REMOTEPOOL zfs send $CURRENTLOCAL | /opt/csw/bin/mbuffer -q -O $REMOTESRV:$PORT -m 1G -s 131072 /dev/null cu Carsten, Thank you so much for the sharing and I will try it. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] any more efficient way to transfer snapshot between two hosts than ssh tunnel?
We have found mbuffer to be the fastest solution. Our rates for large transfers on 10GbE are: 280MB/smbuffer 220MB/srsh 180MB/sHPN-ssh unencrypted 60MB/s standard ssh The tradeoff mbuffer is a little more complicated to script; rsh is, well, you know; and hpn-ssh requires rebuilding ssh and (probably) maintaining a second copy of it. -- Trey Palmer In 10GbE env, even 280MB/s is not a so decent result. Maybe the alternative could be a two-step way. Putting snapshots via NFS/iSCSI and receiving them locally. But that is not perfect. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] any more efficient way to transfer snapshot between two hosts than ssh tunnel?
Assuming in a secure and trusted env, we want to get the maximum transfer speed without the overhead from ssh. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] any more efficient way to transfer snapshot between two hosts than ssh tunnel?
Adrian, That is cool! Thank you so much! BTW, anyone played NDMP in solaris? Or is it feasible to transfer snapshot via NDMP protocol? Before the acquisition, SUN advocated the NDMP backup feature in the openstorage/fishwork. I am sorry if it is the wrong place to ask this question. Thanks. Fred From: Adrian Smith [mailto:adrian.sm...@rmit.edu.au] Sent: 星期五, 十二月 14, 2012 12:08 To: Fred Liu Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] any more efficient way to transfer snapshot between two hosts than ssh tunnel? Hi Fred, Try mbuffer (http://www.maier-komor.de/mbuffer.html) On 14 December 2012 15:01, Fred Liu fred_...@issi.commailto:fred_...@issi.com wrote: Assuming in a secure and trusted env, we want to get the maximum transfer speed without the overhead from ssh. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Adrian Smith (ISUnix), Ext: 55070 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] any more efficient way to transfer snapshot between two hosts than ssh tunnel?
Add the HPN patches to OpenSSH and enable the NONE cipher. We can saturate a gigabits link (980 mbps) between two FreeBSD hosts using that. Without it, we were only able to hit ~480 mbps on a good day. If you want 0 overhead, there's always netcat. :) 980mbps is awesome! I am thinking running two ssh services -- one normal and one with HPN patches only for backup job. But now sure they can work before I try them. I will also try netcat. Many thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] current status of SAM-QFS?
Still a fully supported product from Oracle: http://www.oracle.com/us/products/servers-storage/storage/storage- software/qfs-software/overview/index.html Yeah. But it seems no more updates since sun acquisition. Don't know Oracle's roadmap in aspect of data-tying. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?
The time is the creation time of the snapshots. Yes. That is true. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?
The size accounted for by the userused@ and groupused@ properties is the referenced space, which is used as the basis for many other space accounting values in ZFS (e.g. du / ls -s / stat(2), and the zfs accounting properties referenced, refquota, refreservation, refcompressratio, written). It includes changes local to the dataset (compression, the copies property, file-specific metadata such as indirect blocks), but ignores pool-wide or cross-dataset changes (space shared between a clone and its origin, mirroring, raid-z, dedup[*]). --matt [*] Although dedup can be turned on and off per-dataset, the data is deduplicated against all dedup-enabled data in the pool. Ie, identical data in different datasets will be stored only once, if dedup is enabled for both datasets. Can we also get the *ignored* space accounted? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] current status of SAM-QFS?
. If you want to know Oracle's roadmap for SAM-QFS then I recommend contacting your Oracle account rep rather than asking on a ZFS discussion list. You won't get SAM-QFS or Oracle roadmap answers from this alias. My original purpose is to ask if there is an effort to integrate open-sourced SAM-QFS into illumos or smartos/oi/illumian. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] current status of SAM-QFS?
IIRC, the senior product architects and perhaps some engineers have left Oracle. A better question for your Oracle rep is whether there is a plan to anything other than sustaining engineering for the product. I see. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?
What problem are you trying to solve? How would you want referenced or userused@... to work? To be more clear: space shared between a clone and its origin is referenced by both the clone and the origin, so it is charged to both the clone's and origin's userused@... properties. The additional space used by mirroring and raid-z applies to all blocks in the pool[*], and is not charged anywhere (except by /sbin/zpool). --matt [*] Assuming you are using the recommended configuration of all the same type of top-level vdevs; if you are not then there's no control over which blocks go to which vdevs. There is no specific problem to resolve. Just want to get sort of accurate equation between the raw storage size and the usable storage size although the *meta file* size is trivial. If you do mass storage budget, this equation is meaningful. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?
I don't think accurate equations are applicable in this case. You can have estimates like no more/no less than X based on, basically, level of redundancy and its overhead. ZFS metadata overhead can also be smaller or bigger, depending on your data's typical block size (fixed for zvols at creation time, variable for files); i.e. if your data is expected to be in very small pieces (comparable to single sector size), you'd have big overhead due to required redundancy and metadata. For data in large chunks overheads would be smaller. This gives you something like available space won't be smaller than M disks from my M+N redundant raidzN arrays minus O percent for metadata. You can also constrain these estimates' range by other assumptions like expected dedup or compression ratios, and hope that your end-users would be able to stuff even more of their addressable data into the pool (because it would be sparse, compressable, and/or not unique), but in the end that's unpredictable from the start. Totally agree. We have the similar experimental practice. It varies case by case. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] current status of SAM-QFS?
The subject says it all. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?
On Apr 26, 2012, at 12:27 AM, Fred Liu wrote: zfs 'userused@' properties and 'zfs userspace' command are good enough to gather usage statistics. I think I mix that with NetApp. If my memory is correct, we have to set quotas to get usage statistics under DataOnTAP. Further, if we can add an ILM-like feature to poll the time-related info(atime,mtime,ctime,etc) with that statistic from ZFS, that will be really cool. In general, file-based ILM has limitations that cause all sorts of issues for things like operating systems, where files might only be needed infrequently, but when they are needed, they are needed right now Have you looked at zfs diff for changed files? Here ILM-like feature, I mean we know how the data is distributed by time per pool/filesystem like how many data are modified/accessed before mm/dd/. And we don't need to do the actual storage-tying operations immediately(moving the infrequently-used data to tie-2 storage). The time-related usage statistics are very useful reference for us. zfs diff will show the delta but not come with the time info. Since no one is focusing on enabling default user/group quota now, the temporarily remedy could be a script which traverse all the users/groups in the directory tree. Tough it is not so decent. The largest market for user/group quotas is .edu. But they represent only a small market when measured by $. There are also many corner cases in this problem space. One might pine for the days of VMS and its file resource management features, those features don't scale well to company- wide LDAP and thousands of file systems. My understanding is the quota management is needed as long as zfs storage is used in NAS way(shared by multi-users). So, for now, the fastest method to solve the problem might be to script some walkers. Yes. That is ture. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs
I jump into this loop with different alternative -- ip-based block device. And I saw few successful cases with HAST + UCARP + ZFS + FreeBSD. If zfsonlinux is robust enough, trying DRBD + PACEMAKER + ZFS + LINUX is definitely encouraged. Thanks. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Nico Williams Sent: 星期四, 四月 26, 2012 14:00 To: Richard Elling Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] cluster vs nfs On Thu, Apr 26, 2012 at 12:10 AM, Richard Elling richard.ell...@gmail.com wrote: On Apr 25, 2012, at 8:30 PM, Carson Gaspar wrote: Reboot requirement is a lame client implementation. And lame protocol design. You could possibly migrate read-write NFSv3 on the fly by preserving FHs and somehow updating the clients to go to the new server (with a hiccup in between, no doubt), but only entire shares at a time -- you could not migrate only part of a volume with NFSv3. Of course, having migration support in the protocol does not equate to getting it in the implementation, but it's certainly a good step in that direction. You are correct, a ZFS send/receive will result in different file handles on the receiver, just like rsync, tar, ufsdump+ufsrestore, etc. That's understandable for NFSv2 and v3, but for v4 there's no reason that an NFSv4 server stack and ZFS could not arrange to preserve FHs (if, perhaps, at the price of making the v4 FHs rather large). Although even for v3 it should be possible for servers in a cluster to arrange to preserve devids... Bottom line: live migration needs to be built right into the protocol. For me one of the exciting things about Lustre was/is the idea that you could just have a single volume where all new data (and metadata) is distributed evenly as you go. Need more storage? Plug it in, either to an existing head or via a new head, then flip a switch and there it is. No need to manage allocation. Migration may still be needed, both within a cluster and between clusters, but that's much more manageable when you have a protocol where data locations can be all over the place in a completely transparent manner. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?
“zfs 'userused@' properties” and “'zfs userspace' command” are good enough to gather usage statistics. I think I mix that with NetApp. If my memory is correct, we have to set quotas to get usage statistics under DataOnTAP. Further, if we can add an ILM-like feature to poll the time-related info(atime,mtime,ctime,etc) with that statistic from ZFS, that will be really cool. Since no one is focusing on enabling default user/group quota now, the temporarily remedy could be a script which traverse all the users/groups in the directory tree. Tough it is not so decent. Currently, dedup/compression is pool-based right now, they don’t have the granularity on file system or user or group level. There is also a lot of improving space in this aspect. Thanks. Fred From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Richard Elling Sent: 星期四, 四月 26, 2012 0:48 To: Eric Schrock Cc: zfs-discuss@opensolaris.org; develo...@lists.illumos.org Subject: Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]? On Apr 25, 2012, at 8:14 AM, Eric Schrock wrote: ZFS will always track per-user usage information even in the absence of quotas. See the the zfs 'userused@' properties and 'zfs userspace' command. tip: zfs get -H -o value -p userused@username filesystem Yes, and this is the logical size, not physical size. Some ZFS features increase logical size (copies) while others decrease physical size (compression, dedup) -- richard -- ZFS Performance and Training richard.ell...@richardelling.commailto:richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?
2012/4/26 Fred Liu fred_...@issi.com Currently, dedup/compression is pool-based right now, they don't have the granularity on file system or user or group level. There is also a lot of improving space in this aspect. Compression is not pool-based, you can control it with the 'compression' property on a per-filesystem level, and is fundamentally per-block. Dedup is also controlled per-filesystem, though the DDT is global to the pool. If you think there are compelling features lurking here, then by all means grab the code and run with it :-) - Eric -- Thanks for correcting me. I will have a try and see how far I can go. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?
On 2012-04-26 11:27, Fred Liu wrote: zfs 'userused@' properties and 'zfs userspace' command are good enough to gather usage statistics. ... Since no one is focusing on enabling default user/group quota now, the temporarily remedy could be a script which traverse all the users/groups in the directory tree. Tough it is not so decent. find /export/home -type f -uid 12345 -exec du -ks '{}' \; | summing-script I think you could use some prefetch of dirtree traversal, like a slocate database, or roll your own (perl script). But yes, it does seem like stone age compared to ZFS ;) Thanks for the hint. I mean traverse all the users/groups in the directory tree as getting all user/group info from naming service like nis/ldap for a specific file system. And for each found item, we can use zfs set userquota@/groupquota@ to set the default value. As for usage accounting, zfs 'userused@' properties and 'zfs userspace' command are good enough. We can also use a script to do the summing jobs via traversing all the pools/filesystems. Currently, dedup/compression is pool-based right now, Dedup is pool-wide, compression is dataset-wide, applied to individual blocks. Even deeper, both settings apply to new writes after the corresponding dataset's property was set (i.e. a dataset can have files with mixed compression levels, as well as both deduped and unique files). they don't have the granularity on file system or user or group level. There is also a lot of improving space in this aspect. This particular problem was discussed a number of times back on OpenSolaris forum. It boiled down to what you actually want to have accounted and perhaps billed - the raw resources spent by storage system, or the logical resources accessed and used by its users? Say, you provide VMs with 100Gb of disk space, but your dedup is lucky enough to use 1TB overall for say 100 VMs. You can bill 100 users for full 100Gb each, but your operations budget (and further planning, etc.) has only been hit for 1Tb. The ideal situation is we know exactly both the logical usage and the physical usage per user/group. But that is not applicable for now. And assuming even we know it, we still cannot estimate the physical usage for dedup/compression varies by the using pattern. Yes. We do get bonus from dedup/compression. But there is no good way to make it fit into budget plan from my side. HTH, //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?
On Apr 24, 2012, at 2:50 PM, Fred Liu wrote: Yes. Thanks. I am not aware of anyone looking into this. I don't think it is very hard, per se. But such quotas don't fit well with the notion of many file systems. There might be some restricted use cases where it makes good sense, but I'm not convinced it will scale well -- user quotas never scale well. -- richard OK. I see. And I agree such quotas will scale well. From users' side, they always ask for more space or even no quotas at all. One of the main purposes behind such quotas is that we can account usage and get the statistics. Is it possible to do it without setting such quotas? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [developer] Setting default user/group quotas[usage accounting]?
Missing an important ‘NOT’: OK. I see. And I agree such quotas will **NOT** scale well. From users' side, they always ask for more space or even no quotas at all. One of the main purposes behind such quotas is that we can account usage and get the statistics. Is it possible to do it without setting such quotas? Thanks. Fred _ From: Fred Liu Sent: 星期三, 四月 25, 2012 20:05 To: develo...@lists.illumos.org Cc: 'zfs-discuss@opensolaris.org' Subject: RE: [developer] Setting default user/group quotas[usage accounting]? On Apr 24, 2012, at 2:50 PM, Fred Liu wrote: Yes. Thanks. I am not aware of anyone looking into this. I don't think it is very hard, per se. But such quotas don't fit well with the notion of many file systems. There might be some restricted use cases where it makes good sense, but I'm not convinced it will scale well -- user quotas never scale well. -- richard OK. I see. And I agree such quotas will scale well. From users' side, they always ask for more space or even no quotas at all. One of the main purposes behind such quotas is that we can account usage and get the statistics. Is it possible to do it without setting such quotas? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] FW: Setting default user/group quotas?
-Original Message- From: Fred Liu Sent: 星期二, 四月 24, 2012 11:41 To: develo...@lists.illumos.org Subject: Setting default user/group quotas? It seems this feature is still not there yet. Any plan to do it? Or is it hard to do it? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Windows 8 ReFS (OT)
Looks really beautiful... -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of David Magda Sent: 星期二, 一月 17, 2012 8:06 To: zfs-discuss Subject: [zfs-discuss] Windows 8 ReFS (OT) Kind of off topic, but I figured of some interest to the list. There will be a new file system in Windows 8 with some features that we all know and love in ZFS: As mentioned previously, one of our design goals was to detect and correct corruption. This not only ensures data integrity, but also improves system availability and online operation. Thus, all ReFS metadata is check-summed at the level of a B+ tree page, and the checksum is stored independently from the page itself. [...] Once ReFS detects such a failure, it interfaces with Storage Spaces to read all available copies of data and chooses the correct one based on checksum validation. It then tells Storage Spaces to fix the bad copies based on the good copies. All of this happens transparently from the point of view of the application. http://tinyurl.com/839wnbe http://blogs.msdn.com/b/b8/archive/2012/01/16/building-the-next- generation-file-system-for-windows-refs.aspx http://news.ycombinator.com/item?id=3472857 (via) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Oracle releases Solaris 11 for Sparc and x86 servers
___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Oracle releases Solaris 11 for Sparc and x86 servers
... so when will zfs-related improvement make it to solaris- derivatives :D ? I am also very curious about Oracle's policy about source code. ;-) Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] FS Reliability WAS: about btrfs and zfs
Paul, Thanks. I understand now. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Paul Kraus Sent: 星期一, 十月 24, 2011 22:38 To: ZFS Discussions Subject: Re: [zfs-discuss] FS Reliability WAS: about btrfs and zfs On Sat, Oct 22, 2011 at 12:36 AM, Paul Kraus p...@kraus-haus.org wrote: Recently someone posted to this list of that _exact_ situation, they loaded an OS to a pair of drives while a pair of different drives containing an OS were still attached. The zpool on the first pair ended up not being able to be imported, and were corrupted. I can post more info when I am back in the office on Monday. See the thread started on Tue, Aug 2, 2011 at 12:23 PM with a Subject of [zfs-discuss] Wrong rpool used after reinstall!, the followups, and at least one additional related thread. While I agree that you _should_ be able to have multiple unrelated boot environments on hard drives at once, it seems prudent to me to NOT do such. I assume you _can_ manage multiple ZFS based boot environments using Live Upgrade (or whatever has replaced it in 11). NOTE that I have not done such (managed multiple ZFS boot environments with Live Upgrade), but I ASSUME you can. I suspect that the root of this potential problem is in the ZFS boot code and the use of the same zpool name for multiple zpools at once. By having the boot loader use the zpool directly you get the benefit of having the redundancy of ZFS much earlier in the boot process (the only thing that appears to load off of a single drive is the boot loader, everything from there on loads from the mirrored zpool, at least on my NCP 3 system, my first foray into ZFS root). The danger is that if there are multiple zpools with the same (required) name, then the boot loader may become confused, especially if drives get physically moved around. -- {1-2-3-4-5-6-7- } Paul Kraus - Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) - Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) - Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] FS Reliability WAS: about btrfs and zfs
Some people have trained their fingers to use the -f option on every command that supports it to force the operation. For instance, how often do you do rm -rf vs. rm -r and answer questions about every file? If various zpool commands (import, create, replace, etc.) are used against the wrong disk with a force option, you can clobber a zpool that is in active use by another system. In a previous job, my lab environment had a bunch of LUNs presented to multiple boxes. This was done for convenience in an environment where there would be little impact if an errant command were issued. I'd never do that in production without some form of I/O fencing in place. I also have that habit. And It is good practice to bear in mind. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] FS Reliability WAS: about btrfs and zfs
3. Do NOT let a system see drives with more than one OS zpool at the same time (I know you _can_ do this safely, but I have seen too many horror stories on this list that I just avoid it). Can you elaborate #3? In what situation will it happen? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] remove wrongly added device from zpool
Hi, For my carelessness, I added two disks into a raid-z2 zpool as normal data disk, but in fact I want to make them as zil devices. Any remedy solutions? Many thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
That's a huge bummer, and it's the main reason why device removal has been a priority request for such a long time... There is no solution. You can only destroy recreate your pool, or learn to live with it that way. Sorry... Yeah, I also realized this when I send out this message. In NetApp, it is so easy to change raid group size. There is still a long way for zfs to go. Hope I can see that in the future. I also did another huge mistake which really brings me into the deep pain. I physically removed these two added devices for I though raidz2 can afford it. But now the whole pool corrupts. I don't know where I can go ... Any help will be tremendously appreciated. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
This one missing feature of ZFS, IMHO, does not result in a long way for zfs to go in relation to netapp. I shut off my netapp 2 years ago in favor of ZFS, because ZFS performs so darn much better, and has such immensely greater robustness. Try doing ndmp, cifs, nfs, iscsi on netapp (all extra licenses). Try experimenting with the new version of netapp to see how good it is (you can't unless you buy a whole new box.) Try mirroring a production box onto a lower-cost secondary backup box (there is no such thing). Try storing your backup on disk and rotating your disks offsite. Try running any normal utilities - iostat, top, wireshark - you can't. Try backing up with commercial or otherwise modular (agent-based) backup software. You can't. You have to use CIFS/NFS/NDMP. Just try finding a public mailing list like this one where you can even so much as begin such a conversation about netapp... Been there done that, it's not even in the same ballpark. etc etc. (end rant.) I hate netapp. Yeah, It is kind of touchy topic, we may discuss more in the future. I want to focus on how to repair my pool first. ;-( Um... Wanna post your zpool status and cat /etc/release and zpool upgrade I exported the pool for I want to use zpool import -F to fix it. But now I get one or more devices is currently unavailable Destroy and re-create the pool from a backup source. I use opensolaris b134 and zpool version 22. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
You can add mirrors to those lonely disks. Can it repair the pool? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
I'll tell you what does not help. This email. Now that you know what you're trying to do, why don't you post the results of your zpool import command? How about an error message, and how you're trying to go about fixing your pool? Nobody here can help you without information. User tty login@ idle JCPU PCPU what root console 9:25pm w root@cn03:~# df Filesystem 1K-blocks Used Available Use% Mounted on rpool/ROOT/opensolaris 94109412 6880699 87228713 8% / swap 108497952 344 108497608 1% /etc/svc/volatile /usr/lib/libc/libc_hwcap1.so.1 94109412 6880699 87228713 8% /lib/libc.so.1 swap 108497616 8 108497608 1% /tmp swap 10849768880 108497608 1% /var/run rpool/export 4686423 46841 1% /export rpool/export/home4686423 46841 1% /export/home rpool/export/home/fred 48710 5300 43410 11% /export/home/fred rpool10215515880 102155078 1% /rpool root@cn03:~# !z zpool import cn03 cannot import 'cn03': one or more devices is currently unavailable Destroy and re-create the pool from a backup source. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
I also used zpool import -fFX cn03 in b134 and b151a(via live SX11 live cd). It resulted a core dump and reboot after about 15 min. I can see all the leds are blinking on the HDD within this 15 min. Can replacing empty ZIL devices help? Thanks. Fred -Original Message- From: Fred Liu Sent: 星期一, 九月 19, 2011 21:54 To: 'Edward Ned Harvey'; 'Krunal Desai' Cc: zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] remove wrongly added device from zpool I'll tell you what does not help. This email. Now that you know what you're trying to do, why don't you post the results of your zpool import command? How about an error message, and how you're trying to go about fixing your pool? Nobody here can help you without information. User tty login@ idle JCPU PCPU what root console 9:25pm w root@cn03:~# df Filesystem 1K-blocks Used Available Use% Mounted on rpool/ROOT/opensolaris 94109412 6880699 87228713 8% / swap 108497952 344 108497608 1% /etc/svc/volatile /usr/lib/libc/libc_hwcap1.so.1 94109412 6880699 87228713 8% /lib/libc.so.1 swap 108497616 8 108497608 1% /tmp swap 10849768880 108497608 1% /var/run rpool/export 4686423 46841 1% /export rpool/export/home4686423 46841 1% /export/home rpool/export/home/fred 48710 5300 43410 11% /export/home/fred rpool10215515880 102155078 1% /rpool root@cn03:~# !z zpool import cn03 cannot import 'cn03': one or more devices is currently unavailable Destroy and re-create the pool from a backup source. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
The core dump: r10: ff19a5592000 r11:0 r12:0 r13:0 r14:0 r15: ff00ba4a5c60 fsb: fd7fff172a00 gsb: ff19a5592000 ds:0 es:0 fs:0 gs:0 trp:e err:0 rip: f782f81a cs: 30 rfl:10246 rsp: ff00b9bf0a40 ss: 38 ff00b9bf0830 unix:die+10f () ff00b9bf0940 unix:trap+177b () ff00b9bf0950 unix:cmntrap+e6 () ff00b9bf0ab0 procfs:prchoose+72 () ff00b9bf0b00 procfs:prgetpsinfo+2b () ff00b9bf0ce0 procfs:pr_read_psinfo+4e () ff00b9bf0d30 procfs:prread+72 () ff00b9bf0da0 genunix:fop_read+6b () ff00b9bf0f00 genunix:pread+22c () ff00b9bf0f10 unix:brand_sys_syscall+20d () syncing file systems... done dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel 0:17 100% done 100% done: 1041082 pages dumped, dump succeeded rebooting... -Original Message- From: Fred Liu Sent: 星期一, 九月 19, 2011 22:00 To: Fred Liu; 'Edward Ned Harvey'; 'Krunal Desai' Cc: 'zfs-discuss@opensolaris.org' Subject: RE: [zfs-discuss] remove wrongly added device from zpool I also used zpool import -fFX cn03 in b134 and b151a(via live SX11 live cd). It resulted a core dump and reboot after about 15 min. I can see all the leds are blinking on the HDD within this 15 min. Can replacing empty ZIL devices help? Thanks. Fred -Original Message- From: Fred Liu Sent: 星期一, 九月 19, 2011 21:54 To: 'Edward Ned Harvey'; 'Krunal Desai' Cc: zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] remove wrongly added device from zpool I'll tell you what does not help. This email. Now that you know what you're trying to do, why don't you post the results of your zpool import command? How about an error message, and how you're trying to go about fixing your pool? Nobody here can help you without information. User tty login@ idle JCPU PCPU what root console 9:25pm w root@cn03:~# df Filesystem 1K-blocks Used Available Use% Mounted on rpool/ROOT/opensolaris 94109412 6880699 87228713 8% / swap 108497952 344 108497608 1% /etc/svc/volatile /usr/lib/libc/libc_hwcap1.so.1 94109412 6880699 87228713 8% /lib/libc.so.1 swap 108497616 8 108497608 1% /tmp swap 10849768880 108497608 1% /var/run rpool/export 4686423 46841 1% /export rpool/export/home4686423 46841 1% /export/home rpool/export/home/fred 48710 5300 43410 11% /export/home/fred rpool10215515880 102155078 1% /rpool root@cn03:~# !z zpool import cn03 cannot import 'cn03': one or more devices is currently unavailable Destroy and re-create the pool from a backup source. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
I use opensolaris b134. Thanks. Fred -Original Message- From: Richard Elling [mailto:richard.ell...@gmail.com] Sent: 星期一, 九月 19, 2011 22:21 To: Fred Liu Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] remove wrongly added device from zpool On Sep 19, 2011, at 12:10 AM, Fred Liu fred_...@issi.com wrote: Hi, For my carelessness, I added two disks into a raid-z2 zpool as normal data disk, but in fact I want to make them as zil devices. You don't mention which OS you are using, but for the past 5 years of [Open]Solaris releases, the system prints a warning message and will not allow this to occur without using the force option (-f). -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
You don't mention which OS you are using, but for the past 5 years of [Open]Solaris releases, the system prints a warning message and will not allow this to occur without using the force option (-f). -- richard Yes. There is a warning message, I used zpool add -f. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
I get some good progress like following: zpool import pool: cn03 id: 1907858070511204110 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-6X config: cn03 UNAVAIL missing device raidz2-0 ONLINE c4t5000C5000970B70Bd0 ONLINE c4t5000C5000972C693d0 ONLINE c4t5000C500097009DBd0 ONLINE c4t5000C500097040BFd0 ONLINE c4t5000C5000970727Fd0 ONLINE c4t5000C50009707487d0 ONLINE c4t5000C50009724377d0 ONLINE c4t5000C50039F0B447d0 ONLINE c22t3d0 ONLINE c4t50015179591C238Fd0ONLINE logs c22t4d0 ONLINE c22t5d0 ONLINE Additional devices are known to be part of this pool, though their exact configuration cannot be determined. Any suggestions? Thanks. Fred -Original Message- From: Fred Liu Sent: 星期一, 九月 19, 2011 22:28 To: 'Richard Elling' Cc: zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] remove wrongly added device from zpool You don't mention which OS you are using, but for the past 5 years of [Open]Solaris releases, the system prints a warning message and will not allow this to occur without using the force option (-f). -- richard Yes. There is a warning message, I used zpool add -f. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
No, but your pool is not imported. YES. I see. and look to see which disk is missing? The label, as displayed by zdb -l contains the heirarchy of the expected pool config. The contents are used to build the output you see in the zpool import or zpool status commands. zpool is complaining that it cannot find one of these disks, so look at the labels on the disks to determine what is or is not missing. The next steps depend on this knowledge. zdb -l /dev/rdsk/c22t2d0s0 cannot open '/dev/rdsk/c22t2d0s0': I/O error root@cn03:~# zdb -l /dev/rdsk/c22t3d0s0 LABEL 0 version: 22 name: 'cn03' state: 0 txg: 18269872 pool_guid: 1907858070511204110 hostid: 13564652 hostname: 'cn03' top_guid: 11074483144412112931 guid: 11074483144412112931 vdev_children: 6 vdev_tree: type: 'disk' id: 1 guid: 11074483144412112931 path: '/dev/dsk/c22t3d0s0' devid: 'id1,sd@s4154412020202020414e53393031305f324e4e4e324e4e4e202020202020202035363238363739005f31/a' phys_path: '/pci@0,0/pci15d9,400@1f,2/disk@3,0:a' whole_disk: 1 metaslab_array: 37414 metaslab_shift: 24 ashift: 9 asize: 1895563264 is_log: 0 create_txg: 18269863 LABEL 1 version: 22 name: 'cn03' state: 0 txg: 18269872 pool_guid: 1907858070511204110 hostid: 13564652 hostname: 'cn03' top_guid: 11074483144412112931 guid: 11074483144412112931 vdev_children: 6 vdev_tree: type: 'disk' id: 1 guid: 11074483144412112931 path: '/dev/dsk/c22t3d0s0' devid: 'id1,sd@s4154412020202020414e53393031305f324e4e4e324e4e4e202020202020202035363238363739005f31/a' phys_path: '/pci@0,0/pci15d9,400@1f,2/disk@3,0:a' whole_disk: 1 metaslab_array: 37414 metaslab_shift: 24 ashift: 9 asize: 1895563264 is_log: 0 create_txg: 18269863 LABEL 1 version: 22 name: 'cn03' state: 0 txg: 18269872 pool_guid: 1907858070511204110 hostid: 13564652 hostname: 'cn03' top_guid: 11074483144412112931 guid: 11074483144412112931 vdev_children: 6 vdev_tree: type: 'disk' id: 1 guid: 11074483144412112931 path: '/dev/dsk/c22t3d0s0' devid: 'id1,sd@s4154412020202020414e53393031305f324e4e4e324e4e4e202020202020202035363238363739005f31/a' phys_path: '/pci@0,0/pci15d9,400@1f,2/disk@3,0:a' whole_disk: 1 metaslab_array: 37414 metaslab_shift: 24 ashift: 9 asize: 1895563264 is_log: 0 create_txg: 18269863 LABEL 2 failed to unpack label 2 LABEL 3 failed to unpack label 3 c22t2d0 and c22t3d0 are the devices I physically removed and connected back to the server. How can I fix them? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
-Original Message- From: Richard Elling [mailto:richard.ell...@gmail.com] Sent: 星期二, 九月 20, 2011 3:57 To: Fred Liu Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] remove wrongly added device from zpool more below… On Sep 19, 2011, at 9:51 AM, Fred Liu wrote: Is this disk supposed to be available? You might need to check the partition table, if one exists, to determine if s0 has a non-zero size. Yes. I use format to write an EFI label to it. Now this error is gone. But all four label are failed to unpack under zdb -l now. This is a bad sign, but can be recoverable, depending on how you got here. zdb is saying that it could not find labels at the end of the disk. Label 2 and label 3 are 256KB each, located at the end of the disk, aligned to 256KB boundary. zpool import is smarter than zdb in these cases, and can often recover from it -- up to the loss of all 4 labels, but you need to make sure that the partition tables look reasonable and haven't changed. I have tried zpool import -fFX cn03. But it will do core-dump and reboot about 1 hour later. Unless I'm mistaken, these are ACARD SSDs that have an optional CF backup. Let's hope that the CF backup worked. Yes. It is ACARD. You mean push the restore from CF button to see what will happen? Thanks for your nice help! Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] remove wrongly added device from zpool
zdb -l /dev/rdsk/c22t2d0s0 LABEL 0 failed to unpack label 0 LABEL 1 failed to unpack label 1 LABEL 2 failed to unpack label 2 LABEL 3 failed to unpack label 3 -Original Message- From: Fred Liu Sent: 星期二, 九月 20, 2011 4:06 To: 'Richard Elling' Cc: zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] remove wrongly added device from zpool -Original Message- From: Richard Elling [mailto:richard.ell...@gmail.com] Sent: 星期二, 九月 20, 2011 3:57 To: Fred Liu Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] remove wrongly added device from zpool more below… On Sep 19, 2011, at 9:51 AM, Fred Liu wrote: Is this disk supposed to be available? You might need to check the partition table, if one exists, to determine if s0 has a non-zero size. Yes. I use format to write an EFI label to it. Now this error is gone. But all four label are failed to unpack under zdb -l now. This is a bad sign, but can be recoverable, depending on how you got here. zdb is saying that it could not find labels at the end of the disk. Label 2 and label 3 are 256KB each, located at the end of the disk, aligned to 256KB boundary. zpool import is smarter than zdb in these cases, and can often recover from it -- up to the loss of all 4 labels, but you need to make sure that the partition tables look reasonable and haven't changed. I have tried zpool import -fFX cn03. But it will do core-dump and reboot about 1 hour later. Unless I'm mistaken, these are ACARD SSDs that have an optional CF backup. Let's hope that the CF backup worked. Yes. It is ACARD. You mean push the restore from CF button to see what will happen? Thanks for your nice help! Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] all the history
Hi, I did this: 1): prtvtoc /dev/rdsk/c22t3d0s0 | fmthard -s - /dev/rdsk/c22t2d0s0 2): zpool import cn03 3): zpool status pool: cn03 state: DEGRADED status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAME STATE READ WRITE CKSUM cn03 DEGRADED 0 053 raidz2-0 ONLINE 0 0 0 c4t5000C5000970B70Bd0 ONLINE 0 0 0 c4t5000C5000972C693d0 ONLINE 0 0 0 c4t5000C500097009DBd0 ONLINE 0 0 0 c4t5000C500097040BFd0 ONLINE 0 0 0 c4t5000C5000970727Fd0 ONLINE 0 0 0 c4t5000C50009707487d0 ONLINE 0 0 0 c4t5000C50009724377d0 ONLINE 0 0 0 c4t5000C50039F0B447d0 ONLINE 0 0 0 c22t3d0 DEGRADED 0 0 120 too many errors c22t2d0 DEGRADED 0 040 too many errors c4t50015179591C238Fd0ONLINE 0 0 0 logs c22t4d0 ONLINE 0 0 0 c22t5d0 ONLINE 0 0 0 spares c4t5000C5003AC39D5Fd0UNAVAIL cannot open errors: 1 data errors, use '-v' for a list pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c4t500151795910D221d0s0 ONLINE 0 0 0 errors: No known data errors Thanks. Fred From: Fred Liu Sent: 星期二, 九月 20, 2011 9:23 To: Tony Kim; 'Richard Elling' Subject: all the history Hi, Following is the history: The whole history is I found a ZIL device offline at about 2:00PM today from syslog. And I removed and replaced it with a backup device. But I mis-typed the adding command like �C “zpool add cn03 c22t2d0” and correct command should be “ zpool add cn03 log c22t2d0”. The ZIL was wrongly added as a data device in cn03 pool. I noticed it so I physically removed this device from server as I think raidz2 can afford this. The commands I issued are “zpool add cn03 c22t2d0” and “zpool add �Cf cn03 c22t3d0” But the tragedy is coming, the cn03 whole pool is corrupted now: zpool import pool: cn03 id: 1907858070511204110 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://www.sun.com/msg/ZFS-8000-6X config: cn03 UNAVAIL missing device raidz2-0 ONLINE c4t5000C5000970B70Bd0 ONLINE c4t5000C5000972C693d0 ONLINE c4t5000C500097009DBd0 ONLINE c4t5000C500097040BFd0 ONLINE c4t5000C5000970727Fd0 ONLINE c4t5000C50009707487d0 ONLINE c4t5000C50009724377d0 ONLINE c4t5000C50039F0B447d0 ONLINE c22t3d0 ONLINE c4t50015179591C238Fd0ONLINE logs c22t4d0 ONLINE c22t5d0 ONLINE Additional devices are known to be part of this pool, though their exact configuration cannot be determined. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive and ashift
The only way you will know of decrypting and decompressing causes a problem in that case is if you try it on your systems. I seriously doubt it will be unless the system is already heavily CPU bound and your backup window is already very tight. That is true. My understanding of the NDMP protocol is that it would be a translator that did that it isn't part of the core protocol. The way I would do it is to use a T1C tape drive and have it do the compression and encryption of the data. http://www.oracle.com/us/products/servers-storage/storage/tape- storage/t1c-tape-drive-292151.html The alternative is to have the node in your NDMP network that does the writing to the tape to do the compression and encryption of the data stream before putting it on the tape. I see. T1C is a monster to have if possible ;-). And doing the job on NDMP node(Solaris) needs extra software, is it correct? For starters SSL/TLS (which is what the Oracle ZFSSA provides for replication) or IPsec are possibilities as well, depends what the risk is you are trying to protect against and what transport layer is. But basically it is not provided by ZFS itself it is up to the person building the system to secure the transport layer used for ZFS send. It could also be write directly to a T10k encrypting tape drive. -- Gotcha! Many thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive and ashift
The ZFS Send stream is at the DMU layer at this layer the data is uncompress and decrypted - ie exactly how the application wants it. Even the data compressed/encrypted by ZFS will be decrypted? If it is true, will it be any CPU overhead? And ZFS send/receive tunneled by ssh becomes the only way to encrypt the data transmission? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs send/receive and ashift
Yes, which is exactly what I said. All data as seen by the DMU is decrypted and decompressed, the DMU layer is what the ZPL layer is built ontop of so it has to be that way. Understand. Thank you. ;-) There is always some overhead for doing a decryption and decompression, the question is really can you detect it and if you can does it mater. If you are running Solaris on processors with built in support for AES (eg SPARC T2, T3 or Intel with AES-NI) the overhead is reduced significantly in many cases. For many people getting the stuff from disk takes more time than doing the transform to get back your plaintext. In some of the testing I did I found that gzip decompression can be more significant to a workload than doing the AES decryption. So basically yes of course but does it actually mater ? It is up to how big the delta is. It does matter if the data backup can not be finished within the required backup window when people use zfs send/receive to do the mass data backup. BTW adding a sort of off-topic question -- will NDMP protocol in Solaris will do decompression and decryption? Thanks. And ZFS send/receive tunneled by ssh becomes the only way to encrypt the data transmission? That isn't the only way. -- Any alternatives, if you don't mind? ;-) Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.[GPU acceleration of ZFS]
-Original Message- From: David Magda [mailto:dma...@ee.ryerson.ca] Sent: 星期二, 六月 28, 2011 10:41 To: Fred Liu Cc: Bill Sommerfeld; ZFS Discuss Subject: Re: [zfs-discuss] Encryption accelerator card recommendations.[GPU acceleration of ZFS] On Jun 27, 2011, at 22:03, Fred Liu wrote: FYI There is another thread named -- GPU acceleration of ZFS in this list to discuss the possibility to utilize the power of GPGPU. I posted here: In a similar vein I recently came across SSLShader: http://shader.kaist.edu/sslshader/ http://www.usenix.org/event/nsdi11/tech/full_papers/Jang.pdf http://www.google.com/search?q=sslshader This could be handy for desktops doing ZFS crypto (and even browser SSL and/or SSH), but few servers have decent graphics cards (and SPARC systems don't even have video ports by :). Agree. The most challenging part is coding as long as there is an empty PCIE slot in server. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Encryption accelerator card recommendations.[GPU acceleration of ZFS]
FYI There is another thread named -- GPU acceleration of ZFS in this list to discuss the possibility to utilize the power of GPGPU. I posted here: Good day, I think ZFS can take advantage of using GPU for sha256 calculation, encryption and maybe compression. Modern video card, like 5xxx or 6xxx ATI HD Series can do calculation of sha256 50-100 times faster than modern 4 cores CPU. kgpu project for linux shows nice results. 'zfs scrub' would work freely on high performance ZFS pools. The only problem that there is no AMD/Nvidia drivers for Solaris that support hardware-assisted OpenCL. Is anyone interested in it? Best regards, Anatoly Legkodymov. On Tue, May 10, 2011 at 11:29 AM, Anatoly legko...@fastmail.fm wrote: Good day, I think ZFS can take advantage of using GPU for sha256 calculation, encryption and maybe compression. Modern video card, like 5xxx or 6xxx ATI HD Series can do calculation of sha256 50-100 times faster than modern 4 cores CPU. Ignoring optimizations from SIMD extensions like SSE and friends, this is probably true. However, the GPU also has to deal with the overhead of data transfer to itself before it can even begin crunching data. Granted, a Gen. 2 x16 link is quite speedy, but is CPU performance really that poor where a GPU can still out-perform it? My undergrad thesis dealt with computational acceleration utilizing CUDA, and the datasets had to scale quite a ways before there was a noticeable advantage in using a Tesla or similar over a bog-standard i7-920. The only problem that there is no AMD/Nvidia drivers for Solaris that support hardware-assisted OpenCL. This, and keep in mind that most of the professional users here will likely be using professional hardware, where a simple 8MB Rage XL gets the job done thanks to the magic of out-of-band management cards and other such facilities. Even as a home user, I have not placed a high-end videocard into my machine, I use a $5 ATI PCI videocard that saw about a hour of use whilst I installed Solaris 11. -- --khd IMHO, zfs need to run in all kind of HW T-series CMT server that can help sha calculation since T1 day, did not see any work in ZFS to take advantage it On 5/10/2011 11:29 AM, Anatoly wrote: Good day, I think ZFS can take advantage of using GPU for sha256 calculation, encryption and maybe compression. Modern video card, like 5xxx or 6xxx ATI HD Series can do calculation of sha256 50-100 times faster than modern 4 cores CPU. kgpu project for linux shows nice results. 'zfs scrub' would work freely on high performance ZFS pools. The only problem that there is no AMD/Nvidia drivers for Solaris that support hardware-assisted OpenCL. Is anyone interested in it? Best regards, Anatoly Legkodymov. On Tue, May 10, 2011 at 10:29 PM, Anatoly legko...@fastmail.fm wrote: Good day, I think ZFS can take advantage of using GPU for sha256 calculation, encryption and maybe compression. Modern video card, like 5xxx or 6xxx ATI HD Series can do calculation of sha256 50-100 times faster than modern 4 cores CPU. kgpu project for linux shows nice results. 'zfs scrub' would work freely on high performance ZFS pools. The only problem that there is no AMD/Nvidia drivers for Solaris that support hardware-assisted OpenCL. Is anyone interested in it? This isn't technically true. The NVIDIA drivers support compute, but there's other parts of the toolchain missing. /* I don't know about ATI/AMD, but I'd guess they likely don't support compute across platforms */ /* Disclaimer - The company I work for has a working HMPP compiler for Solaris/FreeBSD and we may soon support CUDA */ On 10 May 2011, at 16:44, Hung-Sheng Tsao (LaoTsao) Ph. D. wrote: IMHO, zfs need to run in all kind of HW T-series CMT server that can help sha calculation since T1 day, did not see any work in ZFS to take advantage it That support would be in the crypto framework though, not ZFS per se. So I think the OP might consider how best to add GPU support to the crypto framework. Chris ___ Thanks. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of David Magda Sent: 星期二, 六月 28, 2011 9:23 To: Bill Sommerfeld Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Encryption accelerator card recommendations. On Jun 27, 2011, at 18:32, Bill Sommerfeld wrote: On 06/27/11 15:24, David Magda wrote: Given the amount of transistors that are available nowadays I think it'd be simpler to just create a series of SIMD instructions right in/on general CPUs, and skip the whole co-processor angle. see: http://en.wikipedia.org/wiki/AES_instruction_set Present in many current Intel CPUs; also expected to be present in AMD's Bulldozer based CPUs. Now compare that with the T-series stuff that also handles 3DES, RC4, RSA2048, DSA, DH, ECC, MD5,
Re: [zfs-discuss] zfs global hot spares?
zpool status -x output would be useful. These error reports do not include a pointer to the faulty device. fmadm can also give more info. Yes. Thanks. mpathadm can be used to determine the device paths for this disk. Notice how the disk is offline at multiple times. There is some sort of recovery going on here that continues to fail later. I call these wounded soldiers because they take a lot more care than a dead soldier. You would be better off if the drive completely died. I think it only works in mpts2(sas2) where multi-path is forcedly enabled. I agree the disk was a sort of critical status before died. The difficult point is the OS can NOT automatically off the wounded disk in mid-night( maybe cause the coming scsi reset storm), nobody can do it at all. In my experience they start randomly and in some cases are not reproducible. It seems sort of agnostic? Isn't it? :-) Are you asking for fault tolerance? If so, then you need a fault tolerant system like a Tandem. If you are asking for a way to build a cost effective solution using commercial, off-the-shelf (COTS) components, then that is far beyond what can be easily said in a forum posting. -- richard Yeah. High availability is another topic which has more technical challenges. Anyway, thank you very much. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs global hot spares?
-Original Message- From: Fred Liu Sent: 星期四, 六月 16, 2011 17:28 To: Fred Liu; 'Richard Elling' Cc: 'Jim Klimov'; 'zfs-discuss@opensolaris.org' Subject: RE: [zfs-discuss] zfs global hot spares? Fixing a typo in my last thread... -Original Message- From: Fred Liu Sent: 星期四, 六月 16, 2011 17:22 To: 'Richard Elling' Cc: Jim Klimov; zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] zfs global hot spares? This message is from the disk saying that it aborted a command. These are usually preceded by a reset, as shown here. What caused the reset condition? Was it actually target 11 or did target 11 get caught up in the reset storm? It happed in the mid-night and nobody touched the file box. I assume it is the transition status before the disk is *thoroughly* damaged: Jun 10 09:34:11 cn03 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS- 8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major Jun 10 09:34:11 cn03 EVENT-TIME: Fri Jun 10 09:34:11 CST 2011 Jun 10 09:34:11 cn03 PLATFORM: X8DTH-i-6-iF-6F, CSN: 1234567890, HOSTNAME: cn03 Jun 10 09:34:11 cn03 SOURCE: zfs-diagnosis, REV: 1.0 Jun 10 09:34:11 cn03 EVENT-ID: 4f4bfc2c-f653-ed20-ab13-eef72224af5e Jun 10 09:34:11 cn03 DESC: The number of I/O errors associated with a ZFS device exceeded Jun 10 09:34:11 cn03 acceptable levels. Refer to http://sun.com/msg/ZFS-8000-FD for more information. Jun 10 09:34:11 cn03 AUTO-RESPONSE: The device has been offlined and marked as faulted. An attempt Jun 10 09:34:11 cn03 will be made to activate a hot spare if available. Jun 10 09:34:11 cn03 IMPACT: Fault tolerance of the pool may be compromised. Jun 10 09:34:11 cn03 REC-ACTION: Run 'zpool status -x' and replace the bad device. After I rebooted it, I got: Jun 10 11:38:49 cn03 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.11 Version snv_134 64-bit Jun 10 11:38:49 cn03 genunix: [ID 683174 kern.notice] Copyright 1983- 2010 Sun Microsystems, Inc. All rights reserved. Jun 10 11:38:49 cn03 Use is subject to license terms. Jun 10 11:38:49 cn03 unix: [ID 126719 kern.info] features: 7f7fsse4_2,sse4_1,ssse3,cpuid,mwait,tscp,cmp,cx16,sse3,nx,asysc,ht t,sse2,sse,sep,pat,cx8,pae,mca,mmx,cmov,d e,pge,mtrr,msr,tsc,lgpg Jun 10 11:39:06 cn03 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,3410@9/pci1000,72@0 (mpt_sas0): Jun 10 11:39:06 cn03mptsas0 unrecognized capability 0x3 Jun 10 11:39:42 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:42 cn03drive offline Jun 10 11:39:47 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:47 cn03drive offline Jun 10 11:39:52 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:52 cn03drive offline Jun 10 11:39:57 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:57 cn03drive offline Hot spare will not help you here. The problem is not constrained to one disk. In fact, a hot spare may be the worst thing here because it can kick in for the disk complaining about a clogged expander or spurious resets. This causes a resilver that reads from the actual broken disk, that causes more resets, that kicks out another disk that causes a resilver, and so on. -- richard So the warm spares could be better choice under this situation? BTW, in what condition, the scsi reset storm will happen? How can we be immune to this so as NOT to interrupt the file service? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs global hot spares?
This message is from the disk saying that it aborted a command. These are usually preceded by a reset, as shown here. What caused the reset condition? Was it actually target 11 or did target 11 get caught up in the reset storm? It happed in the mid-night and nobody touched the file box. I assume it is the transition status before the disk is *thoroughly* damaged: Jun 10 09:34:11 cn03 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major Jun 10 09:34:11 cn03 EVENT-TIME: Fri Jun 10 09:34:11 CST 2011 Jun 10 09:34:11 cn03 PLATFORM: X8DTH-i-6-iF-6F, CSN: 1234567890, HOSTNAME: cn03 Jun 10 09:34:11 cn03 SOURCE: zfs-diagnosis, REV: 1.0 Jun 10 09:34:11 cn03 EVENT-ID: 4f4bfc2c-f653-ed20-ab13-eef72224af5e Jun 10 09:34:11 cn03 DESC: The number of I/O errors associated with a ZFS device exceeded Jun 10 09:34:11 cn03 acceptable levels. Refer to http://sun.com/msg/ZFS-8000-FD for more information. Jun 10 09:34:11 cn03 AUTO-RESPONSE: The device has been offlined and marked as faulted. An attempt Jun 10 09:34:11 cn03 will be made to activate a hot spare if available. Jun 10 09:34:11 cn03 IMPACT: Fault tolerance of the pool may be compromised. Jun 10 09:34:11 cn03 REC-ACTION: Run 'zpool status -x' and replace the bad device. After I rebooted it, I got: Jun 10 11:38:49 cn03 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.11 Version snv_134 64-bit Jun 10 11:38:49 cn03 genunix: [ID 683174 kern.notice] Copyright 1983-2010 Sun Microsystems, Inc. All rights reserved. Jun 10 11:38:49 cn03 Use is subject to license terms. Jun 10 11:38:49 cn03 unix: [ID 126719 kern.info] features: 7f7fsse4_2,sse4_1,ssse3,cpuid,mwait,tscp,cmp,cx16,sse3,nx,asysc,htt,sse2,sse,sep,pat,cx8,pae,mca,mmx,cmov,d e,pge,mtrr,msr,tsc,lgpg Jun 10 11:39:06 cn03 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,3410@9/pci1000,72@0 (mpt_sas0): Jun 10 11:39:06 cn03mptsas0 unrecognized capability 0x3 Jun 10 11:39:42 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:42 cn03drive offline Jun 10 11:39:47 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:47 cn03drive offline Jun 10 11:39:52 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:52 cn03drive offline Jun 10 11:39:57 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:57 cn03drive offline Hot spare will not help you here. The problem is not constrained to one disk. In fact, a hot spare may be the worst thing here because it can kick in for the disk complaining about a clogged expander or spurious resets. This causes a resilver that reads from the actual broken disk, that causes more resets, that kicks out another disk that causes a resilver, and so on. -- richard So the warm spares could be better choice under this situation? BTW, in what condition, the scsi reset storm will happen? How can we be immune to this so as not to avoid interrupting the file service? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs global hot spares?
Fixing a typo in my last thread... -Original Message- From: Fred Liu Sent: 星期四, 六月 16, 2011 17:22 To: 'Richard Elling' Cc: Jim Klimov; zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] zfs global hot spares? This message is from the disk saying that it aborted a command. These are usually preceded by a reset, as shown here. What caused the reset condition? Was it actually target 11 or did target 11 get caught up in the reset storm? It happed in the mid-night and nobody touched the file box. I assume it is the transition status before the disk is *thoroughly* damaged: Jun 10 09:34:11 cn03 fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS- 8000-FD, TYPE: Fault, VER: 1, SEVERITY: Major Jun 10 09:34:11 cn03 EVENT-TIME: Fri Jun 10 09:34:11 CST 2011 Jun 10 09:34:11 cn03 PLATFORM: X8DTH-i-6-iF-6F, CSN: 1234567890, HOSTNAME: cn03 Jun 10 09:34:11 cn03 SOURCE: zfs-diagnosis, REV: 1.0 Jun 10 09:34:11 cn03 EVENT-ID: 4f4bfc2c-f653-ed20-ab13-eef72224af5e Jun 10 09:34:11 cn03 DESC: The number of I/O errors associated with a ZFS device exceeded Jun 10 09:34:11 cn03 acceptable levels. Refer to http://sun.com/msg/ZFS-8000-FD for more information. Jun 10 09:34:11 cn03 AUTO-RESPONSE: The device has been offlined and marked as faulted. An attempt Jun 10 09:34:11 cn03 will be made to activate a hot spare if available. Jun 10 09:34:11 cn03 IMPACT: Fault tolerance of the pool may be compromised. Jun 10 09:34:11 cn03 REC-ACTION: Run 'zpool status -x' and replace the bad device. After I rebooted it, I got: Jun 10 11:38:49 cn03 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.11 Version snv_134 64-bit Jun 10 11:38:49 cn03 genunix: [ID 683174 kern.notice] Copyright 1983- 2010 Sun Microsystems, Inc. All rights reserved. Jun 10 11:38:49 cn03 Use is subject to license terms. Jun 10 11:38:49 cn03 unix: [ID 126719 kern.info] features: 7f7fsse4_2,sse4_1,ssse3,cpuid,mwait,tscp,cmp,cx16,sse3,nx,asysc,ht t,sse2,sse,sep,pat,cx8,pae,mca,mmx,cmov,d e,pge,mtrr,msr,tsc,lgpg Jun 10 11:39:06 cn03 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,3410@9/pci1000,72@0 (mpt_sas0): Jun 10 11:39:06 cn03mptsas0 unrecognized capability 0x3 Jun 10 11:39:42 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:42 cn03drive offline Jun 10 11:39:47 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:47 cn03drive offline Jun 10 11:39:52 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:52 cn03drive offline Jun 10 11:39:57 cn03 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk@g5000c50009723937 (sd3): Jun 10 11:39:57 cn03drive offline Hot spare will not help you here. The problem is not constrained to one disk. In fact, a hot spare may be the worst thing here because it can kick in for the disk complaining about a clogged expander or spurious resets. This causes a resilver that reads from the actual broken disk, that causes more resets, that kicks out another disk that causes a resilver, and so on. -- richard So the warm spares could be better choice under this situation? BTW, in what condition, the scsi reset storm will happen? How can we be immune to this so as NOT to interrupt the file service? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs global hot spares?
-Original Message- From: Richard Elling [mailto:richard.ell...@gmail.com] Sent: 星期三, 六月 15, 2011 14:25 To: Fred Liu Cc: Jim Klimov; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs global hot spares? On Jun 14, 2011, at 10:31 PM, Fred Liu wrote: -Original Message- From: Richard Elling [mailto:richard.ell...@gmail.com] Sent: 星期三, 六月 15, 2011 11:59 To: Fred Liu Cc: Jim Klimov; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs global hot spares? On Jun 14, 2011, at 2:36 PM, Fred Liu wrote: What is the difference between warm spares and hot spares? Warm spares are connected and powered. Hot spares are connected, powered, and automatically brought online to replace a failed disk. The reason I'm leaning towards warm spares is because I see more replacements than failed disks... a bad thing. -- richard You mean so-called failed disks replaced by hot spares are not really physically damaged? Do I misunderstand? That is not how I would phrase it, let's try: assuming the disk is failed because you can't access it or it returns bad data is a bad assumption. -- richard Gotcha! But if there is a real failed disk, we have to do manual warm spare disk replacement. If the pool's failmode is set to wait, we experienced a NFS service time-out. It will interrupt NFS service. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs global hot spares?
This is only true if the pool is not protected. Please protect your pool with mirroring or raidz*. -- richard Yes. We use a raidz2 without any spares. In theory, with one disk broken, there should be no problem. But in reality, we saw NFS service interrupted: Jun 9 23:28:59 cn03 scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1 Jun 9 23:28:59 cn03 scsi: [ID 365881 kern.info] /pci@0,0/pci8086,3410@9/pci1000,72@0 (mpt_sas0): Jun 9 23:28:59 cn03Log info 0x3114 received for target 11. Jun 9 23:28:59 cn03scsi_status=0x0, ioc_status=0x8048, scsi_state=0xc Truncating similar scsi error Jun 10 08:04:38 cn03 svc.startd[9]: [ID 122153 daemon.warning] svc:/network/nfs/server:default: Method or service exit timed out. Killing contract 71840. Jun 10 08:04:38 cn03 svc.startd[9]: [ID 636263 daemon.warning] svc:/network/nfs/server:default: Method /lib/svc/method/nfs-server stop 105 failed due to signal KILL. Truncating scsi similar error Jun 10 09:04:38 cn03 svc.startd[9]: [ID 122153 daemon.warning] svc:/network/nfs/server:default: Method or service exit timed out. Killing contract 71855. Jun 10 09:04:38 cn03 svc.startd[9]: [ID 636263 daemon.warning] svc:/network/nfs/server:default: Method /lib/svc/method/nfs-server stop 105 failed due to signal KILL. This is out of my original assumption when I designed this file box. But this NFS interruption may **NOT** be due to the degraded zpool although one broken disk is almost the only **obvious** event in the night. I will add a hot spare and enable autoreplace to see if it will happen again. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs global hot spares?
What is the difference between warm spares and hot spares? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] The length of zpool history
I assume the history is stored in the meta data. Is it possible to configure how long/much history can be stored/displayed? I know it is doable via external/additional automation like porting to a database. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool/zfs properties in SNMP
Hi, Anyone who is successfully poll the zpool/zfs properties thrun SNMP? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs global hot spares?
-Original Message- From: Richard Elling [mailto:richard.ell...@gmail.com] Sent: 星期三, 六月 15, 2011 11:59 To: Fred Liu Cc: Jim Klimov; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] zfs global hot spares? On Jun 14, 2011, at 2:36 PM, Fred Liu wrote: What is the difference between warm spares and hot spares? Warm spares are connected and powered. Hot spares are connected, powered, and automatically brought online to replace a failed disk. The reason I'm leaning towards warm spares is because I see more replacements than failed disks... a bad thing. -- richard You mean so-called failed disks replaced by hot spares are not really physically damaged? Do I misunderstand? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] degraded pool will stop NFS service when there is no hot spare?
Hi, We have met this yesterday. The degraded pool was exported and I had to re-import it manually. Is it a normal case? I assume it should not be but Has anyone met the similar case? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ACARD ANS-9010 cannot work well LSI 9211-8i SAS inerface
Just want to share with you. We have found and been suffering from some weird issues because of it. It is better to connect them directly to the sata ports on the mainboard. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How does ZFS dedup space accounting work with quota?
-Original Message- From: Erik Trimble [mailto:erik.trim...@oracle.com] Sent: 星期二, 四月 26, 2011 12:47 To: Ian Collins Cc: Fred Liu; ZFS discuss Subject: Re: [zfs-discuss] How does ZFS dedup space accounting work with quota? On 4/25/2011 6:23 PM, Ian Collins wrote: On 04/26/11 01:13 PM, Fred Liu wrote: H, it seems dedup is pool-based not filesystem-based. That's correct. Although it can be turned off and on at the filesystem level (assuming it is enabled for the pool). Which is effectively the same as choosing per-filesystem dedup. Just the inverse. You turn it on at the pool level, and off at the filesystem level, which is identical to off at the pool level, on at the filesystem level that NetApp does. My original though is just enabling dedup on one file system to check if it is mature enough or not in the production env. And I have only one pool. If dedup is filesytem-based, the effect of dedup will be just throttled within one file system and won't propagate to the whole pool. Just disabling dedup cannot get rid of all the effects(such as the possible performance degrade ... etc), because the already dedup'd data is still there and DDT is still there. The thinkable thorough way is totally removing all the dedup'd data. But is it the real thorough way? And also the dedup space saving is kind of indirect. We cannot directly get the space saving in the file system where the dedup is actually enabled for it is pool-based. Even in pool perspective, it is still sort of indirect and obscure from my opinion, the real space saving is the abs delta between the output of 'zpool list' and the sum of 'du' on all the folders in the pool (or 'df' on the mount point folder, not sure if the percentage like 123% will occur or not... grinning ^:^ ). But in NetApp, we can use 'df -s' to directly and easily get the space saving. If it can have fine-grained granularity(like based on fs), that will be great! It is pity! NetApp is sweet in this aspect. So what happens to user B's quota if user B stores a ton of data that is a duplicate of user A's data and then user A deletes the original? Actually, right now, nothing happens to B's quota. He's always charged the un-deduped amount for his quota usage, whether or not dedup is enabled, and regardless of how much of his data is actually deduped. Which is as it should be, as quotas are about limiting how much a user is consuming, not how much the backend needs to store that data consumption. e.g. A, B, C, D all have 100Mb of data in the pool, with dedup on. 20MB of storage has a dedup-factor of 3:1 (common to A, B, C) 50MB of storage has a dedup factor of 2:1 (common to A B ) Thus, the amount of unique data would be: A: 100 - 20 - 50 = 30MB B: 100 - 20 - 50 = 30MB C: 100 - 20 = 80MB D: 100MB Summing it all up, you would have an actual storage consumption of 70 (50+20 deduped) + 30+30+80+100 (unique data) = 310MB to actual storage, for 400MB of apparent storage (i.e. dedup ratio of 1.29:1 ) A, B, C, D would each still have a quota usage of 100MB. It is true, quota is in charge of logical data not physical data. Let's assume an interesting scenario -- say the pool is 100% full in logical data (such as 'df' tells you 100% used) but not full in physical data(such as 'zpool list' tells you still some space available), can we continue writing data into this pool? Anybody has interests to do this experiment? ;-) Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How does ZFS dedup space accounting work with quota?
-Original Message- From: Erik Trimble [mailto:erik.trim...@oracle.com] Sent: Wednesday, April 27, 2011 12:07 AM To: Fred Liu Cc: Ian Collins; ZFS discuss Subject: Re: [zfs-discuss] How does ZFS dedup space accounting work with quota? On 4/26/2011 3:59 AM, Fred Liu wrote: -Original Message- From: Erik Trimble [mailto:erik.trim...@oracle.com] Sent: 星期二, 四月 26, 2011 12:47 To: Ian Collins Cc: Fred Liu; ZFS discuss Subject: Re: [zfs-discuss] How does ZFS dedup space accounting work with quota? On 4/25/2011 6:23 PM, Ian Collins wrote: On 04/26/11 01:13 PM, Fred Liu wrote: H, it seems dedup is pool-based not filesystem-based. That's correct. Although it can be turned off and on at the filesystem level (assuming it is enabled for the pool). Which is effectively the same as choosing per-filesystem dedup. Just the inverse. You turn it on at the pool level, and off at the filesystem level, which is identical to off at the pool level, on at the filesystem level that NetApp does. My original though is just enabling dedup on one file system to check if it is mature enough or not in the production env. And I have only one pool. If dedup is filesytem-based, the effect of dedup will be just throttled within one file system and won't propagate to the whole pool. Just disabling dedup cannot get rid of all the effects(such as the possible performance degrade ... etc), because the already dedup'd data is still there and DDT is still there. The thinkable thorough way is totally removing all the dedup'd data. But is it the real thorough way? You can do that now. Enable Dedup at the pool level. Turn it OFF on all the existing filesystems. Make a new test filesystem, and run your tests. Remember, only data written AFTER the dedup value it turned on will be de-duped. Existing data will NOT. And, though dedup is enabled at the pool level, it will only consider data written into filesystems that have the dedup value as ON. Thus, in your case, writing to the single filesystem with dedup on will NOT have ZFS check for duplicates from the other filesystems. It will check only inside itself, as it's the only filesystem with dedup enabled. If the experiment fails, you can safely destroy your test dedup filesystem, then unset dedup at the pool level, and you're fine. Thanks. I will have a try. And also the dedup space saving is kind of indirect. We cannot directly get the space saving in the file system where the dedup is actually enabled for it is pool-based. Even in pool perspective, it is still sort of indirect and obscure from my opinion, the real space saving is the abs delta between the output of 'zpool list' and the sum of 'du' on all the folders in the pool (or 'df' on the mount point folder, not sure if the percentage like 123% will occur or not... grinning ^:^ ). But in NetApp, we can use 'df -s' to directly and easily get the space saving. That is true. Honestly, however, it would be hard to do this on a per-filesystem basis. ZFS allows for the creation of an arbitrary number of filesystems in a pool, far higher than NetApp does. The result is that the filesystem concept is much more flexible in ZFS. The downside is that keeping dedup statistics for a given arbitrary set of data is logistically difficult. An analogy with NetApp is thus: Can you use any tool to find the dedup ratio of an arbitrary directory tree INSIDE a NetApp filesystem? That is true. There is no apple-to-apple corresponding terminology in NetApp for file system in ZFS. If we think 'volume' in NetApp is the opponent for 'file system' in ZFS, then that is doable, because dedup in NetApp is volume-based. It is true, quota is in charge of logical data not physical data. Let's assume an interesting scenario -- say the pool is 100% full in logical data (such as 'df' tells you 100% used) but not full in physical data(such as 'zpool list' tells you still some space available), can we continue writing data into this pool? Sure, you can keep writing to the volume. What matters to the OS is what *it* thinks, not what some userland app thinks. OK. And then what the output of 'df' will be? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How does ZFS dedup space accounting work with quota?
-Original Message- From: Erik Trimble [mailto:erik.trim...@oracle.com] Sent: Wednesday, April 27, 2011 1:06 AM To: Fred Liu Cc: Ian Collins; ZFS discuss Subject: Re: [zfs-discuss] How does ZFS dedup space accounting work with quota? On 4/26/2011 9:29 AM, Fred Liu wrote: From: Erik Trimble [mailto:erik.trim...@oracle.com] It is true, quota is in charge of logical data not physical data. Let's assume an interesting scenario -- say the pool is 100% full in logical data (such as 'df' tells you 100% used) but not full in physical data(such as 'zpool list' tells you still some space available), can we continue writing data into this pool? Sure, you can keep writing to the volume. What matters to the OS is what *it* thinks, not what some userland app thinks. OK. And then what the output of 'df' will be? Thanks. Fred 110% full. Or whatever. df will just keep reporting what it sees. Even if what it *thinks* doesn't make sense to the human reading it. Gotcha! Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How does ZFS dedup space accounting work with quota?
Cindy, Following is quoted from ZFS Dedup FAQ: Deduplicated space accounting is reported at the pool level. You must use the zpool list command rather than the zfs list command to identify disk space consumption when dedup is enabled. If you use the zfs list command to review deduplicated space, you might see that the file system appears to be increasing because we're able to store more data on the same physical device. Using the zpool list will show you how much physical space is being consumed and it will also show you the dedup ratio.The df command is not dedup-aware and will not provide accurate space accounting. So how can I set the quota size on a file system with dedup enabled? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How does ZFS dedup space accounting work with quota?
H, it seems dedup is pool-based not filesystem-based. If it can have fine-grained granularity(like based on fs), that will be great! It is pity! NetApp is sweet in this aspect. Thanks. Fred -Original Message- From: Brandon High [mailto:bh...@freaks.com] Sent: 星期二, 四月 26, 2011 8:50 To: Fred Liu Cc: cindy.swearin...@oracle.com; ZFS discuss Subject: Re: [zfs-discuss] How does ZFS dedup space accounting work with quota? On Mon, Apr 25, 2011 at 4:53 PM, Fred Liu fred_...@issi.com wrote: So how can I set the quota size on a file system with dedup enabled? I believe the quota applies to the non-dedup'd data size. If a user stores 10G of data, it will use 10G of quota, regardless of whether it dedups at 100:1 or 1:1. -B -- Brandon High : bh...@freaks.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] GNU 'cp -p' can't work well with ZFS-based-NFS
Thanks. But does noacal work with nfs v3? Thanks. Fred -Original Message- From: Cameron Hanover [mailto:chano...@umich.edu] Sent: 星期四, 三月 17, 2011 1:34 To: Fred Liu Cc: ZFS Discussions Subject: Re: [zfs-discuss] GNU 'cp -p' can't work well with ZFS-based- NFS I thought this explained it well. http://www.cuddletech.com/blog/pivot/entry.php?id=939 'NFSv3, ACL's and ZFS' is the relevant part. I've told my customers that run into this to use the noacl mount option. - Cameron Hanover chano...@umich.edu Fill with mingled cream and amber, I will drain that glass again. Such hilarious visions clamber Through the chamber of my brain ― Quaintest thoughts ― queerest fancies Come to life and fade away; What care I how time advances? I am drinking ale today. ―-Edgar Allan Poe On Mar 16, 2011, at 9:56 AM, Fred Liu wrote: Always show info like ‘operation not supported’. Any workaround? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] best migration path from Solaris 10
Probably, we need place a tag before zfs -- Opensource-ZFS or Oracle-ZFS after Solaris11 release. If it is true, these two ZFSes will definitely evolve into different directions. BTW, Did Oracle unveil the actual release date? We are also at the cross road... Thanks. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Fajar A. Nugraha Sent: 星期日, 三月 20, 2011 14:55 To: Pawel Jakub Dawidek Cc: openindiana-disc...@openindiana.org; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] best migration path from Solaris 10 On Sun, Mar 20, 2011 at 4:05 AM, Pawel Jakub Dawidek p...@freebsd.org wrote: On Fri, Mar 18, 2011 at 06:22:01PM -0700, Garrett D'Amore wrote: Newer versions of FreeBSD have newer ZFS code. Yes, we are at v28 at this point (the lastest open-source version). That said, ZFS on FreeBSD is kind of a 2nd class citizen still. [...] That's actually not true. There are more FreeBSD committers working on ZFS than on UFS. How is the performance of ZFS under FreeBSD? Is it comparable to that in Solaris, or still slower due to some needed compatibility layer? -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] GNU 'cp -p' can't work well with ZFS-based-NFS
Always show info like 'operation not supported'. Any workaround? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] GNU 'cp -p' can't work well with ZFS-based-NFS's ACL
It is from ZFS ACL. Thanks. Fred From: Fred Liu Sent: Wednesday, March 16, 2011 9:57 PM To: ZFS Discussions Subject: GNU 'cp -p' can't work well with ZFS-based-NFS Always show info like 'operation not supported'. Any workaround? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] GNU 'cp -p' can't work well with ZFS-based-NFS's ACL
Yeah. But we are on Linux NFS client. ;-( Is it doable to build SUN cp in Linux? Where can I find the source code? Thanks. Fred -Original Message- From: jason.brian.k...@gmail.com [mailto:jason.brian.k...@gmail.com] On Behalf Of Jason Sent: Wednesday, March 16, 2011 10:06 PM To: Fred Liu Cc: ZFS Discussions Subject: Re: [zfs-discuss] GNU 'cp -p' can't work well with ZFS-based-NFS's ACL Use the Solaris cp (/usr/bin/cp) instead On Wed, Mar 16, 2011 at 8:59 AM, Fred Liu fred_...@issi.com wrote: It is from ZFS ACL. Thanks. Fred From: Fred Liu Sent: Wednesday, March 16, 2011 9:57 PM To: ZFS Discussions Subject: GNU 'cp -p' can't work well with ZFS-based-NFS Always show info like 'operation not supported'. Any workaround? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] GNU 'cp -p' can't work well with ZFS-based-NFS's ACL
Sorry. I put post in cc. I use NFSv3(linux 2.4 kernel) coreutils-8.9. Thanks. Fred -Original Message- From: David Magda [mailto:dma...@ee.ryerson.ca] Sent: Wednesday, March 16, 2011 10:29 PM To: Fred Liu Cc: ZFS Discussions Subject: Re: [zfs-discuss] GNU 'cp -p' can't work well with ZFS-based-NFS's ACL On Wed, March 16, 2011 10:08, Fred Liu wrote: Yeah. But we are on Linux NFS client. ;-( Is it doable to build SUN cp in Linux? Where can I find the source code? Are you using NFSv4? Also, what version of GNU coreutils are you using ('cp' is usually part of the coreutils package)? What distribution and version? P.S. Please try to not top post. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] dual protocal on one file system?
Hi, Is it possible to run both CIFS and NFS on one file system over ZFS? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dual protocal on one file system?
Tim, Thanks. Is there a mapping mechanism like what DataOnTap does to map the permission/acl between NIS/LDAP and AD? Thanks. Fred From: Tim Cook [mailto:t...@cook.ms] Sent: 星期日, 三月 13, 2011 9:53 To: Fred Liu Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] dual protocal on one file system? On Sat, Mar 12, 2011 at 7:42 PM, Fred Liu fred_...@issi.commailto:fred_...@issi.com wrote: Hi, Is it possible to run both CIFS and NFS on one file system over ZFS? Thanks. Fred Yes, but managing permissions in that scenario is generally a nightmare. If you're using NFSv4 with AD integration, it's a bit more manageable, but it's still definitely a work in progress. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] performance of whole pool suddenly degrade awhile and restore when one file system trys to exceed the quota
Hi, Has anyone met this? I meet this every time just like somebody steps on brake paddle suddenly and release it in the car. Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reliable, enterprise worthy JBODs?
Rocky, Can individuals buy your products in the retail market? Thanks. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Rocky Shek Sent: 星期五, 一月 28, 2011 7:02 To: 'Pasi Kärkkäinen' Cc: 'Philip Brown'; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reliable, enterprise worthy JBODs? Pasi, I have not tried the Opensolaris FMA yet. But we have developed a tool called DSM that allow users to locate disk drive location, failed drive identification, FRU parts status. http://dataonstorage.com/dataon-products/dsm-30-for-nexentastor.html We also spending time in past to sure SES chip work with major RAID controller card. Rocky -Original Message- From: Pasi Kärkkäinen [mailto:pa...@iki.fi] Sent: Tuesday, January 25, 2011 1:30 PM To: Rocky Shek Cc: 'Philip Brown'; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reliable, enterprise worthy JBODs? On Tue, Jan 25, 2011 at 11:53:49AM -0800, Rocky Shek wrote: Philip, You can consider DataON DNS-1600 4U 24Bay 6Gb/s SAS JBOD Storage. http://dataonstorage.com/dataon-products/dns-1600-4u-6g-sas-to-sas- sata-jbod -storage.html It is the best fit for ZFS Storage application. It can be a good replacement of Sun/Oracle J4400 and J4200 There are also Ultra density DNS-1660 4U 60 Bay 6Gb/s SAS JBOD Storage and other form factor JBOD. http://dataonstorage.com/dataon-products/6g-sas-jbod/dns-1660-4u-60- bay-6g-3 5inch-sassata-jbod.html Does (Open)Solaris FMA work with these DataON JBODs? .. meaning do the failure LEDs work automatically in the case of disk failure? I guess that requires the SES chip on the JBOD to include proper drive identification for all slots. -- Pasi Rocky -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Philip Brown Sent: Tuesday, January 25, 2011 10:05 AM To: zfs-discuss@opensolaris.org Subject: [zfs-discuss] reliable, enterprise worthy JBODs? So, another hardware question :) ZFS has been touted as taking maximal advantage of disk hardware, to the point where it can be used efficiently and cost-effectively on JBODs, rather than having to throw more expensive RAID arrays at it. Only trouble is.. JBODs seem to have disappeared :( Sun/Oracle has discontinued its j4000 line, with no replacement that I can see. IBM seems to have some nice looking hardware in the form of its EXP3500 expansion trays... but they only support it connected to an IBM (SAS) controller... which is only supported when plugged into IBM server hardware :( Any other suggestions for (large-)enterprise-grade, supported JBOD hardware for ZFS these days? Either fibre or SAS would be okay. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] reliable, enterprise worthy JBODs?
Khushil, Thanks. Fred From: Khushil Dep [mailto:khushil@gmail.com] Sent: 星期一, 一月 31, 2011 17:37 To: Fred Liu Cc: Rocky Shek; Pasi Kärkkäinen; Philip Brown; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reliable, enterprise worthy JBODs? You should also check out VA Technologies (http://www.va-technologies.com/servicesStorage.php) in the UK which supply a range of JBOD's. I've used this is very large deployments with no JBOD related failures to-date. Interestingly the laso list co-raid boxes. --- W. A. Khushil Dep - khushil@gmail.commailto:khushil@gmail.com - 07905374843 Windows - Linux - Solaris - ZFS - XenServer - FreeBSD - C/C++ - PHP/Perl - LAMP - Nexenta - Development - Consulting Contracting http://www.khushil.com/ - http://www.facebook.com/GlobalOverlord On 31 January 2011 09:15, Fred Liu fred_...@issi.commailto:fred_...@issi.com wrote: Rocky, Can individuals buy your products in the retail market? Thanks. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.orgmailto:zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-mailto:zfs-discuss- boun...@opensolaris.orgmailto:boun...@opensolaris.org] On Behalf Of Rocky Shek Sent: 星期五, 一月 28, 2011 7:02 To: 'Pasi Kärkkäinen' Cc: 'Philip Brown'; zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reliable, enterprise worthy JBODs? Pasi, I have not tried the Opensolaris FMA yet. But we have developed a tool called DSM that allow users to locate disk drive location, failed drive identification, FRU parts status. http://dataonstorage.com/dataon-products/dsm-30-for-nexentastor.html We also spending time in past to sure SES chip work with major RAID controller card. Rocky -Original Message- From: Pasi Kärkkäinen [mailto:pa...@iki.fimailto:pa...@iki.fi] Sent: Tuesday, January 25, 2011 1:30 PM To: Rocky Shek Cc: 'Philip Brown'; zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] reliable, enterprise worthy JBODs? On Tue, Jan 25, 2011 at 11:53:49AM -0800, Rocky Shek wrote: Philip, You can consider DataON DNS-1600 4U 24Bay 6Gb/s SAS JBOD Storage. http://dataonstorage.com/dataon-products/dns-1600-4u-6g-sas-to-sas- sata-jbod -storage.html It is the best fit for ZFS Storage application. It can be a good replacement of Sun/Oracle J4400 and J4200 There are also Ultra density DNS-1660 4U 60 Bay 6Gb/s SAS JBOD Storage and other form factor JBOD. http://dataonstorage.com/dataon-products/6g-sas-jbod/dns-1660-4u-60- bay-6g-3 5inch-sassata-jbod.html Does (Open)Solaris FMA work with these DataON JBODs? .. meaning do the failure LEDs work automatically in the case of disk failure? I guess that requires the SES chip on the JBOD to include proper drive identification for all slots. -- Pasi Rocky -Original Message- From: zfs-discuss-boun...@opensolaris.orgmailto:zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.orgmailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Philip Brown Sent: Tuesday, January 25, 2011 10:05 AM To: zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org Subject: [zfs-discuss] reliable, enterprise worthy JBODs? So, another hardware question :) ZFS has been touted as taking maximal advantage of disk hardware, to the point where it can be used efficiently and cost-effectively on JBODs, rather than having to throw more expensive RAID arrays at it. Only trouble is.. JBODs seem to have disappeared :( Sun/Oracle has discontinued its j4000 line, with no replacement that I can see. IBM seems to have some nice looking hardware in the form of its EXP3500 expansion trays... but they only support it connected to an IBM (SAS) controller... which is only supported when plugged into IBM server hardware :( Any other suggestions for (large-)enterprise-grade, supported JBOD hardware for ZFS these days? Either fibre or SAS would be okay. -- This message posted from opensolaris.orghttp://opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs
Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL
I do the same with ACARD… Works well enough. Fred From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Jason Warr Sent: 星期四, 十二月 30, 2010 8:56 To: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] SAS/short stroking vs. SSDs for ZIL HyperDrive5 = ACard ANS9010 I have personally been wanting to try one of these for some time as a ZIL device. On 12/29/2010 06:35 PM, Kevin Walker wrote: You do seem to misunderstand ZIL. ZIL is quite simply write cache and using a short stroked rotating drive is never going to provide a performance increase that is worth talking about and more importantly ZIL was designed to be used with a RAM/Solid State Disk. We use sata2 HyperDrive5 RAM disks in mirrors and they work well and are far cheaper than STEC or other enterprise SSD's and have non of the issue related to trim... Highly recommended... ;-) http://www.hyperossystems.co.uk/ Kevin On 29 December 2010 13:40, Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.commailto:opensolarisisdeadlongliveopensola...@nedharvey.com wrote: From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.usmailto:bfrie...@simple.dallas.tx.us] Sent: Tuesday, December 28, 2010 9:23 PM The question of IOPS here is relevant to conversation because of ZIL dedicated log. If you have advanced short-stroking to get the write latency of a log device down to zero, then it can compete against SSD for purposes of a log device, but nobody seems to believe such technology currently exists, and it certainly couldn't compete against SSD for random reads. (ZIL log is the only situation I know of, where write performance of a drive matters and read performance does not matter.) It seems that you may be confused. For the ZIL the drive's rotational latency (based on RPM) is the dominating factor and not the lateral head seek time on the media. In this case, the short-stroking you are talking about does not help any. The ZIL is already effectively short-stroking since it writes in order. Nope. I'm not confused at all. I'm making a distinction between short stroking and advanced short stroking. Where simple short stroking does as you said - eliminates the head seek time but still susceptible to rotational latency. As you said, the ZIL already effectively accomplishes that end result, provided a dedicated spindle disk for log device, but does not do that if your ZIL is on the pool storage. And what I'm calling advanced short stroking are techniques that effectively eliminate, or minimize both seek latency, to zero or near-zero. What I'm calling advanced short stroking doesn't exist as far as I know, but is theoretically possible through either special disk hardware or special drivers. ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Looking for 3.5 SSD for ZIL
ACARD 9010 is good enough in this aspect, if you need extremely high iops... Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Erik Trimble Sent: 星期四, 十二月 23, 2010 14:36 To: Christopher George Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Looking for 3.5 SSD for ZIL On 12/22/2010 7:05 AM, Christopher George wrote: I'm not sure if TRIM will work with ZFS. Neither ZFS nor the ZIL code in particular support TRIM. I was concerned that with trim support the SSD life and write throughput will get affected. Your concerns about sustainable write performance (IOPS) for a Flash based SSD are valid, the resulting degradation will vary depending on the controller used. Best regards, Christopher George Founder/CTO www.ddrdrive.com Christopher is correct, in that SSDs will suffer from (non-trivial) performance degredation after they've exhausted their free list, and haven't been told to reclaim emptied space. True battery-backed DRAM is the only permanent solution currently available which never runs into this problem. Even TRIM-supported SSDs eventually need reconditioning. However, this *can* be overcome by frequently re-formatting the SSD (not the Solaris format, a low-level format using a vendor-supplied utility). It's generally a simple thing, but requires pulling the SSD from the server, connecting it to either a Linux or Windows box, running the reformatter, then replacing the SSD. Which, is a PITA. But, still a bit cheaper than buying a DDRdrive. wink -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Looking for 3.5 SSD for ZIL
ACARD 9010 is good enough in this aspect, if you DON'T need extremely high IOPS... Sorry for the typo. Fred -Original Message- From: Fred Liu Sent: 星期四, 十二月 23, 2010 15:30 To: 'Erik Trimble'; Christopher George Cc: zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] Looking for 3.5 SSD for ZIL ACARD 9010 is good enough in this aspect, if you need extremely high iops... Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Erik Trimble Sent: 星期四, 十二月 23, 2010 14:36 To: Christopher George Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Looking for 3.5 SSD for ZIL On 12/22/2010 7:05 AM, Christopher George wrote: I'm not sure if TRIM will work with ZFS. Neither ZFS nor the ZIL code in particular support TRIM. I was concerned that with trim support the SSD life and write throughput will get affected. Your concerns about sustainable write performance (IOPS) for a Flash based SSD are valid, the resulting degradation will vary depending on the controller used. Best regards, Christopher George Founder/CTO www.ddrdrive.com Christopher is correct, in that SSDs will suffer from (non-trivial) performance degredation after they've exhausted their free list, and haven't been told to reclaim emptied space. True battery-backed DRAM is the only permanent solution currently available which never runs into this problem. Even TRIM-supported SSDs eventually need reconditioning. However, this *can* be overcome by frequently re-formatting the SSD (not the Solaris format, a low-level format using a vendor-supplied utility). It's generally a simple thing, but requires pulling the SSD from the server, connecting it to either a Linux or Windows box, running the reformatter, then replacing the SSD. Which, is a PITA. But, still a bit cheaper than buying a DDRdrive. wink -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] very slow boot: stuck at mounting zfs filesystems
Failed zil devices will also cause this... Fred From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Wolfram Tomalla Sent: Wednesday, December 08, 2010 10:40 PM To: Frank Van Damme Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] very slow boot: stuck at mounting zfs filesystems Hi Frank, you might face the problem of lots of snapshots of your filesystems. For each snapshot a device is created during import of the pool. This can easily lead to an extend startup time. At my system it took about 15 minutes for 3500 snapshots. 2010/12/8 Frank Van Damme frank.vanda...@gmail.commailto:frank.vanda...@gmail.com Hello list, I'm having trouble with a server holding a lot of data. After a few months of uptime, it is currently rebooting from a lockup (reason unknown so far) but it is taking hours to boot up again. The boot process is stuck at the stage where it says: mounting zfs filesystems (1/5) the machine responds to pings and keystrokes. I can see disk activity; the disk leds blink one after another. The file system layout is: a 40 GB mirror for the syspool, and a raidz volume over 4 2TB disks which I use for taking backups (=the purpose of this machine). I have deduplication enabled on the backups pool (which turned out to be pretty slow for file deletes since there are a lot of files on the backups pool and I haven't installed an l2arc yet). The main memory is 6 GB, it's an HP server running Nexenta core platform (kernel version 134f). I assume sooner or later the machine will boot up, but I'm in a bit of a panic about how to solve this permanently - after all the last thing I want is not being able to restore data one day because it takes days to boot the machine. Does anyone have an idea how much longer it may take and if the problem may have anything to do with dedup? -- Frank Van Damme No part of this copyright message may be reproduced, read or seen, dead or alive or by any means, including but not limited to telepathy without the benevolence of the author. ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] 3TB HDD in ZFS
I haven't tested them, but we're using multi-terabyte iscsi volumes now, so I don't really see what could be different. The only possible issue I know of, is that 3TB drives uses 4k sectors, which might not be optimal in all environments. Vennlige hilsener / Best regards 3TB HDD needs UEFI not the traditional BIOS and OS support. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] 3TB HDD in ZFS
Hi, Anyone who has experience with 3TB HDD in ZFS? Can solaris recognize this new HDD? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] RamSan experience in ZFS
Hi, Anyone who has the experience of Texas Memory Systems's RamSan in ZFS? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] WarpDrive SLP-300
http://www.lsi.com/channel/about_channel/whatsnew/warpdrive_slp300/index.html Good stuff for ZFS. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] WarpDrive SLP-300
Yeah, no driver issue. BTW, any new storage-controller-related drivers introduced in snv151a? LSI seems the only one who works very closely with Oracle/Sun. Thanks. Fred -Original Message- From: James C. McPherson [mailto:j...@opensolaris.org] Sent: 星期四, 十一月 18, 2010 12:36 To: Fred Liu Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] WarpDrive SLP-300 On 18/11/10 01:49 PM, Fred Liu wrote: http://www.lsi.com/channel/about_channel/whatsnew/warpdrive_slp300/inde x.html Good stuff for ZFS. Looks a bit like the Sun/Oracle Flash Accelerator card, only with a 2nd generation SAS controller - which would probably use the mpt_sas(7d) driver. James C. McPherson -- Oracle http://www.jmcp.homeunix.com/blog ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] WarpDrive SLP-300
Sure. Gotcha! ^:^ -Original Message- From: James C. McPherson [mailto:j...@opensolaris.org] Sent: 星期四, 十一月 18, 2010 13:16 To: Fred Liu Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] WarpDrive SLP-300 On 18/11/10 03:05 PM, Fred Liu wrote: Yeah, no driver issue. BTW, any new storage-controller-related drivers introduced in snv151a? LSI seems the only one who works very closely with Oracle/Sun. You would have to have a look at what's in the repo, I'm not allowed to tell you :| James C. McPherson -- Oracle http://www.jmcp.homeunix.com/blog ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] get quota showed in precision of byte?
Hi, Is it possible to do zfs get -??? quota filesystem ? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] get quota showed in precision of byte?
I get the answer: -p. -Original Message- From: Fred Liu Sent: 星期六, 八月 28, 2010 9:00 To: zfs-discuss@opensolaris.org Subject: get quota showed in precision of byte? Hi, Is it possible to do zfs get -??? quota filesystem ? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] make df have accurate out upon zfs?
Can you shed more lights on *other commands* which output that information? Appreciations. Fred From: Thomas Burgess [mailto:wonsl...@gmail.com] Sent: 星期五, 八月 20, 2010 17:34 To: Fred Liu Cc: ZFS Discuss Subject: Re: [zfs-discuss] make df have accurate out upon zfs? df serves a purpose though. There are other commands which output that information.. On Thu, Aug 19, 2010 at 3:01 PM, Fred Liu fred_...@issi.commailto:fred_...@issi.com wrote: Not sure if there was similar threads in this list before. Three scenarios: 1): df cannot count snapshot space in a file system with quota set. 2): df cannot count sub-filesystem space in a file system with quota set. 3): df cannot count space saved by de-dup in a file system with quota set. Are they possible? Btw, what is the difference between /usr/gnu/bin/df and /bin/df? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] make df have accurate out upon zfs?
Sure, I know this. What I want to say is following: r...@cn03:~# /usr/gnu/bin/df -h /cn03/3 FilesystemSize Used Avail Use% Mounted on cn03/3298G 154K 298G 1% /cn03/3 r...@cn03:~# /bin/df -h /cn03/3 Filesystem size used avail capacity Mounted on cn03/3 800G 154K 297G 1%/cn03/3 r...@cn03:~# zfs get all cn03/3 NAMEPROPERTY VALUE SOURCE cn03/3 type filesystem - cn03/3 creation Sat Jul 10 9:35 2010 - cn03/3 used 503G - cn03/3 available 297G - cn03/3 referenced 154K - cn03/3 compressratio 1.00x - cn03/3 mountedyes- cn03/3 quota 800G local cn03/3 reservationnone default cn03/3 recordsize 128K default cn03/3 mountpoint /cn03/3default cn03/3 sharenfs rw,root=nfsrootlocal cn03/3 checksum on default cn03/3 compressionoffdefault cn03/3 atime on default cn03/3 deviceson default cn03/3 exec on default cn03/3 setuid on default cn03/3 readonly offdefault cn03/3 zoned offdefault cn03/3 snapdirhidden default cn03/3 aclmodegroupmask default cn03/3 aclinherit restricted default cn03/3 canmount on default cn03/3 shareiscsi offdefault cn03/3 xattr on default cn03/3 copies 1 default cn03/3 version4 - cn03/3 utf8only off- cn03/3 normalization none - cn03/3 casesensitivitysensitive - cn03/3 vscan offdefault cn03/3 nbmand offdefault cn03/3 sharesmb offdefault cn03/3 refquota none default cn03/3 refreservation none default cn03/3 primarycache alldefault cn03/3 secondarycache alldefault cn03/3 usedbysnapshots46.8G - cn03/3 usedbydataset 154K - cn03/3 usedbychildren 456G - cn03/3 usedbyrefreservation 0 - cn03/3 logbiaslatencydefault cn03/3 dedup offdefault cn03/3 mlslabel none default cn03/3 com.sun:auto-snapshot true inherited from cn03 Thanks. Fred From: Thomas Burgess [mailto:wonsl...@gmail.com] Sent: 星期五, 八月 20, 2010 18:44 To: Fred Liu Cc: ZFS Discuss Subject: Re: [zfs-discuss] make df have accurate out upon zfs? as for the difference between the two df's, one is the gnu df (liek you'd have on linux) and the other is the solaris df. 2010/8/20 Thomas Burgess wonsl...@gmail.commailto:wonsl...@gmail.com can't the zfs command provide that information? 2010/8/20 Fred Liu fred_...@issi.commailto:fred_...@issi.com Can you shed more lights on *other commands* which output that information? Appreciations. Fred From: Thomas Burgess [mailto:wonsl...@gmail.commailto:wonsl...@gmail.com] Sent: 星期五, 八月 20, 2010 17:34 To: Fred Liu Cc: ZFS Discuss Subject: Re: [zfs-discuss] make df have accurate out upon zfs? df serves a purpose though. There are other commands which output that information.. On Thu, Aug 19, 2010 at 3:01 PM, Fred Liu fred_...@issi.commailto:fred_...@issi.com wrote: Not sure if there was similar threads in this list before. Three scenarios: 1): df cannot count snapshot space in a file system with quota set. 2): df cannot count sub-filesystem space in a file system with quota set. 3): df cannot count space saved by de-dup in a file system with quota set. Are they possible? Btw, what is the difference between /usr/gnu/bin/df and /bin/df? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.orgmailto:zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Opensolaris is apparently dead
Really sad. Will all the opensolaris-related mailing lists be dead? Thanks. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Andrej Podzimek Sent: 星期六, 八月 14, 2010 23:36 To: Russ Price Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Opensolaris is apparently dead 3. Just stick with b134. Actually, I've managed to compile my way up to b142, but I'm having trouble getting beyond it - my attempts to install later versions just result in new boot environments with the old kernel, even with the latest pkg-gate code in place. Still, even if I get the latest code to install, it's not viable for the long term unless I'm willing to live with stasis. I run build 146. There have been some heads-up messages on the topic. You need b137 or later in order to build b143 or later. Plus the latest packaging bits and other stuff. http://mail.opensolaris.org/pipermail/on-discuss/2010-June/001932.html When compiling b146, it's good to read this first: http://mail.opensolaris.org/pipermail/on-discuss/2010- August/002110.html Instead of using the tagged onnv_146 code, you have to apply all the changesets up to 13011:dc5824d1233f. 6. Abandon ZFS completely and go back to LVM/MD-RAID. I ran it for years before switching to ZFS, and it works - but it's a bitter pill to swallow after drinking the ZFS Kool-Aid. Or Btrfs. It may not be ready for production now, but it could become a serious alternative to ZFS in one year's time or so. (I have been using it for some time with absolutely no issues, but some people (Edward Shishkin) say it has obvious bugs related to fragmentation.) Andrej ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] snapshot .zfs folder can only be seen in the top of a file system?
Hi, Is it true? Any way to find it in every hierarchy? Thanks. Fred ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] snapshot .zfs folder can only be seen in the top of a file system?
Thanks. But too many file systems may be an issue for management and also normal user cannot create file system. I think it should go like what NetApp's snapshot does. It is a pity. Thanks. Fred -Original Message- From: Edward Ned Harvey [mailto:sh...@nedharvey.com] Sent: 星期六, 七月 24, 2010 10:22 To: Fred Liu; zfs-discuss@opensolaris.org Subject: RE: [zfs-discuss] snapshot .zfs folder can only be seen in the top of a file system? From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Fred Liu Is it true? Any way to find it in every hierarchy? Yup. Nope. If you use ZFS, you make a filesystem at whatever level you need it, in order for the .zfs directory to be available to whatever clients will be needing it... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
-Original Message- From: Erik Trimble [mailto:erik.trim...@oracle.com] Sent: 星期四, 七月 01, 2010 11:45 To: Fred Liu Cc: Bob Friesenhahn; 'OpenSolaris ZFS discuss' Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers On 6/30/2010 7:17 PM, Fred Liu wrote: See. Thanks. Does it have the hardware functionality to detect the power outage and do force cache flush when the cache is enabled? Any more detailed info about the capacity (farad) of this supercap and how long one discharge will be? Thanks. Fred I don't think it matters the actual size of the supercap. As Bob said, it only needs to be sized to be big enough to allow all on-board DRAM to be flushed out to Flash. How big that should be is easily determined by the manufacturer, and they'd be grossly negligent if it wasn't at least that size. Any capacity beyond that needed to do a single full flush is excess, so I would hazard a guess that the supercap is just big enough, and no more. That is, just enough to power the SSD for the partial second or so it takes to flush to flash. I don't think we need to worry how big that actually is. Understand and agree. It is sort of picky to ask this without manufacturer's help ;-) Answering your second question first - my reading of the info is that it will force a cache flush (if the cache is enabled) upon loss of power under any circumstance. That is, it will flush the cache in both a controlled power-down (regardless of whether the OS says to do so) and in an immediate power loss. All this is in the SSD's firmware. That is exactly what I expect. Little bit broadly speaking, it is sort of ambiguous in aspect of cache flush in all the HDDs. Is it OS-controlled or firmware-controlled or both? At least in this case, I have got what I expect. But what about for all(generic) the HDDs? Thanks. Fred -Erik -Original Message- From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] Sent: 星期四, 七月 01, 2010 10:01 To: Fred Liu Cc: 'OpenSolaris ZFS discuss' Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers On Wed, 30 Jun 2010, Fred Liu wrote: Any duration limit on the supercap? How long can it sustain the data? A supercap on a SSD drive only needs to sustain the data until it has been saved (perhaps 10 milliseconds). It is different than a RAID array battery. Bob -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
Any duration limit on the supercap? How long can it sustain the data? Thanks. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of David Magda Sent: 星期六, 六月 26, 2010 21:48 To: Arne Jansen Cc: 'OpenSolaris ZFS discuss' Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers On Jun 26, 2010, at 02:09, Arne Jansen wrote: Geoff Nordli wrote: Is this the one (http://www.ocztechnology.com/products/solid-state-drives/2-5--sata-ii/maxim um-performance-enterprise-solid-state-drives/ocz-vertex-2-pro- series-sata-ii -2-5--ssd-.html) with the built in supercap? Yes. Crickey. Who's the genius who thinks of these URLs? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers
See. Thanks. Does it have the hardware functionality to detect the power outage and do force cache flush when the cache is enabled? Any more detailed info about the capacity (farad) of this supercap and how long one discharge will be? Thanks. Fred -Original Message- From: Bob Friesenhahn [mailto:bfrie...@simple.dallas.tx.us] Sent: 星期四, 七月 01, 2010 10:01 To: Fred Liu Cc: 'OpenSolaris ZFS discuss' Subject: Re: [zfs-discuss] OCZ Vertex 2 Pro performance numbers On Wed, 30 Jun 2010, Fred Liu wrote: Any duration limit on the supercap? How long can it sustain the data? A supercap on a SSD drive only needs to sustain the data until it has been saved (perhaps 10 milliseconds). It is different than a RAID array battery. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Crucial RealSSD C300 and cache flush?
Looking forward to see your test report from intel x-25 and ocz vertex 2 pro... Thanks. Fred -Original Message- From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Arne Jansen Sent: 星期四, 六月 24, 2010 16:15 To: Roy Sigurd Karlsbakk Cc: OpenSolaris ZFS discuss Subject: Re: [zfs-discuss] Crucial RealSSD C300 and cache flush? Hi, Roy Sigurd Karlsbakk wrote: Crucial RealSSD C300 has been released and showing good numbers for use as Zil and L2ARC. Does anyone know if this unit flushes its cache on request, as opposed to Intel units etc? I had a chance to get my hands on a Crucial RealSSD C300/128MB yesterday and did some quick testing. Here are the numbers first, some explanation follows below: cache enabled, 32 buffers: Linear read, 64k blocks: 134 MB/s random read, 64k blocks: 134 MB/s linear read, 4k blocks: 87 MB/s random read, 4k blocks: 87 MB/s linear write, 64k blocks: 107 MB/s random write, 64k blocks: 110 MB/s linear write, 4k blocks: 76 MB/s random write, 4k blocks: 32 MB/s cache enabled, 1 buffer: linear write, 4k blocks: 51 MB/s (12800 ops/s) random write, 4k blocks: 7 MB/s (1750 ops/s) linear write, 64k blocks: 106 MB/s (1610 ops/s) random write, 64k blocks: 59 MB/s (920 ops/s) cache disabled, 1 buffer: linear write, 4k blocks: 4.2 MB/s (1050 ops/s) random write, 4k blocks: 3.9 MB/s (980 ops/s) linear write, 64k blocks: 40 MB/s (650 ops/s) random write, 64k blocks: 40 MB/s (650 ops/s) cache disabled, 32 buffers: linear write, 4k blocks: 4.5 MB/s, 1120 ops/s random write, 4k blocks: 4.2 MB/s, 1050 ops/s linear write, 64k blocks: 43 MB/s, 680 ops/s random write, 64k blocks: 44 MB/s, 690 ops/s cache enabled, 1 buffer, with cache flushes linear write, 4k blocks, flush after every write: 1.5 MB/s, 385 writes/s linear write, 4k blocks, flush after every 4th write: 4.2 MB/s, 1120 writes/s The numbers are rough numbers read quickly from iostat, so please don't multiply block size by ops and compare with the bandwidth given ;) The test operates directly on top of LDI, just like ZFS. - nk blocks means the size of each read/write given to the device driver - n buffers means the number of buffers I keep in flight. This is to keep the command queue of the device busy - cache flush means a synchronous ioctl DKIOCFLUSHWRITECACHE These numbers contain a few surprises (at least for me). The biggest surprise is that with cache disabled one cannot get good data rates with small blocks, even if one keeps the command queue filled. This is completely different from what I've seen from hard drives. Also the IOPS with cache flushes is quite low, 385 is not much better than a 15k hdd, while the latter scales better. On the other hand, from the large drop in performance when using flushes one could infer that they indeed flush properly, but I haven't built a test setup for that yet. Conclusion: From the measurements I'd infer the device makes a good L2ARC, but for a slog device the latency is too high and it doesn't scale well. I'll do similar tests on a x-25 and ocz vertex 2 pro as soon as they arrive. If there are numbers you are missing please tell me, I'll measure them if possible. Also please ask if there are questions regarding the test setup. -- Arne ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SMI lable and EFI label in one disk?
Thanks. But it seems not the truth. I just recalled a thread in this list and it said SMI lable and EFI label cannot be in one disk. Is it correct? Let me describe my case. I have a 160GB HDD -- saying c0t0d0. I use OpenSolaris installer to cut a 100GB slice -- c0t0d0s0 for rpool. And I want to use the remaining space for cache device -- assuming c0t0d0s1. But when I use format command, I cannot see the remaining space. artition p Current partition table (original): Total disk cylinders available: 13048 + 2 (reserved cylinders) Part TagFlag Cylinders SizeBlocks 0 rootwm 1 - 13047 99.95GB(13047/0/0) 209600055 1 unassignedwm 00 (0/0/0) 0 2 backupwu 0 - 13047 99.95GB(13048/0/0) 209616120 3 unassignedwm 00 (0/0/0) 0 4 unassignedwm 00 (0/0/0) 0 5 unassignedwm 00 (0/0/0) 0 6 unassignedwm 00 (0/0/0) 0 7 unassignedwm 00 (0/0/0) 0 8 bootwu 0 - 07.84MB(1/0/0) 16065 9 unassignedwm 00 (0/0/0) 0 Anyone like me has the similar case? Thanks. Fred -Original Message- From: Richard Elling [mailto:richard.ell...@gmail.com] Sent: 星期二, 六月 01, 2010 7:42 To: Fred Liu Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Can root pool slice co-exist with non-root pool slice in one HDD? On May 31, 2010, at 4:20 PM, Fred Liu wrote: Hi, The subject says it all. Yes. The reply says it all. :-) Making it happen, is a feature of the installer(s). -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss