Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled
zpool import done! Back online. Total downtime for 4TB pool was about 8 hours, don't know how much of this was completing the destroy transaction. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] freeNAS moves to Linux from FreeBSD
Bob Friesenhahn wrote: On Mon, 7 Dec 2009, Michael DeMan (OA) wrote: Args for FreeBSD + ZFS: - Limited budget - We are familiar with managing FreeBSD. - We are familiar with tuning FreeBSD. - Licensing model Args against OpenSolaris + ZFS: - Hardware compatibility - Lack of knowledge for tuning and associated costs for training staff to learn 'yet one more operating system' they need to support. - Licensing model If you think about it a little bit, you will see that there is no significant difference in the licensing model between FreeBSD+ZFS and OpenSolaris+ZFS. It is not possible to be a little bit pregnant. Either one is pregnant, or one is not. There is a huge difference practically - OpenSolaris has no free security updates for stable releases, unlike FreeBSD. And I'm sure you don't recommend running /dev in production. This is offtopic, and isn't specifically related to CDDL vs BSD, just how Sun chooses to do things. Sure, there have been claims (since before 2008.05) that it might happen some day, but until 2009.06 users can freely get a non-vulernable Firefox or Samba or fixes for various network kernel panics the claims are meaningless. http://mail.opensolaris.org/pipermail/opensolaris-help/2009-November/015824.html -- James Andrewartha ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS dedup report tool
Hi all, Is there any way to generate some report related to the de-duplication feature of ZFS within a zpool/zfs pool? I mean, its nice to have the dedup ratio, but it think it would be also good to have a report where we could see what directories/files have been found as repeated and therefore they suffered deduplication. Thanks for your time, Bruno smime.p7s Description: S/MIME Cryptographic Signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
On Wed, Dec 9, 2009 at 2:26 PM, Bruno Sousa bso...@epinfante.com wrote: Hi all, Is there any way to generate some report related to the de-duplication feature of ZFS within a zpool/zfs pool? I mean, its nice to have the dedup ratio, but it think it would be also good to have a report where we could see what directories/files have been found as repeated and therefore they suffered deduplication. Nice to have at first glance, but could you detail on any specific use-case you see? Regards, Andrey Thanks for your time, Bruno ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] will deduplication know about old blocks?
I'm planning to try out deduplication in the near future, but started wondering if I can prepare for it on my servers. one thing which struck me was that I should change the checksum algorithm to sha256 as soon as possible. but I wonder -- is that sufficient? will the dedup code know about old blocks when I store new data? let's say I have an existing file img0.jpg. I turn on dedup, and copy it twice, to img0a.jpg and img0b.jpg. will all three files refer to the same block(s), or will only img0a and img0b share blocks? -- Kjetil T. Homme Redpill Linpro AS - Changing the game ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
Hi Andrey, For instance, i talked about deduplication to my manager and he was happy because less data = less storage, and therefore less costs . However, now the IT group of my company needs to provide to management board, a report of duplicated data found per share, and in our case one share means one specific company department/division. Bottom line, the mindset is something like : * one share equals to a specific department within the company * the department demands a X value of data storage * the data storage costs Y * making a report of the amount of data consumed by a department, before and after deduplication, means that data storage costs can be seen per department * if theres a cost reduction due to the usage of deduplication, part of that money can be used for business , either IT related subjects or general business * management board wants to see numbers related to costs, and not things like the racio of deduplication in SAN01 is 3x, because for management this is geek talk I hope i was somehow clear, but i can try to explain better if needed. Thanks, Bruno Andrey Kuzmin wrote: On Wed, Dec 9, 2009 at 2:26 PM, Bruno Sousa bso...@epinfante.com wrote: Hi all, Is there any way to generate some report related to the de-duplication feature of ZFS within a zpool/zfs pool? I mean, its nice to have the dedup ratio, but it think it would be also good to have a report where we could see what directories/files have been found as repeated and therefore they suffered deduplication. Nice to have at first glance, but could you detail on any specific use-case you see? Regards, Andrey Thanks for your time, Bruno ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss smime.p7s Description: S/MIME Cryptographic Signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled
zpool import done! Back online. Total downtime for 4TB pool was about 8 hours, don't know how much of this was completing the destroy transaction. Lucky You! :) My box has gone totally unresponsive again :( I cannot even ping it now and I can't hear the disks thrashing. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
On Wed, Dec 9, 2009 at 2:47 PM, Bruno Sousa bso...@epinfante.com wrote: Hi Andrey, For instance, i talked about deduplication to my manager and he was happy because less data = less storage, and therefore less costs . However, now the IT group of my company needs to provide to management board, a report of duplicated data found per share, and in our case one share means one specific company department/division. Bottom line, the mindset is something like : * one share equals to a specific department within the company * the department demands a X value of data storage * the data storage costs Y * making a report of the amount of data consumed by a department, before and after deduplication, means that data storage costs can be seen per department Do you currently have tools that report storage usage per share? What you ask for looks like a request to make these deduplication-aware. * if theres a cost reduction due to the usage of deduplication, part of that money can be used for business , either IT related subjects or general business * management board wants to see numbers related to costs, and not things like the racio of deduplication in SAN01 is 3x, because for management this is geek talk Just divide storage costs by deduplication factor (1), and here you are (provided you can do it by department). Regards, Andrey I hope i was somehow clear, but i can try to explain better if needed. Thanks, Bruno Andrey Kuzmin wrote: On Wed, Dec 9, 2009 at 2:26 PM, Bruno Sousa bso...@epinfante.com wrote: Hi all, Is there any way to generate some report related to the de-duplication feature of ZFS within a zpool/zfs pool? I mean, its nice to have the dedup ratio, but it think it would be also good to have a report where we could see what directories/files have been found as repeated and therefore they suffered deduplication. Nice to have at first glance, but could you detail on any specific use-case you see? Regards, Andrey Thanks for your time, Bruno ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Planed ZFS-Features - Is there a List or something else
Hi There, does anybody know, if there's a Roadmap oder simply a List of the future features of ZFS? Would be interesting to see, what will happen in the future. THX Henri -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] How to destroy ZFS pool with dump on ZVOL
Hi ZFS guys, when playing with one of recent version of OpenSolaris GUI installer, I have tried to restart it after previous failure. However, the installer failed when trying to destroy previously created ZFS root pool. It was discovered that this is due to the fact that dump ZFS volume could not be released and thus subsequent 'zpool destroy' command failed (as expected): # zpool destroy -f rpool cannot destroy 'rpool': pool is busy # zfs list rpool/dump NAME USED AVAIL REFER MOUNTPOINT rpool/dump 750M 9.93G 750M - # dumpadm Dump content: kernel pages Dump device: /dev/zvol/dsk/rpool/dump (dedicated) Savecore directory: /var/crash/opensolaris Savecore enabled: no Save compressed: on There was a discussion which happened on zfs-discuss mailing list some time ago about how dump ZVOL could be release and the recommended approach was to try to move it to swap ZVOL - the attempt failed, but dump ZVOL was released. That no longer seems to work: # swap -l swapfile devswaplo blocks free /dev/zvol/dsk/rpool/swap 8,1 8 2287608 2287608 # dumpadm -d swap dumpadm: no swap devices could be configured as the dump device # dumpadm Dump content: kernel pages Dump device: /dev/zvol/dsk/rpool/dump (dedicated) Savecore directory: /var/crash/opensolaris Savecore enabled: no Save compressed: on Bug 13180 was filed against OpenSolaris installer to track this issue. Could I please ask somebody from ZFS team to help install folks understand what changed and how the installer has to be modified, so that it can destroy ZFS root pool containing dump on ZVOL ? Thank you very much, Jan ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] panic when rebooting from snapshot
Folks, I've been seeing this for a while, but never had the urge to ask, until now. When I take a snapshot of my current root-FS and tell the system to reboot off that snapshot, I'm faced with an assertion failure (running DEBUG bits) that looks like this: r...@codemonkey:~# df -h / FilesystemSize Used Avail Use% Mounted on rpool/ROOT/bfu129G 8.2G 121G 7% / r...@codemonkey:~# zfs snapshot rpool/ROOT/b...@ro r...@codemonkey:~# reboot rpool/ROOT/b...@ro Dec 8 20:41:17 codemonkey reboot: initiated by root on /dev/console panic[cpu0]/thread=ff01e1023040: assertion failed: vfsp-vfs_count != 0, file: ../../common/fs/vfs.c, line: 4374 ff0007fb1c90 genunix:assfail+7e () ff0007fb1cc0 genunix:vfs_rele+86 () ff0007fb1ce0 zfs:zfs_freevfs+2a () ff0007fb1d00 genunix:fsop_freefs+1a () ff0007fb1d30 genunix:vfs_rele+3b () ff0007fb1d60 genunix:vfs_remove+65 () ff0007fb1db0 genunix:dounmount+a3 () ff0007fb1de0 genunix:vfs_unmountall+92 () ff0007fb1e50 genunix:kadmin+549 () ff0007fb1eb0 genunix:uadmin+10f () ff0007fb1f00 unix:brand_sys_syscall32+295 () syncing file systems... done dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel 0:14 100% done 100% done: 256267 pages dumped, dump succeeded rebooting... I've tried to reason why this happens, but fail to come up with a plausible answer. Has anyone else seen this? Anyone knows what's amiss? I'm hesitant to file a bug without a pointer to a possible cause. This only happens when I try to reboot off a snapshot. If I first create a clone of the snapshot and reboot off that, the system is perfectly happy... TIA, Joep ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled
I have disabled all 'non-important' processes (gdm, ssh, vnc, etc). I am now starting this process locally on the server via the console with about 3.4 GB free of RAM. I still have my entries in /etc/system for limiting how much RAM zfs can use. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
Hi, The tool to report storage usage per share is du -h / df -h :) , so yes, these tools could be deduplication aware. I know for instance that microsoft has a feature (in Win2003 R2), called File Server Resource Manager, and inside theres the possibility to make Storage Reports, and one of those reports is Duplicated Files. Bottom line, if ZFS can deliver such a capability, i think that Solaris/OpenSolaris would gain yet another competitive edge over other solutions, therefore more customers could see more and more advantages by choosing ZFS based storage. Bruno Andrey Kuzmin wrote: On Wed, Dec 9, 2009 at 2:47 PM, Bruno Sousa bso...@epinfante.com wrote: Hi Andrey, For instance, i talked about deduplication to my manager and he was happy because less data = less storage, and therefore less costs . However, now the IT group of my company needs to provide to management board, a report of duplicated data found per share, and in our case one share means one specific company department/division. Bottom line, the mindset is something like : * one share equals to a specific department within the company * the department demands a X value of data storage * the data storage costs Y * making a report of the amount of data consumed by a department, before and after deduplication, means that data storage costs can be seen per department Do you currently have tools that report storage usage per share? What you ask for looks like a request to make these deduplication-aware. * if theres a cost reduction due to the usage of deduplication, part of that money can be used for business , either IT related subjects or general business * management board wants to see numbers related to costs, and not things like the racio of deduplication in SAN01 is 3x, because for management this is geek talk Just divide storage costs by deduplication factor (1), and here you are (provided you can do it by department). Regards, Andrey I hope i was somehow clear, but i can try to explain better if needed. Thanks, Bruno Andrey Kuzmin wrote: On Wed, Dec 9, 2009 at 2:26 PM, Bruno Sousa bso...@epinfante.com wrote: Hi all, Is there any way to generate some report related to the de-duplication feature of ZFS within a zpool/zfs pool? I mean, its nice to have the dedup ratio, but it think it would be also good to have a report where we could see what directories/files have been found as repeated and therefore they suffered deduplication. Nice to have at first glance, but could you detail on any specific use-case you see? Regards, Andrey Thanks for your time, Bruno ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss smime.p7s Description: S/MIME Cryptographic Signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Changing ZFS drive pathing
Alex, thanks for the info. You made my heart stop a little when reading your problem with PowerPath, but MPxIO seems like it might be a good option for me. I'll will try that as well although I have not used it before. Thank you! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] freeNAS moves to Linux from FreeBSD
On Wed, 9 Dec 2009, James Andrewartha wrote: There is a huge difference practically - OpenSolaris has no free security updates for stable releases, unlike FreeBSD. And I'm sure you don't recommend running /dev in production. If OpenSolaris was to do that, then it would be called Solaris. :-) It seems that Solaris 10 offers free security and critical updates. Of course the desktop application software is quite old and OS features lag behind OpenSolaris. Sun needs to find a way to improve its profit margins and retain its valuable employees, and the way it does that is by selling service contracts. The base service contract for Solaris 10 is not terribly expensive, although it mostly just offers full access to patches and the Sunsolve site. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] panic when rebooting from snapshot
Hi Joep, Booting from a snapshot isn't possibly because the snapshot is not writable and the boot operation writes to the BE. Booting from a clone is successful because the clone is writable. The second issue is whether the reboot command understands what a snapshot is. I see from the reboot man page that it supports -e environment but I haven't tested this feature with a ZFS BE or clone. Thanks, Cindy On 12/08/09 12:48, Joep Vesseur wrote: Folks, I've been seeing this for a while, but never had the urge to ask, until now. When I take a snapshot of my current root-FS and tell the system to reboot off that snapshot, I'm faced with an assertion failure (running DEBUG bits) that looks like this: r...@codemonkey:~# df -h / FilesystemSize Used Avail Use% Mounted on rpool/ROOT/bfu129G 8.2G 121G 7% / r...@codemonkey:~# zfs snapshot rpool/ROOT/b...@ro r...@codemonkey:~# reboot rpool/ROOT/b...@ro Dec 8 20:41:17 codemonkey reboot: initiated by root on /dev/console panic[cpu0]/thread=ff01e1023040: assertion failed: vfsp-vfs_count != 0, file: ../../common/fs/vfs.c, line: 4374 ff0007fb1c90 genunix:assfail+7e () ff0007fb1cc0 genunix:vfs_rele+86 () ff0007fb1ce0 zfs:zfs_freevfs+2a () ff0007fb1d00 genunix:fsop_freefs+1a () ff0007fb1d30 genunix:vfs_rele+3b () ff0007fb1d60 genunix:vfs_remove+65 () ff0007fb1db0 genunix:dounmount+a3 () ff0007fb1de0 genunix:vfs_unmountall+92 () ff0007fb1e50 genunix:kadmin+549 () ff0007fb1eb0 genunix:uadmin+10f () ff0007fb1f00 unix:brand_sys_syscall32+295 () syncing file systems... done dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel 0:14 100% done 100% done: 256267 pages dumped, dump succeeded rebooting... I've tried to reason why this happens, but fail to come up with a plausible answer. Has anyone else seen this? Anyone knows what's amiss? I'm hesitant to file a bug without a pointer to a possible cause. This only happens when I try to reboot off a snapshot. If I first create a clone of the snapshot and reboot off that, the system is perfectly happy... TIA, Joep ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else
Hi Henri, The slides from the SNIA conference this past fall provide a description of upcoming features, here: http://www.snia.org/events/storage-developer2009/presentations/monday/JeffBonwick_zfs-What_Next-SDC09.pdf Cindy On 12/09/09 05:25, Henri Maddox wrote: Hi There, does anybody know, if there's a Roadmap oder simply a List of the future features of ZFS? Would be interesting to see, what will happen in the future. THX Henri ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs allow - internal error
I've just done a fresh install of Solaris 10 u8 (2009.10) onto a Thumper. Running zfs allow gives the following delightful output: -bash-3.00$ zfs allow internal error: /usr/lib/zfs/pyzfs.py not found I've confirmed it on a second thumper, also running Solaris 10 u8 installed about 2 months ago. Has anyone else seen this? Thanks, Andrew -- Systems Developer e: andrew.nic...@luns.net.uk im: a.nic...@jabber.lancs.ac.uk t: +44 (0)1524 5 10147 Lancaster University Network Services is a limited company registered in England and Wales. Registered number: 04311892. Registered office: University House, Lancaster University, Lancaster, LA1 4YW signature.asc Description: Digital signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled
On Wed, 9 Dec 2009, Markus Kovero wrote: From what I've noticed, if one destroys dataset that is say 50-70TB and reboots before destroy is finished, it can take up to several _days_ before it's back up again. So, nowadays I'm doing rm -fr BEFORE issuing zfs destroy whenever possible. It stands to reason that if deduplication is done via reference counting then whenever a deduplicated block is freed its duplication count needs to be reduced and it needs to be done atomically. Blocks such as full-length zeroed blocks (common for zfs logical volumes) are likely to be quite heavily duplicated. That may be where this bottleneck is coming from. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs allow - internal error
On 09 December, 2009 - Andrew Robert Nicols sent me these 1,6K bytes: I've just done a fresh install of Solaris 10 u8 (2009.10) onto a Thumper. Running zfs allow gives the following delightful output: -bash-3.00$ zfs allow internal error: /usr/lib/zfs/pyzfs.py not found I've confirmed it on a second thumper, also running Solaris 10 u8 installed about 2 months ago. Has anyone else seen this? Yes. You haven't got SUNWPython installed, which is wrongly marked as belonging to the GNOME2 cluster. Install SUNWPython and SUNWPython-share and it'll work. Some ZFS stuff (userspace, allow, ..) started using python in u8. /Tomas -- Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] will deduplication know about old blocks?
Hi Kjetil, Unfortunately, dedup will only apply to data written after the setting is enabled. That also means that new blocks cannot dedup against old block regardless of how they were written. There is therefore no way to prepare your pool for dedup -- you just have to enable it when you have the new bits. Adam On Dec 9, 2009, at 3:40 AM, Kjetil Torgrim Homme wrote: I'm planning to try out deduplication in the near future, but started wondering if I can prepare for it on my servers. one thing which struck me was that I should change the checksum algorithm to sha256 as soon as possible. but I wonder -- is that sufficient? will the dedup code know about old blocks when I store new data? let's say I have an existing file img0.jpg. I turn on dedup, and copy it twice, to img0a.jpg and img0b.jpg. will all three files refer to the same block(s), or will only img0a and img0b share blocks? -- Kjetil T. Homme Redpill Linpro AS - Changing the game ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] my ZFS backup script -- feedback appreciated
On Tue, December 8, 2009 19:23, Andrew Daugherity wrote: Description/rationale of the script (more detailed comments within the script): # This supplements zfs-auto-snapshot, but runs independently. I prefer that # snapshots continue to be taken even if the backup fails. # # This aims to be much more robust than the backup functionality of # zfs-auto-snapshot, namely: # * it uses 'zfs send -I' to send all intermediate snapshots (including # any daily/weekly/etc.), and should still work even if it isn't run # every hour -- as long as the newest remote snapshot hasn't been # rotated out locally yet # * 'zfs recv -dF' on the destination host removes any snapshots not # present locally so you don't have to worry about manually removing # old snapshots there. I realize this doesn't meet everybody's needs but hopefully someone will find it useful. This description matches pretty well (I'm doing my own snapshots) what I've been working on, and having trouble getting to work (getting ZFS errors on one end or the other depending on options). I'm working with OpenSolaris, though, which may make a difference. Dunno when I next get a chance to work on this, maybe this weekend; but having a working example will be great. I'll either just use yours, or at least benefit from seeing what works; I'll figure out which when I look more closely. So thanks! -- David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] will deduplication know about old blocks?
Adam Leventhal a...@eng.sun.com writes: Unfortunately, dedup will only apply to data written after the setting is enabled. That also means that new blocks cannot dedup against old block regardless of how they were written. There is therefore no way to prepare your pool for dedup -- you just have to enable it when you have the new bits. thank you for the clarification! -- Kjetil T. Homme Redpill Linpro AS - Changing the game ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] will deduplication know about old blocks?
Adam, So therefore, the best way is to set this at pool creation timeOK, that makes sense, it operates only on fresh data that's coming over the fence. BUT What happens if you snapshot, send, destroy, recreate (with dedup on this time around) and then write the contents of the cloned snapshot to the various places in the pool - which properties are in the ascendancy here? the host pool or the contents of the clone? The host pool I assume, because clone contents are (in this scenario) just some new data? -Me On Wed, Dec 9, 2009 at 18:43, Adam Leventhal a...@eng.sun.com wrote: Hi Kjetil, Unfortunately, dedup will only apply to data written after the setting is enabled. That also means that new blocks cannot dedup against old block regardless of how they were written. There is therefore no way to prepare your pool for dedup -- you just have to enable it when you have the new bits. On Dec 9, 2009, at 3:40 AM, Kjetil Torgrim Homme wrote: I'm planning to try out deduplication in the near future, but started wondering if I can prepare for it on my servers. one thing which struck me was that I should change the checksum algorithm to sha256 as soon as possible. but I wonder -- is that sufficient? will the dedup code know about old blocks when I store new data? let's say I have an existing file img0.jpg. I turn on dedup, and copy it twice, to img0a.jpg and img0b.jpg. will all three files refer to the same block(s), or will only img0a and img0b share blocks? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
On Dec 9, 2009, at 3:47 AM, Bruno Sousa wrote: Hi Andrey, For instance, i talked about deduplication to my manager and he was happy because less data = less storage, and therefore less costs . However, now the IT group of my company needs to provide to management board, a report of duplicated data found per share, and in our case one share means one specific company department/division. Bottom line, the mindset is something like : * one share equals to a specific department within the company * the department demands a X value of data storage * the data storage costs Y * making a report of the amount of data consumed by a department, before and after deduplication, means that data storage costs can be seen per department * if theres a cost reduction due to the usage of deduplication, part of that money can be used for business , either IT related subjects or general business * management board wants to see numbers related to costs, and not things like the racio of deduplication in SAN01 is 3x, because for management this is geek talk I hope i was somehow clear, but i can try to explain better if needed. Snapshots, copies, compression, deduplication, and (eventually) encryption occurs at the block level, not the file level. Hence, file-level accounting works as long as you do not try to make a 1:1 relationship to physical space. But your problem, as described above, is one of managerial accounting. IMHO, trying to apply a technical solution to a managerial accounting problem is akin to catching a greased pig. It is much easier to just do what businessmen do -- manage managerial accounting. http://en.wikipedia.org/wiki/Managerial_accounting -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
Hi, Despite the fact that i agree in general with your comments, in reality it all comes to money.. So in this case, if i could prove that ZFS was able to find X amount of duplicated data, and since that X amount of data has a price of Y per GB, IT could be seen as business enabler instead of a cost centre. But indeed, you're right , in my case a possible technical solution is trying to answer for a managerial solution..however, isn't it way IT was invented, that i believe that's why i got my paycheck each month :) Bruno Richard Elling wrote: On Dec 9, 2009, at 3:47 AM, Bruno Sousa wrote: Hi Andrey, For instance, i talked about deduplication to my manager and he was happy because less data = less storage, and therefore less costs . However, now the IT group of my company needs to provide to management board, a report of duplicated data found per share, and in our case one share means one specific company department/division. Bottom line, the mindset is something like : * one share equals to a specific department within the company * the department demands a X value of data storage * the data storage costs Y * making a report of the amount of data consumed by a department, before and after deduplication, means that data storage costs can be seen per department * if theres a cost reduction due to the usage of deduplication, part of that money can be used for business , either IT related subjects or general business * management board wants to see numbers related to costs, and not things like the racio of deduplication in SAN01 is 3x, because for management this is geek talk I hope i was somehow clear, but i can try to explain better if needed. Snapshots, copies, compression, deduplication, and (eventually) encryption occurs at the block level, not the file level. Hence, file-level accounting works as long as you do not try to make a 1:1 relationship to physical space. But your problem, as described above, is one of managerial accounting. IMHO, trying to apply a technical solution to a managerial accounting problem is akin to catching a greased pig. It is much easier to just do what businessmen do -- manage managerial accounting. http://en.wikipedia.org/wiki/Managerial_accounting -- richard smime.p7s Description: S/MIME Cryptographic Signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] will deduplication know about old blocks?
What happens if you snapshot, send, destroy, recreate (with dedup on this time around) and then write the contents of the cloned snapshot to the various places in the pool - which properties are in the ascendancy here? the host pool or the contents of the clone? The host pool I assume, because clone contents are (in this scenario) just some new data? The dedup property applies to all writes so the settings for the pool of origin don't matter, just those on the destination pool. Adam -- Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
On Wed, 9 Dec 2009, Bruno Sousa wrote: Despite the fact that i agree in general with your comments, in reality it all comes to money.. So in this case, if i could prove that ZFS was able to find X amount of duplicated data, and since that X amount of data has a price of Y per GB, IT could be seen as business enabler instead of a cost centre. Most of the cost of storing business data is related to the cost of backing it up and administering it rather than the cost of the system on which it is stored. In this case it is reasonable to know the total amount of user data (and charge for it), since it likely needs to be backed up and managed. Deduplication does not help much here. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
On Wed, Dec 9, 2009 at 10:43 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Wed, 9 Dec 2009, Bruno Sousa wrote: Despite the fact that i agree in general with your comments, in reality it all comes to money.. So in this case, if i could prove that ZFS was able to find X amount of duplicated data, and since that X amount of data has a price of Y per GB, IT could be seen as business enabler instead of a cost centre. Most of the cost of storing business data is related to the cost of backing it up and administering it rather than the cost of the system on which it is stored. In this case it is reasonable to know the total amount of user data (and charge for it), since it likely needs to be backed up and managed. Deduplication does not help much here. Um, I thought deduplication had been invented to reduce backup window :). Regards, Andrey Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
Hi, The data needs to be stored somewhere, and usually we need to have a server, disk array, disks, and more data means more disks, and more disks active means more power usage , therefore higher costs, and less green IT :) So, at my point of view, deduplication is relevant for lowering costs, but in order to do that , there has to be a way to measure those costs/savings. But yes, this costs probably represent less than 20% of the total cost, but its a cost no matter what. However, maybe im driving in the wrong road... Bruno Bob Friesenhahn wrote: On Wed, 9 Dec 2009, Bruno Sousa wrote: Despite the fact that i agree in general with your comments, in reality it all comes to money.. So in this case, if i could prove that ZFS was able to find X amount of duplicated data, and since that X amount of data has a price of Y per GB, IT could be seen as business enabler instead of a cost centre. Most of the cost of storing business data is related to the cost of backing it up and administering it rather than the cost of the system on which it is stored. In this case it is reasonable to know the total amount of user data (and charge for it), since it likely needs to be backed up and managed. Deduplication does not help much here. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ smime.p7s Description: S/MIME Cryptographic Signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Expected ZFS behavior?
On 7 dec 2009, at 18.40, Bob Friesenhahn wrote: On Mon, 7 Dec 2009, Richard Bruce wrote: I started copying over all the data from my existing workstation. When copying files (mostly multi-gigabyte DV video files), network throughput drops to zero for ~1/2 second every 8-15 seconds. This throughput drop corresponds to drive activity on the Opensolaris box. The ZFS pool drives show no activity except every 8-15 seconds. As best as I can guess, the Opensolaris box is caching traffic and batching it to disk every so often. I guess I didn't expect disk writes to interrupt network traffic. Is this correct? This is expected behavior. From what has been posted here, these are the current buffering rules: Is it really? Shouldn't it start on the next txg and while the previous txg commits, and just continue writing? /ragge ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else
I didn't see remove a simple device anywhere in there. Is it: too hard to even contemplate doing, or too silly a thing to do to even consider letting that happen or too stupid a question to even consider or too easy and straightforward to do the procedure I see recommended (export the whole pool, destroy the pool, remove the device, remake the pool, then reimport the pool) to even bother with? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
On Wed, 9 Dec 2009, Andrey Kuzmin wrote: Um, I thought deduplication had been invented to reduce backup window :). Unless the backup system also supports deduplication, in what way does deduplication reduce the backup window? Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else
What you're talking about is a side-benefit of the BP rewrite section of the linked slides. I believe that once BP rewrite is fully baked, we'll soon afterwards see a device removal feature arrive. /dale On Dec 9, 2009, at 3:46 PM, R.G. Keen wrote: I didn't see remove a simple device anywhere in there. Is it: too hard to even contemplate doing, or too silly a thing to do to even consider letting that happen or too stupid a question to even consider or too easy and straightforward to do the procedure I see recommended (export the whole pool, destroy the pool, remove the device, remake the pool, then reimport the pool) to even bother with? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Expected ZFS behavior?
On Wed, 9 Dec 2009, Ragnar Sundblad wrote: This is expected behavior. From what has been posted here, these are the current buffering rules: Is it really? Shouldn't it start on the next txg and while the previous txg commits, and just continue writing? The pause is clearly not during the entire TXG commit. The TXG commit could take up to five seconds to complete. Perhaps the pause occurs only during the start of the commit, or perhaps it is at the end, or perhaps it is because the next TXG has already become 100% full while waiting for the current TXG to commit, and zfs is not willing to endanger more than one TXG worth of data so it pauses? To my recollection, none of the zfs developers have been interested in discussing the cause of the pause, although they are clearly interested in maximizing performance. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else
R.G. Keen wrote: I didn't see remove a simple device anywhere in there. Is it: too hard to even contemplate doing, or too silly a thing to do to even consider letting that happen or too stupid a question to even consider or too easy and straightforward to do the procedure I see recommended (export the whole pool, destroy the pool, remove the device, remake the pool, then reimport the pool) to even bother with? It is too complicated to implement directly. As with lvm2 and comparable technologies, one would have to first have a feature that moves all extents from one physical volume to the other available phys.-vols. Then, when allocating the replacement blocks, the algorithm could quickly become _very_ unwieldy, because the pool will still have to keep it's redundancy guarantees [1]. As you can imagine this can be very complex with ZFS mixture of raid, parity, _dynamic_ striping (simply realllocating the blocks could cause massively fragmented disks if the pool/vdev previoiusly used dynamic striping). Using 'copies=n' and extra parity (raidz2,raidz3) further complicates the matter. In all circumstances about the only algorithm to specify for the transformation _with_ the guarantee that all invariants are (logically[2]) checked is to use the wellknown send/recv kludge. In that case you'll simply need double the storage and a lot of processing resources to make the transform. There are a number of situations in which the logic can safely be simplified (using only dynamic striping and using only full disks and when there is a 'third' (recent disk) not involved in any of the existing stripes to receive the relocated stripes etc. In effect, I doubt that these situations are ever going to cover more than what 'detach' and 'replace' offer at this moment in time. So, in a word, yes this is (very) complicated. The complicating thing is that ZFS does dynamic striping and RAID redundancy properties _automagically_. This dynamicity make it very hard to define what needs to happen when a disk is removed (likewise for replacing with a smaller disk). 'Static' RAID tools have the advantage here, because they can guarantee how stripes are layout across a 'pool', and also because the admin can limit the options used for a pool precisely to enable 'special operations' like 'remove physdev'. However, even if so, removal off a disk (as opposed to replacement) is a very uncommon use case for any RAID solution that I know of. [1] of course, you could replace that complexity by a burden on the user: let removal of a drive have the same effect as physically failing that device, degrading the pool. Then you would have to either replace the vdev or re-add a vdev to restore the redundancy. [2] by which I mean, barring bugs in, say, send/recv ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else
Dale Ghent wrote: What you're talking about is a side-benefit of the BP rewrite section of the linked slides. I believe that once BP rewrite is fully baked, we'll soon afterwards see a device removal feature arrive. /dale On Dec 9, 2009, at 3:46 PM, R.G. Keen wrote: I didn't see remove a simple device anywhere in there. Is it: too hard to even contemplate doing, or too silly a thing to do to even consider letting that happen or too stupid a question to even consider or too easy and straightforward to do the procedure I see recommended (export the whole pool, destroy the pool, remove the device, remake the pool, then reimport the pool) to even bother with? -- BP rewrite is key to several oft-asked features: vdev removal, defrag, raidz expansion, among others. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else
* R.G. Keen (k...@geofex.com) wrote: I didn't see remove a simple device anywhere in there. Is it: too hard to even contemplate doing, or too silly a thing to do to even consider letting that happen or too stupid a question to even consider or too easy and straightforward to do the procedure I see recommended (export the whole pool, destroy the pool, remove the device, remake the pool, then reimport the pool) to even bother with? You missed: Too hard to do correctly with current resource levels and other higher priority work. As always, volunteers I'm sure are welcome. :-) Cheers, -- Glenn ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else
On 12/09/09 13:52, Glenn Lagasse wrote: * R.G. Keen (k...@geofex.com) wrote: I didn't see remove a simple device anywhere in there. Is it: too hard to even contemplate doing, or too silly a thing to do to even consider letting that happen or too stupid a question to even consider or too easy and straightforward to do the procedure I see recommended (export the whole pool, destroy the pool, remove the device, remake the pool, then reimport the pool) to even bother with? You missed: Too hard to do correctly with current resource levels and other higher priority work. As always, volunteers I'm sure are welcome. :-) This gives the impression that development is not actively working on it. This is not true. As has been said often it is a difficult problem and has been actively worked on for a few months now. I don't think we are prepared to give a date as to when it will be delivered though. Neil. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else
Neil Perrin wrote: On 12/09/09 13:52, Glenn Lagasse wrote: * R.G. Keen (k...@geofex.com) wrote: I didn't see remove a simple device anywhere in there. Is it: too hard to even contemplate doing, or too silly a thing to do to even consider letting that happen or too stupid a question to even consider or too easy and straightforward to do the procedure I see recommended (export the whole pool, destroy the pool, remove the device, remake the pool, then reimport the pool) to even bother with? You missed: Too hard to do correctly with current resource levels and other higher priority work. As always, volunteers I'm sure are welcome. :-) This gives the impression that development is not actively working on it. This is not true. As has been said often it is a difficult problem and has been actively worked on for a few months now. I don't think we are prepared to give a date as to when it will be delivered though. This should go on a collection of things people ask about a lot, if such a thing were to exist. Oh, wait... http://hub.opensolaris.org/bin/view/Community+Group+zfs/faq#HCandevicesberemovedfromaZFSpool ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else
* Neil Perrin (neil.per...@sun.com) wrote: On 12/09/09 13:52, Glenn Lagasse wrote: * R.G. Keen (k...@geofex.com) wrote: I didn't see remove a simple device anywhere in there. Is it: too hard to even contemplate doing, or too silly a thing to do to even consider letting that happen or too stupid a question to even consider or too easy and straightforward to do the procedure I see recommended (export the whole pool, destroy the pool, remove the device, remake the pool, then reimport the pool) to even bother with? You missed: Too hard to do correctly with current resource levels and other higher priority work. As always, volunteers I'm sure are welcome. :-) This gives the impression that development is not actively working on it. This is not true. As has been said often it is a difficult problem True. I apologize for the misleading nature of my comment. I should have pointed out that I don't work on the ZFS project but was relating what I believed the possible answer could be based upon past list postings of the subject. and has been actively worked on for a few months now. I don't think we are prepared to give a date as to when it will be delivered though. Cool! -- Glenn ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Resend : zfs: questions on ARC membership based on type/ordering of Reads/Writes
hi, i'm re-sending this because I'm hoping that someone has some answers to the following questions. I'm working a hot Escalation on AmberRoad and am trying to understand what's under zfs' hood. thanks Solaris RPE /andrew rutz On 11/25/09 13:55, andrew.r...@sun.com wrote: I am trying to understand the ARC's behavior based on different permutations of (a)sync Reads and (a)sync Writes. thank you, in advance o does the data for a *sync-write* *ever* go into the ARC? eg, my understanding is that the data goes to the ZIL (and the SLOG, if present), but how does it get from the ZIL to the ZIO layer? eg, does it go to the ARC on its way to the ZIO ? o if the sync-write-data *does* go to the ARC, does it go to the ARC *after* it is written to the ZIL's backing-store, or does the data go to the ZIL and the ARC in parallel ? o if a sync-write's data goes to the ARC and ZIL *in parallel*, then does zfs prevent an ARC-hit until the data is confirmed to be on the ZIL's nonvolatile media (eg, disk-platter or SLOG) ? or could a Read get an ARC-hit on a block *before* it's written to zil's backing-store? o is the DMU where the Serialization of transactions occurs? o if an async-Write for block-X hits the Serializer before a Read for block-X hits the Serializer, i am assuming the Read can pass the async-Write; eg, the Read is *not* pended behind the async-write. however, if a Read hits the Serializer after a *sync*-write, then i'm assuming the Read is pended until the sync-write is written to the ZIL's nonvolatile media. o if a Read passes an async-write, then i'm assuming the Read can be satisfied by either the arc, l2arc, or disk. o it's stated that the L2ARC is for random-reads. however, there's nothing to prevent the L2ARC from containing blocks derived from *sequential*-reads, right ? also, blocks from async-writes can also live in l2arc, right? how about sync-writes ? o is the l2arc literally simply a *larger* ARC? eg, does the l2arc obey the normal cache property where everything that is in the L1$ (eg, ARC) is also in the L2$ (eg, l2arc) ? (I have a feeling that the set-theoretic intersection of ARC and L2ARC is empty (for some reason). o does the l2arc use the ARC algorithm (as the name suggests) ? thank you, /andrew Solaris RPE ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Andrew Rutzandrew.r...@sun.com Solaris RPE Ph: (x64089) 512-401-1089 Austin, TX 78727Fax: 512-401-1452 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] will deduplication know about old blocks?
On 10/12/2009, at 5:36 AM, Adam Leventhal wrote: The dedup property applies to all writes so the settings for the pool of origin don't matter, just those on the destination pool. Just a quick related question I’ve not seen answered anywhere else: Is it safe to have dedup running on your rpool? (at install time, or if you need to migrate your rpool to new media) cheers, James ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled
I have disabled all 'non-important' processes (gdm, ssh, vnc, etc). I am now starting this process locally on the server via the console with about 3.4 GB free of RAM. I still have my entries in /etc/system for limiting how much RAM zfs can use. Going on 10 hours now, still importing. Still at just under 2MB/S read speed on each disk in the pool. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled
I have disabled all 'non-important' processes (gdm, ssh, vnc, etc). I am now starting this process locally on the server via the console with about 3.4 GB free of RAM. I still have my entries in /etc/system for limiting how much RAM zfs can use. Going on 10 hours now, still importing. Still at just under 2MB/S read speed on each disk in the pool. And it's now froze again. Been frozen for 10 minutes now. I had iostat working on the console, At the time of the freeze, it started writing to the zfs pool disks, previous to that, it has been all reads. The console cursor is still blinking at least, so it's not a hard lock. I'm just gonna let it sit for a while and see what happens. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled
I wonder if you are hitting this bug: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6905936 Deleting large files or filesystems on a dedup=on filesystem stalls the whole system Cindy On 12/09/09 16:41, Jack Kielsmeier wrote: I have disabled all 'non-important' processes (gdm, ssh, vnc, etc). I am now starting this process locally on the server via the console with about 3.4 GB free of RAM. I still have my entries in /etc/system for limiting how much RAM zfs can use. Going on 10 hours now, still importing. Still at just under 2MB/S read speed on each disk in the pool. And it's now froze again. Been frozen for 10 minutes now. I had iostat working on the console, At the time of the freeze, it started writing to the zfs pool disks, previous to that, it has been all reads. The console cursor is still blinking at least, so it's not a hard lock. I'm just gonna let it sit for a while and see what happens. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled
Ah that could be it! This leaves me hopeful, as it looks like that bug says it'll eventually finish! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Resend : zfs: questions on ARC membership based on type/ordering of Reads/Writes
I replied... maybe I don't count anymore, boo hoo :-) http://opensolaris.org/jive/thread.jspa?threadID=118667tstart=15 -- richard On Dec 9, 2009, at 1:57 PM, andrew.r...@sun.com wrote: hi, i'm re-sending this because I'm hoping that someone has some answers to the following questions. I'm working a hot Escalation on AmberRoad and am trying to understand what's under zfs' hood. thanks Solaris RPE /andrew rutz On 11/25/09 13:55, andrew.r...@sun.com wrote: I am trying to understand the ARC's behavior based on different permutations of (a)sync Reads and (a)sync Writes. thank you, in advance o does the data for a *sync-write* *ever* go into the ARC? eg, my understanding is that the data goes to the ZIL (and the SLOG, if present), but how does it get from the ZIL to the ZIO layer? eg, does it go to the ARC on its way to the ZIO ? o if the sync-write-data *does* go to the ARC, does it go to the ARC *after* it is written to the ZIL's backing-store, or does the data go to the ZIL and the ARC in parallel ? o if a sync-write's data goes to the ARC and ZIL *in parallel*, then does zfs prevent an ARC-hit until the data is confirmed to be on the ZIL's nonvolatile media (eg, disk-platter or SLOG) ? or could a Read get an ARC-hit on a block *before* it's written to zil's backing-store? o is the DMU where the Serialization of transactions occurs? o if an async-Write for block-X hits the Serializer before a Read for block-X hits the Serializer, i am assuming the Read can pass the async-Write; eg, the Read is *not* pended behind the async-write. however, if a Read hits the Serializer after a *sync*-write, then i'm assuming the Read is pended until the sync-write is written to the ZIL's nonvolatile media. o if a Read passes an async-write, then i'm assuming the Read can be satisfied by either the arc, l2arc, or disk. o it's stated that the L2ARC is for random-reads. however, there's nothing to prevent the L2ARC from containing blocks derived from *sequential*-reads, right ? also, blocks from async-writes can also live in l2arc, right? how about sync-writes ? o is the l2arc literally simply a *larger* ARC? eg, does the l2arc obey the normal cache property where everything that is in the L1$ (eg, ARC) is also in the L2$ (eg, l2arc) ? (I have a feeling that the set-theoretic intersection of ARC and L2ARC is empty (for some reason). o does the l2arc use the ARC algorithm (as the name suggests) ? thank you, /andrew Solaris RPE ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Andrew Rutzandrew.r...@sun.com Solaris RPE Ph: (x64089) 512-401-1089 Austin, TX 78727Fax: 512-401-1452 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS dedup report tool
On Dec 9, 2009, at 11:07 AM, Bruno Sousa wrote: Hi, Despite the fact that i agree in general with your comments, in reality it all comes to money.. So in this case, if i could prove that ZFS was able to find X amount of duplicated data, and since that X amount of data has a price of Y per GB, IT could be seen as business enabler instead of a cost centre. But indeed, you're right , in my case a possible technical solution is trying to answer for a managerial solution..however, isn't it way IT was invented, that i believe that's why i got my paycheck each month :) OK, I think I've pulled your leg just a bit :-) Here is the problem, if you charge per-byte, then when you dedup the cost per-byte increases. Why? Because you have both fixed and variable costs and dedup will reduce your variable (per-byte) cost. cost = fixed cost($) + [per byte cost ($/byte) * bytes] The best way to solve this is through managerial accounting (aka change the rules :-) which happens quite often in business. See also Captain Kirk's response to the Kobayashi Maru http://en.wikipedia.org/wiki/Kobayashi_Maru Finally, as my managerial accounting professor says, don't lose money :-) -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] CR#6850837 libshare enhancements to address performance and scalability
I've had a case open for a year or so now regarding the inefficiencies of having a large number of zfs filesystems, in particular how long it takes to share/unshare them (resulting in a reboot cycle time on my x4500 with 8000 file systems of over two hours). I got an update indicating that the subject mentioned bugfix was going to resolve this, but that they did not plan to back port it to Solaris 10. They also were not able to provide any technical details of what was fixed or how much the performance might improve. Would anyone happen to have any details of what changes were made, what kind of improvements might be expected, and why it's not going to be feasible to backport that change to Solaris 10? Thanks... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR#6850837 libshare enhancements to address performance and scalability
Ditto, and also some estimate of when we can see them in opensolaris. On Wed, Dec 9, 2009 at 11:02 PM, Paul B. Henson hen...@acm.org wrote: I've had a case open for a year or so now regarding the inefficiencies of having a large number of zfs filesystems, in particular how long it takes to share/unshare them (resulting in a reboot cycle time on my x4500 with 8000 file systems of over two hours). I got an update indicating that the subject mentioned bugfix was going to resolve this, but that they did not plan to back port it to Solaris 10. They also were not able to provide any technical details of what was fixed or how much the performance might improve. Would anyone happen to have any details of what changes were made, what kind of improvements might be expected, and why it's not going to be feasible to backport that change to Solaris 10? Thanks... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/http://www.csupomona.edu/%7Ehenson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Opensolaris with J4400 - Experiences
OK Today I played with a J4400 connected to a Txxx server running S10 10/09 First off read the release notes I spent about 4 hours pulling my hair out as I could not get stmsboot to work until we read in the release notes that 500GB SATA drives do not work!!! Initial Setup: A pair of dual port SAS controllers (c4 and c5) A J4400 with 6x 1TB SATA disks The J440 had two controllers and these where connected to one SAS card (physical controller c4) Test 1: First a reboot -- -r format shows 12 disks on c4 (each disk having two paths). If you picked the same disk via both paths ZFS stopped you doing stupid things by knowing the disk was already in use. Test 2: run stmsboot -e format now shows six disk on controller c6, a new virtual controller The two internal disks are also now on c6 and stmsboot has done the right stuff with the rpool, so I would guess you could multi-path at a later date if you don't want to fist off, but I did not test this. stmsboot -L only showed the two internal disk not the six in the J4400 strange, but we pressed on. Test 3: I created a zpool (two disks mirrored) using two of the new devices on c6. I created some I/O load I then unplugged one of the cables from the SAS card (physical c4). Result: Nothing everything just keeps working - cool stuff! Test 4: I plugged the unplugged cable into the other controller (physical c5) Result: Nothing everything just keeps working - cool stuff! Test 5: Being bold I then unplugged the remaining cable from the physical c4 controller Result: Nothing everything just keeps working - cool stuff! So I had gone from dual pathed, on a single controller (c4) to single pathed, on a different controller (c5). Test 6: I added the other four drives to the zpool (plain old zfs stuff - a bit boring). Test 7: I plugged in four more disks. Result: Their mulipathed devices just showed up in format, I added them to the pool and also added them as spares all the while the I/O load is happening. No noticable stops or glitches. Conclusion: If you RTFM first then stmsboot does everything it is documented to do. You don't need to play with cfgadm or anything like that, just as I said orginally (below). The multi-pathing stuff is easy to set up and even a very rusty admin. like me found it very easy. Note: There may be patches for the 500GB SATA disks I don'y know, fortunatly that's not what I've sold - Phew!! TTFN Trevor From: zfs-discuss-boun...@opensolaris.org [zfs-discuss-boun...@opensolaris.org] On Behalf Of Trevor Pretty [trevor_pre...@eagle.co.nz] Sent: Monday, 30 November 2009 2:48 p.m. To: Karl Katzke Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Opensolaris with J4400 - Experiences Karl Don't you just use stmsboot? http://docs.sun.com/source/820-3223-14/SASMultipath.html#50511899_pgfId-1046940 Bruno Next week I'm playing with a M3000 and a J4200 in the local NZ distributor's lab. I had planned to just use the latest version of S10, but if I get the time I might play with OpenSolaris as well, but I don't think there is anything radically different between the two here. From what I've read in preparation (and I stand to be corrected): * Will i be able to achieve multipath support, if i connect the J4400 to 2 LSI HBA in one server, with SATA disks, or this is only possible with SAS disks? This server will have OpenSolaris (any release i think) . Disk type does not matter (see link above). * The CAM ( StorageTek Common Array Manager ), its only for hardware management of the JBOD, leaving disk/volumes/zpools/luns/whatever_name management up to the server operating system , correct ? That is my understanding see:- http://docs.sun.com/source/820-3765-11/ * Can i put some readzillas/writezillas in the j4400 along with sata disks, and if so will i have any benefit , or should i place those *zillas directly into the servers disk tray? On the Unified Storage products they go in both. Readzilla in the server Logzillas in the J4400. This is quite logical if you want to move the array between hosts all the data needs to be in the array. Read data can always be re-created so therefore the closer to the CPU the better. See: http://catalog.sun.com/ * Does any one has experiences with those jbods? If so, are they in general solid/reliable ? No: But, get a support contract! * The server will probably be a Sun x44xx series, with 32Gb ram, but for the best possible performance, should i invest in more and more spindles, or a couple less spindles and buy some readzillas? This system will be mainly used to export some volumes over ISCSI to a windows 2003 fileserver, and to hold some NFS shares. Check Brendon Gregg's blogs *I think* he has done some work here from memory. Karl Katzke wrote: Bruno - Sorry, I don't have experience with OpenSolaris, but
Re: [zfs-discuss] will deduplication know about old blocks?
On Thu, Dec 10, 2009 at 12:37 AM, James Lever j...@jamver.id.au wrote: On 10/12/2009, at 5:36 AM, Adam Leventhal wrote: The dedup property applies to all writes so the settings for the pool of origin don't matter, just those on the destination pool. Just a quick related question I’ve not seen answered anywhere else: Is it safe to have dedup running on your rpool? (at install time, or if you need to migrate your rpool to new media) I have it on on my laptop and and a couple of other machines. I also have number of frsh installations (albeit in VB) where the dedup is on from the very beginning. Meanwhile works ok. BTW, are there any implications of having dedup=on on rpool/dump ? I know that the compression is turned off explicitly for rpool/dump. -- Regards, Cyril ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss