Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2009-12-09 Thread Michael Herf
zpool import done! Back online.

Total downtime for 4TB pool was about 8 hours, don't know how much of this was 
completing the destroy transaction.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] freeNAS moves to Linux from FreeBSD

2009-12-09 Thread James Andrewartha

Bob Friesenhahn wrote:

On Mon, 7 Dec 2009, Michael DeMan (OA) wrote:


Args for FreeBSD + ZFS:

- Limited budget
- We are familiar with managing FreeBSD.
- We are familiar with tuning FreeBSD.
- Licensing model

Args against OpenSolaris + ZFS:
- Hardware compatibility
- Lack of knowledge for tuning and associated costs for training staff 
to learn 'yet one more operating system' they need to support.

- Licensing model


If you think about it a little bit, you will see that there is no 
significant difference in the licensing model between FreeBSD+ZFS and 
OpenSolaris+ZFS.  It is not possible to be a little bit pregnant. 
Either one is pregnant, or one is not.


There is a huge difference practically - OpenSolaris has no free security 
updates for stable releases, unlike FreeBSD. And I'm sure you don't 
recommend running /dev in production.


This is offtopic, and isn't specifically related to CDDL vs BSD, just how 
Sun chooses to do things. Sure, there have been claims (since before 
2008.05) that it might happen some day, but until 2009.06 users can freely 
get a non-vulernable Firefox or Samba or fixes for various network kernel 
panics the claims are meaningless.


http://mail.opensolaris.org/pipermail/opensolaris-help/2009-November/015824.html

--
James Andrewartha
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Bruno Sousa
Hi all,

Is there any way to generate some report related to the de-duplication
feature of ZFS within a zpool/zfs pool?
I mean, its nice to have the dedup ratio, but it think it would be also
good to have a report where we could see what directories/files have
been found as repeated and therefore they suffered deduplication.

Thanks for your time,
Bruno


smime.p7s
Description: S/MIME Cryptographic Signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Andrey Kuzmin
On Wed, Dec 9, 2009 at 2:26 PM, Bruno Sousa bso...@epinfante.com wrote:
 Hi all,

 Is there any way to generate some report related to the de-duplication
 feature of ZFS within a zpool/zfs pool?
 I mean, its nice to have the dedup ratio, but it think it would be also
 good to have a report where we could see what directories/files have
 been found as repeated and therefore they suffered deduplication.

Nice to have at first glance, but could you detail on any specific
use-case you see?

Regards,
Andrey


 Thanks for your time,
 Bruno

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] will deduplication know about old blocks?

2009-12-09 Thread Kjetil Torgrim Homme
I'm planning to try out deduplication in the near future, but started
wondering if I can prepare for it on my servers.  one thing which struck
me was that I should change the checksum algorithm to sha256 as soon as
possible.  but I wonder -- is that sufficient?  will the dedup code know
about old blocks when I store new data?

let's say I have an existing file img0.jpg.  I turn on dedup, and copy
it twice, to img0a.jpg and img0b.jpg.  will all three files refer to the
same block(s), or will only img0a and img0b share blocks?

-- 
Kjetil T. Homme
Redpill Linpro AS - Changing the game

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Bruno Sousa
Hi Andrey,

For instance, i talked about deduplication to my manager and he was
happy because less data = less storage, and therefore less costs .
However, now the IT group of my company needs to provide to management
board, a report of duplicated data found per share, and in our case one
share means one specific company department/division.
Bottom line, the mindset is something like :

* one share equals to a specific department within the company
* the department demands a X value of data storage
* the data storage costs Y
* making a report of the amount of data consumed by a department,
  before and after deduplication, means that data storage costs can
  be seen per department
* if theres a cost reduction due to the usage of deduplication, part
  of that money can be used for business , either IT related
  subjects or general business
* management board wants to see numbers related to costs, and not
  things like the racio of deduplication in SAN01 is 3x, because
  for management this is geek talk

I hope i was somehow clear, but i can try to explain better if needed.

Thanks,
Bruno

Andrey Kuzmin wrote:
 On Wed, Dec 9, 2009 at 2:26 PM, Bruno Sousa bso...@epinfante.com wrote:
   
 Hi all,

 Is there any way to generate some report related to the de-duplication
 feature of ZFS within a zpool/zfs pool?
 I mean, its nice to have the dedup ratio, but it think it would be also
 good to have a report where we could see what directories/files have
 been found as repeated and therefore they suffered deduplication.
 

 Nice to have at first glance, but could you detail on any specific
 use-case you see?

 Regards,
 Andrey

   
 Thanks for your time,
 Bruno

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


 

   



smime.p7s
Description: S/MIME Cryptographic Signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2009-12-09 Thread Jack Kielsmeier
 zpool import done! Back online.
 
 Total downtime for 4TB pool was about 8 hours, don't
 know how much of this was completing the destroy
 transaction.

Lucky You! :)

My box has gone totally unresponsive again :( I cannot even ping it now and I 
can't hear the disks thrashing.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Andrey Kuzmin
On Wed, Dec 9, 2009 at 2:47 PM, Bruno Sousa bso...@epinfante.com wrote:
 Hi Andrey,

 For instance, i talked about deduplication to my manager and he was
 happy because less data = less storage, and therefore less costs .
 However, now the IT group of my company needs to provide to management
 board, a report of duplicated data found per share, and in our case one
 share means one specific company department/division.
 Bottom line, the mindset is something like :

    * one share equals to a specific department within the company
    * the department demands a X value of data storage
    * the data storage costs Y
    * making a report of the amount of data consumed by a department,
      before and after deduplication, means that data storage costs can
      be seen per department

Do you currently have tools that report storage usage per share? What
you ask for looks like a request to make these deduplication-aware.

    * if theres a cost reduction due to the usage of deduplication, part
      of that money can be used for business , either IT related
      subjects or general business
    * management board wants to see numbers related to costs, and not
      things like the racio of deduplication in SAN01 is 3x, because
      for management this is geek talk

Just divide storage costs by deduplication factor (1), and here you
are (provided you can do it by department).

Regards,
Andrey


 I hope i was somehow clear, but i can try to explain better if needed.

 Thanks,
 Bruno

 Andrey Kuzmin wrote:
 On Wed, Dec 9, 2009 at 2:26 PM, Bruno Sousa bso...@epinfante.com wrote:

 Hi all,

 Is there any way to generate some report related to the de-duplication
 feature of ZFS within a zpool/zfs pool?
 I mean, its nice to have the dedup ratio, but it think it would be also
 good to have a report where we could see what directories/files have
 been found as repeated and therefore they suffered deduplication.


 Nice to have at first glance, but could you detail on any specific
 use-case you see?

 Regards,
 Andrey


 Thanks for your time,
 Bruno

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss







___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Planed ZFS-Features - Is there a List or something else

2009-12-09 Thread Henri Maddox
Hi There,

does anybody know, if there's a Roadmap oder simply a List of the future 
features of ZFS?

Would be interesting to see, what will happen in the future.

THX
Henri
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How to destroy ZFS pool with dump on ZVOL

2009-12-09 Thread Jan Damborsky

Hi ZFS guys,

when playing with one of recent version of OpenSolaris GUI installer,
I have tried to restart it after previous failure.
However, the installer failed when trying to destroy previously created
ZFS root pool. It was discovered that this is due to the fact that dump
ZFS volume could not be released and thus subsequent 'zpool destroy' command
failed (as expected):

# zpool destroy -f rpool
cannot destroy 'rpool': pool is busy

# zfs list rpool/dump
NAME USED  AVAIL  REFER  MOUNTPOINT
rpool/dump   750M  9.93G   750M  -

# dumpadm
 Dump content: kernel pages
  Dump device: /dev/zvol/dsk/rpool/dump (dedicated)
Savecore directory: /var/crash/opensolaris
 Savecore enabled: no
  Save compressed: on


There was a discussion which happened on zfs-discuss mailing list
some time ago about how dump ZVOL could be release and the recommended
approach was to try to move it to swap ZVOL - the attempt failed,
but dump ZVOL was released. That no longer seems to work:


# swap -l
swapfile devswaplo   blocks free
/dev/zvol/dsk/rpool/swap  8,1 8  2287608  2287608
# dumpadm -d swap
dumpadm: no swap devices could be configured as the dump device
# dumpadm
 Dump content: kernel pages
  Dump device: /dev/zvol/dsk/rpool/dump (dedicated)
Savecore directory: /var/crash/opensolaris
 Savecore enabled: no
  Save compressed: on


Bug 13180 was filed against OpenSolaris installer to track this issue.

Could I please ask somebody from ZFS team to help install folks understand
what changed and how the installer has to be modified, so that it can
destroy ZFS root pool containing dump on ZVOL ?

Thank you very much,
Jan

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] panic when rebooting from snapshot

2009-12-09 Thread Joep Vesseur
Folks,

I've been seeing this for a while, but never had the urge to ask, until now.
When I take a snapshot of my current root-FS and tell the system to reboot off
that snapshot, I'm faced with an assertion failure (running DEBUG bits) that
looks like this:

r...@codemonkey:~# df -h /
FilesystemSize  Used Avail Use% Mounted on
rpool/ROOT/bfu129G  8.2G  121G   7% /
r...@codemonkey:~# zfs snapshot rpool/ROOT/b...@ro
r...@codemonkey:~# reboot rpool/ROOT/b...@ro
Dec  8 20:41:17 codemonkey reboot: initiated by root on /dev/console

panic[cpu0]/thread=ff01e1023040: assertion failed: vfsp-vfs_count != 0,
file: ../../common/fs/vfs.c, line: 4374

ff0007fb1c90 genunix:assfail+7e ()
ff0007fb1cc0 genunix:vfs_rele+86 ()
ff0007fb1ce0 zfs:zfs_freevfs+2a ()
ff0007fb1d00 genunix:fsop_freefs+1a ()
ff0007fb1d30 genunix:vfs_rele+3b ()
ff0007fb1d60 genunix:vfs_remove+65 ()
ff0007fb1db0 genunix:dounmount+a3 ()
ff0007fb1de0 genunix:vfs_unmountall+92 ()
ff0007fb1e50 genunix:kadmin+549 ()
ff0007fb1eb0 genunix:uadmin+10f ()
ff0007fb1f00 unix:brand_sys_syscall32+295 ()

syncing file systems... done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
 0:14 100% done
100% done: 256267 pages dumped, dump succeeded
rebooting...

I've tried to reason why this happens, but fail to come up with a
plausible answer. Has anyone else seen this? Anyone knows what's
amiss? I'm hesitant to file a bug without a pointer to a possible
cause.

This only happens when I try to reboot off a snapshot. If I first create
a clone of the snapshot and reboot off that, the system is perfectly happy...

TIA,

Joep
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2009-12-09 Thread Jack Kielsmeier
I have disabled all 'non-important' processes (gdm, ssh, vnc, etc). I am now 
starting this process locally on the server via the console with about 3.4 GB 
free of RAM.

I still have my entries in /etc/system for limiting how much RAM zfs can use.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Bruno Sousa
Hi,

The tool to report storage usage per share is du -h / df -h :) , so yes,
these tools could be deduplication aware.
I know for instance that microsoft has a feature (in Win2003 R2), called
File Server Resource Manager, and inside theres the possibility to make
Storage Reports, and one of those reports is Duplicated Files.
Bottom line, if ZFS can deliver such a capability, i think that
Solaris/OpenSolaris would gain yet another competitive edge over other
solutions, therefore more customers could see more and more advantages
by choosing ZFS based storage.

Bruno

Andrey Kuzmin wrote:
 On Wed, Dec 9, 2009 at 2:47 PM, Bruno Sousa bso...@epinfante.com wrote:
   
 Hi Andrey,

 For instance, i talked about deduplication to my manager and he was
 happy because less data = less storage, and therefore less costs .
 However, now the IT group of my company needs to provide to management
 board, a report of duplicated data found per share, and in our case one
 share means one specific company department/division.
 Bottom line, the mindset is something like :

* one share equals to a specific department within the company
* the department demands a X value of data storage
* the data storage costs Y
* making a report of the amount of data consumed by a department,
  before and after deduplication, means that data storage costs can
  be seen per department
 

 Do you currently have tools that report storage usage per share? What
 you ask for looks like a request to make these deduplication-aware.

   
* if theres a cost reduction due to the usage of deduplication, part
  of that money can be used for business , either IT related
  subjects or general business
* management board wants to see numbers related to costs, and not
  things like the racio of deduplication in SAN01 is 3x, because
  for management this is geek talk
 

 Just divide storage costs by deduplication factor (1), and here you
 are (provided you can do it by department).

 Regards,
 Andrey

   
 I hope i was somehow clear, but i can try to explain better if needed.

 Thanks,
 Bruno

 Andrey Kuzmin wrote:
 
 On Wed, Dec 9, 2009 at 2:26 PM, Bruno Sousa bso...@epinfante.com wrote:

   
 Hi all,

 Is there any way to generate some report related to the de-duplication
 feature of ZFS within a zpool/zfs pool?
 I mean, its nice to have the dedup ratio, but it think it would be also
 good to have a report where we could see what directories/files have
 been found as repeated and therefore they suffered deduplication.

 
 Nice to have at first glance, but could you detail on any specific
 use-case you see?

 Regards,
 Andrey


   
 Thanks for your time,
 Bruno

 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



 
   
 

   



smime.p7s
Description: S/MIME Cryptographic Signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Changing ZFS drive pathing

2009-12-09 Thread Mike
Alex, thanks for the info.  You made my heart stop a little when reading your 
problem with PowerPath, but MPxIO seems like it might be a good option for me.  
I'll will try that as well although I have not used it before.  Thank you!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] freeNAS moves to Linux from FreeBSD

2009-12-09 Thread Bob Friesenhahn

On Wed, 9 Dec 2009, James Andrewartha wrote:


There is a huge difference practically - OpenSolaris has no free security 
updates for stable releases, unlike FreeBSD. And I'm sure you don't recommend 
running /dev in production.


If OpenSolaris was to do that, then it would be called Solaris. :-)

It seems that Solaris 10 offers free security and critical updates. 
Of course the desktop application software is quite old and OS 
features lag behind OpenSolaris.


Sun needs to find a way to improve its profit margins and retain its 
valuable employees, and the way it does that is by selling service 
contracts.  The base service contract for Solaris 10 is not terribly 
expensive, although it mostly just offers full access to patches and 
the Sunsolve site.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] panic when rebooting from snapshot

2009-12-09 Thread Cindy Swearingen

Hi Joep,

Booting from a snapshot isn't possibly because the snapshot is not
writable and the boot operation writes to the BE. Booting from a
clone is successful because the clone is writable.

The second issue is whether the reboot command understands what
a snapshot is. I see from the reboot man page that it supports
-e environment but I haven't tested this feature with a ZFS BE
or clone.

Thanks,

Cindy



On 12/08/09 12:48, Joep Vesseur wrote:

Folks,

I've been seeing this for a while, but never had the urge to ask, until now.
When I take a snapshot of my current root-FS and tell the system to reboot off
that snapshot, I'm faced with an assertion failure (running DEBUG bits) that
looks like this:

r...@codemonkey:~# df -h /
FilesystemSize  Used Avail Use% Mounted on
rpool/ROOT/bfu129G  8.2G  121G   7% /
r...@codemonkey:~# zfs snapshot rpool/ROOT/b...@ro
r...@codemonkey:~# reboot rpool/ROOT/b...@ro
Dec  8 20:41:17 codemonkey reboot: initiated by root on /dev/console

panic[cpu0]/thread=ff01e1023040: assertion failed: vfsp-vfs_count != 0,
file: ../../common/fs/vfs.c, line: 4374

ff0007fb1c90 genunix:assfail+7e ()
ff0007fb1cc0 genunix:vfs_rele+86 ()
ff0007fb1ce0 zfs:zfs_freevfs+2a ()
ff0007fb1d00 genunix:fsop_freefs+1a ()
ff0007fb1d30 genunix:vfs_rele+3b ()
ff0007fb1d60 genunix:vfs_remove+65 ()
ff0007fb1db0 genunix:dounmount+a3 ()
ff0007fb1de0 genunix:vfs_unmountall+92 ()
ff0007fb1e50 genunix:kadmin+549 ()
ff0007fb1eb0 genunix:uadmin+10f ()
ff0007fb1f00 unix:brand_sys_syscall32+295 ()

syncing file systems... done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
 0:14 100% done
100% done: 256267 pages dumped, dump succeeded
rebooting...

I've tried to reason why this happens, but fail to come up with a
plausible answer. Has anyone else seen this? Anyone knows what's
amiss? I'm hesitant to file a bug without a pointer to a possible
cause.

This only happens when I try to reboot off a snapshot. If I first create
a clone of the snapshot and reboot off that, the system is perfectly happy...

TIA,

Joep
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else

2009-12-09 Thread Cindy Swearingen

Hi Henri,

The slides from the SNIA conference this past fall provide a description
of upcoming features, here:

http://www.snia.org/events/storage-developer2009/presentations/monday/JeffBonwick_zfs-What_Next-SDC09.pdf

Cindy



On 12/09/09 05:25, Henri Maddox wrote:

Hi There,

does anybody know, if there's a Roadmap oder simply a List of the future 
features of ZFS?

Would be interesting to see, what will happen in the future.

THX
Henri

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs allow - internal error

2009-12-09 Thread Andrew Robert Nicols
I've just done a fresh install of Solaris 10 u8 (2009.10) onto a Thumper.
Running zfs allow gives the following delightful output:

-bash-3.00$ zfs allow
internal error: /usr/lib/zfs/pyzfs.py not found

I've confirmed it on a second thumper, also running Solaris 10 u8 installed
about 2 months ago.

Has anyone else seen this?

Thanks,

Andrew

-- 
Systems Developer

e: andrew.nic...@luns.net.uk
im: a.nic...@jabber.lancs.ac.uk
t: +44 (0)1524 5 10147

Lancaster University Network Services is a limited company registered in
England and Wales. Registered number: 04311892. Registered office:
University House, Lancaster University, Lancaster, LA1 4YW


signature.asc
Description: Digital signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2009-12-09 Thread Bob Friesenhahn

On Wed, 9 Dec 2009, Markus Kovero wrote:


From what I've noticed, if one destroys dataset that is say 50-70TB and reboots 
before destroy is finished, it can take up to several _days_ before it's back 
up again.

So, nowadays I'm doing rm -fr BEFORE issuing zfs destroy whenever possible.


It stands to reason that if deduplication is done via reference 
counting then whenever a deduplicated block is freed its duplication 
count needs to be reduced and it needs to be done atomically.  Blocks 
such as full-length zeroed blocks (common for zfs logical volumes) are 
likely to be quite heavily duplicated.  That may be where this 
bottleneck is coming from.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs allow - internal error

2009-12-09 Thread Tomas Ögren
On 09 December, 2009 - Andrew Robert Nicols sent me these 1,6K bytes:

 I've just done a fresh install of Solaris 10 u8 (2009.10) onto a Thumper.
 Running zfs allow gives the following delightful output:
 
 -bash-3.00$ zfs allow
 internal error: /usr/lib/zfs/pyzfs.py not found
 
 I've confirmed it on a second thumper, also running Solaris 10 u8 installed
 about 2 months ago.
 
 Has anyone else seen this?

Yes. You haven't got SUNWPython installed, which is wrongly marked as
belonging to the GNOME2 cluster. Install SUNWPython and SUNWPython-share
and it'll work. Some ZFS stuff (userspace, allow, ..) started using
python in u8.

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] will deduplication know about old blocks?

2009-12-09 Thread Adam Leventhal
Hi Kjetil,

Unfortunately, dedup will only apply to data written after the setting is 
enabled. That also means that new blocks cannot dedup against old block 
regardless of how they were written. There is therefore no way to prepare 
your pool for dedup -- you just have to enable it when you have the new bits.

Adam

On Dec 9, 2009, at 3:40 AM, Kjetil Torgrim Homme wrote:

 I'm planning to try out deduplication in the near future, but started
 wondering if I can prepare for it on my servers.  one thing which struck
 me was that I should change the checksum algorithm to sha256 as soon as
 possible.  but I wonder -- is that sufficient?  will the dedup code know
 about old blocks when I store new data?
 
 let's say I have an existing file img0.jpg.  I turn on dedup, and copy
 it twice, to img0a.jpg and img0b.jpg.  will all three files refer to the
 same block(s), or will only img0a and img0b share blocks?
 
 -- 
 Kjetil T. Homme
 Redpill Linpro AS - Changing the game
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] my ZFS backup script -- feedback appreciated

2009-12-09 Thread David Dyer-Bennet

On Tue, December 8, 2009 19:23, Andrew Daugherity wrote:

 Description/rationale of the script (more detailed comments within the
 script):
 # This supplements zfs-auto-snapshot, but runs independently.  I prefer
 that
 # snapshots continue to be taken even if the backup fails.
 #
 # This aims to be much more robust than the backup functionality of
 # zfs-auto-snapshot, namely:
 # * it uses 'zfs send -I' to send all intermediate snapshots (including
 #   any daily/weekly/etc.), and should still work even if it isn't run
 #   every hour -- as long as the newest remote snapshot hasn't been
 #   rotated out locally yet
 # * 'zfs recv -dF' on the destination host removes any snapshots not
 #   present locally so you don't have to worry about manually removing
 #   old snapshots there.

 I realize this doesn't meet everybody's needs but hopefully someone will
 find it useful.

This description matches pretty well (I'm doing my own snapshots) what
I've been working on, and having trouble getting to work (getting ZFS
errors on one end or the other depending on options).  I'm working with
OpenSolaris, though, which may make a difference.  Dunno when I next get a
chance to work on this, maybe this weekend; but having a working example
will be great.  I'll either just use yours, or at least benefit from
seeing what works; I'll figure out which when I look more closely.  So
thanks!

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] will deduplication know about old blocks?

2009-12-09 Thread Kjetil Torgrim Homme
Adam Leventhal a...@eng.sun.com writes:
 Unfortunately, dedup will only apply to data written after the setting
 is enabled. That also means that new blocks cannot dedup against old
 block regardless of how they were written. There is therefore no way
 to prepare your pool for dedup -- you just have to enable it when
 you have the new bits.

thank you for the clarification!
-- 
Kjetil T. Homme
Redpill Linpro AS - Changing the game

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] will deduplication know about old blocks?

2009-12-09 Thread Colin Raven
Adam,
So therefore, the best way is to set this at pool creation timeOK, that
makes sense, it operates only on fresh data that's coming over the fence.
BUT
What happens if you snapshot, send, destroy, recreate (with dedup on this
time around) and then write the contents of the cloned snapshot to the
various places in the pool - which properties are in the ascendancy here?
the host pool or the contents of the clone? The host pool I assume,
because clone contents are (in this scenario) just some new data?

-Me

On Wed, Dec 9, 2009 at 18:43, Adam Leventhal a...@eng.sun.com wrote:

 Hi Kjetil,

 Unfortunately, dedup will only apply to data written after the setting is
 enabled. That also means that new blocks cannot dedup against old block
 regardless of how they were written. There is therefore no way to prepare
 your pool for dedup -- you just have to enable it when you have the new
 bits.



 On Dec 9, 2009, at 3:40 AM, Kjetil Torgrim Homme wrote:

  I'm planning to try out deduplication in the near future, but started
  wondering if I can prepare for it on my servers.  one thing which struck
  me was that I should change the checksum algorithm to sha256 as soon as
  possible.  but I wonder -- is that sufficient?  will the dedup code know
  about old blocks when I store new data?
 
  let's say I have an existing file img0.jpg.  I turn on dedup, and copy
  it twice, to img0a.jpg and img0b.jpg.  will all three files refer to the
  same block(s), or will only img0a and img0b share blocks?
 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Richard Elling

On Dec 9, 2009, at 3:47 AM, Bruno Sousa wrote:


Hi Andrey,

For instance, i talked about deduplication to my manager and he was
happy because less data = less storage, and therefore less costs .
However, now the IT group of my company needs to provide to management
board, a report of duplicated data found per share, and in our case  
one

share means one specific company department/division.
Bottom line, the mindset is something like :

   * one share equals to a specific department within the company
   * the department demands a X value of data storage
   * the data storage costs Y
   * making a report of the amount of data consumed by a department,
 before and after deduplication, means that data storage costs can
 be seen per department
   * if theres a cost reduction due to the usage of deduplication,  
part

 of that money can be used for business , either IT related
 subjects or general business
   * management board wants to see numbers related to costs, and not
 things like the racio of deduplication in SAN01 is 3x, because
 for management this is geek talk

I hope i was somehow clear, but i can try to explain better if needed.


Snapshots, copies, compression, deduplication, and (eventually)  
encryption
occurs at the block level, not the file level. Hence, file-level  
accounting

works as long as you do not try to make a 1:1 relationship to physical
space.

But your problem, as described above, is one of managerial accounting.
IMHO, trying to apply a technical solution to a managerial accounting
problem is akin to catching a greased pig.  It is much easier to just do
what businessmen do -- manage managerial accounting.
http://en.wikipedia.org/wiki/Managerial_accounting
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Bruno Sousa
Hi,

Despite the fact that i agree in general with your comments, in reality
it all comes to money..
So in this case, if i could prove that ZFS was able to find X amount of
duplicated data, and since that X amount of data has a price of Y per
GB, IT could be seen as business enabler instead of a cost centre.
But indeed, you're right , in my case a possible technical solution is
trying to answer for a managerial solution..however, isn't it way IT was
invented, that i believe that's why i got my paycheck each month :)

Bruno

Richard Elling wrote:
 On Dec 9, 2009, at 3:47 AM, Bruno Sousa wrote:

 Hi Andrey,

 For instance, i talked about deduplication to my manager and he was
 happy because less data = less storage, and therefore less costs .
 However, now the IT group of my company needs to provide to management
 board, a report of duplicated data found per share, and in our case one
 share means one specific company department/division.
 Bottom line, the mindset is something like :

* one share equals to a specific department within the company
* the department demands a X value of data storage
* the data storage costs Y
* making a report of the amount of data consumed by a department,
  before and after deduplication, means that data storage costs can
  be seen per department
* if theres a cost reduction due to the usage of deduplication, part
  of that money can be used for business , either IT related
  subjects or general business
* management board wants to see numbers related to costs, and not
  things like the racio of deduplication in SAN01 is 3x, because
  for management this is geek talk

 I hope i was somehow clear, but i can try to explain better if needed.

 Snapshots, copies, compression, deduplication, and (eventually)
 encryption
 occurs at the block level, not the file level. Hence, file-level
 accounting
 works as long as you do not try to make a 1:1 relationship to physical
 space.

 But your problem, as described above, is one of managerial accounting.
 IMHO, trying to apply a technical solution to a managerial accounting
 problem is akin to catching a greased pig.  It is much easier to just do
 what businessmen do -- manage managerial accounting.
 http://en.wikipedia.org/wiki/Managerial_accounting
  -- richard





smime.p7s
Description: S/MIME Cryptographic Signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] will deduplication know about old blocks?

2009-12-09 Thread Adam Leventhal
 What happens if you snapshot, send, destroy, recreate (with dedup on this 
 time around) and then write the contents of the cloned snapshot to the 
 various places in the pool - which properties are in the ascendancy here? the 
 host pool or the contents of the clone? The host pool I assume, because 
 clone contents are (in this scenario) just some new data?

The dedup property applies to all writes so the settings for the pool of origin 
don't matter, just those on the destination pool.

Adam

--
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Bob Friesenhahn

On Wed, 9 Dec 2009, Bruno Sousa wrote:


Despite the fact that i agree in general with your comments, in reality
it all comes to money..
So in this case, if i could prove that ZFS was able to find X amount of
duplicated data, and since that X amount of data has a price of Y per
GB, IT could be seen as business enabler instead of a cost centre.


Most of the cost of storing business data is related to the cost of 
backing it up and administering it rather than the cost of the system 
on which it is stored.  In this case it is reasonable to know the 
total amount of user data (and charge for it), since it likely needs 
to be backed up and managed.  Deduplication does not help much here.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Andrey Kuzmin
On Wed, Dec 9, 2009 at 10:43 PM, Bob Friesenhahn
bfrie...@simple.dallas.tx.us wrote:
 On Wed, 9 Dec 2009, Bruno Sousa wrote:

 Despite the fact that i agree in general with your comments, in reality
 it all comes to money..
 So in this case, if i could prove that ZFS was able to find X amount of
 duplicated data, and since that X amount of data has a price of Y per
 GB, IT could be seen as business enabler instead of a cost centre.

 Most of the cost of storing business data is related to the cost of backing
 it up and administering it rather than the cost of the system on which it is
 stored.  In this case it is reasonable to know the total amount of user data
 (and charge for it), since it likely needs to be backed up and managed.
  Deduplication does not help much here.

Um, I thought deduplication had been invented to reduce backup window :).

Regards,
Andrey

 Bob
 --
 Bob Friesenhahn
 bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
 GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Bruno Sousa
Hi,

The data needs to be stored somewhere, and usually we need to have a
server, disk array, disks, and more data means more disks, and more
disks active means more power usage , therefore higher costs, and less
green IT :)
So, at my point of view, deduplication is relevant for lowering costs,
but in order to do that , there has to be a way to measure those
costs/savings.
But yes, this costs probably represent less than 20% of the total cost,
but its a cost no matter what.

However, maybe im driving in  the wrong road...

Bruno


Bob Friesenhahn wrote:
 On Wed, 9 Dec 2009, Bruno Sousa wrote:

 Despite the fact that i agree in general with your comments, in reality
 it all comes to money..
 So in this case, if i could prove that ZFS was able to find X amount of
 duplicated data, and since that X amount of data has a price of Y per
 GB, IT could be seen as business enabler instead of a cost centre.

 Most of the cost of storing business data is related to the cost of
 backing it up and administering it rather than the cost of the system
 on which it is stored.  In this case it is reasonable to know the
 total amount of user data (and charge for it), since it likely needs
 to be backed up and managed.  Deduplication does not help much here.

 Bob
 -- 
 Bob Friesenhahn
 bfrie...@simple.dallas.tx.us,
 http://www.simplesystems.org/users/bfriesen/
 GraphicsMagick Maintainer,http://www.GraphicsMagick.org/




smime.p7s
Description: S/MIME Cryptographic Signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Expected ZFS behavior?

2009-12-09 Thread Ragnar Sundblad

On 7 dec 2009, at 18.40, Bob Friesenhahn wrote:

 On Mon, 7 Dec 2009, Richard Bruce wrote:
 
 I started copying over all the data from my existing workstation. When 
 copying files (mostly multi-gigabyte DV video files), network throughput 
 drops to zero for ~1/2 second every 8-15 seconds.  This throughput drop 
 corresponds to drive activity on the Opensolaris box.  The ZFS pool drives 
 show no activity except every 8-15 seconds.  As best as I can guess, the 
 Opensolaris box is caching traffic and batching it to disk every so often.  
 I guess I didn't expect disk writes to interrupt network traffic.  Is this 
 correct?
 
 This is expected behavior.  From what has been posted here, these are the 
 current buffering rules:

Is it really?

Shouldn't it start on the next txg and while the previous txg commits,
and just continue writing?

/ragge

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else

2009-12-09 Thread R.G. Keen
I didn't see remove a simple device anywhere in there.

Is it:
too hard to even contemplate doing, 
or
too silly a thing to do to even consider letting that happen
or 
too stupid a question to even consider
or
too easy and straightforward to do the procedure I see recommended (export the 
whole pool, destroy the pool, remove the device, remake the pool, then reimport 
the pool) to even bother with?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Bob Friesenhahn

On Wed, 9 Dec 2009, Andrey Kuzmin wrote:


Um, I thought deduplication had been invented to reduce backup window :).


Unless the backup system also supports deduplication, in what way does 
deduplication reduce the backup window?


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else

2009-12-09 Thread Dale Ghent

What you're talking about is a side-benefit of the BP rewrite section of the 
linked slides.

I believe that once BP rewrite is fully baked, we'll soon afterwards see a 
device removal feature arrive.

/dale

On Dec 9, 2009, at 3:46 PM, R.G. Keen wrote:

 I didn't see remove a simple device anywhere in there.
 
 Is it:
 too hard to even contemplate doing, 
 or
 too silly a thing to do to even consider letting that happen
 or 
 too stupid a question to even consider
 or
 too easy and straightforward to do the procedure I see recommended (export 
 the whole pool, destroy the pool, remove the device, remake the pool, then 
 reimport the pool) to even bother with?
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Expected ZFS behavior?

2009-12-09 Thread Bob Friesenhahn

On Wed, 9 Dec 2009, Ragnar Sundblad wrote:


This is expected behavior.  From what has been posted here, these 
are the current buffering rules:


Is it really?

Shouldn't it start on the next txg and while the previous txg commits,
and just continue writing?


The pause is clearly not during the entire TXG commit.  The TXG commit 
could take up to five seconds to complete.  Perhaps the pause occurs 
only during the start of the commit, or perhaps it is at the end, or 
perhaps it is because the next TXG has already become 100% full while 
waiting for the current TXG to commit, and zfs is not willing to 
endanger more than one TXG worth of data so it pauses?


To my recollection, none of the zfs developers have been interested in 
discussing the cause of the pause, although they are clearly 
interested in maximizing performance.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else

2009-12-09 Thread Seth Heeren
R.G. Keen wrote:
 I didn't see remove a simple device anywhere in there.

 Is it:
 too hard to even contemplate doing, 
 or
 too silly a thing to do to even consider letting that happen
 or 
 too stupid a question to even consider
 or
 too easy and straightforward to do the procedure I see recommended (export 
 the whole pool, destroy the pool, remove the device, remake the pool, then 
 reimport the pool) to even bother with?
   
It is too complicated to implement directly.

As with lvm2 and comparable technologies, one would have to first have a
feature that moves all extents from one physical volume to the other
available phys.-vols. Then, when allocating the replacement blocks, the
algorithm could quickly become _very_ unwieldy, because the pool will
still have to keep it's redundancy guarantees [1].

As you can imagine this can be very complex with ZFS mixture of raid,
parity, _dynamic_ striping (simply realllocating the blocks could cause
massively fragmented disks if the pool/vdev previoiusly used dynamic
striping). Using 'copies=n' and extra parity (raidz2,raidz3) further
complicates the matter. In all circumstances about the only algorithm to
specify for the transformation _with_ the guarantee that all invariants
are (logically[2]) checked is to use the wellknown send/recv kludge. In
that case you'll simply need double the storage and a lot of processing
resources to make the transform.

There are a number of situations in which the logic can safely be
simplified (using only dynamic striping and using only full disks and
when there is a 'third' (recent disk) not involved in any of the
existing stripes to receive the relocated stripes etc. In effect, I
doubt that these situations are ever going to  cover more than what
'detach' and 'replace' offer at this moment in time.

So, in a word, yes this is (very) complicated.  The complicating thing
is that ZFS does dynamic striping and RAID redundancy properties
_automagically_. This dynamicity make it very hard to define what needs
to happen when a disk is removed (likewise for replacing with a smaller
disk). 'Static' RAID tools have the advantage here, because they can
guarantee how stripes are layout across a 'pool', and also because the
admin can limit the options used for a pool precisely to enable 'special
operations' like 'remove physdev'. However, even if so, removal off a
disk (as opposed to replacement) is a very uncommon use case for any
RAID solution that I know of.




[1] of course, you could replace that complexity by a burden on the
user: let removal of a drive have the same effect as physically failing
that device, degrading the pool. Then you would have to either replace
the vdev or re-add a vdev to restore the redundancy.

[2] by which I mean, barring bugs in, say, send/recv
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else

2009-12-09 Thread Erik Trimble

Dale Ghent wrote:

What you're talking about is a side-benefit of the BP rewrite section of the 
linked slides.

I believe that once BP rewrite is fully baked, we'll soon afterwards see a 
device removal feature arrive.

/dale

On Dec 9, 2009, at 3:46 PM, R.G. Keen wrote:

  

I didn't see remove a simple device anywhere in there.

Is it:
too hard to even contemplate doing, 
or

too silly a thing to do to even consider letting that happen
or 
too stupid a question to even consider

or
too easy and straightforward to do the procedure I see recommended (export the 
whole pool, destroy the pool, remove the device, remake the pool, then reimport 
the pool) to even bother with?
--
BP rewrite is key to several oft-asked features:  vdev removal, defrag, 
raidz expansion, among others.


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else

2009-12-09 Thread Glenn Lagasse
* R.G. Keen (k...@geofex.com) wrote:
 I didn't see remove a simple device anywhere in there.
 
 Is it:
 too hard to even contemplate doing, 
 or
 too silly a thing to do to even consider letting that happen
 or 
 too stupid a question to even consider
 or
 too easy and straightforward to do the procedure I see recommended (export 
 the whole pool, destroy the pool, remove the device, remake the pool, then 
 reimport the pool) to even bother with?

You missed:

Too hard to do correctly with current resource levels and other higher
priority work.

As always, volunteers I'm sure are welcome. :-)

Cheers,

-- 
Glenn
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else

2009-12-09 Thread Neil Perrin



On 12/09/09 13:52, Glenn Lagasse wrote:

* R.G. Keen (k...@geofex.com) wrote:

I didn't see remove a simple device anywhere in there.

Is it:
too hard to even contemplate doing, 
or

too silly a thing to do to even consider letting that happen
or 
too stupid a question to even consider

or
too easy and straightforward to do the procedure I see recommended (export the 
whole pool, destroy the pool, remove the device, remake the pool, then reimport 
the pool) to even bother with?


You missed:

Too hard to do correctly with current resource levels and other higher
priority work.

As always, volunteers I'm sure are welcome. :-)



This gives the impression that development is not actively working
on it. This is not true. As has been said often it is a difficult problem
and has been actively worked on for a few months now. I don't think
we are prepared to give a date as to when it will be delivered though.

Neil.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else

2009-12-09 Thread Larry Wake

Neil Perrin wrote:



On 12/09/09 13:52, Glenn Lagasse wrote:

* R.G. Keen (k...@geofex.com) wrote:

I didn't see remove a simple device anywhere in there.

Is it:
too hard to even contemplate doing, or
too silly a thing to do to even consider letting that happen
or too stupid a question to even consider
or
too easy and straightforward to do the procedure I see recommended 
(export the whole pool, destroy the pool, remove the device, remake 
the pool, then reimport the pool) to even bother with?


You missed:

Too hard to do correctly with current resource levels and other higher
priority work.

As always, volunteers I'm sure are welcome. :-)



This gives the impression that development is not actively working
on it. This is not true. As has been said often it is a difficult problem
and has been actively worked on for a few months now. I don't think
we are prepared to give a date as to when it will be delivered though.



This should go on a collection of things people ask about a lot, if 
such a thing were to exist.  Oh, wait...


http://hub.opensolaris.org/bin/view/Community+Group+zfs/faq#HCandevicesberemovedfromaZFSpool
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Planed ZFS-Features - Is there a List or something else

2009-12-09 Thread Glenn Lagasse
* Neil Perrin (neil.per...@sun.com) wrote:
 
 
 On 12/09/09 13:52, Glenn Lagasse wrote:
 * R.G. Keen (k...@geofex.com) wrote:
 I didn't see remove a simple device anywhere in there.
 
 Is it:
 too hard to even contemplate doing, or
 too silly a thing to do to even consider letting that happen
 or too stupid a question to even consider
 or
 too easy and straightforward to do the procedure I see recommended (export 
 the whole pool, destroy the pool, remove the device, remake the pool, then 
 reimport the pool) to even bother with?
 
 You missed:
 
 Too hard to do correctly with current resource levels and other higher
 priority work.
 
 As always, volunteers I'm sure are welcome. :-)
 
 
 This gives the impression that development is not actively working
 on it. This is not true. As has been said often it is a difficult problem

True.  I apologize for the misleading nature of my comment.  I should
have pointed out that I don't work on the ZFS project but was relating
what I believed the possible answer could be based upon past list
postings of the subject.

 and has been actively worked on for a few months now. I don't think
 we are prepared to give a date as to when it will be delivered though.

Cool!

-- 
Glenn
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Resend : zfs: questions on ARC membership based on type/ordering of Reads/Writes

2009-12-09 Thread Andrew . Rutz

hi,
 i'm re-sending this because I'm hoping that someone has some answers
to the following questions.  I'm working a hot Escalation on AmberRoad
and am trying to understand what's under zfs' hood.

thanks
Solaris RPE
/andrew rutz

On 11/25/09 13:55, andrew.r...@sun.com wrote:

I am trying to understand the ARC's behavior based on different
permutations of (a)sync Reads and (a)sync Writes.

thank you, in advance


o does the data for a *sync-write* *ever* go into the ARC?
  eg, my understanding is that the data goes to the ZIL (and
  the SLOG, if present), but how does it get from the ZIL to the ZIO layer?
  eg, does it go to the ARC on its way to the ZIO ?
  o if the sync-write-data *does* go to the ARC, does it go to
the ARC *after* it is written to the ZIL's backing-store,
or does the data go to the ZIL and the ARC in parallel ?
o if a sync-write's data goes to the ARC and ZIL *in parallel*,
  then does zfs prevent an ARC-hit until the data is confirmed
  to be on the ZIL's nonvolatile media (eg, disk-platter or SLOG) ?
  or could a Read get an ARC-hit on a block *before* it's written
  to zil's backing-store?


o is the DMU where the Serialization of transactions occurs?

o if an async-Write for block-X hits the Serializer before a Read
  for block-X hits the Serializer, i am assuming the Read can
  pass the async-Write; eg, the Read is *not* pended behind the
  async-write.  however, if a Read hits the Serializer after a
  *sync*-write, then i'm assuming the Read is pended until
  the sync-write is written to the ZIL's nonvolatile media.
  o if a Read passes an async-write, then i'm assuming the Read
can be satisfied by either the arc, l2arc, or disk.

o it's stated that the L2ARC is for random-reads.  however, there's
  nothing to prevent the L2ARC from containing blocks derived from
  *sequential*-reads, right ?   also, blocks from async-writes can
  also live in l2arc, right?  how about sync-writes ?

o is the l2arc literally simply a *larger* ARC?  eg, does the l2arc
  obey the normal cache property where everything that is in the L1$
  (eg, ARC) is also in the L2$ (eg, l2arc) ?  (I have a feeling that
  the set-theoretic intersection of ARC and L2ARC is empty (for some
  reason).
  o does the l2arc use the ARC algorithm (as the name suggests) ?

thank you,

/andrew
Solaris RPE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
Andrew Rutzandrew.r...@sun.com
Solaris RPE  Ph: (x64089) 512-401-1089
Austin, TX  78727Fax: 512-401-1452
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] will deduplication know about old blocks?

2009-12-09 Thread James Lever

On 10/12/2009, at 5:36 AM, Adam Leventhal wrote:

 The dedup property applies to all writes so the settings for the pool of 
 origin don't matter, just those on the destination pool.

Just a quick related question I’ve not seen answered anywhere else:

Is it safe to have dedup running on your rpool? (at install time, or if you 
need to migrate your rpool to new media)

cheers,
James

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2009-12-09 Thread Jack Kielsmeier
 I have disabled all 'non-important' processes (gdm,
 ssh, vnc, etc). I am now starting this process
 locally on the server via the console with about 3.4
 GB free of RAM.
 
 I still have my entries in /etc/system for limiting
 how much RAM zfs can use.

Going on 10 hours now, still importing. Still at just under 2MB/S read speed on 
each disk in the pool.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2009-12-09 Thread Jack Kielsmeier
  I have disabled all 'non-important' processes
 (gdm,
  ssh, vnc, etc). I am now starting this process
  locally on the server via the console with about
 3.4
  GB free of RAM.
  
  I still have my entries in /etc/system for
 limiting
  how much RAM zfs can use.
 
 Going on 10 hours now, still importing. Still at just
 under 2MB/S read speed on each disk in the pool.

And it's now froze again. Been frozen for 10 minutes now.

I had iostat working on the console, At the time of the freeze, it started 
writing to the zfs pool disks, previous to that, it has been all reads.

The console cursor is still blinking at least, so it's not a hard lock. I'm 
just gonna let it sit for a while and see what happens.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2009-12-09 Thread Cindy Swearingen

I wonder if you are hitting this bug:

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6905936
Deleting large files or filesystems on a dedup=on filesystem stalls the
whole system

Cindy

On 12/09/09 16:41, Jack Kielsmeier wrote:

I have disabled all 'non-important' processes

(gdm,

ssh, vnc, etc). I am now starting this process
locally on the server via the console with about

3.4

GB free of RAM.

I still have my entries in /etc/system for

limiting

how much RAM zfs can use.

Going on 10 hours now, still importing. Still at just
under 2MB/S read speed on each disk in the pool.


And it's now froze again. Been frozen for 10 minutes now.

I had iostat working on the console, At the time of the freeze, it started 
writing to the zfs pool disks, previous to that, it has been all reads.

The console cursor is still blinking at least, so it's not a hard lock. I'm 
just gonna let it sit for a while and see what happens.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS pool unusable after attempting to destroy a dataset with dedup enabled

2009-12-09 Thread Jack Kielsmeier
Ah that could be it!

This leaves me hopeful, as it looks like that bug says it'll eventually finish!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Resend : zfs: questions on ARC membership based on type/ordering of Reads/Writes

2009-12-09 Thread Richard Elling

I replied... maybe I don't count anymore, boo hoo :-)
http://opensolaris.org/jive/thread.jspa?threadID=118667tstart=15
 -- richard

On Dec 9, 2009, at 1:57 PM, andrew.r...@sun.com wrote:


hi,
i'm re-sending this because I'm hoping that someone has some answers
to the following questions.  I'm working a hot Escalation on AmberRoad
and am trying to understand what's under zfs' hood.

thanks
Solaris RPE
/andrew rutz

On 11/25/09 13:55, andrew.r...@sun.com wrote:

I am trying to understand the ARC's behavior based on different
permutations of (a)sync Reads and (a)sync Writes.
thank you, in advance
o does the data for a *sync-write* *ever* go into the ARC?
 eg, my understanding is that the data goes to the ZIL (and
 the SLOG, if present), but how does it get from the ZIL to the ZIO  
layer?

 eg, does it go to the ARC on its way to the ZIO ?
 o if the sync-write-data *does* go to the ARC, does it go to
   the ARC *after* it is written to the ZIL's backing-store,
   or does the data go to the ZIL and the ARC in parallel ?
   o if a sync-write's data goes to the ARC and ZIL *in parallel*,
 then does zfs prevent an ARC-hit until the data is confirmed
 to be on the ZIL's nonvolatile media (eg, disk-platter or  
SLOG) ?

 or could a Read get an ARC-hit on a block *before* it's written
 to zil's backing-store?
o is the DMU where the Serialization of transactions occurs?
o if an async-Write for block-X hits the Serializer before a Read
 for block-X hits the Serializer, i am assuming the Read can
 pass the async-Write; eg, the Read is *not* pended behind the
 async-write.  however, if a Read hits the Serializer after a
 *sync*-write, then i'm assuming the Read is pended until
 the sync-write is written to the ZIL's nonvolatile media.
 o if a Read passes an async-write, then i'm assuming the Read
   can be satisfied by either the arc, l2arc, or disk.
o it's stated that the L2ARC is for random-reads.  however, there's
 nothing to prevent the L2ARC from containing blocks derived from
 *sequential*-reads, right ?   also, blocks from async-writes can
 also live in l2arc, right?  how about sync-writes ?
o is the l2arc literally simply a *larger* ARC?  eg, does the l2arc
 obey the normal cache property where everything that is in the L1$
 (eg, ARC) is also in the L2$ (eg, l2arc) ?  (I have a feeling that
 the set-theoretic intersection of ARC and L2ARC is empty (for some
 reason).
 o does the l2arc use the ARC algorithm (as the name suggests) ?
thank you,
/andrew
Solaris RPE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--
Andrew Rutzandrew.r...@sun.com
Solaris RPE  Ph: (x64089) 512-401-1089
Austin, TX  78727Fax: 512-401-1452
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS dedup report tool

2009-12-09 Thread Richard Elling

On Dec 9, 2009, at 11:07 AM, Bruno Sousa wrote:

Hi,

Despite the fact that i agree in general with your comments, in  
reality

it all comes to money..
So in this case, if i could prove that ZFS was able to find X amount  
of

duplicated data, and since that X amount of data has a price of Y per
GB, IT could be seen as business enabler instead of a cost centre.
But indeed, you're right , in my case a possible technical solution is
trying to answer for a managerial solution..however, isn't it way IT  
was

invented, that i believe that's why i got my paycheck each month :)


OK, I think I've pulled your leg just a bit :-)  Here is the problem,  
if you

charge per-byte, then when you dedup the cost per-byte increases.
Why? Because you have both fixed and variable costs and dedup
will reduce your variable (per-byte) cost.

cost = fixed cost($) + [per byte cost ($/byte) * bytes]

The best way to solve this is through managerial accounting (aka
change the rules :-) which happens quite often in business.  See also
Captain Kirk's response to the Kobayashi Maru
http://en.wikipedia.org/wiki/Kobayashi_Maru

Finally, as my managerial accounting professor says, don't lose  
money :-)

 -- richard


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] CR#6850837 libshare enhancements to address performance and scalability

2009-12-09 Thread Paul B. Henson

I've had a case open for a year or so now regarding the inefficiencies of
having a large number of zfs filesystems, in particular how long it takes
to share/unshare them (resulting in a reboot cycle time on my x4500 with
8000 file systems of over two hours).

I got an update indicating that the subject mentioned bugfix was going to
resolve this, but that they did not plan to back port it to Solaris 10.
They also were not able to provide any technical details of what was fixed
or how much the performance might improve.

Would anyone happen to have any details of what changes were made, what
kind of improvements might be expected, and why it's not going to be
feasible to backport that change to Solaris 10?

Thanks...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  hen...@csupomona.edu
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] CR#6850837 libshare enhancements to address performance and scalability

2009-12-09 Thread Alastair Neil
Ditto, and also some estimate of when we can see them in opensolaris.

On Wed, Dec 9, 2009 at 11:02 PM, Paul B. Henson hen...@acm.org wrote:


 I've had a case open for a year or so now regarding the inefficiencies of
 having a large number of zfs filesystems, in particular how long it takes
 to share/unshare them (resulting in a reboot cycle time on my x4500 with
 8000 file systems of over two hours).

 I got an update indicating that the subject mentioned bugfix was going to
 resolve this, but that they did not plan to back port it to Solaris 10.
 They also were not able to provide any technical details of what was fixed
 or how much the performance might improve.

 Would anyone happen to have any details of what changes were made, what
 kind of improvements might be expected, and why it's not going to be
 feasible to backport that change to Solaris 10?

 Thanks...


 --
 Paul B. Henson  |  (909) 979-6361  |  
 http://www.csupomona.edu/~henson/http://www.csupomona.edu/%7Ehenson/
 Operating Systems and Network Analyst  |  hen...@csupomona.edu
 California State Polytechnic University  |  Pomona CA 91768
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Opensolaris with J4400 - Experiences

2009-12-09 Thread Trevor Pretty
OK Today I played with a J4400 connected to a Txxx server running S10 10/09

First off read the release notes I spent about 4 hours pulling my hair out as I 
could not get stmsboot to work until we read in the release notes that 500GB 
SATA drives do not work!!!

Initial Setup:
A pair of dual port SAS controllers (c4 and c5)
A J4400 with 6x 1TB SATA disks

The J440 had two controllers and these where connected to one SAS card 
(physical controller c4)

Test 1:

First a reboot -- -r

format shows 12 disks on c4 (each disk having two paths). If you picked the 
same disk via both paths ZFS stopped you doing stupid things by knowing the 
disk was already in use.

Test 2:

run stmsboot -e

format now shows six disk on controller c6, a new virtual controller The two 
internal disks are also now on c6 and stmsboot has done the right stuff with 
the rpool, so I would guess you could multi-path at a later date if you don't 
want to fist off, but I did not test this.

stmsboot -L only showed the two internal disk not the six in the J4400 strange, 
but we pressed on.

Test 3:

I created a zpool (two disks mirrored) using two of the new devices on c6.

I created some I/O load

I then unplugged one of the cables from the SAS card (physical c4).

Result: Nothing everything just keeps working - cool stuff!

Test 4:

I plugged the unplugged cable into the other controller (physical c5)

Result: Nothing everything just keeps working - cool stuff!

Test 5:

Being bold I then unplugged the remaining cable from the physical c4 controller

Result: Nothing everything just keeps working - cool stuff!

So I had gone from dual pathed, on a single controller (c4) to single pathed, 
on a different controller (c5).


Test 6:

I added the other four drives to the zpool (plain old zfs stuff - a bit boring).


Test 7:

I plugged in four more disks.

Result: Their mulipathed devices just showed up in format, I added them to the 
pool and also added them as spares all the while the I/O load is happening. No 
noticable stops or glitches.

Conclusion:

If you RTFM first then stmsboot does everything it is documented to do. You 
don't need to play with cfgadm or anything like that, just as I said orginally 
(below). The multi-pathing stuff is easy to set up and even a very rusty admin. 
like me found it very easy.

Note: There may be patches for the 500GB SATA disks I don'y know, fortunatly 
that's not what I've sold - Phew!!

TTFN
Trevor









From: zfs-discuss-boun...@opensolaris.org [zfs-discuss-boun...@opensolaris.org] 
On Behalf Of Trevor Pretty [trevor_pre...@eagle.co.nz]
Sent: Monday, 30 November 2009 2:48 p.m.
To: Karl Katzke
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Opensolaris with J4400 - Experiences

Karl

Don't you just use stmsboot?

http://docs.sun.com/source/820-3223-14/SASMultipath.html#50511899_pgfId-1046940

Bruno

Next week I'm playing with a M3000 and a J4200 in the local NZ distributor's  
lab. I had planned to just use the latest version of S10, but if I get the time 
I might play with OpenSolaris as well, but I don't think there is anything 
radically different between the two here.

From what I've read in preparation (and I stand to be corrected):



* Will i be able to achieve multipath support, if i connect the
  J4400 to 2 LSI HBA in one server, with SATA disks, or this is only
  possible with SAS disks? This server will have OpenSolaris (any
  release i think) .

Disk type does not matter (see link above).

* The CAM ( StorageTek Common Array Manager ), its only for hardware
  management of the JBOD, leaving
  disk/volumes/zpools/luns/whatever_name management up to the server
  operating system , correct ?

That is my understanding see:- http://docs.sun.com/source/820-3765-11/

* Can i put some readzillas/writezillas in the j4400 along with sata
  disks, and if so will i have any benefit  , or should i place
  those *zillas directly into the servers disk tray?

On the Unified Storage products they go in both. Readzilla in the server 
Logzillas in the J4400. This is quite logical if you want to move the array 
between hosts all the data needs to be in the array. Read data can always be 
re-created so therefore the closer to the CPU the better. See: 
http://catalog.sun.com/

* Does any one has experiences with those jbods? If so, are they in
  general solid/reliable ?

No: But, get a support contract!

* The server will probably be a Sun x44xx series, with 32Gb ram, but
  for the best possible performance, should i invest in more and
  more spindles, or a couple less spindles and buy some readzillas?
  This system will be mainly used to export some volumes over ISCSI
  to a windows 2003 fileserver, and to hold some NFS shares.

Check Brendon Gregg's blogs *I think* he has done some work here from memory.







Karl Katzke wrote:

Bruno -

Sorry, I don't have experience with OpenSolaris, but 

Re: [zfs-discuss] will deduplication know about old blocks?

2009-12-09 Thread Cyril Plisko
On Thu, Dec 10, 2009 at 12:37 AM, James Lever j...@jamver.id.au wrote:

 On 10/12/2009, at 5:36 AM, Adam Leventhal wrote:

 The dedup property applies to all writes so the settings for the pool of 
 origin don't matter, just those on the destination pool.

 Just a quick related question I’ve not seen answered anywhere else:

 Is it safe to have dedup running on your rpool? (at install time, or if you 
 need to migrate your rpool to new media)

I have it on on my laptop and and a couple of other machines. I also
have number of frsh installations (albeit in VB) where the dedup is on
from the very beginning.
Meanwhile works ok.

BTW, are there any implications of having dedup=on on rpool/dump ? I
know that the compression is turned off explicitly for rpool/dump.

-- 
Regards,
Cyril
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss