Re: [zfs-discuss] Is ZFS internal reservation excessive?

2011-06-18 Thread Richard Elling
On Jun 17, 2011, at 4:07 PM, MasterCATZ wrote:
 
 
 ok what is the Point of the RESERVE 
 
 When we can not even delete a file when their is no space left !!!
 
 if they are going to have a RESERVE they should make it a little smarter and
 maybe have the FS use some of that free space so when we do hit 0 bytes 
 data can still be deleted because their is over 50 gig free in the reserve .. 

Is there a quota?
 -- richard

 
 
 # zfs list
 NAME   USED  AVAIL  REFER  MOUNTPOINT
 tank  2.68T  0  2.68T  /tank
 # zpool list
 NAME   SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
 tank  3.64T  3.58T  58.2G98%  1.00x  ONLINE  -
 
 rm -f -r downloads
 rm: downloads: No space left on device
 
 
 
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2011-06-17 Thread MasterCATZ


ok what is the Point of the RESERVE 

When we can not even delete a file when their is no space left !!!

if they are going to have a RESERVE they should make it a little smarter and
maybe have the FS use some of that free space so when we do hit 0 bytes 
data can still be deleted because their is over 50 gig free in the reserve .. 


# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
tank  2.68T  0  2.68T  /tank
# zpool list
NAME   SIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
tank  3.64T  3.58T  58.2G98%  1.00x  ONLINE  -

rm -f -r downloads
rm: downloads: No space left on device





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-19 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/18/2010 09:37 PM, Peter Jeremy wrote:
 Maybe it would be useful if ZFS allowed the reserved space to be
 tuned lower but, at least for ZFS v13, the reserved space seems to
 actually be a bit less than is needed for ZFS to function reasonably.

In fact, filling a 1.5 terabyte ZFS disk (leaving the 1.5% implicit
reservation alone) reduces my write speed to half (and this is using BIG
files 512MB)n. But it seems more a implementation artifact than a
natural law. For instance when we have the block rewrite, we can
coallesce free space to be able to find it easily  fast.

If I understand correctly, with this implicit reservation I don't need
(anymore) to create a dummy dataset with a small (64MBytes) reservation
to be sure I can delete files, etc., when the disk is full. That was
important to have in the beginning of ZFS. Can I forget this
requirement in modern ZFS implementations?.

I think ZFS doesn't reserves space for root, so you better have the
root (and /tmp and /var, if separate datasets) separate from
normal user fillable datasets. Is this correct?.

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBS1WiT5lgi5GaxT1NAQJjlAP/eB2yfMGsRObul9lvuD31i3Z6kn43zTGH
ZBzSA9BKJS+UZmuWrOm8ncjkKZPiHyozoEEQzf4PpyseiusqGZV25kw6dE1xFrym
coRCN3ViUP1oBtXXNNYkm7OEZ5ksZTGVCwCe+rnCcrYPlnYv1I3yd60wb7+Z/r00
qh6ngQuus0o=
=UlGT
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-19 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/18/2010 09:45 PM, Mattias Pantzare wrote:
 No, the reservation in UFS/FFS is to keep the performance up. It will
 be harder and harder to find free space as the disk fills. Is is even
 more important for ZFS to be able to find free space as all writes
 need free space.
 
 The root-thing is just a side effect.

I stand corrected.

I think ZFS doesn't allow root to eat from the implicit reservation,
so we lose the side effect. Am I right?.

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBS1Wi/Jlgi5GaxT1NAQLnlQP/XMGZNGRMHhYhKQERROi85aSFF5v8AZAW
8UrVZB+UgUComoSIBTyFa0dZ1COI/AVR5907me5oTKQEWqnL7CguDBoeElb6jjJM
OIkwu2TjInXhlpn9NLQpyvUdw3ERRKUAoJ1ki5lW9w7BPH3eGJs9mPw2NRdSBmcx
aJFN/7KqIWA=
=GruR
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-19 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/19/2010 12:25 AM, Erik Trimble wrote:
[Block rewrite]
 Once all this gets done, I'd think we seldom would need more than a GB
 or two as reserve space...

I agree. Without block rewrite, is best to have a percentage instead of
hard limit. After block rewrite, better to have a maximum absolute
limit, since free space will be easy to find.

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBS1WlsZlgi5GaxT1NAQKVNwQAjLM+9Us7Phw+h6FaLe+ovzPVNHuFKa59
ouc7J+3NckiFpTdidZNqb7n9qEAg1QIswjSAHm54J/KlMgdNpGTVvrAE/zGqac5U
GrXhgTVvuDxtlUP1+9Ff8O+e8EkJTMD+fGP2eAhL7kyry8xOdJ/ilrw20BSK4dl3
ZpTkdHqS25s=
=Jy30
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-19 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/19/2010 01:14 AM, Richard Elling wrote:
 For example, b129
 includes a fix for CR6869229, zfs should switch to shiny new metaslabs more
 frequently.
 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6869229
 I think the CR is worth reading if you have an interest in allocators and 
 performance.

Where is the docs?. That link has little info.

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBS1WnYZlgi5GaxT1NAQJEmQP/dADR6R6eCsNFCnfbPk0yETsHnXiiLT5Q
gZEOdKpIrefdy23fLDEYvvtMkiPRI3VmnIQwQTjqnmJCW1tNtn8ZO8+dkzAY2GDO
72FA8KuBOswAil/KTyuvGcXSpVX8qZz8DS+CQvP2eRGUXNueoqgzvDUN+TJMYLV4
xImE7eEiLxQ=
=mDYf
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-19 Thread Richard Elling
On Jan 19, 2010, at 4:36 AM, Jesus Cea wrote:
 On 01/19/2010 01:14 AM, Richard Elling wrote:
 For example, b129
 includes a fix for CR6869229, zfs should switch to shiny new metaslabs more
 frequently.
 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6869229
 I think the CR is worth reading if you have an interest in allocators and 
 performance.
 
 Where is the docs?. That link has little info.


Jeff wrote a high-level overview of the allocation here:
http://blogs.sun.com/bonwick/entry/zfs_block_allocation

Finally, UTSL.  There are several allocators already implemented: first
fit, best fit, and dynamic block (uses first fit until a threshold is reached
where it changes to best fit). The default is dynamic block.  The source
also contains a CDF and NDF allocator, but I'm not sure if or where
they are used, they are commented as being experimental [*].
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/metaslab.c

For those not inclined to read source, there is already a method defined
for a metaslab to be marked as fragmented. This can be used to improve
Jeff's #2 item, metaslab selection. If you want to explore your pool, try
zdb -m poolname
zdb -mmm poolname
and wish you had my spacemaps from space code :-)
http://blogs.sun.com/relling/entry/space_maps_from_space

[*] opportunity for fame and glory, invent the perfect allocator :-)

 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

zpool and zfs report different free space because zfs takes into account
an internal reservation of 32MB or 1/64 of the capacity of the pool,
what is bigger.

So in a 2TB Harddisk, the reservation would be 32 gigabytes. Seems a bit
excessive to me...

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBS1SEaZlgi5GaxT1NAQLqXgP+PUBVTa+CU5uulGKzY8QNFGHWcKoIqwvR
w4dFGuVpXTCBnvM9/vzit6Bq5x849zjqsBH/JUFiy1ugIMj8/2Xp0QuVd8+3ynFO
5U1i5XjIWhm5BZfuEIF8NBvzwVZmJOafDvEj56jxb3phi6tnQzw8252F9APJhlI2
jVXxzeyC6XE=
=7nGF
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread David Magda

On Jan 18, 2010, at 10:55, Jesus Cea wrote:

zpool and zfs report different free space because zfs takes into  
account

an internal reservation of 32MB or 1/64 of the capacity of the pool,
what is bigger.

So in a 2TB Harddisk, the reservation would be 32 gigabytes. Seems a  
bit

excessive to me...


1/64 is ~1.5% according to my math.

Ext2/3 uses 5% by default for root's usage; 8% under FreeBSD for FFS.  
Solaris (10) uses a bit more nuance for its UFS:


The default is ((64 Mbytes/partition size) * 100), rounded down to  
the nearest integer and limited between 1% and 10%, inclusively.


32 GB may seem like a lot (and it can hold a lot of stuff), but it's  
not what it used to be. :)


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Jesus Cea
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/18/2010 05:11 PM, David Magda wrote:
 On Jan 18, 2010, at 10:55, Jesus Cea wrote:
 
 zpool and zfs report different free space because zfs takes into account
 an internal reservation of 32MB or 1/64 of the capacity of the pool,
 what is bigger.

 So in a 2TB Harddisk, the reservation would be 32 gigabytes. Seems a bit
 excessive to me...
 
 1/64 is ~1.5% according to my math.
 
 Ext2/3 uses 5% by default for root's usage; 8% under FreeBSD for FFS.
 Solaris (10) uses a bit more nuance for its UFS:

That reservation is to preclude users to exhaust diskspace in such a way
that ever root can not login and solve the problem.

 32 GB may seem like a lot (and it can hold a lot of stuff), but it's not
 what it used to be. :)

I agree that is a lot of space but only 2% of a modern disk. My point
is that 32GB is a lot of space to reserve to be able, for instance, to
delete a file when the pool is full (thanks to COW). And more when the
minimum reserved is 32MB and ZFS can get away with it. I think that
could be a good thing to put a cap to the maximum implicit reservation.

- -- 
Jesus Cea Avion _/_/  _/_/_/_/_/_/
j...@jcea.es - http://www.jcea.es/ _/_/_/_/  _/_/_/_/  _/_/
jabber / xmpp:j...@jabber.org _/_/_/_/  _/_/_/_/_/
.  _/_/  _/_/_/_/  _/_/  _/_/
Things are not so easy  _/_/  _/_/_/_/  _/_/_/_/  _/_/
My name is Dump, Core Dump   _/_/_/_/_/_/  _/_/  _/_/
El amor es poner tu felicidad en la felicidad de otro - Leibniz
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQCVAwUBS1SLs5lgi5GaxT1NAQJkPgP+NGg1iKbNX3BzHXJjYcFLYpVNA376Ys79
VHDbElKlCAzIo80ZqW1gHQpOumUzUCZaR910+0e+0vpUzL81hHQ9wncS8BBhmXZN
Hp3jA39zzB7JjvQxJ9K/CWxbg3O4Nqi+HTcez3sczyg5dx6k1aSf05MgNPt8jtvJ
VNbuQ1hdy7o=
=qxDK
-END PGP SIGNATURE-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Mattias Pantzare
 Ext2/3 uses 5% by default for root's usage; 8% under FreeBSD for FFS.
 Solaris (10) uses a bit more nuance for its UFS:

 That reservation is to preclude users to exhaust diskspace in such a way
 that ever root can not login and solve the problem.

No, the reservation in UFS/FFS is to keep the performance up. It will
be harder and harder to find free space as the disk fills. Is is even
more important for ZFS to be able to find free space as all writes
need free space.

The root-thing is just a side effect.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Richard Elling
On Jan 18, 2010, at 7:55 AM, Jesus Cea wrote:
 zpool and zfs report different free space because zfs takes into account
 an internal reservation of 32MB or 1/64 of the capacity of the pool,
 what is bigger.

This space is also used for the ZIL.

 So in a 2TB Harddisk, the reservation would be 32 gigabytes. Seems a bit
 excessive to me...

Me too.  Before firing off an RFE, what would be a reasonable upper
bound?  A percentage?
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Tim Cook
On Mon, Jan 18, 2010 at 3:49 PM, Richard Elling richard.ell...@gmail.comwrote:

 On Jan 18, 2010, at 7:55 AM, Jesus Cea wrote:
  zpool and zfs report different free space because zfs takes into account
  an internal reservation of 32MB or 1/64 of the capacity of the pool,
  what is bigger.

 This space is also used for the ZIL.

  So in a 2TB Harddisk, the reservation would be 32 gigabytes. Seems a bit
  excessive to me...

 Me too.  Before firing off an RFE, what would be a reasonable upper
 bound?  A percentage?
  -- richard



Not being intimate with the guts of ZFS, it would seem to me that a
percentage would be the best choice.  I'll make the (perhaps incorrect)
assumption that as disks grow, if you have a set amount of free space (say
5g), it becomes harder and harder to find/get to that free space resulting
in performance tanking.  Where as we can expect linear performance if it's a
percentage.  No?

-- 
--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Peter Jeremy
On 2010-Jan-19 00:26:27 +0800, Jesus Cea j...@jcea.es wrote:
On 01/18/2010 05:11 PM, David Magda wrote:
 Ext2/3 uses 5% by default for root's usage; 8% under FreeBSD for FFS.
 Solaris (10) uses a bit more nuance for its UFS:

That reservation is to preclude users to exhaust diskspace in such a way
that ever root can not login and solve the problem.

At least for UFS-derived filesystems (ie FreeBSD and Solaris), the
primary reason for the 8-10% reserved space is to minimise FS
fragmentation and improve space allocation performance:  More total
free space means it's quicker and easier to find the required
contiguous (or any) free space whilst searching a free space bitmap.
Allowing root to eat into that reserved space provided a neat
solution to resource starvation issues but was not the justification.

I agree that is a lot of space but only 2% of a modern disk. My point
is that 32GB is a lot of space to reserve to be able, for instance, to
delete a file when the pool is full (thanks to COW). And more when the
minimum reserved is 32MB and ZFS can get away with it. I think that
could be a good thing to put a cap to the maximum implicit reservation.

AFAIK, it's also necessary to ensure reasonable ZFS performance - the
find some free space issue becomes much more time critical with a
COW filesystem.  I recently had a 2.7TB RAIDZ1 pool get to the point
where zpool was reporting ~2% free space - and performance was
absolutely abyssmal (fsync() was taking over 16 seconds).  When I
freed up a few percent more space, the performance recovered.

Maybe it would be useful if ZFS allowed the reserved space to be
tuned lower but, at least for ZFS v13, the reserved space seems to
actually be a bit less than is needed for ZFS to function reasonably.

-- 
Peter Jeremy


pgpaYK13eLyWU.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Erik Trimble

Tim Cook wrote:



On Mon, Jan 18, 2010 at 3:49 PM, Richard Elling 
richard.ell...@gmail.com mailto:richard.ell...@gmail.com wrote:


On Jan 18, 2010, at 7:55 AM, Jesus Cea wrote:
 zpool and zfs report different free space because zfs takes into
account
 an internal reservation of 32MB or 1/64 of the capacity of the pool,
 what is bigger.

This space is also used for the ZIL.

 So in a 2TB Harddisk, the reservation would be 32 gigabytes.
Seems a bit
 excessive to me...

Me too.  Before firing off an RFE, what would be a reasonable upper
bound?  A percentage?
 -- richard



Not being intimate with the guts of ZFS, it would seem to me that a 
percentage would be the best choice.  I'll make the (perhaps 
incorrect) assumption that as disks grow, if you have a set amount of 
free space (say 5g), it becomes harder and harder to find/get to that 
free space resulting in performance tanking.  Where as we can expect 
linear performance if it's a percentage.  No?


--
--Tim
Actually, as a section of the reservation is ZIL, that portion's impact 
on performance is directly tied to the PERFORMANCE of the underlying 
zpool, not it's size.  As such, given that hard drive performance is 
pretty much hit a wall, I think we should look at having a 
non-size-determined limit on the size of the reserved area, regardless 
of the actual size of the zpool.  The limit would still need to be 
heuristic, since higher-performing zpools would need to have a larger 
maximum reservation than lower-performing ones.


Given my (imperfect) understanding of the internals of ZFS, the non-ZIL 
portions of the reserved space are there mostly to insure that there is 
sufficient (reasonably) contiguous space for doing COW.  Hopefully, once 
BP rewrite materializes (I know, I'm treating this much to much as a 
Holy Grail, here to save us from all the ZFS limitations, but 
really...), we can implement defragmentation which will seriously reduce 
the amount of reserved space required to keep up performance.


Once all this gets done, I'd think we seldom would need more than a GB 
or two as reserve space...


--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Daniel Carosone
On Mon, Jan 18, 2010 at 03:25:56PM -0800, Erik Trimble wrote:

 Hopefully, once BP rewrite materializes (I know, I'm treating this
 much to much as a Holy Grail, here to save us from all the ZFS
 limitations, but really...), we can implement defragmentation which
 will seriously reduce the amount of reserved space required to keep
 up performance. 

I doubt that.  I expect bp-rewrite in general, and its use for
effective defragmentation in particular, to require rather *more* free
space to be available.  Of course, you may be able to add that space by
stretching a raidz vdev to one more disk, but you also may not due to
other constraints (not enough ports, etc).

Another poster pointed out recently that you can readily add more
reserved space using an unmounted filesystem with a reservation of the
appropriate size.  This is most relevant for systems without the stop
looking a start ganging fix.



pgpkeQL4nCvUf.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Richard Elling
On Jan 18, 2010, at 3:25 PM, Erik Trimble wrote:
 Given my (imperfect) understanding of the internals of ZFS, the non-ZIL 
 portions of the reserved space are there mostly to insure that there is 
 sufficient (reasonably) contiguous space for doing COW.  Hopefully, once BP 
 rewrite materializes (I know, I'm treating this much to much as a Holy Grail, 
 here to save us from all the ZFS limitations, but really...), we can 
 implement defragmentation which will seriously reduce the amount of reserved 
 space required to keep up performance.

[Richard pauses to remember the first garbage collection presentation by
a bunch of engineers dressed as garbage men... they're probably still working
at the same job :-)]

There is still some work being done on the allocation front. From my experience
with other allocators (and garbage collectors) I'm sure the technology will 
change
right after it gets near perfect and we'll have to adapt again.  For example, 
b129
includes a fix for CR6869229, zfs should switch to shiny new metaslabs more
frequently.
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6869229
I think the CR is worth reading if you have an interest in allocators and 
performance.

 Once all this gets done, I'd think we seldom would need more than a GB or two 
 as reserve space...

I hope :-)
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Erik Trimble

Richard Elling wrote:

On Jan 18, 2010, at 3:25 PM, Erik Trimble wrote:
  

Given my (imperfect) understanding of the internals of ZFS, the non-ZIL 
portions of the reserved space are there mostly to insure that there is 
sufficient (reasonably) contiguous space for doing COW.  Hopefully, once BP 
rewrite materializes (I know, I'm treating this much to much as a Holy Grail, 
here to save us from all the ZFS limitations, but really...), we can implement 
defragmentation which will seriously reduce the amount of reserved space 
required to keep up performance.



[Richard pauses to remember the first garbage collection presentation by
a bunch of engineers dressed as garbage men... they're probably still working
at the same job :-)]
  

I think I work with a couple of those guys over here in Java-land.

:-)



There is still some work being done on the allocation front. From my experience
with other allocators (and garbage collectors) I'm sure the technology will 
change
right after it gets near perfect and we'll have to adapt again.  For example, 
b129
includes a fix for CR6869229, zfs should switch to shiny new metaslabs more
frequently.
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6869229
I think the CR is worth reading if you have an interest in allocators and 
performance.
  
I'd be interested in knowing if there is any idea cross-pollination 
between memory and disk allocation/GC methods.  The GC guys over here 
are damned slick, and there's quite a bit of good academic literature on 
memory GC (and the GC in Sun JVM has undergone considerable work - there 
are now several built-in, and having such flexibility might be a good 
thing for a GC/defragger).  Maybe it's time for a guest symposium for 
the ZFS folks? 


Once all this gets done, I'd think we seldom would need more than a GB or two 
as reserve space...


I hope :-)
 -- richard
  

Please!

--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS internal reservation excessive?

2010-01-18 Thread Erik Trimble

Daniel Carosone wrote:

On Mon, Jan 18, 2010 at 03:25:56PM -0800, Erik Trimble wrote:

  

Hopefully, once BP rewrite materializes (I know, I'm treating this
much to much as a Holy Grail, here to save us from all the ZFS
limitations, but really...), we can implement defragmentation which
will seriously reduce the amount of reserved space required to keep
up performance. 



I doubt that.  I expect bp-rewrite in general, and its use for
effective defragmentation in particular, to require rather *more* free
space to be available.  Of course, you may be able to add that space by
stretching a raidz vdev to one more disk, but you also may not due to
other constraints (not enough ports, etc).
  
Really?  I would expect that a frequent defragging (i.e. defrag as the 
pool is used (say on a nightly basis), not just once it gets to 90%+ 
utilization) would seriously reduce the amount of reserved space 
required, as it keeps the pool in a much more optimal layout, and thus 
has a lower overhead requirement.


Other thought is that pools which are heavily used (and thus, likely to 
need frequent defrag) would require more reserve than those which 
contain relatively static datasets.  It would certainly be helpful to 
include a tunable parameter to ZFS so that it knows whether the dataset 
is likely to be very write-intensive, or is generally 
write-once-read-many.   If the later case, I would expect that a couple 
hundred MB is all that a pool would ever need for reserve space, 
regardless of the actual pool size.



Another poster pointed out recently that you can readily add more
reserved space using an unmounted filesystem with a reservation of the
appropriate size.  This is most relevant for systems without the stop
looking a start ganging fix.
  



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss