Re: [OmniOS-discuss] Fragmentation

2017-06-25 Thread Jim Klimov
On June 23, 2017 9:01:20 PM GMT+02:00, Richard Elling 
 wrote:
>ZIL pre-allocates at the block level, so think along the lines of 12k
>or 132k.
> — richard
>
>> On Jun 23, 2017, at 11:30 AM, Günther Alka 
>wrote:
>> 
>> hello Richard
>> 
>> I can follow that the Zil does not add more fragmentation to the free
>space but is this effect relevant?
>> If a ZIL pre-allocates say 4G and the remaining fragmented poolsize
>for regular writes is 12T
>> 
>> Gea
>> 
>> Am 23.06.2017 um 19:30 schrieb Richard Elling:
>>> A slog helps fragmentation because the space for ZIL is
>pre-allocated based on a prediction of
>>> how big the write will be. The pre-allocated space includes a
>physical-block-sized chain block for the
>>> ZIL. An 8k write can allocate 12k for the ZIL entry that is freed
>when the txg commits. Thus, a slog
>>> can help decrease free space fragmentation in the pool.
>>>  — richard
>>> 
>>> 
 On Jun 23, 2017, at 8:56 AM, Guenther Alka 
>wrote:
 
 A Zil or better dedicated Slog device will not help as this is not
>a write cache but a logdevice. Its only there to commit every written
>datablock and to put it onto stable storage. It is read only after a
>crash to redo a missing committed write.
 
 All writes, does not matter if sync or not, are going over the
>rambased write cache (per default up to 4GB). This is flushed from time
>to time as a large sequential write. Writes are fragmented then
>depending on the fragmentation of the free space.
 
 Gea
 
 
> To prevent it, a ZIL caching all writes (including sync ones, e.g.
>nfs) can help. Perhaps a DDR drive (or mirror of these) with battery
>and flash protection from poweroffs, so it does not wear out like flash
>would. In this case, how-ever random writes come, ZFS does not have to
>put them on media asap - so it can do larger writes later. This can
>also protect SSD arrays from excessive small writes and wear-out,
>though there a bad(ly sized) ZIL can become a bottleneck.
> 
> Hope this helps,
> Jim
> --
 ___
 OmniOS-discuss mailing list
 OmniOS-discuss@lists.omniti.com
 http://lists.omniti.com/mailman/listinfo/omnios-discuss
>> 
>> -- 
>> ___
>> OmniOS-discuss mailing list
>> OmniOS-discuss@lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>___
>OmniOS-discuss mailing list
>OmniOS-discuss@lists.omniti.com
>http://lists.omniti.com/mailman/listinfo/omnios-discuss

@Gea, IIRC one can set sync mode on a dataset, effectively forcing all writes 
to go to (dedicated) ZIL, and data remains in memory until flushed to 
persistent bulk storage like normal pool writes go. This way more consolidated 
writes can be sent to disks of the pool, rather than forcing many small (sync) 
allocations and deallocations if (sync) writes are small and intensive enough, 
e.g. appending log files, etc.

For SSD pools this is thought to also ease the wear due to ability to reprogram 
whole pages, compensating also for small intensive random writes since random 
LBAs can live in same page.

Jim

Hope Richard would correct me if I got something wrong ;)
--
Typos courtesy of K-9 Mail on my Redmi Android
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Fragmentation

2017-06-23 Thread Richard Elling
ZIL pre-allocates at the block level, so think along the lines of 12k or 132k.
 — richard

> On Jun 23, 2017, at 11:30 AM, Günther Alka  wrote:
> 
> hello Richard
> 
> I can follow that the Zil does not add more fragmentation to the free space 
> but is this effect relevant?
> If a ZIL pre-allocates say 4G and the remaining fragmented poolsize for 
> regular writes is 12T
> 
> Gea
> 
> Am 23.06.2017 um 19:30 schrieb Richard Elling:
>> A slog helps fragmentation because the space for ZIL is pre-allocated based 
>> on a prediction of
>> how big the write will be. The pre-allocated space includes a 
>> physical-block-sized chain block for the
>> ZIL. An 8k write can allocate 12k for the ZIL entry that is freed when the 
>> txg commits. Thus, a slog
>> can help decrease free space fragmentation in the pool.
>>  — richard
>> 
>> 
>>> On Jun 23, 2017, at 8:56 AM, Guenther Alka  wrote:
>>> 
>>> A Zil or better dedicated Slog device will not help as this is not a write 
>>> cache but a logdevice. Its only there to commit every written datablock and 
>>> to put it onto stable storage. It is read only after a crash to redo a 
>>> missing committed write.
>>> 
>>> All writes, does not matter if sync or not, are going over the rambased 
>>> write cache (per default up to 4GB). This is flushed from time to time as a 
>>> large sequential write. Writes are fragmented then depending on the 
>>> fragmentation of the free space.
>>> 
>>> Gea
>>> 
>>> 
 To prevent it, a ZIL caching all writes (including sync ones, e.g. nfs) 
 can help. Perhaps a DDR drive (or mirror of these) with battery and flash 
 protection from poweroffs, so it does not wear out like flash would. In 
 this case, how-ever random writes come, ZFS does not have to put them on 
 media asap - so it can do larger writes later. This can also protect SSD 
 arrays from excessive small writes and wear-out, though there a bad(ly 
 sized) ZIL can become a bottleneck.
 
 Hope this helps,
 Jim
 --
>>> ___
>>> OmniOS-discuss mailing list
>>> OmniOS-discuss@lists.omniti.com
>>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 
> -- 
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Fragmentation

2017-06-23 Thread Günther Alka

hello Richard

I can follow that the Zil does not add more fragmentation to the free 
space but is this effect relevant?
If a ZIL pre-allocates say 4G and the remaining fragmented poolsize for 
regular writes is 12T


Gea

Am 23.06.2017 um 19:30 schrieb Richard Elling:

A slog helps fragmentation because the space for ZIL is pre-allocated based on 
a prediction of
how big the write will be. The pre-allocated space includes a 
physical-block-sized chain block for the
ZIL. An 8k write can allocate 12k for the ZIL entry that is freed when the txg 
commits. Thus, a slog
can help decrease free space fragmentation in the pool.
  — richard



On Jun 23, 2017, at 8:56 AM, Guenther Alka  wrote:

A Zil or better dedicated Slog device will not help as this is not a write 
cache but a logdevice. Its only there to commit every written datablock and to 
put it onto stable storage. It is read only after a crash to redo a missing 
committed write.

All writes, does not matter if sync or not, are going over the rambased write 
cache (per default up to 4GB). This is flushed from time to time as a large 
sequential write. Writes are fragmented then depending on the fragmentation of 
the free space.

Gea



To prevent it, a ZIL caching all writes (including sync ones, e.g. nfs) can 
help. Perhaps a DDR drive (or mirror of these) with battery and flash 
protection from poweroffs, so it does not wear out like flash would. In this 
case, how-ever random writes come, ZFS does not have to put them on media asap 
- so it can do larger writes later. This can also protect SSD arrays from 
excessive small writes and wear-out, though there a bad(ly sized) ZIL can 
become a bottleneck.

Hope this helps,
Jim
--

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


--
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Fragmentation

2017-06-23 Thread Richard Elling
A slog helps fragmentation because the space for ZIL is pre-allocated based on 
a prediction of
how big the write will be. The pre-allocated space includes a 
physical-block-sized chain block for the
ZIL. An 8k write can allocate 12k for the ZIL entry that is freed when the txg 
commits. Thus, a slog
can help decrease free space fragmentation in the pool.
 — richard


> On Jun 23, 2017, at 8:56 AM, Guenther Alka  wrote:
> 
> A Zil or better dedicated Slog device will not help as this is not a write 
> cache but a logdevice. Its only there to commit every written datablock and 
> to put it onto stable storage. It is read only after a crash to redo a 
> missing committed write.
> 
> All writes, does not matter if sync or not, are going over the rambased write 
> cache (per default up to 4GB). This is flushed from time to time as a large 
> sequential write. Writes are fragmented then depending on the fragmentation 
> of the free space.
> 
> Gea
> 
> 
>> To prevent it, a ZIL caching all writes (including sync ones, e.g. nfs) can 
>> help. Perhaps a DDR drive (or mirror of these) with battery and flash 
>> protection from poweroffs, so it does not wear out like flash would. In this 
>> case, how-ever random writes come, ZFS does not have to put them on media 
>> asap - so it can do larger writes later. This can also protect SSD arrays 
>> from excessive small writes and wear-out, though there a bad(ly sized) ZIL 
>> can become a bottleneck.
>> 
>> Hope this helps,
>> Jim
>> --
> 
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Fragmentation

2017-06-23 Thread Guenther Alka
A Zil or better dedicated Slog device will not help as this is not a 
write cache but a logdevice. Its only there to commit every written 
datablock and to put it onto stable storage. It is read only after a 
crash to redo a missing committed write.


All writes, does not matter if sync or not, are going over the rambased 
write cache (per default up to 4GB). This is flushed from time to time 
as a large sequential write. Writes are fragmented then depending on the 
fragmentation of the free space.


Gea



To prevent it, a ZIL caching all writes (including sync ones, e.g. nfs) can 
help. Perhaps a DDR drive (or mirror of these) with battery and flash 
protection from poweroffs, so it does not wear out like flash would. In this 
case, how-ever random writes come, ZFS does not have to put them on media asap 
- so it can do larger writes later. This can also protect SSD arrays from 
excessive small writes and wear-out, though there a bad(ly sized) ZIL can 
become a bottleneck.

Hope this helps,
Jim
--


___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Fragmentation

2017-06-23 Thread Jim Klimov
On June 23, 2017 4:13:52 PM GMT+02:00, Artyom Zhandarovsky  
wrote:
>disk errors: none
>
>
>
>
>
>-
>
>CAP Alert
>
>-
>
>
>
> Is there any way to decrease fragmentation of dr_tank ?
>
>--
>
>zpool list (Sum of RAW disk capacity without redundancy counted)
>
>--
>
>NAME  SIZE  ALLOC   FREE  EXPANDSZ   FRAGCAP  DEDUP  HEALTH 
>ALTROOT
>
>dr_slow  9.06T  77.6M  9.06T - 0% 0%  1.00x  ONLINE  -
>
>dr_tank  48.9T  35.1T  13.9T -23%71%  1.00x  ONLINE  -
>
>rpool 272G  42.1G   230G -10%15%  1.00x  ONLINE  -
>
>
>
>Real Pool capacity from zfs list
>
>--
>
>NAME   USED AVAILMOUNTPOINT  %
>
>dr_slow   7.69T 1.26T /dr_slow 14%!
>
>dr_tank 41.6T 6.33T /dr_tank 13%!
>
>rpool 45.6G218G  /rpool   83%

The issue of zfs fragmentation is that at some point it becomes hard to find 
free spots to write into, as well as to do large writes contiguously, so 
performance suddenly and noticeably drops. This can impact reads as well, 
especially if atime=on is left as default.

To recover from existing fragmentation you must free up space, perhaps zfs-send 
datasets to another pool, empty as much as you can on this one, and send data 
back - so it lands in large contiguous writes.

To prevent it, a ZIL caching all writes (including sync ones, e.g. nfs) can 
help. Perhaps a DDR drive (or mirror of these) with battery and flash 
protection from poweroffs, so it does not wear out like flash would. In this 
case, how-ever random writes come, ZFS does not have to put them on media asap 
- so it can do larger writes later. This can also protect SSD arrays from 
excessive small writes and wear-out, though there a bad(ly sized) ZIL can 
become a bottleneck.

Hope this helps,
Jim
--
Typos courtesy of K-9 Mail on my Redmi Android
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Fragmentation

2017-06-23 Thread Guenther Alka

Yes, but
If you increase your pool by adding a new vdev, your current data are 
not auto-rebalanced. This will only happen over time with new or 
modified data.


If you want the best performance then, you must copy over current data 
ex by renaming a filesystem, replicate it to the former name and delete 
it then.


Gea


Am 23.06.2017 um 17:19 schrieb Artyom Zhandarovsky:

So basically i need to add just more drives... ?

2017-06-23 18:09 GMT+03:00 Guenther Alka >:


The fragmentation info does not describe the fragmentation of the
data on pool but the fragmentation of the free space.  A high
fragmentation value will result in high data fragmentation only
when you write or modify data.


https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSZpoolFragmentationMeaning


So the best and only way to reduce data fragmentation is not to
fill up a pool say over 70-80%.

You should also know that CopyOnWrite filesystems where a complete
datablock ex 128k is written newly even if you change a "house" to
a "mouse" in a textfile are more vulnerable to fragmentation than
older filesystems. This is the price for the crash resitency where
a power outage during a write cannot lead to a corrupted
filesystem like with older filesystems where it can happen that
the data is modified "infile" while the according metadata update
is not happening. ZFS over-compensates this with its advanced
rambased read and write caches. A "defrag tool" is not available
for ZFS.

Gea


Am 23.06.2017 um 16:13 schrieb Artyom Zhandarovsky:

there any way to decrease fragmentation of dr_tank ?


-- 


___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com

http://lists.omniti.com/mailman/listinfo/omnios-discuss




--

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Fragmentation

2017-06-23 Thread Artyom Zhandarovsky
So basically i need to add just more drives... ?

2017-06-23 18:09 GMT+03:00 Guenther Alka :

> The fragmentation info does not describe the fragmentation of the data on
> pool but the fragmentation of the free space.  A high fragmentation value
> will result in high data fragmentation only when you write or modify data.
>
> https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSZpoolFra
> gmentationMeaning
> So the best and only way to reduce data fragmentation is not to fill up a
> pool say over 70-80%.
>
> You should also know that CopyOnWrite filesystems where a complete
> datablock ex 128k is written newly even if you change a "house" to a
> "mouse" in a textfile are more vulnerable to fragmentation than older
> filesystems. This is the price for the crash resitency where a power outage
> during a write cannot lead to a corrupted filesystem like with older
> filesystems where it can happen that the data is modified "infile" while
> the according metadata update is not happening. ZFS over-compensates this
> with its advanced rambased read and write caches. A "defrag tool" is not
> available for ZFS.
>
> Gea
>
>
> Am 23.06.2017 um 16:13 schrieb Artyom Zhandarovsky:
>
>> there any way to decrease fragmentation of dr_tank ?
>>
>
> --
>
> ___
> OmniOS-discuss mailing list
> OmniOS-discuss@lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Fragmentation

2017-06-23 Thread Guenther Alka
The fragmentation info does not describe the fragmentation of the data 
on pool but the fragmentation of the free space.  A high fragmentation 
value will result in high data fragmentation only when you write or 
modify data.


https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSZpoolFragmentationMeaning
So the best and only way to reduce data fragmentation is not to fill up 
a pool say over 70-80%.


You should also know that CopyOnWrite filesystems where a complete 
datablock ex 128k is written newly even if you change a "house" to a 
"mouse" in a textfile are more vulnerable to fragmentation than older 
filesystems. This is the price for the crash resitency where a power 
outage during a write cannot lead to a corrupted filesystem like with 
older filesystems where it can happen that the data is modified "infile" 
while the according metadata update is not happening. ZFS 
over-compensates this with its advanced rambased read and write caches. 
A "defrag tool" is not available for ZFS.


Gea


Am 23.06.2017 um 16:13 schrieb Artyom Zhandarovsky:

there any way to decrease fragmentation of dr_tank ?


--

___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


Re: [OmniOS-discuss] Fragmentation

2017-06-23 Thread Chris Siebenmann
>  Is there any way to decrease fragmentation of dr_tank ?
> 
> --
> 
> zpool list (Sum of RAW disk capacity without redundancy counted)
> 
> --
> 
> NAME  SIZE  ALLOC   FREE  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  ALTROOT
> 
> dr_slow  9.06T  77.6M  9.06T - 0% 0%  1.00x  ONLINE  -
> dr_tank  48.9T  35.1T  13.9T -23%71%  1.00x  ONLINE  -
> rpool 272G  42.1G   230G -10%15%  1.00x  ONLINE  -

 Note that 'FRAG' probably doesn't mean what you expect it to mean,
and you probably don't need to worry about it.

 The ZFS FRAG percentage here is how fragmented *free space* is, not
how fragmented your data is, and the details are arcane. A pool with
low FRAG has most of its free space in large contiguous segments; a
pool with high FRAG has most of the free space broken up into small
pieces. FRAG is essentially a measure of how hard ZFS will have to
work to find space for new data.

 For more details, you can read:

 https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSZpoolFragmentationMeaning
 https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSZpoolFragmentationDetails

(These were current as of late 2015, but the details might have changed
slightly since then.)

- cks
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss


[OmniOS-discuss] Fragmentation

2017-06-23 Thread Artyom Zhandarovsky
disk errors: none





-

CAP Alert

-



 Is there any way to decrease fragmentation of dr_tank ?

--

zpool list (Sum of RAW disk capacity without redundancy counted)

--

NAME  SIZE  ALLOC   FREE  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  ALTROOT

dr_slow  9.06T  77.6M  9.06T - 0% 0%  1.00x  ONLINE  -

dr_tank  48.9T  35.1T  13.9T -23%71%  1.00x  ONLINE  -

rpool 272G  42.1G   230G -10%15%  1.00x  ONLINE  -



Real Pool capacity from zfs list

--

NAME   USED AVAILMOUNTPOINT  %

dr_slow   7.69T 1.26T /dr_slow 14%!

dr_tank 41.6T 6.33T /dr_tank 13%!

rpool 45.6G218G  /rpool   83%
___
OmniOS-discuss mailing list
OmniOS-discuss@lists.omniti.com
http://lists.omniti.com/mailman/listinfo/omnios-discuss