Re: [OmniOS-discuss] Fragmentation
On June 23, 2017 9:01:20 PM GMT+02:00, Richard Ellingwrote: >ZIL pre-allocates at the block level, so think along the lines of 12k >or 132k. > — richard > >> On Jun 23, 2017, at 11:30 AM, Günther Alka >wrote: >> >> hello Richard >> >> I can follow that the Zil does not add more fragmentation to the free >space but is this effect relevant? >> If a ZIL pre-allocates say 4G and the remaining fragmented poolsize >for regular writes is 12T >> >> Gea >> >> Am 23.06.2017 um 19:30 schrieb Richard Elling: >>> A slog helps fragmentation because the space for ZIL is >pre-allocated based on a prediction of >>> how big the write will be. The pre-allocated space includes a >physical-block-sized chain block for the >>> ZIL. An 8k write can allocate 12k for the ZIL entry that is freed >when the txg commits. Thus, a slog >>> can help decrease free space fragmentation in the pool. >>> — richard >>> >>> On Jun 23, 2017, at 8:56 AM, Guenther Alka >wrote: A Zil or better dedicated Slog device will not help as this is not >a write cache but a logdevice. Its only there to commit every written >datablock and to put it onto stable storage. It is read only after a >crash to redo a missing committed write. All writes, does not matter if sync or not, are going over the >rambased write cache (per default up to 4GB). This is flushed from time >to time as a large sequential write. Writes are fragmented then >depending on the fragmentation of the free space. Gea > To prevent it, a ZIL caching all writes (including sync ones, e.g. >nfs) can help. Perhaps a DDR drive (or mirror of these) with battery >and flash protection from poweroffs, so it does not wear out like flash >would. In this case, how-ever random writes come, ZFS does not have to >put them on media asap - so it can do larger writes later. This can >also protect SSD arrays from excessive small writes and wear-out, >though there a bad(ly sized) ZIL can become a bottleneck. > > Hope this helps, > Jim > -- ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss >> >> -- >> ___ >> OmniOS-discuss mailing list >> OmniOS-discuss@lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss > >___ >OmniOS-discuss mailing list >OmniOS-discuss@lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss @Gea, IIRC one can set sync mode on a dataset, effectively forcing all writes to go to (dedicated) ZIL, and data remains in memory until flushed to persistent bulk storage like normal pool writes go. This way more consolidated writes can be sent to disks of the pool, rather than forcing many small (sync) allocations and deallocations if (sync) writes are small and intensive enough, e.g. appending log files, etc. For SSD pools this is thought to also ease the wear due to ability to reprogram whole pages, compensating also for small intensive random writes since random LBAs can live in same page. Jim Hope Richard would correct me if I got something wrong ;) -- Typos courtesy of K-9 Mail on my Redmi Android ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Fragmentation
ZIL pre-allocates at the block level, so think along the lines of 12k or 132k. — richard > On Jun 23, 2017, at 11:30 AM, Günther Alkawrote: > > hello Richard > > I can follow that the Zil does not add more fragmentation to the free space > but is this effect relevant? > If a ZIL pre-allocates say 4G and the remaining fragmented poolsize for > regular writes is 12T > > Gea > > Am 23.06.2017 um 19:30 schrieb Richard Elling: >> A slog helps fragmentation because the space for ZIL is pre-allocated based >> on a prediction of >> how big the write will be. The pre-allocated space includes a >> physical-block-sized chain block for the >> ZIL. An 8k write can allocate 12k for the ZIL entry that is freed when the >> txg commits. Thus, a slog >> can help decrease free space fragmentation in the pool. >> — richard >> >> >>> On Jun 23, 2017, at 8:56 AM, Guenther Alka wrote: >>> >>> A Zil or better dedicated Slog device will not help as this is not a write >>> cache but a logdevice. Its only there to commit every written datablock and >>> to put it onto stable storage. It is read only after a crash to redo a >>> missing committed write. >>> >>> All writes, does not matter if sync or not, are going over the rambased >>> write cache (per default up to 4GB). This is flushed from time to time as a >>> large sequential write. Writes are fragmented then depending on the >>> fragmentation of the free space. >>> >>> Gea >>> >>> To prevent it, a ZIL caching all writes (including sync ones, e.g. nfs) can help. Perhaps a DDR drive (or mirror of these) with battery and flash protection from poweroffs, so it does not wear out like flash would. In this case, how-ever random writes come, ZFS does not have to put them on media asap - so it can do larger writes later. This can also protect SSD arrays from excessive small writes and wear-out, though there a bad(ly sized) ZIL can become a bottleneck. Hope this helps, Jim -- >>> ___ >>> OmniOS-discuss mailing list >>> OmniOS-discuss@lists.omniti.com >>> http://lists.omniti.com/mailman/listinfo/omnios-discuss > > -- > ___ > OmniOS-discuss mailing list > OmniOS-discuss@lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Fragmentation
hello Richard I can follow that the Zil does not add more fragmentation to the free space but is this effect relevant? If a ZIL pre-allocates say 4G and the remaining fragmented poolsize for regular writes is 12T Gea Am 23.06.2017 um 19:30 schrieb Richard Elling: A slog helps fragmentation because the space for ZIL is pre-allocated based on a prediction of how big the write will be. The pre-allocated space includes a physical-block-sized chain block for the ZIL. An 8k write can allocate 12k for the ZIL entry that is freed when the txg commits. Thus, a slog can help decrease free space fragmentation in the pool. — richard On Jun 23, 2017, at 8:56 AM, Guenther Alkawrote: A Zil or better dedicated Slog device will not help as this is not a write cache but a logdevice. Its only there to commit every written datablock and to put it onto stable storage. It is read only after a crash to redo a missing committed write. All writes, does not matter if sync or not, are going over the rambased write cache (per default up to 4GB). This is flushed from time to time as a large sequential write. Writes are fragmented then depending on the fragmentation of the free space. Gea To prevent it, a ZIL caching all writes (including sync ones, e.g. nfs) can help. Perhaps a DDR drive (or mirror of these) with battery and flash protection from poweroffs, so it does not wear out like flash would. In this case, how-ever random writes come, ZFS does not have to put them on media asap - so it can do larger writes later. This can also protect SSD arrays from excessive small writes and wear-out, though there a bad(ly sized) ZIL can become a bottleneck. Hope this helps, Jim -- ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss -- ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Fragmentation
A slog helps fragmentation because the space for ZIL is pre-allocated based on a prediction of how big the write will be. The pre-allocated space includes a physical-block-sized chain block for the ZIL. An 8k write can allocate 12k for the ZIL entry that is freed when the txg commits. Thus, a slog can help decrease free space fragmentation in the pool. — richard > On Jun 23, 2017, at 8:56 AM, Guenther Alkawrote: > > A Zil or better dedicated Slog device will not help as this is not a write > cache but a logdevice. Its only there to commit every written datablock and > to put it onto stable storage. It is read only after a crash to redo a > missing committed write. > > All writes, does not matter if sync or not, are going over the rambased write > cache (per default up to 4GB). This is flushed from time to time as a large > sequential write. Writes are fragmented then depending on the fragmentation > of the free space. > > Gea > > >> To prevent it, a ZIL caching all writes (including sync ones, e.g. nfs) can >> help. Perhaps a DDR drive (or mirror of these) with battery and flash >> protection from poweroffs, so it does not wear out like flash would. In this >> case, how-ever random writes come, ZFS does not have to put them on media >> asap - so it can do larger writes later. This can also protect SSD arrays >> from excessive small writes and wear-out, though there a bad(ly sized) ZIL >> can become a bottleneck. >> >> Hope this helps, >> Jim >> -- > > ___ > OmniOS-discuss mailing list > OmniOS-discuss@lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] Fragmentation
On June 23, 2017 4:13:52 PM GMT+02:00, Artyom Zhandarovskywrote: >disk errors: none > > > > > >- > >CAP Alert > >- > > > > Is there any way to decrease fragmentation of dr_tank ? > >-- > >zpool list (Sum of RAW disk capacity without redundancy counted) > >-- > >NAME SIZE ALLOC FREE EXPANDSZ FRAGCAP DEDUP HEALTH >ALTROOT > >dr_slow 9.06T 77.6M 9.06T - 0% 0% 1.00x ONLINE - > >dr_tank 48.9T 35.1T 13.9T -23%71% 1.00x ONLINE - > >rpool 272G 42.1G 230G -10%15% 1.00x ONLINE - > > > >Real Pool capacity from zfs list > >-- > >NAME USED AVAILMOUNTPOINT % > >dr_slow 7.69T 1.26T /dr_slow 14%! > >dr_tank 41.6T 6.33T /dr_tank 13%! > >rpool 45.6G218G /rpool 83% The issue of zfs fragmentation is that at some point it becomes hard to find free spots to write into, as well as to do large writes contiguously, so performance suddenly and noticeably drops. This can impact reads as well, especially if atime=on is left as default. To recover from existing fragmentation you must free up space, perhaps zfs-send datasets to another pool, empty as much as you can on this one, and send data back - so it lands in large contiguous writes. To prevent it, a ZIL caching all writes (including sync ones, e.g. nfs) can help. Perhaps a DDR drive (or mirror of these) with battery and flash protection from poweroffs, so it does not wear out like flash would. In this case, how-ever random writes come, ZFS does not have to put them on media asap - so it can do larger writes later. This can also protect SSD arrays from excessive small writes and wear-out, though there a bad(ly sized) ZIL can become a bottleneck. Hope this helps, Jim -- Typos courtesy of K-9 Mail on my Redmi Android ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss