Re: [gentoo-user] Re: {OT} Allow work from home?

lee Fri, 04 Mar 2016 16:24:06 -0800

Kai Krakow <hurikha...@gmail.com> writes:

> Am Sat, 20 Feb 2016 11:24:56 +0100
> schrieb lee <l...@yagibdah.de>:
>
>> > It uses some very clever ideas to place files into groups and into
>> > proper order - other than using file mod and access times like other
>> > defrag tools do (which even make the problem worse by doing so
>> > because this destroys locality of data even more).  
>> 
>> I've never heard of MyDefrag, I might try it out.  Does it make
>> updating any faster?
>
> Ah well, difficult question... Short answer: It uses countermeasures
> against performance after updates decreasing too fast. It does this by
> using a "gapped" on-disk file layout - leaving some gaps for Windows to
> put temporary files. By this, files don't become a far spread as
> usually during updates. But yes, it improves installation time.


What difference would that make with an SSD?

> Apparently it's unmaintained since a few years but it still does a good
> job. It was built upon a theory by a student about how to properly
> reorganize file layout on a spinning disk to stay at high performance
> as best as possible.

For spinning disks, I can see how it can be beneficial.

>> > But even SSDs can use _proper_ defragmentation from time to time for
>> > increased lifetime and performance (this is due to how the FTL works
>> > and because erase blocks are huge, I won't get into detail unless
>> > someone asks). This is why mydefrag also supports flash
>> > optimization. It works by moving as few files as possible while
>> > coalescing free space into big chunks which in turn relaxes
>> > pressure on the FTL and allows to have more free and continuous
>> > erase blocks which reduces early flash chip wear. A filled SSD with
>> > long usage history can certainly gain back some performance from
>> > this.  
>> 
>> How does it improve performance?  It seems to me that, for practical
>> use, almost all of the better performance with SSDs is due to reduced
>> latency.  And IIUC, it doesn't matter for the latency where data is
>> stored on an SSD.  If its performance degrades over time when data is
>> written to it, the SSD sucks, and the manufacturer should have done a
>> better job.  Why else would I buy an SSD.  If it needs to reorganise
>> the data stored on it, the firmware should do that.
>
> There are different factors which have impact on performance, not just
> seek times (which, as you write, is the worst performance breaker):
>
>   * management overhead: the OS has to do more house keeping, which
>     (a) introduces more IOPS (which is the only relevant limiting
>     factor for SSD) and (b) introduces more CPU cycles and data
>     structure locking within the OS routines during performing IO which
>     comes down to more CPU cycles spend during IO

How would that be reduced by defragmenting an SSD?

>   * erasing a block is where SSDs really suck at performance wise, plus
>     blocks are essentially read-only once written - that's how flash
>     works, a flash data block needs to be erased prior to being
>     rewritten - and that is (compared to the rest of its performance) a
>     really REALLY HUGE time factor

So let the SSD do it when it's idle.  For applications in which it isn't
idle enough, an SSD won't be the best solution.

>   * erase blocks are huge compared to common filesystem block sizes
>     (erase block = 1 or 2 MB vs. file system block being 4-64k usually)
>     which happens to result in this effect:
>
>     - OS replaces a file by writing a new, deleting the old
>       (common during updates), or the user deletes files
>     - OS marks some blocks as free in its FS structures, it depends on
>       the file size and its fragmentation if this gives you a
>       continuous area of free blocks or many small blocks scattered
>       across the disk: it results in free space fragmentation
>     - free space fragments happen to become small over time, much
>       smaller then the erase block size
>     - if your system has TRIM/discard support it will tell the SSD
>       firmware: here, I no longer use those 4k blocks
>     - as you already figured out: those small blocks marked as free do
>       not properly align with the erase block size - so actually, you
>       may end up with a lot of free space but essentially no complete
>       erase block is marked as free

Use smaller erase blocks.

>     - this situation means: the SSD firmware cannot reclaim this free
>       space to do "free block erasure" in advance so if you write
>       another block of small data you may end up with the SSD going
>       into a direct "read/modify/erase/write" cycle instead of just
>       "read/modify/write" and deferring the erasing until later - ah
>       yes, that's probably becoming slow then
>     - what do we learn: (a) defragment free space from time to time,
>       (b) enable TRIM/discard to reclaim blocks in advance, (c) you may
>       want to over-provision your SSD: just don't ever use 10-15% of
>       your SSD, trim that space, and leave it there for the firmware to
>       shuffle erase blocks around

Use better firmware for SSDs.

>     - the latter point also increases life-time for obvious reasons as
>       SSDs only support a limited count of write-cycles per block
>     - this "shuffling around" blocks is called wear-levelling: the
>       firmware chooses a block candidate with the least write cycles
>       for doing "read/modify/write"
>
> So, SSDs actually do this "reorganization" as you call it - but they
> are limited to it within the bounds of erase block sizes - and the
> firmware knows nothing about the on-disk format and its smaller blocks,
> so it can do nothing to go down to a finer grained reorganization.

Well, I can't help it.  I'm going to need to use 2 SSDs on a hardware
RAID controller in a RAID-1.  I expect the SSDs to just work fine.  If
they don't, then there isn't much point in spending the extra money on
them.

The system needs to boot from them.  So what choice do I have to make
these SSDs happy?

> These facts are apparently unknown to most people, that's why they are
> denying a SSD could become slow or needs some specialized form of
> "defragmentation". The usual recommendation is to do a "secure erase"
> of the disk if it becomes slow - which I consider pretty harmful as it
> rewrites ALL blocks (reducing their write-cycle counter/lifetime), plus
> it's time consuming and could be avoided.

That isn't an option because it would be way too much hassle.

> BTW: OS makers (and FS designers) actually optimize their systems for
> that kind of reorganization of the SSD firmware. NTFS may use different
> allocation strategies on SSD (just a guess) and in Linux there is F2FS
> which actually exploits this reorganization for increased performance
> and lifetime, Ext4 and Btrfs use different allocation strategies and
> prefer spreading file data instead of free space (which is just the
> opposite of what's done for HDD). So, with a modern OS you are much
> less prone to the effects described above.

Does F2FS come with some sort of redundancy?  Reliability and booting
from these SSDs are requirements, so I can't really use btrfs because
it's troublesome to boot from, and the reliability is questionable. Ext4
doesn't have raid.  Using ext4 on mdadm probably won't be any better
than using the hardware RAID, so there's no point in doing that, and I
rather spare me the overhead.

After your explanation, I have to wonder even more than before what the
point in using SSDs is, considering current hard- and software which
doesn't properly use them.  OTOH, so far they do seem to provide better
performance than hard disks even when not used with all the special
precautions I don't want to have to think about.

BTW, why would anyone use SSDs for ZFS's zil or l2arc?  Does ZFS treat
SSDs properly in this application?

Re: [gentoo-user] Re: {OT} Allow work from home?

Reply via email to