On Fri, Nov 10, 2023 at 11:31:37AM -0500, Kent Overstreet wrote:
> Large stack of changes - performance optimizations and improvements to
> related code.
>
> The first big one, killing journal pre-reservations, is already merged:
> besides being a performance improvement it turned out to fix a few tests
> that were sporadically failing with "journal stuck" - nice. The idea
> with that patch is that instead of reserving space in the journal ahead
> of time (for journal reservations that must succeed or risk deadlock),
> we instead rely on watermarks and tracking which operations are
> necessary for reclaim - when we're low on space in the journal and
> flushing things for reclaim, we're able to flush things in order, or
> track when we're not flushing things in order and allow it to fail
>
> Most of the other patches are prep work leading up to the big btree
> write buffer rewrite, which changes the btree write buffer to pull keys
> from the journal, instead of having the tranaction commit path directly
> add keys to the btree write buffer.
>
> btree write buffer performance is pretty important for multithreaded
> update workloads - it's an amdahl's law situation, flushing it is single
> threaded. More performance optimizations there would still be
> worthwhile, and I've been looking at it because I'm sketching out a
> rewrite of disk space accounting (which will allow for per snapshot Id
> accounting) that will lean heavily on the btree write buffer.
Forgot to mention performance numbers -
fio, 4k random write iops, no data IO mode, checksums off (this is my
benchmark setup for just testing core bcachefs code):
1 job 8 jobs
before 169k 645k
after 207k 798k