Hello,

Basically yes, I'm suggesting a mutex in the vdf struct.

I can't see that being ok. I mean what would that thing even do? VFD
isn't shared between processes, and if we get a smgr flush we have to
apply it, or risk breaking other things.

Probably something is eluding my comprehension:-)

My basic assumption is that the fopen & fd is per process, so we just have to deal with the one in the checkpointer process, so it is enough that the checkpointer does not close the file while it is flushing things to it?

* my laptop, 16 GB Ram, 840 EVO 1TB as storage. With 2GB
shared_buffers. Tried checkpoint timeouts from 60 to 300s.

Hmmm. This is quite short.

Indeed. I'd never do that in a production scenario myself. But
nonetheless it showcases a problem.

I would say that it would render sorting ineffective because all the rewriting is done by bgwriter or workers, which does not totally explain why the throughput would be worst than before, I would expect it to be as bad as before...

Well, you can't easily sort bgwriter/backend writes stemming from cache
replacement. Unless your access patterns are entirely sequential the
data in shared buffers will be laid out in a nearly entirely random
order.  We could try sorting the data, but with any reasonable window,
for many workloads the likelihood of actually achieving much with that
seems low.

Maybe the sorting could be shared with others so that everybody uses the
same order?

That would suggest to have one global sorting of buffers, maybe maintained
by the checkpointer, which could be used by all processes that need to scan
the buffers (in file order), instead of scanning them in memory order.

Uh. Cache replacement is based on an approximated LRU, you can't just
remove that without serious regressions.

I understand that, but there is a balance to find. Generating random I/Os is very bad for performance, so the decision process must combine LRU/LFU heuristics with considering things in some order as well.

Hmmm. The shorter the timeout, the more likely the sorting NOT to be
effective

You mean, as evidenced by the results, or is that what you'd actually
expect?

What I would expect...

I don't see why then? If you very quickly writes lots of data the OS
will continously flush dirty data to the disk, in which case sorting is
rather important?

What I have in mind is: the shorter the timeout the less neighboring buffers will be touched, so the less nice sequential writes will be found by sorting them, so the worst the positive impact on performance...

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to