Tom Lane wrote:
Heikki Linnakangas <[EMAIL PROTECTED]> writes:
imola-336 imola-337 imola-340
writes by checkpoint 38302 30410 39529
writes by bgwriter 350113 2205782 1418672
writes by backends 1834333 265755 787633
writes total 2222748 2501947 2245834
allocations 2683170 2657896 2699974
It looks like Tom's idea is not a winner; it leads to more writes than
necessary.
The incremental number of writes is not that large; only about 10% more.
The interesting thing is that those "extra" writes must represent
buffers that were re-touched after their usage_count went to zero, but
before they could be recycled by the clock sweep. While you'd certainly
expect some of that, I'm surprised it is as much as 10%. Maybe we need
to play with the buffer allocation strategy some more.
The very small difference in NOTPM among the three runs says that either
this whole area is unimportant, or DBT2 isn't a good test case for it;
or maybe that there's something wrong with the patches?
On imola-340, there's still a significant amount of backend writes. I'm
still not sure what we should be aiming at. Is 0 backend writes our goal?
Well, the lower the better, but not at the cost of a very large increase
in total writes.
Imola-340 was with a patch along the lines of
Itagaki's original patch, ensuring that there's as many clean pages in
front of the clock head as were consumed by backends since last bgwriter
iteration.
This seems intuitively wrong, since in the presence of bursty request
behavior it'll constantly be getting caught short of buffers. I think
you need a safety margin and a moving-average decay factor. Possibly
something like
buffers_to_clean = Max(buffers_used * 1.1,
buffers_to_clean * 0.999);
where buffers_used is the current observation of demand. This would
give us a safety margin such that buffers_to_clean is not less than
the largest demand observed in the last 100 iterations (0.999 ^ 100
is about 0.90, cancelling out the initial 10% safety margin), and it
takes quite a while for the memory of a demand spike to be forgotten
completely.
That would be overly aggressive on a workload that's steady on average,
but consists of small bursts. Like this: 0 0 0 0 100 0 0 0 0 100 0 0 0 0
100. You'd end up writing ~100 pages on every bgwriter round, but you
only need an average of 20 pages per round. That'd be effectively the
same as keeping all buffers with usage_count=0 clean.
BTW, I believe that kind of workload is actually very common. That's
what you get if one transaction causes say 10-100 buffer allocations,
and you execute one such transaction every few seconds.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
http://archives.postgresql.org