On Tue, Apr 14, 2015 at 7:10 PM, Greg Stark <st...@mit.edu> wrote: > The way the clock sweep algorithm is meant to be thought about is that it's > an approximate lru. Each usage count corresponds to an ntile of the lru. So > we don't know which buffer is least recently used but it must be in the set > of buffers with usage count 0 and that should be 1/nth of all the buffers.
Agreed. > In order for that property to be maintained though the usage count for some > buffer should be getting decremented every time we touch a buffer. That is, > every time we promote one buffer to the most recently moved ntile we should > be demoting some other buffer. Agreed. It's not easy to get this behavior exactly, though, because the buffer you kick out necessarily has a usage count of 0 and the one you bring in probably shouldn't. And we don't wanna have to run the clock sweep every time somebody touches a non-maximal usage count. But I think it is still true that this is, to some degree, what our algorithm is trying to approximate, and I also think it's pretty clear that our current approximation isn't that great. > The way our cache works we promote when a buffer is accessed but we only > demote when a buffer is flushed. We flush a lot less often than we touch > buffers so it's not surprising that the cache ends up full of buffers that > are all in the "most recently used" section. This isn't really correct. We promote when it's accessed, but we demote it when the clock sweep hand passes over it, which happens each time we consider it for eviction. It does not have to do with flushing dirty date to disk, and it does not happen only when the buffer is actually evicted. > Now it's complicated by the fact that we aren't promoting buffers directly > to the most recently used ntile. We're incrementing the usage count by one. > That makes it more of a "least frequently used" list rather than a lru. I > think that's a mistake but I recall some debate about that when it first > went in. Note that the discussion of 2Q, LRU(k), and perhaps others ask not only how recently the page was used, but how frequently it was used. The recency of the next-to-last access is often used as a proxy for frequency. Consider two buffers. One gets touched 5 times in a row, once a day. The other gets touched 5 times per day, at equal intervals. In general, the second buffer is a better choice to retain than the first, even if it has been touched less recently. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers