On 15 Apr 2015 15:43, "Simon Riggs" <si...@2ndquadrant.com> wrote:
>
> It all depends upon who is being selfish. Why is a user "selfish" for
> not wanting to clean every single block they scan, when the people
> that made the mess do nothing and go faster 10 minutes from now?
> Randomly and massively penalising large SELECTs makes no sense. Some
> cleanup is OK, with reasonable limits, which is why that is proposed.

I don't think it's productive to think of a query as a different actor with
only an interest in its own performance and no interest in overall system
performance.

>From a holistic point of view the question is how many times is a given hit
chain going to need to be followed before it's pruned. Or to put it another
way, how expensive is creating a hot chain. Does it cause a single prune? a
fixed number of chain readers followed by a prune? Does the amount of work
depend on the workload or is it consistent?

My intuition is that a fixed cutoff like "five pages" is dangerous because
if you update many pages there's no limit to the number of times they'll be
read before they're all pruned. The steady state could easily be that every
query is having to read hot chains forever.

My intuition, again, is that what we need is a percentage such as "do 10
prunes then ignore the next 1000 clean pages with hot chains. That
guarantees that after 100 selects the hot chains will all be pruned but
each select will only prune 1% of the clean pages it sees.

Reply via email to