Re: [HACKERS] WIP: Fast GiST index build

Alexander Korotkov Tue, 16 Aug 2011 06:20:36 -0700

On Tue, Aug 16, 2011 at 4:04 PM, Heikki Linnakangas <
[email protected]> wrote:


> I can see that that's equal to the formula given in the paper, log_B(M/4B),
> but I couldn't see any explanation for that formula in the paper. Your
> explanation makes sense, but where did it come from?
>
I didn't find it too. But it has to reservse memory for both sub-tree and
active buffers. While we'are reserving memory for sub-tree in
effective_cache_size and memory for last pages of buffers in
maintenance_work_mem.


> It seems a bit pessimistic: while it's true that the a subtree can't be
> larger than 2 * maxIndexTuplesPerPage ^ levelStep, you can put a tighter
> upper bound on it. The max size of a subtree of depth n can be calculated as
> the geometric series:
>
> r^0 + r^1 + r^2 + ... + r^n = (1 - r^(n + 1)) / (1 - r)
>
> where r = maxIndexTuplesPerPage. For r=2 those formulas are equal, but for
> a large r and small n (which is typical), the 2 * 
> maxIndexTuplesPerPage^**levelStep
> formula gives a value that's almost twice as large as the real max size of a
> subtree.
>
Thus, we can calculate:
levelstep = min(log_r(1 + effective_cache_size_in_pages*(r - 1)) - 1,
log_r(maintenance_work_mem_in_pages - 1))
and get more precise result. But also we need at least very rough estimate
of memory occupied by node buffers hash tab and path items.

------
With best regards,
Alexander Korotkov.

Re: [HACKERS] WIP: Fast GiST index build

Reply via email to