Re: [HACKERS] Hash tables in dynamic shared memory

Thomas Munro Sun, 18 Dec 2016 15:34:55 -0800

On Sun, Nov 20, 2016 at 11:54 AM, John Gorman <johngorm...@gmail.com> wrote:
> I reviewed the dht-v2.patch and found that it is in excellent shape.


Thanks for reviewing!  And sorry for the late reply.

> Benchmarking shows that this performs somewhat faster than
> dynahash which is surprising because it is doing DSA address
> translations on the fly.

That is indeed surprising and I think warrants more investigation.

> One area where this could run faster is to reduce the amount of
> time when the hash is in the RESIZE_IN_PROGRESS state.
>
> When a hash partition reaches 75% full a resize begins by allocating
> an empty new hash bucket array with double the number of buckets.
> This begins the RESIZE_IN_PROGRESS state where there is both an
> old and a new hash bucket array.
>
> During the RESIZE state all operations such as find, insert,
> delete and iterate run slower due to having to look items up in
> two hash bucket arrays.
>
> Whenever a new item is inserted into the hash and the hash is in
> the RESIZE state the current code takes the time to move one item
> from the old hash partition to the new hash partition. During
> insertion an exclusive lock is already held on the partition so
> this is an efficient time to do the resize cleanup work.
>
> However since we only clean up one item per insertion we do not
> complete the cleanup and free the old hash bucket array until we
> are due to start yet another resize operation.
>
> So we are effectively always in the RESIZE state which slows down
> operations and wastes some space.

Right, that is the case in a workload that inserts a bunch of data but
then becomes read-only forever.  A workload that constantly does a mix
of writing and reading should settle down at a reasonable size and
stop resizing.

> Here are the timings for insert and find in nanoseconds on a
> Macbook Pro. The insert_resize figure includes the resize work to
> move one item from the old to the new hash bucket array.
>
> insert_resize: 945.98
> insert_clean:  450.39
> find_resize:   348.90
> find_clean:    293.16
>
> The attached dht-v2-resize-cleanup.patch patch applies on top of
> the dht-v2.patch and speeds up the resize cleanup process so that
> hashes spend most of their time in the clean state.
>
> It does this by cleaning up more than one old item during
> inserts. This patch cleans up three old items.
>
> There is also the case where a hash receives all of its inserts
> at the beginning and the rest of the work load is all finds. This
> patch also cleans up two items for each find.
>
> The downside of doing extra cleanup early is some additional
> latency. Here are the experimental figures and the approximate
> formulas for different numbers of cleanups for inserts. Note that
> we cannot go lower than one cleanup per insert.
>
> Resize work in inserts: 729.79 insert + 216.19 * cleanups
> 1  945.98
> 2 1178.13
> 3 1388.73
> 4 1617.04
> 5 1796.91
>
> Here are the timings for different numbers of cleanups for finds.
> Note that since we do not hold an exclusive partition lock on a find
> there is the additional overhead of taking that lock.
>
> Resize work in finds: 348.90 dirty_find + 233.45 lock + 275.78 * cleanups
> 0  348.90
> 1  872.88
> 2 1138.98
> 3 1448.94
> 4 1665.31
> 5 1975.91
>
> The new effective latencies during the resize vs. the clean states.
>
> #define DHT_INSERT_CLEANS 3
> #define DHT_SEARCH_CLEANS 2
>
> insert_resize: 1388.73  -- 3 cleaned
> insert_clean:   450.39
> find_resize:   1138.98  -- 2 cleaned
> find_clean:     293.16
>
> The overall performance will be faster due to not having to examine
> more than one hash bucket array most of the time.

Thanks for doing all these experiments.  Yeah, I think it makes sense
to merge this change to improve that case.

Since Dilip Kumar's Parallel Bitmap Heap Scan project is no longer
using this, I think I should park it here unless/until another
potential use case pops up.  Some interesting candidates have been
mentioned already, and I'm fairly sure there are other uses too, but I
don't expect anyone to be interested in committing this patch until
something concrete shows up, so I'll go work on other things until
then.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hash tables in dynamic shared memory

Reply via email to