On 01/19/2016 10:54 PM, Peter Geoghegan wrote:
On Tue, Jan 19, 2016 at 9:37 AM, Alvaro Herrera
<alvhe...@2ndquadrant.com> wrote:
Our transcript seems to predate that bugfix commit, so I assume we need
to apply this to our copy too.  Sadly, Hideaki-san commit message isn't
very descriptive.

Fortunately, the function mergeHyperLogLog() in our hyperloglog.c
currently has no callers.

I don't really know how HyperLogLog works, so maybe we can't or
shouldn't apply the patch because of how the hash stuff is used.

I think that Hideaki's confusion comes from whether or not this HLL
state is a sparse or dense/full representation. The distinction is
explained in the README for postgresql-hll:

https://github.com/aggregateknowledge/postgresql-hll

postgresql-hll has no support for merging HLLs that are sparse:

https://github.com/aggregateknowledge/postgresql-hll/blob/master/hll.c#L1888

Can't we just tear mergeHyperLogLog() out?

FWIW I've been considering adding APPROX_COUNT_DISTINCT() aggregate, similarly to what other databases (e.g. Vertica) have built-in. Now, that would not require the merge too, but we're currently baking support for 'combine' functions, and that's exactly what merge does.

So why not just fix the bug?

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to