Re: [PATCH] ltree hash functions
jian he writes: > I also made the following changes: > from > uint64 levelHash = hash_any_extended((unsigned char *) al->name, al->len, > seed); > uint32 levelHash = hash_any((unsigned char *) al->name, al->len); > to > uint64 levelHash = DatumGetUInt64(hash_any_extended((unsigned char *) > al->name, al->len, seed)); > uint32 levelHash = DatumGetUInt32(hash_any((unsigned char *) al->name, > al->len)); Yeah, that'd fail on 32-bit machines. Pushed v5 with some minor cosmetic tweaking. regards, tom lane
Re: [PATCH] ltree hash functions
On Thu, Feb 1, 2024 at 11:11 PM vignesh C wrote: > > On Wed, 6 Dec 2023 at 04:08, Tommy Pavlicek wrote: > > > > Thanks. > > > > I've attached the latest version that updates the naming in line with > > the convention. > > > > On Mon, Dec 4, 2023 at 12:46 AM jian he wrote: > > > > > > On Fri, Dec 1, 2023 at 8:44 AM Tommy Pavlicek > > > wrote: > > > > > > > > > > > > Patch updated for those comments (and a touch of cleanup in the tests) > > > > attached. > > > > > > it would be a better name as hash_ltree than ltree_hash, similar logic > > > applies to ltree_hash_extended. > > > that would be the convention. see: > > > https://stackoverflow.com/a/69650940/15603477 > > > > > > > > > Other than that, it looks good. > > CFBot shows one of the test is failing as in [1]: > diff -U3 /tmp/cirrus-ci-build/contrib/ltree/expected/ltree.out > /tmp/cirrus-ci-build/build-32/testrun/ltree/regress/results/ltree.out > --- /tmp/cirrus-ci-build/contrib/ltree/expected/ltree.out 2024-01-31 > 15:18:42.893039599 + > +++ /tmp/cirrus-ci-build/build-32/testrun/ltree/regress/results/ltree.out > 2024-01-31 15:23:25.309028749 + > @@ -1442,9 +1442,14 @@ > ('0.1.2'::ltree), ('0'::ltree), ('0_asd.1_ASD'::ltree)) x(v) > WHERE hash_ltree(v)::bit(32) != hash_ltree_extended(v, 0)::bit(32) > OR hash_ltree(v)::bit(32) = hash_ltree_extended(v, 1)::bit(32); > - value | standard | extended0 | extended1 > +--+---+--- > -(0 rows) > +value| standard | > extended0 |extended1 > +-+--+--+-- > + 0 | 10001010100010011011 | > 0101100001000100011001011011 | 010110000100010001101001 > + 0.1 | 101010001010110001001110 | > 0000100010001100110111010101 | 0000100010001101100011010101 > + 0.1.2 | 0111010101110100 | > 1010111001110101100011010111 | 101011100111011100100011 > + 0 | 10001010100010011011 | > 0101100001000100011001011011 | 010110000100010001101001 > + 0_asd.1_ASD | 011000101000101001001101 | > 0000100010001100110111010101 | 0000100010001101100011010101 > +(5 rows) > > Please post an updated version for the same. > > [1] - > https://api.cirrus-ci.com/v1/artifact/task/5572544858685440/testrun/build-32/testrun/ltree/regress/regression.diffs > It only fails on Linux - Debian Bullseye - Meson. I fixed the white space, named it v5. I also made the following changes: from uint64 levelHash = hash_any_extended((unsigned char *) al->name, al->len, seed); uint32 levelHash = hash_any((unsigned char *) al->name, al->len); to uint64 levelHash = DatumGetUInt64(hash_any_extended((unsigned char *) al->name, al->len, seed)); uint32 levelHash = DatumGetUInt32(hash_any((unsigned char *) al->name, al->len)); (these two line live in different functions) I have some problems testing it locally, so I post the patch. From ffc0e3daea5e490d8b9e6b7a0345f780d88b5624 Mon Sep 17 00:00:00 2001 From: jian he Date: Fri, 2 Feb 2024 16:03:56 +0800 Subject: [PATCH v5 1/1] add hash functions for the ltree extension. --- contrib/ltree/Makefile| 2 +- contrib/ltree/expected/ltree.out | 65 + contrib/ltree/ltree--1.2--1.3.sql | 23 +++ contrib/ltree/ltree.control | 2 +- contrib/ltree/ltree_op.c | 68 +++ contrib/ltree/ltreetest.sql | 1 + contrib/ltree/meson.build | 1 + contrib/ltree/sql/ltree.sql | 43 +++ doc/src/sgml/ltree.sgml | 8 9 files changed, 211 insertions(+), 2 deletions(-) create mode 100644 contrib/ltree/ltree--1.2--1.3.sql diff --git a/contrib/ltree/Makefile b/contrib/ltree/Makefile index 770769a7..f50f2c94 100644 --- a/contrib/ltree/Makefile +++ b/contrib/ltree/Makefile @@ -14,7 +14,7 @@ OBJS = \ ltxtquery_op.o EXTENSION = ltree -DATA = ltree--1.1--1.2.sql ltree--1.1.sql ltree--1.0--1.1.sql +DATA = ltree--1.2--1.3.sql ltree--1.1--1.2.sql ltree--1.1.sql ltree--1.0--1.1.sql PGFILEDESC = "ltree - hierarchical label data type" HEADERS = ltree.h diff --git a/contrib/ltree/expected/ltree.out b/contrib/ltree/expected/ltree.out index 984cd030..dd4f0fc1 100644 --- a/contrib/ltree/expected/ltree.out +++ b/contrib/ltree/expected/ltree.out @@ -1433,8 +1433,27 @@ SELECT '{j.k.l.m, g.b.c.d.e}'::ltree[] ?~ 'A*@|g.b.c.d.e'; g.b.c.d.e (1 row) +-- Check that the hash_ltree() and hash_ltree_extended() function's lower +-- 32 bits match when the seed is 0 and do not match when the seed != 0 +SELECT v as value, hash_ltree(v)::bit(32) as standard, + hash_ltree_extended(v, 0)::bit(32) as extended0, + hash_ltree_extended(v, 1)::bit(32) as extended1 +FROM (VALUES (NULL::ltree), (''::ltree), ('0'::ltree), ('0.1'::ltree), + ('0.1.2'::ltree), ('0':
Re: [PATCH] ltree hash functions
On Wed, 6 Dec 2023 at 04:08, Tommy Pavlicek wrote: > > Thanks. > > I've attached the latest version that updates the naming in line with > the convention. > > On Mon, Dec 4, 2023 at 12:46 AM jian he wrote: > > > > On Fri, Dec 1, 2023 at 8:44 AM Tommy Pavlicek wrote: > > > > > > > > > Patch updated for those comments (and a touch of cleanup in the tests) > > > attached. > > > > it would be a better name as hash_ltree than ltree_hash, similar logic > > applies to ltree_hash_extended. > > that would be the convention. see: > > https://stackoverflow.com/a/69650940/15603477 > > > > > > Other than that, it looks good. CFBot shows one of the test is failing as in [1]: diff -U3 /tmp/cirrus-ci-build/contrib/ltree/expected/ltree.out /tmp/cirrus-ci-build/build-32/testrun/ltree/regress/results/ltree.out --- /tmp/cirrus-ci-build/contrib/ltree/expected/ltree.out 2024-01-31 15:18:42.893039599 + +++ /tmp/cirrus-ci-build/build-32/testrun/ltree/regress/results/ltree.out 2024-01-31 15:23:25.309028749 + @@ -1442,9 +1442,14 @@ ('0.1.2'::ltree), ('0'::ltree), ('0_asd.1_ASD'::ltree)) x(v) WHERE hash_ltree(v)::bit(32) != hash_ltree_extended(v, 0)::bit(32) OR hash_ltree(v)::bit(32) = hash_ltree_extended(v, 1)::bit(32); - value | standard | extended0 | extended1 +--+---+--- -(0 rows) +value| standard | extended0 |extended1 +-+--+--+-- + 0 | 10001010100010011011 | 0101100001000100011001011011 | 010110000100010001101001 + 0.1 | 101010001010110001001110 | 0000100010001100110111010101 | 0000100010001101100011010101 + 0.1.2 | 0111010101110100 | 1010111001110101100011010111 | 101011100111011100100011 + 0 | 10001010100010011011 | 0101100001000100011001011011 | 010110000100010001101001 + 0_asd.1_ASD | 011000101000101001001101 | 0000100010001100110111010101 | 0000100010001101100011010101 +(5 rows) Please post an updated version for the same. [1] - https://api.cirrus-ci.com/v1/artifact/task/5572544858685440/testrun/build-32/testrun/ltree/regress/regression.diffs Regards, Vignesh
Re: [PATCH] ltree hash functions
Thanks. I've attached the latest version that updates the naming in line with the convention. On Mon, Dec 4, 2023 at 12:46 AM jian he wrote: > > On Fri, Dec 1, 2023 at 8:44 AM Tommy Pavlicek wrote: > > > > > > Patch updated for those comments (and a touch of cleanup in the tests) > > attached. > > it would be a better name as hash_ltree than ltree_hash, similar logic > applies to ltree_hash_extended. > that would be the convention. see: > https://stackoverflow.com/a/69650940/15603477 > > > Other than that, it looks good. 0001-ltree-hash-v4.patch Description: Binary data
Re: [PATCH] ltree hash functions
On Fri, Dec 1, 2023 at 8:44 AM Tommy Pavlicek wrote: > > > Patch updated for those comments (and a touch of cleanup in the tests) > attached. it would be a better name as hash_ltree than ltree_hash, similar logic applies to ltree_hash_extended. that would be the convention. see: https://stackoverflow.com/a/69650940/15603477 Other than that, it looks good.
Re: [PATCH] ltree hash functions
On Tue, Nov 28, 2023 at 7:38 PM jian he wrote: > you only change Makefile, you also need to change contrib/ltree/meson.build? > Do you need to use EXPLAIN to demo the index usage? Thanks! Yes, I missed the Meson build file. I added additional commands with EXPLAIN (COSTS OFF) as I found in other places. Patch updated for those comments (and a touch of cleanup in the tests) attached. 0001-ltree-hash-v3.patch Description: Binary data
Re: [PATCH] ltree hash functions
On Wed, Nov 29, 2023 at 6:09 AM Tommy Pavlicek wrote: > > On Thu, Jul 6, 2023 at 2:18 AM Daniel Gustafsson wrote: > > > > > On 19 Jun 2023, at 11:18, Tommy Pavlicek wrote: > > > > > Tommy, are you interested in extending ALTER OPERATOR to allow this, > > > which would also allow fixing the ltree operator? > > > > > > Yes, I can do that. I took a look over the code and email thread and it > > > seems like it should be relatively straight forward. I'll put a patch > > > together for that and then update this patch to alter the operator. > > > > Did you have a chance to look at this for an updated patch for this > > commitfest? > > I finally had a chance to look at this and I've updated the patch to > alter the = operator to enable hash joins. > > This is ready to be looked at now. > > Is there anything I need to do to move this forward? > you only change Makefile, you also need to change contrib/ltree/meson.build? +drop index tstidx; +create index tstidx on ltreetest using hash (t); +set enable_seqscan=off; + +SELECT * FROM ltreetest WHERE t = '12.3' order by t asc; Do you need to use EXPLAIN to demo the index usage?
Re: [PATCH] ltree hash functions
On Thu, Jul 6, 2023 at 2:18 AM Daniel Gustafsson wrote: > > > On 19 Jun 2023, at 11:18, Tommy Pavlicek wrote: > > > Tommy, are you interested in extending ALTER OPERATOR to allow this, > > which would also allow fixing the ltree operator? > > > > Yes, I can do that. I took a look over the code and email thread and it > > seems like it should be relatively straight forward. I'll put a patch > > together for that and then update this patch to alter the operator. > > Did you have a chance to look at this for an updated patch for this > commitfest? I finally had a chance to look at this and I've updated the patch to alter the = operator to enable hash joins. This is ready to be looked at now. Is there anything I need to do to move this forward? Cheers, Tommy 0001-ltree-hash-v2.patch Description: Binary data
Re: [PATCH] ltree hash functions
> On 19 Jun 2023, at 11:18, Tommy Pavlicek wrote: > Tommy, are you interested in extending ALTER OPERATOR to allow this, > which would also allow fixing the ltree operator? > > Yes, I can do that. I took a look over the code and email thread and it seems > like it should be relatively straight forward. I'll put a patch together for > that and then update this patch to alter the operator. Did you have a chance to look at this for an updated patch for this commitfest? -- Daniel Gustafsson
Re: [PATCH] ltree hash functions
> > FWIW the CREATE OPERATOR documentation only talks about hash joins for HASHES, maybe it should be updated to also mention hash aggregates? I think I might have been a bit unclear here, the hash aggregate does work without altering the operator so it's just the join that's blocked. Sorry about the confusion. I wonder what's the use case for this. I wonder how often people join on > ltree, for example. Did you just notice ltree can't hash and decided to > fix that, or do you have a practical use case / need for this feature? I mostly want to add hash indexes. Beyond selecting specific values, you can use them to get ancestors (trim the path and do an exact select) and descendents (using a functional index calculating the parent path for each row). For example, I've found it can be faster to calculate the path of every ancestor and use select ltree path = ANY([ancestor paths]) compared to using a gist index. It's not ideal, but unfortunately I've found that with enough rows, gist indexes get very large and slow. Btree indexes are better, but for ltree they can still be up to around 10x bigger than a hash index. I've also seen ltree hash indexes outperform btree indexes in very large tables, but I suspect in most cases they'll be similar. Tommy, are you interested in extending ALTER OPERATOR to allow this, > which would also allow fixing the ltree operator? Yes, I can do that. I took a look over the code and email thread and it seems like it should be relatively straight forward. I'll put a patch together for that and then update this patch to alter the operator. On Sat, Jun 17, 2023 at 9:57 PM Tomas Vondra wrote: > > > On 6/17/23 20:19, Tom Lane wrote: > > Tomas Vondra writes: > >> I guess the "correct" solution would be to extend ALTER OPERATOR. I > >> wonder why it's not supported - it's clearly an intentional decision > >> (per comment in AlterOperator). So what might break if this changes for > >> an existing operator? > > > > This code was added by commit 321eed5f0. The thread leading up to > > that commit is here: > > > > https://www.postgresql.org/message-id/flat/3348985.V7xMLFDaJO%40dinodell > > > > There are some nontrivial concerns in there about breaking the > > semantics of existing exclusion constraints, for instance. I think > > we mostly rejected the concern about invalidation of cached plans > > as already-covered, but that wasn't the only problem. > > > > However, I think we could largely ignore the issues if we restricted > > ALTER OPERATOR to only add commutator, negator, hashes, or merges > > properties to operators that lacked them before --- which'd be the > > primary if not only use-case anyway. That direction can't break > > anything. > > > > Sound reasonable. > > Tommy, are you interested in extending ALTER OPERATOR to allow this, > which would also allow fixing the ltree operator? > > > regards > > -- > Tomas Vondra > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
Re: [PATCH] ltree hash functions
On 6/17/23 20:19, Tom Lane wrote: > Tomas Vondra writes: >> I guess the "correct" solution would be to extend ALTER OPERATOR. I >> wonder why it's not supported - it's clearly an intentional decision >> (per comment in AlterOperator). So what might break if this changes for >> an existing operator? > > This code was added by commit 321eed5f0. The thread leading up to > that commit is here: > > https://www.postgresql.org/message-id/flat/3348985.V7xMLFDaJO%40dinodell > > There are some nontrivial concerns in there about breaking the > semantics of existing exclusion constraints, for instance. I think > we mostly rejected the concern about invalidation of cached plans > as already-covered, but that wasn't the only problem. > > However, I think we could largely ignore the issues if we restricted > ALTER OPERATOR to only add commutator, negator, hashes, or merges > properties to operators that lacked them before --- which'd be the > primary if not only use-case anyway. That direction can't break > anything. > Sound reasonable. Tommy, are you interested in extending ALTER OPERATOR to allow this, which would also allow fixing the ltree operator? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [PATCH] ltree hash functions
Tomas Vondra writes: > I guess the "correct" solution would be to extend ALTER OPERATOR. I > wonder why it's not supported - it's clearly an intentional decision > (per comment in AlterOperator). So what might break if this changes for > an existing operator? This code was added by commit 321eed5f0. The thread leading up to that commit is here: https://www.postgresql.org/message-id/flat/3348985.V7xMLFDaJO%40dinodell There are some nontrivial concerns in there about breaking the semantics of existing exclusion constraints, for instance. I think we mostly rejected the concern about invalidation of cached plans as already-covered, but that wasn't the only problem. However, I think we could largely ignore the issues if we restricted ALTER OPERATOR to only add commutator, negator, hashes, or merges properties to operators that lacked them before --- which'd be the primary if not only use-case anyway. That direction can't break anything. regards, tom lane
Re: [PATCH] ltree hash functions
Hi, I've created a CF entry for the patch: https://commitfest.postgresql.org/43/4375/ I only briefly skimmed the code, so a couple comments. On 6/17/23 17:45, Tommy Pavlicek wrote: > Hi All, > > I've written a patch to add hash functions for the ltree extension. It > adds support for hash indexes and hash aggregation. I've reused the > existing logic that's used to hash arrays and added tests that mirror > elsewhere (i.e. hstore and hash_func regression tests). > Reusing code/logic is the right approach, IMHO. > The patch doesn't currently support hash joins as the ltree = operator > was created without support for it. The ALTER OPERATOR command doesn't > support changing the hash join support, so I'm not sure what the best > strategy to change it is. Is it ok to update the operator's row in the > pg_operator system catalog or is there a better way to change this that > someone could recommend? > I guess the "correct" solution would be to extend ALTER OPERATOR. I wonder why it's not supported - it's clearly an intentional decision (per comment in AlterOperator). So what might break if this changes for an existing operator? FWIW the CREATE OPERATOR documentation only talks about hash joins for HASHES, maybe it should be updated to also mention hash aggregates? > Any comments on the overall approach or other feedback would be appreciated. > I wonder what's the use case for this. I wonder how often people join on ltree, for example. Did you just notice ltree can't hash and decided to fix that, or do you have a practical use case / need for this feature? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
[PATCH] ltree hash functions
Hi All, I've written a patch to add hash functions for the ltree extension. It adds support for hash indexes and hash aggregation. I've reused the existing logic that's used to hash arrays and added tests that mirror elsewhere (i.e. hstore and hash_func regression tests). The patch doesn't currently support hash joins as the ltree = operator was created without support for it. The ALTER OPERATOR command doesn't support changing the hash join support, so I'm not sure what the best strategy to change it is. Is it ok to update the operator's row in the pg_operator system catalog or is there a better way to change this that someone could recommend? Any comments on the overall approach or other feedback would be appreciated. Thanks, Tommy ltree-hash-v1.patch Description: Binary data