https://github.com/tvondra/hashset
On Mon, Aug 14, 2023 at 11:23 PM Florents Tselai
wrote:
>
> Has anyone put this in a git repo / extension package or similar ?
>
> I’d like to try it out outside the core pg tree.
>
> > On 1 Jul 2023, at 12:04 PM, Joel Jacobson wrote:
> >
> > On Fri, Jun 30,
Has anyone put this in a git repo / extension package or similar ?
I’d like to try it out outside the core pg tree.
> On 1 Jul 2023, at 12:04 PM, Joel Jacobson wrote:
>
> On Fri, Jun 30, 2023, at 06:50, jian he wrote:
>> more like a C questions
>> in this context does
>> #define
On Fri, Jun 30, 2023, at 06:50, jian he wrote:
> more like a C questions
> in this context does
> #define HASHSET_GET_VALUES(set) ((int32 *) ((set)->data +
> CEIL_DIV((set)->capacity, 8)))
> define first, then define struct int4hashset_t. Is this normally ok?
Yes, it's fine. Macros are just text
On Thu, Jun 29, 2023 at 4:43 PM Joel Jacobson wrote:
>
> On Thu, Jun 29, 2023, at 08:54, jian he wrote:
> > Anyway, this time, I added another macro,which seems to simplify the code.
> >
> > #define SET_DATA_PTR(a) \
> > (((char *) (a->data)) + CEIL_DIV(a->capacity, 8))
> >
> > it passed all the
On Wed, Jun 28, 2023 at 4:50 PM Joel Jacobson wrote:
>
> On Wed, Jun 28, 2023, at 08:26, jian he wrote:
>
> > Hi there.
> > I changed the function hashset_contains to strict.
>
> Changing hashset_contains to STRICT would cause it to return NULL
> if any of the operands are NULL, which I don't
On Wed, Jun 28, 2023, at 08:26, jian he wrote:
> Hi there.
> I changed the function hashset_contains to strict.
Changing hashset_contains to STRICT would cause it to return NULL
if any of the operands are NULL, which I don't believe is correct, since:
SELECT NULL = ANY('{}'::int4[]);
?column?
On Tue, Jun 27, 2023 at 4:27 PM Joel Jacobson wrote:
>
> On Tue, Jun 27, 2023, at 04:35, jian he wrote:
> > in SQLMultiSets.pdf(previously thread) I found a related explanation
> > on page 45, 46.
> > /home/jian/hashset/0001-make-int4hashset_contains-strict-and-header-file-change.patch
> > (CASE
On Tue, Jun 27, 2023, at 10:26, Joel Jacobson wrote:
> Attachments:
> * hashset-0.0.1-b7e5614-full.patch
> * hashset-0.0.1-b7e5614-incremental.patch
To help verify that the semantics, I thought it might be helpful to provide
a comprehensive set of examples that tries to cover all different ways
On Tue, Jun 27, 2023 at 4:55 AM Joel Jacobson wrote:
>
> On Mon, Jun 26, 2023, at 13:06, jian he wrote:
> > Can you try to glue the attached to the hashset data type input
> > function.
> > the attached will parse cstring with double quote and not. so '{1,2,3}'
> > == '{"1","2","3"}'. obviously
On Mon, Jun 26, 2023 at 4:55 PM Joel Jacobson wrote:
> On Mon, Jun 26, 2023, at 13:06, jian he wrote:
> > Can you try to glue the attached to the hashset data type input
> > function.
> > the attached will parse cstring with double quote and not. so '{1,2,3}'
> > == '{"1","2","3"}'. obviously
On Mon, Jun 26, 2023, at 13:06, jian he wrote:
> Can you try to glue the attached to the hashset data type input
> function.
> the attached will parse cstring with double quote and not. so '{1,2,3}'
> == '{"1","2","3"}'. obviously quote will preserve the inner string as
> is.
> currently
On Mon, Jun 26, 2023 at 4:36 AM Joel Jacobson wrote:
>
> On Sun, Jun 25, 2023, at 11:42, Joel Jacobson wrote:
> > SELECT hashset_contains('{}'::int4hashset, NULL::int);
> >
> > would be False, according to the General Rules.
> >
> ...
> > Applying the same rules, we'd have to return Unknown
On Mon, Jun 26, 2023 at 2:56 AM Tomas Vondra
wrote:
>
>
>
> On 6/25/23 15:32, jian he wrote:
> >> Or maybe I just don't understand the proposal. Perhaps it'd be best if
> >> jian wrote a patch illustrating the idea, and showing how it performs
> >> compared to the current approach.
> >
> >
On Sun, Jun 25, 2023, at 11:42, Joel Jacobson wrote:
> SELECT hashset_contains('{}'::int4hashset, NULL::int);
>
> would be False, according to the General Rules.
>
...
> Applying the same rules, we'd have to return Unknown (which we represent as
> null) for:
>
> SELECT
On 6/25/23 15:32, jian he wrote:
>> Or maybe I just don't understand the proposal. Perhaps it'd be best if
>> jian wrote a patch illustrating the idea, and showing how it performs
>> compared to the current approach.
>
> currently joel's idea is a int4hashset. based on the code first tomas
> Or maybe I just don't understand the proposal. Perhaps it'd be best if
> jian wrote a patch illustrating the idea, and showing how it performs
> compared to the current approach.
currently joel's idea is a int4hashset. based on the code first tomas wrote.
it looks like a non-nested an
On Sat, Jun 24, 2023, at 21:16, Joel Jacobson wrote:
> New version of int4hashset_contains() that should follow the same
> General Rules as MULTISET's MEMBER OF (8.16 ).
...
> SELECT hashset_contains('{}'::int4hashset, NULL::int); -- false
...
> SELECT hashset_contains('{null}'::int4hashset,
New version of int4hashset_contains() that should follow the same
General Rules as MULTISET's MEMBER OF (8.16 ).
The first rule is to return False if the cardinality is 0 (zero).
However, we must first check if the first argument is null,
in which case the cardinality cannot be 0 (zero),
so if
On Thu, Jun 22, 2023, at 07:51, Joel Jacobson wrote:
> For instance, how should hashset_count() work?
>
> Given the query,
>
> SELECT hashset_count('{1,2,3,null}'::int4hashset);
>
> Should we,
>
> a) threat NULL as a distinct value and return 4?
>
> b) ignore NULL and return 3?
>
> c) return NULL?
On 6/23/23 13:47, Andrew Dunstan wrote:
>
> On 2023-06-23 Fr 04:23, Joel Jacobson wrote:
>> On Fri, Jun 23, 2023, at 08:40, jian he wrote:
>>> I played around array_func.c
>>> many of the code can be used for multiset data type.
>>> now I imagine multiset as something like one dimension array.
On 2023-06-23 Fr 04:23, Joel Jacobson wrote:
On Fri, Jun 23, 2023, at 08:40, jian he wrote:
I played around array_func.c
many of the code can be used for multiset data type.
now I imagine multiset as something like one dimension array. (nested
is somehow beyond the imagination...).
Are you
On Fri, Jun 23, 2023 at 4:23 PM Joel Jacobson wrote:
> On Fri, Jun 23, 2023, at 08:40, jian he wrote:
> > I played around array_func.c
> > many of the code can be used for multiset data type.
> > now I imagine multiset as something like one dimension array. (nested
> > is somehow beyond the
On Fri, Jun 23, 2023, at 08:40, jian he wrote:
> I played around array_func.c
> many of the code can be used for multiset data type.
> now I imagine multiset as something like one dimension array. (nested
> is somehow beyond the imagination...).
Are you suggesting it might be a better idea to
I played around array_func.c
many of the code can be used for multiset data type.
now I imagine multiset as something like one dimension array. (nested is
somehow beyond the imagination...).
* A standard varlena array has the following internal structure:
*- standard varlena header word
*
On 6/22/23 19:52, Joel Jacobson wrote:
> On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote:
>> This is also what the SQL standard does for multisets - there's SQL:20nn
>> draft at http://www.wiscorp.com/SQLStandards.html, and the > predicate> section (p. 475) explains how this should work with
On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote:
> This is also what the SQL standard does for multisets - there's SQL:20nn
> draft at http://www.wiscorp.com/SQLStandards.html, and the predicate> section (p. 475) explains how this should work with NULL.
I've looked again at the paper you
On Tue, Jun 20, 2023, at 18:25, Tomas Vondra wrote:
> On 6/20/23 16:56, Joel Jacobson wrote:
>> The reference to consistency with what we do elsewhere might not be entirely
>> applicable in this context, since the set feature we're designing is a new
>> beast
>> in the SQL landscape.
>
> I don't
On 6/20/23 20:08, jian he wrote:
> On Wed, Jun 21, 2023 at 12:25 AM Tomas Vondra
> ...
>> http://www.wiscorp.com/sqlmultisets.zip
>
>> Conceptually, a multiset is an unordered collection of elements, all of the
>> same type, with dupli-
>> cates permitted. Unlike arrays, a multiset is an
On Wed, Jun 21, 2023 at 12:25 AM Tomas Vondra
wrote:
>
>
>
> On 6/20/23 16:56, Joel Jacobson wrote:
> > On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote:
> >> On 6/20/23 12:59, Joel Jacobson wrote:
> >>> On Mon, Jun 19, 2023, at 02:00, jian he wrote:
> select
On 6/20/23 16:56, Joel Jacobson wrote:
> On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote:
>> On 6/20/23 12:59, Joel Jacobson wrote:
>>> On Mon, Jun 19, 2023, at 02:00, jian he wrote:
select hashset_contains('{1,2}'::int4hashset,NULL::int);
should return null?
>>>
>>> I agree, it
On Tue, Jun 20, 2023, at 16:56, Joel Jacobson wrote:
> I think we have an opportunity here to innovate and potentially influence a
> future set concept in the SQL standard.
Adding to my previous note - If there's a worry about future SQL standards
introducing SETs with NULLs, causing
On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote:
> On 6/20/23 12:59, Joel Jacobson wrote:
>> On Mon, Jun 19, 2023, at 02:00, jian he wrote:
>>> select hashset_contains('{1,2}'::int4hashset,NULL::int);
>>> should return null?
>>
>> I agree, it should.
>>
>> I've now changed all functions
On 6/20/23 14:10, Tomas Vondra wrote:
> ...
>
> This is also what the SQL standard does for multisets - there's SQL:20nn
> draft at http://www.wiscorp.com/SQLStandards.html, and the predicate> section (p. 475) explains how this should work with NULL.
>
BTW I just notices there's also a
On 6/20/23 12:59, Joel Jacobson wrote:
> On Mon, Jun 19, 2023, at 02:00, jian he wrote:
>> select hashset_contains('{1,2}'::int4hashset,NULL::int);
>> should return null?
>
> I agree, it should.
>
> I've now changed all functions except int4hashset() (the init function)
> and the aggregate
On Mon, Jun 19, 2023, at 02:00, jian he wrote:
> select hashset_contains('{1,2}'::int4hashset,NULL::int);
> should return null?
I agree, it should.
I've now changed all functions except int4hashset() (the init function)
and the aggregate functions to be STRICT.
I think this patch is OK to send
On Tue, Jun 20, 2023, at 02:04, Tomas Vondra wrote:
> For UPDATE, it'd be pretty clear too, I think. It's possible to do
>
>UPDATE table SET col = SET[1,2,3]
>
> and it's clear the first is the command SET, while the second is a set
> constructor. For SELECT there'd be conflict, and for ALTER
On 6/20/23 00:50, Joel Jacobson wrote:
> On Mon, Jun 19, 2023, at 14:59, Tomas Vondra wrote:
>> What unexpected issues you mean? Sure, if someone uses multisets as if
>> they were sets (so ignoring the handling of duplicates), things will go
>> booom! quickly.
>
> The unexpected issues I had
On Mon, Jun 19, 2023, at 14:59, Tomas Vondra wrote:
> What unexpected issues you mean? Sure, if someone uses multisets as if
> they were sets (so ignoring the handling of duplicates), things will go
> booom! quickly.
The unexpected issues I had in mind are subtle bugs due to treating multisets
as
Andrew Dunstan writes:
> Yes, Multisets (a.k.a. bags and a large number of other names) would be
> interesting. But I wouldn't like to abandon pure sets either. Maybe a
> typmod indicating the allowed multiplicity of the type?
I don't think trying to use typmod to carry fundamental semantic
On 6/19/23 13:33, Joel Jacobson wrote:
> On Mon, Jun 19, 2023, at 11:21, Tomas Vondra wrote:
>> AFAICS the standard only defines arrays and multisets. Arrays are pretty
>> much the thing we have, including the ARRAY[] constructor etc. Multisets
>> are similar to hashset discussed here, except
On 6/19/23 13:50, Andrew Dunstan wrote:
>
> On 2023-06-19 Mo 05:21, Tomas Vondra wrote:
>> On 6/18/23 18:45, Andrew Dunstan wrote:
>>> On 2023-06-16 Fr 20:38, Joel Jacobson wrote:
New patch is attached, which will henceforth always be a complete patch,
to avoid the hassle of having to
On Mon, Jun 19, 2023, at 11:49, jian he wrote:
> hashset_to_array function should be strict?
>
> I noticed hashset_symmetric_difference and hashset_difference handle
> null in a different way, seems they should handle null in a consistent
> way?
Yes, I agree, they should be consistent.
I've
On 2023-06-19 Mo 05:21, Tomas Vondra wrote:
On 6/18/23 18:45, Andrew Dunstan wrote:
On 2023-06-16 Fr 20:38, Joel Jacobson wrote:
New patch is attached, which will henceforth always be a complete patch,
to avoid the hassle of having to assemble incremental patches.
Cool, thanks.
It might
On Mon, Jun 19, 2023, at 11:21, Tomas Vondra wrote:
> AFAICS the standard only defines arrays and multisets. Arrays are pretty
> much the thing we have, including the ARRAY[] constructor etc. Multisets
> are similar to hashset discussed here, except that it tracks the number
> of elements for each
On Mon, Jun 19, 2023 at 2:51 PM Joel Jacobson wrote:
>
> On Mon, Jun 19, 2023, at 02:00, jian he wrote:
> > select hashset_contains('{1,2}'::int4hashset,NULL::int);
> > should return null?
>
> Hmm, that's a good philosophical question.
>
> I notice Tomas Vondra in the initial commit opted for
On 6/18/23 18:45, Andrew Dunstan wrote:
>
> On 2023-06-16 Fr 20:38, Joel Jacobson wrote:
>>
>> New patch is attached, which will henceforth always be a complete patch,
>> to avoid the hassle of having to assemble incremental patches.
>
>
> Cool, thanks.
>
It might still be convenient to
On Mon, Jun 19, 2023, at 02:00, jian he wrote:
> select hashset_contains('{1,2}'::int4hashset,NULL::int);
> should return null?
Hmm, that's a good philosophical question.
I notice Tomas Vondra in the initial commit opted for allowing NULL inputs,
treating them as empty sets, e.g. in
On Sat, Jun 17, 2023 at 8:38 AM Joel Jacobson wrote:
>
> On Fri, Jun 16, 2023, at 17:42, Joel Jacobson wrote:
> > I realise int4hashset_hash() is broken,
> > since two int4hashset's that are considered equal,
> > can by coincidence get different hashes:
> ...
> > Do we have any ideas on how to
On Sun, Jun 18, 2023, at 18:45, Andrew Dunstan wrote:
> . It might be worth sending a version number with the send function
> (c.f. jsonb_send / jsonb_recv). That way would would not be tied forever
> to some wire representation.
Great idea; implemented.
> . I think there are some important
On 2023-06-16 Fr 20:38, Joel Jacobson wrote:
New patch is attached, which will henceforth always be a complete patch,
to avoid the hassle of having to assemble incremental patches.
Cool, thanks.
A couple of random thoughts:
. It might be worth sending a version number with the send
On Fri, Jun 16, 2023, at 17:42, Joel Jacobson wrote:
> I realise int4hashset_hash() is broken,
> since two int4hashset's that are considered equal,
> can by coincidence get different hashes:
...
> Do we have any ideas on how to fix this without sacrificing performance?
The problem was due to
On Fri, Jun 16, 2023, at 13:57, jian he wrote:
> similar to (int[] || int4) and (int4 || int[])
> should we expect ('{1,2}'::int4hashset || 3) == (3 ||
> '{1,2}'::int4hashset) == (select hashset_add('{1,2}'::int4hashset,3));
> *?*
Good idea, makes sense to support it.
Implemented in attached
similar to (int[] || int4) and (int4 || int[])
should we expect ('{1,2}'::int4hashset || 3) == (3 ||
'{1,2}'::int4hashset) == (select hashset_add('{1,2}'::int4hashset,3)); *?*
The following is the general idea on how to make it work by looking at
similar code
CREATE OPERATOR || (
leftarg
New patch attached:
Add customizable params to int4hashset() and collision count function
This commit enhances int4hashset() by introducing adjustable capacity,
load, and growth factors, providing flexibility for performance optimization.
Also added is a new function, hashset_collisions(), to
On Thu, Jun 15, 2023, at 11:44, jian he wrote:
> In hashset/test/sql/order.sql, can we add the following to test whether
> the optimizer will use our index.
>
> CREATE INDEX ON test_int4hashset_order (int4hashset_col
> int4hashset_btree_ops);
>
> -- to make sure that this work with just two
On Thu, Jun 15, 2023, at 06:29, jian he wrote:
> I am not sure the following results are correct.
> with cte as (
> select hashset(x) as x
> ,hashset_capacity(hashset(x))
> ,hashset_count(hashset(x))
> from generate_series(1,10) g(x))
> select *
> ,'|' as
On Thu, Jun 15, 2023, at 11:44, jian he wrote:
> I didn't install the extension directly. I copied the
> hashset--0.0.1.sql to another place, using gcc to compile these
> functions.
..
> Because even make
> PG_CONFIG=/home/jian/postgres/2023_05_25_beta5421/bin/pg_config still
> has an error.
In hashset/test/sql/order.sql, can we add the following to test
whether the optimizer
will use our index.
CREATE INDEX ON test_int4hashset_order (int4hashset_col
int4hashset_btree_ops);
-- to make sure that this work with just two rows
SET enable_seqscan TO off;
explain(costs off) SELECT *
On Thu, Jun 15, 2023, at 04:22, jian he wrote:
> Attachments:
> * temp.patch
Thanks for good suggestions.
New patch attached:
Enhance parsing and reorder headers in hashset module
Allow whitespaces in hashset input and reorder the inclusion of
header files, placing PostgreSQL headers first.
On Thu, Jun 15, 2023 at 5:04 AM Joel Jacobson wrote:
> On Wed, Jun 14, 2023, at 15:16, Tomas Vondra wrote:
> > On 6/14/23 14:57, Joel Jacobson wrote:
> >> Would it be feasible to teach the planner to utilize the internal hash
> table of
> >> hashset directly? In the case of arrays, the hash
On Thu, Jun 15, 2023 at 5:04 AM Joel Jacobson wrote:
> On Wed, Jun 14, 2023, at 15:16, Tomas Vondra wrote:
> > On 6/14/23 14:57, Joel Jacobson wrote:
> >> Would it be feasible to teach the planner to utilize the internal hash
> table of
> >> hashset directly? In the case of arrays, the hash
On Wed, 14 Jun 2023 at 19:14, Tomas Vondra
wrote:
> > ...So we'd want the same index usage as
> > =ANY(array) but would like faster row checking than we get with an array
> > when other indexes are used.
>
> We kinda already do this since PG14 (commit 50e17ad281), actually. If
> the list is long
On Wed, Jun 14, 2023, at 15:16, Tomas Vondra wrote:
> On 6/14/23 14:57, Joel Jacobson wrote:
>> Would it be feasible to teach the planner to utilize the internal hash table
>> of
>> hashset directly? In the case of arrays, the hash table construction is an
...
> It's definitely something I'd
On 2023-06-14 We 05:44, Tomas Vondra wrote:
The thing is, adding stuff to core is not free - it means the community
becomes responsible for maintenance, testing, fixing issues, etc. It's
not feasible (or desirable) to have all extensions in core, and cloud
vendors generally do have ways to
On 6/14/23 14:57, Joel Jacobson wrote:
> On Wed, Jun 14, 2023, at 11:44, Tomas Vondra wrote:
>>> Perspective from a potential user: I'm currently working on something
>>> where an array-like structure with fast membership test performance
>>> would be very useful. The main type of query is
On Wed, Jun 14, 2023, at 11:44, Tomas Vondra wrote:
>> Perspective from a potential user: I'm currently working on something
>> where an array-like structure with fast membership test performance
>> would be very useful. The main type of query is doing an =ANY(the set)
>> filter, where the set
On 6/14/23 06:31, Tom Dunstan wrote:
> On Mon, 12 Jun 2023 at 22:37, Tomas Vondra
> mailto:tomas.von...@enterprisedb.com>>
> wrote:
>
> Perhaps. So you're proposing to have this as a regular built-in type?
> It's hard for me to judge how popular this feature would be, but I guess
>
On Wed, Jun 14, 2023, at 06:31, Tom Dunstan wrote:
> On Mon, 12 Jun 2023 at 22:37, Tomas Vondra
> wrote:
>> Perhaps. So you're proposing to have this as a regular built-in type?
>> It's hard for me to judge how popular this feature would be, but I guess
>> people often use arrays while they
On Mon, 12 Jun 2023 at 22:37, Tomas Vondra
wrote:
> Perhaps. So you're proposing to have this as a regular built-in type?
> It's hard for me to judge how popular this feature would be, but I guess
> people often use arrays while they actually want set semantics ...
>
Perspective from a
On Tue, Jun 13, 2023, at 20:50, Joel Jacobson wrote:
> hashset is now using hash_bytes_uint32() from hashfn.h
I spotted a problem in the ordering logic of the comparison functions.
The issue was with handling hashsets containing empty positions,
causing non-lexicographic ordering.
The updated
On Mon, Jun 12, 2023, at 22:36, Joel Jacobson wrote:
> On Mon, Jun 12, 2023, at 21:58, Tomas Vondra wrote:
>> My suggestion is to be lazy, just use the lookup3 we have in hashfn.c
>> (through hash_bytes or something), and at the same time make it possible
>> to switch to a different function in
On Mon, Jun 12, 2023, at 21:58, Tomas Vondra wrote:
> But those are numbers for large keys - if you restrict the input to
> 4B-16B (which is what we planned to do by focusing on int, bigint and
> uuid), there's no significant difference:
Oh, sorry, I completely failed to read the meaning of the
On 6/12/23 19:34, Joel Jacobson wrote:
> On Sat, Jun 10, 2023, at 22:12, Tomas Vondra wrote:
1) better hash table implementation
>
> I noticed src/include/common/hashfn.h provides implementation
> of the Jenkins/lookup3 hash function, and thought maybe
> we could simply use it in hashset?
On Sat, Jun 10, 2023, at 22:12, Tomas Vondra wrote:
>>> 1) better hash table implementation
I noticed src/include/common/hashfn.h provides implementation
of the Jenkins/lookup3 hash function, and thought maybe
we could simply use it in hashset?
However, I noticed that according to SMHasher [1],
On Sat, Jun 10, 2023, at 17:46, Andrew Dunstan wrote:
> Maybe you can post a full patch as well as incremental?
Attached patch is based on tvondra's last commit 375b072.
> Stylistically I think you should reduce reliance on magic numbers (like 13).
> Probably need some #define's?
Great idea,
On 2023-06-12 Mo 09:00, Tomas Vondra wrote:
What was the json/jsonb story, was it ever an extension before
being included in core?
I don't recall what the exact story was, but I guess the "json" type was
added to core very long ago (before we started to push back a bit), and
we added some
On 6/12/23 14:46, Andrew Dunstan wrote:
>
> On 2023-06-11 Su 16:15, Joel Jacobson wrote:
>> On Sun, Jun 11, 2023, at 17:03, Tomas Vondra wrote:
>>> On 6/11/23 12:26, Joel Jacobson wrote:
I think there are some good arguments that speaks in favour of including
it in core:
>> ...
>>>
On 6/11/23 22:15, Joel Jacobson wrote:
> On Sun, Jun 11, 2023, at 17:03, Tomas Vondra wrote:
>> On 6/11/23 12:26, Joel Jacobson wrote:
>>> I think there are some good arguments that speaks in favour of including it
>>> in core:
> ...
>>
>> I agree with all of that, but ...
>>
>> It's just past
On 2023-06-11 Su 16:15, Joel Jacobson wrote:
On Sun, Jun 11, 2023, at 17:03, Tomas Vondra wrote:
On 6/11/23 12:26, Joel Jacobson wrote:
I think there are some good arguments that speaks in favour of including it in
core:
...
I agree with all of that, but ...
It's just past feature freeze,
On Sun, Jun 11, 2023, at 17:03, Tomas Vondra wrote:
> On 6/11/23 12:26, Joel Jacobson wrote:
>> I think there are some good arguments that speaks in favour of including it
>> in core:
...
>
> I agree with all of that, but ...
>
> It's just past feature freeze, so the earliest release this could
On Sun, Jun 11, 2023, at 16:58, Andrew Dunstan wrote:
>>On 2023-06-11 Su 06:26, Joel Jacobson wrote:
>>Perhaps "set" would have been a better name, since the use of hash functions
>>from an end-user perspective is >>implementation details, but we cannot use
>>that word since it's a reserved
On 6/11/23 12:26, Joel Jacobson wrote:
> On Sat, Jun 10, 2023, at 22:26, Tomas Vondra wrote:
>> On 6/10/23 17:46, Andrew Dunstan wrote:
>>>
>>> Maybe you can post a full patch as well as incremental?
>>>
>>
>> I wonder if we should keep discussing this extension here, considering
>> it's going
On 2023-06-11 Su 06:26, Joel Jacobson wrote:
On Sat, Jun 10, 2023, at 22:26, Tomas Vondra wrote:
On 6/10/23 17:46, Andrew Dunstan wrote:
Maybe you can post a full patch as well as incremental?
I wonder if we should keep discussing this extension here, considering
it's going to be out of
On Sat, Jun 10, 2023, at 22:26, Tomas Vondra wrote:
> On 6/10/23 17:46, Andrew Dunstan wrote:
>>
>> Maybe you can post a full patch as well as incremental?
>>
>
> I wonder if we should keep discussing this extension here, considering
> it's going to be out of core (at least for now). Not sure
On 6/10/23 17:46, Andrew Dunstan wrote:
>
> On 2023-06-09 Fr 07:56, Joel Jacobson wrote:
>> On Fri, Jun 9, 2023, at 13:33, jian he wrote:
>> > Hi, I am quite new about C.
>> > The following function I have 3 questions.
>> > 1. 7691,4201, I assume they are just random prime ints?
>>
>> Yes,
On 6/9/23 12:58, Joel Jacobson wrote:
> On Thu, Jun 8, 2023, at 12:19, Tomas Vondra wrote:
>> Would you be interested in helping with / working on some of that? I
>> don't have immediate need for this stuff, so it's not very high on my
>> TODO list.
>
> Sure, I'm willing to help!
>
> I've
in funcion. hashset_in
int32 value = strtol(str, , 10);
there is no int32 value range check?
imitate src/backend/utils/adt/int.c. the following way is what I came up
with.
int64 value = strtol(str, , 10);
if (errno == ERANGE || value < INT_MIN || value > INT_MAX)
ereturn(fcinfo->context, (Datum)
On 2023-06-09 Fr 07:56, Joel Jacobson wrote:
On Fri, Jun 9, 2023, at 13:33, jian he wrote:
> Hi, I am quite new about C.
> The following function I have 3 questions.
> 1. 7691,4201, I assume they are just random prime ints?
Yes, 7691 and 4201 are likely chosen as random prime numbers.
In
On Fri, Jun 9, 2023, at 13:33, jian he wrote:
> Hi, I am quite new about C.
> The following function I have 3 questions.
> 1. 7691,4201, I assume they are just random prime ints?
Yes, 7691 and 4201 are likely chosen as random prime numbers.
In hash functions, prime numbers are often used to
On Fri, Jun 9, 2023 at 6:58 PM Joel Jacobson wrote:
> On Thu, Jun 8, 2023, at 12:19, Tomas Vondra wrote:
> > Would you be interested in helping with / working on some of that? I
> > don't have immediate need for this stuff, so it's not very high on my
> > TODO list.
>
> Sure, I'm willing to
On Thu, Jun 8, 2023, at 12:19, Tomas Vondra wrote:
> Would you be interested in helping with / working on some of that? I
> don't have immediate need for this stuff, so it's not very high on my
> TODO list.
Sure, I'm willing to help!
I've attached a patch that works on some of the items on your
On 6/8/23 11:41, Joel Jacobson wrote:
> On Wed, Jun 7, 2023, at 19:37, Tomas Vondra wrote:
>> Interesting, considering how dumb the the hash table implementation is.
>
> That's promising.
>
Yeah, not bad for sleep-deprived on-plane hacking ...
There's a bunch of stuff that needs to be improved
On Wed, Jun 7, 2023, at 19:37, Tomas Vondra wrote:
> Interesting, considering how dumb the the hash table implementation is.
That's promising.
>> I tested Neo4j and the results are surprising; it appears to be
>> significantly *slower*.
>> However, I've probably misunderstood something, maybe I
On 6/7/23 16:21, Joel Jacobson wrote:
> On Tue, Jun 6, 2023, at 13:20, Tomas Vondra wrote:
>> it cuts the timing to about 50% on my laptop, so maybe it'll be ~300ms
>> on your system. There's a bunch of opportunities for more improvements,
>> as the hash table implementation is pretty
On Tue, Jun 6, 2023, at 13:20, Tomas Vondra wrote:
> it cuts the timing to about 50% on my laptop, so maybe it'll be ~300ms
> on your system. There's a bunch of opportunities for more improvements,
> as the hash table implementation is pretty naive/silly, the on-disk
> format is wasteful and so
On 6/5/23 21:52, Joel Jacobson wrote:
> On Mon, Jun 5, 2023, at 01:44, Tomas Vondra wrote:
>> Anyway, I hacked together a trivial set backed by an open addressing
>> hash table:
>>
>> https://github.com/tvondra/hashset
>>
>> It's super-basic, providing just some bare minimum of functions, but
On Mon, Jun 5, 2023, at 01:44, Tomas Vondra wrote:
> Anyway, I hacked together a trivial set backed by an open addressing
> hash table:
>
> https://github.com/tvondra/hashset
>
> It's super-basic, providing just some bare minimum of functions, but
> hopefully good enough for experiments.
>
> -
On Fri, Jun 2, 2023, at 10:01, Ants Aasma wrote:
> Have you looked at roaring bitmaps? There is a pg_roaringbitmap
> extension [1] already available that offers very fast unions,
> intersections and membership tests over integer sets. I used it to get
> some pretty impressive performance results
On 6/1/23 09:14, Joel Jacobson wrote:
> On Thu, Jun 1, 2023, at 09:02, Joel Jacobson wrote:
>> Here is an example using a real anonymised social network.
>
> I realised the "found" column is not necessary in this particular example,
> since we only care about the friends at the exact depth
On Wed, 31 May 2023 at 18:40, Joel Jacobson wrote:
>
> On Wed, May 31, 2023, at 16:53, Tomas Vondra wrote:
> > I think this needs a better explanation - what exactly is a hashset in
> > this context? Something like an array with a hash for faster lookup of
> > unique elements, or what?
>
> In
1 - 100 of 107 matches
Mail list logo