On Tue, Jul 9, 2024, at 14:01, Dean Rasheed wrote:
> One thing I noticed while testing the earlier patches on this thread
> was that they were significantly faster if they used unsigned integers
> rather than signed integers. I think the reason is that operations
> like "x / 1" and "x % 1"
On Tue, Jul 9, 2024, at 16:11, Joel Jacobson wrote:
> I added some more ndigits test cases:
Ops, please ignore previous benchmark;
I had forgot to commit in between the measurements,
so they all ran in the same db txn,
which caused a lot of noise on few ndigits.
New benchmark:
> /*
>
On Tue, Jul 9, 2024, at 14:01, Dean Rasheed wrote:
> Before considering the other patches to optimise for larger inputs, I
> think it's worth optimising the existing mul_var() code as much as
> possible.
>
> One thing I noticed while testing the earlier patches on this thread
> was that they were
On Mon, Jul 8, 2024, at 12:45, Matthias van de Meent wrote:
> On Sun, 7 Jul 2024, 22:40 Joel Jacobson, wrote:
>> Today, since 64-bit architectures are dominant, NBASE=1e8 seems like it would
>> have been the best choice, since the square of that still fits in
>> a 64-bit
On Mon, Jul 8, 2024, at 11:45, Dean Rasheed wrote:
> On Mon, 8 Jul 2024 at 00:40, Joel Jacobson wrote:
>>
>> On Sun, Jul 7, 2024, at 13:28, Dean Rasheed wrote:
>> > I've also tidied up a bit by replacing all instances of SHRT_MAX with
>> > a new constant NU
On Mon, Jul 8, 2024, at 10:34, Michael Paquier wrote:
> Thanks for the patch. I have been looking at it for a few hours,
> eyeing a bit on the ObjectProperty parts a bit if we were to extend it
> for sub-object IDs, and did not like the complexity this introduces,
> so I'd be OK to live with the
On Sun, Jul 7, 2024, at 13:28, Dean Rasheed wrote:
> I've also tidied up a bit by replacing all instances of SHRT_MAX with
> a new constant NUMERIC_WEIGHT_MAX, whose name more accurately
> describes the limit, as used in various other overflow checks.
Having thought a bit more on this, I think we
Hello hackers,
I'm not hopeful this idea will be fruitful, but maybe we can find solutions
to the problems together.
The idea is to increase the numeric NBASE from 1e4 to 1e8, which could possibly
give a significant performance boost of all operations across the board,
on 64-bit architectures,
Hello hackers,
This patch adds a mul_var_large() that is dispatched to from mul_var()
for var1ndigits >= 8, regardless of rscale.
The main idea with mul_var_large() is to reduce the "n" in O(n^2) by a factor
of two.
This is achieved by first converting the (ndigits) number of int16 NBASE
On Sun, Jul 7, 2024, at 13:28, Dean Rasheed wrote:
> The numeric round() and trunc() functions clamp the scale argument to
> the range between +/- NUMERIC_MAX_RESULT_SCALE, which is +/- 2000.
> That's a long way short of the actual allowed range of type numeric,
> so they produce incorrect results
On Sat, Jul 6, 2024, at 17:36, Dean Rasheed wrote:
> In the numeric width_bucket() code, we currently do the following:
..
> Instead, this can be done more simply and efficiently, using division
> with truncation as follows:
..
>
> Patch attached. I didn't bother with any new test cases, since
On Sat, Jul 6, 2024, at 11:34, Dean Rasheed wrote:
> On Fri, 5 Jul 2024 at 18:37, Joel Jacobson wrote:
>>
>> On Fri, Jul 5, 2024, at 18:42, Joel Jacobson wrote:
>> > Very nice, v7-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
>> > is now the winne
On Fri, Jul 5, 2024, at 18:42, Joel Jacobson wrote:
> Very nice, v7-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
> is now the winner on all my CPUs:
I thought it would be interesting to also measure the isolated effect
on just numeric_mul() without the query overhead.
In
On Fri, Jul 5, 2024, at 17:41, Dean Rasheed wrote:
> On Fri, 5 Jul 2024 at 12:56, Joel Jacobson wrote:
>>
>> Interesting you got so bad bench results for v6-mul_var_int64.patch
>> for var1ndigits=4, that patch is actually the winner on AMD Ryzen 9 7950X3D.
>
> Inte
On Thu, Jul 4, 2024, at 20:43, Dean Rasheed wrote:
> On Wed, 3 Jul 2024 at 21:45, Joel Jacobson wrote:
>>
>> > On Wed, Jul 3, 2024, at 20:57, Dean Rasheed wrote:
>> >> I wouldn't expect it to ever be off by more than 1
>> >
>> > OK,
On Fri, Jul 5, 2024, at 01:18, Michael Paquier wrote:
> I would still stick to only one function, with arguments coming from
> scanning pg_[sh]depend.
>
>> I found some code in aclchk.c on line 4452-4468 that seems useful,
>> but not sure. Maybe there is some other existing code that is better
>>
On Thu, Jul 4, 2024, at 09:38, Joel Jacobson wrote:
> Summary of benchmark results:
>
> cpu | var1ndigits | winner
> --+-+-
..
> v5-optimize-numeri
On Wed, Jul 3, 2024, at 13:17, Dean Rasheed wrote:
> Anyway, here are both patches for comparison. I'll stop hacking for a
> while and let you see what you make of these.
>
> Regards,
> Dean
>
> Attachments:
> * v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
> *
On Wed, Jul 3, 2024, at 22:27, Joel Jacobson wrote:
> On Wed, Jul 3, 2024, at 20:57, Dean Rasheed wrote:
>> I wouldn't expect it to ever be off by more than 1, given that
>> MUL_GUARD_DIGITS = 2, which corresponds to 8 decimal digits, and the
>> number of digits in the sma
On Wed, Jul 3, 2024, at 20:57, Dean Rasheed wrote:
> Ah yes, I think I was looking at a newer version of the code where I'd
> already fixed that bug. Unless you think there are still bugs in any
> of the boundary checks, which is entirely possible.
Ah, that explains it.
And no, I can't find any
On Wed, Jul 3, 2024, at 15:48, Joel Jacobson wrote:
> On Wed, Jul 3, 2024, at 13:17, Dean Rasheed wrote:
>> On Tue, 2 Jul 2024 at 21:10, Joel Jacobson wrote:
>>>
>>> I found the bug in the case 3 code,
>>> and it turns out the same type o
On Wed, Jul 3, 2024, at 13:17, Dean Rasheed wrote:
> On Tue, 2 Jul 2024 at 21:10, Joel Jacobson wrote:
>>
>> I found the bug in the case 3 code,
>> and it turns out the same type of bug also exists in the case 2 code:
>>
>> case 2:
>>
Hello hackers,
I have discovered a peculiar behavior in mul_var() when it is called with
rscale=0, but the input variables have many decimal digits, resulting in a
product with a .5 decimal part. Given that no decimals are requested by the
caller, I would expect the result to be rounded up.
On Tue, Jul 2, 2024, at 22:10, Joel Jacobson wrote:
> * v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
Instead of these boundary checks, maybe it would be cleaner to just
skip this optimization if there are too few res_ndigits?
/Joel
On Tue, Jul 2, 2024, at 21:55, Joel Jacobson wrote:
> On Tue, Jul 2, 2024, at 20:53, Joel Jacobson wrote:
>> Trying to wrap my head around what could cause this.
I found the bug in the case 3 code,
and it turns out the same type of bug also exists in the case 2 code:
On Tue, Jul 2, 2024, at 20:53, Joel Jacobson wrote:
> Trying to wrap my head around what could cause this.
>
> It's rounding down instead of up, and these cases all end with decimal
> .500.
Interesting, I actually think there is a bug in the normal mul_var() code.
Found a case
On Tue, Jul 2, 2024, at 18:20, Joel Jacobson wrote:
> * v3-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
Hmm, v3 contains a bug which I haven't been able to solve yet.
Reporting now to avoid time waste reviewing it since it's buggy.
The attached patch is how I tested and found the
On Tue, Jul 2, 2024, at 13:44, Dean Rasheed wrote:
>> Can you think of an example that should trigger the bug?
>
> .0001 * 5000._ with rscale = 0 triggers it (returned
> 50004999 instead of 50005000).
Thanks, helpful.
Attached patch adds the var1ndigits=3 case.
Benchmark:
/*
*
On Tue, Jun 25, 2024, at 09:13, Joel Jacobson wrote:
> Attachments:
> * v8-0001-Add-pg_get_acl.patch
Rebased version.
Uses ACL acronym added in commit 00d819d46a6f5b7e9d2e02948a1c80d11c4ce260:
doc: Add ACL acronym for "Access Control List"
/Joel
v9-0001-Add-pg_get_acl.p
On Tue, Jul 2, 2024, at 11:05, Joel Jacobson wrote:
> On Tue, Jul 2, 2024, at 10:22, Dean Rasheed wrote:
>> Shortly after posting that, I realised that there was a small bug. This bit:
>>
>> case 2:
>> newdig = (int) var1digits[1] *
On Tue, Jul 2, 2024, at 10:22, Dean Rasheed wrote:
> Shortly after posting that, I realised that there was a small bug. This bit:
>
> case 2:
> newdig = (int) var1digits[1] * var2digits[res_ndigits - 4];
>
> isn't quite right in the case where rscale is less than the
On Tue, Jul 2, 2024, at 00:19, Dean Rasheed wrote:
> I had a play with this, and came up with a slightly different way of
> doing it that works for var2 of any size, as long as var1 is just 1 or
> 2 digits.
>
> Repeating your benchmark where both numbers have up to 2 NBASE-digits,
> this new
On Mon, Jul 1, 2024, at 15:14, Joel Jacobson wrote:
> * 0001-Optimize-mul_var-for-var2ndigits-4.patch
Found a typo, fixed in new version.
The int128 version is still slower though,
I wonder if there is something that can be done to speed it up further.
Below is a more realistic benchmark t
On Mon, Jul 1, 2024, at 15:11, Joel Jacobson wrote:
> Not really sure why. Maybe the code I tried can be optimized further:
If anyone want experiment with the int128 version,
here is a patch that adds a separate numeric_mul_patched() function,
so it's easier to benchmark against the unmodif
On Mon, Jul 1, 2024, at 14:25, Dagfinn Ilmari Mannsåker wrote:
> div_var() also has an optimisation for 3- and 4-digit operands under
> HAVE_INT128 (added in commit 0aa38db56bf), would that make sense in
> mul_var() too?
I considered it, but it only gives a marginal speed-up on Intel Core
On Mon, Jul 1, 2024, at 09:35, Daniel Gustafsson wrote:
> Avoding redirects is generally a good thing, not everyone is on lightning fast
> internet. Wikipedia is however not doing any 30X redirects so it's not really
> an issue for those links, it's all 200 requests.
Yes, I noticed that too when
On Mon, Jul 1, 2024, at 08:04, Joel Jacobson wrote:
> * 0001-optimize-numeric-mul_var-small-factors.patch
New version to silence maybe-uninitialized error reported by cfbot.
/Joel
v2-0001-optimize-numeric-mul_var-small-factors.patch
Description: Binary data
Hello hackers,
Attached patch introduces an optimization of mul_var() in numeric.c, targeting
cases where the multiplicands consist of only one or two base-NBASE digits.
Such small multiplicands can fit into an int64 and thus be computed directly,
resulting in a significant performance
On Sun, Jun 30, 2024, at 17:44, Tom Lane wrote:
> "Joel Jacobson" writes:
>> On Sat, Jun 29, 2024, at 17:25, Tom Lane wrote:
>>> (In general I find this patch seriously undercommented.)
>
>> However, I think the comments above split_var_at(),
>> mul_var_
On Sat, Jun 29, 2024, at 14:22, Dean Rasheed wrote:
> However, I really don't like having these magic constants at all,
> because in practice the threshold above which the Karatsuba algorithm
> is a win can vary depending on a number of factors, such as whether
> it's running on 32-bit or 64-bit,
On Sat, Jun 29, 2024, at 17:25, Tom Lane wrote:
> Dean Rasheed writes:
>> There's another complication though (if the threshold is made
>> configurable): the various numeric functions that use mul_var() are
>> immutable, which means that the results from the Karatsuba algorithm
>> must match
On Thu, Jun 27, 2024, at 10:42, Joel Jacobson wrote:
> I'll start a new separate thread about fixing the other non-canonical URLs.
Here is the separate thread to fix the docs to use canonical links:
https://postgr.es/m/8ccc96c7-0515-491b-be98-cfacdaeda...@app.fastmail.com
On Sat, Jun 29, 2024, at 14:22, Dean Rasheed wrote:
> On Sun, Jun 23, 2024 at 09:00:29AM +0200, Joel Jacobson wrote:
>> Attached, rebased version of the patch that implements the Karatsuba
>> algorithm in numeric.c's mul_var().
>>
>
> Something to watch out for is that
Hello hackers,
During work in the separate thread [1], I discovered more cases
where the link in docs wasn't the canonical link [2].
[1]
https://postgr.es/m/cakfquwyex9pj9g0zhjewsmsbnquygh+fycw-66ezjfvg4ko...@mail.gmail.com
[2] https://en.wikipedia.org/wiki/Canonical_link_element
The. below
On Wed, Jun 26, 2024, at 18:54, David G. Johnston wrote:
> On Wed, Jun 26, 2024 at 8:47 AM Nathan Bossart
> wrote:
>> On Wed, Jun 26, 2024 at 07:58:55AM -0700, David G. Johnston wrote:
>> > On Wed, Jun 26, 2024 at 7:52 AM Joel Jacobson wrote:
>> >> Want me
On Wed, Jun 26, 2024, at 02:59, David G. Johnston wrote:
> Though there was no comment on the fact we should be linking to:
>
> https://en.wikipedia.org/wiki/Access-control_list
>
> not:
>
> https://en.wikipedia.org/wiki/Access_Control_List
>
> to avoid the dis-ambiguation redirect.
>
> If we are
On Tue, Jun 25, 2024, at 08:42, Michael Paquier wrote:
> On Tue, Jun 25, 2024 at 08:06:41AM +0200, Joel Jacobson wrote:
>> Not sure if I see how to implement it for pg_get_acl() though.
>>
>> I've had a look at how pg_describe_object() works for this case:
>>
>&g
On Tue, Jun 25, 2024, at 07:11, Michael Paquier wrote:
> On Tue, Jun 25, 2024 at 12:20:20AM +0200, Joel Jacobson wrote:
>> Thanks, much better. New version attached.
>
> + The PostgreSQL documentation, and code,
> refers
> + to the specifications within t
On Tue, Jun 25, 2024, at 03:57, Michael Paquier wrote:
> On Tue, Jun 25, 2024 at 01:21:14AM +0200, Joel Jacobson wrote:
>> Good idea, I've started a separate thread for this:
>>
>> https://postgr.es/m/9253b872-dbb1-42a6-a79e-b1e96effc857%40app.fastmail.com
>>
&g
On Mon, Jun 24, 2024, at 01:46, Michael Paquier wrote:
> Rather unrelated to this patch, still this patch makes the situation
> more complicated in the docs, but wouldn't it be better to add ACL as
> a term in acronyms.sql, and reuse it here? It would be a doc-only
> patch that applies on top of
On Mon, Jun 24, 2024, at 23:15, David G. Johnston wrote:
> I really dislike "For avoidance of doubt and clarity" - and in terms of
> being equivalent the following seems like a more accurate description
> of reality.
>
> The PostgreSQL documentation, and code, refers to the specifications
>
On Mon, Jun 24, 2024, at 21:51, David G. Johnston wrote:
> On Mon, Jun 24, 2024 at 12:46 PM Joel Jacobson wrote:
>> On Mon, Jun 24, 2024, at 18:02, David G. Johnston wrote:
>>
>> > The page we link to uses "permissions" while we consistently use
>>
On Mon, Jun 24, 2024, at 18:02, David G. Johnston wrote:
> On Mon, Jun 24, 2024 at 8:44 AM Nathan Bossart
> wrote:
>> I think we could omit "i.e. privileges list."
>>
>
> Agreed. Between the docs and code we say "privileges list" once and
> that refers to the dumputIls description of the
Hello hackers,
This patch is based on a suggestion from a separate thread [1]:
On Mon, Jun 24, 2024, at 01:46, Michael Paquier wrote:
> Rather unrelated to this patch, still this patch makes the situation
> more complicated in the docs, but wouldn't it be better to add ACL as
> a term in
On Fri, Jun 14, 2024, at 03:07, Aaron Altman wrote:
> Thanks for the detailed background and comments, Joel!
>
> The new status of this patch is: Ready for Committer
Thanks for reviewing.
Attached, rebased version of the patch that implements the Karatsuba algorithm
in numeric.c's mul_var().
On Sat, Jun 22, 2024, at 11:44, Joel Jacobson wrote:
> * v5-0001-Add-pg_get_acl.patch
> * v2-0002-Add-pg_get_acl-overloads.patch
Rename files to ensure cfbot applies them in order; both need to have same
version prefix.
v6-0001-Add-pg_get_acl.patch
Description: Binary data
v6-00
On Sat, Jun 22, 2024, at 02:54, Joel Jacobson wrote:
> Attachments:
> * v4-0001-Add-pg_get_acl.patch
> * 0002-Add-pg_get_acl-overloads.patch
Rebase and reduced diff for src/test/regress/sql/privileges.sql between patches.
/Joel
v5-0001-Add-pg_get_acl.patch
Description: Binary data
v2
On Fri, Jun 21, 2024, at 05:25, Michael Paquier wrote:
> Interesting idea.
>
> I am not really convinced that the regproc and regclass overloads are
> really necessary, considering the fact that one of the goals
> mentioned, as far as I understand, is to be able to get an idea of the
> ACLs
On Wed, Jun 19, 2024, at 16:23, Isaac Morland wrote:
> I have no idea how often this would be useful, but I wonder if it could
> work to have overloaded single-parameter versions for each of
> regprocedure (pg_proc.proacl), regclass (pg_class.relacl), …. To call,
> just cast the OID to the
On Wed, Jun 19, 2024, at 15:51, Ranier Vilela wrote:
> Regarding the patch, could it be written in the following style?
Thanks for nice improvement. New version attached.
Best,
Joel
v2-0001-Add-pg_get_acl.patch
Description: Binary data
You also need to run `initdb` to create a new database cluster, with the new
catalog version.
Let me know if you need more specific instructions.
Best,
Joel
On Wed, Jun 19, 2024, at 14:59, Ranier Vilela wrote:
> Em qua., 19 de jun. de 2024 às 08:35, Joel Jacobson
> escreveu:
&g
...@www.fastmail.com
On Thu, Mar 25, 2021, at 16:16, Alvaro Herrera wrote:
> On 2021-Mar-25, Joel Jacobson wrote:
>
>> pg_shdepend doesn't contain the aclitem info though,
>> so it won't work for pg_permissions if we want to expose
>> privilege_type, is_grantable and grantor.
>
>
On Tue, Jun 11, 2024, at 19:16, Aaron Altman wrote:
> Hi Joel, thanks for posting this. Although I have only a cursory
> familiarity with fast multiplication algorithms, I'd like to try and
> give it a review. To start with, can you help me understand the choice
> of this algorithm versus
?
Here is a new attempt, file content unchanged, just named to 0007 and added
"pg_get_acl" to the name.
On Thu, Jun 13, 2024, at 04:00, Joel Jacobson wrote:
> On Thu, Jun 13, 2024, at 00:14, Joel Jacobson wrote:
>> Changes since patch 0005 from 2021-03-25:
>
On Thu, Jun 13, 2024, at 00:14, Joel Jacobson wrote:
> Changes since patch 0005 from 2021-03-25:
> * 0006-pg_privileges-and-pg_ownerships.patch
- Also much faster now thanks to pg_get_acl():
Test with 10 tables:
SELECT COUNT(*) FROM pg_permissions_union_all;
Time: 1466.504 ms (00:
On Thu, Mar 10, 2022, at 22:02, Chapman Flack wrote:
> It looked to me as if the -hackers messages of 25 and 26 March 2021 had
> found a consensus that a pg_get_acl() function would be a good thing,
> with the views to be implemented over that.
>
> I'm just not seeing any later patch that adds
Hi,
This patch introduces the Karatsuba algorithm to speed up multiplication
operations in numeric.c, where the operands have many digits.
It is implemented via a new conditional in mul_var() that determines whether
the sizes of the factors are sufficiently large to justify its use. This
On Tue, Feb 13, 2024, at 23:56, Corey Huinker wrote:
> This patch came out of a discussion at the last PgCon with the person
> who made the "fringe feature" quote, who seemed quite supportive of
> documenting the technique. The comment may have been in regards to
> actually implementing a LIMIT
On Tue, Feb 13, 2024, at 10:28, Laurenz Albe wrote:
> On Mon, 2024-02-12 at 12:24 -0500, Corey Huinker wrote:
>> > Do you plan to add it to the commitfest? If yes, I'd set it "ready for
>> > committer".
>>
>> Commitfest entry reanimated.
>
> Truly... you created a revenant in the already
On Sun, Feb 11, 2024, at 14:58, Robert Haas wrote:
> It's not entirely clear to me what our wider vision is here. Some
> people seem to want a whole series of flags that can disable various
> things that the superuser might otherwise be able to do,
Yes, that's what bothers me a little with the
On Fri, Sep 8, 2023, at 16:17, Gabriele Bartolini wrote:
> ```
> postgres=# ALTER SYSTEM SET wal_level TO minimal;
> ERROR: could not open file "postgresql.auto.conf": Permission denied
> ```
+1 to simply mark postgresql.auto.conf file as not being writeable.
To improve the UX experience, how
On Fri, Jun 30, 2023, at 06:50, jian he wrote:
> more like a C questions
> in this context does
> #define HASHSET_GET_VALUES(set) ((int32 *) ((set)->data +
> CEIL_DIV((set)->capacity, 8)))
> define first, then define struct int4hashset_t. Is this normally ok?
Yes, it's fine. Macros are just text
On Wed, Jun 28, 2023, at 08:26, jian he wrote:
> Hi there.
> I changed the function hashset_contains to strict.
Changing hashset_contains to STRICT would cause it to return NULL
if any of the operands are NULL, which I don't believe is correct, since:
SELECT NULL = ANY('{}'::int4[]);
?column?
On Tue, Jun 27, 2023, at 10:26, Joel Jacobson wrote:
> Attachments:
> * hashset-0.0.1-b7e5614-full.patch
> * hashset-0.0.1-b7e5614-incremental.patch
To help verify that the semantics, I thought it might be helpful to provide
a comprehensive set of examples that tries to cover all diffe
On Mon, Jun 26, 2023, at 13:06, jian he wrote:
> Can you try to glue the attached to the hashset data type input
> function.
> the attached will parse cstring with double quote and not. so '{1,2,3}'
> == '{"1","2","3"}'. obviously quote will preserve the inner string as
> is.
> currently
On Sun, Jun 25, 2023, at 11:42, Joel Jacobson wrote:
> SELECT hashset_contains('{}'::int4hashset, NULL::int);
>
> would be False, according to the General Rules.
>
...
> Applying the same rules, we'd have to return Unknown (which we represent as
> null) for:
>
>
On Sat, Jun 24, 2023, at 21:16, Joel Jacobson wrote:
> New version of int4hashset_contains() that should follow the same
> General Rules as MULTISET's MEMBER OF (8.16 ).
...
> SELECT hashset_contains('{}'::int4hashset, NULL::int); -- false
...
> SELECT hashset_contains('{null}'::int4h
New version of int4hashset_contains() that should follow the same
General Rules as MULTISET's MEMBER OF (8.16 ).
The first rule is to return False if the cardinality is 0 (zero).
However, we must first check if the first argument is null,
in which case the cardinality cannot be 0 (zero),
so if
On Thu, Jun 22, 2023, at 07:51, Joel Jacobson wrote:
> For instance, how should hashset_count() work?
>
> Given the query,
>
> SELECT hashset_count('{1,2,3,null}'::int4hashset);
>
> Should we,
>
> a) threat NULL as a distinct value and return 4?
>
> b) ignore NULL
On Fri, Jun 23, 2023, at 08:40, jian he wrote:
> I played around array_func.c
> many of the code can be used for multiset data type.
> now I imagine multiset as something like one dimension array. (nested
> is somehow beyond the imagination...).
Are you suggesting it might be a better idea to
On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote:
> This is also what the SQL standard does for multisets - there's SQL:20nn
> draft at http://www.wiscorp.com/SQLStandards.html, and the predicate> section (p. 475) explains how this should work with NULL.
I've looked again at the paper you
On Tue, Jun 20, 2023, at 18:25, Tomas Vondra wrote:
> On 6/20/23 16:56, Joel Jacobson wrote:
>> The reference to consistency with what we do elsewhere might not be entirely
>> applicable in this context, since the set feature we're designing is a new
>> beast
>> in the
On Tue, Jun 20, 2023, at 16:56, Joel Jacobson wrote:
> I think we have an opportunity here to innovate and potentially influence a
> future set concept in the SQL standard.
Adding to my previous note - If there's a worry about future SQL standards
introducing SETs with NULLs, c
On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote:
> On 6/20/23 12:59, Joel Jacobson wrote:
>> On Mon, Jun 19, 2023, at 02:00, jian he wrote:
>>> select hashset_contains('{1,2}'::int4hashset,NULL::int);
>>> should return null?
>>
>> I agree, it should
On Mon, Jun 19, 2023, at 02:00, jian he wrote:
> select hashset_contains('{1,2}'::int4hashset,NULL::int);
> should return null?
I agree, it should.
I've now changed all functions except int4hashset() (the init function)
and the aggregate functions to be STRICT.
I think this patch is OK to send
On Tue, Jun 20, 2023, at 02:04, Tomas Vondra wrote:
> For UPDATE, it'd be pretty clear too, I think. It's possible to do
>
>UPDATE table SET col = SET[1,2,3]
>
> and it's clear the first is the command SET, while the second is a set
> constructor. For SELECT there'd be conflict, and for ALTER
On Mon, Jun 19, 2023, at 14:59, Tomas Vondra wrote:
> What unexpected issues you mean? Sure, if someone uses multisets as if
> they were sets (so ignoring the handling of duplicates), things will go
> booom! quickly.
The unexpected issues I had in mind are subtle bugs due to treating multisets
as
On Mon, Jun 19, 2023, at 11:49, jian he wrote:
> hashset_to_array function should be strict?
>
> I noticed hashset_symmetric_difference and hashset_difference handle
> null in a different way, seems they should handle null in a consistent
> way?
Yes, I agree, they should be consistent.
I've
On Mon, Jun 19, 2023, at 11:21, Tomas Vondra wrote:
> AFAICS the standard only defines arrays and multisets. Arrays are pretty
> much the thing we have, including the ARRAY[] constructor etc. Multisets
> are similar to hashset discussed here, except that it tracks the number
> of elements for each
On Mon, Jun 19, 2023, at 02:00, jian he wrote:
> select hashset_contains('{1,2}'::int4hashset,NULL::int);
> should return null?
Hmm, that's a good philosophical question.
I notice Tomas Vondra in the initial commit opted for allowing NULL inputs,
treating them as empty sets, e.g. in
On Sun, Jun 18, 2023, at 18:45, Andrew Dunstan wrote:
> . It might be worth sending a version number with the send function
> (c.f. jsonb_send / jsonb_recv). That way would would not be tied forever
> to some wire representation.
Great idea; implemented.
> . I think there are some important
On Fri, Jun 16, 2023, at 17:42, Joel Jacobson wrote:
> I realise int4hashset_hash() is broken,
> since two int4hashset's that are considered equal,
> can by coincidence get different hashes:
...
> Do we have any ideas on how to fix this without sacrificing performance?
The pro
On Fri, Jun 16, 2023, at 13:57, jian he wrote:
> similar to (int[] || int4) and (int4 || int[])
> should we expect ('{1,2}'::int4hashset || 3) == (3 ||
> '{1,2}'::int4hashset) == (select hashset_add('{1,2}'::int4hashset,3));
> *?*
Good idea, makes sense to support it.
Implemented in attached
New patch attached:
Add customizable params to int4hashset() and collision count function
This commit enhances int4hashset() by introducing adjustable capacity,
load, and growth factors, providing flexibility for performance optimization.
Also added is a new function, hashset_collisions(), to
On Thu, Jun 15, 2023, at 11:44, jian he wrote:
> In hashset/test/sql/order.sql, can we add the following to test whether
> the optimizer will use our index.
>
> CREATE INDEX ON test_int4hashset_order (int4hashset_col
> int4hashset_btree_ops);
>
> -- to make sure that this work with just two
On Thu, Jun 15, 2023, at 06:29, jian he wrote:
> I am not sure the following results are correct.
> with cte as (
> select hashset(x) as x
> ,hashset_capacity(hashset(x))
> ,hashset_count(hashset(x))
> from generate_series(1,10) g(x))
> select *
> ,'|' as
On Thu, Jun 15, 2023, at 11:44, jian he wrote:
> I didn't install the extension directly. I copied the
> hashset--0.0.1.sql to another place, using gcc to compile these
> functions.
..
> Because even make
> PG_CONFIG=/home/jian/postgres/2023_05_25_beta5421/bin/pg_config still
> has an error.
On Thu, Jun 15, 2023, at 04:22, jian he wrote:
> Attachments:
> * temp.patch
Thanks for good suggestions.
New patch attached:
Enhance parsing and reorder headers in hashset module
Allow whitespaces in hashset input and reorder the inclusion of
header files, placing PostgreSQL headers first.
On Wed, Jun 14, 2023, at 15:16, Tomas Vondra wrote:
> On 6/14/23 14:57, Joel Jacobson wrote:
>> Would it be feasible to teach the planner to utilize the internal hash table
>> of
>> hashset directly? In the case of arrays, the hash table construction is an
...
> It's
On Wed, Jun 14, 2023, at 11:44, Tomas Vondra wrote:
>> Perspective from a potential user: I'm currently working on something
>> where an array-like structure with fast membership test performance
>> would be very useful. The main type of query is doing an =ANY(the set)
>> filter, where the set
1 - 100 of 421 matches
Mail list logo