Optimize numeric.c mul_var() using the Karatsuba algorithm

2024-04-15 Thread Joel Jacobson
Hi, This patch introduces the Karatsuba algorithm to speed up multiplication operations in numeric.c, where the operands have many digits. It is implemented via a new conditional in mul_var() that determines whether the sizes of the factors are sufficiently large to justify its use. This

Re: Document efficient self-joins / UPDATE LIMIT techniques.

2024-02-14 Thread Joel Jacobson
On Tue, Feb 13, 2024, at 23:56, Corey Huinker wrote: > This patch came out of a discussion at the last PgCon with the person > who made the "fringe feature" quote, who seemed quite supportive of > documenting the technique. The comment may have been in regards to > actually implementing a LIMIT

Re: Document efficient self-joins / UPDATE LIMIT techniques.

2024-02-13 Thread Joel Jacobson
On Tue, Feb 13, 2024, at 10:28, Laurenz Albe wrote: > On Mon, 2024-02-12 at 12:24 -0500, Corey Huinker wrote: >> > Do you plan to add it to the commitfest? If yes, I'd set it "ready for >> > committer". >> >> Commitfest entry reanimated. > > Truly... you created a revenant in the already

Re: Possibility to disable `ALTER SYSTEM`

2024-02-12 Thread Joel Jacobson
On Sun, Feb 11, 2024, at 14:58, Robert Haas wrote: > It's not entirely clear to me what our wider vision is here. Some > people seem to want a whole series of flags that can disable various > things that the superuser might otherwise be able to do, Yes, that's what bothers me a little with the

Re: Possibility to disable `ALTER SYSTEM`

2024-02-07 Thread Joel Jacobson
On Fri, Sep 8, 2023, at 16:17, Gabriele Bartolini wrote: > ``` > postgres=# ALTER SYSTEM SET wal_level TO minimal; > ERROR: could not open file "postgresql.auto.conf": Permission denied > ``` +1 to simply mark postgresql.auto.conf file as not being writeable. To improve the UX experience, how

Re: Do we want a hashset type?

2023-07-01 Thread Joel Jacobson
On Fri, Jun 30, 2023, at 06:50, jian he wrote: > more like a C questions > in this context does > #define HASHSET_GET_VALUES(set) ((int32 *) ((set)->data + > CEIL_DIV((set)->capacity, 8))) > define first, then define struct int4hashset_t. Is this normally ok? Yes, it's fine. Macros are just text

Re: Do we want a hashset type?

2023-06-28 Thread Joel Jacobson
On Wed, Jun 28, 2023, at 08:26, jian he wrote: > Hi there. > I changed the function hashset_contains to strict. Changing hashset_contains to STRICT would cause it to return NULL if any of the operands are NULL, which I don't believe is correct, since: SELECT NULL = ANY('{}'::int4[]); ?column?

Re: Do we want a hashset type?

2023-06-27 Thread Joel Jacobson
On Tue, Jun 27, 2023, at 10:26, Joel Jacobson wrote: > Attachments: > * hashset-0.0.1-b7e5614-full.patch > * hashset-0.0.1-b7e5614-incremental.patch To help verify that the semantics, I thought it might be helpful to provide a comprehensive set of examples that tries to cover all diffe

Re: Do we want a hashset type?

2023-06-26 Thread Joel Jacobson
On Mon, Jun 26, 2023, at 13:06, jian he wrote: > Can you try to glue the attached to the hashset data type input > function. > the attached will parse cstring with double quote and not. so '{1,2,3}' > == '{"1","2","3"}'. obviously quote will preserve the inner string as > is. > currently

Re: Do we want a hashset type?

2023-06-25 Thread Joel Jacobson
On Sun, Jun 25, 2023, at 11:42, Joel Jacobson wrote: > SELECT hashset_contains('{}'::int4hashset, NULL::int); > > would be False, according to the General Rules. > ... > Applying the same rules, we'd have to return Unknown (which we represent as > null) for: > >

Re: Do we want a hashset type?

2023-06-25 Thread Joel Jacobson
On Sat, Jun 24, 2023, at 21:16, Joel Jacobson wrote: > New version of int4hashset_contains() that should follow the same > General Rules as MULTISET's MEMBER OF (8.16 ). ... > SELECT hashset_contains('{}'::int4hashset, NULL::int); -- false ... > SELECT hashset_contains('{null}'::int4h

Re: Do we want a hashset type?

2023-06-24 Thread Joel Jacobson
New version of int4hashset_contains() that should follow the same General Rules as MULTISET's MEMBER OF (8.16 ). The first rule is to return False if the cardinality is 0 (zero). However, we must first check if the first argument is null, in which case the cardinality cannot be 0 (zero), so if

Re: Do we want a hashset type?

2023-06-24 Thread Joel Jacobson
On Thu, Jun 22, 2023, at 07:51, Joel Jacobson wrote: > For instance, how should hashset_count() work? > > Given the query, > > SELECT hashset_count('{1,2,3,null}'::int4hashset); > > Should we, > > a) threat NULL as a distinct value and return 4? > > b) ignore NULL

Re: Do we want a hashset type?

2023-06-23 Thread Joel Jacobson
On Fri, Jun 23, 2023, at 08:40, jian he wrote: > I played around array_func.c > many of the code can be used for multiset data type. > now I imagine multiset as something like one dimension array. (nested > is somehow beyond the imagination...). Are you suggesting it might be a better idea to

Re: Do we want a hashset type?

2023-06-22 Thread Joel Jacobson
On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote: > This is also what the SQL standard does for multisets - there's SQL:20nn > draft at http://www.wiscorp.com/SQLStandards.html, and the predicate> section (p. 475) explains how this should work with NULL. I've looked again at the paper you

Re: Do we want a hashset type?

2023-06-21 Thread Joel Jacobson
On Tue, Jun 20, 2023, at 18:25, Tomas Vondra wrote: > On 6/20/23 16:56, Joel Jacobson wrote: >> The reference to consistency with what we do elsewhere might not be entirely >> applicable in this context, since the set feature we're designing is a new >> beast >> in the

Re: Do we want a hashset type?

2023-06-20 Thread Joel Jacobson
On Tue, Jun 20, 2023, at 16:56, Joel Jacobson wrote: > I think we have an opportunity here to innovate and potentially influence a > future set concept in the SQL standard. Adding to my previous note - If there's a worry about future SQL standards introducing SETs with NULLs, c

Re: Do we want a hashset type?

2023-06-20 Thread Joel Jacobson
On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote: > On 6/20/23 12:59, Joel Jacobson wrote: >> On Mon, Jun 19, 2023, at 02:00, jian he wrote: >>> select hashset_contains('{1,2}'::int4hashset,NULL::int); >>> should return null? >> >> I agree, it should

Re: Do we want a hashset type?

2023-06-20 Thread Joel Jacobson
On Mon, Jun 19, 2023, at 02:00, jian he wrote: > select hashset_contains('{1,2}'::int4hashset,NULL::int); > should return null? I agree, it should. I've now changed all functions except int4hashset() (the init function) and the aggregate functions to be STRICT. I think this patch is OK to send

Re: Do we want a hashset type?

2023-06-19 Thread Joel Jacobson
On Tue, Jun 20, 2023, at 02:04, Tomas Vondra wrote: > For UPDATE, it'd be pretty clear too, I think. It's possible to do > >UPDATE table SET col = SET[1,2,3] > > and it's clear the first is the command SET, while the second is a set > constructor. For SELECT there'd be conflict, and for ALTER

Re: Do we want a hashset type?

2023-06-19 Thread Joel Jacobson
On Mon, Jun 19, 2023, at 14:59, Tomas Vondra wrote: > What unexpected issues you mean? Sure, if someone uses multisets as if > they were sets (so ignoring the handling of duplicates), things will go > booom! quickly. The unexpected issues I had in mind are subtle bugs due to treating multisets as

Re: Do we want a hashset type?

2023-06-19 Thread Joel Jacobson
On Mon, Jun 19, 2023, at 11:49, jian he wrote: > hashset_to_array function should be strict? > > I noticed hashset_symmetric_difference and hashset_difference handle > null in a different way, seems they should handle null in a consistent > way? Yes, I agree, they should be consistent. I've

Re: Do we want a hashset type?

2023-06-19 Thread Joel Jacobson
On Mon, Jun 19, 2023, at 11:21, Tomas Vondra wrote: > AFAICS the standard only defines arrays and multisets. Arrays are pretty > much the thing we have, including the ARRAY[] constructor etc. Multisets > are similar to hashset discussed here, except that it tracks the number > of elements for each

Re: Do we want a hashset type?

2023-06-19 Thread Joel Jacobson
On Mon, Jun 19, 2023, at 02:00, jian he wrote: > select hashset_contains('{1,2}'::int4hashset,NULL::int); > should return null? Hmm, that's a good philosophical question. I notice Tomas Vondra in the initial commit opted for allowing NULL inputs, treating them as empty sets, e.g. in

Re: Do we want a hashset type?

2023-06-18 Thread Joel Jacobson
On Sun, Jun 18, 2023, at 18:45, Andrew Dunstan wrote: > . It might be worth sending a version number with the send function > (c.f. jsonb_send / jsonb_recv). That way would would not be tied forever > to some wire representation. Great idea; implemented. > . I think there are some important

Re: Do we want a hashset type?

2023-06-16 Thread Joel Jacobson
On Fri, Jun 16, 2023, at 17:42, Joel Jacobson wrote: > I realise int4hashset_hash() is broken, > since two int4hashset's that are considered equal, > can by coincidence get different hashes: ... > Do we have any ideas on how to fix this without sacrificing performance? The pro

Re: Do we want a hashset type?

2023-06-16 Thread Joel Jacobson
On Fri, Jun 16, 2023, at 13:57, jian he wrote: > similar to (int[] || int4) and (int4 || int[]) > should we expect ('{1,2}'::int4hashset || 3) == (3 || > '{1,2}'::int4hashset) == (select hashset_add('{1,2}'::int4hashset,3)); > *?* Good idea, makes sense to support it. Implemented in attached

Re: Do we want a hashset type?

2023-06-16 Thread Joel Jacobson
New patch attached: Add customizable params to int4hashset() and collision count function This commit enhances int4hashset() by introducing adjustable capacity, load, and growth factors, providing flexibility for performance optimization. Also added is a new function, hashset_collisions(), to

Re: Do we want a hashset type?

2023-06-15 Thread Joel Jacobson
On Thu, Jun 15, 2023, at 11:44, jian he wrote: > In hashset/test/sql/order.sql, can we add the following to test whether > the optimizer will use our index. > > CREATE INDEX ON test_int4hashset_order (int4hashset_col > int4hashset_btree_ops); > > -- to make sure that this work with just two

Re: Do we want a hashset type?

2023-06-15 Thread Joel Jacobson
On Thu, Jun 15, 2023, at 06:29, jian he wrote: > I am not sure the following results are correct. > with cte as ( > select hashset(x) as x > ,hashset_capacity(hashset(x)) > ,hashset_count(hashset(x)) > from generate_series(1,10) g(x)) > select * > ,'|' as

Re: Do we want a hashset type?

2023-06-15 Thread Joel Jacobson
On Thu, Jun 15, 2023, at 11:44, jian he wrote: > I didn't install the extension directly. I copied the > hashset--0.0.1.sql to another place, using gcc to compile these > functions. .. > Because even make > PG_CONFIG=/home/jian/postgres/2023_05_25_beta5421/bin/pg_config still > has an error.

Re: Do we want a hashset type?

2023-06-15 Thread Joel Jacobson
On Thu, Jun 15, 2023, at 04:22, jian he wrote: > Attachments: > * temp.patch Thanks for good suggestions. New patch attached: Enhance parsing and reorder headers in hashset module Allow whitespaces in hashset input and reorder the inclusion of header files, placing PostgreSQL headers first.

Re: Do we want a hashset type?

2023-06-14 Thread Joel Jacobson
On Wed, Jun 14, 2023, at 15:16, Tomas Vondra wrote: > On 6/14/23 14:57, Joel Jacobson wrote: >> Would it be feasible to teach the planner to utilize the internal hash table >> of >> hashset directly? In the case of arrays, the hash table construction is an ... > It's

Re: Do we want a hashset type?

2023-06-14 Thread Joel Jacobson
On Wed, Jun 14, 2023, at 11:44, Tomas Vondra wrote: >> Perspective from a potential user: I'm currently working on something >> where an array-like structure with fast membership test performance >> would be very useful. The main type of query is doing an =ANY(the set) >> filter, where the set

Re: Do we want a hashset type?

2023-06-14 Thread Joel Jacobson
On Wed, Jun 14, 2023, at 06:31, Tom Dunstan wrote: > On Mon, 12 Jun 2023 at 22:37, Tomas Vondra > wrote: >> Perhaps. So you're proposing to have this as a regular built-in type? >> It's hard for me to judge how popular this feature would be, but I guess >> people often use arrays while they

Re: Do we want a hashset type?

2023-06-13 Thread Joel Jacobson
On Tue, Jun 13, 2023, at 20:50, Joel Jacobson wrote: > hashset is now using hash_bytes_uint32() from hashfn.h I spotted a problem in the ordering logic of the comparison functions. The issue was with handling hashsets containing empty positions, causing non-lexicographic ordering. The upda

Re: Do we want a hashset type?

2023-06-13 Thread Joel Jacobson
On Mon, Jun 12, 2023, at 22:36, Joel Jacobson wrote: > On Mon, Jun 12, 2023, at 21:58, Tomas Vondra wrote: >> My suggestion is to be lazy, just use the lookup3 we have in hashfn.c >> (through hash_bytes or something), and at the same time make it possible >> to switch to

Re: Do we want a hashset type?

2023-06-12 Thread Joel Jacobson
On Mon, Jun 12, 2023, at 21:58, Tomas Vondra wrote: > But those are numbers for large keys - if you restrict the input to > 4B-16B (which is what we planned to do by focusing on int, bigint and > uuid), there's no significant difference: Oh, sorry, I completely failed to read the meaning of the

Re: Do we want a hashset type?

2023-06-12 Thread Joel Jacobson
On Sat, Jun 10, 2023, at 22:12, Tomas Vondra wrote: >>> 1) better hash table implementation I noticed src/include/common/hashfn.h provides implementation of the Jenkins/lookup3 hash function, and thought maybe we could simply use it in hashset? However, I noticed that according to SMHasher [1],

Re: Do we want a hashset type?

2023-06-12 Thread Joel Jacobson
On Sat, Jun 10, 2023, at 17:46, Andrew Dunstan wrote: > Maybe you can post a full patch as well as incremental? Attached patch is based on tvondra's last commit 375b072. > Stylistically I think you should reduce reliance on magic numbers (like 13). > Probably need some #define's? Great idea,

Re: Let's make PostgreSQL multi-threaded

2023-06-12 Thread Joel Jacobson
On Mon, Jun 12, 2023, at 13:53, Tomas Vondra wrote: > In a way, I think this "split into independently beneficial steps" > strategy is the only option with a meaningful chance of success. +1 /Joel

Re: Do we want a hashset type?

2023-06-11 Thread Joel Jacobson
On Sun, Jun 11, 2023, at 17:03, Tomas Vondra wrote: > On 6/11/23 12:26, Joel Jacobson wrote: >> I think there are some good arguments that speaks in favour of including it >> in core: ... > > I agree with all of that, but ... > > It's just past feature freeze, so the

Re: Do we want a hashset type?

2023-06-11 Thread Joel Jacobson
On Sun, Jun 11, 2023, at 16:58, Andrew Dunstan wrote: >>On 2023-06-11 Su 06:26, Joel Jacobson wrote: >>Perhaps "set" would have been a better name, since the use of hash functions >>from an end-user perspective is >>implementation details, but we cannot use

Re: Do we want a hashset type?

2023-06-11 Thread Joel Jacobson
On Sat, Jun 10, 2023, at 22:26, Tomas Vondra wrote: > On 6/10/23 17:46, Andrew Dunstan wrote: >> >> Maybe you can post a full patch as well as incremental? >> > > I wonder if we should keep discussing this extension here, considering > it's going to be out of core (at least for now). Not sure

Re: Do we want a hashset type?

2023-06-09 Thread Joel Jacobson
On Fri, Jun 9, 2023, at 13:33, jian he wrote: > Hi, I am quite new about C. > The following function I have 3 questions. > 1. 7691,4201, I assume they are just random prime ints? Yes, 7691 and 4201 are likely chosen as random prime numbers. In hash functions, prime numbers are often used to

Re: Do we want a hashset type?

2023-06-09 Thread Joel Jacobson
On Thu, Jun 8, 2023, at 12:19, Tomas Vondra wrote: > Would you be interested in helping with / working on some of that? I > don't have immediate need for this stuff, so it's not very high on my > TODO list. Sure, I'm willing to help! I've attached a patch that works on some of the items on your

[btree-indexed column] <@ [range | multirange]

2023-06-08 Thread Joel Jacobson
I've noticed the planner is not yet smart enough to do an index scan when the left operand of a contains operator (<@) is a btree-indexed column and the right operand is a range or multirange type of the same type as the column. For instance, given a users table with an id int primary key column,

Re: Do we want a hashset type?

2023-06-08 Thread Joel Jacobson
On Wed, Jun 7, 2023, at 19:37, Tomas Vondra wrote: > Interesting, considering how dumb the the hash table implementation is. That's promising. >> I tested Neo4j and the results are surprising; it appears to be >> significantly *slower*. >> However, I've probably misunderstood something, maybe I

Re: Do we want a hashset type?

2023-06-07 Thread Joel Jacobson
On Tue, Jun 6, 2023, at 13:20, Tomas Vondra wrote: > it cuts the timing to about 50% on my laptop, so maybe it'll be ~300ms > on your system. There's a bunch of opportunities for more improvements, > as the hash table implementation is pretty naive/silly, the on-disk > format is wasteful and so

Re: Do we want a hashset type?

2023-06-05 Thread Joel Jacobson
On Mon, Jun 5, 2023, at 01:44, Tomas Vondra wrote: > Anyway, I hacked together a trivial set backed by an open addressing > hash table: > > https://github.com/tvondra/hashset > > It's super-basic, providing just some bare minimum of functions, but > hopefully good enough for experiments. > > -

Re: Do we want a hashset type?

2023-06-05 Thread Joel Jacobson
On Fri, Jun 2, 2023, at 10:01, Ants Aasma wrote: > Have you looked at roaring bitmaps? There is a pg_roaringbitmap > extension [1] already available that offers very fast unions, > intersections and membership tests over integer sets. I used it to get > some pretty impressive performance results

Re: Do we want a hashset type?

2023-06-01 Thread Joel Jacobson
On Thu, Jun 1, 2023, at 09:02, Joel Jacobson wrote: > Here is an example using a real anonymised social network. I realised the "found" column is not necessary in this particular example, since we only care about the friends at the exact depth level. Simplified query: CREATE OR

Re: Do we want a hashset type?

2023-06-01 Thread Joel Jacobson
On Wed, May 31, 2023, at 18:59, Tomas Vondra wrote: > How does storing just the IDs solves anything? Isn't the main challenge > the random I/O when fetching the adjacent nodes? This does not really > improve that, no? I'm thinking of a recursive query where a lot of time is just spent following

Re: Do we want a hashset type?

2023-05-31 Thread Joel Jacobson
On Wed, May 31, 2023, at 16:53, Tomas Vondra wrote: > I think this needs a better explanation - what exactly is a hashset in > this context? Something like an array with a hash for faster lookup of > unique elements, or what? In this context, by "hashset" I am indeed referring to a data structure

Do we want a hashset type?

2023-05-31 Thread Joel Jacobson
Hi, I've been working with a social network start-up that uses PostgreSQL as their only database. Recently, they became interested in graph databases, largely because of an article [1] suggesting that a SQL database "just chokes" when it encounters a depth-five friends-of-friends query (for a

Re: Should CSV parsing be stricter about mid-field quotes?

2023-05-20 Thread Joel Jacobson
On Fri, May 19, 2023, at 18:06, Daniel Verite wrote: > COPY FROM file CSV somewhat differs as your example shows, > but it still mishandle \. when unquoted. For instance, consider this > file to load with COPYt FROM '/tmp/t.csv' WITH CSV > $ cat /tmp/t.csv > line 1 > \. > line 3 > line 4 >

Re: New COPY options: DELIMITER NONE and QUOTE NONE

2023-05-20 Thread Joel Jacobson
On Fri, May 19, 2023, at 19:03, Andrew Dunstan wrote: > I think you've been a bit too cute with the grammar changes, but as you say > this is a POC. Thanks for feedback. The approach I took for the new grammar rules was inspired by previous commits, such as de7531a971b, which introduced support

New COPY options: DELIMITER NONE and QUOTE NONE

2023-05-19 Thread Joel Jacobson
The thread "Should CSV parsing be stricter about mid-field quotes?" [1] forked into a new topic, with two new ideas, hence this new thread. 1. COPY ... QUOTE NONE In the [1] thread, Andrew Dunstan suggested a trick on how to deal with unquoted but delimited files, such as TSV-files produced by

Re: Should CSV parsing be stricter about mid-field quotes?

2023-05-19 Thread Joel Jacobson
On Thu, May 18, 2023, at 18:48, Daniel Verite wrote: > Joel Jacobson wrote: >> OTOH, one would then need to inspect the TSV file doesn't contain \. on an >> empty line... > > Note that this is the case for valid CSV contents, since backslash-dot > on a line by itself is bo

Re: Should CSV parsing be stricter about mid-field quotes?

2023-05-18 Thread Joel Jacobson
On Thu, May 18, 2023, at 08:35, Pavel Stehule wrote: > Maybe there is another third implementation in Libre Office. > > Generally TSV is not well specified, and then the implementations are not > consistent. Thanks Pavel, that was a very interesting case indeed: Libre Office (tested on Mac)

Re: Should CSV parsing be stricter about mid-field quotes?

2023-05-18 Thread Joel Jacobson
On Thu, May 18, 2023, at 08:00, Joel Jacobson wrote: > 1. How about adding a `WITHOUT QUOTE` or `QUOTE NONE` option in conjunction > with `COPY ... WITH CSV`? More ideas: [ QUOTE 'quote_character' | UNQUOTED ] or [ QUOTE 'quote_character' | NO_QUOTE ] Thinking about it, I recall anothe

Re: Should CSV parsing be stricter about mid-field quotes?

2023-05-18 Thread Joel Jacobson
On Thu, May 18, 2023, at 00:18, Kirk Wolak wrote: > Here you go. Not horrible handling. (I use DataGrip so I saved it from there > directly as TSV, just for an extra datapoint). > > FWIW, if you copy/paste in windows, the data, the field with the tab gets > split into another column in Excel. But

Re: Should CSV parsing be stricter about mid-field quotes?

2023-05-17 Thread Joel Jacobson
On Wed, May 17, 2023, at 19:42, Andrew Dunstan wrote: > You can use CSV mode pretty reliably for TSV files. The trick is to use a > quoting char that shouldn't appear, such as E'\x01' as well as setting the > delimiter to E'\t'. Yes, it's far from obvious. I've been using that trick myself many

Re: Should CSV parsing be stricter about mid-field quotes?

2023-05-16 Thread Joel Jacobson
On Tue, May 16, 2023, at 13:43, Joel Jacobson wrote: >If we made midfield quoting a CSV error, those users who are currently mistaken >about their TSV/TEXT files being CSV while also having balanced quotes in their >data, would encounter an error rather than a silent failure, which

Re: Should CSV parsing be stricter about mid-field quotes?

2023-05-16 Thread Joel Jacobson
On Sun, May 14, 2023, at 16:58, Andrew Dunstan wrote: > And if people do follow the method you describe then their input with > unescaped quotes will be rejected 999 times out of 1000. It's only cases where > the field happens to have an even number of embedded quotes, like Joel's > somewhat

Re: Should CSV parsing be stricter about mid-field quotes?

2023-05-13 Thread Joel Jacobson
On Fri, May 12, 2023, at 21:57, Andrew Dunstan wrote: > Maybe this is unexpected by you, but it's not by me. What other sane > interpretation of that data could there be? And what CSV producer outputs > such horrible content? As you've noted, ours certainly does not. Our rules > are clear:

Should CSV parsing be stricter about mid-field quotes?

2023-05-11 Thread Joel Jacobson
Hi hackers, I've come across an unexpected behavior in our CSV parser that I'd like to bring up for discussion. % cat example.csv id,rating,review 1,5,"Great product, will buy again." 2,3,"I bought this for my 6" laptop but it didn't fit my 8" tablet" % psql CREATE TABLE reviews (id int, rating

Re: Missing free_var() at end of accum_sum_final()?

2023-03-05 Thread Joel Jacobson
On Fri, Mar 3, 2023, at 16:11, Dean Rasheed wrote: > Attachments: > * make-result-using-vars-buf-v2.patch One suggestion: maybe add a comment explaining why the allocated buffer which size is based on strlen(cp) for the decimal digit values, is guaranteed to be large enough also for the result's

Re: Missing free_var() at end of accum_sum_final()?

2023-03-04 Thread Joel Jacobson
On Fri, Mar 3, 2023, at 16:11, Dean Rasheed wrote: > So IMO the results just don't justify such an extensive set of > changes, and I think we should abandon this fixed buffer approach. I agree. I was hoping it would be possible to reduce the invasiveness, but I think it's difficult and probably

Re: Missing free_var() at end of accum_sum_final()?

2023-02-20 Thread Joel Jacobson
Hi, My apologies, it seems my email didn't reach the list, probably due to the benchmark plot images being to large. Here is the email again, but with URLs to the images instead, and benchmark updated with results for the 0005-fixed-buf.patch. -- On Mon, Feb 20, 2023, at 12:32, Dean Rasheed

Re: Missing free_var() at end of accum_sum_final()?

2023-02-20 Thread Joel Jacobson
Hi, I found another small but significant improvement of the previous patch: else if (ndigits < var->buf_len) { -memset(var->buf, 0, var->buf_len); +var->buf[0] = 0; var->digits = var->buf + 1; var->ndigits = ndigits; } We don't need to set all buf elements to zero, only the

Re: Missing free_var() at end of accum_sum_final()?

2023-02-20 Thread Joel Jacobson
On Mon, Feb 20, 2023, at 08:38, Michael Paquier wrote: > Perhaps you should register this patch to the commit of March? Here > it is: > https://commitfest.postgresql.org/42/ Thanks, done. /Joel

Re: Missing free_var() at end of accum_sum_final()?

2023-02-19 Thread Joel Jacobson
On Sun, Feb 19, 2023, at 23:16, Joel Jacobson wrote: > Hi again, > > Ignore previous patch, new correct version attached, that also keeps > track of if the buf-field is in use or not. Ops! One more thinko, detected when trying fixed_buf[8], which caused a seg fault. To fix

Re: Missing free_var() at end of accum_sum_final()?

2023-02-19 Thread Joel Jacobson
results was produced with: https://github.com/joelonsql/pg-timeit Now, the question is how big of a fixed_buf we want. I will run some more benchmarks. /Joel On Sun, Feb 19, 2023, at 20:54, Joel Jacobson wrote: > On Fri, Feb 17, 2023, at 21:26, Andres Freund wrote: >> Removing the free_va

Re: Missing free_var() at end of accum_sum_final()?

2023-02-19 Thread Joel Jacobson
On Fri, Feb 17, 2023, at 21:26, Andres Freund wrote: > Removing the free_var()s from numeric_add_opt_error() speeds it up from ~361ms > to ~338ms. I notice numeric_add_opt_error() is extern and declared in numeric.h, called from e.g. timestamp.c and jsonpath_exec.c. Is that a problem, i.e. is

Re: Missing free_var() at end of accum_sum_final()?

2023-02-15 Thread Joel Jacobson
On Thu, Feb 16, 2023, at 07:26, Michael Paquier wrote: > Indeed, it is true that any code path of numeric.c that relies on a > NumericVar with an allocation done in its buffer is careful enough to > free it, except for generate_series's SRF where one step of the > computation is done. I don't see

Missing free_var() at end of accum_sum_final()?

2023-02-15 Thread Joel Jacobson
Hi, I noticed the NumericVar's pos_var and neg_var are not free_var()'d at the end of accum_sum_final(). The potential memory leak seems small, since the function is called only once per sum() per worker (and from a few more places), but maybe they should be free'd anyways for correctness?

[PATCH] FIx alloc_var() ndigits thinko

2023-02-15 Thread Joel Jacobson
Hi, I came across another harmless thinko in numeric.c. It is harmless since 20/DEC_DIGITS and 40/DEC_DIGITS happens to be exactly 5 and 10 since DEC_DIGITS == 4, but should be fixed anyway for correctness IMO. - alloc_var(var, 20 / DEC_DIGITS); + alloc_var(var, (20 + DEC_DIGITS -

Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().

2023-01-31 Thread Joel Jacobson
On Tue, Jan 31, 2023, at 20:25, Dean Rasheed wrote: > That seems a bit wordy, given the context of this comment. I think > it's sufficient to just give the formula, and note that it simplifies > when DEC_DIGITS is even (not just 4): > > /* > * Assume the input was normalized, so

Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().

2023-01-31 Thread Joel Jacobson
Hi, On Tue, Jan 31, 2023, at 14:40, Dean Rasheed wrote: > That's still not right. If you want a proper mathematically justified > formula, it's fairly easy to derive. ... > or equivalently, in code with truncated integer division: > > if (arg.weight >= 0) > sweight = arg.weight *

Re: [PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().

2023-01-31 Thread Joel Jacobson
Hi, On Sun, Jan 29, 2023, at 14:33, Dean Rasheed wrote: > On Sat, 28 Jan 2023 at 22:14, Joel Jacobson wrote: >> HEAD, patched: >> sweight = (arg.weight * DEC_DIGITS) / 2 + 1 > > You haven't actually said why this formula is more correct than the > current one. I

[PATCH] Fix old thinko in formula to compute sweight in numeric_sqrt().

2023-01-28 Thread Joel Jacobson
Hi, I found what appears to be a small harmless error in numeric.c, that seems worthwhile to fix only because it's currently causes confusion. It hasn't caused any problems, since the incorrect formula happens to always produce the same result for DEC_DIGITS==4. However, for other DEC_DIGITS

Re: Non-decimal integer literals

2023-01-23 Thread Joel Jacobson
On Fri, Jan 13, 2023, at 07:01, Dean Rasheed wrote: > Attachments: > * 0001-Add-non-decimal-integer-support-to-type-numeric.patch Nice! This also simplifies when dealing with non-negative integers represented as byte arrays, common in e.g. cryptography code. Before, one had to implement

Re: [PATCH] Use 128-bit math to accelerate numeric division, when 8 < divisor digits <= 16

2023-01-22 Thread Joel Jacobson
On Sun, Jan 22, 2023, at 14:25, Dean Rasheed wrote: > I just modified the previous test you posted: > > \timing on > SELECT count(numeric_div_volatile(1e131071,123456)) FROM > generate_series(1,1e4); > > Time: 2048.060 ms (00:02.048)-- HEAD > Time: 2422.720 ms (00:02.423)-- With patch >

Re: [PATCH] Use 128-bit math to accelerate numeric division, when 8 < divisor digits <= 16

2023-01-22 Thread Joel Jacobson
On Sun, Jan 22, 2023, at 11:06, Dean Rasheed wrote: > Seems like a reasonable idea, with some pretty decent gains. > > Note, however, that for a divisor having fewer than 5 or 6 digits, > it's now significantly slower because it's forced to go through > div_var_int64() instead of div_var_int() for

Re: [PATCH] pg_permissions

2022-02-26 Thread Joel Jacobson
On Fri, Feb 25, 2022, at 22:12, Chapman Flack wrote: > I would be happy to review this patch, but a look through the email leaves me > thinking it may still be waiting on a C implementation of pg_get_acl(). Is > that > right? Not sure. > And perhaps a view rename to pg_privileges, following

Re: List of all* PostgreSQL EXTENSIONs in the world

2022-02-23 Thread Joel Jacobson
On Wed, Feb 23, 2022, at 15:23, Andrew Dunstan wrote: > Yeah, I have several others on bitbucket that might be of interest (in > varying states of doneness): Thanks, added. /Joel

Re: List of all* PostgreSQL EXTENSIONs in the world

2022-02-23 Thread Joel Jacobson
On Fri, Feb 11, 2022, at 04:46, Noah Misch wrote: > How did you make the list? (I'd imagine doing it by searching for > repositories containing evidence like \bpgxs\b matches.) Searching Github for repos with a *.control file in the root dir and a Makefile containing ^PGXS Hmm, now that you

Re: List of all* PostgreSQL EXTENSIONs in the world

2022-02-23 Thread Joel Jacobson
On Fri, Feb 11, 2022, at 01:22, Ian Lawrence Barwick wrote: > More precisely it's a list of all? the repositories with PostgreSQL > extensions on > GitHub? Yes, the ambition is to list all repos at GitHub, and to manually add repos hosted at GitLab and other places. > OTOH not everything is on

Re: List of all* PostgreSQL EXTENSIONs in the world

2022-02-23 Thread Joel Jacobson
On Tue, Feb 22, 2022, at 09:13, Aleksander Alekseev wrote: > Hi Joel, > >> I've compiled a list of all* PostgreSQL EXTENSIONs in the world: > > The list is great. Thanks for doing this! Glad to hear! > Just curious, is it a one-time experiment or you are going to keep the > list in an actual

Re: List of all* PostgreSQL EXTENSIONs in the world

2022-02-21 Thread Joel Jacobson
On Mon, Feb 21, 2022, at 21:16, Joel Jacobson wrote: > As a first attempt, I've added the description from the Github-repos, and two > categories to start the discussion: > > - Uncategorized > - Foreign Data Wrappers Some more categories added: - Access Methods - Aggregate F

Re: List of all* PostgreSQL EXTENSIONs in the world

2022-02-21 Thread Joel Jacobson
On Thu, Feb 10, 2022, at 21:35, Robert Haas wrote: > I think a list like this is probably not useful without at least a > brief description of each one, and maybe some attempt at > categorization. +1 As a first attempt, I've added the description from the Github-repos, and two categories to

List of all* PostgreSQL EXTENSIONs in the world

2022-02-10 Thread Joel Jacobson
Hi hackers, I've compiled a list of all* PostgreSQL EXTENSIONs in the world: https://gist.github.com/joelonsql/e5aa27f8cc9bd22b8999b7de8aee9d47 *) It's not all, but 1041, compared to the 338 found on PGXN. Maybe it would be an idea to provide a lightweight solution, e.g. maintaining a simple

Re: [RFC] building postgres with meson

2022-02-09 Thread Joel Jacobson
On Tue, Oct 12, 2021, at 10:37, Andres Freund wrote: > - PGXS - and I don't yet know what to best do about it. One > backward-compatible way would be to continue use makefiles for pgxs, > but do the necessary replacement of Makefile.global.in via meson (and > not use that for postgres' own

Re: PSA: Autoconf has risen from the dead

2022-01-23 Thread Joel Jacobson
On Sun, Jan 23, 2022, at 17:29, Tom Lane wrote: >While chasing something else, I was surprised to learn that the >Autoconf project has started to make releases again. There are >2.70 (2020-12-08) and 2.71 (2021-01-28) versions available at >https://ftp.gnu.org/gnu/autoconf/ > >Right now, I'm not

Re: Schema variables - new implementation for Postgres 15

2022-01-13 Thread Joel Jacobson
On Thu, Jan 13, 2022, at 20:12, Pavel Stehule wrote: >I cannot imagine how the "window" keyword can work in SQL context. In >Javascript "window" is an object - it is not a keyword, and it makes sense in >usual Javascript context inside HTML browsers. I was thinking since Javascript is by far

Re: Schema variables - new implementation for Postgres 15

2022-01-13 Thread Joel Jacobson
On Thu, Jan 13, 2022, at 18:24, Dean Rasheed wrote: > Those are examples that a malicious user might use, but even without > such examples, I think it would be far too easy to inadvertently break > a large application by defining a variable that conflicted with a > column name you didn't know

Re: pl/pgsql feature request: shorthand for argument and local variable references

2022-01-06 Thread Joel Jacobson
On Thu, Jan 6, 2022, at 21:38, Pavel Stehule wrote: > I say, semantically - how I understand the meaning of the word "in" is not > good to use it for generic alias of function arguments (when we have out > arguments too). Trying to imagine a situation where you would need a shorthand also for

Re: pl/pgsql feature request: shorthand for argument and local variable references

2022-01-06 Thread Joel Jacobson
I merely suggest reusing an existing reserved keyword. >>čt 6. 1. 2022 v 20:03 odesílatel Joel Jacobson napsal: >> >>If "in." would work, due to "in" being a reserved SQL keyword, >>don't you think the benefits of a SQL standardized solution would outw

Re: pl/pgsql feature request: shorthand for argument and local variable references

2022-01-06 Thread Joel Jacobson
On Thu, Jan 6, 2022, at 19:03, Pavel Stehule wrote: > The possibility to define a label dynamically is a better solution (not by > some buildin keyword), > because it allows some possibility for the end user to define what he prefers. I'm trying to understand why you think a user-defined

  1   2   3   4   >