date:20230325

Justin Pryzby  writes:
> On Mon, Mar 13, 2023 at 02:19:07PM +0100, Daniel Gustafsson wrote:
>> +   elog(ERROR,
>> +"unexpected NULL value in cached tuple for 
>> pg_catalog.%s.%s",
>> +get_rel_name(cacheinfo[cacheId].reloid),

> Question: Is it safe to be making catalog access inside an error
> handler, when one of the most likely reason for hitting the error is
> catalog corruption ?

I don't see a big problem here.  If there were catalog corruption
preventing fetching the catalog's pg_class row, it's highly unlikely
that you'd have managed to retrieve a catalog row to complain about.
(That is, corruption in this particular catalog entry probably does
not extend to the metadata about the catalog containing it.)

> Maybe the answer is that it's not "safe" but "safe enough"

Right.

If we got concerned about this we could dodge the extra catalog access
by adding the catalog's name to CatCache entries.  I doubt it's worth
it though.  We can always re-evaluate if we see actual evidence of
problems.

regards, tom lane

Re: Refactoring SysCacheGetAttr to know when attr cannot be NULL

2023-03-25 Thread Justin Pryzby

On Mon, Mar 13, 2023 at 02:19:07PM +0100, Daniel Gustafsson wrote:
> > On 2 Mar 2023, at 15:44, Tom Lane  wrote:
> > 
> > Peter Eisentraut  writes:
> >> I think an error message like
> >> "unexpected null value in system cache %d column %d"
> >> is sufficient.  Since these are "can't happen" errors, we don't need to 
> >> spend too much extra effort to make it prettier.
> > 
> > I'd at least like to see it give the catalog's OID.  That's easily
> > convertible to a name, and it doesn't tend to move around across PG
> > versions, neither of which are true for syscache IDs.
> > 
> > Also, I'm fairly unconvinced that it's a "can't happen" --- this
> > would be very likely to fire as a result of catalog corruption,
> > so it would be good if it's at least minimally interpretable
> > by a non-expert.  Given that we'll now have just one copy of the
> > code, ISTM there's a good case for doing the small extra work
> > to report catalog and column by name.
> 
> Rebased v3 on top of recent conflicting ICU changes causing the patch to not
> apply anymore.  Also took another look around the tree to see if there were
> missed callsites but found none new.

+++ b/src/backend/utils/cache/syscache.c
@@ -77,6 +77,7 @@
 #include "catalog/pg_user_mapping.h"
 #include "lib/qunique.h"
 #include "utils/catcache.h"
+#include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/syscache.h"

@@ -1099,6 +1100,32 @@ SysCacheGetAttr(int cacheId, HeapTuple tup,
+   elog(ERROR,
+"unexpected NULL value in cached tuple for 
pg_catalog.%s.%s",
+get_rel_name(cacheinfo[cacheId].reloid),

Question: Is it safe to be making catalog access inside an error
handler, when one of the most likely reason for hitting the error is
catalog corruption ?

Maybe the answer is that it's not "safe" but "safe enough" - IOW, if
you're willing to throw an assertion, it's good enough to try to show
the table name, and if the error report crashes the server, that's "not
much worse" than having Asserted().

-- 
Justin

Re: Request for comment on setting binary format output per session

2023-03-25 Thread Dave Cramer

Dave Cramer


On Sat, 25 Mar 2023 at 19:06, Jeff Davis  wrote:

> On Thu, 2023-03-23 at 15:37 -0400, Dave Cramer wrote:
> > So where do we go from here ?
> >
> > I can implement using type names as well as OID's
>
> My current thought is that you should use the protocol extension and
> make type names work (in addition to OIDs) at least for fully-qualified
> type names. I don't really like the GUC -- perhaps I could be convinced
> it's OK, but until we find a problem with protocol extensions, it looks
> like a protocol extension is the way to go here.
>
> Well as I said if I use any external pool that shares connections with
multiple clients, some of which may not know how to decode binary data then
we have to have a way to set the binary format after the connection is
established. I did float the idea of using the cancel key as a unique
identifier that passed with the parameter would allow setting the parameter
after connection establishmen.

I like Peter's idea for some kind of global identifier, but we can do
> that independently at a later time.
>
> If search path works fine and we're all happy with it, we could also
> support unqualified type names. It feels slightly off to me to use
> search_path for something like that, though.
>
> There's still the problem about the connection pools. pgbouncer could
> consider the binary formats to be an important parameter like the
> database name, where the connection pooler would not mingle connections
> with different settings for binary_formats. That would work, but it
> would be weird because if a new version of a driver adds new binary
> format support, it could cause worse connection pooling performance
> until all the other drivers also support that binary format. I'm not
> sure if that's a major problem or not. Another idea would be for the
> connection pooler to also have a binary_formats config, and it would do
> some checking to make sure all connecting clients understand some
> minimal set of binary formats, so that it could still mingle the
> connections. Either way, I think this is solvable by the connection
> pooler.
>

Well that means that connection poolers have to all be fixed. There are
more than just pgbouncer.
Seems rather harsh that a new feature breaks a connection pooler or makes
the pooler unusable.

Dave

Re: Request for comment on setting binary format output per session

On Thu, 2023-03-23 at 15:37 -0400, Dave Cramer wrote:
> So where do we go from here ?
> 
> I can implement using type names as well as OID's

My current thought is that you should use the protocol extension and
make type names work (in addition to OIDs) at least for fully-qualified
type names. I don't really like the GUC -- perhaps I could be convinced
it's OK, but until we find a problem with protocol extensions, it looks
like a protocol extension is the way to go here.

I like Peter's idea for some kind of global identifier, but we can do
that independently at a later time.

If search path works fine and we're all happy with it, we could also
support unqualified type names. It feels slightly off to me to use
search_path for something like that, though.

There's still the problem about the connection pools. pgbouncer could
consider the binary formats to be an important parameter like the
database name, where the connection pooler would not mingle connections
with different settings for binary_formats. That would work, but it
would be weird because if a new version of a driver adds new binary
format support, it could cause worse connection pooling performance
until all the other drivers also support that binary format. I'm not
sure if that's a major problem or not. Another idea would be for the
connection pooler to also have a binary_formats config, and it would do
some checking to make sure all connecting clients understand some
minimal set of binary formats, so that it could still mingle the
connections. Either way, I think this is solvable by the connection
pooler.

Regards,
Jeff Davis

Re: Request for comment on setting binary format output per session

On Fri, 2023-03-24 at 07:52 -0500, Merlin Moncure wrote:
> I think so.  Dave's idea puts a lot of flexibility into the client
> side, and that's good.  Search path mechanics are really well
> understood and well integrated with extensions already (create
> extension ..schema) assuming that the precise time UDT are looked up
> in an unqualified way is very clear to- or invoked via- the client
> code.  I'll say it again though; OIDs really ought to be considered a
> transient cache of type information rather than a
> permanent identifier. 

I'm not clear on what proposal you are making and/or endorising?

> Regarding UDT, lots of common and useful scenarios (containers, enum,
> range, etc), do not require special knowledge to parse beyond the
> kind of type it is.  Automatic type creation from tables is one of
> the most genius things about postgres and directly wiring client
> structures to them through binary is really nifty.

Perhaps not special knowledge, but you need to know the structure. If
you have a query like "SELECT '...'::sometable", you still need to know
the structure of sometable to parse it.

>   This undermines the case that binary parsing requires special
> knowledge IMO, UDT might in fact be the main long term target...I
> could see scenarios where java classes might be glued directly to
> postgres tables by the driver...this would be a lot more efficient
> than using json which is how everyone does it today.  Then again,
> maybe *I* might be overthinking this.

Wouldn't that only work if someone is doing a "SELECT *"?

Regards,
Jeff Davis

Re: Refactoring SysCacheGetAttr to know when attr cannot be NULL

2023-03-25 Thread Daniel Gustafsson

> On 14 Mar 2023, at 08:00, Peter Eisentraut 
>  wrote:

> I prefer to use "null value" for SQL null values, and NULL for the C symbol.

Thats a fair point, I agree with that.

> I'm a bit hesitant about hardcoding pg_catalog here.  That happens to be 
> true, of course, but isn't actually enforced, I think.  I think that could be 
> left off.  It's not like people will be confused about which schema 
> "pg_class.relname" is in.
> 
> Also, the cached tuple isn't really for the attribute, so maybe split that up 
> a bit, like
> 
> "unexpected null value in cached tuple for catalog %s column %s"

No objections, so changed to that wording.

With these changes and a pgindent run across it per Davids comment downthread,
I've pushed this now.  Thanks for review!

I'm keeping a watchful eye on the buildfarm; francolin has errored in
recoveryCheck which I'm looking into but at first glance I don't think it's
related (other animals have since passed it and it works locally, but I'll keep
digging at it to make sure).

--
Daniel Gustafsson

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

2023-03-25 Thread Melanie Plageman

On Tue, Mar 21, 2023 at 6:03 AM Ants Aasma  wrote:
>
> On Mon, 20 Mar 2023 at 00:59, Melanie Plageman
>  wrote:
> >
> > On Wed, Mar 15, 2023 at 6:46 AM Ants Aasma  wrote:
> > >
> > > On Wed, 15 Mar 2023 at 02:29, Melanie Plageman
> > >  wrote:
> > > > As for routine vacuuming and the other buffer access strategies, I think
> > > > there is an argument for configurability based on operator knowledge --
> > > > perhaps your workload will use the data you are COPYing as soon as the
> > > > COPY finishes, so you might as well disable a buffer access strategy or
> > > > use a larger fraction of shared buffers. Also, the ring sizes were
> > > > selected sixteen years ago and average server memory and data set sizes
> > > > have changed.
> > >
> > > To be clear I'm not at all arguing against configurability. I was
> > > thinking that dynamic use could make the configuration simpler by self
> > > tuning to use no more buffers than is useful.
> >
> > Yes, but I am struggling with how we would define "useful".
>
> For copy and vacuum, the only reason I can see for keeping visited
> buffers around is to avoid flushing WAL or at least doing it in larger
> batches. Once the ring is big enough that WAL doesn't need to be
> flushed on eviction, making it bigger only wastes space that could be
> used by something that is not going to be evicted soon.

Well, I think if you know you will use the data you are COPYing right
away in your normal workload, it could be useful to have the ring be
large or to disable use of the ring. And, for vacuum, if you need to get
it done as quickly as possible, again, it could be useful to have the
ring be large or disable use of the ring.

> > > > StrategyRejectBuffer() will allow bulkreads to, as you say, use more
> > > > buffers than the original ring size, since it allows them to kick
> > > > dirty buffers out of the ring and claim new shared buffers.
> > > >
> > > > Bulkwrites and vacuums, however, will inevitably dirty buffers and
> > > > require flushing the buffer (and thus flushing the associated WAL) when
> > > > reusing them. Bulkwrites and vacuum do not kick dirtied buffers out of
> > > > the ring, since dirtying buffers is their common case. A dynamic
> > > > resizing like the one you suggest would likely devolve to vacuum and
> > > > bulkwrite strategies always using the max size.
> > >
> > > I think it should self stabilize around the point where the WAL is
> > > either flushed by other commit activity, WAL writer or WAL buffers
> > > filling up. Writing out their own dirtied buffers will still happen,
> > > just the associated WAL flushes will be in larger chunks and possibly
> > > done by other processes.
> >
> > They will have to write out any WAL associated with modifications to the
> > dirty buffer before flushing it, so I'm not sure I understand how this
> > would work.
>
> By the time the dirty buffer needs eviction the WAL associated with it
> can already be written out by concurrent commits, WAL writer or by WAL
> buffers filling up. The bigger the ring is, the higher the chance that
> one of these will happen before we loop around.

Ah, I think I understand the idea now. So, I think it is an interesting
idea to try and find the goldilocks size for the ring buffer. It is
especially interesting to me in the case in which we are enlarging the
ring.

However, given that concurrent workload variability, machine I/O latency
fluctuations, etc, we will definitely have to set a max value of some
kind anyway for the ring size. So, this seems more like a complimentary
feature to vacuum_buffer_usage_limit. If we added some kind of adaptive
sizing in a later version, we could emphasize in the guidance for
setting vacuum_buffer_usage_limit that it is the *maximum* size you
would like to allow vacuum to use. And, of course, there are the other
operations which use buffer access strategies.

> > > > As for decreasing the ring size, buffers are only "added" to the ring
> > > > lazily and, technically, as it is now, buffers which have been added
> > > > added to the ring can always be reclaimed by the clocksweep (as long as
> > > > they are not pinned). The buffer access strategy is more of a
> > > > self-imposed restriction than it is a reservation. Since the ring is
> > > > small and the buffers are being frequently reused, odds are the usage
> > > > count will be 1 and we will be the one who set it to 1, but there is no
> > > > guarantee. If, when attempting to reuse the buffer, its usage count is
> > > > > 1 (or it is pinned), we also will kick it out of the ring and go look
> > > > for a replacement buffer.
> > >
> > > Right, but while the buffer is actively used by the ring it is
> > > unlikely that clocksweep will find it at usage 0 as the ring buffer
> > > should cycle more often than the clocksweep. Whereas if the ring stops
> > > using a buffer, clocksweep will eventually come and reclaim it. And if
> > > the ring shrinking decision turns out to be wrong before the
> > >

Re: what should install-world do when docs are not available?

Hi,

On 2023-03-25 16:40:03 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Right now if one does install-world with autoconf, without having xmllint or
> > xsltproc available, one gets an error:
> > ERROR: `xmllint' is missing on your system.
>
> > Is that good? Should meson behave the same?
>
> Since install-world is defined to install documentation, it should
> do so or fail trying.  Maybe we could skip the xmllint step, but you'd
> still need xsltproc so I'm not sure that that moves the bar very far.

xmllint is the more commonly installed tool (it's part of libxml, which
libxslt depends on), so that wouldn't help much - and we now depend on xmllint
to build the input to xsltproc anyway...

> > If that's what we decide to do, perhaps "docs" should be split further? The
> > dependencies for pdf generation are a lot more heavyweight.
>
> Yeah.  Personally I think "docs" should just build/install the HTML
> docs, but maybe I'm too narrow-minded.

Sorry, I meant docs as a meson option, not as a build target. The 'docs'
target builds just the html doc (as with autoconf), and install-doc installs
both html and manpages (also as with autoconf).

I am basically wondering if we should make it so that if you say
-Ddocs=enabled and xmllint or xsltproc aren't available, you get an error at
configure time. And if -Ddocs=auto, the summary at the end of configure will
tell you if the necessary tools to build the docs are available, but not error
out.

The extension to that could be to have a separate -Ddoc_pdf option, which'd
mirror the above.

Greetings,

Andres Freund

Re: Parallel Full Hash Join

2023-03-25 Thread Melanie Plageman

On Sat, Mar 25, 2023 at 09:21:34AM +1300, Thomas Munro wrote:
>  * reuse the same umatched_scan_{chunk,idx} variables as above
>  * rename the list of chunks to scan to work_queue
>  * fix race/memory leak if we see PHJ_BATCH_SCAN when we attach (it
> wasn't OK to just fall through)

ah, good catch.

> I don't love the way that both ExecHashTableDetachBatch() and
> ExecParallelPrepHashTableForUnmatched() duplicate logic relating to
> the _SCAN/_FREE protocol, but I'm struggling to find a better idea.
> Perhaps I just need more coffee.

I'm not sure if I have strong feelings either way.
To confirm I understand, though: in ExecHashTableDetachBatch(), the call
to BarrierArriveAndDetachExceptLast() serves only to advance the barrier
phase through _SCAN, right? It doesn't really matter if this worker is
the last worker since BarrierArriveAndDetach() handles that for us.
There isn't another barrier function to do this (and I mostly think it
is fine), but I did have to think on it for a bit.

Oh, and, unrelated, but it is maybe worth updating the BarrierAttach()
function comment to mention BarrierArriveAndDetachExceptLast().

> I think your idea of opportunistically joining the scan if it's
> already running makes sense to explore for a later step, ie to make
> multi-batch PHFJ fully fair, and I think that should be a fairly easy
> code change, and I put in some comments where changes would be needed.

makes sense.

I have some very minor pieces of feedback, mainly about extraneous
commas that made me uncomfortable ;)

> From 8b526377eb4a4685628624e75743aedf37dd5bfe Mon Sep 17 00:00:00 2001
> From: Thomas Munro 
> Date: Fri, 24 Mar 2023 14:19:07 +1300
> Subject: [PATCH v12 1/2] Scan for unmatched hash join tuples in memory order.
> 
> In a full/right outer join, we need to scan every tuple in the hash
> table to find the ones that were not matched while probing, so that we

Given how you are using the word "so" here, I think that comma before it
is not needed.

> @@ -2083,58 +2079,45 @@ bool
>  ExecScanHashTableForUnmatched(HashJoinState *hjstate, ExprContext *econtext)
>  {
>   HashJoinTable hashtable = hjstate->hj_HashTable;
> - HashJoinTuple hashTuple = hjstate->hj_CurTuple;
> + HashMemoryChunk chunk;
>  
> - for (;;)
> + while ((chunk = hashtable->unmatched_scan_chunk))
>   {
> - /*
> -  * hj_CurTuple is the address of the tuple last returned from 
> the
> -  * current bucket, or NULL if it's time to start scanning a new
> -  * bucket.
> -  */
> - if (hashTuple != NULL)
> - hashTuple = hashTuple->next.unshared;
> - else if (hjstate->hj_CurBucketNo < hashtable->nbuckets)
> - {
> - hashTuple = 
> hashtable->buckets.unshared[hjstate->hj_CurBucketNo];
> - hjstate->hj_CurBucketNo++;
> - }
> - else if (hjstate->hj_CurSkewBucketNo < hashtable->nSkewBuckets)
> + while (hashtable->unmatched_scan_idx < chunk->used)
>   {
> - int j = 
> hashtable->skewBucketNums[hjstate->hj_CurSkewBucketNo];
> + HashJoinTuple hashTuple = (HashJoinTuple)
> + (HASH_CHUNK_DATA(hashtable->unmatched_scan_chunk) +
> +  hashtable->unmatched_scan_idx);
>  
> - hashTuple = hashtable->skewBucket[j]->tuples;
> - hjstate->hj_CurSkewBucketNo++;
> - }
> - else
> - break;  /* finished all buckets 
> */
> + MinimalTuple tuple = HJTUPLE_MINTUPLE(hashTuple);
> + int hashTupleSize = 
> (HJTUPLE_OVERHEAD + tuple->t_len);
>  
> - while (hashTuple != NULL)
> - {
> - if 
> (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(hashTuple)))
> - {
> - TupleTableSlot *inntuple;
> + /* next tuple in this chunk */
> + hashtable->unmatched_scan_idx += 
> MAXALIGN(hashTupleSize);
>  
> - /* insert hashtable's tuple into exec slot */
> - inntuple = 
> ExecStoreMinimalTuple(HJTUPLE_MINTUPLE(hashTuple),
> - 
>  hjstate->hj_HashTupleSlot,
> - 
>  false);/* do not pfree */
> - econtext->ecxt_innertuple = inntuple;
> + if 
> (HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(hashTuple)))
> + continue;
>  
> - /*
> -  * Reset temp memory each time; although this 
> function doesn't
> -

Re: what should install-world do when docs are not available?

Andres Freund  writes:
> Right now if one does install-world with autoconf, without having xmllint or
> xsltproc available, one gets an error:
> ERROR: `xmllint' is missing on your system.

> Is that good? Should meson behave the same?

Since install-world is defined to install documentation, it should
do so or fail trying.  Maybe we could skip the xmllint step, but you'd
still need xsltproc so I'm not sure that that moves the bar very far.

> If that's what we decide to do, perhaps "docs" should be split further? The
> dependencies for pdf generation are a lot more heavyweight.

Yeah.  Personally I think "docs" should just build/install the HTML
docs, but maybe I'm too narrow-minded.

> We should probably also generate a useful error when the stylesheets aren't
> available.

Maybe, but we haven't had that in the autoconf case either, and there
have not been too many complaints.  Not sure it's worth putting extra
effort into.

regards, tom lane

Re: Infinite Interval

2023-03-25 Thread Isaac Morland

On Sat, 25 Mar 2023 at 15:59, Tom Lane  wrote:

> Joseph Koshakow  writes:
> > In terms of adding/subtracting infinities, the IEEE standard is pay
> > walled and I don't have a copy. I tried finding information online but
> > I also wasn't able to find anything useful. I additionally checked to see
> > the results of C++, C, and Java, and they all match which increases my
> > confidence that we're doing the right thing. Does anyone happen to have
> > a copy of the standard and can confirm?
>
> I think you can take it as read that simple C test programs on modern
> platforms will exhibit IEEE-compliant handling of float infinities.
>

Additionally, the Java language specification claims to follow IEEE 754:

https://docs.oracle.com/javase/specs/jls/se11/html/jls-15.html#jls-15.18.2

So either C and Java agree with each other and with the spec, or they
disagree in the same way even while at least one of them explicitly claims
to be following the spec. I think you're on pretty firm ground.

what should install-world do when docs are not available?

Hi,

A question by Justin made me wonder what the right behaviour for world,
install-world should be when the docs tools aren't available. I'm wondering
from the angle of meson, but it also seems something we possibly should think
about for autoconf.

Right now if one does install-world with autoconf, without having xmllint or
xsltproc available, one gets an error:
ERROR: `xmllint' is missing on your system.

Is that good? Should meson behave the same?

I wonder if, for meson, the best behaviour would be to make 'docs' a feature
set to auto. If docs set to enabled, and the necessary tools are not
available, fail at that time, instead of doing so while building.

If that's what we decide to do, perhaps "docs" should be split further? The
dependencies for pdf generation are a lot more heavyweight.

We should probably also generate a useful error when the stylesheets aren't
available. Right now we just generate a long error:

/usr/bin/xsltproc --nonet --path . --stringparam pg.version '16devel'
/home/andres/src/postgresql/doc/src/sgml/stylesheet.xsl postgres-full.xml
I/O error : Attempt to load network entity
http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl
warning: failed to load external entity
"http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl;
compilation error: file /home/andres/src/postgresql/doc/src/sgml/stylesheet.xsl
line 6 element import
xsl:import : unable to load
http://docbook.sourceforge.net/release/xsl/current/xhtml/chunk.xsl
I/O error : Attempt to load network entity
http://docbook.sourceforge.net/release/xsl/current/common/entities.ent
/home/andres/src/postgresql/doc/src/sgml/stylesheet-html-common.xsl:4: warning:
failed to load external entity
"http://docbook.sourceforge.net/release/xsl/current/common/entities.ent;
%common.entities;
^
/home/andres/src/postgresql/doc/src/sgml/stylesheet-html-common.xsl:116: parser
error : Entity 'primary' not defined
translate(substring(, 1, 1),
...

Greetings,

Andres Freund

[1]
https://www.postgresql.org/message-id/20230325180310.o6drykb3uz4u4x4r%40awork3.anarazel.de

Re: windows/meson cfbot warnings

2023-03-25 Thread Justin Pryzby

On Sun, Feb 26, 2023 at 09:54:25AM -0800, Andres Freund wrote:
> On 2023-02-25 16:45:38 -0600, Justin Pryzby wrote:
> > Unrelated, but something else changed and now there's this.
> > 
> > https://cirrus-ci.com/task/6202242768830464
> > 
> > [20:10:34.310] c:\cirrus>call sh -c 'if grep ": warning " build.txt; then 
> > exit 1; fi; exit 0' 
> > [20:10:34.397] C:\python\Include\pyconfig.h(117): warning C4005: 
> > 'MS_WIN64': macro redefinition
> > [20:10:34.397] C:\python\Include\pyconfig.h(117): warning C4005: 
> > 'MS_WIN64': macro redefinition
> > [20:10:34.397] C:\python\Include\pyconfig.h(117): warning C4005: 
> > 'MS_WIN64': macro redefinition
> > [20:10:34.397] C:\python\Include\pyconfig.h(117): warning C4005: 
> > 'MS_WIN64': macro redefinition
> 
> Hm, odd.
> 
> There's a bit more context about the warning in the output:
> 
> [21:43:58.782] [1509/2165] Compiling C object 
> src/pl/plpython/plpython3.dll.p/plpy_exec.c.obj
> [21:43:58.782] C:\python\Include\pyconfig.h(117): warning C4005: 'MS_WIN64': 
> macro redefinition
> [21:43:58.924] src/pl/plpython/plpython3.dll.p/meson_pch-c.c: note: see 
> previous definition of 'MS_WIN64'

I don't remember right now what I did to get the output, but it's like:

[18:33:22.351] c:\python\Include\pyconfig.h(117): warning C4005: 'MS_WIN64': 
macro redefinition
[18:33:22.351] c:\python\Include\pyconfig.h(117): note: 'MS_WIN64' previously 
declared on the command line

I found that this is caused by changes in meson between 1.0.0 and 1.0.1.

Probably one of these.
https://github.com/mesonbuild/meson/commit/aa69cf04484309f82d2da64c433539d2f6f2fa82
https://github.com/mesonbuild/meson/commit/9bf718fcee0d9e30b5de2d6a2f154aa417aa8d4c

-- 
Justin

Re: Infinite Interval

Joseph Koshakow  writes:
> In terms of adding/subtracting infinities, the IEEE standard is pay
> walled and I don't have a copy. I tried finding information online but
> I also wasn't able to find anything useful. I additionally checked to see
> the results of C++, C, and Java, and they all match which increases my
> confidence that we're doing the right thing. Does anyone happen to have
> a copy of the standard and can confirm?

I think you can take it as read that simple C test programs on modern
platforms will exhibit IEEE-compliant handling of float infinities.

regards, tom lane

Re: Progress report of CREATE INDEX for nested partitioned tables

I pushed 0001 with some cosmetic changes (for instance, trying to
make the style of the doc entries for partitions_total/partitions_done
match the rest of their table).

I'm not touching 0002 or 0003, because I think they're fundamentally
a bad idea.  Progress reporting is inherently inexact, because it's
so hard to predict the amount of work to be done in advance -- have
you ever seen a system anywhere whose progress bars reliably advance
at a uniform rate?  I think adding assertions that the estimates are
error-free is just going to cause headaches.  As an example, I added
a comment pointing out that the current fix won't crash and burn if
the caller failed to lock all the child tables in advance: the
find_all_inheritors call should be safe anyway, so the worst consequence
would be an imprecise partitions_total estimate.  But that argument
falls down if we're going to add assertions that partitions_total
isn't in error.

regards, tom lane

Re: Non-superuser subscription owners

On Fri, 2023-03-24 at 09:24 -0400, Robert Haas wrote:
> I certainly agree that the security model isn't in a reasonable place
> right now. However, I feel that:
> 
> (1) adding an extra predefined role

> (2) even adding the connection string security stuff

I don't see how these points are related to the question of whether you
should commit your non-superuser-subscription-owners patch or logical-
repl-as-table-owner patch first.

My perspective is that logical replication is an unfinished feature
with an incomplete design. As I said earlier, that's why I backed away
from trying to do non-superuser subscriptions as a documented feature:
it feels like we need to settle some of the underlying pieces first.

There are some big issues, like the security model for replaying
changes. And some smaller issues like feature gaps (RLS doesn't work,
if I remember correctly, and maybe something with partitioning). There
are potential clashes with other proposals, like the CREATE
SUBSCRIPTION ... SERVER, which I hope can be sorted out later. And I
don't feel like I have a good handle on the publisher security model
and threats, which hopefully is just a matter of documenting some best
practices.

Each time we dig into one of these issues I learn something, and I
think others do, too. If we skip past that process and start adding new
features on top of this unfinished design, then I think we are setting
ourselves up for trouble that is going to be harder to fix later.

I don't mean to say all of the above issues are blockers or that they
should all be resolved in my favor. But there are enough issues and
some of those issues are serious enough that I feel like it's premature
to just go ahead with the non-superuser subscriptions and the
predefined role.

There are already users, which complicates things. And you make a good
point that some important users may be already working around the
flaws. But there's already a patch and discussion going on for some
security model improvements (thanks to you), so let's try to get that
one in first. If we can't, it's probably because we learned something
important.

Regards,
Jeff Davis

Re: Should vacuum process config file reload more often

2023-03-25 Thread Melanie Plageman

On Thu, Mar 23, 2023 at 8:27 PM Melanie Plageman
 wrote:
> On Thu, Mar 23, 2023 at 2:09 AM Masahiko Sawada  wrote:
> > > And, I was wondering if it was worth trying to split up the part that
> > > reloads the config file and all of the autovacuum stuff. The reloading
> > > of the config file by itself won't actually result in autovacuum workers
> > > having updated cost delays because of them overwriting it with
> > > wi_cost_delay, but it will allow VACUUM to have those updated values.
> >
> > It makes sense to me to have changes for overhauling the rebalance
> > mechanism in a separate patch.
> >
> > Looking back at the original concern you mentioned[1]:
> >
> > speed up long-running vacuum of a large table by
> > decreasing autovacuum_vacuum_cost_delay/vacuum_cost_delay, however the
> > config file is only reloaded between tables (for autovacuum) or after
> > the statement (for explicit vacuum).
> >
> > does it make sense to have autovac_balance_cost() update workers'
> > wi_cost_delay too? Autovacuum launcher already reloads the config file
> > and does the rebalance. So I thought autovac_balance_cost() can update
> > the cost_delay as well, and this might be a minimal change to deal
> > with your concern. This doesn't have the effect for manual VACUUM but
> > since vacuum delay is disabled by default it won't be a big problem.
> > As for manual VACUUMs, we would need to reload the config file in
> > vacuum_delay_point() as the part of your patch does. Overhauling the
> > rebalance mechanism would be another patch to improve it further.
>
> So, we can't do this without acquiring an access shared lock on every
> call to vacuum_delay_point() because cost delay is a double.
>
> I will work on a patchset with separate commits for reloading the config
> file, though (with autovac not benefitting in the first commit).

So, I realized we could actually do as you say and have autovac workers
update their wi_cost_delay and keep the balance changes in a separate
commit. I've done this in attached v8.

Workers take the exclusive lock to update their wi_cost_delay and
wi_cost_limit only when there is a config reload. So, there is one
commit that implements this behavior and a separate commit to revise the
worker rebalancing.

Note that we must have the workers also update wi_cost_limit_base and
then call autovac_balance_cost() when they reload the config file
(instead of waiting for launcher to call autovac_balance_cost()) to
avoid potentially calculating the sleep with a new value of cost delay
and an old value of cost limit.

In the commit which revises the worker rebalancing, I'm still wondering
if wi_dobalance should be an atomic flag -- probably not worth it,
right?

On Fri, Mar 24, 2023 at 1:27 PM Melanie Plageman
 wrote:
> I realized nworkers_for_balance should be initialized to 0 and not 1 --
> 1 is misleading since there are often 0 autovac workers. We just never
> want to use nworkers_for_balance when it is 0. But, workers put a floor
> of 1 on the number when they divide limit/nworkers_for_balance (since
> they know there must be at least one worker right now since they are a
> worker). I thought about whether or not they should call
> autovac_balance_cost() if they find that nworkers_for_balance is 0 when
> updating their own limit, but I'm not sure.

I've gone ahead and updated this. I haven't made the workers call
autovac_balance_cost() if they find that nworkers_for_balance is 0 when
they try and use it when updating their limit because I'm not sure if
this can happen. I would be interested in input here.

I'm also still interested in feedback on the variable name
av_nworkers_for_balance.

> On Fri, Mar 24, 2023 at 1:21 AM Masahiko Sawada  wrote:
> > > > I need another few readthroughs to figure out of VacuumFailsafeActive 
> > > > does what
> > > > I think it does, and should be doing, but in general I think this is a 
> > > > good
> > > > idea and a patch in good condition close to being committable.
> >
> > Another approach would be to make VacuumCostActive a ternary value:
> > on, off, and never. When we trigger the failsafe mode we switch it to
> > never, meaning that it never becomes active even after reloading the
> > config file. A good point is that we don't need to add a new global
> > variable, but I'm not sure it's better than the current approach.
>
> Hmm, this is interesting. I don't love the word "never" since it kind of
> implies a duration longer than the current table being vacuumed. But we
> could find a different word or just document it well. For clarity, we
> might want to call it failsafe_mode or something.
>
> I wonder if the primary drawback to converting
> LVRelState->failsafe_active to a global VacuumFailsafeActive is just the
> general rule of limiting scope to the minimum needed.

Okay, so I've changed my mind about this. I like having a ternary for
VacuumCostActive and keeping failsafe_active in LVRelState. What I
didn't like was having non-vacuum code have to care

Re: hio.c does visibilitymap_pin()/IO while holding buffer lock

2023-03-25 Thread Peter Geoghegan

On Sat, Mar 25, 2023 at 11:17 AM Andres Freund  wrote:
> Thinking more about this, I think there's no inherent deadlock danger with
> reading the VM while holding a buffer lock, "just" an efficiency issue. If we
> avoid needing to do IO nearly all the time, by trying to pin the right page
> before extending, it's probably good enough.

Uh, it was quite possible for lazy_vacuum_heap_page() to do that up
until very recently (it was fixed by my commit 980ae17310). Since it
would call visibilitymap_get_status() with an exclusive buffer lock on
the heap page, which sometimes had to change the VM page. It
potentially did an IO at that point, to read in a later VM page to the
caller's initially-pinned one.

In other words, up until recently there was a strange idiom used by
lazy_vacuum_heap_page/lazy_vacuum_heap_rel, where we'd abuse
visibilitymap_get_status() as a replacement for calling
visibilitymap_pin() right before acquire a heap page buffer lock. But
now the second heap pass does it the same way as the first heap pass.
(Even still, I have no reason to believe that the previous approach
was all that bad; it was just a bit ugly.)

There are still a few visibilitymap_get_status()-with-buffer-lock
calls in vacuumlazy.c, FWIW, but they don't run the risk of needing to
change the vmbuffer we have pinned with the heap page buffer lock
held.

-- 
Peter Geoghegan

Re: hio.c does visibilitymap_pin()/IO while holding buffer lock

Hi,

On 2023-03-25 12:57:17 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2023-03-25 14:34:25 +0100, Tomas Vondra wrote:
> >> Can't we ensure we actually lock the vm buffer too in ReadBufferBI,
> >> before calling ReadBufferExtended? Or am I confused and that's not
> >> possible for some reason?
> 
> > Note that this is using P_NEW. I.e. we don't know the buffer location yet.
> 
> Maybe the relation-extension logic needs to include the ability to get
> the relevant vm page?

I don't see how that's easily possible with the current lock ordering
rules. At least without giving up using RBM_ZERO_AND_LOCK for extending or
stuffing even more things to happen with the the extension lock held, which I
don't think we want to. I don't think INSERT_FROZEN is worth that price.

Perhaps we should just try to heuristically pin the right VM buffer before
trying to extend?

Thinking more about this, I think there's no inherent deadlock danger with
reading the VM while holding a buffer lock, "just" an efficiency issue. If we
avoid needing to do IO nearly all the time, by trying to pin the right page
before extending, it's probably good enough.

Greetings,

Andres Freund

Re: Bug with ICU for merge join

On Fri, 2023-03-24 at 15:45 -0400, Jeff Janes wrote:
> Ever since 27b62377b47f9e7bf58613, I have been getting "ERROR:
>  mergejoin input data is out of order" for the attached reproducer.

Thank you for the report! And the simple repro.

Fixed in 81a6d57e33.

Regards,
Jeff Davis

Re: Progress report of CREATE INDEX for nested partitioned tables

Justin Pryzby  writes:
> On Sat, Mar 25, 2023 at 11:55:13AM -0400, Tom Lane wrote:
>> That's why I wanted list_length() not list_length() - 1.  We are
>> doing *something* at the top partitioned table, it just doesn't
>> involve a table scan, so I don't find this totally unreasonable.
>> If you agree we are doing work at intermediate partitioned tables,
>> how are we not doing work at the top one?

> What you're proposing would redefine the meaning of
> PARTITIONS_DONE/TOTAL, even in the absence of intermediate partitioned
> tables.  Which might be okay, but the scope of this thread/patch was to
> fix the behavior involving intermediate partitioned tables.

I'm a little skeptical of that argument, because this patch is already
redefining the meaning of PARTITIONS_TOTAL.  The fact that the existing
documentation is vague enough to be read either way doesn't make it not
a change.

Still, in the interests of getting something done I'll drop the issue.

regards, tom lane

Re: hio.c does visibilitymap_pin()/IO while holding buffer lock

Andres Freund  writes:
> On 2023-03-25 14:34:25 +0100, Tomas Vondra wrote:
>> Can't we ensure we actually lock the vm buffer too in ReadBufferBI,
>> before calling ReadBufferExtended? Or am I confused and that's not
>> possible for some reason?

> Note that this is using P_NEW. I.e. we don't know the buffer location yet.

Maybe the relation-extension logic needs to include the ability to get
the relevant vm page?

regards, tom lane

Re: hio.c does visibilitymap_pin()/IO while holding buffer lock

Hi,

On 2023-03-25 14:34:25 +0100, Tomas Vondra wrote:
> On 3/25/23 03:57, Andres Freund wrote:
> > 2) Change relevant code so that we only return a valid vmbuffer if we could 
> > do
> >so without blocking / IO and, obviously, skip updating the VM if we
> >couldn't get the buffer.
> > 
> 
> I don't recall the exact details about the vm locking/pinning, but can't
> we just ensure we actually follow the proper locking order? I mean, this
> only deals with new pages, requested at line ~624:
> 
>   buffer = ReadBufferBI(relation, P_NEW, RBM_ZERO_AND_LOCK, bistate);
> 
> Can't we ensure we actually lock the vm buffer too in ReadBufferBI,
> before calling ReadBufferExtended? Or am I confused and that's not
> possible for some reason?

Note that this is using P_NEW. I.e. we don't know the buffer location yet.

Greetings,

Andres Freund

Re: meson/msys2 fails with plperl/Strawberry

Hi,

On 2023-03-25 08:46:42 -0400, Andrew Dunstan wrote:
> config/perl.m4 contains this:
>
>
>AC_MSG_CHECKING(for flags to link embedded Perl)
>if test "$PORTNAME" = "win32" ; then
> perl_lib=`basename $perl_archlibexp/CORE/perl[[5-9]]*.lib .lib`
> if test -e "$perl_archlibexp/CORE/$perl_lib.lib"; then
> perl_embed_ldflags="-L$perl_archlibexp/CORE -l$perl_lib"
> else
> perl_lib=`basename $perl_archlibexp/CORE/libperl[[5-9]]*.a .a | 
> sed 's/^lib//'`
> if test -e "$perl_archlibexp/CORE/lib$perl_lib.a"; then
> perl_embed_ldflags="-L$perl_archlibexp/CORE -l$perl_lib"
> fi
> fi
>else
> pgac_tmp1=`$PERL -MExtUtils::Embed -e ldopts`
> pgac_tmp2=`$PERL -MConfig -e 'print "$Config{ccdlflags} 
> $Config{ldflags}"'`
> perl_embed_ldflags=`echo X"$pgac_tmp1" | sed -e "s/^X//" -e 
> "s%$pgac_tmp2%%"`
>fi
>AC_SUBST(perl_embed_ldflags)dnl
>
> I don't see any equivalent in meson.build of the win32 logic, and thus I am
> getting a setup failure on fairywren when trying to move it to meson, while
> it will happily build with autoconf.

I did not try to build with strawberry perl using mingw - it doesn't seem like
a very interesting thing, given that mingw has a much more reasonable perl
than strawberry - but with the mingw perl it works well.


The above logic actually did *not* work well with mingw for me, because the
names are not actually what configure expects, and it seems like a seriously
bad idea to encode that much knowledge about library naming and locations.

https://cirrus-ci.com/task/6421536551206912

[16:32:28.997] Has header "perl.h" : YES
[16:32:28.997] Message: CCFLAGS recommended by perl:  -DWIN32 -DWIN64 
-DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS 
-DUSE_PERLIO -D__USE_MINGW_ANSI_STDIO -fno-strict-aliasing -mms-bitfields
[16:32:28.997] Message: CCFLAGS for embedding perl: 
-IC:\msys64\ucrt64\lib\perl5\core_perl/CORE -DWIN32 -DWIN64 
-DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS 
-DUSE_PERLIO -DPLPERL_HAVE_UID_GID
[16:32:28.997] Message: LDFLAGS recommended by perl: "-s 
-L"C:\msys64\ucrt64\lib\perl5\core_perl\CORE" -L"C:\msys64\ucrt64\lib"  
"C:\msys64\ucrt64\lib\perl5\core_perl\CORE\libperl532.a" 
"C:\msys64\ucrt64\lib\libmoldname.a" "C:\msys64\ucrt64\lib\libkernel32.a" 
"C:\msys64\ucrt64\lib\libuser32.a" "C:\msys64\ucrt64\lib\libgdi32.a" 
"C:\msys64\ucrt64\lib\libwinspool.a" "C:\msys64\ucrt64\lib\libcomdlg32.a" 
"C:\msys64\ucrt64\lib\libadvapi32.a" "C:\msys64\ucrt64\lib\libshell32.a" 
"C:\msys64\ucrt64\lib\libole32.a" "C:\msys64\ucrt64\lib\liboleaut32.a" 
"C:\msys64\ucrt64\lib\libnetapi32.a" "C:\msys64\ucrt64\lib\libuuid.a" 
"C:\msys64\ucrt64\lib\libws2_32.a" "C:\msys64\ucrt64\lib\libmpr.a" 
"C:\msys64\ucrt64\lib\libwinmm.a" "C:\msys64\ucrt64\lib\libversion.a" 
"C:\msys64\ucrt64\lib\libodbc32.a" "C:\msys64\ucrt64\lib\libodbccp32.a" 
"C:\msys64\ucrt64\lib\libcomctl32.a""
[16:32:28.997] Message: LDFLAGS for embedding perl: 
"C:\msys64\ucrt64\lib\perl5\core_perl\CORE\libperl532.a 
C:\msys64\ucrt64\lib\libmoldname.a C:\msys64\ucrt64\lib\libkernel32.a 
C:\msys64\ucrt64\lib\libuser32.a C:\msys64\ucrt64\lib\libgdi32.a 
C:\msys64\ucrt64\lib\libwinspool.a C:\msys64\ucrt64\lib\libcomdlg32.a 
C:\msys64\ucrt64\lib\libadvapi32.a C:\msys64\ucrt64\lib\libshell32.a 
C:\msys64\ucrt64\lib\libole32.a C:\msys64\ucrt64\lib\liboleaut32.a 
C:\msys64\ucrt64\lib\libnetapi32.a C:\msys64\ucrt64\lib\libuuid.a 
C:\msys64\ucrt64\lib\libws2_32.a C:\msys64\ucrt64\lib\libmpr.a 
C:\msys64\ucrt64\lib\libwinmm.a C:\msys64\ucrt64\lib\libversion.a 
C:\msys64\ucrt64\lib\libodbc32.a C:\msys64\ucrt64\lib\libodbccp32.a 
C:\msys64\ucrt64\lib\libcomctl32.a"
[16:32:28.997] Checking if "libperl" : links: YES


> I would expect the ld flags to be "-LC:/STRAWB~1/perl/lib/CORE -lperl532"

You didn't say what they ended up as?


> (Off topic peeve - one of the things I dislike about meson is that the
> meson.build files are written in YA bespoke language).

I don't really disagree. However, all the general purpose language using build
etools I found were awful. And meson's language is a heck of a lot nicer than
e.g. cmake's...


Greetings,

Andres Freund

Re: Progress report of CREATE INDEX for nested partitioned tables

2023-03-25 Thread Justin Pryzby

On Sat, Mar 25, 2023 at 11:55:13AM -0400, Tom Lane wrote:
> Justin Pryzby  writes:
> > On Thu, Mar 23, 2023 at 04:35:46PM -0400, Tom Lane wrote:
> >> Furthermore, as things stand it's not hard
> >> for PARTITIONS_TOTAL to be zero --- there's at least one such case
> >> in the regression tests --- and that seems just weird to me.
> 
> > I don't know why it'd seem weird.  postgres doesn't create partitions
> > automatically, so by default there are none.  If we create a table but
> > never load any data, it'll have no partitions.
> 
> My problem with it is that it's not clear how to tell "no partitioned
> index creation in progress" from "partitioned index creation in progress,
> but total = 0".  Maybe there's some out-of-band way to tell that in the
> stats reporting system, but still it's a weird corner case.
> 
> > Also, the TOTAL=0 case
> > won't go away just because we start counting intermediate partitions.
> 
> That's why I wanted list_length() not list_length() - 1.  We are
> doing *something* at the top partitioned table, it just doesn't
> involve a table scan, so I don't find this totally unreasonable.
> If you agree we are doing work at intermediate partitioned tables,
> how are we not doing work at the top one?

What you're proposing would redefine the meaning of
PARTITIONS_DONE/TOTAL, even in the absence of intermediate partitioned
tables.  Which might be okay, but the scope of this thread/patch was to
fix the behavior involving intermediate partitioned tables.

It's somewhat weird to me that find_all_inheritors(rel) returns the rel
itself.  But it's an internal function, and evidently that's what's
needed/desirable to do, so that's fine.

However, "PARTITIONS_TOTAL" has a certain user-facing definition, and
"Number of partitions" is easier to explain than "Number of partitions
plus the rel itself", and IMO an easier definition is a better one.

Your complaint seems similar to something I've said a few times before:
it's weird to expose macroscopic progress reporting of partitioned
tables in the same view and in the same *row* as microscopic progress of
its partitions.  But changing that is a job for another patch.  I won't
be opposed to it if someone were to propose a patch to remove
partitions_{done,total}.  See also:
https://www.postgresql.org/message-id/flat/YCy5ZMt8xAyoOMmv%40paquier.xyz#b20d1be226a93dacd3fd40b402315105

-- 
Justin

Re: Infinite Interval

2023-03-25 Thread Joseph Koshakow

In terms of adding/subtracting infinities, the IEEE standard is pay
walled and I don't have a copy. I tried finding information online but
I also wasn't able to find anything useful. I additionally checked to see
the results of C++, C, and Java, and they all match which increases my
confidence that we're doing the right thing. Does anyone happen to have
a copy of the standard and can confirm?

- Joe Koshakow

Re: running logical replication as the subscription owner

2023-03-25 Thread Noah Misch

On Fri, Mar 24, 2023 at 10:00:36AM -0400, Robert Haas wrote:
> On Fri, Mar 24, 2023 at 3:59 AM Jeff Davis  wrote:
> > That's a little confusing, why not just always use the
> > SECURITY_RESTRICTED_OPERATION? Is there a use case I'm missing?
> 
> Some concern was expressed -- not sure exactly where the email is
> exactly, and it might've been on pgsql-security -- that doing that
> categorically might break things that are currently working. The
> people who were concerned included Andres and I forget who else. My

SECURITY_RESTRICTED_OPERATION blocks deferred triggers and CREATE TEMP TABLE.
If you create a DEFERRABLE INITIALLY DEFERRED fk constraint and replicate to
the constraint's table in a SECURITY_RESTRICTED_OPERATION, I expect an error.

> gut reaction was the same as yours, just do it always and don't worry
> about it. But if people think that users are likely to run afoul of

Hard to know.  It's the sort of thing that I model as creating months of work
for epsilon users to adapt their applications.  Epsilon could be zero.  Most
users don't notice.

> the SECURITY_RESTRICTED_OPERATION restrictions in practice, then this
> is better, and the implementation complexity isn't high. We could even
> think of extending this kind of logic to other places where
> SECURITY_RESTRICTED_OPERATION is enforced, if desired.

Firing a trigger in an index expression or materialized view query is not
reasonable, so today's uses of SECURITY_RESTRICTED_OPERATION would not benefit
from this approach.  Being able to create temp tables in those places has some
value, but I would allow temp tables in SECURITY_RESTRICTED_OPERATION instead
of proliferating the check on the ability to SET ROLE.  (One might allow temp
tables by introducing NewTempSchemaNestLevel(), called whenever we call
NewGUCNestLevel().  The transaction would then proceed as though it has no
temp schema, allocating an additional schema if creating a temp object.)

Re: Progress report of CREATE INDEX for nested partitioned tables

Justin Pryzby  writes:
> On Thu, Mar 23, 2023 at 04:35:46PM -0400, Tom Lane wrote:
>> Furthermore, as things stand it's not hard
>> for PARTITIONS_TOTAL to be zero --- there's at least one such case
>> in the regression tests --- and that seems just weird to me.

> I don't know why it'd seem weird.  postgres doesn't create partitions
> automatically, so by default there are none.  If we create a table but
> never load any data, it'll have no partitions.

My problem with it is that it's not clear how to tell "no partitioned
index creation in progress" from "partitioned index creation in progress,
but total = 0".  Maybe there's some out-of-band way to tell that in the
stats reporting system, but still it's a weird corner case.

> Also, the TOTAL=0 case
> won't go away just because we start counting intermediate partitions.

That's why I wanted list_length() not list_length() - 1.  We are
doing *something* at the top partitioned table, it just doesn't
involve a table scan, so I don't find this totally unreasonable.
If you agree we are doing work at intermediate partitioned tables,
how are we not doing work at the top one?

regards, tom lane

Re: Infinite Interval

2023-03-25 Thread Joseph Koshakow

On Fri, Mar 24, 2023 at 9:43 AM Ashutosh Bapat 
wrote:
>
>You don't need to do this, but looks like we can add DAYS_PER_WEEK
macro and
>use it here.

I've attached a patch with this new macro. There's probably tons of
places it can be used instead of hardcoding the number 7, but I'll save
that for a future patch.

- Joe Koshakow
From 41fa5de65c757d72331aff6bb626fab76390e9db Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 18 Mar 2023 12:26:28 -0400
Subject: [PATCH 1/2] Move integer helper function to int.h

---
 src/backend/utils/adt/datetime.c | 25 -
 src/include/common/int.h | 13 +
 2 files changed, 17 insertions(+), 21 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index be2e55bb29..64f28a85b0 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -51,7 +51,6 @@ static int	DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 	   struct pg_tm *tm);
 static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 		   int precision, bool fillzeros);
-static bool int64_multiply_add(int64 val, int64 multiplier, int64 *sum);
 static bool AdjustFractMicroseconds(double frac, int64 scale,
 	struct pg_itm_in *itm_in);
 static bool AdjustFractDays(double frac, int scale,
@@ -515,22 +514,6 @@ AppendTimestampSeconds(char *cp, struct pg_tm *tm, fsec_t fsec)
 	return AppendSeconds(cp, tm->tm_sec, fsec, MAX_TIMESTAMP_PRECISION, true);
 }
 
-
-/*
- * Add val * multiplier to *sum.
- * Returns true if successful, false on overflow.
- */
-static bool
-int64_multiply_add(int64 val, int64 multiplier, int64 *sum)
-{
-	int64		product;
-
-	if (pg_mul_s64_overflow(val, multiplier, ) ||
-		pg_add_s64_overflow(*sum, product, sum))
-		return false;
-	return true;
-}
-
 /*
  * Multiply frac by scale (to produce microseconds) and add to itm_in->tm_usec.
  * Returns true if successful, false if itm_in overflows.
@@ -621,7 +604,7 @@ AdjustMicroseconds(int64 val, double fval, int64 scale,
    struct pg_itm_in *itm_in)
 {
 	/* Handle the integer part */
-	if (!int64_multiply_add(val, scale, _in->tm_usec))
+	if (pg_mul_add_s64_overflow(val, scale, _in->tm_usec))
 		return false;
 	/* Handle the float part */
 	return AdjustFractMicroseconds(fval, scale, itm_in);
@@ -2701,9 +2684,9 @@ DecodeTimeForInterval(char *str, int fmask, int range,
 		return dterr;
 
 	itm_in->tm_usec = itm.tm_usec;
-	if (!int64_multiply_add(itm.tm_hour, USECS_PER_HOUR, _in->tm_usec) ||
-		!int64_multiply_add(itm.tm_min, USECS_PER_MINUTE, _in->tm_usec) ||
-		!int64_multiply_add(itm.tm_sec, USECS_PER_SEC, _in->tm_usec))
+	if (pg_mul_add_s64_overflow(itm.tm_hour, USECS_PER_HOUR, _in->tm_usec) ||
+		pg_mul_add_s64_overflow(itm.tm_min, USECS_PER_MINUTE, _in->tm_usec) ||
+		pg_mul_add_s64_overflow(itm.tm_sec, USECS_PER_SEC, _in->tm_usec))
 		return DTERR_FIELD_OVERFLOW;
 
 	return 0;
diff --git a/src/include/common/int.h b/src/include/common/int.h
index 450800894e..81726c65f7 100644
--- a/src/include/common/int.h
+++ b/src/include/common/int.h
@@ -254,6 +254,19 @@ pg_mul_s64_overflow(int64 a, int64 b, int64 *result)
 #endif
 }
 
+/*
+ * Add val * multiplier to *sum.
+ * Returns false if successful, true on overflow.
+ */
+static inline bool
+pg_mul_add_s64_overflow(int64 val, int64 multiplier, int64 *sum)
+{
+	int64		product;
+
+	return pg_mul_s64_overflow(val, multiplier, ) ||
+		pg_add_s64_overflow(*sum, product, sum);
+}
+
 /*
  * Overflow routines for unsigned integers
  *
-- 
2.34.1

From 242ffd232bef606c9c948f0ee9980152fb9e3bec Mon Sep 17 00:00:00 2001
From: Joseph Koshakow 
Date: Sat, 18 Mar 2023 12:38:58 -0400
Subject: [PATCH 2/2] Check for overflow in make_interval

---
 src/backend/utils/adt/timestamp.c  | 24 +++-
 src/include/common/int.h   | 13 +
 src/include/datatype/timestamp.h   |  1 +
 src/test/regress/expected/interval.out |  5 +
 src/test/regress/sql/interval.sql  |  4 
 5 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index aaadc68ae6..ccf0019a3c 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1517,13 +1517,27 @@ make_interval(PG_FUNCTION_ARGS)
  errmsg("interval out of range")));
 
 	result = (Interval *) palloc(sizeof(Interval));
-	result->month = years * MONTHS_PER_YEAR + months;
-	result->day = weeks * 7 + days;
+	result->month = months;
+	if (pg_mul_add_s32_overflow(years, MONTHS_PER_YEAR, >month))
+		ereport(ERROR,
+(errcode(ERRCODE_DATETIME_VALUE_OUT_OF_RANGE),
+ errmsg("interval out of range")));
+	result->day = days;
+	if (pg_mul_add_s32_overflow(weeks, DAYS_PER_WEEK, >day))
+		ereport(ERROR,
+

Re: About the constant-TRUE clause in reconsider_outer_join_clauses