Re: ICU for global collation

2022-06-25 Thread Julien Rouhaud
Hi,

On Sat, Jun 25, 2022 at 10:19:30AM -0500, Justin Pryzby wrote:
> commit f2553d43060edb210b36c63187d52a632448e1d2 says >=1500 in a few places,
> but in pg_upgrade says <=1500, which looks wrong for upgrades from v15.
> I think it should say <= 1400.
> 
> On Wed, Feb 02, 2022 at 02:01:23PM +0100, Peter Eisentraut wrote:
> > diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
> > index 69ef23119f..2a9ca0e389 100644
> > --- a/src/bin/pg_upgrade/info.c
> > +++ b/src/bin/pg_upgrade/info.c
> > @@ -312,11 +312,20 @@ get_db_infos(ClusterInfo *cluster)
> > i_spclocation;
> > charquery[QUERY_ALLOC];
> >  
> > snprintf(query, sizeof(query),
> > -"SELECT d.oid, d.datname, d.encoding, d.datcollate, 
> > d.datctype, "
> > +"SELECT d.oid, d.datname, d.encoding, d.datcollate, 
> > d.datctype, ");
> > +   if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1500)
> > +   snprintf(query + strlen(query), sizeof(query) - strlen(query),
> > +"'c' AS datcollprovider, NULL AS daticucoll, 
> > ");
> > +   else
> > +   snprintf(query + strlen(query), sizeof(query) - strlen(query),
> > +"datcollprovider, daticucoll, ");
> > +   snprintf(query + strlen(query), sizeof(query) - strlen(query),
> >  "pg_catalog.pg_tablespace_location(t.oid) AS 
> > spclocation "
> >  "FROM pg_catalog.pg_database d "
> >  " LEFT OUTER JOIN pg_catalog.pg_tablespace t "

Indeed!




Re: NAMEDATALEN increase because of non-latin languages

2022-06-25 Thread Andres Freund
Hi,

On 2022-06-26 10:48:24 +0800, Julien Rouhaud wrote:
> Anyway, per the nearby discussions I don't see much interest, especially not 
> in
> the context of varlena identifiers (I should have started a different thread,
> sorry about that), so I don't think it's worth investing more efforts into
> it.

FWIW, I still think it's a worthwhile feature - just, to me, unrelated to
varlena identifiers. I've seen tremendous amounts of space wasted in tables
just because of alignment issues...

- Andres




Re: NAMEDATALEN increase because of non-latin languages

2022-06-25 Thread Julien Rouhaud
On Thu, Jun 23, 2022 at 10:19:44AM -0400, Robert Haas wrote:
> On Thu, Jun 23, 2022 at 6:13 AM Julien Rouhaud  wrote:
>
> > And should record_in / record_out use the logical position, as in:
> > SELECT ab::text FROM ab / SELECT (a, b)::ab;
> >
> > I would think not, as relying on a possibly dynamic order could break 
> > things if
> > you store the results somewhere, but YMMV.
>
> I think here the answer is yes again. I mean, consider that you could
> also ALTER TABLE DROP COLUMN and then ALTER TABLE ADD COLUMN with the
> same name. That is surely going to affect the meaning of such things.
> I don't think we want to have one meaning if you reorder things that
> way and a different meaning if you reorder things using whatever
> commands we create for changing the display column positions.

It indeed would, but ALTER TABLE DROP COLUMN is a destructive operation, and
I'm assuming that anyone doing that is aware that it will have an impact on
stored data and such.  I initially thought that changing the display order of
columns shouldn't have the same impact with the stability of otherwise
unchanged record definition, as it would make such reorder much more impacting.
But I agree that having different behaviors seems worse.

> > Then, what about joinrels expansion?  I learned that the column ordering 
> > rules
> > are far from being obvious, and I didn't find those in the documentation 
> > (note
> > that I don't know if that something actually described in the SQL standard).
> > So for instance, if a join is using an explicit USING clause rather than an 
> > ON
> > clause, the merged columns are expanded first, so:
> >
> > SELECT * FROM ab ab1 JOIN ab ab2 USING (b)
> >
> > should unexpectedly expand to (b, a, a).  Is this order a strict 
> > requirement?
>
> I dunno, but I can't see why it creates a problem for this patch to
> maintain the current behavior. I mean, just use the logical column
> position instead of the physical one here and forget about the details
> of how it works beyond that.

I'm not that familiar with this part of the code so I may have missed
something, but I didn't see any place where I could just simply do that.

To be clear, the approach I used is to change the expansion ordering but
otherwise keep the current behavior, to try to minimize the changes.  This is
done by keeping the attribute in the physical ordering pretty much everywhere,
including in the nsitems, and just logically reorder them during the expansion.
In other words all the code still knows that the 1st column is the first
physical column and so on.

So in that query, the ordering is supposed to happen when handling the "SELECT
*", which makes it impossible to retain that order.

I'm assuming that what you meant is to change the ordering when processing the
JOIN and retain the old "SELECT *" behavior, which is to emit items in the
order they're found.  But IIUC the only way to do that would be to change the
order when building the nsitems themselves, and have the code believe that the
attributes are physically stored in the logical order.  That's probably doable,
but that looks like a way more impacting change.  Or did you mean to keep the
approach I used, and just have some special case for "SELECT *" when referring
to a joinrel and instead try to handle the logical expansion in the join?
AFAICS it would require to add some extra info in the parsing structures, as it
doesn't really really store any position, just relies on array offset / list
position and maps things that way.

> > Another problem (that probably wouldn't be a problem for system catalogs) is
> > that defaults are evaluated in the physical position.  This example from the
> > regression test will clearly have a different behavior if the columns are 
> > in a
> > different physical order:
> >
> >  CREATE TABLE INSERT_TBL (
> > x INT DEFAULT nextval('insert_seq'),
> > y TEXT DEFAULT '-NULL-',
> > z INT DEFAULT -1 * currval('insert_seq'),
> > CONSTRAINT INSERT_TBL_CON CHECK (x >= 3 AND y <> 'check failed' AND 
> > x < 8),
> > CHECK (x + z = 0));
> >
> > But changing the behavior to rely on the logical position seems quite
> > dangerous.
>
> Why?

It feels to me like a POLA violation, and probably people wouldn't expect it to
behave this way (even if this is clearly some corner case problem).  Even if
you argue that this is not simply a default display order but something more
like real column order, the physical position being some implementation detail,
it still doesn't really feels right.

The main reason for having the possibility to change the logical position is to
have "better looking", easier to work with, relations even if you have some
requirements with the real physical order like trying to optimize things as
much as possible (reordering columns to avoid padding space, put non-nullable
columns first...).  The order in which defaults are evaluated looks like the
same kind of requirements.  How useful would 

Re: Backends stunk in wait event IPC/MessageQueueInternal

2022-06-25 Thread Thomas Munro
On Tue, May 17, 2022 at 3:31 PM Thomas Munro  wrote:
> On Mon, May 16, 2022 at 3:45 PM Japin Li  wrote:
> > Maybe use the __illumos__ macro more accurity.
> >
> > +#elif defined(WAIT_USE_EPOLL) && defined(HAVE_SYS_SIGNALFD_H) && \
> > +   !defined(__sun__)
>
> Thanks, updated, and with a new commit message.

Pushed to master and REL_14_STABLE.

I'll email the illumos build farm animal owners to say that they
should be able to remove -DWAIT_USE_POLL.

Theoretically, it might be useful that we've separated the
WAIT_USE_SELF_PIPE code from WAIT_USE_POLL if someone eventually wants
to complete the set of possible WaitEventSet implementations by adding
/dev/poll (Solaris, HPUX) and pollset (AIX) support.  I don't think
those have a nicer way to receive race-free signal wakeups.
Realistically no one's likely to show up with a patch for those old
proprietary Unixen at this point on the timeline, I just think it's
interesting that every OS had something better than poll(), we just
need that fallback for lack of patches, not lack of kernel features.
Ironically the typical monster AIX systems I've run into in the wild
are probably much more capable of suffering from poll() contention
than all these puny x86 systems, with oodles of CPUs and NUMA nodes.
If someone *is* still interested in scalability on AIX, I'd recommend
looking at pollset for latch.c, and also the stalled huge pages
thing[1].

[1] 
https://www.postgresql.org/message-id/CA%2BhUKGJE4dq%2BNZHrm%3DpNSNCYwDCH%2BT6HtaWm5Lm8vZzygknPpA%40mail.gmail.com




Re: Hardening PostgreSQL via (optional) ban on local file system access

2022-06-25 Thread Hannu Krosing
Having a superuser.conf file could also be used to solve another issue
- currently, if you remove the SUPERUSER attribute from all users your
only option to get it back would be to run in single-user mode. With
conf file one could add an "always" option there which makes any
matching user superuser to fix whatever needs fixing as superuser.

Cheers
Hannu

On Sat, Jun 25, 2022 at 10:54 PM Hannu Krosing  wrote:
>
> What are your ideas of applying a change similar to above to actually
> being a superuser ?
>
> That is adding a check for "superuser being currently available" to
> function superuser() in
> ./src/backend/utils/misc/superuser.c ?
>
> It could be as simple as a flag that can be set only at startup for
> maximum speed - though the places needing superuser check are never in
> speed-critical path as far as I have seen.
>
> Or it could be more complex and dynamix, like having a file similar to
> pg_hba.conf that defines the ability to run code as superuser based on
> a set of attributes each of which could be * meaning "any"
>
> These could be
>   * database (to limit superuser to only certain databases)
>   * session user (to allow only some users to become superuser,
> including via SECURITY DEFINER functions)
>   * backend pid (this would allow kind of 2-factor authentication -
> connect, then use that session's pid to add a row to
> pg_ok_to_be_sup.conf file, then continue with your superuser-y stuff)
>   * valid-until - this lets one allow superuser for a limited period
> without fear of forgetting top disable it
>
> This approach would have the the benefit of being very low-code while
> delivering the extra protection of needing pre-existing access to
> filesystem to enable/disable .
>
> For easiest usability the pg_ok_to_be_sup.conf file should be outside
> pg_reload_conf() - either just read each time the superuser() check is
> run, or watched via inotify and reloaded each time it changes.
>
> Cheers,
> Hannu
>
> P.S: - thanks Magnus for the "please don't top-post" notice - I also
> need to remember to check if all the quoted mail history is left in
> when I just write a replay without touching any of it. I hope it does
> the right thing and leaves it out, but it just might unders some
> conditions bluntly append it anyway just in case :)




Re: Hardening PostgreSQL via (optional) ban on local file system access

2022-06-25 Thread Hannu Krosing
What are your ideas of applying a change similar to above to actually
being a superuser ?

That is adding a check for "superuser being currently available" to
function superuser() in
./src/backend/utils/misc/superuser.c ?

It could be as simple as a flag that can be set only at startup for
maximum speed - though the places needing superuser check are never in
speed-critical path as far as I have seen.

Or it could be more complex and dynamix, like having a file similar to
pg_hba.conf that defines the ability to run code as superuser based on
a set of attributes each of which could be * meaning "any"

These could be
  * database (to limit superuser to only certain databases)
  * session user (to allow only some users to become superuser,
including via SECURITY DEFINER functions)
  * backend pid (this would allow kind of 2-factor authentication -
connect, then use that session's pid to add a row to
pg_ok_to_be_sup.conf file, then continue with your superuser-y stuff)
  * valid-until - this lets one allow superuser for a limited period
without fear of forgetting top disable it

This approach would have the the benefit of being very low-code while
delivering the extra protection of needing pre-existing access to
filesystem to enable/disable .

For easiest usability the pg_ok_to_be_sup.conf file should be outside
pg_reload_conf() - either just read each time the superuser() check is
run, or watched via inotify and reloaded each time it changes.

Cheers,
Hannu

P.S: - thanks Magnus for the "please don't top-post" notice - I also
need to remember to check if all the quoted mail history is left in
when I just write a replay without touching any of it. I hope it does
the right thing and leaves it out, but it just might unders some
conditions bluntly append it anyway just in case :)




Re: Amcheck verification of GiST and GIN

2022-06-25 Thread Andrey Borodin



> On 23 Jun 2022, at 00:27, Nikolay Samokhvalov  wrote:
> 
> Since you're talking to yourself, just wanted to support you – this is an 
> important thing, definitely should be very useful for many projects; I hope 
> to find time to test it in the next few days. 

Thanks Nikolay!


> On 23 Jun 2022, at 04:29, Andres Freund  wrote:
Thanks for looking into the patch, Andres!

> On 2022-06-22 20:40:56 +0300, Andrey Borodin wrote:
>>> diff --git a/contrib/amcheck/amcheck.c b/contrib/amcheck/amcheck.c
>> new file mode 100644
>> index 00..7a222719dd
>> --- /dev/null
>> +++ b/contrib/amcheck/amcheck.c
>> @@ -0,0 +1,187 @@
>> +/*-
>> + *
>> + * amcheck.c
>> + *  Utility functions common to all access methods.
> 
> This'd likely be easier to read if the reorganization were split into its own
> commit.
> 
> I'd also split gin / gist support. It's a large enough patch that that imo
> makes reviewing easier.
I will split the patch in 3 steps:
1. extract generic functions to amcheck.c
2. add gist functions
3. add gin functions
But each this step is just adding few independent files + some lines to 
Makefile.

I'll fix other notes too in the next version.

Thanks!

Best regards, Andrey Borodin.



Re: Hardening PostgreSQL via (optional) ban on local file system access

2022-06-25 Thread Andrey Borodin



> On 25 Jun 2022, at 03:08, Hannu Krosing  wrote:
> 
> Currently the file system access is controlled via being a SUPREUSER

My 2 cents. Ongoing work on making superuser access unneeded seems much more 
relevant to me.
IMO superuser == full OS access available from postgres process. I think 
there's uncountable set of ways to affect OS from superuser.
E.g. you can create a TOAST value compressed by pglz that allows you to look 
few kilobytes before detoasted datum. Or make an archive_command = 'gcc my 
shell code'.
It's not even funny to invent things that you can hack as a superuser.

Best regards, Andrey Borodin.



Re: Postgres perl module namespace

2022-06-25 Thread Noah Misch
On Thu, Jun 23, 2022 at 10:45:40PM -0700, Noah Misch wrote:
> On Wed, Jun 22, 2022 at 11:03:22AM -0400, Andrew Dunstan wrote:
> > On 2022-06-22 We 03:21, Noah Misch wrote:
> > > On Tue, Apr 19, 2022 at 07:24:58PM -0400, Andrew Dunstan wrote:
> > >> On 2022-04-19 Tu 18:39, Michael Paquier wrote:
> > >>> +*generate_ascii_string = *TestLib::generate_ascii_string;
> > >>> +*slurp_dir = *TestLib::slurp_dir;
> > >>> +*slurp_file = *TestLib::slurp_file;
> > >>>
> > >>> I am not sure if it is possible and my perl-fu is limited in this
> > >>> area, but could a failure be enforced when loading this path if a new
> > >>> routine added in TestLib.pm is forgotten in this list?
> > >> Not very easily that I'm aware of, but maybe some superior perl wizard
> > >> will know better.
> > > One can alias the symbol table, like 
> > > https://metacpan.org/pod/Package::Alias
> > > does.  I'm attaching what I plan to use.  Today, check-world fails after
> > >
> > >   sed -i 's/TestLib/PostgreSQL::Test::Utils/g; 
> > > s/PostgresNode/PostgreSQL::Test::Cluster/g' **/*.pl
> > >
> > > on REL_14_STABLE, because today's alias list is incomplete.  With this 
> > > change,
> > > the same check-world passes.
> 
> The patch wasn't sufficient to make that experiment pass for REL_10_STABLE,
> where 017_shm.pl uses the %params argument of get_new_node().  The problem
> call stack had PostgreSQL::Test::Cluster->get_new_code calling
> PostgreSQL::Test::Cluster->new, which needs v14- semantics.  Here's a fixed
> version, just changing the new() hack.

I pushed this, but it broke lapwing and wrasse.  I will investigate.




pg_upgrade allows itself to be run twice

2022-06-25 Thread Justin Pryzby
Now that I've gotten your attention..

I expect pg_upgrade to fail if I run it twice in a row.

It would be reasonable if it were to fail during the "--check" phase,
maybe by failing like this:
| New cluster database "..." is not empty: found relation "..."

But that fails to happen if the cluster has neither tables nor matviews, in
which case, it passes the check phase and then fails like this:

Performing Upgrade
--
Analyzing all rows in the new cluster   ok
Freezing all rows in the new clusterok
Deleting files from new pg_xact ok
Copying old pg_clog to new server   ok
Setting oldest XID for new cluster  ok
Setting next transaction ID and epoch for new cluster   ok
Deleting files from new pg_multixact/offsetsok
Copying old pg_multixact/offsets to new server  ok
Deleting files from new pg_multixact/membersok
Copying old pg_multixact/members to new server  ok
Setting next multixact ID and offset for new clusterok
Resetting WAL archives  ok
Setting frozenxid and minmxid counters in new cluster   connection to 
server on socket "/home/pryzbyj/src/postgres/.s.PGSQL.50432" failed: FATAL:  
could not open relation with OID 2610
Failure, exiting

I'll concede that a cluster which has no tables sounds a lot like a toy, but I
find it disturbing that nothing prevents running the process twice, up to the
point that it's evidently corrupted the catalog.

While looking at this, I noticed that starting postgres --single immediately
after initdb allows creating objects with OIDs below FirstNormalObjectId
(thereby defeating pg_upgrade's check).  I'm not familiar with the behavioral
differences of single user mode, and couldn't find anything in the
documentation.




Re: Hardening PostgreSQL via (optional) ban on local file system access

2022-06-25 Thread Magnus Hagander
On Sat, Jun 25, 2022 at 1:13 AM Andres Freund  wrote:

>
> On 2022-06-25 00:08:13 +0200, Hannu Krosing wrote:
> > Currently the file system access is controlled via being a SUPREUSER
> > or having the pg_read_server_files, pg_write_server_files and
> > pg_execute_server_program roles. The problem with this approach is
> > that it will not stop an attacker who has managed to become the
> > PostgreSQL  SUPERUSER from  breaking out of the server to reading and
> > writing files and running programs in the surrounding container, VM or
> > OS.
>
> If a user has superuser rights, they automatically can execute arbitrary
> code. It's that simple. Removing roles isn't going to change that. Our code
> doesn't protect against C functions mismatching their SQL level
> definitions. With that you can do a lot of things.
>

Yeah, trying to close this hole is a *very* large architectural change.

I think a much better use of time is to further reduce the *need* to use
superuser to the point that the vast majority of installations can run
without having it. For example the addition of the pg_monitor role has made
a huge difference to the number of things needing superuser access. As doe
the "grantable gucs" in 15, for example. Enumerating what remaining things
can be done safely without such access and working on turning that into
grantable permissions or roles  will be a much safer way to work on the
underlying problem (which definitely is a real one), and as a bonus that
gives a more granular control over things even *beyond* just the file
system access.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Hardening PostgreSQL via (optional) ban on local file system access

2022-06-25 Thread Magnus Hagander
(please don't top-post. Surely you've been around this community long
enough to know that)


On Sat, Jun 25, 2022 at 1:59 AM Hannu Krosing  wrote:

> My understanding was that unless activated by admin these changes
> would change nothing.
>

That is assuming you can do this with changing just a couple of lines of
code. Which you will not be able to do. The risk of back patching something
like that even if off by default is *way* too large.


And they would be (borderline :) ) security fixes
>

No, they would not. Not anymore than adding a new authentication method for
example could be considered a security fix.



And the versioning policy link actually does not say anything about
> not adding features to older versions (I know this is the policy, just
> pointing out the info in not on that page).
>

Yes it does:

The PostgreSQL Global Development Group releases a new major version
containing new features about once a year. Each major version receives bug
fixes and, if need be, security fixes that are released at least once every
three months in what we call a "minor release."

And slightly further down:

While upgrading will always contain some level of risk, PostgreSQL minor
releases fix only frequently-encountered bugs, security issues, and data
corruption problems to reduce the risk associated with upgrading.


So unless you claim this is a frequently encountered bug (it's not -- it's
acting exactly has intentional), security issue (same) or data corruption
(unrelated), it should not go in a minor version. It's very clear.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: ICU for global collation

2022-06-25 Thread Justin Pryzby
commit f2553d43060edb210b36c63187d52a632448e1d2 says >=1500 in a few places,
but in pg_upgrade says <=1500, which looks wrong for upgrades from v15.
I think it should say <= 1400.

On Wed, Feb 02, 2022 at 02:01:23PM +0100, Peter Eisentraut wrote:
> --- a/src/bin/pg_dump/pg_dump.c
> +++ b/src/bin/pg_dump/pg_dump.c
> @@ -2792,6 +2794,10 @@ dumpDatabase(Archive *fout)
>   appendPQExpBuffer(dbQry, "datminmxid, ");
>   else
>   appendPQExpBuffer(dbQry, "0 AS datminmxid, ");
> + if (fout->remoteVersion >= 15)
> + appendPQExpBuffer(dbQry, "datcollprovider, ");
> + else
> + appendPQExpBuffer(dbQry, "'c' AS datcollprovider, ");
>   appendPQExpBuffer(dbQry,
> "(SELECT spcname FROM pg_tablespace t 
> WHERE t.oid = dattablespace) AS tablespace, "
> "shobj_description(oid, 
> 'pg_database') AS description "

> diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
> index 69ef23119f..2a9ca0e389 100644
> --- a/src/bin/pg_upgrade/info.c
> +++ b/src/bin/pg_upgrade/info.c
> @@ -312,11 +312,20 @@ get_db_infos(ClusterInfo *cluster)
>   i_spclocation;
>   charquery[QUERY_ALLOC];
>  
>   snprintf(query, sizeof(query),
> -  "SELECT d.oid, d.datname, d.encoding, d.datcollate, 
> d.datctype, "
> +  "SELECT d.oid, d.datname, d.encoding, d.datcollate, 
> d.datctype, ");
> + if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1500)
> + snprintf(query + strlen(query), sizeof(query) - strlen(query),
> +  "'c' AS datcollprovider, NULL AS daticucoll, 
> ");
> + else
> + snprintf(query + strlen(query), sizeof(query) - strlen(query),
> +  "datcollprovider, daticucoll, ");
> + snprintf(query + strlen(query), sizeof(query) - strlen(query),
>"pg_catalog.pg_tablespace_location(t.oid) AS 
> spclocation "
>"FROM pg_catalog.pg_database d "
>" LEFT OUTER JOIN pg_catalog.pg_tablespace t "

> --- a/src/bin/psql/describe.c
> +++ b/src/bin/psql/describe.c
> @@ -896,6 +896,18 @@ listAllDbs(const char *pattern, bool verbose)
> gettext_noop("Encoding"),
> gettext_noop("Collate"),
> gettext_noop("Ctype"));
> + if (pset.sversion >= 15)
> + appendPQExpBuffer(,
> +   "   d.daticucoll as 
> \"%s\",\n"
> +   "   CASE 
> d.datcollprovider WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\",\n",
> +   gettext_noop("ICU Collation"),
> +   gettext_noop("Coll. 
> Provider"));
> + else
> + appendPQExpBuffer(,
> +   "   d.datcollate as 
> \"%s\",\n"
> +   "   'libc' AS \"%s\",\n",
> +   gettext_noop("ICU Collation"),
> +   gettext_noop("Coll. 
> Provider"));
>   appendPQExpBufferStr(, "   ");
>   printACLColumn(, "d.datacl");
>   if (verbose)

> @@ -4617,6 +4629,15 @@ listCollations(const char *pattern, bool verbose, bool 
> showSystem)
> gettext_noop("Collate"),
> gettext_noop("Ctype"));
>  
> + if (pset.sversion >= 15)
> + appendPQExpBuffer(,
> +   ",\n   c.collicucoll AS 
> \"%s\"",
> +   gettext_noop("ICU 
> Collation"));
> + else
> + appendPQExpBuffer(,
> +   ",\n   c.collcollate AS 
> \"%s\"",
> +   gettext_noop("ICU 
> Collation"));
> +
>   if (pset.sversion >= 10)
>   appendPQExpBuffer(,
> ",\n   CASE 
> c.collprovider WHEN 'd' THEN 'default' WHEN 'c' THEN 'libc' WHEN 'i' THEN 
> 'icu' END AS \"%s\"",




Re: making relfilenodes 56 bits

2022-06-25 Thread Robert Haas
On Fri, Jun 24, 2022 at 9:30 PM Andres Freund  wrote:
> If we "inline" RelFileNumber, it's probably worth reorder the members so that
> the most distinguishing elements come first, to make it quicker to detect hash
> collisions. It shows up in profiles today...
>
> I guess it should be blockNum, fileNumber, forkNumber, dbOid, spcOid? I think
> as long as blockNum, fileNumber are first, the rest doesn't matter much.

Hmm, I guess we could do that. Possibly as a separate, very small patch.

> > And then like this in 0003:
> >
> > typedef struct buftag
> > {
> > Oid spcOid;
> > Oid dbOid;
> > RelFileNumber   fileNumber:56;
> > ForkNumber  forkNum:8;
> > } BufferTag;
>
> Probably worth checking the generated code / the performance effects of using
> bitfields (vs manual maskery). I've seen some awful cases, but here it's at a
> byte boundary, so it might be ok.

One advantage of using bitfields is that it might mean we don't need
to introduce accessor macros. Now, if that's going to lead to terrible
performance I guess we should go ahead and add the accessor macros -
Dilip had those in an earlier patch anyway. But it'd be nice if it
weren't necessary.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_upgrade (12->14) fails on aggregate

2022-06-25 Thread Andrey Borodin



> On 25 Jun 2022, at 01:28, Justin Pryzby  wrote:
> 
> But what I wrote already shows what you want.
Just tested that, you are right. My version was printing name, I didn't know 
regproc prints so nice definition.

>  this is my latest.
> <0001-WIP-pg_upgrade-check-detect-old-polymorphics-from-pr.patch>

Let's rename "databases_with_old_polymorphics.txt" to somthing like 
"old_polymorphics.txt" or maybe even "incompatible_polymorphics_usage.txt"?
I think you will come up with a better name, my point is here everythin is in 
"databases", and "old" doesn't describe essence of the problem.

Also, let's check that oid of used functions belong to system catalog (<16384)? 
We don't care about user-defined functions with the same name.

And, probably, we can do this unconditionally:
if (old_cluster.major_version >= 9500)
appendPQExpBufferStr(_polymorphics,
Nothing bad will happen if we blacklist usage of nonexistent functions. I see 
there's a lot of code to have a dynamic list, if you think this exclusion for 
pre-9.5 is justified - OK, from my POV we can keep this code.

These comment is unneeded too:
// "AND aggtranstype='anyarray'::regtype

Thank you!

Best regards, Andrey Borodin.






Re: Core dump in range_table_mutator()

2022-06-25 Thread Dean Rasheed
On Sat, 25 Jun 2022 at 04:39, Tom Lane  wrote:
>
> Well, if we want to clean this up a bit rather than just doing the
> minimum safe fix ... I spent some time why we were bothering with the
> FLATCOPY step at all, rather than just mutating the Query* pointer.
> I think the reason is to not fail if the QTW_DONT_COPY_QUERY flag is
> set, but maybe we should clear that flag when recursing?
>

Hmm, interesting, but we don't actually pass on that flag when
recursing anyway. Since it is the mutator routine's responsibility to
make a possibly-modified copy of its input node, if it wants to
recurse into the subquery, it should always call query_tree_mutator()
with QTW_DONT_COPY_QUERY unset, and range_table_mutator() should never
need to FLATCOPY() the subquery.

But then, in the interests of further tidying up, why does
range_table_mutator() call copyObject() on the subquery if
QTW_IGNORE_RT_SUBQUERIES is set? If QTW_IGNORE_RT_SUBQUERIES isn't
set, the mutator routine will either copy and modify the subquery, or
it will return the original unmodified subquery node via
expression_tree_mutator(), without copying it. So then if
QTW_IGNORE_RT_SUBQUERIES is set, why not also just return the original
unmodified subquery node?

So then the RTE_SUBQUERY case in range_table_mutator() would only have to do:

case RTE_SUBQUERY:
if (!(flags & QTW_IGNORE_RT_SUBQUERIES))
MUTATE(newrte->subquery, newrte->subquery, Query *);
break;

which wouldn't fall over if the subquery were NULL.

Regards,
Dean




Re: Race condition in TransactionIdIsInProgress

2022-06-25 Thread Simon Riggs
On Sat, 25 Jun 2022 at 10:18, Heikki Linnakangas  wrote:
>
> On 24/06/2022 04:43, Andres Freund wrote:
> > On 2022-06-23 22:03:27 +0300, Heikki Linnakangas wrote:
> >> In summary, I think we should:
> >> - commit and backpatch Simon's
> >> just_remove_TransactionIdIsKnownCompleted_call.v1.patch
> >> - fix pg_xact_status() to check TransactionIdIsInProgress()
> >> - in v15, remove TransationIdIsKnownCompleted function altogether
> >>
> >> I'll try to get around to that in the next few days, unless someone beats 
> >> me
> >> to it.
> >
> > Makes sense.
>
> This is what I came up with for master. One difference from Simon's
> original patch is that I decided to not expose the
> TransactionIdIsKnownNotInProgress(), as there are no other callers of it
> in core, and it doesn't seem useful to extensions. I inlined it into the
> caller instead.

Looks good, thanks.

> BTW, should we worry about XID wraparound with the cache? Could you have
> a backend sit idle for 2^32 transactions, without making any
> TransactionIdIsKnownNotInProgress() calls? That's not new with this
> patch, though, it could happen with the single-item cache in transam.c, too.

Yes, as a separate patch only in PG15+, imho.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Race condition in TransactionIdIsInProgress

2022-06-25 Thread Heikki Linnakangas

On 24/06/2022 04:43, Andres Freund wrote:

On 2022-06-23 22:03:27 +0300, Heikki Linnakangas wrote:

In summary, I think we should:
- commit and backpatch Simon's
just_remove_TransactionIdIsKnownCompleted_call.v1.patch
- fix pg_xact_status() to check TransactionIdIsInProgress()
- in v15, remove TransationIdIsKnownCompleted function altogether

I'll try to get around to that in the next few days, unless someone beats me
to it.


Makes sense.


This is what I came up with for master. One difference from Simon's 
original patch is that I decided to not expose the 
TransactionIdIsKnownNotInProgress(), as there are no other callers of it 
in core, and it doesn't seem useful to extensions. I inlined it into the 
caller instead.


BTW, should we worry about XID wraparound with the cache? Could you have 
a backend sit idle for 2^32 transactions, without making any 
TransactionIdIsKnownNotInProgress() calls? That's not new with this 
patch, though, it could happen with the single-item cache in transam.c, too.


- HeikkiFrom de81639af2ce0ed635778d4f7cafb3dfc4fd84f5 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Sat, 25 Jun 2022 12:02:39 +0300
Subject: [PATCH 1/1] Fix visibility check when XID is committed in CLOG but
 not in procarray.

TransactionIdIsInProgress had a fast path to return 'false' if the
single-item CLOG cache said that the transaction was known to be
committed. However, that was wrong, because a transaction is first
marked as committed in the CLOG but doesn't become visible to others
until it has removed its XID from the proc array. That could lead to an
error:

ERROR:  t_xmin is uncommitted in tuple to be updated

or for an UPDATE to go ahead without blocking, before the previous
UPDATE on the same row was made visible.

The window is usually very short, but synchronous replication makes it
much wider, because the wait for synchronous replica happens in that
window.

Another thing that makes it hard to hit is that it's hard to get such
a commit-in-progress transaction into the single item CLOG cache.
Normally, if you call TransactionIdIsInProgress on such a transaction,
it determines that the XID is in progress without checking the CLOG
and without populating the cache. One way to prime the cache is to
explicitly call pg_xact_status() on the XID. Another way is use a lot
of subtransactions, so that the subxid cache in the proc array is
overflown, making TransactionIdIsInProgress rely on CLOG checks.

This has been broken ever since it was introduced in 2008, but the race
condition is very hard to hit, especially without synchronous
replication.  There were a couple of reports of this starting from
summer 2021, but no one was able to find the root cause then.

TransactionIdIsKnownCompleted() is now unused. In 'master', remove it,
but I left in place in backbranches in case it's used by extensions.

Also change pg_xact_status() to check TransactionIdIsInProgress().
Previously, it only checked the CLOG, and returned "committed" before
the transaction was actually made visible to other queries. Note that
this also means that you cannot use pg_xact_status() to reproduce the
bug anymore, even if the code wasn't fixed.

Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with
the pg_xact_status() change added by me.

Author: Simon Riggs
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
---
 src/backend/access/transam/transam.c | 27 -
 src/backend/storage/ipc/procarray.c  | 12 ++-
 src/backend/utils/adt/xid8funcs.c| 30 +++-
 src/include/access/transam.h |  1 -
 4 files changed, 23 insertions(+), 47 deletions(-)

diff --git a/src/backend/access/transam/transam.c b/src/backend/access/transam/transam.c
index dbc5f884e8..5865810135 100644
--- a/src/backend/access/transam/transam.c
+++ b/src/backend/access/transam/transam.c
@@ -219,33 +219,6 @@ TransactionIdDidAbort(TransactionId transactionId)
 	return false;
 }
 
-/*
- * TransactionIdIsKnownCompleted
- *		True iff transaction associated with the identifier is currently
- *		known to have either committed or aborted.
- *
- * This does NOT look into pg_xact but merely probes our local cache
- * (and so it's not named TransactionIdDidComplete, which would be the
- * appropriate name for a function that worked that way).  The intended
- * use is just to short-circuit TransactionIdIsInProgress calls when doing
- * repeated heapam_visibility.c checks for the same XID.  If this isn't
- * extremely fast then it will be counterproductive.
- *
- * Note:
- *		Assumes transaction identifier is valid.
- */
-bool
-TransactionIdIsKnownCompleted(TransactionId transactionId)
-{
-	if (TransactionIdEquals(transactionId, cachedFetchXid))
-	{
-		/* If it's in the cache at all, it must be completed. */
-		return true;
-	}
-
-	return false;
-}
-
 /*
  * TransactionIdCommitTree
  *		Marks the given 

Re: Future Postgres 15 and Clang 15

2022-06-25 Thread Fabien COELHO


Hello Thomas,


  llvmjit.c:1233:81: error: too few arguments to function call, expected 3, 
have 2
 ref_gen = 
LLVMOrcCreateCustomCAPIDefinitionGenerator(llvm_resolve_symbols, NULL);


Ah yes, I hadn't seen that one yet.  That function grew a "Dispose"
argument, which we can just pass NULL for, at a guess:

https://github.com/llvm/llvm-project/commit/14b7c108a2bf46541efc3a5c9cbd589b3afc18e6


I agree with the guess. Whether the NULL induced semantics is the right 
one is much less obvious…



The question is: should pg 15 try to be clang 15 ready? I'm afraid yes, as
LLVM does 2 releases per year, so clang 15 should come out this Fall,
together with pg 15. Possibly other changes will come before the
releases:-/


OK let's try to get a patch ready first and then see what we can do.
I'm more worried about code that compiles OK but then crashes or gives
wrong query results (like the one for 9b4e4cfe)  than I am about code
that doesn't compile at all (meaning no one can actually ship it!).  I
think the way our two projects' release cycles work, there will
occasionally be short periods where we can't use their very latest
release, but we can try to avoid that...


Yep. The part which would worry me is the code complexity and kludges 
induced by trying to support a moving API. Maybe careful header-handled 
macros can do the trick (eg for an added parameter as above), but I'm 
afraid it cannot always be that simple.


--
Fabien.