Re: [EXTERNAL] Re: WIP: WAL prefetch (another approach)

2020-08-27 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Aug 27, 2020 at 2:51 PM Stephen Frost  wrote:
> > > Hm? At least earlier versions didn't do prefetching for records with an 
> > > fpw, and only for subsequent records affecting the same or if not in s_b 
> > > anymore.
> >
> > We don't actually read the page when we're replaying an FPW though..?
> > If we don't read it, and we entirely write the page from the FPW, how is
> > pre-fetching helping..?
> 
> Suppose there is a checkpoint. Then we replay a record with an FPW,
> pre-fetching nothing. Then the buffer gets evicted from
> shared_buffers, and maybe the OS cache too. Then, before the next
> checkpoint, we again replay a record for the same page. At this point,
> pre-fetching should be helpful.

Sure- but if we're talking about 25GB of WAL, on a server that's got
32GB, then why would those pages end up getting evicted from memory
entirely?  Particularly, enough of them to end up with such a huge
difference in replay time..

I do agree that if we've got more outstanding WAL between checkpoints
than the system's got memory then that certainly changes things, but
that wasn't what I understood the case to be here.

> Admittedly, I don't quite understand whether that is what is happening
> in this test case, or why SDD vs. HDD should make any difference. But
> there doesn't seem to be any reason why it doesn't make sense in
> theory.

I agree that this could be a reason, but it doesn't seem to quite fit in
this particular case given the amount of memory and WAL.  I'm suspecting
that it's something else and I'd very much like to know if it's a
general "this applies to all (most?  a lot of?) SSDs because the
hardware has a larger than 8KB page size and therefore the kernel has to
read it", or if it's something odd about this particular system and
doesn't apply generally.

Thanks,

Stephen


signature.asc
Description: PGP signature


New default role- 'pg_read_all_data'

2020-08-27 Thread Stephen Frost
Greetings,

There's no shortage of requests and responses regarding how to have a
'read all of the data' role in PG, with various hacks involving "GRANT
ALL" and "ALTER DEFAULT PRIVILEGES" to "solve" this, neither of which
really works long term ("GRANT ALL" is one-time, and "ALTER DEFAULT"
only helps for the roles that exist today).

Now that we have the default role system, we can provide a proper
solution to this oft-requested capability.

This patch adds a default role to meet specifically that use-case, in
the long-term, by explicitly allowing SELECT rights on all relations,
and USAGE rights on all schemas, for roles who are members of the new
'pg_read_all_data' role.

No effort is made to prevent a user who has this role from writing data-
that's up to the admin, but this will allow someone to use pg_dump or
pg_dumpall in a much more reliable manner to make sure that the entire
database is able to be exported for the purpose of backups, upgrades, or
other common use-cases, without having to have that same user be a PG
superuser.

This role is given the Bypass RLS right, though to use it effectively, a
user would need to pass '--role=pg_read_all_data' to pg_dump/pg_dumpall,
since role attributes are not checked as part of role membership.

Thoughts?

Will add to the September CF.

Thanks,

Stephen
diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml
index 829decd883..09b3cd6cb8 100644
--- a/doc/src/sgml/user-manag.sgml
+++ b/doc/src/sgml/user-manag.sgml
@@ -517,6 +517,12 @@ DROP ROLE doomed_role;
   
  
  
+  
+   pg_read_all_data
+   Read all data (tables, sequences), even without having explicit
+   GRANT rights to the object.  Implicitly includes USAGE rights on all
+   schemas.
+  
   
pg_read_all_settings
Read all configuration variables, even those normally visible only to
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index c626161408..aca1deeca8 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -3849,6 +3849,15 @@ pg_class_aclmask(Oid table_oid, Oid roleid,
 
 	ReleaseSysCache(tuple);
 
+	/*
+	 * Check if ACL_SELECT is being checked and, if so, and not set already
+	 * as part of the result, then check if the user is a member of the
+	 * pg_read_all_data role, which allows read access to all relations.
+	 */
+	if (mask & ACL_SELECT && !(result & ACL_SELECT) &&
+		has_privs_of_role(roleid, DEFAULT_ROLE_READ_ALL_DATA))
+		result |= ACL_SELECT;
+
 	return result;
 }
 
@@ -4175,6 +4184,14 @@ pg_namespace_aclmask(Oid nsp_oid, Oid roleid,
 
 	ReleaseSysCache(tuple);
 
+	/*
+	 * Check if ACL_USAGE is being checked and, if so, and not set already
+	 * as part of the result, then check if the user is a member of the
+	 * pg_read_all_data role, which allows usage access to all schemas.
+	 */
+	if (mask & ACL_USAGE && !(result & ACL_USAGE) &&
+		has_privs_of_role(roleid, DEFAULT_ROLE_READ_ALL_DATA))
+		result |= ACL_USAGE;
 	return result;
 }
 
diff --git a/src/include/catalog/pg_authid.dat b/src/include/catalog/pg_authid.dat
index 7c08851550..0dc4b2baec 100644
--- a/src/include/catalog/pg_authid.dat
+++ b/src/include/catalog/pg_authid.dat
@@ -25,6 +25,11 @@
   rolcreaterole => 'f', rolcreatedb => 'f', rolcanlogin => 'f',
   rolreplication => 'f', rolbypassrls => 'f', rolconnlimit => '-1',
   rolpassword => '_null_', rolvaliduntil => '_null_' },
+{ oid => '3435', oid_symbol => 'DEFAULT_ROLE_READ_ALL_DATA',
+  rolname => 'pg_read_all_data', rolsuper => 'f', rolinherit => 't',
+  rolcreaterole => 'f', rolcreatedb => 'f', rolcanlogin => 'f',
+  rolreplication => 'f', rolbypassrls => 't', rolconnlimit => '-1',
+  rolpassword => '_null_', rolvaliduntil => '_null_' },
 { oid => '3374', oid_symbol => 'DEFAULT_ROLE_READ_ALL_SETTINGS',
   rolname => 'pg_read_all_settings', rolsuper => 'f', rolinherit => 't',
   rolcreaterole => 'f', rolcreatedb => 'f', rolcanlogin => 'f',
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
index 3ec22c20ea..411525bcd1 100644
--- a/src/test/regress/expected/privileges.out
+++ b/src/test/regress/expected/privileges.out
@@ -24,8 +24,10 @@ CREATE USER regress_priv_user2;
 CREATE USER regress_priv_user3;
 CREATE USER regress_priv_user4;
 CREATE USER regress_priv_user5;
+CREATE USER regress_priv_user6;
 CREATE USER regress_priv_user5;	-- duplicate
 ERROR:  role "regress_priv_user5" already exists
+GRANT pg_read_all_data TO regress_priv_user6;
 CREATE GROUP regress_priv_group1;
 CREATE GROUP regress_priv_group2 WITH USER regress_priv_user1, regress_priv_user2;
 ALTER GROUP regress_priv_group1 ADD USER regress_priv_user4;
@@ -129,6 +131,21 @@ SELECT * FROM atest2 WHERE ( col1 IN ( SELECT b FROM atest1 ) );
 --+--
 (0 rows)
 
+SET SESSION AUTHORIZATION regress_priv_user6;
+SELECT * FROM atest1; -- ok
+ a |  b  
+---+-
+ 1 | two
+ 1 | two
+(2 rows)
+
+SELECT * FROM atest2; -- ok
+ col1 | col2 
+--+--

Re: New default role- 'pg_read_all_data'

2020-08-28 Thread Stephen Frost
Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> Without having actually looked at the code, definite +1 for this feature.
> It's much requested...

Thanks.

> But, should we also have a pg_write_all_data to go along with it?

Perhaps, but could certainly be a different patch, and it'd need to be
better defined, it seems to me...  read_all is pretty straight-forward
(the general goal being "make pg_dumpall/pg_dump work"), what would
write mean?  INSERT?  DELETE?  TRUNCATE?  ALTER TABLE?  System catalogs?

Doesn't seem like you could just declare it to be 'allow pg_restore'
either, as that might include creating untrusted functions, et al.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: New default role- 'pg_read_all_data'

2020-08-28 Thread Stephen Frost
Greetings,

* Georgios Kokolatos (gkokola...@protonmail.com) wrote:
> The patch seems to be implementing a useful and requested feature.
> The patch applies cleanly and passes the basic regress tests. Also the 
> commitfest bot is happy.
> 
> A first pass at the code, has not revealed any worthwhile comments.
> Please allow me for a second and more thorough pass. The commitfest has 
> hardly started after all.

Great, thanks!

> Also allow me a series of genuine questions: 
> 
> What would the behaviour be with REVOKE?
> In a sequence similar to:
> GRANT ALL ON ...

GRANT ALL would be independently GRANT'ing rights to some role and
therefore unrelated.

> REVOKE pg_read_all_data FROM ...

This would simply REVOKE that role from the user.  Privileges
independently GRANT'd directly to the user wouldn't be affected.  Nor
would other role membership.

> What privileges would the user be left with? Would it be possible to end up 
> in the same privilege only with a GRANT command?

I'm not sure what's being asked here.

> Does the above scenario even make sense?

I definitely believe it makes sense for a given role/user to be a member
of pg_read_all_data and to be a member of other roles, or to have other
privileges GRANT'd directly to them.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: New default role- 'pg_read_all_data'

2020-08-28 Thread Stephen Frost
Greetings,

* Isaac Morland (isaac.morl...@gmail.com) wrote:
> On Fri, 28 Aug 2020 at 08:43, Stephen Frost  wrote:
> > This would simply REVOKE that role from the user.  Privileges
> > independently GRANT'd directly to the user wouldn't be affected.  Nor
> > would other role membership.
> >
> > > What privileges would the user be left with? Would it be possible to end
> > up in the same privilege only with a GRANT command?
> 
> What about:
> 
> REVOKE SELECT ON [table] FROM pg_read_all_data;

Wouldn't have any effect, and I think that's correct.

> I guess what I’m really asking is whether pg_read_all_data is automatically
> granted SELECT on all newly-created relations, or if the permission
> checking system always returns TRUE when asked if pg_read_all_data can
> select from a relation? I’m guessing it’s the latter so that it would be
> ineffective to revoke select privilege as I think this is more useful, but
> I’d like to be sure and the documentation should be explicit on this point.

Yes, it's the latter.  I'm not really sure about the documentation
change you're contemplating- have a specific suggestion?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: New default role- 'pg_read_all_data'

2020-08-28 Thread Stephen Frost
Greetings,

* gkokola...@pm.me (gkokola...@pm.me) wrote:
> On Friday, 28 August 2020 15:43, Stephen Frost  wrote:
> > > What privileges would the user be left with? Would it be possible to end 
> > > up in the same privilege only with a GRANT command?
> >
> > I'm not sure what's being asked here.
> 
> You are correct. My phrasing is not clear. Please be patient and allow me to 
> try again.
> 
> I was playing around with the code and I was trying a bit the opposite of 
> what you have submitted in the test file.
> 
> You have, (snipped):
> 
> GRANT pg_read_all_data TO regress_priv_user6;
> 
> SET SESSION AUTHORIZATION regress_priv_user6;
> SELECT * FROM atest1; -- ok
> INSERT INTO atest2 VALUES ('foo', true); -- fail
> 
> 
> I was expecting:
> REVOKE pg_read_all_data FROM regress_priv_user6;

Are you sure this REVOKE was successful..?

> SET SESSION AUTHORIZATION regress_priv_user6;
> SELECT * FROM atest1; -- fail
> INSERT INTO atest2 VALUES ('foo', true); -- ok

=# create role r1;
CREATE ROLE
=*# grant pg_read_all_data to r1;
GRANT ROLE
=*# create table t1 (c1 int);
CREATE TABLE
=*# set role r1;
=*> select * from t1;
 c1 

(0 rows)
=*> reset role;
RESET
=*# revoke pg_read_all_data from r1;
REVOKE ROLE
=*# set role r1;
SET
=*> select * from t1;
ERROR:  permission denied for table t1

Seems to be working as expected here.

> My expectation was not met since in my manual test (unless I made a mistake 
> which is entirely possible), the SELECT above did not fail. The insert did 
> succeed though.

That the INSERT worked seems pretty odd- could you post the exact
changes you've made to the regression tests, or the exact script where
you aren't seeing what you expect?  I've not been able to reproduce the
GRANT allowing a user to INSERT into a table.

> The first question: Was my expectation wrong?

If there aren't any other privileges involved, then REVOKE'ing the role
from a user should prevent that user from being able to SELECT from the
table.

> The second question: Is there a privilege that can be granted to 
> regress_priv_user6 that will not permit the select operation but will permit 
> the insert operation? If no, should there be one?

GRANT INSERT ON atest1 TO regress_prive_user6; would allow just
INSERT'ing.

Magnus also brought up the idea of a 'write_all_data' role, but that's
pretty independent of this, imv.  Not against adding it, if we can agree
as to what it means, exactly, but we should probably discuss over in
that sub-thread.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: New default role- 'pg_read_all_data'

2020-08-28 Thread Stephen Frost
Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> On Fri, Aug 28, 2020 at 2:38 PM Stephen Frost  wrote:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> > > Without having actually looked at the code, definite +1 for this feature.
> > > It's much requested...
> >
> > Thanks.
> >
> > > But, should we also have a pg_write_all_data to go along with it?
> >
> > Perhaps, but could certainly be a different patch, and it'd need to be
> > better defined, it seems to me...  read_all is pretty straight-forward
> > (the general goal being "make pg_dumpall/pg_dump work"), what would
> > write mean?  INSERT?  DELETE?  TRUNCATE?  ALTER TABLE?  System catalogs?
> 
> Well, it's pg_write_all_*data*, so it certainly wouldn't be alter table or
> system catalogs.
> 
> I'd say insert/update/delete yes.
> 
> TRUNCATE is always an outlier.Given it's generally classified as DDL, I
> wouldn't include it.

Alright, that seems like it'd be pretty easy.  We already have a check
in pg_class_aclmask to disallow modification of system catalogs w/o
being a superuser, so we should be alright to add a similar check for
insert/update/delete just below where I added the SELECT check.

> > Doesn't seem like you could just declare it to be 'allow pg_restore'
> > either, as that might include creating untrusted functions, et al.
> 
> No definitely not. That wouldn't be the usecase at all :)

Good. :)

> (and fwiw to me the main use case for read_all_data also isn't pg_dump,
> because most people using pg_dump are already db owner or higher in my
> experience. But it is nice that it helps with that too)

Glad to have confirmation that there's other use-cases this helps with.

I'll post an updated patch with that added in a day or two.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: New default role- 'pg_read_all_data'

2020-08-28 Thread Stephen Frost
Greetings,

* Gilles Darold (gil...@darold.net) wrote:
> Le 28/08/2020 à 15:26, Stephen Frost a écrit :
> >* Magnus Hagander (mag...@hagander.net) wrote:
> >>On Fri, Aug 28, 2020 at 2:38 PM Stephen Frost  wrote:
> >>>* Magnus Hagander (mag...@hagander.net) wrote:
> >>>>Without having actually looked at the code, definite +1 for this feature.
> >>>>It's much requested...
> >>>Thanks.
> >>>
> >>>>But, should we also have a pg_write_all_data to go along with it?
> >>>Perhaps, but could certainly be a different patch, and it'd need to be
> >>>better defined, it seems to me...  read_all is pretty straight-forward
> >>>(the general goal being "make pg_dumpall/pg_dump work"), what would
> >>>write mean?  INSERT?  DELETE?  TRUNCATE?  ALTER TABLE?  System catalogs?
> >>Well, it's pg_write_all_*data*, so it certainly wouldn't be alter table or
> >>system catalogs.
> >>
> >>I'd say insert/update/delete yes.
> >>
> >>TRUNCATE is always an outlier.Given it's generally classified as DDL, I
> >>wouldn't include it.
> >Alright, that seems like it'd be pretty easy.  We already have a check
> >in pg_class_aclmask to disallow modification of system catalogs w/o
> >being a superuser, so we should be alright to add a similar check for
> >insert/update/delete just below where I added the SELECT check.
> >
> >>>Doesn't seem like you could just declare it to be 'allow pg_restore'
> >>>either, as that might include creating untrusted functions, et al.
> >>No definitely not. That wouldn't be the usecase at all :)
> >Good. :)
> >
> >>(and fwiw to me the main use case for read_all_data also isn't pg_dump,
> >>because most people using pg_dump are already db owner or higher in my
> >>experience. But it is nice that it helps with that too)
> >Glad to have confirmation that there's other use-cases this helps with.
> >
> >I'll post an updated patch with that added in a day or two.
> 
> Looking at this thread I was thinking about the FDW. Does role
> pg_read_all_data will allow to execute pg_dump with --include-foreign-data
> option? If this is the case how about priviledge on fdw and foreign server?
> If this is the behavior we want I guess that the modification should be
> applied to pg_foreign_data_wrapper_aclmask() and pg_foreign_server_aclmask()
> too.

Using an FDW will often also require having a user mapping and there's
no way for that to be accomplished through only GRANT'ing a default
role, so I don't think we should mix this specific role with the FDW
permissions system.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: New default role- 'pg_read_all_data'

2020-08-28 Thread Stephen Frost
Greetings,

* Gilles Darold (gil...@darold.net) wrote:
> Le 28/08/2020 à 16:52, Stephen Frost a écrit :
> >Using an FDW will often also require having a user mapping and there's
> >no way for that to be accomplished through only GRANT'ing a default
> >role, so I don't think we should mix this specific role with the FDW
> >permissions system.
> 
> I'm fine with that, perhaps it should be mentioned in the documentation that
> foreign tables are not covered by this role.

We could say it doesn't GRANT CONNECT rights on databases, or EXECUTE on
functions too, but that doesn't seem like a terribly good approach for
the documentation to take- instead we document specifically what IS
included, which seems sufficiently clear to me.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Ideas about a better API for postgres_fdw remote estimates

2020-08-29 Thread Stephen Frost
Greetings,

* Andrey Lepikhov (a.lepik...@postgrespro.ru) wrote:
> During the implementation of sharding related improvements i noticed that if
> we use a lot of foreign partitions, we have bad plans because of vacuum
> don't update statistics of foreign tables.This is done by the ANALYZE
> command, but it is very expensive operation for foreign table.
> Problem with statistics demonstrates with TAP-test from the first patch in
> attachment.

Yes, the way we handle ANALYZE today for FDWs is pretty terrible, since
we stream the entire table across to do it.

> I implemented some FDW + pg core machinery to reduce weight of the problem.
> The ANALYZE command on foreign table executes query on foreign server that
> extracts statistics tuple, serializes it into json-formatted string and
> returns to the caller. The caller deserializes this string, generates
> statistics for this foreign table and update it. The second patch is a
> proof-of-concept.

Isn't this going to create a version dependency that we'll need to deal
with..?  What if a newer major version has some kind of improved ANALYZE
command, in terms of what it looks at or stores, and it's talking to an
older server?

When I was considering the issue with ANALYZE and FDWs, I had been
thinking it'd make sense to just change the query that's built in
deparseAnalyzeSql() to have a TABLESAMPLE clause, but otherwise run in
more-or-less the same manner as today.  If we don't like the available
TABLESAMPLE methods then we could add a new one that's explicitly the
'right' sample for an ANALYZE call and use that when it's available on
the remote side.  Not sure if it'd make any sense for ANALYZE itself to
start using that same TABLESAMPLE code, but maybe?  Not that I think
it'd be much of an issue if it's independent either, with appropriate
comments to note that we should probably try to make them match up for
the sake of FDWs.

> This patch speedup analyze command and provides statistics relevance on a
> foreign table after autovacuum operation. Its effectiveness depends on
> relevance of statistics on the remote server, but still.

If we do decide to go down this route, wouldn't it mean we'd have to
solve the problem of what to do when it's a 9.6 foreign server being
queried from a v12 server and dealing with any difference in the
statistics structures of the two?

Seems like we would... in which case I would say that we should pull
that bit out and make it general, and use it for pg_upgrade too, which
would benefit a great deal from having the ability to upgrade stats
between major versions also.  That's a much bigger piece to take on, of
course, but seems to be what's implied with this approach for the FDW.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: [EXTERNAL] Re: WIP: WAL prefetch (another approach)

2020-08-30 Thread Stephen Frost
Greetings,

* Tomas Vondra (tomas.von...@2ndquadrant.com) wrote:
> On Thu, Aug 27, 2020 at 04:28:54PM -0400, Stephen Frost wrote:
> >* Robert Haas (robertmh...@gmail.com) wrote:
> >>On Thu, Aug 27, 2020 at 2:51 PM Stephen Frost  wrote:
> >>> > Hm? At least earlier versions didn't do prefetching for records with an 
> >>> > fpw, and only for subsequent records affecting the same or if not in 
> >>> > s_b anymore.
> >>>
> >>> We don't actually read the page when we're replaying an FPW though..?
> >>> If we don't read it, and we entirely write the page from the FPW, how is
> >>> pre-fetching helping..?
> >>
> >>Suppose there is a checkpoint. Then we replay a record with an FPW,
> >>pre-fetching nothing. Then the buffer gets evicted from
> >>shared_buffers, and maybe the OS cache too. Then, before the next
> >>checkpoint, we again replay a record for the same page. At this point,
> >>pre-fetching should be helpful.
> >
> >Sure- but if we're talking about 25GB of WAL, on a server that's got
> >32GB, then why would those pages end up getting evicted from memory
> >entirely?  Particularly, enough of them to end up with such a huge
> >difference in replay time..
> >
> >I do agree that if we've got more outstanding WAL between checkpoints
> >than the system's got memory then that certainly changes things, but
> >that wasn't what I understood the case to be here.
> 
> I don't think it's very clear how much WAL there actually was in each
> case - the message only said there was more than 25GB, but who knows how
> many checkpoints that covers? In the cases with FPW=on this may easily
> be much less than one checkpoint (because with scale 45GB an update to
> every page will log 45GB of full-page images). It'd be interesting to
> see some stats from pg_waldump etc.

Also in the message was this:

--
In order to avoid checkpoints during benchmark, max_wal_size(200GB) and
checkpoint_timeout(200 mins) are set to a high value.
--

Which lead me to suspect, at least, that this was much less than a
checkpoint, as you suggest.  Also, given that the comment was 'run is
cancelled when there is a reasonable amount of WAL (>25GB), seems likely
that it's at least *around* there.

Ultimately though, there just isn't enough information provided to
really be able to understand what's going on.  I agree, pg_waldump stats
would be useful.

> >>Admittedly, I don't quite understand whether that is what is happening
> >>in this test case, or why SDD vs. HDD should make any difference. But
> >>there doesn't seem to be any reason why it doesn't make sense in
> >>theory.
> >
> >I agree that this could be a reason, but it doesn't seem to quite fit in
> >this particular case given the amount of memory and WAL.  I'm suspecting
> >that it's something else and I'd very much like to know if it's a
> >general "this applies to all (most?  a lot of?) SSDs because the
> >hardware has a larger than 8KB page size and therefore the kernel has to
> >read it", or if it's something odd about this particular system and
> >doesn't apply generally.
> 
> Not sure. I doubt it has anything to do with the hardware page size,
> that's mostly transparent to the kernel anyway. But it might be that the
> prefetching on a particular SSD has more overhead than what it saves.

Right- I wouldn't have thought the hardware page size would matter
either, but it's entirely possible that assumption is wrong and that it
does matter for some reason- perhaps with just some SSDs, or maybe with
a lot of them, or maybe there's something else entirely going on.  About
all I feel like I can say at the moment is that I'm very interested in
ways to make WAL replay go faster and it'd be great to get more
information about what's going on here to see if there's something we
can do to generally improve WAL replay.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: New default role- 'pg_read_all_data'

2020-08-30 Thread Stephen Frost
Greetings,

* Stephen Frost (sfr...@snowman.net) wrote:
> * Magnus Hagander (mag...@hagander.net) wrote:
> > On Fri, Aug 28, 2020 at 2:38 PM Stephen Frost  wrote:
> > > * Magnus Hagander (mag...@hagander.net) wrote:
> > > > Without having actually looked at the code, definite +1 for this 
> > > > feature.
> > > > It's much requested...
> > >
> > > Thanks.
> > >
> > > > But, should we also have a pg_write_all_data to go along with it?
> > >
> > > Perhaps, but could certainly be a different patch, and it'd need to be
> > > better defined, it seems to me...  read_all is pretty straight-forward
> > > (the general goal being "make pg_dumpall/pg_dump work"), what would
> > > write mean?  INSERT?  DELETE?  TRUNCATE?  ALTER TABLE?  System catalogs?
> > 
> > Well, it's pg_write_all_*data*, so it certainly wouldn't be alter table or
> > system catalogs.
> > 
> > I'd say insert/update/delete yes.
> > 
> > TRUNCATE is always an outlier.Given it's generally classified as DDL, I
> > wouldn't include it.
> 
> Alright, that seems like it'd be pretty easy.  We already have a check
> in pg_class_aclmask to disallow modification of system catalogs w/o
> being a superuser, so we should be alright to add a similar check for
> insert/update/delete just below where I added the SELECT check.
> 
> > > Doesn't seem like you could just declare it to be 'allow pg_restore'
> > > either, as that might include creating untrusted functions, et al.
> > 
> > No definitely not. That wouldn't be the usecase at all :)
> 
> Good. :)
> 
> > (and fwiw to me the main use case for read_all_data also isn't pg_dump,
> > because most people using pg_dump are already db owner or higher in my
> > experience. But it is nice that it helps with that too)
> 
> Glad to have confirmation that there's other use-cases this helps with.
> 
> I'll post an updated patch with that added in a day or two.

Here's that updated patch, comments welcome.

Thanks!

Stephen
diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml
index 829decd883..1213125bfd 100644
--- a/doc/src/sgml/user-manag.sgml
+++ b/doc/src/sgml/user-manag.sgml
@@ -517,6 +517,20 @@ DROP ROLE doomed_role;
   
  
  
+  
+   pg_read_all_data
+   Read all data (tables, views, sequences), as if having SELECT
+   rights on those objects, and USAGE rights on all schemas, even without
+   having it explicitly.  This role also has BYPASSRLS
+   set for it.
+  
+  
+   pg_write_all_data
+   Write all data (tables, views, sequences), as if having INSERT,
+   UPDATE, and DELETE rights on those objects, and USAGE rights on all
+   schemas, even without having it explicitly.  This role also has
+   BYPASSRLS set for it.
+  
   
pg_read_all_settings
Read all configuration variables, even those normally visible only to
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index c626161408..3e6d060554 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -3849,6 +3849,26 @@ pg_class_aclmask(Oid table_oid, Oid roleid,
 
 	ReleaseSysCache(tuple);
 
+	/*
+	 * Check if ACL_SELECT is being checked and, if so, and not set already
+	 * as part of the result, then check if the user is a member of the
+	 * pg_read_all_data role, which allows read access to all relations.
+	 */
+	if (mask & ACL_SELECT && !(result & ACL_SELECT) &&
+		has_privs_of_role(roleid, DEFAULT_ROLE_READ_ALL_DATA))
+		result |= ACL_SELECT;
+
+	/*
+	 * Check if ACL_INSERT, ACL_UPDATE, or ACL_DELETE is being checked
+	 * and, if so, and not set already as part of the result, then check
+	 * if the user is a member of the pg_write_all_data role, which
+	 * allows INSERT/UPDATE/DELETE access to all relations.
+	 */
+	if (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE) &&
+	   !(result & (ACL_INSERT | ACL_UPDATE | ACL_DELETE)) &&
+		has_privs_of_role(roleid, DEFAULT_ROLE_WRITE_ALL_DATA))
+		result |= (mask & (ACL_INSERT | ACL_UPDATE | ACL_DELETE));
+
 	return result;
 }
 
@@ -4175,6 +4195,16 @@ pg_namespace_aclmask(Oid nsp_oid, Oid roleid,
 
 	ReleaseSysCache(tuple);
 
+	/*
+	 * Check if ACL_USAGE is being checked and, if so, and not set already
+	 * as part of the result, then check if the user is a member of the
+	 * pg_read_all_data or pg_write_all_data roles, which allow usage
+	 * access to all schemas.
+	 */
+	if (mask & ACL_USAGE && !(result & ACL_USAGE) &&
+		(has_privs_of_role(roleid, DEFAULT_ROLE_READ_ALL_DATA) ||
+		has_privs_of_role(roleid, DEFAULT_ROLE_WRITE_ALL_DATA)))

Re: Ideas about a better API for postgres_fdw remote estimates

2020-08-31 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Sat, Aug 29, 2020 at 12:50:59PM -0400, Tom Lane wrote:
> > Stephen Frost  writes:
> > > Isn't this going to create a version dependency that we'll need to deal
> > > with..?  What if a newer major version has some kind of improved ANALYZE
> > > command, in terms of what it looks at or stores, and it's talking to an
> > > older server?
> > 
> > Yeah, this proposal is a nonstarter unless it can deal with the remote
> > server being a different PG version with different stats.
> > 
> > Years ago (when I was still at Salesforce, IIRC, so ~5 years) we had
> > some discussions about making it possible for pg_dump and/or pg_upgrade
> > to propagate stats data forward to the new database.  There is at least
> > one POC patch in the archives for doing that by dumping the stats data
> > wrapped in a function call, where the target database's version of the
> > function would be responsible for adapting the data if necessary, or
> > maybe just discarding it if it couldn't adapt.  We seem to have lost
> > interest but it still seems like something worth pursuing.  I'd guess
> > that if such infrastructure existed it could be helpful for this.
> 
> I don't think there was enough value to do statistics migration just for
> pg_upgrade, but doing it for pg_upgrade and FDWs seems like it might
> have enough demand to justify the required work and maintenance.

Not sure that it really matters much, but I disagree with the assessment
that there wasn't enough value to do it for pg_upgrade; I feel that it
just hasn't been something that's had enough people interested in
working on it, which isn't the same thing.

If a good patch showed up tomorrow, with someone willing to spend time
on it, I definitely think it'd be something we should include even if
it's just for pg_upgrade.  A solution that works for both pg_upgrade and
the postgres FDW would be even better, of course.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Ideas about a better API for postgres_fdw remote estimates

2020-08-31 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Mon, Aug 31, 2020 at 12:19:52PM -0400, Stephen Frost wrote:
> > * Bruce Momjian (br...@momjian.us) wrote:
> > > I don't think there was enough value to do statistics migration just for
> > > pg_upgrade, but doing it for pg_upgrade and FDWs seems like it might
> > > have enough demand to justify the required work and maintenance.
> > 
> > Not sure that it really matters much, but I disagree with the assessment
> > that there wasn't enough value to do it for pg_upgrade; I feel that it
> > just hasn't been something that's had enough people interested in
> > working on it, which isn't the same thing.
> 
> I am not sure what point you are trying to make, but if it had enough
> value, wouldn't people work on it, or are you saying that it had enough
> value, but people didn't realize it, so didn't work on it?  I guess I
> can see that.  For me, it was the maintenance burden that always scared
> me from getting involved since it would be the rare case where
> pg_upgrade would have to be modified for perhaps every major release.

The point I was making was that it has value and people did realize it
but there's only so many resources to go around when it comes to hacking
on PG and therefore it simply hasn't been done yet.

There's a big difference between "yes, we all agree that would be good
to have, but no one has had time to work on it" and "we don't think this
is worth having because of the maintenance work it'd require."  The
latter shuts down anyone thinking of working on it, which is why I said
anything.

> > If a good patch showed up tomorrow, with someone willing to spend time
> > on it, I definitely think it'd be something we should include even if
> > it's just for pg_upgrade.  A solution that works for both pg_upgrade and
> > the postgres FDW would be even better, of course.
> 
> Yep, see above.  The problem isn't mostly the initial patch, but someone
> who is going to work on it and test it for every major release,
> potentially forever.   Frankly, this is a pg_dump feature, rather than
> something pg_upgrade should be doing, because not having to run ANALYZE
> after restoring a dump is also a needed feature.

I tend to agree with it being more of a pg_dump issue- but that also
shows that your assessment above doesn't actually fit, because we
definitely change pg_dump every release.  Consider that if someone wants
to add some new option to CREATE TABLE, which gets remembered in the
catalog, they have to make sure that pg_dump support is added for that.
If we added the statistics export/import to pg_dump, someone changing
those parts of the system would also be expected to update pg_dump to
manage those changes, including working with older versions of PG.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Ideas about a better API for postgres_fdw remote estimates

2020-08-31 Thread Stephen Frost
Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Mon, Aug 31, 2020 at 12:56:21PM -0400, Stephen Frost wrote:
> > The point I was making was that it has value and people did realize it
> > but there's only so many resources to go around when it comes to hacking
> > on PG and therefore it simply hasn't been done yet.
> > 
> > There's a big difference between "yes, we all agree that would be good
> > to have, but no one has had time to work on it" and "we don't think this
> > is worth having because of the maintenance work it'd require."  The
> > latter shuts down anyone thinking of working on it, which is why I said
> > anything.
> 
> I actually don't know which statement above is correct, because of the
> "forever" maintenance.

I can understand not being sure which is correct, and we can all have
different points of view on it too, but that's a much softer stance than
what I, at least, understood from your up-thread comment which was-

> I don't think there was enough value to do statistics migration just
> for pg_upgrade [...]

That statement came across to me as saying the latter statement above.
Perhaps that wasn't what you intended it to, in which case it's good to
have the discussion and clarify it, for others who might be following
this thread and wondering if they should consider working on this area
of the code or not.

> Yes, very true, but technically any change in any aspect of the
> statistics system would require modification of the statistics dump,
> which usually isn't required for most feature changes.

Feature work either requires changes to pg_dump, or not.  I agree that
features which don't require pg_dump changes are definitionally less
work than features which do (presuming the rest of the feature is the
same in both cases) but that isn't a justification to not have pg_dump
support in cases where it's expected- we just don't currently expect it
for statistics (which is a rather odd exception when you consider that
nearly everything else that ends up in the catalog tables is included).

For my part, at least, I'd like to see us change that expectation, for a
number of reasons:

- pg_upgrade could leverage it and reduce downtime and/or confusion for
  users who are upgrading and dealing with poor statistics or no
  statistics for however long after the upgrade

- Tables restored wouldn't require an ANALYZE to get reasonable queries
  against them

- Debugging query plans would be a lot less guess-work or having to ask
  the user to export the statistics by hand from the catalog and then
  having to hand-hack them in to try and reproduce what's happening,
  particularly when re-running an analyze ends up giving different
  results, which isn't uncommon for edge cases

- The postgres_fdw would be able to leverage this, as discussed earlier
  on in this thread

- Logical replication could potentially leverage the existing stats and
  not require ANALYZE to be done after an import, leading to more
  predictable query plans on the replica

I suspect there's probably other benefits than the ones above, but these
all seem pretty valuable to me.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Kerberos support broken on MSVC builds for Windows x64?

2020-09-01 Thread Stephen Frost
Greetings,

* Dave Page (dp...@pgadmin.org) wrote:
> Attached is a patch against 12.4 for the build system in case anyone wants
> to play (I'll do it properly against the head branch later). I'm guessing
> this will work for < 12, as with 12 I'm now getting the following which
> looks like it's related to GSS encryption:
> 
> "C:\Users\dpage\Downloads\postgresql-12.4\pgsql.sln" (default target) (1) ->
> "C:\Users\dpage\Downloads\postgresql-12.4\pgcrypto.vcxproj" (default
> target) (2) ->
> "C:\Users\dpage\Downloads\postgresql-12.4\postgres.vcxproj" (default
> target) (3) ->
> (Link target) ->
>   be-secure-gssapi.obj : error LNK2019: unresolved external symbol setenv
> referenced in function secure_open_gssapi
> [C:\Users\dpage\Downloads\postgresql-12.4\postgres.vcxproj]
>   .\Release\postgres\postgres.exe : fatal error LNK1120: 1 unresolved
> externals [C:\Users\dpage\Downloads\postgresql-12.4\postgres.vcxproj]
> 
> I'll dig into that some more.

Yes, that'd be in the GSSENC code, which I hadn't been expecting to be
used under Windows.  If you're successful, I don't have any issue
helping to make that work, though I'm curious if you're trying to build
with MIT KfW (which is rather ancient these days, being based on krb5
1.13 and not updated since..) or with a more current release...?

Of course, it'd be good to get a buildfarm animal in place that's
actually testing this if we're going to make it work.

Regarding the setenv() call, should be able to use pgwin32_putenv() in
place on Windows, I'd think..?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Kerberos support broken on MSVC builds for Windows x64?

2020-09-02 Thread Stephen Frost
Greetings,

* Dave Page (dp...@pgadmin.org) wrote:
> On Tue, Sep 1, 2020 at 5:29 PM Stephen Frost  wrote:
> > * Dave Page (dp...@pgadmin.org) wrote:
> > > Attached is a patch against 12.4 for the build system in case anyone
> > wants
> > > to play (I'll do it properly against the head branch later). I'm guessing
> > > this will work for < 12, as with 12 I'm now getting the following which
> > > looks like it's related to GSS encryption:
> > >
> > > "C:\Users\dpage\Downloads\postgresql-12.4\pgsql.sln" (default target)
> > (1) ->
> > > "C:\Users\dpage\Downloads\postgresql-12.4\pgcrypto.vcxproj" (default
> > > target) (2) ->
> > > "C:\Users\dpage\Downloads\postgresql-12.4\postgres.vcxproj" (default
> > > target) (3) ->
> > > (Link target) ->
> > >   be-secure-gssapi.obj : error LNK2019: unresolved external symbol setenv
> > > referenced in function secure_open_gssapi
> > > [C:\Users\dpage\Downloads\postgresql-12.4\postgres.vcxproj]
> > >   .\Release\postgres\postgres.exe : fatal error LNK1120: 1 unresolved
> > > externals [C:\Users\dpage\Downloads\postgresql-12.4\postgres.vcxproj]
> > >
> > > I'll dig into that some more.
> >
> > Yes, that'd be in the GSSENC code, which I hadn't been expecting to be
> > used under Windows.  If you're successful, I don't have any issue
> > helping to make that work, though I'm curious if you're trying to build
> > with MIT KfW (which is rather ancient these days, being based on krb5
> > 1.13 and not updated since..) or with a more current release...?
> 
> I'm currently using the KFW 4.1 build from MIT. I've tried building it
> myself but it requires a very old toolchain (which defeated the point of
> what I was trying to do at the time).

> I haven't yet looked to see if the source for krb5-1.8.2 will build or even
> has the right bits in it for Windows - as I'm sure you know MIT seem to
> maintain an entirely different version for Windows for which I assume
> there's a reason.

I'm a bit confused as to why you'd consider trying 1.8.2- did you mean
1.18.2 there, perhaps..?  That's what I would think to try, since, as I
understand it from following the Kerberos Dev list (which is pretty
responsive...) has been updated to work with newer Windows build
toolchains.

> > Of course, it'd be good to get a buildfarm animal in place that's
> > actually testing this if we're going to make it work.
> 
> Fixing the config on hamerkop should deal with that I think. Though I am
> confused as to why the Buildfarm UI thinks it has Kerberos support enabled
> - did we change the config parameter from krb5 to gss some time prior to
> 9.5? If so, that could explain it.

Looks to be run by SRA OSS..  Perhaps reaching out to them to ask about
it would help?

> > Regarding the setenv() call, should be able to use pgwin32_putenv() in
> > place on Windows, I'd think..?
> 
> Right, I imagine so. It's on my todo...

Alright.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Kerberos support broken on MSVC builds for Windows x64?

2020-09-02 Thread Stephen Frost
Greetings,

* Dave Page (dp...@pgadmin.org) wrote:
> On Wed, Sep 2, 2020 at 9:05 AM Dave Page  wrote:
> >> Yes, that'd be in the GSSENC code, which I hadn't been expecting to be
> >> used under Windows.
>
> Here's a patch to make it build successfully (against head). I believe the
> changes to Solution.pm should be back patched all the way, and the rest to
> 12.

Looks about right..  I might pull out the code from both places that are
setting that variable into a dedicated function to be used from both
though.

> Testing however, has been more problematic - I suspect at least partly
> because of my Kerberos newbie-ness. I have a test server in an Ubuntu VM,
> which I've used quite successfully to authenticate against another VM
> running PG 12  on Ubuntu, from both Ubuntu and Windows clients. Using that,
> but with a Windows client running MIT Kerberos I find that getting a ticket
> takes a good 30 seconds or so. Postgres also seems to get it's ticket
> successfully via the keytab file:

So, from Windows clients that don't have MIT KfW installed, you're able
to authenticate against PG 12 on Ubuntu using Kerberos, right..?  With
PG built using SSPI on the client side, I'm guessing?

Kerberos uses reverse DNS to try to check what hostname to use when
requesting a ticket, I wonder if what you're seeing here is a delay due
to there not being reverse DNS functional in the environment, perhaps..?

> C:\pg>"c:\Program Files\MIT\Kerberos\bin\klist.exe"
> Ticket cache: API:Initial default ccache
> Default principal: dp...@pgadmin.org
> 
> Valid starting ExpiresService principal
> 09/02/20 15:06:49  09/03/20 01:06:49  krbtgt/pgadmin@pgadmin.org
> renew until 09/03/20 15:06:31
> 09/02/20 15:07:06  09/03/20 01:06:49  postgres/win-ilt1arj8a9c@
> renew until 09/03/20 15:06:31
> 09/02/20 15:07:06  09/03/20 01:06:49  postgres/win-ilt1arj8...@pgadmin.org
> renew until 09/03/20 15:06:31
> 
> However, If I try to login using host + gss in the pg_hba.conf file, I then
> get:
> 
> C:\pg>bin\psql postgres
> psql: error: could not connect to server: SSPI continuation error: No
> credentials are available in the security package
>  (8009030e)

This is with PG compiled with GSS on the client side and using MIT KfW?

This particular error from SSPI seems to possibly be coming from the
constrained delegation system.  While not directly about this issue,
Microsoft has some documentation about configuring constrained
delegation (and how to turn it off) here:

https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/deploy/Set-up-hosts-for-live-migration-without-Failover-Clustering

Now, we aren't actually delegating credentials here, so it seems a bit
odd for it to be complaining about that, but perhaps it's throwing this
error because the MIT KfW library has no clue about constrained
delegation and therefore wouldn't be trying to enforce it.

> If I try to use hostgssenc + gss, it looks like it's not even trying to
> encrypt:
> 
> C:\pg>bin\psql postgres
> psql: error: could not connect to server: FATAL:  no pg_hba.conf entry for
> host "::1", user "dpage", database "postgres", SSL off
> 
> Any ideas?

If it's not trying then I would be suspicious that the
gss_acquire_creds() call is saying that there isn't a credential cache,
though that would be a bit odd given that klist seems to be working.

Would certainly be interesting to see if 1.18.2 changes anything in this
regard.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: shared-memory based stats collector

2020-09-03 Thread Stephen Frost
Greetings,

* Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> At Mon, 01 Jun 2020 18:00:01 +0900 (JST), Kyotaro Horiguchi 
>  wrote in 
> > Rebased on the current HEAD. 36ac359d36 conflicts with this. Tranche
> 
> Hmm. This conflicts with 0fd2a79a63. Reabsed on it.

Thanks for working on this and keeping it updated!

I've started taking a look and at least right off...

> >From 4926e50e7635548f86dcd0d36cbf56d168a5d242 Mon Sep 17 00:00:00 2001
> From: Kyotaro Horiguchi 
> Date: Mon, 16 Mar 2020 17:15:35 +0900
> Subject: [PATCH v35 1/7] Use standard crash handler in archiver.
> 
> The commit 8e19a82640 changed SIGQUIT handler of almost all processes
> not to run atexit callbacks for safety. Archiver process should behave
> the same way for the same reason. Exit status changes 1 to 2 but that
> doesn't make any behavioral change.

Shouldn't this:

a) be back-patched, as the other change was
b) also include a change to have the stats collector (which I realize is
   removed later on in this patch set, but we're talking about fixing
   existing things..) for the same reason, and because there isn't much
   point in trying to write out the stats after we get a SIGQUIT, since
   we're just going to blow them away again since we're going to go
   through crash recovery..?

Might be good to have a separate thread to address these changes.

I've looked through (some of) this thread and through the patches also
and hope to provide a review of the bits that should be targetting v14
(unlike the above) soon.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: shared-memory based stats collector

2020-09-07 Thread Stephen Frost
Greetings,

* Stephen Frost (sfr...@snowman.net) wrote:
> * Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> > The commit 8e19a82640 changed SIGQUIT handler of almost all processes
> > not to run atexit callbacks for safety. Archiver process should behave
> > the same way for the same reason. Exit status changes 1 to 2 but that
> > doesn't make any behavioral change.
> 
> Shouldn't this:
> 
> a) be back-patched, as the other change was
> b) also include a change to have the stats collector (which I realize is
>removed later on in this patch set, but we're talking about fixing
>existing things..) for the same reason, and because there isn't much
>point in trying to write out the stats after we get a SIGQUIT, since
>we're just going to blow them away again since we're going to go
>through crash recovery..?

* Michael Paquier (mich...@paquier.xyz) wrote:
> 0001 is just a piece of refactoring, so I see no strong argument in
> favor of a backpatch, IMHO.

No, 0001 changes the SIGQUIT handler for the archiver process, which is
what 8e19a82640 was about changing for a bunch of the other subprocesses
and which was back-patched all the way, so I'm a bit confused about why
you're saying it's just refactoring..?

Note that exit() and _exit() aren't the same, and the latter is what's
now being used in SignalHandlerForCrashExit.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: More aggressive vacuuming of temporary tables

2020-09-09 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2020-08-28 11:46:49 -0400, Tom Lane wrote:
> > It strikes me that when we are vacuuming a temporary table (which
> > necessarily will be one of our own session), we don't really need
> > to care what the global xmin horizon is.  If we're not inside a
> > user transaction block, then there are no tuples in the table that
> > could be in-doubt anymore.  Neither are there any snapshots in our
> > session that could see any dead tuples.  Nor do we give a fig what
> > other sessions might think of those tuples.  So we could just set
> > the xmin cutoff as aggressively as possible, which is to say
> > equal to the nextXid counter.  While vacuuming a temp table is
> > perhaps not something people do very often, I think when they do
> > do it they would like us to clean out all the dead tuples not just
> > some.
> 
> That seems like a good idea.

Agreed.

> I've been toying with a patch that introduces more smarts about when a
> row is removable, by looking more closely whether a specific row
> versions are visible (e.g. in the common case of one old snapshot and
> lots of newer rows). But that's orders of magnitude more complicated. So
> going for something as simple as this seems like a good idea.

I've wondered about this for a long time- very cool that you've found
time to actually work on a patch.  A couple of different ideas were
discussed previously about how to do that kind of a check- mind talking
about what method you're using, or perhaps just sharing that patch? :)

The potential of such an improvement is huge.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Reduce the time required for a database recovery from archive.

2020-09-09 Thread Stephen Frost
Greetings,

* Dmitry Shulga (d.shu...@postgrespro.ru) wrote:
> Overall archive file processing is done one by one, and this might
> create a performance bottleneck if archived WAL files are delivered slowly,
> because the database server has to wait for arrival of the next
> WAL segment before applying its records.
> 
> To address this issue it is proposed to receive archived WAL files in parallel
> so that when the next WAL segment file is required for processing of redo log
> records it would be already available.

Yes, pgbackrest already does exactly this (if configured)- uses parallel
processes to fetch the WAL and have it be available ahead of time.

> Implementation of this approach assumes running several background processes 
> (bgworkers)
> each of which runs a shell command specified by the parameter restore_command
> to deliver an archived WAL file. Number of running parallel processes is 
> limited
> by the new parameter max_restore_command_workers. If this parameter has value > 0
> then WAL files delivery is performed using the original algorithm, that is in
> one-by-one manner. If this parameter has value greater than 0 then the 
> database
> server starts several bgworker processes up to the limit specified by
> the parameter max_restore_command_workers and passes to every process
> WAL file name to deliver. Active processes start prefetching of specified
> WAL files and store received files in the directory pg_wal/pgsql_tmp. After
> bgworker process finishes receiving a file it marks itself as a free process
> and waits for a new request to receive a next WAL file. The main process
> performing database recovery still handles WAL files in one-by-one manner,
> but instead of waiting for a next required WAL file's availability it checks 
> for
> that file in the prefetched directory. If a new file is present there,
> the main process starts its processing.

I'm a bit confused about this description- surely it makes sense for the
parallel workers to continue to loop and fetch up to some specified
max..?  Then to monitor and to fetch more when the amount pre-fetched so
far drops before that level?  The description above makes it sound like
X WAL will be fetched ahead of time, and then the recovery process will
go through those until it runs out and then it'll have to wait for the
next X WAL to be fetched, which means it's still going to end up being
delayed even with these parallel processes, which isn't good.

Does this also properly handle timeline switches..?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: SIGQUIT handling, redux

2020-09-09 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> This is to pull together a couple of recent threads that are seemingly
> unrelated to $SUBJECT.
> 
> Over in the long thread at [1] we've agreed to try to make the backend
> code actually do what assorted comments claim it does, namely allow
> SIGQUIT to be accepted at any point after a given process has established
> its signal handlers.  (Right now, it mostly is not accepted during error
> recovery, but there's no clear reason to preserve that exception.)
> 
> Of course, this is only safe if the SIGQUIT handler is safe to be invoked
> anywhere, so I did a quick survey of backend signal handlers to see if
> that is true.  I immediately found pgarch_exit() which surely is not.  It
> turns out that Horiguchi-san already noticed that and proposed to fix it,
> within the seemingly entirely unrelated patch series for a shared-memory
> based stats collector (see patch 0001 in [2]).  I think we should go ahead
> and commit that, and maybe even back-patch it.  There seems no good reason
> for the archiver to treat SIGQUIT as nonfatal when other postmaster
> children do not.

As I mentioned over there, I agree that we should do this and we should
further have the statistics collector also do so, which currently sets
up SIGQUIT with ShutdownRequestPending() and in its loop decides it's
fine to write out the stats file (which we're going to remove during
recovery anyway...) and then call exit(0).  I also think we should
back-patch these changes, as the commit mentioned in Horiguchi-san's
patch for pgarch_exit() was.

> Another thing I found is startup_die() in postmaster.c, which catches
> SIGQUIT during the authentication phase of a new backend.  This calls
> proc_exit(1), which (a) is unsafe on its face, and (b) potentially
> demotes a SIGQUIT into something that will not cause the parent postmaster
> to perform a DB-wide restart.  There seems no good reason for that
> behavior, either.  I propose that we use SignalHandlerForCrashExit
> for SIGQUIT here too.

Yikes, agreed.

> In passing, it's worth noting that startup_die() isn't really much safer
> for SIGTERM than it is for SIGQUIT; the only argument for distinguishing
> those is that code that applies BlockSig will at least manage to block the
> former.  So it's slightly tempting to drop startup_die() altogether in
> favor of using SignalHandlerForCrashExit for the SIGTERM-during-auth case
> too.  However, that'd have the effect of causing a "fast shutdown" to get
> converted to a panic if it happens while any sessions are in
> authentication phase, so I think this cure is likely worse than the
> disease.

Agreed, that's definitely no good.

> Worth reading in connection with this is the thread that led up to
> commits 8e19a8264 et al [3].  We had a long discussion about how
> quickdie() and startup_die() might be made safe(r), without any
> real consensus emerging about any alternatives being better than the
> status quo.  I still don't have an improvement idea for quickdie();
> I don't want to give up trying to send a message to the client.
> Note, however, that quickdie() does end with _exit(2) so that at
> least it's not trying to execute arbitrary process-cleanup code.
> 
> In short then, I want to ensure that postmaster children's SIGQUIT
> handlers always end with _exit(2) rather than some other exit method.
> We have two exceptions now and they need to get fixed.

I agree we should fix these.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: shared-memory based stats collector

2020-09-09 Thread Stephen Frost
Greetings,

* Kyotaro Horiguchi (horikyota@gmail.com) wrote:
> > Shouldn't this:
> > 
> > a) be back-patched, as the other change was
> > b) also include a change to have the stats collector (which I realize is
> >removed later on in this patch set, but we're talking about fixing
> >existing things..) for the same reason, and because there isn't much
> >point in trying to write out the stats after we get a SIGQUIT, since
> >we're just going to blow them away again since we're going to go
> >through crash recovery..?
> > 
> > Might be good to have a separate thread to address these changes.
> 
> +1

Just FYI, Tom's started a thread which includes this over here-

https://postgr.es/m/1850884.1599601...@sss.pgh.pa.us

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: TDE (Transparent Data Encryption) supported ?

2020-09-11 Thread Stephen Frost
Greetings,

* laurent.fe...@free.fr (laurent.fe...@free.fr) wrote:
> There is a fork named PostgreSQL 12.x TDE from Cybertec. The issue is that 
> there is no key management at all.

Key management is definitely an understood issue in the community and
there was a fair bit of work put into trying to solve that problem last
year and hopefully that'll continue and perhaps be to a point where we
have a solution in v14.  With that, perhaps we can also get TDE in, but
there certainly aren't any guarantees.

What really needs to be considered, however, is what your attack vectors
are and if TDE would really address them (often it doesn't, as with any
TDE the keys end up at least in the memory of the database server and
therefore available to an attacker, not to mention that the data ends up
being sent to the application unencrypted, and no, TLS/SSL doesn't help
that since an attacker on the server would be able to get at the data
before it's wrapped in TLS).

> Using pgcrypto has an impact on the application then I have to give up this 
> way.

pgcrypto is an option but, again, the keys end up on the database server
and therefore it may not be very helpful, depending on what the attack
vector is that you're concerned about.

> There is another alternative named "Client-Side Encryption'' that I have not 
> looked at in detail yet. I'm afraid that this solution has an impact on the 
> application too. And if there are two applications pointing to the same 
> database I am wondering how the encryption key is shared between the two 
> nodes.

Performing encryption of the sensitive data as early as possible is
certainly the best answer, and that means doing it in the application
before it ever reaches the database server.  Yes, that means that
changes have to be made to the application, but it's a much better
solution which will actually address real attack vectors, like the
database server being compromised.

In general, as it relates to multiple applications connecting to the
same database- I'd just downright discourage that.  PG is much lighter
weight than other RDBMS solutions and you don't need to have 100
different applications connecting to the same database server- instead
have an independent cluster for each application and use certificates or
other strong authentication mechanism to make sure that app A can only
connect to the PG cluster for app A and not to any others- this
completely removes the concern about an SQL injection attack on app A
allowing the attacker to gain access to app B's data.

Another advantage of this is that the individual clusters are smaller,
and they can be scaled independently, backed up independently, and
restored independently, all of which are really good things.

> The last point is about the backups, whatever the solution, the data has to 
> be in an encrypted format when "backuping".

Use a backup technology that provides encryption- pgbackrest does, and
some others probably do too.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: TDE (Transparent Data Encryption) supported ?

2020-09-14 Thread Stephen Frost
Greetings,

We'd prefer it if you didn't top-post (just write some stuff at the top)
when you respond and post to these mailing lists.

* laurent.fe...@free.fr (laurent.fe...@free.fr) wrote:
> I come back to your comments about vestor attacks. I know that TDE protects 
> against disk thefts, not really more ..

That is a data-at-rest concern and TDE is one approach to addressing it.

> But compagnie has some internal rules and some of them require "At Rest" 
> encryption, nothing more is mentionned.
> Then, even if TDE is not THE solution in term of security, it is something 
> that companies want.

Disk-based encryption is available for basically all operating systems
and PostgreSQL works reasonably well on top of encrypted filesystems or
block devices.  That's all available today, works quite well to deal
with the "someone stole the disk" or "someone forgot to wipe the drive
before throwing it away" attack vectors.

In particular, I'd encourage you to look at Linux with LUKS for data at
rest encryption.  You can then simply run PostgreSQL on top of that and
be protected without any of the complications which TDE introduces.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Function to execute a program

2020-09-14 Thread Stephen Frost
Greetings,

* Magnus Hagander (mag...@hagander.net) wrote:
> Would it make sense to have a pg_execute_program() that corresponds to COPY
> FROM PROGRAM? This would obviously have the same permissions restrictions
> as COPY FROM PROGRAM.

Eh, perhaps.

> The usecase would be to for example execute a command that returns json
> format output, which could then be parsed and processed as part of a query.
> Today, COPY FROM PROGRAM cannot do this, as we can't read a multiline json
> value with COPY.

I'd rather come up with a way to import this kind of object into PG by
using COPY rather than adding a different way to pull them in.

In the "this is a crazy idea" category- have a way to specify an input
function to COPY, or a target data type (and then use it's input
function) to which COPY just keeps feeding bytes until the function
comes back with "ok, I got a value to return" or something.

Would be really neat to have a way to COPY in a JSON, XML, or whatever
even if it's multi-line and even if it has multiple objects (ie: 10
complete JSON documents in a single file would become 10 rows of jsonb).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Function to execute a program

2020-09-14 Thread Stephen Frost
Greetings.

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Magnus Hagander (mag...@hagander.net) wrote:
> >> Would it make sense to have a pg_execute_program() that corresponds to COPY
> >> FROM PROGRAM? This would obviously have the same permissions restrictions
> >> as COPY FROM PROGRAM.
> 
> > I'd rather come up with a way to import this kind of object into PG by
> > using COPY rather than adding a different way to pull them in.
> 
> I'm not for overloading COPY to try to make it handle every data import
> use-case.  The issue here AIUI is that Magnus wants the program output
> to be read as an uninterpreted blob (which he'll then try to convert to
> jsonb or whatever, but that's not the concern of the import code).  This
> is exactly antithetical to COPY's mission of reading some rows that are
> made up of some columns and putting the result into a table.

I don't really think there's anything inherent in the fact that "COPY"
today only has one way to handle data that the user wants to import that
it should be required to always operate in that manner.

As for slowing down the current method- I don't think that we'd
implement such a change as just a modification to the existing optimized
parsing code as that just wouldn't make any sense and would slow COPY
down for this use-case, but having a COPY command that's able to work in
a few different modes when it comes to importing data seems like it
could be sensible, fast, and clear to users.

One could imagine creating some other top-level command to handle more
complex import cases than what COPY does today but I don't actually
think that'd be an improvment.

> Yeah, we could no doubt add some functionality to disable all the
> row-splitting and column-splitting and associated escaping logic,
> but that's going to make COPY slower and more complicated.  And it
> still doesn't address wanting to use the result directly in a query
> instead of sticking it into a table.

The way that's handled for the cases that COPY does work with today is
file_fdw.  Ideally, we'd do the same here.

Ultimately, COPY absolutely *is* our general data import tool- it's just
that today we push some of the work to make things 'fit' on the user and
that ends up with pain points like exactly what Magnus has pointed out
here.  We should be looking to improve that situation, and I don't
really care for the solution to that being "create some random other new
thing for data import that users then have to know exists and learn how
to use".

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: new heapcheck contrib module

2020-09-23 Thread Stephen Frost
Greetings,

* Peter Geoghegan (p...@bowt.ie) wrote:
> On Tue, Sep 22, 2020 at 12:41 PM Robert Haas  wrote:
> > But now I see that there's no secondary permission check in the
> > verify_nbtree.c code. Is that intentional? Peter, what's the
> > justification for that?
> 
> As noted by comments in contrib/amcheck/sql/check_btree.sql (the
> verify_nbtree.c tests), this is intentional. Note that we explicitly
> test that a non-superuser role can perform verification following
> GRANT EXECUTE ON FUNCTION ... .

> As I mentioned earlier, this is supported (or at least it is supported
> in my interpretation of things). It just isn't documented anywhere
> outside the test itself.

Would certainly be good to document this but I tend to agree with the
comments that ideally-

a) it'd be nice for a relatively low-privileged user/process could run
   the tests in an ongoing manner
b) we don't want to add more is-superuser checks
c) users shouldn't really be given the ability to see rows they're not
   supposed to have access to

In other places in the code, when an error is generated and the user
doesn't have access to the underlying table or doesn't have BYPASSRLS,
we don't include the details or the actual data in the error.  Perhaps
that approach would make sense here (or perhaps not, but it doesn't seem
entirely crazy to me, anyway).  In other words:

a) keep the ability for someone who has EXECUTE on the function to be
   able to run the function against any relation
b) when we detect an issue, perform a permissions check to see if the
   user calling the function has rights to read the rows of the table
   and, if RLS is enabled on the table, if they have BYPASSRLS
c) if the user has appropriate privileges, log the detailed error, if
   not, return a generic error with a HINT that details weren't
   available due to lack of privileges on the relation

I can appreciate the concerns regarding dead rows ending up being
visible to someone who wouldn't normally be able to see them but I'd
argue we could simply document that fact rather than try to build
something to address it, for this particular case.  If there's push back
on that then I'd suggest we have a "can read dead rows" or some such
capability that can be GRANT'd (in the form of a default role, I would
think) which a user would also have to have in order to get detailed
error reports from this function.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: role self-revocation

2022-03-10 Thread Stephen Frost
Greetings,

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Thu, Mar 10, 2022 at 7:46 AM Robert Haas  wrote:
> > On Wed, Mar 9, 2022 at 4:31 PM Tom Lane  wrote:
> > > I don't think we need syntax to describe it.  As I just said in my
> > > other reply, we have a perfectly good precedent for this already
> > > in ordinary object permissions.  That is: an object owner always,
> > > implicitly, has GRANT OPTION for all the object's privileges, even
> > > if she revoked the corresponding plain privilege from herself.
> > >
> > > Yeah, this does mean that we're effectively deciding that the creator
> > > of a role is its owner.  What's the problem with that?
> >
> > I don't think that's entirely the wrong concept, but it doesn't make a
> > lot of sense in a world where the creator has to be a superuser. If
> > alice, bob, and charlie are superusers who take turns creating new
> > users, and then we let charlie go due to budget cuts, forcing alice
> > and bob to change the owner of all the users he created to some other
> > superuser as a condition of dropping his account is a waste of
> > everyone's time. They can do exactly the same things to every account
> > on the system after we change the role owner as before.
> 
> Then maybe we should just implement the idea that if a superuser would
> become the owner we instead substitute in the bootstrap user.  Or give the
> DBA the choice whether they want to retain knowledge of specific roles -
> and thus are willing to accept the "waste of time".

This doesn't strike me as going in the right direction.  Falling back to
the bootstrap superuser is generally a hack and not a great one.  I'll
also point out that the SQL spec hasn't got a concept of role ownership
either.

> > But wait, I hear you cry, what about CREATEROLE? Well, CREATEROLE is
> > generally agreed to be broken right now, and if you don't agree with
> > that, consider that it can grant pg_execute_server_programs to a
> > newly-created account and then explain to me how it's functionally
> > different from superuser.
> 
> CREATEROLE has long been defined as basically having "with admin option" on
> every role in the system.  The failure to special-case the roles that grant
> different aspects of superuser-ness to its members doesn't make CREATEROLE
> itself broken, it makes the implementation of pg_execute_server_programs
> broken.  Only superusers should be considered to have with admin option on
> these roles. They can delegate through the usual membership+admin mechanism
> to a CREATEROLE role if they desire.

No, CREATEROLE having admin option on every role in the system is broken
and always has been.  It's not just an issue for predefined roles like
pg_execute_server_program, it's an issue for any role that could become
a superuser either directly or indirectly and that extends beyond the
predefined ones.  As this issue with CREATEROLE existed way before
predefined roles were added to PG, claiming that it's an issue with
predefined roles doesn't make a bit of sense.

> > The whole area needs a rethink. I believe
> > everyone involved in the discussion on the other threads agrees that
> > some reform of CREATEROLE is necessary, and more generally with the
> > idea that it's useful for non-superusers to be able to create roles.
> 
> As the documentation says, using SUPERUSER for day-to-day administration is
> contrary to good security practices.  Role management is considered to be a
> day-to-day administration activity.  I agree with this principle.  It was
> designed to neither be a superuser nor grant superuser, so removing the
> ability to grant the pg_* role memberships remains consistent with its
> original intent.

That would not be sufficient to make CREATEROLE safe.  Far, far from it.

> > I want that because I want mini-superusers, where alice can administer
> > the users that alice creates just as if she were a superuser,
> > including having their permissions implicitly and dropping them when
> > she wants them gone, but where alice cannot break out to the operating
> > system as a true superuser could do.
> 
> CREATEROLE (once the pg_* with admin rules are fixed) + Ownership and rules
> restricting interfering with another role's objects (unless superuser)
> seems to handle this.

This is not sufficient- roles can be not-superuser themselves but have
the ability to become superuser if GRANT'd a superuser role and
therefore we can't have a system where CREATEROLE allows arbitrary
GRANT'ing of roles to each other.  I'm a bit confused too as anything
where we are curtailing what CREATEROLE roles are able to do in a manner
that means they're only able to modify some subset of roles should
equally apply to predefined roles too- that is, CREATEROLE shouldn't be
the determining factor in the question of if a role can GRANT a
predefined (or any other role) to some other role- that should be
governed by the admin option on that role, and that should work exactly
the same for predefine

Re: role self-revocation

2022-03-10 Thread Stephen Frost
Greetings,

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Thu, Mar 10, 2022 at 9:19 AM Stephen Frost  wrote:
> > * David G. Johnston (david.g.johns...@gmail.com) wrote:
> > > On Thu, Mar 10, 2022 at 7:46 AM Robert Haas 
> > wrote:
> > > > On Wed, Mar 9, 2022 at 4:31 PM Tom Lane  wrote:
> > > > > I don't think we need syntax to describe it.  As I just said in my
> > > > > other reply, we have a perfectly good precedent for this already
> > > > > in ordinary object permissions.  That is: an object owner always,
> > > > > implicitly, has GRANT OPTION for all the object's privileges, even
> > > > > if she revoked the corresponding plain privilege from herself.
> > > > >
> > > > > Yeah, this does mean that we're effectively deciding that the creator
> > > > > of a role is its owner.  What's the problem with that?
> > > >
> > > > I don't think that's entirely the wrong concept, but it doesn't make a
> > > > lot of sense in a world where the creator has to be a superuser. If
> > > > alice, bob, and charlie are superusers who take turns creating new
> > > > users, and then we let charlie go due to budget cuts, forcing alice
> > > > and bob to change the owner of all the users he created to some other
> > > > superuser as a condition of dropping his account is a waste of
> > > > everyone's time. They can do exactly the same things to every account
> > > > on the system after we change the role owner as before.
> > >
> > > Then maybe we should just implement the idea that if a superuser would
> > > become the owner we instead substitute in the bootstrap user.  Or give
> > the
> > > DBA the choice whether they want to retain knowledge of specific roles -
> > > and thus are willing to accept the "waste of time".
> >
> > This doesn't strike me as going in the right direction.  Falling back to
> > the bootstrap superuser is generally a hack and not a great one.  I'll
> > also point out that the SQL spec hasn't got a concept of role ownership
> > either.
> >
> > > > But wait, I hear you cry, what about CREATEROLE? Well, CREATEROLE is
> > > > generally agreed to be broken right now, and if you don't agree with
> > > > that, consider that it can grant pg_execute_server_programs to a
> > > > newly-created account and then explain to me how it's functionally
> > > > different from superuser.
> > >
> > > CREATEROLE has long been defined as basically having "with admin option"
> > on
> > > every role in the system.  The failure to special-case the roles that
> > grant
> > > different aspects of superuser-ness to its members doesn't make
> > CREATEROLE
> > > itself broken, it makes the implementation of pg_execute_server_programs
> > > broken.  Only superusers should be considered to have with admin option
> > on
> > > these roles. They can delegate through the usual membership+admin
> > mechanism
> > > to a CREATEROLE role if they desire.
> >
> > No, CREATEROLE having admin option on every role in the system is broken
> > and always has been.  It's not just an issue for predefined roles like
> > pg_execute_server_program,
> 
> 
> 
> > it's an issue for any role that could become
> > a superuser either directly or indirectly and that extends beyond the
> > predefined ones.
> 
> 
> The only indirect way for a role to become superuser is to have been
> granted membership in a superuser group, then SET ROLE.  Non-superusers
> cannot do this.  If a superuser does this I consider the outcome to be no
> different than if they go and do:

A non-superuser absolutely can be GRANT'd membership in a superuser role
and then SET ROLE to that user thus becoming a superuser.  Giving users
a regular role to log in as and then membership in a role that can
become a superuser is akin to having a sudoers group in Unix and is good
practice, not something that everyone should have to be super-dooper
careful to not do, lest a CREATEROLE user be able to leverage that.

> SET allow_system_table_mods TO true;
> DROP pg_catalog.pg_class;

I don't equate these in the least.

> In short, having a CREATEROLE user issuing:
> GRANT pg_read_all_stats TO davidj;
> should result in the same outcome as them issuing:
> GRANT postgres TO davidj;
> -- ERROR:  must be superuser to alter superusers

No, what should matter is if the role doing the GRA

Re: role self-revocation

2022-03-10 Thread Stephen Frost
Greetings,

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Thu, Mar 10, 2022 at 11:05 AM Stephen Frost  wrote:
> > * David G. Johnston (david.g.johns...@gmail.com) wrote:
> > > On Thu, Mar 10, 2022 at 9:19 AM Stephen Frost  wrote:
> > > > * David G. Johnston (david.g.johns...@gmail.com) wrote:
> > > > > On Thu, Mar 10, 2022 at 7:46 AM Robert Haas  
> > > > > wrote:
> >
> > > The only indirect way for a role to become superuser is to have been
> > > granted membership in a superuser group, then SET ROLE.  Non-superusers
> > > cannot do this.  If a superuser does this I consider the outcome to be no
> > > different than if they go and do:
> >
> > A non-superuser absolutely can be GRANT'd membership in a superuser role
> > and then SET ROLE to that user thus becoming a superuser.
> 
> A non-superuser cannot grant a non-superuser membership in a superuser
> role.  A superuser granting a user membership in a superuser role makes
> that user a superuser.  This seems sane.
> 
> If a superuser grants a non-superuser membership in a superuser role then
> today a non-superuser can grant a user membership in that intermediate
> role, thus having a non-superuser make another user a superuser.  This is
> arguably a bug that needs to be fixed.
> 
> My desired fix is to just require the superuser to mark (or have it marked
> by default ideally) the role inheriting superuser and put the
> responsibility on the superuser.  I agree this is not ideal, but it is
> probably quick and low risk.
> 
> I'll let someone else describe the details of the alternative option.  I
> suspect it will end up being a better option in terms of design.  But
> depending on time and risk even knowing that we want the better design
> eventually doesn't preclude getting the easier fix in now.
> 
> > No, what should matter is if the role doing the GRANT has admin rights
> > on pg_read_all_stats, or on the postgres role.  That also happens to be
> > what the spec says.
> 
> Yes, and superusers implicitly have that right, while CREATEROLE users
> implicitly have that right on the pg_* role but not on superuser roles.  I
> just want to plug that hole and include the pg_* roles (or any role for
> that matter) in being able to be denied implied ADMIN rights for
> non-superusers.

CREATEROLE users implicitly have that right on *all non-superuser
roles*.  Not just the pg_* ones, which is why the pg_* ones aren't any
different in this regard.

> > Today a non-superuser cannot "grant postgres to someuser;"
> >
> > No, but a role can be created like 'admin', which a superuser GRANT's
> > 'postgres' to and then that role can be GRANT'd to anyone by anyone who
> > has CREATEROLE rights.  That's not sane.
> 
> I agree.  And I've suggested a minimal fix, adding an attribute to the role
> that prohibits non-superusers from granting it to others, that removes the
> insane behavior.

I disagree that this is a minimal fix as I don't see it as a fix to the
actual issue, which is the ability for CREATEROLE users to GRANT role
membership to all non-superuser roles on the system.  CREATEROLE
shouldn't be allowing that.

> I'm on board for a hard-coded fix as well - if a superuser is in the
> membership chain of a role then non-superusers cannot grant membership in
> that role to others.

Why not just look at the admin_option field of pg_auth_members...?  I
don't get why that isn't an even more minimal fix than this idea you
have of adding a column to pg_authid and then propagating around "this
user could become a superuser" or writing code that has to go check "is
there some way for this role to become a superuser, either directly or
through some subset of pg_* roles?"

> Neither of those really solves the pg_* roles problem.  We still need to
> indicate that they are somehow special.  Whether it is a nice matrix or
> roles and permissions or a simple attribute that makes them behave like
> they are superuser roles.

I disagree that they should be considered special when it comes to role
membership and management.  They're just roles, like any other.

> > I disagree entirely with the idea that we must have some roles who can
> > only ever be administered by a superuser.
> 
> I don't think this is a must have.  I think that since we do have it today
> that fixes that leverage the status quo in order to be done more easily are
> perfectly valid solutions.

We have a half-way-implemented attempt at this, not something that's
actually effective, and therefore I don't agree that we really have it
today or that we should keep it.

Re: role self-revocation

2022-03-10 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> But that's not to say that we couldn't decide to do something else
> instead, and that other thing might well be better. Do you want to
> sketch out a full proposal, even just what the syntax would look like,
> and share that here? And if you could explain how I could use it to
> create the mini-superusers that I'm trying to get out of this thing,
> even better.

It'd be useful to have a better definition of exactly what a
'mini-superuser' is, but at least for the moment when it comes to roles,
let's look at what the spec says:

CREATE ROLE
  - Who is allowed to run CREATE ROLE is implementation-defined
  - After creation, this is effictively run:
GRANT new_role TO creator_role WITH ADMIN, GRANTOR "_SYSTEM"

DROP ROLE
  - Any user who has been GRANT'd a role with ADMIN option is able to
DROP that role.

GRANT ROLE
  - No cycles allowed
  - A role must have ADMIN rights on the role to be able to GRANT it to
another role.

ALTER ROLE
  - Doesn't exist

This actually looks to me like more-or-less what you're looking for, it
just isn't what we have today because CREATEROLE brings along with it a
bunch of other stuff, some of which we want and some that we don't, and
some things that the SQL spec says ADMIN should be allowed to do (DROP
ROLE) we don't allow today.

It's also not quite what I want because it requires that membership and
ADMIN go together where I'd like to be able to have those be
independently GRANT'able- and then some.

I don't think we're that far from having all of these though.  To start
with, we remove from CREATEROLE the random things that it does which go
beyond what folks tend to expect- remove the whole 'grant any role to
any other' stuff, remove the 'drop role' exception, remove the
'alter role' stuff.  Do make it so that when you create a role, however,
the above GRANT is effectively done.  Now, for the items above where we
removed the checks against have_createrole_privilege() we go back and
add in checks using is_admin_of_role().  Of course, also remove the role
self-administration bug.

That's step #1, but it gets us more-or-less what you're looking for, I
think, and brings us a lot closer to what the spec has.

Step #2 is also in-line with the spec: track GRANTORs and care about
them, for everything.  We really should have been doing this all along.
Note that I'm not saying that an owner of a table can't REVOKE some
right that was GRANT'd on that table, but rather that a user who was
GRANT'd ADMIN rights on a table and then GRANT'd that right to some
other user shouldn't have some other user who only has ADMIN rights on
the table be able to remove that GRANT.  Same goes for roles, meaning
that you could GRANT rights in a role with ADMIN option and not have to
be afraid that the role you just gave that to will be able to remove
*your* ADMIN rights on that role.  In general, I don't think this
would actually have a very large impact on users because most users
don't, today, use the ADMIN option much.

Step #3 starts going in the direction of what I'd like to see, which
would be to break out membership in a role as a separate thing from
admin rights on that role.  This is also what would help with the 'bot'
use-case that Joshua (not David Steele, btw) brought up.

Step #4 then breaks the 'admin' option on roles into pieces- a 'drop
role' right, a 'reset password' right, maybe separate rights for
different role attributes, etc.  We would likely still keep the
'admin_option' column in pg_auth_members and just check that first
and then check the individual rights (similar to table-level vs.
column-level privileges) so that we stay in line with the spec's
expectation here and with what users are used to.

In some hyptothetical world, there's even a later step #5 which allows
us to define user profiles and then grant the ability for a user to
create a role with a certain profile (but not any arbitrary profile),
thus making things like the 'bot' even more constrained in terms of
what it's able to do (maybe it can then create a role that's a member of
a role without itself being a member of that role or explicitly having
admin rights in that role, as an example).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: role self-revocation

2022-03-10 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Mar 10, 2022 at 2:58 PM Stephen Frost  wrote:
> > It'd be useful to have a better definition of exactly what a
> > 'mini-superuser' is, but at least for the moment when it comes to roles,
> > let's look at what the spec says:
> 
> Gosh, I feel like I've spelled that out approximately 463,121 times
> already. That estimate might be slightly off though; I've been known
> to make mistakes from time to time

If there's a specific message that details it closely on the lists
somewhere, I'm happy to go review it.  I admit that I didn't go back and
look for such.

> > CREATE ROLE
> >   - Who is allowed to run CREATE ROLE is implementation-defined
> >   - After creation, this is effictively run:
> > GRANT new_role TO creator_role WITH ADMIN, GRANTOR "_SYSTEM"
> >
> > DROP ROLE
> >   - Any user who has been GRANT'd a role with ADMIN option is able to
> > DROP that role.
> >
> > GRANT ROLE
> >   - No cycles allowed
> >   - A role must have ADMIN rights on the role to be able to GRANT it to
> > another role.
> >
> > ALTER ROLE
> >   - Doesn't exist
> >
> > This actually looks to me like more-or-less what you're looking for, it
> > just isn't what we have today because CREATEROLE brings along with it a
> > bunch of other stuff, some of which we want and some that we don't, and
> > some things that the SQL spec says ADMIN should be allowed to do (DROP
> > ROLE) we don't allow today.
> 
> The above is mostly fine with me, except for the part about ALTER ROLE
> not existing. I think it's always good to be able to change your mind
> post-CREATE.

Errr, just to be clear, ALTER ROLE doesn't exist *in the spec*.  I
wasn't suggesting that we get rid of it, just that it doesn't exist in
the spec and therefore the spec doesn't have anything to say about it.

> Basically, in this sketch, ADMIN OPTION on a role involves the ability
> to DROP it, which means we don't need a separate role owner concept.

Right.  The above doesn't include any specifics about what to do with
ALTER ROLE, but my thought would be to have it also be under ADMIN
OPTION rather than under CREATEROLE, as I tried to outline (though not
very well, I'll admit) below.

> It also involves membership, meaning that you can freely exercise the
> privileges of the role without SET ROLE. While I'm totally down with
> having other possible behaviors as options, that particular behavior
> seems very useful to me, so, sounds great.

Well, yes and no- by default you're right, presuming everything is set
as inheirited, but I'd wish for us to keep the option of creating roles
which are noinherit and having that work just as it does today.

> > It's also not quite what I want because it requires that membership and
> > ADMIN go together where I'd like to be able to have those be
> > independently GRANT'able- and then some.
> >
> > I don't think we're that far from having all of these though.  To start
> > with, we remove from CREATEROLE the random things that it does which go
> > beyond what folks tend to expect- remove the whole 'grant any role to
> > any other' stuff, remove the 'drop role' exception, remove the
> > 'alter role' stuff.  Do make it so that when you create a role, however,
> > the above GRANT is effectively done.  Now, for the items above where we
> > removed the checks against have_createrole_privilege() we go back and
> > add in checks using is_admin_of_role().  Of course, also remove the role
> > self-administration bug.
> 
> What do you mean by the 'drop role' exception?

'ability' was probably a better word there.  What I'm talking about is
changing in DropRole:

if (!have_createrole_privilege())
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 errmsg("permission denied to drop role")));

to be, more or less:

if (!is_admin_of_role(role))
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 errmsg("permission denied to drop role")));

> I don't like removing 'alter role'.

Ditto above but for AlterRole.  Taking it away from users with
CREATEROLE being able to run those commands on $anyrole and instead
making it so that the role running DROP ROLE or ALTER ROLE needs to have
ADMIN on the role they're messing with.  I do think we may also need to
make some adjustments in terms of what a regular user WITH ADMIN on a
given role is able to

Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-10 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Mar 10, 2022 at 3:22 AM Jeff Davis  wrote:
> > * Can we mark this extension 'trusted'? I'm not 100% clear on the
> > standards for that marker, but it seems reasonable for a database owner
> > with the right privileges might want to install it.
> 
> I'm not clear on the standard either, exactly, but might not that
> allow the database owner to get a peek at what's happening in other
> databases?

The standard is basically that all of the functions it brings are
written to enforce the PG privilege system and you aren't able to use
the extension to bypass those privileges.  In some cases that means that
the C-language functions installed have if(!superuser) ereport() calls
throughout them- that's a fine answer, but it's perhaps not very helpful
to mark those as trusted.  In other cases, the C-language functions
installed don't directly provide access to data, such as the PostGIS
functions.

I've not looked back on this thread, but I'd expect pg_walinspect to
need those superuser checks and with those it *could* be marked as
trusted, but that again brings into question how useful it is to mark it
thusly.

In an ideal world, we might have a pg_readwal predefined role which
allows a role which was GRANT'd that role to be able to read WAL
traffic, and then the pg_walinspect extension could check that the
calling role has that predefined role, and other functions and
extensions could also check that rather than any existing superuser
checks.  A cloud provider or such could then include in their setup of a
new instance something like:

GRANT pg_readwal TO admin_user WITH ADMIN OPTION;

Presuming that there isn't anything that ends up in the WAL that's an
issue for the admin_user to have access to.

I certainly don't think we should allow either database owners or
regular users on a system the ability to access the WAL traffic of the
entire system.  More forcefully- we should *not* be throwing more access
rights towards $owners in general and should be thinking about how we
can allow admins, providers, whomever, the ability to control what
rights users are given.  If they're all lumped under 'owner' then
there's no way for people to provide granular access to just those
things they wish and intend to.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: role self-revocation

2022-03-11 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Mar 10, 2022 at 5:14 PM Tom Lane  wrote:
> > This seems reasonable in isolation, but
> >
> > (1) it implies a persistent relationship between creating and created
> > roles.  Whether you want to call that ownership or not, it sure walks
> > and quacks like ownership.

I agree that there would be a recorded relationship (that is, one that
we write into the catalog and keep around until and unless it's removed
by an admin) between creating and created roles and that's probably the
default when CREATE ROLE is run but, unlike tables or such objects in
the system, I don't agree that we should require this to exist at
absolutely all times for every role (what would it be for the bootstrap
superuser..?).  At least today, that's distinct from how ownership in
the system works.  I also don't believe that this is necessarily an
issue for Robert's use-case, as long as there are appropriate
restrictions around who is allowed to remove or modify these
relationships.

> I agree. It's been obvious to me from the beginning that we needed
> such a persistent relationship, and also that it needed to be a
> relationship from which the created role couldn't simply walk away.
> Yet, more than six months after the first discussions of this topic,
> we still don't have any kind of agreement on what that thing should be
> called. I like my TENANT idea best, but I'm perfectly willing to call
> it ownership as you seem to prefer or WITH ADMIN OPTION as Stephen
> seems to prefer if one of those ideas gains consensus. But we've
> managed to waste all hope of making any significant progress here for
> an entire release cycle for lack of ability to agree on spelling. I
> think that's unfair to Mark, who put a lot of work into this area and
> got nothing out of it, and I think it sucks for users of PostgreSQL,
> too.

Well ... one of those actually already exists and also happens to be in
the SQL spec.  I don't necessarily agree that we should absolutely
require that the system always enforce that this relationship exist (I'd
like a superuser to be able to get rid of it and to be able to change it
too if they want) and that seems a bit saner than having the bootstrap
superuser be special in some way here as would seem to otherwise be
required.  I also feel that it would be generally useful to have more
than one of these relationships, if the user wishes, and that's
something that ownership doesn't (directly) support today.  Further,
that's supported and expected by the SQL spec too.  Even if we invented
some concept of ownership of roles, it seems like we should make most of
the other changes discussed here to bring us closer to what the spec
says about CREATE ROLE, DROP ROLE, GRANT, REVOKE, etc.  At that point
though, what's the point of having ownership?

> > (2) it seems exactly contradictory to your later point that
> >
> > > Agree. I also think that it would be a good idea to attribute grants
> > > performed by any superuser to the bootstrap superuser, or leave them
> > > unattributed somehow. Because otherwise dropping superusers becomes a
> > > pain in the tail for no good reason.
> >
> > Either there's a persistent relationship or there's not.  I don't
> > think it's sensible to treat superusers differently here.
> >
> > I think that this argument about the difficulty of dropping superusers
> > may in fact be the motivation for the existing behavior that object-
> > permissions GRANTs done by superusers are attributed to the object
> > owner; something you were unhappy about upthread.
> >
> > In the end these requirements seem mutually contradictory.  Either
> > we can have a persistent ownership relationship or not, but I don't
> > think we can have it apply in some cases and not others without
> > creating worse problems than we solve.  I'm inclined to toss overboard
> > the requirement that superusers need to be an easy thing to drop.
> > Why is that important, anyway?
> 
> Well, I think you're looking at it the wrong way. Compared to getting
> useful functionality, the relative ease of dropping users is
> completely unimportant. I'm happy to surrender it in exchange for
> something else. I just don't see why we should give it up for nothing.
> If Alice creates non-superusers Bob and Charlie, and Charlie creates
> Doug, we need the persistent relationship to know that Charlie is
> allowed to drop Doug and Bob is not. But if Charlie is a superuser
> anyway, then the persistent relationship is of no use. I don't see the
> point of cluttering up the system with such dependencies. Will I do it
> that way, if that's what it takes to get the patch accepted? Sure. But
> I can't imagine any end-user actually liking it.

We need to know that Charlie is allowed to drop Doug and Bob isn't but
that doesn't make it absolutely required that this be tracked
permanently or that Alice can't decide later to make it such that Doug
can't be dropped by Charlie for whatever reason 

Re: role self-revocation

2022-03-11 Thread Stephen Frost
Greetings,

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Thu, Mar 10, 2022 at 3:01 PM Robert Haas  wrote:
> 
> > On Thu, Mar 10, 2022 at 4:00 PM David G. Johnston
> >  wrote:
> > > I dislike changing the documented behavior of CREATEROLE to the degree
> > suggested here.  However, there are three choices here, only one of which
> > can be chosen:
> > >
> > > 1. Leave CREATEROLE alone entirely
> > > 2. Make it so CREATEROLE cannot assign membership to the predefined
> > roles or superuser (inheritance included), but leave the rest alone.  This
> > would be the hard-coded version, not the role attribute one.
> > > 3. Make it so CREATEROLE can only assign membership to roles for which
> > it has been made an admin; as well as the other things mentioned
> > >
> > > Moving forward I'd prefer options 1 or 2, leaving the ability to
> > create/alter/drop a role to be vested via predefined roles.
> >
> > It sounds like you prefer a behavior where CREATEROLE gives power over
> > all non-superusers, but that seems pretty limiting to me.
> 
> Doh!  I edited out the part where I made clear I considered options 1 and 2
> as basically being done for a limited period of time while deprecating the
> CREATEROLE attribute altogether in favor of the fine-grained and predefined
> role based permission granting.  I don't want to nerf CREATEROLE as part of
> adding this new feature, instead leave it as close to status quo as
> reasonable so as not to mess up existing setups that make use of it.  We
> can note in the release notes and documentation that we consider CREATEROLE
> to be deprecated and that the new predefined role should be used to give a
> user the ability to create/alter/drop roles, etc...  DBAs should consider
> revoking CREATEROLE from their users and granting them proper memberships
> in the predefined roles and the groups those roles should be administering.

I disagree entirely with the idea that we should push the out however
many years it'd take to get through some deprecation period.  We are
absolutely terrible when it comes to that and what we're talking about
here, at this point anyway, is making changes that get us closer to what
the spec says.  I agree that we can't back-patch these changes, but I
don't think we need a deprecation period.  If we were just getting rid
of CREATEROLE, I don't think we'd have a deprecation period.  If we need
to get rid of CREATEROLE and introduce something new that more-or-less
means the same thing, to make it so that people's scripts break in a
more obvious way, maybe we can consider that, but I don't really think
that's actually the case here.  Such scripts as will break will still
break in a pretty clear way with a clear answer as to how to fix them
and I don't think there's some kind of data corruption or something that
would happen.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: role self-revocation

2022-03-11 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Robert Haas  writes:
> > If we implement the link between the creating role and the created
> > role as role ownership, then we are surely just going to add a
> > rolowner column to pg_authid, and when the role is owned by nobody, I
> > think we should always just store a valid OID in it, rather than
> > sometimes storing 0. It just seems simpler. Any time we would store 0,
> > store the bootstrap superuser's pg_authid.oid value instead. That way
> > the OID is always valid, which probably lets us get by with fewer
> > special cases in the code.

We haven't got any particularly special cases in the code today for what
happens if we run up the role hierarchy to a point that it ends and so
I'm not sure why adding in a whole new concept around role ownership,
which doesn't exist in the spec, would somehow leave us with fewer such
special cases.

> +1.
> 
> Note that either case would also involve making entries in pg_shdepend;
> although for the case of roles owned by/granted to the bootstrap
> superuser, we could omit those on the usual grounds that we don't need
> to record dependencies on pinned objects.

That we aren't discussing the issues with the current GRANT ... WITH
ADMIN OPTION and how we deviate from what the spec calls for when it
comes to DROP ROLE, which seems to be the largest thing that's
'solved' with this ownership concept, is concerning to me.

If we go down the route of adding role ownership, are we going to
document that we explicitly go against the SQL standard when it comes to
how DROP ROLE works?  Or are we going to fix DROP ROLE?  I'd much prefer
the latter, but doing so then largely negates the point of this role
ownership concept.  I don't see how it makes sense to do both.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: role self-revocation

2022-03-11 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Mar 11, 2022 at 11:12 AM Mark Dilger
>  wrote:
> > This issue of how much backwards compatibility breakage we're willing to 
> > tolerate is just as important as questions about how we would want roles to 
> > work in a green-field development project.  The sense I got a year ago, on 
> > this list, was that changing CREATEROLE was acceptable, but changing other 
> > parts of the system, such as how ADMIN OPTION works, would go too far.

That we deviate as far as we do when it comes to the SQL spec is
something that I don't feel like I had a good handle on when discussing
this previously (that the spec doesn't talk about 'admin option' really
but rather 'grantable authorization identifiers' or whatever it is
doesn't help... but still, that's on me, sorry about that).

> > Role ownership did not yet exist, and that was a big motivation in 
> > introducing that concept, because you couldn't credibly say it broke other 
> > existing features.  It introduces the new notion that when a superuser 
> > creates a role, the superuser owns it, which is identical to how things 
> > implicitly work today; and when a CREATEROLE non-superuser creates a role, 
> > that role owns the new role, which is different from how it works today, 
> > arguably breaking CREATEROLE's prior behavior.  *But it doesn't break 
> > anything else*.
> >
> > If we're going to change how ADMIN OPTION works, or how role membership 
> > works, or how inherit/noinherit works, let's first be clear that we are 
> > willing to accept whatever backwards incompatibility that entails.  This is 
> > not a green-field development project.  The constant spinning around with 
> > regard to how much compatibility we need to preserve is giving me vertigo.

I agree that it would have an impact on backwards compatibility to
change how WITH ADMIN works- but it would also get us more in line with
what the SQL standard says for how WITH ADMIN is supposed to work and
that seems worth the change to me.

> On the other hand, changing ADMIN OPTION does have compatibility and
> spec-compliance ramifications. I think Stephen is arguing that we can
> solve this problem while coming closer to the spec, and I think we
> usually consider getting closer to the spec to be a sufficient reason
> for breaking backward compatibility (cf. standard_conforming_strings).

Indeed.

> But I don't know whether he is correct when he argues that the spec
> makes admin option on a role sufficient to drop the role. I've never
> had any luck understanding what the SQL specification is saying about
> any topic.

I'm happy to point you to what the spec says and to discuss it further
if that would be helpful, or to get other folks to comment on it.  I
agree that it's definitely hard to grok at times.  In this particular
case what I'm looking at is, under DROP ROLE / Access Rules, there's
only one sentence:

There shall exist at least one grantable role authorization descriptor
whose role name is R and whose grantee is an enabled authorization
identifier.

A bit of decoding: 'grantable role authorization descriptor' is a GRANT
of a role WITH ADMIN OPTION.  The role name 'R' is the role specified.
The 'grantee' is who that role R was GRANT'd to, and 'enabled
authorization identifier' is basically "has_privs_of_role()" (note that
you can in the spec hvae roles that you're a member of but which are
*not* currently enabled).

Hopefully that helps.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: role self-revocation

2022-03-11 Thread Stephen Frost
Greetings,

On Fri, Mar 11, 2022 at 12:32 Mark Dilger 
wrote:

>
>
> > On Mar 11, 2022, at 8:48 AM, Stephen Frost  wrote:
> >
> > I agree that it would have an impact on backwards compatibility to
> > change how WITH ADMIN works- but it would also get us more in line with
> > what the SQL standard says for how WITH ADMIN is supposed to work and
> > that seems worth the change to me.
>
> I'm fine with giving up some backwards compatibility to get some SQL
> standard compatibility, as long as we're clear that is what we're doing.
> What you say about the SQL spec isn't great, though, because too much power
> is vested in "ADMIN".  I see "ADMIN" as at least three separate privileges
> together.  Maybe it would be spec compliant to implement "ADMIN" as a
> synonym for a set of separate privileges?


I do think that’s reasonable … and believe I suggested it about 3 messages
ago in this thread. ;)  (step #3 I think it was?  Or maybe 4).

> On Mar 11, 2022, at 8:41 AM, Stephen Frost  wrote:
> >
> > That we aren't discussing the issues with the current GRANT ... WITH
> > ADMIN OPTION and how we deviate from what the spec calls for when it
> > comes to DROP ROLE, which seems to be the largest thing that's
> > 'solved' with this ownership concept, is concerning to me.
>
> Sure, let's discuss that a bit more.  Here is my best interpretation of
> your post about the spec, when applied to postgres with an eye towards not
> doing any more damage than necessary:
>
> > On Mar 10, 2022, at 11:58 AM, Stephen Frost  wrote:
> >
> > let's look at what the spec says:
> >
> > CREATE ROLE
> >  - Who is allowed to run CREATE ROLE is implementation-defined
>
> This should be anyone with membership in pg_create_role.


That could be the case if we wished to go that route. I’d think in such
case we’d then also remove CREATEROLE as otherwise the documentation feels
like it’d be quite confusing.

>  - After creation, this is effictively run:
> >GRANT new_role TO creator_role WITH ADMIN, GRANTOR "_SYSTEM"
>
> This should internally be implemented as three separate privileges, one
> which means you can grant the role, another which means you can drop the
> role, and a third that means you're a member of the role.  That way, they
> can be independently granted and revoked.  We could make "WITH ADMIN" a
> short-hand for "WITH G, D, M" where G, D, and M are whatever we name the
> independent privileges Grant, Drop, and Member-of.


I mean, sure, we can get there, and possibly add more like if you’re
allowed to change or reset that role’s password and other things, but I
don’t see that this piece is required as part of the very first change in
this area.  Further, WITH ADMIN already gives grant and member today, so
you’re saying the only thing this change does that makes “WITH ADMIN” too
powerful is adding DROP to it, yet that’s explicitly what the spec calls
for.  In short, I disagree that moving the DROP ROLE right from CREATEROLE
roles having that across the entire system (excluding superusers) to WITH
ADMIN where the role who has that right can:

A) already become that role and drop all their objects
B) already GRANT that role to some other role

is a big issue.

Splitting G and D helps with backwards compatibility, because it gives
> people who want the traditional postgres "admin" a way to get there, by
> granting "G+M".  Splitting M from G and D makes it simpler to implement the
> "bot" idea, since the bot shouldn't have M.  But it does raise a question
> about always granting G+D+M to the creator, since the bot is the creator
> and we don't want the bot to have M.  This isn't a problem I've invented
> from thin air, mind you, as G+D+M is just the definition of ADMIN per the
> SQL spec, if I've understood you correctly.  So we need to think a bit more
> about the pg_create_role built-in role and whether that needs to be further
> refined to distinguish those who can get membership in roles they create
> vs. those who cannot.  This line of reasoning takes me in the direction of
> what I think you were calling #5 upthread, but you'd have to elaborate on
> that, and how it interacts with the spec, for us to have a useful
> conversation about it.


All that said, as I said before, I’m in favor of splitting things up and so
if you want to do that as part of this initial work, sure. Idk that it’s
absolutely required as part of this but I’m not going to complain if it’s
included either.  I agree that would allow folks to get something similar
to what they could get today if they want.

I agree that the split up helps with the “bot” idea, as 

Re: Kerberos delegation support in libpq and postgres_fdw

2022-03-11 Thread Stephen Frost
Greetings,

On Fri, Mar 11, 2022 at 18:55 Jacob Champion  wrote:

> On Mon, 2022-02-28 at 20:28 -0500, Stephen Frost wrote:
> > Will add to the CF for consideration.
>
> GSSAPI newbie here, so caveat lector.


No worries, thanks for your interest!

> diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
> > index efc53f3135..6f820a34f1 100644
> > --- a/src/backend/libpq/auth.c
> > +++ b/src/backend/libpq/auth.c
> > @@ -920,6 +920,7 @@ pg_GSS_recvauth(Port *port)
> >   int mtype;
> >   StringInfoData buf;
> >   gss_buffer_desc gbuf;
> > + gss_cred_id_t proxy;
> >
> >   /*
> >* Use the configured keytab, if there is one.  Unfortunately,
> Heimdal
> > @@ -949,6 +950,9 @@ pg_GSS_recvauth(Port *port)
> >*/
> >   port->gss->ctx = GSS_C_NO_CONTEXT;
> >
> > + proxy = NULL;
> > + port->gss->proxy_creds = false;
> > +
> >   /*
> >* Loop through GSSAPI message exchange. This exchange can consist
> of
> >* multiple messages sent in both directions. First message is
> always from
> > @@ -999,7 +1003,7 @@ pg_GSS_recvauth(Port *port)
> >
>&port->gss->outbuf,
> >
>&gflags,
> >
>NULL,
> > -
>NULL);
> > +
>&proxy);
> >
> >   /* gbuf no longer used */
> >   pfree(buf.data);
> > @@ -1011,6 +1015,12 @@ pg_GSS_recvauth(Port *port)
> >
> >   CHECK_FOR_INTERRUPTS();
> >
> > + if (proxy != NULL)
> > + {
> > + pg_store_proxy_credential(proxy);
> > + port->gss->proxy_creds = true;
> > + }
> > +
>
> Some implementation docs [1] imply that a delegated_cred_handle is only
> valid if the ret_flags include GSS_C_DELEG_FLAG. The C-binding RFC [2],
> though, says that we can rely on it being set to GSS_C_NO_CREDENTIAL if
> no handle was sent...
>
> I don't know if there are any implementation differences here, but in
> any case I think it'd be more clear to use the GSS_C_NO_CREDENTIAL
> spelling (instead of NULL) here, if we do decide not to check
> ret_flags.


Hmmm, yeah, that seems like it might be better and is something I’ll take a
look at.

[5] says we have to free the proxy credential with GSS_Release_cred();
> I don't see that happening anywhere, but I may have missed it.


I’m not sure that it’s really necessary or worthwhile to do that at process
end since … the process is about to end. I suppose we could provide a
function that a user could call to ask for it to be released sooner if we
really wanted..?

>   maj_stat = gss_init_sec_context(&min_stat,
> > -
>  GSS_C_NO_CREDENTIAL,
> > +
>  proxy,
> >
>  &conn->gctx,
> >
>  conn->gtarg_nam,
> >
>  GSS_C_NO_OID,
> > -
>  GSS_C_MUTUAL_FLAG,
> > +
>  GSS_C_MUTUAL_FLAG | GSS_C_DELEG_FLAG,
> >   0,
> >
>  GSS_C_NO_CHANNEL_BINDINGS,
> >
>  (ginbuf.value == NULL) ? GSS_C_NO_BUFFER : &ginbuf,
> > diff --git a/src/interfaces/libpq/fe-secure-gssapi.c
> b/src/interfaces/libpq/fe-secure-gssapi.c
> > index 6ea52ed866..566c89f52f 100644
> > --- a/src/interfaces/libpq/fe-secure-gssapi.c
> > +++ b/src/interfaces/libpq/fe-secure-gssapi.c
> > @@ -631,7 +631,7 @@ pqsecure_open_gss(PGconn *conn)
> >*/
> >   major = gss_init_sec_context(&minor, conn->gcred, &conn->gctx,
> >
> conn->gtarg_nam, GSS_C_NO_OID,
> > -
> GSS_REQUIRED_FLAGS, 0, 0, &input, NULL,
> > +
> GSS_REQUIRED_FLAGS | GSS_C_DELEG_FLAG, 0, 0, &input, NULL,
>
> It seems like there should be significant security implications to
> allowing delegation across the board. Especially since one postgres_fdw
> might delegate to another server, and then another... Should this be
> opt-in, maybe via a connection parameter?


This is already opt-in- at kinit time a user can decide if they’d like a
proxy-able ticket or not.  I don’t know that we really need to have our own
option for it … tho I’m not really against adding such an option either.

(It also looks like there are some mechanisms for further constraining
> delegation scope, either by administrator policy or otherwise [3, 4].
> Might be a good thing for a v2 of this feature to have.)


Yes, constrained delegation is a pretty neat extension to Kerberos and one
I’d like to look at later as a future enhancement but I don’t think it
needs to be in the initial version.

Similarly, it f

Re: role self-revocation

2022-03-11 Thread Stephen Frost
Greetings,

On Fri, Mar 11, 2022 at 19:03 Mark Dilger 
wrote:

> > On Mar 11, 2022, at 2:46 PM, Stephen Frost  wrote:
> >
> > I do think that’s reasonable … and believe I suggested it about 3
> messages ago in this thread. ;)  (step #3 I think it was?  Or maybe 4).
>
> Yes, and you mentioned it to me off-list.


Indeed.

I'm soliciting a more concrete specification for what you are proposing.
> To me, that means understanding how the SQL spec behavior that you champion
> translates into specific changes.  You specified some of this in steps #1
> through #5, but I'd like a clearer indication of how many of those (#1
> alone, both #1 and #2, or what?) constitute a competing idea to the idea of
> role ownership, and greater detail about how each of those steps translate
> into specific behavior changes in postgres.  Your initial five-step email
> seems to be claiming that #1 by itself is competitive, but to me it seems
> at least #1 and #2 would be required.


First … I outlined a fair bit of further description in the message you
just responded to but neglected to include in your response, which strikes
me as odd that you’re now asking for further explanation.  When it comes to
completing the idea of role ownership- I didn’t come up with that idea nor
champion it and therefore I’m not really sure how many of the steps are
required to fully support that concept..?  For my part, I would think that
those steps necessary to satisfy the spec would get us pretty darn close to
what true folks advocating for role ownership are asking for, but that
doesn’t include the superuser-only alter role attributes piece.  Is that
included in role ownership?  I wouldn’t think so, but some might argue
otherwise, and I don’t know that it is actually useful to divert into a
discussion about what is or isn’t in that.

If we agree that the role attribute bits are independent then I think I
agree that we need 1 and 2 to get the capabilities that the folks asking
for role ownership want, as 2 is where we make sure that one admin of a
role can’t revoke another admin’s rights over that role.  Perhaps 2 isn’t
strictly necessary in a carefully managed environment where no one else is
given admin rights over the mini-superuser roles, but I’d rather not have
folks depending on that.  I’d still push back though and ask those
advocating for this if it meets what they’re asking for.  I got the
impression that it did but maybe I misunderstood.

In terms of exactly how things would work with these changes… I thought I
explained it pretty clearly, so it’s kind of hard to answer that further
without a specific question to answer.  Did you have something specific in
mind?  Perhaps I could answer a specific question and provide more clarity
that way.

Thanks,

Stephen

>


Re: role self-revocation

2022-03-14 Thread Stephen Frost
Greetings,

* Mark Dilger (mark.dil...@enterprisedb.com) wrote:
> > On Mar 11, 2022, at 4:56 PM, Stephen Frost  wrote:
> > First … I outlined a fair bit of further description in the message you 
> > just responded to but neglected to include in your response, which strikes 
> > me as odd that you’re now asking for further explanation.
> 
> >  When it comes to completing the idea of role ownership- I didn’t come up 
> > with that idea nor champion it
> 
> Sorry, not "completing", but "competing".  It seems we're discussing 
> different ways to fix how roles and CREATEROLE work, and we have several 
> ideas competing against each other.  (This differs from *people* competing 
> against each other, as I don't necessarily like the patch I wrote better than 
> I like your idea.)
> 
> > and therefore I’m not really sure how many of the steps are required to 
> > fully support that concept..?
> 
> There are problems that the ownership concepts solve.  I strongly suspect 
> that your proposal could also solve those same problems, and just trying to 
> identify the specific portions of your proposal necessary to do so.

I'm happy to help try to identify those, but it seems we'd need to have
the exact problems that ownership solves defined first.  Robert defined
what he's looking for as:

Robert Haas  wrote:
> Probably easier to just say it again: I want to have users that can
> create roles and then have superuser-like powers with respect to those
> roles. They can freely exercise the privileges of those roles, and
> they can do all the things that a superuser can do but only with
> respect to those roles. They cannot break out to the OS. I think it's
> pretty similar to what you are describing, with a couple of possible
> exceptions. For example, would you imagine that being an admin of a
> login role would let you change that user's password? Because that
> would be desirable behavior from where I sit.

Which sure sounds like it's just about covered in step #1 of what I
outlined before, except that the above description implies that one
can't "get away" from the user who created their role, in which case we
do need step #2 also.

> >  For my part, I would think that those steps necessary to satisfy the spec 
> > would get us pretty darn close to what true folks advocating for role 
> > ownership are asking for
> 
> I have little idea what "true folks" means in this context.  As for 
> "advocating for role ownership", I'm not in that group.  Whether role 
> ownership or something else, I just want some solution to a set of problems, 
> mostly to do with needing superuser to do role management tasks.

... I'm not entirely sure what I meant there either, my hunch is that
'true' was actually just a leftover word from some other framing of that
sentence and I had meant to remove it.  Apologies for that.  What I was
trying to get at there is that steps #1 & #2 are the ones that I view as
getting us closer to spec compliance and that doing so would get us to
where Robert's ask above would be answered.

> > , but that doesn’t include the superuser-only alter role attributes piece.  
> > Is that included in role ownership?  I wouldn’t think so, but some might 
> > argue otherwise, and I don’t know that it is actually useful to divert into 
> > a discussion about what is or isn’t in that.
> 
> Introducing the idea of role ownership doesn't fix that.  But a patch which 
> introduces role ownership is useless unless CREATEROLE is also fixed.  There 
> isn't any point having non-superusers create and own roles if, to do so, they 
> need a privilege which can break into superuser.  But that argument is no 
> different with a patch along the lines of what you are proposing.  CREATEROLE 
> needs fixing either way.

There's a few ways to have the 'CREATEROLE' role attribute be fixed-

- Remove it entirely (replacing with pg_create_role or such)
- Remove its ability to GRANT out rights that the role running it
  doesn't have
- Make it superfluous (leave it as-is, but add in pg_create_role which
  allows a role to create another role but doesn't include the magic
  GRANT whatever-role TO whatever-role that CREATEROLE has)

I agree that we need to do something here to allow roles to create other
roles while not having or being able to trivially get superuser
themselves.

> > If we agree that the role attribute bits are independent
> 
> Yes.

Great.

> > then I think I agree that we need 1 and 2 to get the capabilities that the 
> > folks asking for role ownership want
> 
> Yes.

Ok.

> > as 2 is where we make sure that one admin of a role can’t revoke 

Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-14 Thread Stephen Frost
Greetings,

* Jeff Davis (pg...@j-davis.com) wrote:
> On Thu, 2022-03-10 at 15:54 -0500, Stephen Frost wrote:
> > The standard is basically that all of the functions it brings are
> > written to enforce the PG privilege system and you aren't able to use
> > the extension to bypass those privileges.  In some cases that means
> > that
> 
> Every extension should follow that standard, right? If it doesn't (e.g.
> creating dangerous functions and granting them to public), then even
> superuser should not install it.

Every extension that's intended to be installed on a multi-user PG
system where the admin cares about security in the database, sure.  I
disagree that this applies universally to every system or every
extension.  Those are standards that modules we distribute in contrib
should meet but I don't know that we necessarily have to have them for,
say, modules in test.

> > the C-language functions installed have if(!superuser) ereport()
> > calls
> 
> I'm curious why not rely on the grant system where possible? I thought
> we were trying to get away from explicit superuser checks.

We don't yet have capabilities for everything.  I agree that we should
be getting away from explicit superuser checks and explained below how
we might be able to in this case.

> > I've not looked back on this thread, but I'd expect pg_walinspect to
> > need those superuser checks and with those it *could* be marked as
> > trusted, but that again brings into question how useful it is to mark
> > it
> > thusly.
> 
> As long as any functions are safely accessible to public or a
> predefined role, there is some utility for the 'trusted' marker.

I'm not sure that I agree, though I'm also not sure that it's a useful
thing to debate.  Still, if all of the functions in a particular
extension have explicit if (!superuser) ereport() checks in them, then
while installing it is 'safe' and it could be marked as 'trusted'
there's very little point in doing so as the only person who can get
anything useful from those functions is a superuser, and a superuser can
install non-trusted extensions anyway.  How is it useful to mark such an
extension as 'trusted'?

> As this patch is currently written, pg_monitor has access these
> functions, though I don't think that's the right privilege level at
> least for pg_get_raw_wal_record().

Yeah, I agree that pg_monitor isn't the right thing for such a function
to be checking.

> > I certainly don't think we should allow either database owners or
> > regular users on a system the ability to access the WAL traffic of
> > the
> > entire system.
> 
> Agreed. That was not what I intended by asking if it should be marked
> 'trusted'. The marker only allows the non-superuser to run the CREATE
> EXTENSION command; it's up to the extension script to decide whether
> any non-superusers can do anything at all with the extension.

Sure.

> >   More forcefully- we should *not* be throwing more access
> > rights towards $owners in general and should be thinking about how we
> > can allow admins, providers, whomever, the ability to control what
> > rights users are given.  If they're all lumped under 'owner' then
> > there's no way for people to provide granular access to just those
> > things they wish and intend to.
> 
> Not sure I understand, but that sounds like a larger discussion.

The point I was trying to make is that it's better to move in the
direction of things like pg_read_all_data rather than just declaring
that the owner of a database can read all of the tables in that
database, as an example.  Maybe we want to implicitly have such
privilege for the owner of the database too, but we should first make it
something that's able to be GRANT'd out to non-owners so that it's not
required that all of those privileges be given out together at once.

Note that to be considered an 'owner' of an object you have to be a
member of the role that owns the object, which means that said role is
necessarily able to also become the owning role too.  Lumping lots of
privileges together- all the rights that being an 'owner' of the object
conveys, plus the ability to also become that role directly and do
things as that role, works actively against the general idea of 'least
privilege'.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2022-03-16 Thread Stephen Frost
Greetings,

* Bharath Rupireddy (bharath.rupireddyforpostg...@gmail.com) wrote:
> On Tue, Mar 15, 2022 at 7:21 AM Bharath Rupireddy
>  wrote:
> >
> > On Mon, Mar 14, 2022 at 8:25 PM Stephen Frost  wrote:
> > >
> > > > As this patch is currently written, pg_monitor has access these
> > > > functions, though I don't think that's the right privilege level at
> > > > least for pg_get_raw_wal_record().
> > >
> > > Yeah, I agree that pg_monitor isn't the right thing for such a function
> > > to be checking.
> >
> > On Thu, Mar 10, 2022 at 1:52 PM Jeff Davis  wrote:
> > >
> > > * pg_get_raw_wal_record() seems too powerful for pg_monitor. Maybe that
> > > function should require pg_read_server_files? Or at least
> > > pg_read_all_data?
> >
> > The v9 patch set posted at [1] grants execution on
> > pg_get_raw_wal_record() to the pg_monitor role.
> >
> > pg_read_all_data may not be the right choice, but pg_read_server_files
> > is as these functions do read the WAL files on the server. If okay,
> > I'm happy to grant execution on pg_get_raw_wal_record() to the
> > pg_read_server_files role.
> >
> > Thoughts?
> >
> > [1] 
> > https://www.postgresql.org/message-id/CALj2ACVRH-z8mZLyFkpLvY4qRhxQCqU_BLkFTtwt%2BTPZNhfEVg%40mail.gmail.com
> 
> Attaching v10 patch set which allows pg_get_raw_wal_record to be
> executed by either superuser or users with pg_read_server_files role,
> no other change from v9 patch set.

In a quick look, that seems reasonable to me.  If folks want to give out
access to this function individually they're also able to do so, which
is good.  Doesn't seem worthwhile to introduce a new predefined role for
this one function.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-16 Thread Stephen Frost
Greetings,

* samay sharma (smilingsa...@gmail.com) wrote:
> On Fri, Mar 4, 2022 at 11:15 AM Jacob Champion  wrote:
> > On Thu, 2022-03-03 at 11:12 +0100, Peter Eisentraut wrote:
> > > At the moment, it is not possible to judge whether the hook interface
> > > you have chosen is appropriate.
> > >
> > > I suggest you actually implement the Azure provider, then make the hook
> > > interface, and then show us both and we can see what to do with it.
> >
> > To add a data point here, I've rebased my OAUTHBEARER experiment [1] on
> > top of this patchset. (That should work with Azure's OIDC provider, and
> > if it doesn't, I'd like to know why.)
> 
> Firstly, thanks for doing this. It helps to have another data point and the
> feedback you provided is very valuable. I've looked to address it with the
> patchset attached to this email.
> 
> This patch-set adds the following:
> 
> * Allow multiple custom auth providers to be registered (Addressing
> feedback from Aleksander and Andrew)
> * Modify the test extension to use SCRAM to exchange secrets (Based on
> Andres's suggestion)
> * Add support for custom auth options to configure provider's behavior (by
> exposing a new hook) (Required by OAUTHBEARER)
> * Allow custom auth methods to use usermaps. (Required by OAUTHBEARER)

That we're having to extend this quite a bit to work for the proposed
OAUTH patches and that it still doesn't do anything for the client side
(note that the OAUTHBEARER patches are still patching libpq to add
support directly and surely will still be even with this latest patch
set ...) makes me seriously doubt that this is the right direction to be
going in.

> > After the port, here are the changes I still needed to carry in the
> > backend to get the tests passing:
> >
> > - I needed to add custom HBA options to configure the provider.
> 
> Could you try to rebase your patch to use the options hook and let me know
> if it satisfies your requirements?
> 
> Please let me know if there's any other feedback.

How about- if we just added OAUTH support directly into libpq and the
backend, would that work with Azure's OIDC provider?  If not, why not?
If it does, then what's the justification for trying to allow custom
backend-only authentication methods?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-16 Thread Stephen Frost
Greetings,

* Jeff Davis (pg...@j-davis.com) wrote:
> On Wed, 2022-03-16 at 11:02 -0400, Stephen Frost wrote:
> > That we're having to extend this quite a bit to work for the proposed
> > OAUTH patches and that it still doesn't do anything for the client
> > side
> > (note that the OAUTHBEARER patches are still patching libpq to add
> > support directly and surely will still be even with this latest patch
> > set ...) makes me seriously doubt that this is the right direction to
> > be
> > going in.
> 
> I don't follow this concern. I assume you're referring to the last two
> bullet points, which I repeat below:
> 
> * Add support for custom auth options to configure provider's
> behavior: 
> 
> Even without OAUTH I think this would have been requested.
> Configuration of some kind is going to be necessary. Without custom
> options, I guess the provider would need to define its own config file?
> Not the end of the world, but not ideal.

Yes, configuration of some kind will be needed and getting that
configuration correct is going to be essential to being able to have a
secure authentication system.

> > How about- if we just added OAUTH support directly into libpq and the
> > backend, would that work with Azure's OIDC provider?  If not, why
> > not?
> > If it does, then what's the justification for trying to allow custom
> > backend-only authentication methods?
> 
> I understand the point, but it also has a flip side: if custom auth
> works, why do we need to block waiting for a complete core OAUTH
> feature?

First, let's be clear- we aren't actually talking about custom or
pluggable authentication here, at least when it comes to PG as a
project.  For it to actually be pluggable, it needs to be supported on
both the client side and the server side, not just the server side.

That this keeps getting swept under the carpet makes me feel like this
isn't actually an argument about the best way to move the PG project
forward but rather has another aim.  I don't think we should even
consider accepting a patch to make this something that works only on the
server side as, in such a case, we wouldn't even be able to have an
extension of our own that leverages it or tests it without bespoke code
for that purpose.  I certainly don't think we should add code to libpq
for some auth method that isn't even available in a default install of
PG, as would happen if we accepted both this patch and the patch to add
OAUTH support to libpq.  What exactly is the thinking on how this would
move forward?  We'd commit this custom support that requires an external
extension to actually work and then add hacked-in code to libpq in order
for folks to be able to use OAUTH?  What happens when, and not if,
something changes in that extension that requires a change on the client
side..?  Are we going to be dealing with CVEs for the libpq side of
this?

> Authentication seems like a good candidate for allowing custom methods.

It's not and this is clear from looking at history.  We have seen this
time and time again- PAM, SASL, RADIUS could be included, EAP, etc.
You'll note that we already support some of these, yet you'd like to now
add some set of hooks in addition to these pluggable options that
already exist because, presumably, they don't actually solve what you're
looking for.  That's actually rather typical when it comes to
authentication methods- everyone has this idea that it can be pluggable
or a few hooks to allow a custom method will work for the next great
thing, but that's not what actually happens.

> New ones are always coming along, being used in new ways, updating to
> newer crypto, or falling out of favor. We've accumulated quite a few.

I agree that we need to get rid of some of the ones that we've got, but
even so, at least the ones we have are maintained and updated to the
extent possible for us to fix issues with them.  The idea that
externally maintained custom auth methods would be taken care of in such
a way is quite far fetched in my view.

I also disagree that they're coming along so quickly and so fast that
it's actually difficult for us to include or support, we've covered
quite a few, after all, as you seem to agree with.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-16 Thread Stephen Frost
Greetings,

* samay sharma (smilingsa...@gmail.com) wrote:
> On Wed, Mar 16, 2022 at 8:02 AM Stephen Frost  wrote:
> > How about- if we just added OAUTH support directly into libpq and the
> > backend, would that work with Azure's OIDC provider?  If not, why not?
> 
> Overall, Azure AD implements OIDC, so this could be doable. However, we'll
> have to explore further to see if the current implementation would satisfy
> all the requirements.

Great, would be useful to know what's discovered here and if OIDC
doesn't do what you're looking for, how you're intending to work on
improving it to make it so that it does, in an open and transparent
manner through the RFC process.

> > If it does, then what's the justification for trying to allow custom
> > backend-only authentication methods?
> 
> The only goal of this patch wasn't to enable support for Azure AD. That's
> just one client. Users might have a need to add or change auth methods in
> the future and providing that extensibility so we don't need to have core
> changes for each one of them would be useful. I know there isn't alignment
> on this yet, but if we'd like to move certain auth methods out of core into
> extensions, then this might provide a good framework for that.

Hand-waving at this and saying that other people want it is not
justification for it- who else, exactly, has come along and asked for
it?  I also disagree that we actually want to move authentication
methods out of core- that was an argument brought up in this thread as
justification for why these hooks should exist but I don't see anyone
other than the folks that want the hooks actually saying that's
something they want.  There are some folks who agree that we should drop
support for certain authentication methods but that's not at all the
same thing (and no, moving them to contrib isn't somehow deprecating or
removing them from what we have to support or making it so that we don't
have to deal with them).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-17 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2022-03-17 12:10:51 +0100, Peter Eisentraut wrote:
> > Looking at the existing authentication methods
> > 
> > # METHOD can be "trust", "reject", "md5", "password", "scram-sha-256",
> > # "gss", "sspi", "ident", "peer", "pam", "ldap", "radius" or "cert".
> > 
> > how many of these could have been implemented using a plugin mechanism that
> > was designed before the new method was considered?  Probably not many.
> 
> trust, reject, md5, password, ident, peer, pam, ldap, radius look trivially
> possible. I think sspi is doable as well, but I don't know it well enough to
> be confident. gss without transport encryption could have as well, I
> think. Even scram-sha-256 looks possible, although that'd have been a good bit
> harder.  Where do you see the problems?

I agree with Peter and don't feel like the above actually answered the
question in any fashion.  The question wasn't "which auth methods could
be implemented using these new set of hooks?" but rather- which could
have been implemented using a plugin mechanism that was designed
*before* the new method was considered?  The answer to that is
'basically none'.  There even existed a plugin mechanism (PAM) before
many of them, and they weren't able implemented using it.  That's
entirely the point- while this might be an interesting way to redesign
what we have now, should we feel that's useful to do (I don't and think
it would be an actively bad thing for the project to try and do...)
there's no reason to believe that it'll actually be useful for the
*next* auth method that comes along.

> Only stuff tying into transport encryption is clearly not doable via the
> proposed API, but that's hardly the fault of an authentication API?

This is absolutely the wrong way to look at it.  The authentication
methods that are coming along today are designed to tie into the
transport encryption because that's *needed* to avoid MITM attacks.  I'd
much rather we generally avoid including ones that don't support that
and we certainly shouldn't be trying to build a generic system which
explicitly doesn't support it.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-17 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2022-03-17 14:03:31 -0400, Stephen Frost wrote:
> > * Andres Freund (and...@anarazel.de) wrote:
> > > On 2022-03-17 12:10:51 +0100, Peter Eisentraut wrote:
> > > > Looking at the existing authentication methods
> > > > 
> > > > # METHOD can be "trust", "reject", "md5", "password", "scram-sha-256",
> > > > # "gss", "sspi", "ident", "peer", "pam", "ldap", "radius" or "cert".
> > > > 
> > > > how many of these could have been implemented using a plugin mechanism 
> > > > that
> > > > was designed before the new method was considered?  Probably not many.
> > > 
> > > trust, reject, md5, password, ident, peer, pam, ldap, radius look 
> > > trivially
> > > possible. I think sspi is doable as well, but I don't know it well enough 
> > > to
> > > be confident. gss without transport encryption could have as well, I
> > > think. Even scram-sha-256 looks possible, although that'd have been a 
> > > good bit
> > > harder.  Where do you see the problems?
> > 
> > I agree with Peter and don't feel like the above actually answered the
> > question in any fashion.  The question wasn't "which auth methods could
> > be implemented using these new set of hooks?" but rather- which could
> > have been implemented using a plugin mechanism that was designed
> > *before* the new method was considered?  The answer to that is
> > 'basically none'.
> 
> I fail to see how you come to that conclusion.

...

> > There even existed a plugin mechanism (PAM) before many of them, and they
> > weren't able implemented using it.
> 
> That's nonsensical. PAM is a platform specific API to start with - so it
> doesn't make sense to implement additional postgres authentication mechanism
> with it. It also wasn't added to postgres as a base mechanism for extensible
> authentication. And it's fundamentally restricted in the kind of secret
> exchanges that can be implemented with it.

Exactly because of this.  The folks who came up with PAM didn't forsee
the direction that authentication methods were going to go in and I
don't see any particular reason why we're smarter than they were.

> > That's entirely the point- while this might be an interesting way to
> > redesign what we have now, should we feel that's useful to do (I don't and
> > think it would be an actively bad thing for the project to try and do...)
> > there's no reason to believe that it'll actually be useful for the *next*
> > auth method that comes along.
> 
> What concrete limitation do you see in the API? It basically gives you the
> same powers as writing something in auth.c directly. And due to AUTH_REQ_SASL*
> we have fairly extensible ways of exchanging authentication data.

I'm not one to predict what the next authentication method to come down
the road is but I doubt that an API we come up with today will actually
work to perfectly implement it.  GSSAPI was supposed to be an extensible
authentication method too.

Also- how is this going to work client-side in libpq?

> > > Only stuff tying into transport encryption is clearly not doable via the
> > > proposed API, but that's hardly the fault of an authentication API?
> > 
> > This is absolutely the wrong way to look at it.  The authentication
> > methods that are coming along today are designed to tie into the
> > transport encryption because that's *needed* to avoid MITM attacks.  I'd
> > much rather we generally avoid including ones that don't support that
> > and we certainly shouldn't be trying to build a generic system which
> > explicitly doesn't support it.
> 
> So you're saying that any authentication API needs to come together with a
> runtime extendable secure transport mechanism? That feels like raising the bar
> ridiculously into the sky. If we want to add more transport mechanisms - and
> I'm not sure we do - that's a very sizable project on its own. And
> independent.

No, that's not what I'm saying.  Authentication mechanisms should have a
way to tie themselves to the secure transport layer is what I was
getting at, which you seemed to be saying wasn't possible with this API.

However, it's certainly a relevant point that something like encrypted
GSSAPI still wouldn't be able to be implemented with this.  I don't know
that it's required for a new auth method to have that, but not being
able to do it with the proposed API is another point against this
approach.  That is, even the existing methods that we've got today
wouldn't all be able to work with this.

> Note that you *can* combine existing secure transport mechanisms with the
> proposed API. You can use Port->{ssl,gss,peer_cn,...} to query information and
> do further openssl etc calls.

If it's possible then that's good, though I do have concerns about how
this plays into support for different transport layers or even libraries
once we get support for an alternative to openssl.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-17 Thread Stephen Frost
Greetings,

On Thu, Mar 17, 2022 at 17:02 Andres Freund  wrote:

> On 2022-03-16 18:50:23 -0400, Stephen Frost wrote:
> > First, let's be clear- we aren't actually talking about custom or
> > pluggable authentication here, at least when it comes to PG as a
> > project.  For it to actually be pluggable, it needs to be supported on
> > both the client side and the server side, not just the server side.
> >
> > That this keeps getting swept under the carpet makes me feel like this
> > isn't actually an argument about the best way to move the PG project
> > forward but rather has another aim.
>
> This is insulting and unjustified. IMO completely inappropriate for the
> list /
> community. I've also brought this up privately, but I thought it important
> to
> state so publically.


I am concerned.

I don’t intend to insult you or anyone else on this thread.  I’m sorry.

This isn’t the first time I asked about this on this thread, yet the
question about why this is only being discussed as backend changes, and
with the only goal seeming to be to have a backend loadable module, without
anything on the client side for something that’s clearly both a server and
client side concern, seems to just be ignored, which make me feel like my
comments and the concerns that I raise aren’t being appreciated.

I had drafted a response to your private email to me but hadn’t wanted to
send it without going over it again after taking time to be sure I had
cooled down and was being level-headed in my response.

I am sorry.  I am still concerned.

Thanks,

Stephen

>


Re: Proposal: Support custom authentication methods using hooks

2022-03-17 Thread Stephen Frost
Greetings,

On Thu, Mar 17, 2022 at 23:25 Andres Freund  wrote:

> On 2022-03-17 22:13:27 -0400, Stephen Frost wrote:
> > This isn’t the first time I asked about this on this thread, yet the
> > question about why this is only being discussed as backend changes, and
> > with the only goal seeming to be to have a backend loadable module,
> without
> > anything on the client side for something that’s clearly both a server
> and
> > client side concern, seems to just be ignored, which make me feel like my
> > comments and the concerns that I raise aren’t being appreciated.
>
> It's imo a separate project to make the client side extensible. There's
> plenty
> of authentication methods that don't need any further client side support
> than
> the existing SASL (or password if OK for some use case) messages, which
> most
> clients (including libpq) already know.
>
> Of course the fact that libpq "only" speaks SCRAM-SHA-256 is a limit
> (unless
> you have server side access to clear text passwords, but brrr). But there's
> plenty that can be done with that. Proxying auth via a central postgres
> server
> with the secrets, sharing secrets with other data stores also understanding
> SCRAM-SHA-256, ...
>
> There definitely *also* are important authentication methods that can't be
> implemented without further client side support. Some of those could
> plausibly
> be tackled on the protocol / libpq level in a way that they could be used
> by
> multiple authentication methods. Other authentication methods definitely
> need
> dedicated code in the client (although the protocol likely can be fairly
> generic).
>
> Given that there's benefit from the server side extensibility alone, I
> don't
> see much benefit in tying the server side extensibility to the client side
> extensibility.


How are we going to reasonably test this then?

I also don’t think that I agree that it’s acceptable to only have the
ability to extend the authentication on the server side as that implies a
whole bunch of client-side work that goes above and beyond just
implementing an authentication system if it’s not possible to leverage
libpq (which nearly all languages out there use..).  Not addressing that
side of it also makes me concerned that whatever we do here won’t be
right/enough/etc.

This is not an area, in my view, where flexibility for the sake of it is
good.  Misunderstandings between the client and the server or between how
the core code and the hooks interact seem very likely and could easily lead
to insecure setups and a good chance for CVEs.

Much like encryption, authentication isn’t easy to get right.  We should be
working to implement standard that experts, through RFCs and similar
well-defined protocols, have defined in the core code with lots of eyes
looking at it.

So, very much a -1 from me for trying to extend the backend in this way.  I
see a lot of risk and not much, if any, benefit.  I’d much rather see us
add support directly in the core code, on the client and sever side, for
OAUTH and other well defined authentication methods, and we even have an
active patch for that could make progress on that with a bit of review.

Thanks,

Stephen

>


Re: Proposal: Support custom authentication methods using hooks

2022-03-17 Thread Stephen Frost
Greetings,

On Fri, Mar 18, 2022 at 00:24 Andres Freund  wrote:

> Hi,
>
> On 2022-03-18 00:03:37 -0400, Stephen Frost wrote:
> > On Thu, Mar 17, 2022 at 23:25 Andres Freund  wrote:
> > > It's imo a separate project to make the client side extensible. There's
> > > plenty
> > > of authentication methods that don't need any further client side
> support
> > > than
> > > the existing SASL (or password if OK for some use case) messages, which
> > > most
> > > clients (including libpq) already know.
> > >
> > > Of course the fact that libpq "only" speaks SCRAM-SHA-256 is a limit
> > > (unless
> > > you have server side access to clear text passwords, but brrr). But
> there's
> > > plenty that can be done with that. Proxying auth via a central postgres
> > > server
> > > with the secrets, sharing secrets with other data stores also
> understanding
> > > SCRAM-SHA-256, ...
> > >
> > > There definitely *also* are important authentication methods that
> can't be
> > > implemented without further client side support. Some of those could
> > > plausibly
> > > be tackled on the protocol / libpq level in a way that they could be
> used
> > > by
> > > multiple authentication methods. Other authentication methods
> definitely
> > > need
> > > dedicated code in the client (although the protocol likely can be
> fairly
> > > generic).
> > >
> > > Given that there's benefit from the server side extensibility alone, I
> > > don't
> > > see much benefit in tying the server side extensibility to the client
> side
> > > extensibility.
> >
> >
> > How are we going to reasonably test this then?
>
> The test module in the patch is a good starting point.


Without a complete, independent, “this is how it will really be used” on
both the server and client side set of tests I’m not sure that it is.

> I also don’t think that I agree that it’s acceptable to only have the
> > ability to extend the authentication on the server side as that implies a
> > whole bunch of client-side work that goes above and beyond just
> > implementing an authentication system if it’s not possible to leverage
> > libpq (which nearly all languages out there use..).  Not addressing that
> > side of it also makes me concerned that whatever we do here won’t be
> > right/enough/etc.
>
> You're skipping over my point of everything that can be made to use
> SASL-SCRAM-256 just working with the existing libpq? Again?


The plan is to make scram pluggable on the client side?  That isn’t what’s
been proposed here that I’ve seen.  Or is the idea that we are going to
have built-in support for some subset of “custom” things, but only on the
client side because it’s hard to do custom things there, but they’re custom
and have to be loaded through shared preload libraries on the server side?

If you want to argue that somehow that communicating via SASL-SCRAM-256
> from a
> plugin is not a sufficient use-case, or that the API should be shaped more
> specifically to just use SASL-SCRAM-256, fine. Argue it. Otherwise I don't
> know what to do except ignore you.


One thrust of my argument is that if we are going to support custom
authentication methods then we need to consider both sides of that, not
just the server side.  Another is- it’s highly unlikely that the next
authentication method will actually be able to be implemented using these
custom hooks, based on the history we have seen of pluggable authentication
systems, and therefore I don’t agree that they make sense or will be what
we need in the future, which seems to be a large thrust of the argument
here.  I’m also concerned about the risks that this presents to the project
in that there will be arguments about where the fault lies between the
hooks and the core code, as this is just inherently security-related bits,
yet that doesn’t seem to be addressed.  Either way though, it strikes me as
likely to be leaving our users in a bad position either way.

>
I also wasn’t aware that we felt that it’s acceptable to just ignore other
committers.

Thanks,

Stephen

>


Re: Proposal: Support custom authentication methods using hooks

2022-03-18 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2022-03-18 00:45:49 -0400, Stephen Frost wrote:
> > > I also don’t think that I agree that it’s acceptable to only have the
> > > > ability to extend the authentication on the server side as that implies 
> > > > a
> > > > whole bunch of client-side work that goes above and beyond just
> > > > implementing an authentication system if it’s not possible to leverage
> > > > libpq (which nearly all languages out there use..).  Not addressing that
> > > > side of it also makes me concerned that whatever we do here won’t be
> > > > right/enough/etc.
> > >
> > > You're skipping over my point of everything that can be made to use
> > > SASL-SCRAM-256 just working with the existing libpq? Again?
> 
> > The plan is to make scram pluggable on the client side?  That isn’t what’s
> > been proposed here that I’ve seen.
> 
> Libpq and many other drivers already speaks SASL-SCRAM-256. If you're
> implementing an authentication method that wants to store secrets outside of
> postgres (centrally in another postgres instance, ldap, mysql or whatnot) you
> don't need to make scram pluggable client-side. From the perspective of the
> client it'd look *exactly* like the normal auth flow with the server.

Then the only way to get support for something like, say, OAUTH, is to
modify the core code.

That strikes me as entirely defeating the point of having this be
extensible, since you still have to modify the core code to get support
for it.

> What's proposed here is not really about adding new authentication protocols
> between FE/BE (although some *limited* forms of that might be possible). It's
> primarily about using the existing FE/BE authentication exchanges
> (AUTH_REQ_{PASSWORD,SASL*,...}) to validate against something other than the
> builtin pg_authid.rolpassword.  Because drivers already know those
> authentication exchanges, it doesn't need to learn new tricks.

The OAUTH patch was specifically changed to try and use this and yet it
still had to modify the core code on the libpq side (at least, maybe
other things or maybe not depending on later changes to this patch).
I don't get why such an exercise would have been done if the goal is to
just allow what's in rolpassword to be pulled from somewhere else or why
we would be talking about different auth methods in the first place if
that's the goal here.  That the patch was also brought up in the context
of wanting to add support for another auth method also doesn't seem to
support the argument that this is just about changing where the value in
rolpassword comes from.

> As I said in
> https://www.postgresql.org/message-id/20220318032520.t2bcwrnterg6lq6g%40alap3.anarazel.de
> when describing the above, that's not enough to implement every desireable
> authentication method - but there's real use-cases where using the existing
> SASL-SCRAM-256 is sufficient.

That this apparently isn't actually about adding new authentication
protocols or methods sure strikes me as odd when folks were adjusting
proposed patches specificially to use this for new authentication
methods and the subject is "Proposal: Support custom authentication
methods using hooks".  If what's happening is still actually SCRAM then
I also don't get why we would change pg_hba.conf to say 'custom' instead
of scram-sha-256 w/ some option to say how to go get the actual token,
or why there isn't anything done to deal with passwords being changed.
Basically, I don't agree with more-or-less anything that the patch is
doing if that's the goal.  If the goal is to actually change where the
token in rolpassword comes from for existing authentication methods and
specifically isn't to actually try to use this to support new
authenitcation methods, then I'd suggest laying it out that way and
having it be an option for scram-sha-256 or whatever other methods
(though that seems like the only one that should really be changed in
this way, if you want my 2c, as the rest that would work this way are
basically broken, that being md5) and then make sure to have a way to
handle password changes too.

> > I also wasn’t aware that we felt that it’s acceptable to just ignore other
> > committers.
> 
> I'm not talking about going ahead and committing. Just not continuing
> discussing this topci and spending my time more fruitfully on something else.

I'm still a -1 on this patch.  Maybe there's a way to get a shared
storage for rolpassword and maybe that could even be extensible if we
want it to be (though I'd be more inclined to just build it in...) and
if that's actually the goal but this doesn't seem li

Re: [PATCH v2] use has_privs_for_role for predefined roles

2022-03-20 Thread Stephen Frost
Greetings,

On Sun, Mar 20, 2022 at 18:31 Joshua Brindle 
wrote:

> On Sun, Mar 20, 2022 at 12:27 PM Joe Conway  wrote:
> >
> > On 3/3/22 11:26, Joshua Brindle wrote:
> > > On Thu, Feb 10, 2022 at 2:37 PM Joe Conway  wrote:
> > >>
> > >> On 2/10/22 14:28, Nathan Bossart wrote:
> > >> > On Wed, Feb 09, 2022 at 04:39:11PM -0500, Joe Conway wrote:
> > >> >> On 2/9/22 13:13, Nathan Bossart wrote:
> > >> >>> I do wonder if users find the differences between predefined
> roles and role
> > >> >>> attributes confusing.  INHERIT doesn't govern role attributes,
> but it will
> > >> >>> govern predefined roles when this patch is applied.  Maybe the
> role
> > >> >>> attribute system should eventually be deprecated in favor of using
> > >> >>> predefined roles for everything.  Or perhaps the predefined roles
> should be
> > >> >>> converted to role attributes.
> > >> >>
> > >> >> Yep, I was suggesting that the latter would have been preferable
> to me while
> > >> >> Robert seemed to prefer the former. Honestly I could be happy with
> either of
> > >> >> those solutions, but as I alluded to that is probably a discussion
> for the
> > >> >> next development cycle since I don't see us doing that big a
> change in this
> > >> >> one.
> > >> >
> > >> > I agree.  I still think Joshua's proposed patch is a worthwhile
> improvement
> > >> > for v15.
> > >>
> > >> +1
> > >>
> > >> I am planning to get into it in detail this weekend. So far I have
> > >> really just ensured it merges cleanly and passes make world.
> > >
> > > Rebased patch to apply to master attached.
> >
> > Well longer than I planned, but finally took a closer look.
> >
> > I made one minor editorial fix to Joshua's patch, rebased to current
> > master, and added two missing call sites that presumably are related to
> > recent commits for pg_basebackup.
>
> FWIW I pinged Stephen when I saw the basebackup changes included
> is_member_of and he didn't think they necessarily needed to be changed
> since they aren't accessible to human and you can't SET ROLE on a
> replication connection in order to access the role where inheritance
> was blocked.


Yeah, though that should really be very clearly explained in comments
around that code as anything that uses is_member should really be viewed as
an exception.  I also wouldn’t be against using has_privs there anyway and
saying that you shouldn’t be trying to connect as one role on a replication
connection and using the privs of another when you don’t automatically
inherit those rights, but on the assumption that the committer intended
is_member there because SET ROLE isn’t something one does on replication
connections, I’m alright with it being as is.

Kind Regards,

Stephen P Frost

>


Re: role self-revocation

2022-03-28 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Mar 24, 2022 at 1:10 PM Tom Lane  wrote:
> > > However, it might. And if it does, I think it would be best if
> > > removing that exception were the *only* change in this area made by
> > > that release.
> >
> > Good idea, especially since it's getting to be too late to consider
> > anything more invasive anyway.
> 
> I'd say it's definitely too late at this point.

Agreed.

> > > So I propose to commit something like what I posted here:
> > > http://postgr.es/m/ca+tgmobgek0jraowqvpqhsxcfbdfitxsomoebhmmmhmj4gl...@mail.gmail.com
> >
> > +1, although the comments might need some more work.  In particular,
> > I'm not sure that this bit is well stated:

Also +1 on this.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Proposal: Support custom authentication methods using hooks

2022-03-28 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2022-03-18 15:23:21 -0400, Stephen Frost wrote:
> > * Andres Freund (and...@anarazel.de) wrote:
> > > On 2022-03-18 00:45:49 -0400, Stephen Frost wrote:
> > > > > I also don’t think that I agree that it’s acceptable to only have the
> > > > > > ability to extend the authentication on the server side as that 
> > > > > > implies a
> > > > > > whole bunch of client-side work that goes above and beyond just
> > > > > > implementing an authentication system if it’s not possible to 
> > > > > > leverage
> > > > > > libpq (which nearly all languages out there use..).  Not addressing 
> > > > > > that
> > > > > > side of it also makes me concerned that whatever we do here won’t be
> > > > > > right/enough/etc.
> > > > >
> > > > > You're skipping over my point of everything that can be made to use
> > > > > SASL-SCRAM-256 just working with the existing libpq? Again?
> > > 
> > > > The plan is to make scram pluggable on the client side?  That isn’t 
> > > > what’s
> > > > been proposed here that I’ve seen.
> > > 
> > > Libpq and many other drivers already speaks SASL-SCRAM-256. If you're
> > > implementing an authentication method that wants to store secrets outside 
> > > of
> > > postgres (centrally in another postgres instance, ldap, mysql or whatnot) 
> > > you
> > > don't need to make scram pluggable client-side. From the perspective of 
> > > the
> > > client it'd look *exactly* like the normal auth flow with the server.
> > 
> > Then the only way to get support for something like, say, OAUTH, is to
> > modify the core code.
> 
> It wasn't the goal of the patch to add oauth support. I think it's a good sign
> that the patch allowed Jacob to move the server side code in an extension, but
> it's a benefit not a requirement.

I disagree that it's actually at all useful when considering this
change, specifically because it doesn't move the actual goalposts at
all when it comes to adding support for new authentication methods for
PG overall as the core code still has to be modified through the regular
process.  I get that the idea of this patch isn't to add oauth, but I
don't think it's sensible to claim that the changes to the oauth patch
to use these hooks on just the server side while still having a bunch of
code in libpq for oauth makes this set of hooks make sense.  I
definitely don't think we should ever even consider adding support for
something on the libpq side which require a 'custom' auth module on the
server side that isn't included in core.  Putting it into contrib would
just ultimately make it really painful for users to deal with without
any benefit that I can see either as we'd still have to maintain it and
update it through the regular PG process.  Trying hard to give the
benefit of the doubt here, maybe you could argue that by having the
module in contrib and therefore not loaded unless requested that the
system might be overall 'more secure', but given that we already require
admins to specify what the auth method is with pg_hba and that hasn't
been a source of a lot of CVEs around somehow getting the wrong code in
there, I just don't see it as a sensible justfication.

> I think we might be able to build ontop of this to make things even more
> extensible and it's worth having the discussion whether the proposal can be
> made more extensible with a reasonable amount of complexity. But that's
> different from faulting the patch for not achieving something it didn't set
> out to do.

That it wasn't clear just what the idea of this patch was strikes me as
another point against it, really.  The thread talked about custom
authentication methods and the modification to pg_hba clearly made it
out to be a top-level alternative to SCRAM and the other methods even
though it sounds like that wasn't the intent, or maybe was but was only
part of it (but then, what's the other part..?  That's still unclear to
me even after reading these latest emails..).

> > That strikes me as entirely defeating the point of having this be
> > extensible, since you still have to modify the core code to get support
> > for it.
> 
> Since it wasn't the goal to add oauth...

But.. a patch was specifically proposed on this thread to use this to
add oauth, with encouragment from the folks working on this patch, and
that was then used, even just above, as a reason why this was a useful
change to make

Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2022-03-28 Thread Stephen Frost
Greetings,

* Nathan Bossart (nathandboss...@gmail.com) wrote:
> On Thu, Mar 10, 2022 at 07:13:14PM -0500, Chapman Flack wrote:
> > Looks like this change to an example in func.sgml is not quite right:
> > 
> > -postgres=# SELECT * FROM pg_walfile_name_offset(pg_stop_backup());
> > +postgres=# SELECT * FROM pg_walfile_name_offset(pg_backup_stop());
> > 
> > pg_backup_stop returns a record now, not just lsn. So this works for me:
> > 
> > +postgres=# SELECT * FROM pg_walfile_name_offset((pg_backup_stop()).lsn);
> 
> Ah, good catch.  I made this change in v7.  I considered doing something
> like this
> 
>   SELECT w.* FROM pg_backup_stop() b, pg_walfile_name_offset(b.lsn) w;
> 
> but I think your suggestion is simpler.

I tend to agree.  More below.

> Subject: [PATCH v7 1/1] remove exclusive backup mode
> diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
> index 0d69851bb1..c8b914c1aa 100644
> --- a/doc/src/sgml/backup.sgml
> +++ b/doc/src/sgml/backup.sgml
> @@ -881,19 +873,19 @@ test ! -f 
> /mnt/server/archivedir/000100A90065 && cp pg_wal/0
> 
>  
>   Connect to the server (it does not matter which database) as a user with
> - rights to run pg_start_backup (superuser, or a user who has been granted
> + rights to run pg_backup_start (superuser, or a user who has been granted
>   EXECUTE on the function) and issue the command:
>  
> -SELECT pg_start_backup('label', false, false);
> +SELECT pg_backup_start(label => 'label', fast => false);
>  
>   where label is any string you want to use to uniquely
>   identify this backup operation. The connection
> - calling pg_start_backup must be maintained until 
> the end of
> + calling pg_backup_start must be maintained until 
> the end of
>   the backup, or the backup will be automatically aborted.
>  
>  
>  
> - By default, pg_start_backup can take a long time 
> to finish.
> + By default, pg_backup_start can take a long time 
> to finish.
>   This is because it performs a checkpoint, and the I/O
>   required for the checkpoint will be spread out over a significant
>   period of time, by default half your inter-checkpoint interval

Hrmpf.  Not this patch's fault, but the default isn't 'half your
inter-checkpoint interval' anymore (checkpoint_completion_target was
changed to 0.9 ... let's not discuss who did that and missed fixing
this).  Overall though, maybe we should reword this to avoid having to
remember to change it again if we ever change the completion target
again?  Also it seems to imply that pg_backup_start is going to run its
own independent checkpoint or something, which isn't the typical case.
I generally explain what is going on here like this:

By default, pg_backup_start will wait for the next
checkpoint to complete (see ref: checkpoint_timeout ... maybe also
wal-configuration.html).

> @@ -937,7 +925,7 @@ SELECT * FROM pg_stop_backup(false, true);
>   ready to archive.
>  
>  
> - The pg_stop_backup will return one row with three
> + The pg_backup_stop will return one row with three
>   values. The second of these fields should be written to a file named
>   backup_label in the root directory of the backup. 
> The
>   third field should be written to a file named

I get that we have  and , but those are just
formatting and don't show up in the actual text, and so it ends up
being:

The pg_backup_stop will return

Which doesn't sound quite right to me.  I'd say we either drop 'The' or
add 'function' after.  I realize that this patch is mostly doing a
s/pg_stop_backup/pg_backup_stop/, but, still.

> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
> index 8a802fb225..73096708cc 100644
> --- a/doc/src/sgml/func.sgml
> +++ b/doc/src/sgml/func.sgml
> @@ -25732,24 +25715,19 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 
> 622360 free (88 chunks); 1029560
>  spcmapfile text )
> 
> 
> -Finishes performing an exclusive or non-exclusive on-line backup.
> -The exclusive parameter must match the
> -previous pg_start_backup call.
> -In an exclusive backup, pg_stop_backup removes
> -the backup label file and, if it exists, the tablespace map file
> -created by pg_start_backup.  In a non-exclusive
> -backup, the desired contents of these files are returned as part of
> +Finishes performing an on-line backup.  The desired contents of the
> +backup label file and the tablespace map file are returned as part of
>  the result of the function, and should be written to files in the
>  backup area (not in the data directory).
> 

Given that this is a crucial part, which the exclusive mode has wrong,
I'd be a bit more forceful about it, eg:

The contents of the backup label file and the tablespace map file must
be stored as part of the backup and must NOT be stored in the live data
directory (doing so 

Re: Corruption during WAL replay

2022-03-29 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Mar 25, 2022 at 10:34 AM Tom Lane  wrote:
> > I dunno.  Compatibility and speed concerns aside, that seems like an awful
> > lot of bits to be expending on every page compared to the value.
> 
> I dunno either, but over on the TDE thread people seemed quite willing
> to expend like 16-32 *bytes* for page verifiers and nonces and things.

Absolutely.

> For compatibility and speed reasons, I doubt we could ever get by with
> doing that in every cluster, but I do have some hope of introducing
> something like that someday at least as an optional feature. It's not
> like a 16-bit checksum was state-of-the-art even when we introduced
> it. We just did it because we had 2 bytes that we could repurpose
> relatively painlessly, and not any larger number. And that's still the
> case today, so at least in the short term we will have to choose some
> other solution to this problem.

I agree that this would be great as an optional feature.  Those patches
to allow the system to be built with reserved space for $whatever would
allow us to have a larger checksum for those who want it and perhaps a
nonce with TDE for those who wish that in the future.  I mentioned
before that I thought it might be a good way to introduce page-level
epochs for 64bit xids too though it never seemed to get much traction.

Anyhow, this whole thread has struck me as a good reason to polish those
patches off and add on top of them an extended checksum ability, first,
independent of TDE, and remove the dependency of those patches from the
TDE effort and instead allow it to just leverage that ability.  I still
suspect we'll have some folks who will want TDE w/o a per-page nonce and
that could be an option but we'd be able to support TDE w/ integrity
pretty easily, which would be fantastic.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: GSoC: Improve PostgreSQL Regression Test Coverage

2022-03-29 Thread Stephen Frost
Greetings,

* Christos Maris (christos.c.ma...@gmail.com) wrote:
> This is Christos Maris <https://www.linkedin.com/in/christosmaris/>, and
> I'd like to declare my interest in the GSoC project mentioned above.
> I'm an experienced Fullstack Developer, but still an open-source beginner.
> That, combined with the fact that I'd really like to contribute to Postgres
> and learn Perl, makes this particular project ideal for me.

Glad to hear that you're interested!

> I want to ask if the mentor of this project (Stephen Frost) could help me
> with my application (if that's allowed, of course)?

You're welcome to send it to me off-list to take a look at and I can
provide comments on it, but note that I can't provide excessive help in
crafting it.  This is to be your proposal after all, not mine.

Be sure to review this:

https://google.github.io/gsocguides/student/writing-a-proposal

When it comes to working on your proposal.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2022-04-05 Thread Stephen Frost
Greetings,

* David Steele (da...@pgmasters.net) wrote:
> On 4/4/22 11:42 AM, Nathan Bossart wrote:
> >I noticed a couple of other things that can be removed.  Since we no longer
> >wait on exclusive backup mode during smart shutdown, we can change
> >connsAllowed (in postmaster.c) to a boolean and remove CAC_SUPERUSER.  We
> >can also remove a couple of related notes in the documentation.  I've done
> >all this in the attached patch.
> 
> These changes look good to me. IMV it is a real bonus how much the state
> machine has been simplified.

Yeah, agreed.

> I've also run this patch through the pgbackrest regression tests without any
> problems.

Fantastic.

Please find attached an updated patch + commit message.  Mostly, I just
went through and did a bit more in terms of updating the documentation
and improving the comments (there were some places that were still
worrying about the chance of a 'stray' backup_label file existing, which
isn't possible any longer), along with some additional testing and
review.  This is looking pretty good to me, but other thoughts are
certainly welcome.  Otherwise, I'm hoping to commit this tomorrow.

Thanks!

Stephen
From 4e53c8252fb1ee32bf9688b553cf1f5876ab234b Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Mon, 4 Apr 2022 13:05:13 -0400
Subject: [PATCH] Remove exclusive backup mode

Exclusive-mode backups have been deprecated since 9.6 (when
non-exclusive backups were introduced) due to the issues
they can cause should the system crash while one is running and
generally because non-exclusive provides a much better interface.
Further, exclusive backup mode wasn't really being tested (nor was most
of the related code- like being able to log in just to stop an exclusive
backup and the bits of the state machine related to that) and having to
possibly deal with an exclusive backup and the backup_label file
existing during pg_basebackup, pg_rewind, etc, added other complexities
that we are better off without.

This patch removes the exclusive backup mode, the various special cases
for dealing with it, and greatly simplifies the online backup code and
documentation.

Authors: David Steele, Nathan Bossart
Reviewed-by: Chapman Flack
Discussion: https://postgr.es/m/ac7339ca-3718-3c93-929f-99e725d11...@pgmasters.net
https://postgr.es/m/CAHg+QDfiM+WU61tF6=npzocmzvhdzck47kneyb0zrulyzv5...@mail.gmail.com
---
 doc/src/sgml/backup.sgml  | 241 +---
 doc/src/sgml/func.sgml| 111 +---
 doc/src/sgml/high-availability.sgml   |   6 +-
 doc/src/sgml/monitoring.sgml  |   4 +-
 doc/src/sgml/ref/pg_ctl-ref.sgml  |   6 +-
 doc/src/sgml/ref/pgupgrade.sgml   |   2 +-
 doc/src/sgml/runtime.sgml |   8 +-
 src/backend/access/transam/xlog.c | 519 +++---
 src/backend/access/transam/xlogfuncs.c| 253 ++---
 src/backend/access/transam/xlogrecovery.c |  20 +-
 src/backend/catalog/system_functions.sql  |  18 +-
 src/backend/postmaster/postmaster.c   |  75 +--
 src/backend/replication/basebackup.c  |  20 +-
 src/backend/utils/init/postinit.c |  18 -
 src/bin/pg_basebackup/t/010_pg_basebackup.pl  |   4 +
 src/bin/pg_ctl/pg_ctl.c   |  30 -
 src/bin/pg_rewind/filemap.c   |   6 +-
 src/include/access/xlog.h |   7 +-
 src/include/catalog/pg_control.h  |   4 +-
 src/include/catalog/pg_proc.dat   |  28 +-
 src/include/libpq/libpq-be.h  |   3 +-
 src/include/miscadmin.h   |   4 -
 src/test/perl/PostgreSQL/Test/Cluster.pm  |  56 +-
 .../t/010_logical_decoding_timelines.pl   |   4 +-
 24 files changed, 247 insertions(+), 1200 deletions(-)

diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 0d69851bb1..fc2ec68758 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -857,18 +857,9 @@ test ! -f /mnt/server/archivedir/000100A90065 && cp pg_wal/0
 sequence, and that the success of a step is verified before
 proceeding to the next step.

-   
-Low level base backups can be made in a non-exclusive or an exclusive
-way. The non-exclusive method is recommended and the exclusive one is
-deprecated and will eventually be removed.
-   
-
-   
-Making a Non-Exclusive Low-Level Backup
 
- A non-exclusive low level backup is one that allows other
- concurrent backups to be running (both those started using
- the same backup API and those started using
+ Multiple backups are able to be run concurrently (both those
+ started using this backup API and those started using
  ).
 
 
@@ -881,34 +872,30 @@ test ! -f /mnt/server/archivedir/000100A90065 && cp pg_wal/0

 
  Connect to the server (it 

Re: SQL/JSON: JSON_TABLE

2022-04-06 Thread Stephen Frost
Greetings,

* Andrew Dunstan (and...@dunslane.net) wrote:
> On 4/6/22 09:20, Andrew Dunstan wrote:
> > On 4/5/22 22:21, Andres Freund wrote:
> >> On 2022-03-27 16:53:57 -0400, Andrew Dunstan wrote:
> >>> I'm therefore going to commit this series
> >> The new jsonb_sqljson test is, on my machine, the slowest test in the main
> >> regression tests:
> >>
> >> 4639 ms jsonb_sqljson
> >> 2401 ms btree_index
> >> 2166 ms stats_ext
> >> 2027 ms alter_table
> >> 1616 ms triggers
> >> 1498 ms brin
> >> 1489 ms join_hash
> >> 1367 ms foreign_key
> >> 1345 ms tuplesort
> >> 1202 ms plpgsql
> >>
> >> Any chance the slowness isn't required slowness?
> >
> >
> > I'll take a look.
> 
> I've committed a change that should reduce it substantially, but there
> might be more work to do.

All for improving the speed, but this broke the recovery tests (as
noticed by the buildfarm).  Maybe we should add
--no-unlogged-table-data to those pg_dumpall runs?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Kerberos delegation support in libpq and postgres_fdw

2022-04-06 Thread Stephen Frost
Greetinsg,

* Jacob Champion (pchamp...@vmware.com) wrote:
> On Fri, 2022-03-11 at 19:39 -0500, Stephen Frost wrote:
> > On Fri, Mar 11, 2022 at 18:55 Jacob Champion  wrote:
> > > [5] says we have to free the proxy credential with GSS_Release_cred();
> > > I don't see that happening anywhere, but I may have missed it.
> > 
> > I’m not sure that it’s really necessary or worthwhile to do that at
> > process end since … the process is about to end. I suppose we could
> > provide a function that a user could call to ask for it to be
> > released sooner if we really wanted..?
> 
> Do we have to keep the credential handle around once we've stored it
> into the MEMORY: cache, though? Just seems like a leak that someone
> will have to plug eventually, even if it doesn't really impact things
> now.

We don't, so I've fixed that in the attached.  Not sure it's that big a
deal but I don't think it hurts anything either.

> > > It seems like there should be significant security implications to
> > > allowing delegation across the board. Especially since one postgres_fdw
> > > might delegate to another server, and then another... Should this be
> > > opt-in, maybe via a connection parameter?
> > 
> > This is already opt-in- at kinit time a user can decide if they’d
> > like a proxy-able ticket or not.  I don’t know that we really need to
> > have our own option for it … tho I’m not really against adding such
> > an option either.
> 
> I don't really have experience with the use case. Is it normal for
> kinit users to have to decide once, globally, whether they want
> everything they interact with to be able to proxy their credentials? It
> just seems like you'd want more fine-grained control over who gets to
> masquerade as you.

Yes, that's pretty typical for kinit users- they usually go with
whatever the org policy is.  Now, you're not wrong about wanting more
fine-grained control, which is what's known as 'constrained delegation'.
That's something that Kerberos in general supports these days though
it's more complicated and requires additional code to do.  That's
something that I think we could certainly add later on.

> > > Similarly, it feels a little strange that the server would allow the
> > > client to unilaterally force the use of a delegated credential. I think
> > > that should be opt-in on the server side too, unless there's some
> > > context I'm missing around why that's safe.
> > 
> > Perhaps you could explain what isn’t safe about accepting a delegated
> > credential from a client..?  I am not away of a risk to accepting
> > such a delegated credential.
> 
> My initial impression is that this is effectively modifying the USER
> MAPPING that the admin has set up. I'd be worried about an open
> credential proxy being used to bypass firewall or HBA restrictions, for
> instance -- you might not be able to connect as an admin from your
> machine, but you might be able to connect by bouncing through a proxy.
> (What damage you can do is going to be limited by what the server
> extensions can do, of course.)

I'm not sure that I really see the concern here.  Also, in order for
this to work, the user mapping would have to be created with "password
required = false".  Maybe that's something we revisit later, but it
seems like a good way to allow an admin to have control over this.

> Another danger might be disclosure/compromise of middlebox secrets? Is
> it possible for someone who has one half of the credentials to snoop on
> a gssenc connection between the proxy Postgres and the backend
> Postgres?

A compromised middlebox would, of course, be an issue- for any kind of
delegated credentials (which certainly goes for cleartext passwords
being passed along, and that's currently the only thing we support..).
One nice thing about GSSAPI is that the client and the server validate
each other, so it wouldn't just be 'any' middle-box but would have to be
one that was actually a trusted system in the infrastructure which has
somehow been compromised and was still trusted.

> > Even so, I’m not against adding an option… but exactly how would that
> > option be configured?  Server level?  On the HBA line?  role level..?
> 
> In the OPTIONS for CREATE SERVER, maybe? At least for the FDW case.

I'm a bit confused on this.  The option to allow or not allow delegated
credentials couldn't be something that's in the CREATE SERVER for FDWs
as it applies to more than just FDWs but also dblink and anything else
where we reach out from PG to contact some other system.

> > > If I'm

Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2022-04-06 Thread Stephen Frost
Greetings,

* Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> On Tue, 2022-04-05 at 13:06 -0700, Nathan Bossart wrote:
> > On Tue, Apr 05, 2022 at 11:25:36AM -0400, Stephen Frost wrote:
> > > Please find attached an updated patch + commit message.  Mostly, I just
> > > went through and did a bit more in terms of updating the documentation
> > > and improving the comments (there were some places that were still
> > > worrying about the chance of a 'stray' backup_label file existing, which
> > > isn't possible any longer), along with some additional testing and
> > > review.  This is looking pretty good to me, but other thoughts are
> > > certainly welcome.  Otherwise, I'm hoping to commit this tomorrow.
> > 
> > LGTM!
> 
> Cassandra (not the software) from the sidelines predicts that we will
> get some fire from users for this, although I concede the theoretical
> sanity of the change.

Great, thanks for that.

This has now been committed- thanks again to everyone for their help!

Stephen


signature.asc
Description: PGP signature


Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas

2022-04-07 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Apr 7, 2022 at 3:27 PM Peter Geoghegan  wrote:
> > I just meant that it wouldn't be reasonable to impose a fixed cost on
> > every user, even those not using the feature. Which you said yourself.
> 
> Unfortunately, I think there's bound to be some cost. We can avoid
> using the space in the page for every user, but anything that makes
> the page layout variable is going to cost some number of CPU cycles
> someplace. We have to measure that overhead and see if it's small
> enough that we're OK with it.

Agreed that there may be some cost in CPU cycles for folks who don't
initialize the database with an option that needs this, but hopefully
that'll be quite small.

> > I got that much, of course. That will work, I suppose, but it'll be
> > the first and last time that anybody gets to do that (unless we accept
> > it being incompatible with encryption).
> 
> Yeah.

I don't know that I agree with this being the 'first and last time'..?
If we have two options that could work together and each need some
amount of per-page space, such as a nonce or authentication tag, we'd
just need to be able to track which of those are enabled (eg: through
pg_control) and then know which gets what space.  I don't see why we
couldn't add something today and then add something else later on.  I do
think there'll likely be cases where we have two options that don't make
sense together and we wouldn't wish to allow that (such as page-level
strong checksums and authenticated encryption, as the latter provides
the former inherently) but there could certainly be things that do work
together too.

> > > If we *didn't* put the nonce at the end of the page, where else would
> > > we put it? It has to be at a fixed offset, because otherwise you can't
> > > find it without decrypting the page first, which would be circular.
> >
> > Immediately before the special area proper (say BTOpaque), which would
> > "logically come after" the special space under this scheme. You
> > wouldn't have a simple constant offset into the page, but you'd have
> > something not too far removed from such a constant. It could work as a
> > constant with minimal context (just the AM type). Just like with
> > Matthias' patch.
> 
> I don't think this would work, because I don't think it would be
> practical to always know the AM type. Think about applying an XLOG_FPI
> record, for example.

I'm also doubtful about how well this would work, but the other question
is- what would be the advantage to doing it this way?  If we could have
it be run-time instead of initdb-time, that'd be great (imagine a
database that's encrypted while another isn't in the same cluster, or
even individual tables, which would all be very cool), but I don't think
this approach would make that possible..?

(sorry for only just seeing this thread getting traction, have been a
bit busy with other things ...)

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Preventing indirection for IndexPageGetOpaque for known-size page special areas

2022-04-07 Thread Stephen Frost
Greetings,

* Peter Geoghegan (p...@bowt.ie) wrote:
> On Thu, Apr 7, 2022 at 12:37 PM Robert Haas  wrote:
> > On Thu, Apr 7, 2022 at 3:27 PM Peter Geoghegan  wrote:
> > > I just meant that it wouldn't be reasonable to impose a fixed cost on
> > > every user, even those not using the feature. Which you said yourself.
> >
> > Unfortunately, I think there's bound to be some cost. We can avoid
> > using the space in the page for every user, but anything that makes
> > the page layout variable is going to cost some number of CPU cycles
> > someplace. We have to measure that overhead and see if it's small
> > enough that we're OK with it.
> 
> I wouldn't go so far as to say that any non-zero overhead is not okay
> (that sounds really extreme). I would only say this much: wasting
> non-trivial amounts of space on every page must not happen. If the
> user opts-in to the feature then it's not just waste, so that's
> perfectly okay. (I imagine that we agree on this much already.)

Right, the *space* wouldn't ever be wasted- either the user opts-in and
the space is used for what the user asked it to be used for, or ther
user doesn't select that option and we don't reserve that space.
Robert's point was that by allowing this to happen at initdb time, there
may be some CPU cycles that end up getting spent, even if the user
didn't opt-in, that would be saved if this decision was made at compile
time.  For my 2c, I'd think that would be our main concern with this,
but I also am hopeful and strongly suspect that the extra CPU cycles
aren't likely to be much and therefore would be acceptable.

A thought that's further down the line would be the question of if we
could get those CPU cycles back (and perhaps more..) by using JIT.

> > > Immediately before the special area proper (say BTOpaque), which would
> > > "logically come after" the special space under this scheme. You
> > > wouldn't have a simple constant offset into the page, but you'd have
> > > something not too far removed from such a constant. It could work as a
> > > constant with minimal context (just the AM type). Just like with
> > > Matthias' patch.
> >
> > I don't think this would work, because I don't think it would be
> > practical to always know the AM type. Think about applying an XLOG_FPI
> > record, for example.
> 
> There are already some pretty shaky heuristics that are used by tools
> like pg_filedump for this exact purpose. But they work! And they're
> even supported by Postgres, quasi-officially -- grep for "pg_filedump"
> to see what I mean.

Shaky heuristics feels like something appropriate for pg_filedump.. less
so for things like TDE.

> There are numerous reasons why we might want to put that on a formal
> footing, so that we can reliably detect the AM type starting from only
> a page image. I suspect that you're going to end up needing to account
> for this index AMs anyway, so going this way isn't necessarily making
> your life all that much harder. (I am not really sure about that,
> though, since I don't have enough background information.)

This seems like it'd be pretty difficult given that AMs can be loaded as
extensions..?  Also seems like a much bigger change and I'm still not
sure what the advantage is, at least in terms of what this particular
patch is doing.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Kerberos delegation support in libpq and postgres_fdw

2022-04-07 Thread Stephen Frost
Greetings,

* Stephen Frost (sfr...@snowman.net) wrote:
> * Jacob Champion (pchamp...@vmware.com) wrote:
> > On Fri, 2022-03-11 at 19:39 -0500, Stephen Frost wrote:
> > > Even so, I’m not against adding an option… but exactly how would that
> > > option be configured?  Server level?  On the HBA line?  role level..?
> > 
> > In the OPTIONS for CREATE SERVER, maybe? At least for the FDW case.
> 
> I'm a bit confused on this.  The option to allow or not allow delegated
> credentials couldn't be something that's in the CREATE SERVER for FDWs
> as it applies to more than just FDWs but also dblink and anything else
> where we reach out from PG to contact some other system.

Thinking it through further, it seems like the right place to allow an
administrator to control if credentials are allowed to be delegated is
through a pg_hba option.  Attached patch adds such an option.

> > > > Also, we're
> > > > globally ignoring whatever ccache was set by an administrator. Can't
> > > > two postgres_fdw connections from the same backend process require
> > > > different settings?
> > > 
> > > Settings..? Perhaps, but delegated credentials aren’t really 
> > > settings, so not really sure what you’re suggesting here.
> > 
> > I mean that one backend server might require delegated credentials, and
> > another might require whatever the admin has already set up in the
> > ccache, and the user might want to use tables from both servers in the
> > same session.
> 
> That an admin might have a credential cache that's picked up and used
> for connections from a regular user backend to another system strikes me
> as an altogether concerning idea.  Even so, in such a case, the admin
> would have had to set up the user mapping with 'password required =
> false' or it wouldn't have worked for a non-superuser anyway, so I'm not
> sure that I'm too worried about this case.

To address this, I also added a new GUC which allows an administrator to
control what the credential cache is set to for user-authenticated
backends, with a default of MEMORY:, which should generally be safe and
won't cause a user backend to pick up on a file-based credential cache
which might exist on the server somewhere.  This gives the administrator
the option to set it to more-or-less whatever they'd like though, so if
they want to set it to a file-based credential cache, then they can do
so (I did put some caveats about doing that into the documentation as I
don't think it's generally a good idea to do...).

> > > > I notice that gss_store_cred_into() has a companion,
> > > > gss_acquire_cred_from(). Is it possible to use that to pull out our
> > > > delegated credential explicitly by name, instead of stomping on the
> > > > global setup?
> > > 
> > > Not really sure what is meant here by global setup..?  Feeling like
> > > this is a follow on confusion from maybe mixing server vs client
> > > libpq?
> > 
> > By my reading, the gss_store_cred_into() call followed by
> > the setenv("KRB5CCNAME", ...) is effectively performing global
> > configuration for the process. Any KRB5CCNAME already set up by the
> > server admin is going to be ignored from that point onward. Is that
> > accurate?
> 
> The process, yes, but I guess I disagree on that being 'global'- it's
> just for that PG backend process.

The new krb_user_ccache is a lot closer to 'global', though it's
specifically for user-authenticated backends (allowing the postmaster
and other things like replication connections to use whatever the
credential cache is set to by the administrator on startup), but that
seems like it makes sense to me- generally you're not going to want
regular user backends to be accessing the credential cache of the
'postgres' unix account on the server.

> Attached is an updated patch which adds the gss_release_creds call, a
> function in libpq to allow checking if the libpq connection was made
> using GSS, changes to dblink to have it check for password-or-gss when
> connecting to a remote system, and tests for dblink and postgres_fdw to
> make sure that this all works correctly.

I've added a couple more tests to address the new options too, along
with documentation for them.  This is starting to feel reasonably decent
to me, at least as a first pass at supporting kerberos credential
delegation, which is definitely a feature I've been hoping we would get
into PG for quite a while.  Would certainly appreciate some feedback on
this (from anyone who'd like to comment), though I know we're getting
into the last few 

Re: Weird failure with latches in curculio on v15

2023-03-01 Thread Stephen Frost
Greetings,

* Nathan Bossart (nathandboss...@gmail.com) wrote:
> On Sat, Feb 25, 2023 at 11:00:31AM -0800, Andres Freund wrote:
> > TBH, I think the current archive and restore module APIs aren't useful. I
> > think it was a mistake to add archive modules without having demonstrated 
> > that
> > one can do something useful with them that the restore_command didn't 
> > already
> > do. If anything, archive modules have made it harder to improve archiving
> > performance via concurrency.
> 
> I must respectfully disagree that this work is useless.  Besides the
> performance and security benefits of not shelling out for every WAL file,
> I've found it very useful to be able to use the standard module framework
> to develop archive modules.  It's relatively easy to make use of GUCs,
> background workers, compression, etc.  Of course, there is room for
> improvement in areas like concurrency support as you rightly point out, but
> I don't think that makes the current state worthless.

Would be great to see these archive modules, perhaps it would help show
how this functionality is useful and what could be done in core to make
things easier for the archive module.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Non-superuser subscription owners

2023-03-01 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Feb 28, 2023 at 4:01 PM Jeff Davis  wrote:
> > Or default expressions, I presume. If we at least agree on this point,
> > then I think we should try to find a way to treat these other hunks of
> > code in a secure way (which I think is what Andres was suggesting).
> 
> Yeah, or any other expressions. Basically impose restrictions when the
> user running the code is not the same as the user who provided the
> code.

Would this have carve-outs for things like "except if the user providing
the code is trusted/superuser"?  Seems like that would be necessary for
the function to be able to do more-or-less anything, but then I worry
that there's superuser-owned code which could leak information or be
used by a malicious owner as that code would still be running as the
invoking user..  Perhaps we could say that the function also has to be
leakproof, but that isn't quite the same issue and therefore it seems
like we'd need to decorate all of the functions with another flag that's
allowed to be run in this manner.

Random thought- what if the function has a NOTIFY in it with a payload
of some kind of sensitive information?

> > Unless the trusted roles defaults to '*', then I think it will still
> > break some things.
> 
> Definitely. IMHO, it's OK to break some things, certainly in a major
> release and maybe even in a minor release. But we don't want to break
> more things that we really need to break. And as you say, we want the
> restrictions to be comprehensible.

Really hard to say if whatever this is is OK for back-patching and
breaking minor releases without knowing exactly what is getting broken
... but it'd have to be a very clear edge case of what gets broken for
it to be sensible for breaking in a minor release without a very clear
vulnerability or such that's being fixed with a simple work-around.
Just making all auditing triggers break in a minor release certainly
wouldn't be acceptable, as an example that I imagine we all agree with.

> > Each of these ideas and sub-ideas affect the semantics, and should be
> > documented. But how do we document that some code runs as you, some as
> > the person who wrote it, sometimes we obey SECURITY INVOKER and
> > sometimes we ignore it and use DEFINER semantics, some code is outside
> > a function and always executes as the invoker, some code has some
> > security flags, and some code has more security flags, code can change
> > between the time you look at it and the time it runs, and it's all
> > filtered through GUCs with their own privilege sub-language?
> >
> > OK, let's assume that we have all of that documented, then how do we
> > guide users on what reasonable best practices are for the GUC settings,
> > etc.? Or do we just say "this is mechanically how all these parts work,
> > good luck assembling it into a secure system!". [ Note: I feel like
> > this is the state we are in now. Even if technically we don't have live
> > security bugs that I'm aware of, we are setting users up for security
> > problems. ]
> >
> > On the other hand, if we focus on executing code as the user who wrote
> > it in most places, then the documentation will be something like: "you
> > defined the table, you wrote the code, it runs as you, here are some
> > best practices for writing secure code". And we have some different
> > documentation for writing a cool SECURITY INVOKER function and how to
> > get other users to trust you enough to run it. That sounds a LOT more
> > understandable for users.
> 
> What I was imagining is that we would document something like: A table
> can have executable code associated with it in a variety of ways. For
> example, it can have triggers, default expressions, check constraints,
> or row-level security filters. In most cases, these expressions are
> executed with the privileges of the user performing the operation on
> the table, except when SECURITY DEFINER functions are used. Because
> these expressions are set by the table owner and executed by the users
> accessing the table, there is a risk that the table owner could
> include malicious code that usurps the privileges of the user
> accessing the table. For this reason, these expressions are, by
> default, restricted from doing . If you want to allow those
> operations, you can .

Well, one possible answer to 'something' might be 'use SECURITY DEFINER
functions which are owned by a role allowed to do '.  Note that
that doesn't have to be the table owner though, it could be a much more
constrained role.  That approach would allow us to leverage the existing
GRANT/RLS/et al system for what's allowed and avoid having to create new
things like a complex permission system inside of a GUC for users to
have to understand.

> I agree that running code as the table owner is helpful in a bunch of
> scenarios, but I also don't think it fixes everything. You earlier
> mentioned that switching to the table owner seems to be just a way

Re: a very minor bug and a couple of comment changes for basebackup.c

2023-03-06 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> Thanks for the review. I have committed the patches.

No objections to what was committed.

> On Thu, Mar 2, 2023 at 2:59 AM Michael Paquier  wrote:
> > There is more to it: the page LSN is checked before its checksum.
> > Hence, if the page's LSN is corrupted in a way that it is higher than
> > sink->bbs_state->startptr, the checksum verification is just skipped
> > while the page is broken but not reported as such.  (Spoiler: this has
> > been mentioned in the past, and maybe we'd better remove this stuff in
> > its entirety.)
> 
> Yep. It's pretty embarrassing that we have a checksum verification
> feature that has both known ways of producing false positives and
> known ways of producing false negatives and we have no plan to ever
> fix that, we're just going to keep shipping what we've got. I think
> it's pretty clear that the feature shouldn't have been committed like
> this; valid criticisms of the design were offered and simply not
> addressed, not even by updating the comments or documentation with
> disclaimers. I find the code in sendFile() pretty ugly, too. For all
> of that, I'm a bit uncertain whether ripping it out is the right thing
> to do. It might be (I'm not sure) that it tends to work decently well
> in practice. Like, yes, it could produce false checksum warnings, but
> does that actually happen to people? It's probably not too likely that
> a read() or write() of 8kB gets updated after doing only part of the
> I/O, so the concurrent read or write is fairly likely to be on-CPU, I
> would guess, and therefore this algorithm might kind of work OK in
> practice despite begin garbage on a theoretical level. Similarly, the
> problems with how the LSN is vetted make it likely that a page
> replaced with random garbage will go undetected, but a page where a
> few bytes get flipped in a random position within the page is likely
> to get caught, and maybe the latter is actually a bigger risk than the
> former. I don't really know. I'd be interested to hear from anyone
> with a lot of practical experience using the feature. A few anecdotes
> of the form "this routine fails to tell us about problems" or "this
> often complains about problems that are not real" or "this has really
> helped us out on a number of occasions and we have had no problems
> with it" would be really helpful.

The concern about the LSN is certainly a valid one and we ended up
ripping that check out of pgbackrest in favor of taking a different
approach- simply re-read the page and see if it changed.  If it changed,
then we punt and figure that it was a hot page that PG was actively
writing to and so it'll be in the WAL and we don't have to worry about
it.  We're generally concerned more about latent on-disk corruption that
is missed than about some kind of in-memory corruption and the page
changing in the filesystem cache without some other process writing to
it just seems to be awfully unlikely.

https://github.com/pgbackrest/pgbackrest/commit/9eec98c61302121134d2067326dbd2cd0f2f0b9c

From a practical perspective, while this has the afore-mentioned risk
regarding our loop happening twice fast enough that it re-reads the same
page without the i/o on the page progressing at all, I'm not aware of
even one report of that actually happening.  We absolutely have seen
cases where the first read picks up a torn page and that's not even
uncommon in busy environments.

One of the ideas I've had around how to address all of this is to not
depend on inferring things from what's been written out but instead to
just ask PG and that's one of the (relatively few...) benefits that I
see to an archive_library- the ability to check if a given page is in
shared buffers and, if so, what its LSN is to see if it's past the start
of the backup.  Given that pg_basebackup is already working with a lot
of server-side code.. perhaps this could be a direction to go in for it.

Another idea we played around with was keeping track of the LSNs of the
pages with invalid checksums and checking that they fall within the
range of start LSN-end LSN of the backup; while that wouldn't
completely prevent corrupted LSNs from skipping detection using the LSN
approach, it'd sure make it a whole lot less likely.

Lastly, I've also wondered about trying to pull (clean) pages from
shared_buffers directly instead of reading them off of disk at all,
perhaps using a ring buffer in shared_buffers to read them in if
they're not already there, and letting all the usual checksum validation
and such happening with PG handling it.  Dirty pages would have to be
handled though, presumably by locking them and then reading the page
from disk as I don't think folks would be pleased if we decided to
forcibly write them out.  For an incremental backup, if you have
reliable knowledge that the page is in the WAL from PG, perhaps you
could even skip the page entirely.

Independently of verifying PG checksums, we've also conside

Re: Moving forward with TDE

2023-03-08 Thread Stephen Frost
Greetings,

* Chris Travers (chris.trav...@gmail.com) wrote:
> From the documentation, the primary threat model of TDE is to prevent 
> decryption of data from archived wal segments (and data files), for example 
> on a backup system.  While there are other methods around this problem to 
> date, I think that this feature is worth pursuing for that reason.  I want to 
> address a couple of reasons for this and then go into some reservations I 
> have about how some of this is documented.

Agreed, though the latest efforts include an option for *authenticated*
encryption as well as unauthenticated.  That makes it much more
difficult to make undetected changes to the data that's protected by
the authenticated encryption being used.

> There are current workarounds to ensuring encryption at rest, but these have 
> a number of problems.  Encryption passphrases end up lying around the system 
> in various places.  Key rotation is often difficult.  And one mistake can 
> easily render all efforts ineffective.  TDE solves these problems.  The 
> overall design from the internal docs looks solid.  This definitely is 
> something I would recommend for many users.

There's clearly user demand for it as there's a number of organizations
who have forks which are providing it in one shape or another.  This
kind of splintering of the community is actually an actively bad thing
for the project and is part of what killed Unix, by at least some pretty
reputable accounts, in my view.

> I have a couple small caveats though.  Encryption of data is a large topic 
> and there isn't a one-size-fits-all solution to industrial or state 
> requirements.  Having all this key management available in PostgreSQL is a 
> very good thing.  Long run it is likely to end up being extensible, and 
> therefore both more powerful and offering a wider range of choices for 
> solution architects.  Implementing encryption is also something that is easy 
> to mess up.  For this reason I think it would be great if we had a 
> standardized format for discussing encryption options that we could use going 
> forward.  I don't think that should be held against this patch but I think we 
> need to start discussing it now because it will be a bigger problem later.

Do you have a suggestion as to the format to use?

> A second caveat I have is that key management is a topic where you really 
> need a good overview of internals in order to implement effectively.  If you 
> don't know how an SSL handshake works or what is in a certificate, you can 
> easily make mistakes in setting up SSL.  I can see the same thing happening 
> here.  For example, I don't think it would be safe to leave the KEK on an 
> encrypted filesystem that is decrypted at runtime (or at least I wouldn't 
> consider that safe -- your appetite for risk may vary).

Agreed that we should document this and make clear that the KEK is
necessary for server start but absolutely should be kept as safe as
possible and certainly not stored on disk somewhere nearby the encrypted
cluster.

> My proposal would be to have build a template for encryption options in the 
> documentation.  This could include topics like SSL as well.  In such a 
> template we'd have sections like "Threat model,"  "How it works," 
> "Implementation Requirements" and so forth.  Again I don't think this needs 
> to be part of the current patch but I think it is something we need to start 
> thinking about now.  Maybe after this goes in, I can present a proposed 
> documentation patch.

I'm not entirely sure that it makes sense to lump this and TLS in the
same place as they end up being rather independent at the end of the
day.  If you have ideas for how to improve the documentation, I'd
certainly encourage you to go ahead and work on that and submit it as a
patch rather than waiting for this to actually land in core.  Having
good and solid documentation is something that will help this get in,
after all, and to the extent that it's covering existing topics like
TLS, those could likely be included independently and that would be of
benefit to everyone.

> I will also note that I don't consider myself to be very qualified on topics 
> like encryption.  I can reason about key management to some extent but some 
> implementation details may be beyond me.  I would hope we could get some 
> extra review on this patch set soon.

Certainly agree with you there though there's an overall trajectory of
patches involved in all of this that's a bit deep.  The plan is to
discuss that at PGCon (On the Road to TDE) and at the PGCon
Unconference after.  I certainly hope those interested will be there.
I'm also happy to have a call with anyone interested in this effort
independent of that, of course.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Improve logging when using Huge Pages

2023-03-09 Thread Stephen Frost
Greetings,

* Nathan Bossart (nathandboss...@gmail.com) wrote:
> On Wed, Feb 15, 2023 at 10:13:17AM -0800, Nathan Bossart wrote:
> > On Tue, Feb 14, 2023 at 07:32:56PM -0600, Justin Pryzby wrote:
> >> On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote:
> >>> I'm curious why you chose to make this a string instead of an enum.  There
> >>> might be little practical difference, but since there are only three
> >>> possible values, I wonder if it'd be better form to make it an enum.
> >> 
> >> It takes more code to write as an enum - see 002.txt.  I'm not convinced
> >> this is better.
> >> 
> >> But your comment made me fix its , and reconsider the strings,
> >> which I changed to active={unknown/true/false} rather than {unk/on/off}.
> >> It could also be active={unknown/yes/no}...
> > 
> > I think unknown/true/false is fine.  I'm okay with using a string if no one
> > else thinks it should be an enum (or a bool).
> 
> There's been no response for this, so I guess we can proceed with a string
> GUC.

Just happened to see this and I'm not really a fan of this being a
string when it's pretty clear that's not what it actually is.

> +Reports whether huge pages are in use by the current instance.
> +See  for more information.
> 
> I still think we should say "server" in place of "current instance" here.

We certainly use 'server' a lot more in config.sgml than we do
'instance'.  "currently running server" might be closer to how we
describe a running PG system in other parts (we talk about "currently
running server processes", "while the server is running", "When running
a standby server", "when the server is running"; "instance" is used much
less and seems to more typically refer to 'state of files on disk' in my
reading vs. 'actively running process' though there's some of each).

> + {"huge_pages_active", PGC_INTERNAL, PRESET_OPTIONS,
> + gettext_noop("Indicates whether huge pages are in 
> use."),
> + NULL,
> + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | 
> GUC_RUNTIME_COMPUTED
> + },
> 
> I don't think we need to use GUC_RUNTIME_COMPUTED.  'postgres -C' seems to
> always report "unknown" for this GUC, so all this would do is cause that
> command to error unnecessarily when the server is running.

... or we could consider adjusting things to actually try the mmap() and
find out if we'd end up with huge pages or not.  Naturally we'd only
want to do that if the server isn't running though and erroring if it is
would be perfectly correct.  Either that or just refusing to provide it
by an error or other approach (see below) seems entirely reasonable.

> It might be worth documenting exactly what "unknown" means.  IIUC you'll
> only ever see "on" or "off" via SHOW or pg_settings, which doesn't seem
> tremendously obvious.

If we could get rid of that case and just make this a boolean, that
seems like it'd really be the best answer.

To that end- perhaps this isn't appropriate as a GUC at all?  Why not
have this just be a system information function?  Functionally it really
provides the same info- if the server is running then you get back
either true or false, if it's not then you can't call it but that's
hardly different from an unknown or error result.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: [PATCH] Align GSS and TLS error handling in PQconnectPoll()

2023-03-09 Thread Stephen Frost
Greetings,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Fri, Feb 17, 2023 at 09:01:43AM -0800, Jacob Champion wrote:
> > On Thu, Feb 16, 2023 at 10:59 PM Michael Paquier  
> > wrote:
> >> I am adding Stephen Frost
> >> in CC to see if he has any comments about all this part of the logic
> >> with gssencmode.
> > 
> > Sounds good.
> 
> Hearing nothing on this part, perhaps we should just move on and
> adjust the behavior on HEAD?  Thats seems like one step in the good
> direction.  If this brews right, we could always discuss a backpatch
> at some point, if necessary.
> 
> Thoughts from others?

I agree with matching how SSL is handled here and in a review of the
patch proposed didn't see any issues with it.  Seems like it's probably
something that should also be back-patched and it doesn't look terribly
risky to do so, is there a specific risk that you see?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Disable rdns for Kerberos tests

2023-03-09 Thread Stephen Frost
Greetings,

* Heikki Linnakangas (hlinn...@iki.fi) wrote:
> On 25 February 2023 00:50:30 EET, Stephen Frost  wrote:
> >Thanks for reviewing!  Comments added and updated the commit message.
> >
> >Unless there's anything else, I'll push this early next week.
> 
> s/capture portal/captive portal/. Other than that, looks good to me.

Push, thanks again!

Stephen


signature.asc
Description: PGP signature


Re: Disable rdns for Kerberos tests

2023-03-09 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > Push, thanks again!
> 
> Why'd you only change HEAD?  Isn't the test equally fragile in the
> back branches?

We hadn't had any complaints about it and so I wasn't sure if it was
useful to back-patch it.  I'm happy to do so though.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Improve logging when using Huge Pages

2023-03-09 Thread Stephen Frost
Greetings,

* Justin Pryzby (pry...@telsasoft.com) wrote:
> On Thu, Mar 09, 2023 at 09:34:10AM -0500, Stephen Frost wrote:
> > * Nathan Bossart (nathandboss...@gmail.com) wrote:
> > > On Wed, Feb 15, 2023 at 10:13:17AM -0800, Nathan Bossart wrote:
> > > > On Tue, Feb 14, 2023 at 07:32:56PM -0600, Justin Pryzby wrote:
> > > >> On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote:
> > > >>> I'm curious why you chose to make this a string instead of an enum.  
> > > >>> There
> > > >>> might be little practical difference, but since there are only three
> > > >>> possible values, I wonder if it'd be better form to make it an enum.
> > > >> 
> > > >> It takes more code to write as an enum - see 002.txt.  I'm not 
> > > >> convinced
> > > >> this is better.
> > > >> 
> > > >> But your comment made me fix its , and reconsider the strings,
> > > >> which I changed to active={unknown/true/false} rather than 
> > > >> {unk/on/off}.
> > > >> It could also be active={unknown/yes/no}...
> > > > 
> > > > I think unknown/true/false is fine.  I'm okay with using a string if no 
> > > > one
> > > > else thinks it should be an enum (or a bool).
> > > 
> > > There's been no response for this, so I guess we can proceed with a string
> > > GUC.
> > 
> > Just happened to see this and I'm not really a fan of this being a
> > string when it's pretty clear that's not what it actually is.
> 
> I don't understand what you mean by that.
> Why do you say it isn't a string ?

Because it's clearly a yes/no, either the server is currently running
with huge pages, or it isn't.  That's the definition of a boolean.
Sure, anything can be cast to text but when there's a data type that
fits better, that's almost uniformly better to use.

> > > +Reports whether huge pages are in use by the current instance.
> > > +See  for more information.
> > > 
> > > I still think we should say "server" in place of "current instance" here.
> > 
> > We certainly use 'server' a lot more in config.sgml than we do
> > 'instance'.  "currently running server" might be closer to how we
> > describe a running PG system in other parts (we talk about "currently
> > running server processes", "while the server is running", "When running
> > a standby server", "when the server is running"; "instance" is used much
> > less and seems to more typically refer to 'state of files on disk' in my
> > reading vs. 'actively running process' though there's some of each).
> 
> I called it "instance" since the GUC has no meaning when it's not
> running.  I'm fine to rename it to "running server".

Great, I do think that would match better with the rest of the
documentation.

> > > + {"huge_pages_active", PGC_INTERNAL, PRESET_OPTIONS,
> > > + gettext_noop("Indicates whether huge pages are in 
> > > use."),
> > > + NULL,
> > > + GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE | 
> > > GUC_RUNTIME_COMPUTED
> > > + },
> > > 
> > > I don't think we need to use GUC_RUNTIME_COMPUTED.  'postgres -C' seems to
> > > always report "unknown" for this GUC, so all this would do is cause that
> > > command to error unnecessarily when the server is running.
> > 
> > ... or we could consider adjusting things to actually try the mmap() and
> > find out if we'd end up with huge pages or not.
> 
> That seems like a bad idea, since it might work one moment and fail one
> moment later.  If we could tell in advance whether it was going to work,
> we wouldn't be here, and probably also wouldn't have invented
> huge_pages=true.

Sure it might ... but I tend to disagree that it's actually all that
likely for it to end up being as inconsistent as that and it'd be nice
to be able to see if the server will end up successfully starting (for
this part, at least), or not, when configured with huge pages set to on,
or if starting with 'try' is likely to result in it actually using huge
pages, or not.

> > > It might be worth documenting exactly what "unknown" means.  IIUC you'll
> > > only ever see "on" or "off

Re: Improve logging when using Huge Pages

2023-03-09 Thread Stephen Frost
Greetings,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> On 2023-Mar-09, Justin Pryzby wrote:
> > On Thu, Mar 09, 2023 at 09:34:10AM -0500, Stephen Frost wrote:
> > > > +Reports whether huge pages are in use by the current instance.
> > > > +See  for more information.
> > > > 
> > > > I still think we should say "server" in place of "current instance" 
> > > > here.
> > > 
> > > We certainly use 'server' a lot more in config.sgml than we do
> > > 'instance'.  "currently running server" might be closer to how we
> > > describe a running PG system in other parts (we talk about "currently
> > > running server processes", "while the server is running", "When running
> > > a standby server", "when the server is running"; "instance" is used much
> > > less and seems to more typically refer to 'state of files on disk' in my
> > > reading vs. 'actively running process' though there's some of each).
> > 
> > I called it "instance" since the GUC has no meaning when it's not
> > running.  I'm fine to rename it to "running server".
> 
> I'd rather make all these other places use "instance" instead.  We used
> to consider these terms interchangeable, but since we introduced the
> glossary to unify the terminology, they are no longer supposed to be.
> A server (== a machine) can contain many instances, and each individual
> instance in the server could be using huge pages or not.

Perhaps, but then the references to instance should probably also be
changed since there's certainly some that are referring to a set of data
files and not to backend processes, eg:

##
The shutdown setting is useful to have the instance
ready at the exact replay point desired.  The instance will still be
able to replay more WAL records (and in fact will have to replay WAL
records since the last checkpoint next time it is started).
##

Not sure about just changing one thing at a time though or using the
'right' term when introducing things while having the 'wrong' term be
used next to it.  Also not saying that this patch should be responsible
for fixing everything either.  'Server' in the glossary does explicitly
say it could be used when referring to an 'instance' too though, so
'running server' doesn't seem to be entirely wrong.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Sub-millisecond [autovacuum_]vacuum_cost_delay broken

2023-03-09 Thread Stephen Frost
Greetings,

* Thomas Munro (thomas.mu...@gmail.com) wrote:
> On Fri, Mar 10, 2023 at 10:26 AM Melanie Plageman
>  wrote:
> > I think that 4753ef37e0ed undid the work caf626b2c did to support
> > sub-millisecond delays for vacuum and autovacuum.
> >
> > After 4753ef37e0ed, vacuum_delay_point()'s local variable msec is a
> > double which, after being passed to WaitLatch() as timeout, which is a
> > long, ends up being 0, so we don't end up waiting AFAICT.
> >
> > When I set [autovacuum_]vacuum_delay_point to 0.5, SHOW will report that
> > it is 500us, but WaitLatch() is still getting 0 as timeout.
> 
> Given that some of the clunkier underlying kernel primitives have
> milliseconds in their interface, I don't think it would be possible to
> make a usec-based variant of WaitEventSetWait() that works everywhere.
> Could it possibly make sense to do something that accumulates the
> error, so if you're using 0.5 then every second vacuum_delay_point()
> waits for 1ms?

Hmm.  That generally makes sense to me.. though isn't exactly the same.
Still, I wouldn't want to go back to purely pg_usleep() as that has the
other downsides mentioned.

Perhaps if the delay is sub-millisecond, explicitly do the WaitLatch()
with zero but also do the pg_usleep()?  That's doing a fair bit of work
beyond just sleeping, but it also means we shouldn't miss out on the
postmaster going away or similar..

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Improve logging when using Huge Pages

2023-03-13 Thread Stephen Frost
Greetings,

On Mon, Mar 13, 2023 at 21:03 Justin Pryzby  wrote:

> On Thu, Mar 09, 2023 at 03:02:29PM -0500, Stephen Frost wrote:
> > * Justin Pryzby (pry...@telsasoft.com) wrote:
> > > On Thu, Mar 09, 2023 at 09:34:10AM -0500, Stephen Frost wrote:
> > > > * Nathan Bossart (nathandboss...@gmail.com) wrote:
> > > > > On Wed, Feb 15, 2023 at 10:13:17AM -0800, Nathan Bossart wrote:
> > > > > > On Tue, Feb 14, 2023 at 07:32:56PM -0600, Justin Pryzby wrote:
> > > > > >> On Mon, Feb 13, 2023 at 08:18:52PM -0800, Nathan Bossart wrote:
> > > > > >>> I'm curious why you chose to make this a string instead of an
> enum.  There
> > > > > >>> might be little practical difference, but since there are only
> three
> > > > > >>> possible values, I wonder if it'd be better form to make it an
> enum.
> > > > > >>
> > > > > >> It takes more code to write as an enum - see 002.txt.  I'm not
> convinced
> > > > > >> this is better.
> > > > > >>
> > > > > >> But your comment made me fix its , and reconsider the
> strings,
> > > > > >> which I changed to active={unknown/true/false} rather than
> {unk/on/off}.
> > > > > >> It could also be active={unknown/yes/no}...
> > > > > >
> > > > > > I think unknown/true/false is fine.  I'm okay with using a
> string if no one
> > > > > > else thinks it should be an enum (or a bool).
> > > > >
> > > > > There's been no response for this, so I guess we can proceed with
> a string
> > > > > GUC.
> > > >
> > > > Just happened to see this and I'm not really a fan of this being a
> > > > string when it's pretty clear that's not what it actually is.
> > >
> > > I don't understand what you mean by that.
> > > Why do you say it isn't a string ?
> >
> > Because it's clearly a yes/no, either the server is currently running
> > with huge pages, or it isn't.  That's the definition of a boolean.
>
> I originally implemented it as a boolean, and I changed it in response
> to the feedback that postgres -C huge_pages_active should return
> "unknown".


I really don’t see how that’s at all useful.

> > Is there an agreement to use a function, instead ?
>
> Alvaro was -1 on using a function


I don’t entirely get that argument (select thisfunc(); is much worse than
show thisguc; ..?   Also, the former is easier to use with other functions
and such, as you don’t have to do current_setting(‘thisguc’)…).

Andres is +0 (?)


Kinda felt like this was closer to +0.5 or more, for my part anyway.

Nathan is +1
> Stephen is +1
>
> I'm -0.5,


Why..?

but I reimplemented it as a function.


Thanks!

  I hope that helps it to
> progress.  Please include a suggestion if there's better place for the
> function or global var.


Will try to give it a look tomorrow.

Thanks again!

Stephen

>


Re: Kerberos delegation support in libpq and postgres_fdw

2023-03-20 Thread Stephen Frost
Greetings,

* Stephen Frost (sfr...@snowman.net) wrote:
> * Michael Paquier (mich...@paquier.xyz) wrote:
> > On Mon, Sep 19, 2022 at 02:05:39PM -0700, Jacob Champion wrote:
> > > It's not prevented, because a password is being used. In my tests I'm
> > > connecting as an unprivileged user.
> > > 
> > > You're claiming that the middlebox shouldn't be doing this. If this new
> > > default behavior were the historical behavior, then I would have agreed.
> > > But the cat's already out of the bag on that, right? It's safe today.
> > > And if it's not safe today for some other reason, please share why, and
> > > maybe I can work on a patch to try to prevent people from doing it.
> > 
> > Please note that this has been marked as returned with feedback in the
> > current CF, as this has remained unanswered for a bit more than three
> > weeks.
> 
> There's some ongoing discussion about how to handle outbound connections
> from the server ending up picking up credentials from the server's
> environment (that really shouldn't be allowed unless specifically asked
> for..), that's ultimately an independent change from what this patch is
> doing.

That got committed, which is great, though it didn't go quite as far as
I had been hoping regarding dealing with outbound connections from the
server- perhaps we should make it clear at least for postgres_fdw that
it might be good for administrators to explicitly say which options are
allowed for a given user-map when it comes to how authentication is
done to the remote server?  Seems like mostly a documentation
improvement, I think?  Or should we have some special handling around
that option for postgres_fdw/dblink?

> Here's an updated version which does address Robert's concerns around
> having this disabled by default and having options on both the server
> and client side saying if it is to be enabled or not.  Also added to
> pg_stat_gssapi a field that indicates if credentials were proxied or not
> and made some other improvements and added additional regression tests
> to test out various combinations.

I've done some self-review and also reworked how the security checks are
done to be sure that we're not ending up pulling credentials from the
environment (with added regression tests to check for it too).  If
there's remaining concerns around that, please let me know.  Of course,
other review would be great also.  Presently though:

- Rebased up to today
- Requires explicitly being enabled on client and server
- Authentication to a remote server via dblink or postgres_fdw with
  GSSAPI requires that credentials were proxied by the client to the
  server, except if the superuser set 'password_required' to false on
  the postgres_fdw (which has lots of caveats around it in the
  documentation because it's inherently un-safe to do).
- Includes updated documentation
- Quite a few additional regression tests to check for unrelated
  credentials coming from the environment in either cases where
  credentials have been proxied and in cases where they haven't.
- Only changes to existing regression tests for dblink/postgres_fdw are
  in the error message wording updates.

Thanks!

Stephen
From 03a799699b42d3f9d9ed017bd448f606b7a2761d Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Mon, 28 Feb 2022 20:17:55 -0500
Subject: [PATCH] Add support for Kerberos credential delegation

Support GSSAPI/Kerberos credentials being delegated to the server by a
client.  With this, a user authenticating to PostgreSQL using Kerberos
(GSSAPI) credentials can choose to delegate their credentials to the
PostgreSQL server (which can choose to accept them, or not), allowing
the server to then use those delegated credentials to connect to
another service, such as with postgres_fdw or dblink or theoretically
any other service which is able to be authenticated using Kerberos.

Both postgres_fdw and dblink are changed to allow non-superuser
password-less connections but only when GSSAPI credentials have been
proxied to the server by the client and GSSAPI is used to authenticate
to the remote system.

Authors: Stephen Frost, Peifeng Qiu
Discussion: https://postgr.es/m/co1pr05mb8023cc2cb575e0faad7df4f8a8...@co1pr05mb8023.namprd05.prod.outlook.com
---
 contrib/dblink/dblink.c   | 118 +++---
 contrib/dblink/expected/dblink.out|   4 +-
 contrib/postgres_fdw/connection.c |  68 +++-
 .../postgres_fdw/expected/postgres_fdw.out|  19 +-
 contrib/postgres_fdw/option.c |   3 +
 contrib/postgres_fdw/sql/postgres_fdw.sql |   3 +-
 doc/src/sgml/config.sgml  |  17 +
 doc/src/sgml/dblink.sgml  |   5 +-
 doc/src/sgml/libpq.sgml   |  41 +++
 

Re: Kerberos delegation support in libpq and postgres_fdw

2023-03-26 Thread Stephen Frost
Greetings,

* Greg Stark (st...@mit.edu) wrote:
> The CFBot says there's a function be_gssapi_get_proxy() which is
> undefined. Presumably this is a missing #ifdef or a definition that
> should be outside an #ifdef.

Yup, just a couple of missing #ifdef's.

Updated and rebased patch attached.

Thanks!

Stephen
From 450a8749d04af54e8214a2ab357fbec7849a485b Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Mon, 28 Feb 2022 20:17:55 -0500
Subject: [PATCH] Add support for Kerberos credential delegation

Support GSSAPI/Kerberos credentials being delegated to the server by a
client.  With this, a user authenticating to PostgreSQL using Kerberos
(GSSAPI) credentials can choose to delegate their credentials to the
PostgreSQL server (which can choose to accept them, or not), allowing
the server to then use those delegated credentials to connect to
another service, such as with postgres_fdw or dblink or theoretically
any other service which is able to be authenticated using Kerberos.

Both postgres_fdw and dblink are changed to allow non-superuser
password-less connections but only when GSSAPI credentials have been
proxied to the server by the client and GSSAPI is used to authenticate
to the remote system.

Authors: Stephen Frost, Peifeng Qiu
Discussion: https://postgr.es/m/co1pr05mb8023cc2cb575e0faad7df4f8a8...@co1pr05mb8023.namprd05.prod.outlook.com
---
 contrib/dblink/dblink.c   | 120 ---
 contrib/dblink/expected/dblink.out|   4 +-
 contrib/postgres_fdw/connection.c |  70 +++-
 .../postgres_fdw/expected/postgres_fdw.out|  19 +-
 contrib/postgres_fdw/option.c |   3 +
 contrib/postgres_fdw/sql/postgres_fdw.sql |   3 +-
 doc/src/sgml/config.sgml  |  17 +
 doc/src/sgml/dblink.sgml  |   5 +-
 doc/src/sgml/libpq.sgml   |  41 +++
 doc/src/sgml/monitoring.sgml  |   9 +
 doc/src/sgml/postgres-fdw.sgml|   5 +-
 src/backend/catalog/system_views.sql  |   3 +-
 src/backend/foreign/foreign.c |   1 +
 src/backend/libpq/auth.c  |  13 +-
 src/backend/libpq/be-gssapi-common.c  |  51 +++
 src/backend/libpq/be-secure-gssapi.c  |  26 +-
 src/backend/utils/activity/backend_status.c   |   1 +
 src/backend/utils/adt/pgstatfuncs.c   |  20 +-
 src/backend/utils/init/postinit.c |   8 +-
 src/backend/utils/misc/guc_tables.c   |  10 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/catalog/pg_proc.dat   |   6 +-
 src/include/libpq/auth.h  |   1 +
 src/include/libpq/be-gssapi-common.h  |   3 +
 src/include/libpq/libpq-be.h  |   2 +
 src/include/utils/backend_status.h|   1 +
 src/interfaces/libpq/exports.txt  |   1 +
 src/interfaces/libpq/fe-auth.c|  15 +-
 src/interfaces/libpq/fe-connect.c |  17 +
 src/interfaces/libpq/fe-secure-gssapi.c   |  23 +-
 src/interfaces/libpq/libpq-fe.h   |   1 +
 src/interfaces/libpq/libpq-int.h  |   2 +
 src/test/kerberos/Makefile|   3 +
 src/test/kerberos/t/001_auth.pl   | 335 --
 src/test/perl/PostgreSQL/Test/Utils.pm|  27 ++
 src/test/regress/expected/rules.out   |  11 +-
 36 files changed, 743 insertions(+), 135 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 78a8bcee6e..292377566b 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -48,6 +48,7 @@
 #include "funcapi.h"
 #include "lib/stringinfo.h"
 #include "libpq-fe.h"
+#include "libpq/libpq-be.h"
 #include "libpq/libpq-be-fe-helpers.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
@@ -111,7 +112,8 @@ static HeapTuple get_tuple_of_interest(Relation rel, int *pkattnums, int pknumat
 static Relation get_rel_from_relname(text *relname_text, LOCKMODE lockmode, AclMode aclmode);
 static char *generate_relation_name(Relation rel);
 static void dblink_connstr_check(const char *connstr);
-static void dblink_security_check(PGconn *conn, remoteConn *rconn);
+static bool dblink_connstr_has_pw(const char *connstr);
+static void dblink_security_check(PGconn *conn, remoteConn *rconn, const char *connstr);
 static void dblink_res_error(PGconn *conn, const char *conname, PGresult *res,
 			 bool fail, const char *fmt,...) pg_attribute_printf(5, 6);
 static char *get_connect_string(const char *servername);
@@ -213,7 +215,7 @@ dblink_get_conn(char *conname_or_str,
 	 errmsg("could not establish connection"),
 	 errdetail_internal("%s", msg)));
 		}
-		dblink_security_check(conn, rconn);
+		dblink_security_check(conn, rconn, connstr);
 		if (PQclientEncoding(conn) != GetDatabaseEncoding())
 			PQsetClientEncoding(conn, GetDatabaseEncodi

Re: Moving forward with TDE

2023-03-27 Thread Stephen Frost
Greetings,

On Mon, Mar 27, 2023 at 12:38 Bruce Momjian  wrote:

> On Wed, Mar  8, 2023 at 04:25:04PM -0500, Stephen Frost wrote:
> > Agreed, though the latest efforts include an option for *authenticated*
> > encryption as well as unauthenticated.  That makes it much more
> > difficult to make undetected changes to the data that's protected by
> > the authenticated encryption being used.
>
> I thought some more about this.  GCM-style authentication of encrypted
> data has value because it assumes the two end points are secure but that
> a malicious actor could modify data during transfer.  In the Postgres
> case, it seems the two end points and the transfer are all in the same
> place.  Therefore, it is unclear to me the value of using GCM-style
> authentication because if the GCM-level can be modified, so can the end
> points, and the encryption key exposed.


What are the two end points you are referring to and why don’t you feel
there is an opportunity between them for a malicious actor to attack the
system?

There are simpler cases to consider than an online attack on a single
independent system where an attacker having access to modify the data in
transit between PG and the storage would imply the attacker also having
access to read keys out of PG’s memory.

As specific examples, consider:

An attack against the database system where the database server is shut
down, or a backup, and  the encryption key isn’t available on the system.

The backup system itself, not running as the PG user (an option supported
by PG and at least pgbackrest) being compromised, thus allowing for
injection of changes into a backup or into a restore.

The beginning of this discussion also very clearly had individuals voicing
strong opinions that unauthenticated encryption methods were not acceptable
as an end-state for PG due to the clear issue of there then being no
protection against modification of data.  The approach we are working
towards provides both the unauthenticated option, which clearly has value
to a large number of our collective user base considering the number of
commercial implementations which have now arisen, and the authenticated
solution which goes further and provides the level clearly expected of the
PG community. This gets us a win-win situation.

> There's clearly user demand for it as there's a number of organizations
> > who have forks which are providing it in one shape or another.  This
> > kind of splintering of the community is actually an actively bad thing
> > for the project and is part of what killed Unix, by at least some pretty
> > reputable accounts, in my view.
>
> Yes, the number of commercial implementations of this is a concern.  Of
> course, it is also possible that those commercial implementations are
> meeting checkbox requirements rather than technical ones, and the
> community has been hostile to check box-only features.


I’ve grown weary of this argument as the other major piece of work it was
routinely applied to was RLS and yet that has certainly been seen broadly
as a beneficial feature with users clearly leveraging it and in more than
some “checkbox” way.

Indeed, it’s similar also in that commercial implementations were done of
RLS while there were arguments made about it being a checkbox feature which
were used to discourage it from being implemented in core.  Were it truly
checkbox, I don’t feel we would have the regular and ongoing discussion
about it on the lists that we do, nor see other tools built on top of PG
which specifically leverage it. Perhaps there are truly checkbox features
out there which we will never implement, but I’m (perhaps due to what my
dad would call selective listening on my part, perhaps not) having trouble
coming up with any presently. Features that exist in other systems that we
don’t want?  Certainly. We don’t characterize those as simply “checkbox”
though. Perhaps that’s in part because we provide alternatives- but that’s
not the case here. We have no comparable way to have this capability as
part of the core system.

We, as a community, are clearly losing value by lack of this capability, if
by no other measure than simply the numerous users of the commercial
implementations feeling that they simply can’t use PG without this feature,
for whatever their reasoning.

Thanks,

Stephen


Re: Moving forward with TDE

2023-03-27 Thread Stephen Frost
Greetings,

On Mon, Mar 27, 2023 at 18:17 Bruce Momjian  wrote:

> On Tue, Mar 28, 2023 at 12:01:56AM +0200, Stephen Frost wrote:
> > Greetings,
> >
> > On Mon, Mar 27, 2023 at 12:38 Bruce Momjian  wrote:
> >
> > On Wed, Mar  8, 2023 at 04:25:04PM -0500, Stephen Frost wrote:
> > > Agreed, though the latest efforts include an option for
> *authenticated*
> > > encryption as well as unauthenticated.  That makes it much more
> > > difficult to make undetected changes to the data that's protected
> by
> > > the authenticated encryption being used.
> >
> > I thought some more about this.  GCM-style authentication of
> encrypted
> > data has value because it assumes the two end points are secure but
> that
> > a malicious actor could modify data during transfer.  In the Postgres
> > case, it seems the two end points and the transfer are all in the
> same
> > place.  Therefore, it is unclear to me the value of using GCM-style
> > authentication because if the GCM-level can be modified, so can the
> end
> > points, and the encryption key exposed.
> >
> >
> > What are the two end points you are referring to and why don’t you feel
> there
> > is an opportunity between them for a malicious actor to attack the
> system?
>
> Uh, TLS can use GCM and in this case you assume the sender and receiver
> are secure, no?


TLS does use GCM.. pretty much exclusively as far as I can recall. So do a
lot of other things though..

> There are simpler cases to consider than an online attack on a single
> > independent system where an attacker having access to modify the data in
> > transit between PG and the storage would imply the attacker also having
> access
> > to read keys out of PG’s memory.
>
> I consider the operating system and its processes as much more of a
> single entity than TLS over a network.


This may be the case sometimes but there’s absolutely no shortage of other
cases and it’s almost more the rule these days, that there is some kind of
network between the OS processes and the storage- a SAN, an iSCSI network,
NFS, are all quite common.

> As specific examples, consider:
> >
> > An attack against the database system where the database server is shut
> down,
> > or a backup, and  the encryption key isn’t available on the system.
> >
> > The backup system itself, not running as the PG user (an option
> supported by PG
> > and at least pgbackrest) being compromised, thus allowing for injection
> of
> > changes into a backup or into a restore.
>
> I then question why we are not adding encryption to pg_basebackup or
> pgbackrest rather than the database system.


Pgbackrest has encryption and authentication of it … but that doesn’t
actually address the attack vector that I outlined. If the backup user is
compromised then they can change the data before it gets to the storage.
If the backup user is compromised then they have access to whatever key is
used to encrypt and authenticate the backup and therefore can trivially
manipulate the data.

Encryption of backups by the backup tool serves to protect the data after
it leaves the backup system and is stored in cloud storage or in whatever
format the repository takes.  This is beneficial, particularly when the
data itself offers no protection, but simply not the same.

> The beginning of this discussion also very clearly had individuals voicing
> > strong opinions that unauthenticated encryption methods were not
> acceptable as
> > an end-state for PG due to the clear issue of there then being no
> protection
> > against modification of data.  The approach we are working towards
> provides
>
> What were the _technical_ reasons for those objections?


I believe largely the ones I’m bringing up here and which I outline above…
I don’t mean to pretend that any of this is of my own independent
construction. I don’t believe it is and my apologies if it came across that
way.

> both the unauthenticated option, which clearly has value to a large
> number of
> > our collective user base considering the number of commercial
> implementations
> > which have now arisen, and the authenticated solution which goes further
> and
> > provides the level clearly expected of the PG community. This gets us a
> win-win
> > situation.
> >
> > > There's clearly user demand for it as there's a number of
> organizations
> > > who have forks which are providing it in one shape or another.
> This
> > > kind of splintering of the community is actually an actively bad
> thing
> > > for the project and is part of what killed Unix, by

Re: Moving forward with TDE

2023-03-27 Thread Stephen Frost
Greetings,

On Mon, Mar 27, 2023 at 19:19 Bruce Momjian  wrote:

> On Tue, Mar 28, 2023 at 12:57:42AM +0200, Stephen Frost wrote:
> > I consider the operating system and its processes as much more of a
> > single entity than TLS over a network.
> >
> > This may be the case sometimes but there’s absolutely no shortage of
> other
> > cases and it’s almost more the rule these days, that there is some kind
> of
> > network between the OS processes and the storage- a SAN, an iSCSI
> network, NFS,
> > are all quite common.
>
> Yes, but consider that the database cluster is having to get its data
> from that remote storage --- the remote storage is not an independent
> entity that can be corrupted without the databaes server being
> compromised. If everything in PGDATA was GCM-verified, it would be
> secure, but because some parts are not, I don't think it would be.


The remote storage is certainly an independent system. Multi-mount LUNs are
entirely possible in a SAN (and absolutely with NFS, or just the NFS server
itself is compromised..), so while the attacker may not have any access to
the database server itself, they may have access to these other systems,
and that’s not even considering in-transit attacks which are also
absolutely possible, especially with iSCSI or NFS.

I don’t understand what is being claimed that the remote storage is “not an
independent system” based on my understanding of, eg, NFS. With NFS, a
directory on the NFS server is exported and the client mounts that
directory as NFS locally, all over a network which may or may not be
secured against manipulation.  A user on the NFS server with root access is
absolutely able to access and modify files on the NFS server trivially,
even if they have no access to the PG server.  Would you explain what you
mean?

I do agree that the ideal case would be that we encrypt everything we can
(not everything can be for various reasons, but we don’t actually need to
either) in the PGDATA directory is encrypted and authenticated, just like
it would be ideal if everything was checksum’d and isn’t today. We are
progressing in that direction thanks to efforts such as reworking the other
subsystems to used shared buffers and a consistent page format, but just
like with checksums we do not need to have the perfect solution for us to
provide a lot of value here- and our users know that as the same is true of
the unauthenticated encryption approaches being offered by the commercial
solutions.

> > As specific examples, consider:
> > >
> > > An attack against the database system where the database server is
> shut
> > down,
> > > or a backup, and  the encryption key isn’t available on the system.
> > >
> > > The backup system itself, not running as the PG user (an option
> supported
> > by PG
> > > and at least pgbackrest) being compromised, thus allowing for
> injection
> > of
> > > changes into a backup or into a restore.
> >
> > I then question why we are not adding encryption to pg_basebackup or
> > pgbackrest rather than the database system.
> >
> > Pgbackrest has encryption and authentication of it … but that doesn’t
> actually
> > address the attack vector that I outlined. If the backup user is
> compromised
> > then they can change the data before it gets to the storage.  If the
> backup
> > user is compromised then they have access to whatever key is used to
> encrypt
> > and authenticate the backup and therefore can trivially manipulate the
> data.
>
> So the idea is that the backup user can be compromised without the data
> being vulnerable --- makes sense, though that use-case seems narrow.


That’s perhaps a fair consideration- but it’s clearly of enough value that
many of our users are asking for it and not using PG because we don’t have
it today. Ultimately though, this clearly makes it more than a “checkbox”
feature. I hope we are able to agree on that now.

> What were the _technical_ reasons for those objections?
> >
> > I believe largely the ones I’m bringing up here and which I outline
> above… I
> > don’t mean to pretend that any of this is of my own independent
> construction. I
> > don’t believe it is and my apologies if it came across that way.
>
> Yes, there is value beyond the check-box, but in most cases those
> values are limited considering the complexity of the features, and the
> check-box is what most people are asking for, I think.


For the users who ask on the lists for this feature, regularly, how many
don’t ask because they google or find prior responses on the list to the
question of if we have this capability?  How do we know that their cases
are “checkbox”?  Consi

Re: Moving forward with TDE

2023-03-27 Thread Stephen Frost
Greetings,

On Mon, Mar 27, 2023 at 21:35 Bruce Momjian  wrote:

> On Tue, Mar 28, 2023 at 02:03:50AM +0200, Stephen Frost wrote:
> > The remote storage is certainly an independent system. Multi-mount LUNs
> are
> > entirely possible in a SAN (and absolutely with NFS, or just the NFS
> server
> > itself is compromised..), so while the attacker may not have any access
> to the
> > database server itself, they may have access to these other systems, and
> that’s
> > not even considering in-transit attacks which are also absolutely
> possible,
> > especially with iSCSI or NFS.
> >
> > I don’t understand what is being claimed that the remote storage is “not
> an
> > independent system” based on my understanding of, eg, NFS. With NFS, a
> > directory on the NFS server is exported and the client mounts that
> directory as
> > NFS locally, all over a network which may or may not be secured against
> > manipulation.  A user on the NFS server with root access is absolutely
> able to
> > access and modify files on the NFS server trivially, even if they have no
> > access to the PG server.  Would you explain what you mean?
>
> The point is that someone could change values in the storage, pg_xact,
> encryption settings, binaries, that would allow the attacker to learn
> the encryption key.  This is not possible for two secure endpoints and
> someone changing data in transit.  Yeah, it took me a while to
> understand these boundaries too.


This depends on the specific configuration of the systems, clearly. Being
able to change values in other parts of the system isn’t great and we
should work to improve on that, but clearly that isn’t so much of an issue
that people aren’t willing to accept a partial solution or existing
commercial solutions wouldn’t be accepted or considered viable. Indeed,
using GCM is objectively an improvement over what’s being offered commonly
today.

I also generally object to the idea that being able to manipulate the
PGDATA directory necessarily means being able to gain access to the KEK. In
trivial solutions, sure, it’s possible, but the NFS server should never be
asking some external KMS for the key to a given DB server and a reasonable
implementation won’t allow this, and instead would flag and log such an
attempt for someone to review, leading to a much faster realization of a
compromised system.

Certainly it’s much simpler to reason about an attacker with no knowledge
of either system and only network access to see if they can penetrate the
communications between the two end-points, but that is not the only case
where authenticated encryption is useful.

> So the idea is that the backup user can be compromised without the
> data
> > being vulnerable --- makes sense, though that use-case seems narrow.
> >
> > That’s perhaps a fair consideration- but it’s clearly of enough value
> that many
> > of our users are asking for it and not using PG because we don’t have it
> today.
> > Ultimately though, this clearly makes it more than a “checkbox” feature.
> I hope
> > we are able to agree on that now.
>
> It is more than a check box feature, yes, but I am guessing few people
> are wanting the this for the actual features beyond check box.


As I explained previously, perhaps the people asking are doing so for only
the “checkbox”, but that doesn’t mean it isn’t a useful feature or that it
isn’t valuable in its own right. Those checklists were compiled and
enforced for a reason, which the end users might not understand but is
still absolutely valuable.  Sad to say, but frankly this is becoming more
and more common but we shouldn’t be faulting the users asking for it- if it
were truly useless then eventually it would be removed from the standard,
but it hasn’t and it won’t be because, while not every end user has a depth
of understanding to explain it, it is actually a useful and important
capability to have and one that is important to implement.

> Yes, there is value beyond the check-box, but in most cases those
> > values are limited considering the complexity of the features, and
> the
> > check-box is what most people are asking for, I think.
> >
> > For the users who ask on the lists for this feature, regularly, how many
> don’t
> > ask because they google or find prior responses on the list to the
> question of
> > if we have this capability?  How do we know that their cases are
> “checkbox”?
>
> Because I have rarely heard people articulate the value beyond check
> box.


Have I done so sufficiently then that we can agree that calling it
“checkbox” is inappropriate and detrimental to our user base?

> Consider that there are standards groups which explicitly consider these
> attack
> > vectors and consider them

Re: Kerberos delegation support in libpq and postgres_fdw

2023-03-28 Thread Stephen Frost
Greetings,

* Stephen Frost (sfr...@snowman.net) wrote:
> * Greg Stark (st...@mit.edu) wrote:
> > The CFBot says there's a function be_gssapi_get_proxy() which is
> > undefined. Presumably this is a missing #ifdef or a definition that
> > should be outside an #ifdef.
> 
> Yup, just a couple of missing #ifdef's.
> 
> Updated and rebased patch attached.

... and a few more.  Apparently hacking on a plane without enough sleep
leads to changing ... and unchanging configure flags before testing.

Thanks,

Stephen
From 9808fd4eb4920e40468709af7325a27b18066cec Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Mon, 28 Feb 2022 20:17:55 -0500
Subject: [PATCH] Add support for Kerberos credential delegation

Support GSSAPI/Kerberos credentials being delegated to the server by a
client.  With this, a user authenticating to PostgreSQL using Kerberos
(GSSAPI) credentials can choose to delegate their credentials to the
PostgreSQL server (which can choose to accept them, or not), allowing
the server to then use those delegated credentials to connect to
another service, such as with postgres_fdw or dblink or theoretically
any other service which is able to be authenticated using Kerberos.

Both postgres_fdw and dblink are changed to allow non-superuser
password-less connections but only when GSSAPI credentials have been
proxied to the server by the client and GSSAPI is used to authenticate
to the remote system.

Authors: Stephen Frost, Peifeng Qiu
Discussion: https://postgr.es/m/co1pr05mb8023cc2cb575e0faad7df4f8a8...@co1pr05mb8023.namprd05.prod.outlook.com
---
 contrib/dblink/dblink.c   | 127 ---
 contrib/dblink/expected/dblink.out|   4 +-
 contrib/postgres_fdw/connection.c |  72 +++-
 .../postgres_fdw/expected/postgres_fdw.out|  19 +-
 contrib/postgres_fdw/option.c |   3 +
 contrib/postgres_fdw/sql/postgres_fdw.sql |   3 +-
 doc/src/sgml/config.sgml  |  17 +
 doc/src/sgml/dblink.sgml  |   5 +-
 doc/src/sgml/libpq.sgml   |  41 +++
 doc/src/sgml/monitoring.sgml  |   9 +
 doc/src/sgml/postgres-fdw.sgml|   5 +-
 src/backend/catalog/system_views.sql  |   3 +-
 src/backend/foreign/foreign.c |   1 +
 src/backend/libpq/auth.c  |  13 +-
 src/backend/libpq/be-gssapi-common.c  |  51 +++
 src/backend/libpq/be-secure-gssapi.c  |  26 +-
 src/backend/utils/activity/backend_status.c   |   1 +
 src/backend/utils/adt/pgstatfuncs.c   |  20 +-
 src/backend/utils/init/postinit.c |   8 +-
 src/backend/utils/misc/guc_tables.c   |  10 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/catalog/pg_proc.dat   |   6 +-
 src/include/libpq/auth.h  |   1 +
 src/include/libpq/be-gssapi-common.h  |   3 +
 src/include/libpq/libpq-be.h  |   2 +
 src/include/utils/backend_status.h|   1 +
 src/interfaces/libpq/exports.txt  |   1 +
 src/interfaces/libpq/fe-auth.c|  15 +-
 src/interfaces/libpq/fe-connect.c |  17 +
 src/interfaces/libpq/fe-secure-gssapi.c   |  23 +-
 src/interfaces/libpq/libpq-fe.h   |   1 +
 src/interfaces/libpq/libpq-int.h  |   2 +
 src/test/kerberos/Makefile|   3 +
 src/test/kerberos/t/001_auth.pl   | 335 --
 src/test/perl/PostgreSQL/Test/Utils.pm|  27 ++
 src/test/regress/expected/rules.out   |  11 +-
 36 files changed, 752 insertions(+), 135 deletions(-)

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index 78a8bcee6e..d4dd338055 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -48,6 +48,7 @@
 #include "funcapi.h"
 #include "lib/stringinfo.h"
 #include "libpq-fe.h"
+#include "libpq/libpq-be.h"
 #include "libpq/libpq-be-fe-helpers.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
@@ -111,7 +112,8 @@ static HeapTuple get_tuple_of_interest(Relation rel, int *pkattnums, int pknumat
 static Relation get_rel_from_relname(text *relname_text, LOCKMODE lockmode, AclMode aclmode);
 static char *generate_relation_name(Relation rel);
 static void dblink_connstr_check(const char *connstr);
-static void dblink_security_check(PGconn *conn, remoteConn *rconn);
+static bool dblink_connstr_has_pw(const char *connstr);
+static void dblink_security_check(PGconn *conn, remoteConn *rconn, const char *connstr);
 static void dblink_res_error(PGconn *conn, const char *conname, PGresult *res,
 			 bool fail, const char *fmt,...) pg_attribute_printf(5, 6);
 static char *get_connect_string(const char *servername);
@@ -213,7 +215,7 @@ dblink_get_conn(char *conname_or_str,
 	 errmsg("could not establish connection"),
 	 errdeta

Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

2023-03-30 Thread Stephen Frost
Greetings,

* Jacob Champion (jchamp...@timescale.com) wrote:
> On 3/20/23 09:32, Robert Haas wrote:
> > I think this is the root of our disagreement.
> 
> Agreed.

I've read all the way back to the $SUBJECT change to try and get an
understanding of the questions here and it's not been easy, in part, I
think, due to the verbiage but also the perhaps lack of concrete
examples and instead references to other systems and protocols.

> > My understanding of the
> > previous discussion is that people think that the major problem here
> > is the wraparound-to-superuser attack. That is, in general, we expect
> > that when we connect to a database over the network, we expect it to
> > do some kind of active authentication, like asking us for a password,
> > or asking us for an SSL certificate that isn't just lying around for
> > anyone to use. However, in the specific case of a local connection, we
> > have a reliable way of knowing who the remote user is without any kind
> > of active authentication, namely 'peer' authentication or perhaps even
> > 'trust' if we trust all the local users, and so we don't judge it
> > unreasonable to allow local connections without any form of active
> > authentication. There can be some scenarios where even over a network
> > we can know the identity of the person connecting with complete
> > certainty, e.g. if endpoints are locked down such that the source IP
> > address is a reliable indicator of who is initiating the connection,
> > but in general when there's a network involved you don't know who the
> > person making the connection is and need to do something extra to
> > figure it out.
> 
> Okay, but this is walking back from the network example you just
> described upthread. Do you still consider that in scope, or...?

The concern around the network certainly needs to be in-scope overall.

> > If you accept this characterization of the problem,
> 
> I'm not going to say yes or no just yet, because I don't understand your
> rationale for where to draw the lines.
> 
> If you just want the bare minimum thing that will solve the localhost
> case, require_auth landed this week. Login triggers are not yet a thing,
> so `require_auth=password,md5,scram-sha-256` ensures active
> authentication. You don't even have to disallow localhost connections,
> as far as I can tell; they'll work as intended.

I do think require_auth helps us move in a positive direction.  As I
mentioned elsewhere, I don't think we highlight it nearly enough in the
postgres_fdw documentation.  Let's look at that in a bit more depth with
concrete examples and perhaps everyone will be able to get a bit more
understanding of the issues.

Client is psql
Proxy is some PG server that's got postgres_fdw
Target is another PG server, that is being connected to from Proxy
Authentication is via GSS/Kerberos with proxied credentials

What do we want to require the user to configure to make this secure?

Proxy's pg_hba configured to require GSS auth from Client.
Target's pg_hba configured to require GSS auth from Proxy.

Who are we trusting with what?  In particular, I'd argue that the user
who is able to install the postgres_fdw extension and the user who is
able to issue the CREATE SERVER are largely trusted; at least in so far
as the user doing CREATE SERVER is allowed to create the server and
through that allowed to make outbound connections from the Proxy.

Therefore, the Proxy is configured with postgres_fdw and with a trusted
user performing the CREATE SERVER.

What doesn't this handle today?  Connection side-effects are one
problem- once the CREATE SERVER is done, any user with USAGE rights on
the server can create a USER MAPPING for themselves, either with a
password or without one (if they're able to proxy GSS credentials to the
system).  They aren't able to set password_required though, which
defaults to true.  However, without having require_auth set, they're
able to cause the Proxy to reach an authentication stage with the Target
that might not match what credentials they're supposed to be providing.

We attempt to address this by checking post-auth to Target that we used
the credentials to connect that we expected to- if GSS credentials were
proxied, then we expect to use those.  If a password was provided then
we expect to use a password to auth (only checked after we see if GSS
credentials were proxied and used).  The issue here is 'post-auth' bit,
we'd prefer to fail the connection pre-auth if it isn't what we're
expecting.  Should we then explicit set require_auth=gss when GSS
credentials are proxied?  Also, if a password is provided, then
explicitly set require_auth=scram-sha-256?  Or default to these, at
least, and allow the CREATE SERVER user to override our choices?  Or
should it be a USER MAPPING option that's restricted?  Or not?

> > I think that what you're proposing is that B and C can just be allowed
> > to proxy to A and A can say "hey, by the way, I'm just gonna let you
> > in without asking 

Re: Transparent column encryption

2023-03-30 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2023-03-30 16:01:46 +0200, Peter Eisentraut wrote:
> > On 30.03.23 03:29, Andres Freund wrote:
> > > > One might think that, but the precedent in other equivalent systems is 
> > > > that
> > > > you reference the key and the algorithm separately.  There is some
> > > > (admittedly not very conclusive) discussion about this near [0].
> > > > 
> > > > [0]: 
> > > > https://www.postgresql.org/message-id/flat/00b0c4f3-0d9f-dcfd-2ba0-eee5109b4963%40enterprisedb.com#147ad6faafe8cdd2c0d2fd56ec972a40
> > > 
> > > I'm very much not convinced by that. Either way, there at least there 
> > > should
> > > be a comment mentioning that we intentionally try to allow that.
> > > 
> > > Even if this feature is something we want (why?), ISTM that this should 
> > > not be
> > > implemented by having multiple fields in pg_attribute, but instead by a 
> > > table
> > > referenced by by pg_attribute.attcek.
> > 
> > I don't know if it is clear to everyone here, but the key data model and the
> > surrounding DDL are exact copies of the equivalent MS SQL Server feature.
> > When I was first studying it, I had the exact same doubts about this.  But
> > as I was learning more about it, it does make sense, because this matches a
> > common pattern in key management systems, which is relevant because these
> > keys ultimately map into KMS-managed keys in a deployment.  Moreover, 1) it
> > is plausible that those people knew what they were doing, and 2) it would be
> > preferable to maintain alignment and not create something that looks the
> > same but is different in some small but important details.

I was wondering about this- is it really exactly the same, down to the
point that there's zero checking of what the data returned actually is
after it's decrypted and given to the application, and if it actually
matches the claimed data type?

> I find it very hard to belief that details of the catalog representation like
> this will matter to users. How would would it conceivably affect users that we
> store (key, encryption method) in pg_attribute vs storing an oid that's
> effectively a foreign key reference to (key, encryption method)?

I do agree with this.

> > > > With the proposed removal of usertypmod, it's only two fields: the link 
> > > > to
> > > > the key and the user-facing type.
> > > 
> > > That feels far less clean. I think loosing the ability to set the 
> > > precision of
> > > a numeric, or the SRID for postgis datums won't be received very well?
> > 
> > In my mind, and I probably wasn't explicit about this, I'm thinking about
> > what can be done now versus later.
> > 
> > The feature is arguably useful without typmod support, e.g., for text. We
> > could ship it like that, then do some work to reorganize pg_attribute and
> > tuple descriptors to relieve some pressure on each byte, and then add the
> > typmod support back in in a future release.  I think that is a workable
> > compromise.
> 
> I doubt that shipping a version of column encryption that breaks our type
> system is a good idea.

And this.

I do feel that column encryption is a useful capability and there's
large parts of this approach that I agree with, but I dislike the idea
of having our clients be able to depend on what gets returned for
non-encrypted columns while not being able to trust what encrypted
column results are and then trying to say it's 'transparent'.  To that
end, it seems like just saying they get back a bytea and making it clear
that they have to provide the validation would be clear, while keeping
much of the rest.  Expanding out from that I'd imagine, pie-in-the-sky
and in some far off land, having our data type in/out validation
functions moved to the common library and then adding client-side
validation of the data going in/out of the encrypted columns would allow
application developers to be able to trust what we're returning (as long
as they're using libpq- and we'd have to document that independent
implementations of the protocol have to provide this or just continue to
return bytea's).

Not sure how we'd manage to provide support for extensions though.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Sep 8, 2022 at 1:06 PM  wrote:
> > In theory, I could also inherit that privilege, but that's not how the
> > system works today. By using is_member_of_role, the decision was already
> > made that this should not depend on inheritance. What is left, is the
> > ability to do it via SET ROLE only.
> 
> I do not accept the argument that we've already made the decision that
> this should not depend on inheritance. It's pretty clear that we
> haven't thought carefully enough about which checks should depend only
> on membership, and which ones should depend on inheritance. The patch
> I committed just now to fix ALTER DEFAULT PRIVILEGES is one clear
> example of where we've gotten that wrong. We also changed the way
> predefined roles worked with inheritance not too long ago, so that
> they started using has_privs_of_role() rather than
> is_member_of_role(). Our past thinking on this topic has been fuzzy
> enough that we can't really conclude that because something uses
> is_member_of_role() now that's what it should continue to do in the
> future. We are working to get from a messy situation where the rules
> aren't consistent or understandable to one where they are, and that
> may mean changing some things.

Agreed that we haven't been good about the distinction between these,
but that the recent work by Joshua and yourself has been moving us in
the right direction.

> One could take the view that the issue here is that
> pg_read_all_settings shouldn't have the right to create objects in the
> first place, and that this INHERIT vs. SET ROLE distinction is just a
> distraction. However, that would require accepting the idea that it's
> possible for a role to lack privileges granted to PUBLIC, which also
> sounds pretty unsatisfying. On the whole, I'm inclined to think it's
> reasonable to suppose that if you want to grant a role to someone
> without letting them create objects owned by that role, it should be a
> role that doesn't own any existing objects either. Essentially, that's
> legislating that predefined roles should be minimally privileged: they
> should hold the ability to do whatever it is that they are there to do
> (like read all settings) but not have any other privileges (like the
> ability to do stuff to objects they own).

Predefined roles are special in that they should GRANT just the
privileges that the role is described to GRANT and that users really
shouldn't be able to SET ROLE to them nor should they be allowed to own
objects, or at least that's my general feeling on them.

If an administrator doesn't wish for a user to have the privileges
provided by the predefined role by default, they should be able to set
that up by creating another role who has that privilege which the user
is able to SET ROLE to.  That is:

CREATE ROLE admin WITH INHERIT FALSE;
GRANT pg_read_all_settings TO admin;
GRANT admin TO alice;

Would allow 'alice' to log in without the privileges associated with
pg_read_all_settings but 'alice' is able to SET ROLE admin; and gain
those privileges.  It wasn't intended that 'alice' be able to SET ROLE
to pg_read_all_settings itself though.

* Robert Haas (robertmh...@gmail.com) wrote:
> Yeah, my statement before wasn't correct. It appears that alice can't
> just usurp the privileges of pg_read_all_settings trivially, but she
> can create a trigger on any preexisting table owned by
> pg_read_all_settings and then anyone who performs an operation that
> causes that trigger to fire is at risk:

Triggers aren't the only thing to be worried about in this area-
functions defined inside of views are also run with the privileges of
the user running the SELECT and not as the owner of the view.  The same
is true of running SELECT against tables with RLS too, of course.
Generally speaking, it's always been very risky to access the objects of
users who you don't trust in any way and we don't currently provide any
particularly easy way to make that kind of access safe.  RLS at least
provides an escape by allowing a user to turn it off, but the same isn't
available for setting a search_path and then running queries or
accessing views or running DML against tables with triggers.

> I'm slightly skeptical of that conclusion because the whole thing just
> feels a bit flimsy. Like, the whole idea that you can compromise your
> account by inserting a row into somebody else's table feels a little
> nuts to me. Triggers and row-level security policies make it easy to
> do things that look safe and are actually very dangerous. I think
> anyone would reasonably expect that calling a function owned by some
> other user might be risky, because who knows what that function might
> do, but it seems less obvious that accessing a table could execute
> arbitrary code, yet it can. And it is even less obvious that creating
> a table owned by one role might give some other role who inherits that
> user's privileges to booby-trap that table i

Re: has_privs_of_role vs. is_member_of_role, redux

2022-09-26 Thread Stephen Frost
Greetings,

* Wolfgang Walther (walt...@technowledgy.de) wrote:
> Robert Haas:
> > I don't think we're going to be very happy if we redefine inheriting
> > the privileges of another role to mean inheriting only some of them.
> > That seems pretty counterintuitive to me. I also think that this
> > particular definition is pretty fuzzy.
> 
> Scratch my previous suggestion. A new, less fuzyy definition would be:
> Ownership is not a privilege itself and as such not inheritable.

One of the reasons the role system was brought into being was explicitly
to allow other roles to have ownership-level rights on objects that they
didn't directly own.

I don't see us changing that.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: predefined role(s) for VACUUM and ANALYZE

2022-09-28 Thread Stephen Frost
Greetings,

On Wed, Sep 28, 2022 at 14:50 Nathan Bossart 
wrote:

> On Tue, Sep 20, 2022 at 09:31:26PM -0700, Nathan Bossart wrote:
> > I bet a more pressing concern is the calls to aclmask() since checking
> > privileges is probably done more frequently than updating them.  That
> > appears to use a linear search, too, so maybe sorting the aclitem arrays
> is
> > actually worth exploring.  I still doubt there will be much noticeable
> > impact from expanding AclMode outside of the most extreme cases.
>
> I've been testing aclmask() with long aclitem arrays (2,000 entries is
> close to the limit for pg_class entries), and I haven't found any
> significant impact from bumping AclMode to 64 bits.


The max is the same regardless of the size..?  Considering the size is
capped since pg_class doesn’t (and isn’t likely to..) have a toast table,
that seems unlikely, so I’m asking for clarification on that. We may be
able to get consensus that the difference isn’t material since no one is
likely to have such long lists, but we should at least be aware.

Thanks,

Stephen

>


Re: kerberos/001_auth test fails on arm CPU darwin

2022-09-29 Thread Stephen Frost
Greetings,

* Peter Eisentraut (peter.eisentr...@enterprisedb.com) wrote:
> On 27.09.22 03:37, Andres Freund wrote:
> > Maybe we should rely on PATH, rather than hardcoding OS dependent locations?
> > Or at least fall back to seach binaries in PATH? Seems pretty odd to 
> > hardcode
> > all these locations without a way to influence it from outside the test.
> 
> Homebrew intentionally does not install the krb5 and openldap packages into
> the path, because they conflict with macOS-provided software. However, those
> macOS-provided variants don't provide all the pieces we need for the tests.

The macOS-provided versions are also old and broken, or at least that
was the case when I looked into them last.

> Also, on Linux you need /usr/sbin, which is often not in the path.
> 
> So I think there is no good way around hardcoding a lot of these paths.

Yeah, not sure what else to do.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: use has_privs_of_role() for pg_hba.conf

2022-10-16 Thread Stephen Frost
Greetings,

* Nathan Bossart (nathandboss...@gmail.com) wrote:
> On Sat, Oct 08, 2022 at 11:46:50AM -0400, Robert Haas wrote:
> > Now there may be some other scenario in which the patch is going in
> > exactly the right direction, and if I knew what it was, maybe I'd
> > agree that the patch was a great idea. But I haven't seen anything
> > like that on the thread. Basically, the argument is just that the
> > change would make things more consistent. However, it might be an
> > abuse of the term. If you go out and buy blue curtains because you
> > have a blue couch, that's consistent interior decor. If you go out and
> > buy a blue car because you have a blue couch, that's not really
> > consistent anything, it's just two fairly-unrelated things that are
> > both blue.
> 
> I believe I started this thread after reviewing the remaining uses of
> is_member_of_role() after 6198420 was committed and wondering whether this
> case was an oversight.  If upon closer inspection we think that mere
> membership is appropriate for pg_hba.conf, I'm fully prepared to go and
> mark this commitfest entry as Rejected.  It obviously does not seem as
> clear-cut as 6198420.  And I'll admit I don't have a concrete use-case in
> hand to justify the behavior change.

Looks like we've already ended up there, but my recollection of this is
that it was very much intentional to use is_member_of_role() here.
Perhaps it should have been better commented (as all uses of
is_member_of_role() instead of has_privs_of_role() really should have
lots of comments as to exactly why it makes sense in those cases).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: allowing for control over SET ROLE

2022-10-16 Thread Stephen Frost
Greetings,

* Nathan Bossart (nathandboss...@gmail.com) wrote:
> On Fri, Sep 30, 2022 at 04:34:32PM -0400, Robert Haas wrote:
> > That thread has not reached an entirely satisfying conclusion.
> > However, the behavior that was deemed outright buggy over there has
> > been fixed. The remaining question is what to do about commands that
> > allow you to give objects to other users (like ALTER  ..
> > OWNER TO) or commands that allow you to create objects owned by other
> > users (like CREATE DATABASE ... OWNER). I have, in this version,
> > adopted the proposal by Wolfgang Walther on the other thread to make
> > this controlled by the new SET option. This essentially takes the view
> > that the ability to create objects owned by another user is not
> > precisely a privilege, and is thus not inherited just because the
> > INHERIT option is set on the GRANT, but it is something you can do if
> > you could SET ROLE to that role, so we make it dependent on the SET
> > option. This logic is certainly debatable, but it does have the
> > practical advantage of making INHERIT TRUE, SET FALSE a useful
> > combination of settings for predefined roles. It's also 100%
> > backward-compatible, whereas if we made the behavior dependent on the
> > INHERIT option, users could potentially notice behavior changes after
> > upgrading to v16.
> 
> I'm not sure about tying the ownership stuff to this new SET privilege.
> While you noted some practical advantages, I'd expect users to find it kind
> of surprising.  Also, for predefined roles, I think you need to be careful
> about distributing ADMIN, as anyone with ADMIN on a predefined role can
> just GRANT SET to work around the restrictions.  I don't have a better
> idea, though, so perhaps neither of these things is a deal-breaker.  I was
> tempted to suggest using ADMIN instead of SET for the ownership stuff, but
> that wouldn't be backward-compatible, and you'd still be able to work
> around it to some extent with SET (e.g., SET ROLE followed by CREATE
> DATABASE).

As we work through splitting up the privileges and managing them in a
more fine-grained way, it seems clear that we'll need to have a similar
split for ADMIN rights on roles- that is, we'll need to be able to
say "role X is allowed to GRANT INHERIT for role Y to other roles, but
not SET".

I'm still half-tempted to say that predefined roles should just be dealt
with as a special case.. but if we split ADMIN in the manner as
described above then maybe we could get away with not having to, but it
would depend a great deal of people actually reading the documentation
and I'm concerned that's a bit too much to ask in this case.

That is- the first person who is likely to GRANT out ADMIN rights in a
predefined role is going to be a superuser.  To avoid breaking backwards
compatibility, GRANT'ing of ADMIN needs to GRANT all the partial-ADMIN
rights that exist, or at least exist today, which includes both SET and
INHERIT.  Unless we put some kind of special case for predefined roles
where we throw an error or at least a warning when a superuser
(presumably) inadvertantly does a simple GRANT ADMIN for $predefined
role, we're going to end up in the situation where folks can SET ROLE to
a predefined role and do things that they really shouldn't be allowed
to.

We could, of course, very clearly document that the way to GRANT ADMIN
rights for a predefined role is to always make sure to *only* GRANT
ADMIN/INHERIT, but again I worry that it simply wouldn't be followed in
many cases.  Perhaps we could arrange for the bootstrap superuser to
only be GRANT'd ADMIN/INHERIT for predefined roles and then not have an
explicit cut-out for superuser doing a GRANT on predefined roles or
perhaps having such be protected under allow_system_table_mods under the
general consideration that modifying of predefined roles isn't something
that folks should be doing post-initdb.

Just a few thoughts on this, not sure any of these ideas are great but
perhaps this helps move us forward.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Checking for missing heap/index files

2022-10-18 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Oct 18, 2022 at 12:59 PM Tom Lane  wrote:
> > There is no text suggesting that it's okay to miss, or to double-return,
> > an entry that is present throughout the scan.  So I'd interpret the case
> > you're worried about as "forbidden by POSIX".  Of course, it's known that
> > NFS fails to provide POSIX semantics in all cases --- but I don't know
> > if this is one of them.
> 
> Yeah, me neither. One problem I see is that, even if the behavior is
> forbidden by POSIX, if it happens in practice on systems people
> actually use, then it's an issue. We even have documentation saying
> that it's OK to use NFS, and a lot of people do -- which IMHO is
> unfortunate, but it's also not clear what the realistic alternatives
> are. It's pretty hard to tell people in 2022 that they are only
> allowed to use PostgreSQL with local storage.
> 
> But to put my cards on the table, it's not so much that I am worried
> about this problem myself as that I want to know whether we're going
> to do anything about it as a project, and if so, what, because it
> intersects a patch that I'm working on. So if we want to readdir() in
> one fell swoop and cache the results, I'm going to go write a patch
> for that. If we don't, then I'd like to know whether (a) we think that
> would be theoretically acceptable but not justified by the evidence
> presently available or (b) would be unacceptable due to (b1) the
> potential for increased memory usage or (b2) some other reason.

While I don't think it's really something that should be happening, it's
definitely something that's been seen with some networked filesystems,
as reported.  I also strongly suspect that on local filesystems there's
something that prevents this from happening but as mentioned that
doesn't cover all PG use cases.

In pgbackrest, we moved to doing a scan and cache'ing all of the results
in memory to reduce the risk when reading from the PG data dir.  We also
reworked our expire code (which removes an older backup from the backup
repository) to also do a complete scan before removing files.

I don't see it as likely to be acceptable, but arranging to not add or
remove files while the scan is happening would presumably eliminate the
risk entirely.  We've not seen this issue recur in the expire command
since the change to first completely scan the directory and then go and
remove the files from it.  Perhaps just not removing files during the
scan would be sufficient which might be more reasonable to do.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Logging of matching pg_hba.conf entry during auth skips trust auth, potential security issue

2023-08-17 Thread Stephen Frost
Greetings,

* Jacob Champion (jchamp...@timescale.com) wrote:
> Maybe something like the attached?

> - I used the phrasing "connection not authenticated" in the hopes that
> it's a bit more greppable than just "connection", especially in
> combination with the existing "connection authenticated" lines.

That doesn't seem quite right ... admittedly, 'trust' isn't performing
authentication but there can certainly be an argument made that the
basic 'matched a line in pg_hba.conf' is a form of authentication, and
worse really, saying 'not authenticated' would seem to imply that we
didn't allow the connection when, really, we did, and that could be
confusing to someone.

Maybe 'connection allowed' instead..?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Logging of matching pg_hba.conf entry during auth skips trust auth, potential security issue

2023-08-17 Thread Stephen Frost
Greetings,

* Jacob Champion (jchamp...@timescale.com) wrote:
> On Thu, Aug 17, 2023 at 9:01 AM Stephen Frost  wrote:
> > Maybe 'connection allowed' instead..?
> 
> Hm. It hasn't really been allowed yet, either. To illustrate what I mean:
> 
> LOG:  connection received: host=[local]
> LOG:  connection allowed: user="jacob" method=trust
> (/home/jacob/src/data/pg16/pg_hba.conf:117)
> LOG:  connection authorized: user=jacob database=postgres
> application_name=psql
> 
> Maybe "unauthenticated connection:"? "connection without
> authentication:"? "connection skipped authentication:"?

Don't like 'skipped' but that feels closer.

How about 'connection bypassed authentication'?

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Logging of matching pg_hba.conf entry during auth skips trust auth, potential security issue

2023-08-17 Thread Stephen Frost
Greetings,

On Thu, Aug 17, 2023 at 15:23 Robert Haas  wrote:

> On Thu, Aug 17, 2023 at 12:54 PM Jacob Champion 
> wrote:
> > On Thu, Aug 17, 2023 at 9:46 AM Stephen Frost 
> wrote:
> > > Don't like 'skipped' but that feels closer.
> > >
> > > How about 'connection bypassed authentication'?
> >
> > Works for me; see v2.
>
> For what it's worth, my vote would be for "connection authenticated:
> ... method=trust".


I don’t have any particular objection to this language and agree that it’s
actually closer to how we talk about the trust auth method in our
documentation.

Maybe if we decided to rework the documentation … or perhaps just ripped
“trust” out entirely … but those are whole different things from what we
are trying to accomplish here.

Thanks,

Stephen


Re: SLRUs in the main buffer pool - Page Header definitions

2023-08-24 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Aug 18, 2023 at 12:15 PM Nathan Bossart
>  wrote:
> > I think I agree with Stephen.  We routinely make changes that require
> > updates to extensions, and I doubt anyone is terribly wild about
> > maintaining two SLRU systems for several years.
> 
> Yeah, maintaining two systems doesn't sound like a good idea.
> 
> However, this would be a big change. I'm not sure how we validate a
> change of this magnitude. There are both correctness and performance
> considerations. I saw there had been a few performance results on the
> thread from Thomas that spawned this thread; but I guess we'd want to
> do more research. One question is: how do you decide how many buffers
> to use for each SLRU, and how many to leave available for regular
> data?

Agreed that we'd certainly want to make sure it's all correct and to do
performance testing but in terms of how many buffers... isn't much of
the point of this that we have data in memory because it's being used
and if it's not then we evict it?  That is, I wouldn't think we'd have
set parts of the buffer pool for SLRUs vs. regular data but would let
the actual demand drive what pages are in the cache and I thought that
was part of this proposal and part of the reasoning behind making this
change.

There's certainly an argument to be made that our current cache
management strategy doesn't account very well for the value of pages
(eg: btree root pages vs. random heap pages, perhaps) and that'd
certainly be a good thing to improve on, but that's independent of this.
If it's not, then that's certainly moving the goal posts a very long way
in terms of this effort.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Let's make PostgreSQL multi-threaded

2023-08-25 Thread Stephen Frost
Greetings,

* David Geier (geidav...@gmail.com) wrote:
> On 8/11/23 14:05, Merlin Moncure wrote:
> > Hm, noted this upthread, but asking again, does this
> > help/benefit interactions with the operating system make oom kill
> > situations less likely?   These things are the bane of my existence, and
> > I'm having a hard time finding a solution that prevents them other than
> > running pgbouncer and lowering max_connections, which adds complexity. 
> > I suspect I'm not the only one dealing with this.   What's really scary
> > about these situations is they come without warning.  Here's a pretty
> > typical example per sar -r.
> > 
> > The conjecture here is that lots of idle connections make the server
> > appear to have less memory available than it looks, and sudden transient
> > demands can cause it to destabilize.
> 
> It does in the sense that your server will have more memory available in
> case you have many long living connections around. Every connection has less
> kernel memory overhead if you will. Of course even then a runaway query will
> be able to invoke the OOM killer. The unfortunate thing with the OOM killer
> is that, in my experience, it often kills the checkpointer. That's because
> the checkpointer will touch all of shared buffers over time which makes it
> likely to get selected by the OOM killer. Have you tried disabling memory
> overcommit?

This is getting a bit far afield in terms of this specific thread, but
there's an ongoing effort to give PG administrators knobs to be able to
control how much actual memory is used rather than depending on the
kernel to actually tell us when we're "out" of memory.  There'll be new
patches for the September commitfest posted soon.  If you're interested
in this issue, it'd be great to get more folks involved in review and
testing.

Thanks!

Stephen


signature.asc
Description: PGP signature


<    1   2   3   4   5   6   7   8   9   10   >