from:"Stephen"

[HACKERS] Alter table drop column and background vacuum?

2002-06-21 Thread Stephen


Any idea if alter table drop column and background vacuum will be
implemented by 7.3?
It's really critical for large applications that must run 24/7.

Stephen



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Some thoughts about i/o priorities and throttling vacuum

2003-10-16 Thread Stephen

I think adding tunable delay per-page loop into VACUUM will help keep system
responsive at all times. In many cases, especially for mostly read-only
tables, plain VACUUM does not need to complete immediately (VACUUM FULL
should complete immediately). I prefer that VACUUM takes its sweet time to
run as long as it doesn't disrupt other queries. See my other post on
"VACUUM degrades performance significantly. Database becomes unusable!" on
pgsql-general mailing list.

Regards,

Stephen

"Tom Lane" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Greg Stark <[EMAIL PROTECTED]> writes:
> > ... vacuum could throttle
> > its own disk accesses by, say, reading 64k at a time then sleeping for
> > a fraction of a second.
> > ...
> > Personally I think i/o priorities give much better leverage.
>
> Pie in the sky is great too ;-).  But there is no such thing as i/o
> priorities, at least not in any portable sense.
>
> OTOH I was just musing to myself earlier today that putting a tunable
> delay into VACUUM's per-page loop might make it more friendly to
> competing processes.  I dunno if it'd work or just be a waste of time,
> but it does seem worth experimenting with.
>
> Want to try it out and report back?
>
> regards, tom lane
>
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org
>

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Some thoughts about i/o priorities and throttling vacuum

2003-10-17 Thread Stephen

Is it possible to have an optional delay in plain VACUUM for each invocation
rather than database wide? Something along the line of an optional THROTTLE
or DELAY parameter for the VACUUM command. The THROTTLE is ignored when FULL
or FREEZE is selected.

VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [THROTTLE] ANALYZE [ table [ (column
[, ...] ) ] ]

This way autovacuum can still throttle VACUUM as needed in future (either in
contrib or backend) and administrators can decide to apply different delays
for different tables depending on the usage.

Regards, Stephen

"Tom Lane" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Of course, this makes VACUUM run longer, and if you are waiting for it
> > to finish, it would be worse, like if you are running it at night or
> > something.
> > I think the delay has to take into account the number of active
> > transactions or something.
>
> I was just thinking of a GUC parameter: wait N milliseconds between
> pages, where N defaults to zero probably.  A user who wants to run his
> vacuum as a background process could set N larger than zero.  I don't
> believe we are anywhere near being able to automatically adjust the
> delay based on load, and even if we could, this would ignore the point
> you make above --- the user's intent has to matter as much as anything
> else.
>
> regards, tom lane
>
> ---(end of broadcast)---
> TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
>



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[HACKERS] Broken links in postgreSQL.org ads

2003-10-24 Thread Stephen

Dear webmaster,

I tried contacting [EMAIL PROTECTED] twice about broken links on all
the top corner square ads at http://www.postgresql.org web site, but no one
seemed fix them for a very very long time. Hopefully this post will get to
the right channel.

Regards, Stephen




---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Broken links in postgreSQL.org ads

2003-10-26 Thread Stephen

I'm using IE6 and it redirects back to http://www.postgresql.org. Mozilla
1.4 works fine.

Regards, Stephen


""Marc G. Fournier"" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
>
> 'K, just tried Konqueror, and I get the same behaviour ... Firebird 0.7,
> though, works fine for me ...
>
> Just looked in Konqueror's settings for Cookies, and default is to accept
> from originating server ... IE6 has similar 'defaults', but you can setup
> P3P to get around it, do you know if Konqueror supports the same thing?
>
> Does anyone know anything about P3P?  We setup P3P for IE6, but it seems
> to still fail ... based on checks that Dave made, it apparently has to do
> with a 'short form' of P3P that needs to be sent out, but I'm not sure
> where/how to send it out ...
>
> On Fri, 24 Oct 2003, Richard Huxton wrote:
>
> > On Friday 24 October 2003 16:45, Marc G. Fournier wrote:
> > > how broken?  I just tested it from here, using Mozilla Firebird, and
they
> > > work fine, no errors ... there are issues with IE6 that we are aware
of,
> > > but again, nothing that should generate error messages ...
> >
> > I always assumed it was my settings, but they've never worked for me. It
just
> > seems to redirect back to www.postgresql.org
> >
> > Firebird 0.7 (allow unrequested popups from postgresql.org)
> > Konqueror 3.1.something
> >
> > --
> >   Richard Huxton
> >   Archonet Ltd
> >
>
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly
>



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-10-31 Thread Stephen

Great! I haven't tried it yet, but I love the thought of it already :-)
I've been waiting for something like this for the past 2 years and now it's
going to make my multi-gigabyte PostgreSQL more usable and responsive. Will
the delay be tunable per VACUUM invocation? This is needed for different
tables that require different VACUUM priorities (eg. For small tables that
are rarely used, I rather vacuum with zero delay. For big tables, I'd set a
reasonable delay in vacuum and let it run through the day & nite).

Regards,

Stephen


"Tom Lane" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> "Matthew T. O'Connor" <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> 2. I only bothered to insert delays in the processing loops of plain
> >> VACUUM and btree index cleanup.  VACUUM FULL and cleanup of non-btree
> >> indexes aren't done yet.
> >>
> > I thought we didn't want the delay in vacuum full since it locks things
> > down, we want vacuum full to finish ASAP.  As opposed to normal vacuum
> > which would be fired by the autovacuum daemon.
>
> My thought was that it'd be up to the user to set vacuum_page_delay
> appropriately for what he is doing.  It might or might not ever make
> sense to use a nonzero delay in VACUUM FULL, but the facility should be
> there.  (Since plain and full VACUUM share the same index cleanup code,
> it would take some klugery to implement a policy of "no delays for
> VACUUM FULL" anyway.)
>
> Best practice would likely be to leave the default vacuum_page_delay at
> zero, and have the autovacuum daemon set a nonzero value for vacuums it
> issues.
>
> regards, tom lane
>
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
>



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-01 Thread Stephen

I tried the Tom Lane's patch on PostgreSQL 7.4-BETA-5 and it works
fantastically! Running a few short tests show a significant improvement in

responsiveness on my RedHat 9 Linux 2.4-20-8 (IDE 120GB 7200RPM UDMA5).

I didn't feel any noticeable delay when vacuum_page_delay is set to 5ms, 10
ms. Vacuum takes 15 to 24 times longer to complete (as expected)

but I don't mind at all. Vmstat BI/BO load is reduced by 5 times when
vacuum_page_delay = 1ms. Load average reduced significantly

also as there are less processes waiting to complete. I find a value of 1ms
to 5ms is quite good and will keep system responsive. Going from 10ms to 1ms
didn't seem to reduce the total vacuum time by much and I'm not sure why.

Any chance we can get this patched into 7.4 permanently?

I cannot say how well it would work on a heavy load, but on a light load
this patch is highly recommended for 24/7 large DB systems. The

database is mostly read-only. There are 133,000 rows and each row is about
2.5kB in size (mostly due to the bytea column holding a binary

image). The long row causes system to TOAST the table. I repeatedly ran the
following tests while system is idling:


Normal operation with no VACUUM
===

tsdb=# explain analyze select * from table1 where id =
'0078997ac809877c1a0d1f76af753608';
QUERY PLAN

--
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=2 width=344)
(actual time=19.030..19.036 rows=1 loops=1)
   Index Cond: ((id)::text = '0078997ac809877c1a0d1f76af753608'::text)
 Total runtime: 19.206 ms
(3 rows)



VACUUM at vacuum_page_delay = 0
===

-bash-2.05b$ vmstat 1
   procs  memory  swap  io system
cpu
 r  b  w   swpd   free   buff  cache   si   sobibo   incs us sy
id
 0  1  0 176844   3960  17748 14670400  1408 0  296   556  0  1
99
 0  1  0 176844   3960  17748 14626400  1536 0  285   546  0  2
98


tsdb=# explain analyze select * from table1 where id =
'00e5ae5f4fddab371f7847f7da65eebb';
   QUERY PLAN


 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=2 width=344)
(actual time=298.028..298.047 rows=1 loops=1)
   Index Cond: ((id)::text = '0036edc4a92b6afd41304c6c8b76bc3c'::text)
 Total runtime: 298.275 ms
(3 rows)

tsdb=# explain analyze select * from table1 where id =
'0046751ac3ec290b9f66ea1d66431923';
 QUERY PLAN


 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=2 width=344)
(actual time=454.727..454.746 rows=1 loops=1)
   Index Cond: ((id)::text = '0046751ac3ec290b9f66ea1d66431923'::text)
 Total runtime: 454.970 ms
(3 rows)

tsdb=# explain analyze select * from table1 where id =
'00a74e6885579a2d50487f5a1dceba22';
 QUERY PLAN


 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=2 width=344)
(actual time=344.483..344.501 rows=1 loops=1)
   Index Cond: ((id)::text = '00a74e6885579a2d50487f5a1dceba22'::text)
 Total runtime: 344.700 ms
(3 rows)


VACUUM at vacuum_page_delay = 1
===

   procs  memory  swap  io system
cpu
 r  b  w   swpd   free   buff  cache   si   sobibo   incs us sy
id
 0  0  0 176840   4292  23700 13741600   384 0  127   302  0  0
100
 0  0  0 176840   4220  23700 13711600   512 0  118   286  0  0
100
 1  0  0 176840   4220  23700 13665600   384 0  132   303  0  1
99


tsdb=# explain analyze select * from table1 where id =
'003d5966f8b9a06e4b0fff9fa8e93be0';
QUERY PLAN

--
 Index Scan using table1_pkey on table1  (cost=0.00..6.01 rows=2 width=344)
(actual time=74.575..74.584 rows=1 loops=1)
   Index Cond: ((id)::text = '003d5966f8b9a06e4b0fff9fa8e93be0'::text)
 Total runtime: 74.761 ms
(3 rows)

tsdb=# explain analyze select * from table1 where id =
'00677fe46cd0af3d98564068f34db1cf';
QUERY PLAN

--

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-02 Thread Stephen

As it turns out. With vacuum_page_delay = 0, VACUUM took 1m20s (80s) to
complete, with vacuum_page_delay = 1 and vacuum_page_delay = 10, both
VACUUMs completed in 18m3s (1080 sec). A factor of 13 times! This is for a
single 350 MB table.

Apparently, it looks like the upcoming Linux kernel 2.6 will have a smaller
quantum:

http://go.jitbot.com/linux2.6-quantum

There is also mention of user-space tweak to get a more accurate time slice
of near 1ms on Linux, but I'm not sure how this works and if it applies to
Unixes:

http://go.jitbot.com/linux-devrtc-quantum

Regards, Stephen



"Tom Lane" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> "Stephen" <[EMAIL PROTECTED]> writes:
> > also as there are less processes waiting to complete. I find a value of
1ms
> > to 5ms is quite good and will keep system responsive. Going from 10ms to
1ms
> > didn't seem to reduce the total vacuum time by much and I'm not sure
why.
>
> On most Unixen, the effective resolution of sleep requests is whatever
> the scheduler time quantum is --- and 10ms is the standard quantum in
> most cases.  So any delay less than 10ms is going to be interpreted as
> 10ms.
>
> I think on recent Linuxen it's possible to adjust the time quantum, but
> whether this would be a net win isn't clear; presumably a shorter
> quantum would result in more scheduler overhead and more process-swap
> penalties.
>
> regards, tom lane
>
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org
>



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-04 Thread Stephen

I don't mind the long delay as long as we have a choice as we clearly do in
this case to set vacuum_page_delay=WHATEVER. Of course, if VACUUM can be
improved with better code placement for the delays or buffer replacement
policies then I'm all for it. Right now, I'm pretty satisfied with the
responsiveness on large DBs using vacuum_page_delay=10ms delay.

Any ideas if this patch will be included into 7.4 before final release?

Stephen


"Andrew Dunstan" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
>
> Not surprising, I should have thought. Why would you care that much?
> The idea as I understand it is to improve the responsiveness of things
> happening alongside vacuum ("real work"). I normally run vacuum when I
> don't expect anything else much to be happening - but I don't care how
> long it takes (within reason), especially if it isn't going to intefere
> with other uses.
>
> cheers
>
> andrew
>



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Performance features the 4th

2003-11-09 Thread Stephen

Yes, I would like to see the vacuum delay patch go into 7.4.1 if possible.
It's really useful. I don't think there is any major risk in adding the
delay patch into a minor revision given the small amount of code change.

Stephen


""Matthew T. O'Connor"" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Tom Lane wrote:
>
> >Jan Wieck <[EMAIL PROTECTED]> writes:
> >
> >
> >>As a matter of fact, people who have performance problems are likely to
> >>be the same who have upgrade problems. And as Gaetano pointed out
> >>correctly, we will see wildforms with one or the other feature applied.
> >>
> >>
> >
> >I'd believe that for patches of the size of my original VACUUM-delay
> >hack (or even a production-grade version of same, which'd probably be
> >10x larger).  The kind of wholesale rewrite you are currently proposing
> >is much too large to consider folding back into 7.4.*, IMHO.
> >
> >
> Do people think that the VACUUM-delay patch by itself, would be usefully
> enough on it's own to consider working it into 7.4.1 or something?  From
> the little feedback I have read on the VACUUM-delay patch used in
> isolation, it certainly does help.  I would love to see it put into 7.4
> somehow.
>
> The far more rigorous changes that Jan is working on, will be welcome
> improvements for 7.5.
>
>
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
>



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-09 Thread Stephen

The delay patch worked so well, I couldn't resist asking if a similar patch
could be added for COPY command (pg_dump). It's just an extension of the
same idea. On a large DB, backups can take very long while consuming a lot
of IO slowing down other select and write operations. We operate on a backup
window during low traffic period at night. It'll be nice to be able to run
pg_dump *anytime* and no longer need to worry about the backup window.
Backups will take longer to run, but like in the case of the VACUUM, it's a
win for many people to be able to let it run in the background through the
whole day. The delay should be optional and defaults to zero so those who
wish to backup immediately can still do it. The way I see it, routine
backups and vacuums should be ubiquitous once properly configured.

Regards,

Stephen

"Tom Lane" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Jan Wieck <[EMAIL PROTECTED]> writes:
> > I am currently looking at implementing ARC as a replacement strategy. I
> > don't have anything that works yet, so I can't really tell what the
> > result would be and it might turn out that we want both features.
>
> It's likely that we would.  As someone (you?) already pointed out,
> VACUUM has bad side-effects both in terms of cache flushing and in
> terms of sheer I/O load.  Those effects require different fixes AFAICS.
>
> One thing that bothers me here is that I don't see how adjusting our
> own buffer replacement strategy is going to do much of anything when
> we cannot control the kernel's buffer replacement strategy.  To get any
> real traction we'd have to go back to the "take over most of RAM for
> shared buffers" approach, which we already know to have a bunch of
> severe disadvantages.
>
> regards, tom lane
>
> ---(end of broadcast)---
> TIP 4: Don't 'kill -9' the postmaster
>

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] What's planned for 7.5?

2004-01-18 Thread Stephen

Any chance we'll see the VACUUM delay patch (throttle) get into 7.5? I only
had the chance to try the first patch by Tom Lane and it was very good
already. I was hoping it gets into 7.4.1 but it didn't. :-(

I really need the VACUUM delay patch because my servers are begging to die
every time VACUUM runs. Please let it be in 7.5.

Thanks.

Stephen


"Tom Lane" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> ow <[EMAIL PROTECTED]> writes:
> > Is this all that's planned for 7.5? (based on current TODO list)
>
> If you think the TODO list has *anything* to do with release planning,
> you fundamentally misunderstand the way things are done around here.
>
> The TODO list is a list of things that are widely agreed to be problems
> (though sometimes it only means "Bruce thinks this is a problem") and
> therefore will probably get worked on at some point in the future.
>
> What actually gets done for 7.5 will depend on which itches bother which
> developers enough to get scratched in the next few months.  We do not
> have any central planning.
>
> It is a safe bet that 7.5 will include some things mentioned in the
> present TODO, as well as many things not mentioned in it; and that
> many items mentioned in the present TODO will still remain undone.
>
> regards, tom lane
>
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org
>



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] VACUUM delay (was Re: What's planned for 7.5?)

2004-01-18 Thread Stephen

The vacuum delay patch is not the ideal solution but it worked like a charm
on my servers. I really need the vacuum delay patch or a better solution in
7.5. I'm getting millions of requests a month and running VACUUM without the
patch makes PostgreSQL useless for many consecutive hours. Not quite the
24/7 system I was hopping for. :-(

Unfortunately, it's rather difficult to patch so many machines as my entire
system runs on Redhat RPMs. I'm really hopping to see a solution to this
VACUUM problem in 7.5. I've been waiting for this fix for over 3 years and
now it's almost there.

Will this problem get addressed in the not so official TODO list?

Thanks and keep up the good work!

Stephen





"Jan Wieck" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Tom Lane wrote:
> > Christopher Browne <[EMAIL PROTECTED]> writes:
> >> "Stephen" <[EMAIL PROTECTED]> writes:
> >>> Any chance we'll see the VACUUM delay patch (throttle) get into 7.5?
> >
> >> The hope, in 7.5, is to have ARC, which is the "super-duper-duper"
> >> version, working.
> >
> > Actually, I'm not sure that ARC should be considered to supersede the
> > usefulness of a per-page delay in VACUUM.  ARC should prevent VACUUM
> > from trashing the contents of Postgres' shared buffer arena, but it
> > won't do much of anything to prevent VACUUM from trashing the kernel
> > buffer contents.  And it definitely won't do anything to help if the
> > real problem is that you're short of disk bandwidth and VACUUM's extra
> > I/O demand pushes your total load over the knee of the response-time
> > curve.  What you need then is a throttle.
> >
> > The original patch I posted was incomplete for a number of reasons,
> > but I think it may still be worth working on.  Jan, any comments?
>
> I agree that there is considerable value in IO distribution. As such I
> already have the basics of the Background Writer in there.
>
> I left out the vacuum delay since I thought it was good enough to proove

> that there is low hanging fruit, but that it was far from what I'd call
> a solution. Ideally Vacuum would coordinate it's IO not only against
> some GUC variables, but also against the BGWriter+BufStrategy combo.
>
>
> Jan
>
> --
> #==#
> # It's easier to get forgiveness for being wrong than for being right. #
> # Let's break this rule - forgive me.  #
> #== [EMAIL PROTECTED] #
>
>
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org
>



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] Avoid MVCC using exclusive lock possible?

2004-02-28 Thread Stephen

Hi,

Recently, I ran a huge update on an Integer column affecting 100 million
rows in my database. What happened was my disk space increased in size and
my IO load was very high. It appears that MVCC wants to rewrite each row
(each row was about 5kB due to a bytea column). In addition, VACUUM needs to
run to recover space eating up even more IO bandwidth.

It came to my mind that what if there could be a mechanism in place to allow
overwriting portions of the same row *whenever possible* instead of creating
a new row as MVCC would require. This would work well for timestamp, char,
integer, float, boolean columns etc..

A user must explicitly call:

EXCLUSIVE LOCK ON TABLE
UPDATE ROWs
RELEASE LOCK ON TABLE.

It basically immitates the behavior of MySQL. Surely, this would be faster
than recreating the new row and marking the old one as invalid at the
expense of locking the table. MySQL users can then use Postgres and get
similar performance simply by locking the table first.

It probably works well when the transaction volume is low, when you need a
quick counter, when your IO bandwidth is saturated or when you want to avoid
VACUUMing after a massive update.

Any thoughts?




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

[HACKERS] pg_dump dump catalog ACLs

2016-02-29 Thread Stephen Frost

All,

Per discussion about the best approach to reduce the amount of
superuser-only capabilities, this patch modifies pg_dump to dump out
all ACLs which exist on objects in the pg_catalog schema.  With this
change, follow-on trivial patches will remove explicit superuser()
checks from functions and replace them with 'REVOKE EXECUTE FROM public'
commands, allowing users to then control what users are allowed to
execute those functions.

Started as a new thread to hopefully gain more interest.  Will be
registered in the March commitfest shortly.

Thanks!

Stephen
From b2b01b498f3d9fede2e876785effd48f00feee34 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Mon, 29 Feb 2016 21:11:46 -0500
Subject: [PATCH] Make pg_dump dump ACLs for pg_catalog objects

Historically, we've avoided dumping anything about the objects in
the pg_catalog schema.  Unfortunately, this has meant that any
changes or even initial ACLs set for objects in pg_catalog are not
preserved.

Instead, dump out the ACLs which are set on objects in pg_catalog
in the same way we dump ACLs for user objects.

This is implemented by adding the notion of a 'dump component'
(such as an ACL, or comments, or policies) which can be requested
to be dumped out rather than everything.
---
 src/bin/pg_dump/common.c  |2 +-
 src/bin/pg_dump/pg_dump.c | 1576 -
 src/bin/pg_dump/pg_dump.h |   14 +-
 3 files changed, 865 insertions(+), 727 deletions(-)

diff --git a/src/bin/pg_dump/common.c b/src/bin/pg_dump/common.c
index f798b15..507b0bf 100644
--- a/src/bin/pg_dump/common.c
+++ b/src/bin/pg_dump/common.c
@@ -437,7 +437,7 @@ AssignDumpId(DumpableObject *dobj)
 	dobj->dumpId = ++lastDumpId;
 	dobj->name = NULL;			/* must be set later */
 	dobj->namespace = NULL;		/* may be set later */
-	dobj->dump = true;			/* default assumption */
+	dobj->dump = DUMP_COMPONENT_ALL;	/* default assumption */
 	dobj->ext_member = false;	/* default assumption */
 	dobj->dependencies = NULL;
 	dobj->nDeps = 0;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 64c2673..416e6a7 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -1284,7 +1284,7 @@ checkExtensionMembership(DumpableObject *dobj, DumpOptions *dopt)
 	 * extension contents with something different.
 	 */
 	if (!dopt->binary_upgrade)
-		dobj->dump = false;
+		dobj->dump = DUMP_COMPONENT_NONE;
 	else
 		dobj->dump = ext->dobj.dump;
 
@@ -1306,16 +1306,20 @@ selectDumpableNamespace(NamespaceInfo *nsinfo, DumpOptions *dopt)
 	 * namespaces. If specific namespaces are being dumped, dump just those
 	 * namespaces. Otherwise, dump all non-system namespaces.
 	 */
+
 	if (table_include_oids.head != NULL)
-		nsinfo->dobj.dump = false;
+		nsinfo->dobj.dump = DUMP_COMPONENT_NONE;
 	else if (schema_include_oids.head != NULL)
 		nsinfo->dobj.dump = simple_oid_list_member(&schema_include_oids,
-   nsinfo->dobj.catId.oid);
+   nsinfo->dobj.catId.oid) ?
+			DUMP_COMPONENT_ALL : DUMP_COMPONENT_NONE;
+	else if (strncmp(nsinfo->dobj.name, "pg_catalog", 10) == 0)
+		nsinfo->dobj.dump = DUMP_COMPONENT_ACL;
 	else if (strncmp(nsinfo->dobj.name, "pg_", 3) == 0 ||
 			 strcmp(nsinfo->dobj.name, "information_schema") == 0)
-		nsinfo->dobj.dump = false;
+		nsinfo->dobj.dump = DUMP_COMPONENT_NONE;
 	else
-		nsinfo->dobj.dump = true;
+		nsinfo->dobj.dump = DUMP_COMPONENT_ALL;
 
 	/*
 	 * In any case, a namespace can be excluded by an exclusion switch
@@ -1323,7 +1327,7 @@ selectDumpableNamespace(NamespaceInfo *nsinfo, DumpOptions *dopt)
 	if (nsinfo->dobj.dump &&
 		simple_oid_list_member(&schema_exclude_oids,
 			   nsinfo->dobj.catId.oid))
-		nsinfo->dobj.dump = false;
+		nsinfo->dobj.dump = DUMP_COMPONENT_NONE;
 }
 
 /*
@@ -1342,7 +1346,8 @@ selectDumpableTable(TableInfo *tbinfo, DumpOptions *dopt)
 	 */
 	if (table_include_oids.head != NULL)
 		tbinfo->dobj.dump = simple_oid_list_member(&table_include_oids,
-   tbinfo->dobj.catId.oid);
+   tbinfo->dobj.catId.oid) ?
+DUMP_COMPONENT_ALL : DUMP_COMPONENT_NONE;
 	else
 		tbinfo->dobj.dump = tbinfo->dobj.namespace->dobj.dump;
 
@@ -1352,7 +1357,7 @@ selectDumpableTable(TableInfo *tbinfo, DumpOptions *dopt)
 	if (tbinfo->dobj.dump &&
 		simple_oid_list_member(&table_exclude_oids,
 			   tbinfo->dobj.catId.oid))
-		tbinfo->dobj.dump = false;
+		tbinfo->dobj.dump = DUMP_COMPONENT_NONE;
 }
 
 /*
@@ -1381,7 +1386,7 @@ selectDumpableType(TypeInfo *tyinfo, DumpOptions *dopt)
 		if (tytable != NULL)
 			tyinfo->dobj.dump = tytable->dobj.dump;
 		else
-			tyinfo->dobj.dump = false;
+			tyinfo->dobj.dump = DUMP_COMPONENT_NONE;
 		return;
 	}
 
@@ -1401,11 +1406,7 @@ selectDumpableType(TypeInfo *tyinfo, DumpOptions *dopt)
 	if (checkExt

[HACKERS] Default Roles

2016-02-29 Thread Stephen Frost

All,

Attached is a stripped-down version of the default roles patch which
includes only the 'pg_signal_backend' default role.  This provides the
framework and structure for other default roles to be added and formally
reserves the 'pg_' role namespace.  This is split into two patches, the
first to formally reserve 'pg_', the second to add the new default role.

Will add to the March commitfest for review.

Thanks!

Stephen
From 4a14522ec7ec7d25c3ce9d07f6525b76f6bab598 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Mon, 29 Feb 2016 21:27:46 -0500
Subject: [PATCH 1/2] Reserve the "pg_" namespace for roles

This will prevent users from creating roles which begin with "pg_" and
will check for those roles before allowing an upgrade using pg_upgrade.

This will allow for default roles to be provided at initdb time.
---
 doc/src/sgml/ref/psql-ref.sgml  |  8 +--
 src/backend/catalog/catalog.c   |  5 ++--
 src/backend/commands/user.c | 40 
 src/backend/utils/adt/acl.c | 41 +
 src/bin/pg_dump/pg_dumpall.c|  2 ++
 src/bin/pg_upgrade/check.c  | 40 ++--
 src/bin/psql/command.c  |  4 ++--
 src/bin/psql/describe.c |  5 +++-
 src/bin/psql/describe.h |  2 +-
 src/bin/psql/help.c |  4 ++--
 src/include/utils/acl.h |  1 +
 src/test/regress/expected/rolenames.out | 18 +++
 src/test/regress/sql/rolenames.sql  |  8 +++
 13 files changed, 166 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 6d0cb3d..76bb642 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1365,13 +1365,15 @@ testdb=>
 
 
   
-\dg[+] [ pattern ]
+\dg[S+] [ pattern ]
 
 
 Lists database roles.
 (Since the concepts of users and groups have been
 unified into roles, this command is now equivalent to
 \du.)
+By default, only user-created roles are shown; supply the
+S modifier to include system roles.
 If pattern is specified,
 only those roles whose names match the pattern are listed.
 If the form \dg+ is used, additional information
@@ -1525,13 +1527,15 @@ testdb=>
   
 
   
-\du[+] [ pattern ]
+\du[S+] [ pattern ]
 
 
 Lists database roles.
 (Since the concepts of users and groups have been
 unified into roles, this command is now equivalent to
 \dg.)
+By default, only user-created roles are shown; supply the
+S modifier to include system roles.
 If pattern is specified,
 only those roles whose names match the pattern are listed.
 If the form \du+ is used, additional information
diff --git a/src/backend/catalog/catalog.c b/src/backend/catalog/catalog.c
index bead2c1..d1cf45b 100644
--- a/src/backend/catalog/catalog.c
+++ b/src/backend/catalog/catalog.c
@@ -184,8 +184,9 @@ IsToastNamespace(Oid namespaceId)
  *		True iff name starts with the pg_ prefix.
  *
  *		For some classes of objects, the prefix pg_ is reserved for
- *		system objects only.  As of 8.0, this is only true for
- *		schema and tablespace names.
+ *		system objects only.  As of 8.0, this was only true for
+ *		schema and tablespace names.  With 9.6, this is also true
+ *		for roles.
  */
 bool
 IsReservedName(const char *name)
diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
index 4baeaa2..ee37397 100644
--- a/src/backend/commands/user.c
+++ b/src/backend/commands/user.c
@@ -17,6 +17,7 @@
 #include "access/htup_details.h"
 #include "access/xact.h"
 #include "catalog/binary_upgrade.h"
+#include "catalog/catalog.h"
 #include "catalog/dependency.h"
 #include "catalog/indexing.h"
 #include "catalog/objectaccess.h"
@@ -312,6 +313,17 @@ CreateRole(CreateRoleStmt *stmt)
 	}
 
 	/*
+	 * Check that the user is not trying to create a role in the reserved
+	 * "pg_" namespace.
+	 */
+	if (IsReservedName(stmt->role))
+		ereport(ERROR,
+(errcode(ERRCODE_RESERVED_NAME),
+ errmsg("role name \"%s\" is reserved",
+	 stmt->role),
+ errdetail("Role names starting with \"pg_\" are reserved.")));
+
+	/*
 	 * Check the pg_authid relation to be certain the role doesn't already
 	 * exist.
 	 */
@@ -1117,6 +1129,7 @@ RenameRole(const char *oldname, const char *newname)
 	int			i;
 	Oid			roleid;
 	ObjectAddress address;
+	Form_pg_authid authform;
 
 	rel = heap_open(AuthIdRelationId, RowExclusiveLock);
 	dsc = RelationGetDescr(rel);
@@ -1136,6 +1149,7 @@ RenameRole(const char *oldname, const char *newname)
 	 */

Re: [HACKERS] pg_dump dump catalog ACLs

2016-02-29 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > Per discussion about the best approach to reduce the amount of
> > superuser-only capabilities, this patch modifies pg_dump to dump out
> > all ACLs which exist on objects in the pg_catalog schema.
> 
> Um ... surely there are some of those that are installed by default?

There are a few, but not terribly many currently.

> To make this work, you'd need a way to distinguish privileges installed
> by initdb from those changed later.

To replicate whatever the current ACL is, we don't actually need to
make such a differentiation.  I'm not against doing so, but the only
point of it would be to eliminate a few extra lines being dumped out
which re-run those commands that initdb runs on restore.

The downside of doing so would be having to keep track of the exact ACLs
set for every object in pg_catalog which has a non-NULL ACL at initdb
time for every version of PG that the latest version of pg_dump
supports, and making sure that any changes to those get updated in
pg_dump in addition to the relevant system_views.sql change.

That's possible, but I wasn't sure it was worth it.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] pg_dump dump catalog ACLs

2016-02-29 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> To make this work, you'd need a way to distinguish privileges installed
> >> by initdb from those changed later.
> 
> > To replicate whatever the current ACL is, we don't actually need to
> > make such a differentiation.  I'm not against doing so, but the only
> > point of it would be to eliminate a few extra lines being dumped out
> > which re-run those commands that initdb runs on restore.
> 
> No, the point of it would be to not have pg_dump scripts overriding
> installed-by-default ACLs.  A newer PG version might have different
> ideas about what those should be.  I don't think this is exactly an
> academic concern, either: wouldn't a likely outcome of your default-roles
> work be that some built-in functions have different initial ACLs than
> they do today?  Good luck with that, if pg_upgrade overwrites those
> ACLs with the previous-version values.

As it turns out, there isn't such an issue as the default for functions
is to allow PUBLIC to EXECUTE and therefore we don't dump out ACLs for
most functions.  The follow-on change to this patch is to modify those
functions to *not* have the default/NULL ACL (and also drop the explicit
if (!superuser()) ereport() checks in those functions), which will work
just fine and won't be overwritten during pg_upgrade because those
functions currently just have the default ACL, which we don't dump out.

Of course, it's a different story if the user changes the ACL on objects
in pg_catalog and then we change what we think the default ACL should
be, but in such a case, I'm guessing we should probably go with what the
user explicitly asked for anyway and if there's a serious enough change
in the permissions of the function then perhaps we should have a
different function instead of re-defining the existing one.

We do have some fun issues with pg_upgrade by going with this approach
of having pg_dump dump out ACLs- what happens when there's a function or
column which goes away?  If there's a non-NULL ACL on them, the restore
will just outright fail, if we don't do something more.  I'm not a huge
fan of coding into pg_dump the knowledge of every object which exists
for every version of PG we support for pg_dump though.

Regarding the default roles patch, now that it's down to only one
default role, based on the assumption that this approach with pg_dump
will solve all the other concerns, there isn't really much overlap
between the default roles and the function ACLs.  Those functions which
will be able to work with just ACLs won't have a default role and the
functions which we need a default role for will have the default NULL
ACL.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] [NOVICE] WHERE clause not used when index is used

2016-03-01 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Tobias Florek  writes:
> > When creating an index to use for an ORDER BY clause, a simple query
> > starts to return more results than expected.  See the following detailed
> > log.
> 
> Ugh.  That is *badly* broken.  I thought maybe it had something to do with
> the "abbreviated keys" work, but the same thing happens if you change the
> numeric column to integer, so I'm not very sure where to look.  Who's
> touched btree key comparison logic lately?
> 
> (Problem is reproducible in 9.5 and HEAD, but not 9.4.)

Looks to have been introduced in 2ed5b87f.  Reverting that gets us back
to results which look correct.

> > Create enough test data for planer to use an index (if exists) for the
> > condition.
> 
> > CREATE TABLE "index_cond_test" AS
> > SELECT
> >   (10 + random() * 10)::int AS "final_score",
> >   round((10 + random() * 10)::numeric, 5) "time_taken"
> > FROM generate_series(1, 1) s;
> 
> 
> > Run control query without an index (will be less than 1 rows). Pay
> > attention to tuples of (20,a) with a > 11.
> 
> > SELECT *
> > FROM "index_cond_test"
> > WHERE (final_score, time_taken) < (20, 11)
> > ORDER BY final_score DESC, time_taken ASC;
> 
> 
> > Or wrapped in count(*), to make it even more obvious
> 
> > SELECT count(*) FROM ( SELECT *
> >FROM "index_cond_test"
> >WHERE (final_score, time_taken) < (20, 11)
> >ORDER BY final_score DESC, time_taken ASC) q;
> 
> > Create the index
> 
> > CREATE INDEX "index_cond_test_ranking" ON "index_cond_test" USING btree 
> > (final_score DESC, time_taken ASC);
> 
> > Run test query (will return all 1 rows)
> 
> > SELECT *
> > FROM "index_cond_test"
> > WHERE (final_score, time_taken) < (20, 11)
> > ORDER BY final_score DESC, time_taken ASC;
> 
> > or wrapped
> 
> > SELECT count(*) FROM ( SELECT *
> >FROM "index_cond_test"
> >WHERE (final_score, time_taken) < (20, 11)
> >ORDER BY final_score DESC, time_taken ASC) q;

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [NOVICE] WHERE clause not used when index is used

2016-03-01 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> [ trimming -novice from the cc list ]
> 
> Jeff Janes  writes:
> > On Tue, Mar 1, 2016 at 7:40 AM, Tom Lane  wrote:
> >> (Problem is reproducible in 9.5 and HEAD, but not 9.4.)
> 
> > Bisects down to:
> 
> > 606c0123d627b37d5ac3f7c2c97cd715dde7842f is the first bad commit
> > commit 606c0123d627b37d5ac3f7c2c97cd715dde7842f
> > Author: Simon Riggs 
> > Date:   Tue Nov 18 10:24:55 2014 +
> 
> > Reduce btree scan overhead for < and > strategies
> 
> Hmm ... just from the commit message, this seems much more likely to be
> the cause than does the buffer locking patch Stephen fingered.  Stephen,
> how'd you identify 2ed5b87f as being the problem?

Badly. :)

I didn't expect it to be something that far back and was just going
backwards through reverting/testing patches that looked like
candidates, but forgot to create the index when I tested the revert of
that, which made it look like reverting it 'fixed' it.

Apologies for the noise.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pg_dump dump catalog ACLs

2016-03-02 Thread Stephen Frost

* Joe Conway (m...@joeconway.com) wrote:
> On 03/01/2016 08:00 AM, Tom Lane wrote:
> > Joe Conway  writes:
> >> Would it be a terrible idea to add some attribute to ACLs which can be
> >> used to indicate they should not be dumped (and supporting syntax)?
> > 
> > Yes, we'd need some way to mark non-null ACLs as being "built-in
> > defaults".  I do not see the need to have SQL syntax supporting that
> > though.
> 
> I was thinking the supporting syntax might be used by extensions, for
> example.

I tend to agree with Tom that we don't really need SQL syntax for this.

> > Actually, wouldn't you need to mark individual aclitems as built-in
> > or not?  Consider a situation where we have some function foo() that
> > by default has EXECUTE permission granted to some built-in "pg_admin"
> > role.  If a given installation then also grants EXECUTE to "joe",
> > what you really want to have happen is for pg_dump to dump only the
> > grant to "joe".  Mentioning pg_admin's grant would tie the dump to
> > a particular major PG version's idea of what the built-in roles are,
> > which is what I'm arguing we need to avoid.
> 
> Yes, I guess it would need to be a per aclitem attribute.

Agreed.

> > I guess this could also be addressed by having two separate aclitem[]
> > columns, one that is expected to be frozen after initdb and one for
> > user-added grants.
> 
> Yeah, that would work, but seems kind of ugly.

Rather than have two aclitem[] columns in every catalog, since this
information is only used by pg_dump and not during normal operation, we
could use the approach that pg_description took and have an independent
catalog table which just contains all non-NULL "system" ACLs.  We could
populate it at the bottom of system_views.sql, so that we don't have to
explicitly think about updating that table whenever there's a change to
what the default ACLs are.

I don't see any reason it couldn't be used by extensions also, though
we'd have to do a bit more work on pg_dump to make it actually dump
out any non-default ACLs for extension-owned objects.

Thoughts?

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] pg_dump dump catalog ACLs

2016-03-02 Thread Stephen Frost

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Wed, Mar 2, 2016 at 1:54 PM, Stephen Frost  wrote:
> > Rather than have two aclitem[] columns in every catalog, since this
> > information is only used by pg_dump and not during normal operation, we
> > could use the approach that pg_description took and have an independent
> > catalog table which just contains all non-NULL "system" ACLs.  We could
> > populate it at the bottom of system_views.sql, so that we don't have to
> > explicitly think about updating that table whenever there's a change to
> > what the default ACLs are.
> >
> > I don't see any reason it couldn't be used by extensions also, though
> > we'd have to do a bit more work on pg_dump to make it actually dump
> > out any non-default ACLs for extension-owned objects.
> >
> 
> It sounds like this train of thought would resolve this complaint?
> 
> 
> http://www.postgresql.org/message-id/CADmxfmmz-ATwptaidTSAF0XE=cpeikmyc00sj6t9xf6kcv5...@mail.gmail.com
> 
> Namely allowing users to edit permissions on extension objects and have
> those changes dumped and then restored after the dependent CREATE EXTENSION
> command is executed during pg_restore.
> 
> Did I interpret that right?

Yes, I was following that thread also (as was Joe, I imagine) and that's
the idea.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pg_dump dump catalog ACLs

2016-03-02 Thread Stephen Frost

* Joe Conway (m...@joeconway.com) wrote:
> On 03/02/2016 12:54 PM, Stephen Frost wrote:
> > * Joe Conway (m...@joeconway.com) wrote:
> >> On 03/01/2016 08:00 AM, Tom Lane wrote:
> >>> Yes, we'd need some way to mark non-null ACLs as being "built-in
> >>> defaults".  I do not see the need to have SQL syntax supporting that
> >>> though.
> >>
> >> I was thinking the supporting syntax might be used by extensions, for
> >> example.
> > 
> > I tend to agree with Tom that we don't really need SQL syntax for this.
> 
> > I don't see any reason it couldn't be used by extensions also, though
> > we'd have to do a bit more work on pg_dump to make it actually dump
> > out any non-default ACLs for extension-owned objects.
> 
> Without any syntax, what does the extension do, directly insert into
> this special catalog table?

Perhaps nothing- we already are tracking everything they've created, at
the end of the extension's script, we could simply go over all of the
objects which are part of the extension and save off the non-NULL ACLs
which exist.

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Wed, Mar 2, 2016 at 2:44 PM, Joe Conway  wrote:
> 
> > On 03/02/2016 12:54 PM, Stephen Frost wrote:
> > > * Joe Conway (m...@joeconway.com) wrote:
> > >> On 03/01/2016 08:00 AM, Tom Lane wrote:
> > >>> Yes, we'd need some way to mark non-null ACLs as being "built-in
> > >>> defaults".  I do not see the need to have SQL syntax supporting that
> > >>> though.
> > >>
> > >> I was thinking the supporting syntax might be used by extensions, for
> > >> example.
> > >
> > > I tend to agree with Tom that we don't really need SQL syntax for this.
> >
> > > I don't see any reason it couldn't be used by extensions also, though
> > > we'd have to do a bit more work on pg_dump to make it actually dump
> > > out any non-default ACLs for extension-owned objects.
> >
> > Without any syntax, what does the extension do, directly insert into
> > this special catalog table?
> >
> >
> The desire in the thread I linked was for the user, not the extension, to
> alter the permissions.  In that context (and possibly here as well)
> PostgreSQL would (somehow?) recognize the target as being special and thus
> requiring adding or updating an entry into the supplemental catalog table
> when the usual GRANT/REVOKE SQL command is issued.

Not quite.

The idea here is for us to track in the catalog what the ACLs were at
the "start" (being "initdb" time for the database as a whole, and
"CREATE EXTENSION" time for the extension) and then dump out any ACLs
which have been changed since then.

That strikes me as much simpler than having to track every GRANT/REVOKE
done against some special set of objects..

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pg_basebackup compression TODO item

2016-03-03 Thread Stephen Frost

* Andres Freund (and...@anarazel.de) wrote:
> On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote:
> > I think we want it at protocol level rather than pg_basebackup level.
> 
> I think we may want both eventually, but I do agree that protocol level
> has a lot higher "priority" than that. Something like protocol level
> compression has a bit of different tradeofs than compressing base
> backups, and it's nice not to compress, uncompress, compress again.

+1, the whole compress-uncompress-compress thing was why I was trying to
add support to COPY to do zlib compression, which could have then been
used to compress server-side and then just write the results out to a
file for -Fc/-Fd style dumps.  We ended up implementing the 'PROGRAM'
thing for COPY, which is nice, but isn't the same.

> > If SSL compression is busted on base backups, it's equally busted on
> > regular connection and replication streams. People do ask for
> > compression on that (in particular I've had a lot of requests when it
> > comes to replication), and our response there has traditionally been
> > "ssl compression"...
> 
> Agreed. I think our answer there was always a bit of a cop out...

Agreed on this also.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] New competition from Microsoft?

2016-03-07 Thread Stephen Frost

* Joe Conway (m...@joeconway.com) wrote:
> On 03/07/2016 01:43 PM, Josh berkus wrote:
> > http://blogs.microsoft.com/?p=67248
> > 
> > Once SQL Server is available on Linux, we're going to see more people
> > using it as an alternative to PostgreSQL.  Especially since they're
> > picking up a lot of our better features, like R support.
> 
> IANAL, but I wonder how they can have R support given that libR.so is
> GPL licensed, not LPGL? Have they open sourced SQL Server?

I thought they had purchased/partnered with an R implementor that didn't
use the GPL one...

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Odd warning from pg_dump

2016-03-08 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> In connection with a question on -general, I tried this:
> 
> $ pg_dump -n '*' regression >regression.dump
> pg_dump: WARNING: typtype of data type "any" appears to be invalid
> pg_dump: WARNING: typtype of data type "anyarray" appears to be invalid
> pg_dump: WARNING: typtype of data type "anyelement" appears to be invalid
> pg_dump: WARNING: typtype of data type "anyenum" appears to be invalid
> pg_dump: WARNING: typtype of data type "anynonarray" appears to be invalid
> pg_dump: WARNING: typtype of data type "anyrange" appears to be invalid
> pg_dump: WARNING: typtype of data type "cstring" appears to be invalid
> pg_dump: WARNING: typtype of data type "event_trigger" appears to be invalid
> pg_dump: WARNING: typtype of data type "fdw_handler" appears to be invalid
> pg_dump: WARNING: typtype of data type "index_am_handler" appears to be 
> invalid
> pg_dump: WARNING: typtype of data type "internal" appears to be invalid
> pg_dump: WARNING: typtype of data type "language_handler" appears to be 
> invalid
> pg_dump: WARNING: typtype of data type "opaque" appears to be invalid
> pg_dump: WARNING: typtype of data type "pg_ddl_command" appears to be invalid
> pg_dump: WARNING: typtype of data type "record" appears to be invalid
> pg_dump: WARNING: typtype of data type "trigger" appears to be invalid
> pg_dump: WARNING: typtype of data type "tsm_handler" appears to be invalid
> pg_dump: WARNING: typtype of data type "void" appears to be invalid
> $
> 
> The datatypes being complained of are evidently all the pseudo-types.
> I haven't looked into the code to figure out why this happens.
> The dump is produced anyway, so it's only a cosmetic issue, but seems
> probably worth fixing.

This is fixed in my changes to pg_dump, though I didn't expect you'd be
able to hit them in released versions and so hadn't been planning to
break it out.

This is the hunk for that particular change (line numbers are probably
off due to my other changes in the overall patch):

*** dumpType(Archive *fout, TypeInfo *tyinfo
*** 8722,8727 
--- 8727,8743 
dumpRangeType(fout, tyinfo);
else if (tyinfo->typtype == TYPTYPE_PSEUDO && !tyinfo->isDefined)
dumpUndefinedType(fout, tyinfo);
+   else if (tyinfo->typtype == TYPTYPE_PSEUDO && tyinfo->isDefined &&
+strncmp(tyinfo->dobj.namespace->dobj.name, "pg_catalog", 10) == 0)
+   /*
+* skip defined pseudo-types in pg_catalog, they are special cases in
+* the type system which are defined at initdb time only.
+*
+* Should a user manage to create one (which would require hand hacking
+* the catalog, currently), throwing the below error seems entirely
+* reasonable.
+*/
+   return;
else
write_msg(NULL, "WARNING: typtype of data type \"%s\" appears to be 
invalid\n",
  tyinfo->dobj.name);

I can certainly look at committing that independently from the other
pg_dump changes that I'm working on.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Odd warning from pg_dump

2016-03-08 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> pg_dump: WARNING: typtype of data type "any" appears to be invalid
> 
> > This is fixed in my changes to pg_dump, though I didn't expect you'd be
> > able to hit them in released versions and so hadn't been planning to
> > break it out.
> 
> Oh, this was with HEAD; I've not checked it in released versions.

Using your "-n '*'", I suspect you'd be able to hit it in released
versions also.

I think the real question is if "-n '*'" should still exclude
'pg_catalog'.  Fixing the issue with defined pseudo types is wonderful,
but aren't you going to end up with a dump you can't restore,
regardless?

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Odd warning from pg_dump

2016-03-08 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > I think the real question is if "-n '*'" should still exclude
> > 'pg_catalog'.  Fixing the issue with defined pseudo types is wonderful,
> > but aren't you going to end up with a dump you can't restore,
> > regardless?
> 
> Yeah, perhaps so.  The thread on -general has also produced the
> information that pg_dump -t '*' tries to dump system catalogs as if
> they were user tables, which is another pretty useless bit of behavior.

Indeed.

> So maybe we should drop the hunk you've got there (which frankly seems a
> bit of a kluge) and instead hot-wire things so that stuff in pg_catalog
> is excluded even if it would otherwise match the inclusion lists.

An idealistic viewpoint might be that we should provide a way to
actually create defined pseudo-types and then make pg_dump work with
them, but I have a hard time seeing why that would be worth the effort.
One might also argue that we should have a way of dealing with every
type of object which exists and defined psuedo-types seem to be the only
thing left out.

I agree that it seems like the best idea is to just hot-wire pg_dump to
exclude pg_catalog, though I'm inclined to suggest that we just exclude
it from pattern matches.  We know that objects sometimes end up in
pg_catalog which aren't really supposed to be there and there might be
other reasons to want to dump it out, so having 'pg_dump -n pg_catalog'
still do its best to dump out what's been asked for seems reasonable.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Floating point timestamps

2016-03-10 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Thomas Munro  writes:
> > Is the plan to remove support for floating point timestamps at some
> > stage?  If so, what is that waiting on, and would it provide
> > sufficient warning if (say) 9.6 were documented as the last major
> > release to support that build option?
> 
> AFAIK there is no particular plan to do that.  It's not like leaving
> that code in place is costing us huge amounts of maintenance effort.

Agreed, and I have little doubt that it's still used in the field given
how long it was the default for some distributions.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Add generate_series(date,date) and generate_series(date,date,integer)

2016-03-10 Thread Stephen Frost

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Simon Riggs wrote:
> > On 10 March 2016 at 06:53, Michael Paquier 
> > wrote:
> > 
> > > On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
> > >  wrote:
> > > > Robert Haas wrote:
> > > >> I'm pretty meh about the whole idea of this function, though,
> > > >> actually, and I don't see a single clear +1 vote for this
> > > >> functionality upthread.  (Apologies if I've missed one.)  In the
> > > >> absence of a few of those, I recommend we reject this.
> > > >
> > > > +1
> > >
> > > I'm meh for this patch.
> > 
> > "meh" == +1
> > 
> > I thought it meant -1
> 
> I would say that "meh" is +0, actually.  So the the tally is a small
> positive number -- not great, but seems enough to press forward unless
> we get more -1 votes.

I'm +1 on this also, for my part.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pg_dump dump catalog ACLs

2016-03-14 Thread Stephen Frost

Tom, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Joe Conway  writes:
> > Would it be a terrible idea to add some attribute to ACLs which can be
> > used to indicate they should not be dumped (and supporting syntax)?
> 
> Yes, we'd need some way to mark non-null ACLs as being "built-in
> defaults".  I do not see the need to have SQL syntax supporting that
> though.

The attached patch does this by adding a 'pg_init_privs' catalog and
populating that catalog at the end of initdb, after all of the initial
privileges have been set up.

> Actually, wouldn't you need to mark individual aclitems as built-in
> or not?  Consider a situation where we have some function foo() that
> by default has EXECUTE permission granted to some built-in "pg_admin"
> role.  If a given installation then also grants EXECUTE to "joe",
> what you really want to have happen is for pg_dump to dump only the
> grant to "joe".  Mentioning pg_admin's grant would tie the dump to
> a particular major PG version's idea of what the built-in roles are,
> which is what I'm arguing we need to avoid.

What I was thinking about for this was to have pg_dump simply not dump
out any GRANTs made to default roles (those starting with 'pg_'), as
part of the default roles patch.

Another option would be to have pg_dump figure out the delta between
what the initial privileges were, as recorded in pg_init_privs, as what
the current rights are.

I was thinking that the former was simpler, but I don't think it'd be
too difficult to do the latter if the consensus is that it's better to
do so.

> I guess this could also be addressed by having two separate aclitem[]
> columns, one that is expected to be frozen after initdb and one for
> user-added grants.

This is along the lines of what I've done, but I've used a new catalog
instead, which is quite small and doesn't complicate or change the
regular catalogs.

* Joe Conway (m...@joeconway.com) wrote:
> On 03/02/2016 12:54 PM, Stephen Frost wrote:
> > * Joe Conway (m...@joeconway.com) wrote:
> >> On 03/01/2016 08:00 AM, Tom Lane wrote:
> >>> Yes, we'd need some way to mark non-null ACLs as being "built-in
> >>> defaults".  I do not see the need to have SQL syntax supporting that
> >>> though.
> >>
> >> I was thinking the supporting syntax might be used by extensions, for
> >> example.
> > 
> > I tend to agree with Tom that we don't really need SQL syntax for this.
> 
> > I don't see any reason it couldn't be used by extensions also, though
> > we'd have to do a bit more work on pg_dump to make it actually dump
> > out any non-default ACLs for extension-owned objects.
> 
> Without any syntax, what does the extension do, directly insert into
> this special catalog table?

I've not included it in this patch, but my thinking here would be to add
a check to the GRANT functions which checks 'if (creating_extension)'
and records/updates the entries in pg_init_privs for those objects.
This is similar to how we handle dependencies for extensions currently.
That way, extensions don't have to do anything and if the user changes
the permissions on an extension's object, the permissions for that
object would then be dumped out.

This still requires more testing, documentation, and I'll see about
making it work completely for extensions also, but wanted to provide an
update and solicit for review/comments.

The patches have been broken up in what I hope is a logical way for
those who wish to review/comment:

#1 - Add the pg_init_privs catalog
#2 - Change pg_dump to use a bitmap instead of boolean for 'dump'
#3 - Split 'dump' into 'dump' and 'dump_contains'
#4 - Make pg_dump include changed ACLs in dump of 'pg_catalog'
#5 - Remove 'if (!superuser()) ereport()' calls and REVOKE EXECUTE

Thanks!

Stephen
From 31f4fafeaa06ad7c72ee7c6699253a65ea542d11 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Thu, 10 Mar 2016 20:56:09 -0500
Subject: [PATCH 1/5] Add new catalog called pg_init_privs

This new catalog will hold the privileges which the system is
initialized with at initdb time.  This will allow pg_dump
(and any other similar use-cases) to detect when the privileges
set on objects in pg_catalog have been changed and handle those
changes appropriately.

This could be extended to work for extensions also, but that
seems like an independent enough change that it should be in a
different commit.
---
 src/backend/catalog/Makefile   |   2 +-
 src/bin/initdb/initdb.c| 121 +
 src/include/catalog/indexing.h |   3 +
 src/include/catalog/pg_init_p

Re: [HACKERS] Minor bug affecting ON CONFLICT lock wait log messages

2016-03-15 Thread Stephen Frost

* Julien Rouhaud (julien.rouh...@dalibo.com) wrote:
> On 15/03/2016 03:30, Peter Geoghegan wrote:
> > On Mon, Mar 7, 2016 at 1:46 PM, Peter Geoghegan  wrote:
> >> Attached patch fixes a bug reported privately by Stephen this morning.
> > 
> > Bump.
> > 
> > I would like to see this in the next point release. It shouldn't be
> > hard to review.
> > 
> 
> + reason_wait = indexInfo->ii_ExclusionOps ?
> + XLTW_RecheckExclusionConstr : XLTW_InsertIndex;
> 
> Shouldn't it be set to XLTW_InsertIndexUnique instead?

Actually, no, though I had the same question for Peter when I was first
reviewing this.

XLTW_InsertIndexUnique is used when building a unique index, but this is
just a check, and more to the point, it's actually a re-check of what
we're doing in nbinsert.c where we're already using XLTW_InsertIndex.

We wouldn't want to end up returning different error messages for the
same command under the same conditions just based, which is what we'd
potentially end up doing if we used XLTW_InsertIndexUnique here.

> Otherwise the patch seems ok to me.

Agreed.  I'm going to play with it a bit more but barring objections,
I'll commit and back-patch Peter's patch.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Minor bug affecting ON CONFLICT lock wait log messages

2016-03-15 Thread Stephen Frost

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Stephen Frost wrote:
> > * Julien Rouhaud (julien.rouh...@dalibo.com) wrote:
> 
> > XLTW_InsertIndexUnique is used when building a unique index, but this is
> > just a check, and more to the point, it's actually a re-check of what
> > we're doing in nbinsert.c where we're already using XLTW_InsertIndex.
> > 
> > We wouldn't want to end up returning different error messages for the
> > same command under the same conditions just based, which is what we'd
> > potentially end up doing if we used XLTW_InsertIndexUnique here.
> 
> Perhaps we need a new value in that enum, so that the context message is
> specific to the situation at hand?

Maybe, but that would require a new message and new translation and just
generally doesn't seem like something we'd want to back-patch for a
bugfix.  As such, if someone's interested, they could certainly propose
it to go into HEAD, but I'm not going to look at making those kinds of
changes for this bugfix.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Default Roles

2016-03-15 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Feb 29, 2016 at 10:02 PM, Stephen Frost  wrote:
> > Attached is a stripped-down version of the default roles patch which
> > includes only the 'pg_signal_backend' default role.  This provides the
> > framework and structure for other default roles to be added and formally
> > reserves the 'pg_' role namespace.  This is split into two patches, the
> > first to formally reserve 'pg_', the second to add the new default role.
> >
> > Will add to the March commitfest for review.
> 
> Here is a review of the first patch:
> 
> +   if (!IsA(node, RoleSpec))
> +   elog(ERROR, "invalid node type %d", node->type);
> 
> That looks strange.  Why not just Assert(IsA(node, RoleSpec))?

Sure, that works, done.

> +
> +   return;
> 
> Useless return.

Removed.

> @@ -673,6 +673,7 @@ dumpRoles(PGconn *conn)
>  "pg_catalog.shobj_description(oid,
> 'pg_authid') as rolcomment, "
>   "rolname =
> current_user AS is_current_user "
>   "FROM pg_authid "
> + "WHERE rolname !~ '^pg_' "
>   "ORDER BY 2");
> else if (server_version >= 90100)
> printfPQExpBuffer(buf,
> @@ -895,6 +896,7 @@ dumpRoleMembership(PGconn *conn)
>"LEFT JOIN pg_authid ur on
> ur.oid = a.roleid "
>"LEFT JOIN pg_authid um on
> um.oid = a.member "
>"LEFT JOIN pg_authid ug on
> ug.oid = a.grantor "
> +  "WHERE NOT (ur.rolname ~
> '^pg_' AND um.rolname ~ '^pg_')"
>"ORDER BY 1,2,3");
> 
> If I'm reading this correctly, when dumping a 9.5 server, we'll
> silently drop any roles whose names start with pg_ from the dump, and
> hope that doesn't break anything.  When dumping a 9.4 or older server,
> we'll include those roles and hope that they miraculously restore on
> the new server.  I'm thinking neither of those approaches is going to
> work out, and the difference between them seems totally unprincipled.

That needed to be updated to be 9.6 and 9.5, of course.

Further, you make a good point that 9.6's pg_dumpall should really
always exclude any roles which start with 'pg_', throwing a warning if
it finds them.

Note that pg_upgrade won't proceed with an upgrade of a system that has
any 'pg_' roles.

> @@ -631,7 +637,8 @@ check_is_install_user(ClusterInfo *cluster)
> res = executeQueryOrDie(conn,
> "SELECT rolsuper, oid 
> "
> "FROM
> pg_catalog.pg_roles "
> -   "WHERE rolname
> = current_user");
> +   "WHERE rolname
> = current_user "
> +   "AND rolname
> !~ '^pg_'");
> 
> /*
>  * We only allow the install user in the new cluster (see comment 
> below)
> @@ -647,7 +654,8 @@ check_is_install_user(ClusterInfo *cluster)
> 
> res = executeQueryOrDie(conn,
> "SELECT COUNT(*) "
> -   "FROM
> pg_catalog.pg_roles ");
> +   "FROM
> pg_catalog.pg_roles "
> +   "WHERE rolname
> !~ '^pg_'");
> 
> if (PQntuples(res) != 1)
> pg_fatal("could not determine the number of users\n");
> 
> What bad thing would happen without these changes that would be
> avoided with these changes?  I can't think of anything.

This function ("check_is_install_user") is simply checking that the user
we are logged in as is the 'install' user and that there aren't any
other users in the destination cluster.  The initial check is perhaps a
bit paranoid- it shouldn't be possible for a 'pg_' role to be the one
which pg_upgrade has logged into the cluster with as none of the 'pg_'
roles have the '

Re: [HACKERS] [PATCH v6] GSSAPI encryption support

2016-03-15 Thread Stephen Frost

Robbie,

* Robbie Harwood (rharw...@redhat.com) wrote:
> Michael Paquier  writes:
> > -   maj_stat = gss_accept_sec_context(
> > - &min_stat,
> > +   maj_stat = gss_accept_sec_context(&min_stat,
> >
> > This is just noise.
> 
> You're not wrong, though I do think it makes the code more readable by
> enforcing style and this is a "cleanup" commit.  I'll take it out if it
> bothers you.

First, thanks much for working on this, I've been following along the
discussions and hope to be able to help move it to commit, once it's
ready.

Secondly, generally, speaking, we prefer that 'cleanup' type changes
(particularly whitespace-only ones) are in independent commits which are
marked as just whitespace/indentation changes.  We have a number of
organizations which follow our code changes and it makes it more
difficult on them to include whitespace/indentation changes with code
changes.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Password identifiers, protocol aging and SCRAM protocol

2016-03-19 Thread Stephen Frost

Robert, all,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Mar 18, 2016 at 9:31 AM, Michael Paquier
>  wrote:
> > That's not an issue for me to rebase this set of patches. The only
> > conflicts that I anticipate are on 0009, but I don't have high hopes
> > to get this portion integrating into core for 9.6, the rest of the
> > patches is complicated enough, and everyone bandwidth is limited.
> 
> I really think we ought to consider pushing this whole thing out to
> 9.7.  I don't see how we're going to get all of this into 9.6, and
> these are big, user-facing changes that I don't think we should rush
> into under time pressure.  I think it'd be better to do this early in
> the 9.7 cycle so that it has time to settle before the time crunch at
> the end.  I predict this is going to have a lot of loose ends that are
> going to take months to settle, and we don't have that time right now.

I'm not sure that I agree with the above.  This patch has been through
the ringer multiple times regarding the user-facing bits and, by and
large, the results appear reasonable.  Further, getting a better auth
method into PG is something which I do view as a priority considering
the concerns and complaints that have been, justifiably, raised against
our current password-based authentication support.

This isn't a new patch set either, it was submitted initially over the
summer after it was pointed out, over a year ago, that people actually
do care about the problems with our current implementation (amusingly, I
recall having pointed out the same 5+ years ago, but only did so to this
list).

I've been following along on this patch set and asked David to spend
time reviewing it as I feel that it's stil got a chance for 9.6, since
it's been through multiple CF rounds and has had a fair bit of
discussion, review, and consideration.

> And I'd rather see all of the changes in one release than split them
> across two releases.

I agree with this.  If we aren't going to get SCRAM into 9.6 then the
rest is just breaking things with little benefit.  I'm optomistic that
we will be able to include SCRAM support in 9.6, but if that ends up not
being feasible then we need to put all of the changes to the next
release.

I do think that if we push this off to 9.7 then we're going to have
SCRAM *plus* a bunch of other changes around password policies in that
release, and it'd be better to introduce SCRAM independently of the
other changes.

All that said, this is just my voice from having followed this thread
and discussing it with David and I'm not trying to force anything.  It'd
certainly be nice to have and to be able to tell people that we do have
a strong and recognized approach to password-based authentication in PG,
but I've long been telling everyone that they should be using GSSAPI
and/or SSL and can continue to do so for another year if necessary.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Minor bug affecting ON CONFLICT lock wait log messages

2016-03-21 Thread Stephen Frost

Peter,

* Peter Geoghegan (p...@heroku.com) wrote:
> Thinking about this again, I think we should use
> XLTW_InsertIndexUnique after all. The resemblance of the
> check_exclusion_or_unique_constraint() code to the nbtinsert.c code
> seems only superficial on second thought. So, I propose fixing the fix
> by changing XLTW_InsertIndex to XLTW_InsertIndexUnique.

I'm not against changing it, but...

> Basically, unlike with the similar nbtinsert.c code, we're checking
> someone else's tuple in the speculative insertion
> check_exclusion_or_unique_constraint() case that was changed (or it's
> an exclusion constraint, where even the check for our own tuple
> happens only after insertion; no change there in any case). Whereas,
> in the nbtinsert.c case that I incorrectly duplicated, we're
> specifically indicating that we're waiting on *our own* already
> physically inserted heap tuple, and say as much in the
> XLTW_InsertIndex message that makes it into the log. 

I don't quite follow how we're indicating that we're waiting on our own
already physically inserted heap tuple in the log?

> So, the
> fd658dbb300456b393536802d1145a9cea7b25d6 fix is wrong in that we now
> indicate that the wait was on our own already-inserted tuple, and not
> *someone else's* already-inserted tuple, as it should (we haven't
> inserting anything in the first phase of speculative insertion, this
> precheck). This code is not a do-over of the check in nbtinsert.c --
> rather, the code in nbtinsert.c is a second phase do-over of this code
> (where we've physically inserted a heap tuple + index tuple --
> "speculative" though that is).

One of the reasons I figured the XLTW_InsertIndex message made sense in
this case is that it'd provide a consistent result for users.
Specifically, I don't think it makes sense to produce different messages
based only on who got there first.  I don't mind having a different
message when there's something different happening (there's exclusion
constraints involved, or you're building an index, etc).

> It seems fine to characterize a wait here as "checking the uniqueness
> of [somebody else's] tuple", even though technically we're checking
> the would-be uniqueness were we to (speculatively, or actually)
> insert. However, it does not seem fine to claim ctid_wait as a tuple
> we ourselves inserted.

I don't think either message really fits here, unfortunately.  We're not
actually checking the uniqueness of someone else's tuple here either,
after all, we're waiting to see what happens with their tuple because
ours won't be unique if it goes in with that other tuple in place, if
I'm following along correctly.

One thing I can say is that the XLTW_InsertIndex at least matches the
*action* we're taking, which is that we're trying to INSERT.  While
having the ctid included in the message may be useful for debugging,
I've got doubts about including it in regular messages like this; we
certainly don't do that for the 'normal' INSERT case when we discover
a duplicate key, but that ship has sailed.

Ultimately, I guess I'm inclined to leave it as-committed.  If you
understand enough to realize what that pair of numbers after 'tuple'
means, you've probably found this thread and followed it enough to
understand what's happening.

I don't feel terribly strongly about that position and so if others
feel the XLTW_InsertIndexUnique message really would be better, I'd be
happy to commit the change.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] NOT EXIST for PREPARE

2016-03-22 Thread Stephen Frost

* Merlin Moncure (mmonc...@gmail.com) wrote:
> On Tue, Mar 22, 2016 at 8:01 AM, Andres Freund  wrote:
> > On 2016-03-22 12:41:43 +0300, Yury Zhuravlev wrote:
> >> Do I understand correctly the only way know availability PREPARE it will
> >> appeal to pg_prepared_statements?
> >> I think this is not a good practice. In some cases, we may not be aware of
> >> the PREPARE made (pgpool). Moreover, it seems popular question in the
> >> Internet: 
> >> http://stackoverflow.com/questions/1193020/php-postgresql-check-if-a-prepared-statement-already-exists
> >>
> >> What do you think about adding NOT EXIST functionality to PREPARE?
> >
> > Not very much. If you're not in in control of the prepared statements, you
> > can't be sure it's not an entirely different statement. So NOT EXISTS
> > doesn't really buy you anything, you'd still need to compare the
> > statement somehow.
> 
> Strongly disagree!  A typical use case of this feature would be in
> connection pooler scenarios where you *are* in control of the
> statement but it's a race to see who creates it first.  This feature
> should be immediately be incorporated by the JDBC driver so that we'd
> no longer have to disable server side prepared statements when using
> pgbounder (for example).

Indeed, and we already said we're not going to verify that the object
that already exists in a 'IF NOT EXISTS' case matches exactly whatever
you're trying to create, see CREATE TABLE.

I agree that PREPARE IF NOT EXISTS would be nice to have, but only if we
can keep it fast somehow, which is the part that makes me wonder a bit.

Having PREPARE IF NOT EXISTS would imply that application authors would
be expected to run a set of PREPAREs at the start of each transaction
(if you want to support transaction pooling mode in, say, pgbouncer),
for each prepared statement they want to use in that transaction.  That
doesn't seem completely unreasonable, but it'd need to be fast.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] NOT EXIST for PREPARE

2016-03-22 Thread Stephen Frost

* Andres Freund (and...@anarazel.de) wrote:
> On 2016-03-22 09:37:15 -0500, Merlin Moncure wrote:
> > On Tue, Mar 22, 2016 at 8:01 AM, Andres Freund  wrote:
> > > Hi,
> > > On 2016-03-22 12:41:43 +0300, Yury Zhuravlev wrote:
> > >> Do I understand correctly the only way know availability PREPARE it will
> > >> appeal to pg_prepared_statements?
> > >> I think this is not a good practice. In some cases, we may not be aware 
> > >> of
> > >> the PREPARE made (pgpool). Moreover, it seems popular question in the
> > >> Internet: 
> > >> http://stackoverflow.com/questions/1193020/php-postgresql-check-if-a-prepared-statement-already-exists
> > >>
> > >> What do you think about adding NOT EXIST functionality to PREPARE?
> > >
> > > Not very much. If you're not in in control of the prepared statements, you
> > > can't be sure it's not an entirely different statement. So NOT EXISTS
> > > doesn't really buy you anything, you'd still need to compare the
> > > statement somehow.
> > 
> > Strongly disagree!  A typical use case of this feature would be in
> > connection pooler scenarios where you *are* in control of the
> > statement but it's a race to see who creates it first.  This feature
> > should be immediately be incorporated by the JDBC driver so that we'd
> > no longer have to disable server side prepared statements when using
> > pgbounder (for example).
> 
> Uh. JDBC precisely is a scenario where that's *NOT* applicable? You're
> not in control of the precise prepared statement names it generates, so
> you have no guarantee that one prepared statement identified by its name
> means the same in another connection.

Clearly, you'd need to be able to control the prepared statement name to
use such a feature.

Given that we're talking about what sounds like a new feature in the
JDBC driver, I don't see why you wouldn't also make that a requirement
of the feature..?  Or have the JDBC driver calculate a unique ID for
each statement using a good hash, perhaps?

Note: I don't pretend to have any clue as to the internals of the JDBC
driver, but it hardly seems far-fetched to have this be supported in a
way that works.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] NOT EXIST for PREPARE

2016-03-23 Thread Stephen Frost

Merlin,

* Merlin Moncure (mmonc...@gmail.com) wrote:
> No one is arguing that that you should send it any every time (at
> least -- I hope not).

I'm not sure I follow how you can avoid that though?

pgbouncer in transaction pooling mode may let a particular connection
die off and then, when you issue a new request, create a new one- which
won't have any prepared queries in it, even though you never lost your
connection to pgbouncer.

That's why I was saying you'd have to send it at the start of every
transaction, which does add to network load and requires parsing, etc.
Would be nice to avoid that, if possible, but I'm not quite sure how.

One thought might be to have the server somehow have a pre-canned set of
queries already set up and ready for you to use when you connect,
without any need to explicitly prepare them, etc.

Just a thought.  I do still like the general idea of INE support for
PREPARE, but perhaps there's a better option.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Stephen Frost

* Merlin Moncure (mmonc...@gmail.com) wrote:
> On Wed, Mar 23, 2016 at 12:37 PM, Tom Lane  wrote:
> > which is both SQL-standard semantics and much more efficient than
> > SRF-in-tlist.  We've more or less deprecated SRF-in-tlist since we
> > introduced LATERAL in 9.3.  How much are we willing to do to stay
> > bug-compatible with old behaviors here?
> 
> I think we should, and the fact this was caught so early on the
> release cycle underscores that.  One of the problems is that there are
> reasonable cases (note, not impacted by this bug) of this usage that
> are still commonplace, for example:
> 
> ysanalysis=# select unnest(current_schemas(true));
>unnest
> 
>  pg_catalog
>  public
> 
> I'm something of a backwards compatibility zealot, but I've become one
> for very good reasons.  Personally, I'd rather we'd define precisely
> the usages that are deprecated (I guess SRF-tlist in the presence of
> FROM) and force them to error out with an appropriate HINT rather than
> give a different answer than they used to.  The problem here is that
> LATERAL is still fairly new and there is a huge body of code out there
> leveraging the 'bad' way, as it was for years and years the only way
> to do a number of useful things.

I have to side with what I believe is Tom's position on this one.  I do
like the notion of throwing an error in cases where someone sent us
something that we're pretty sure is wrong, but I don't agree that we
should continue to carry on bug-compatibility with things that are
already one foot in the grave and really just need to be shoved all the
way in.

This isn't the only break in backwards compatibility we've had over the
years and is pretty far from the largest (string escaping, anyone?  or
removing implicit casts?) and I'd argue we're better off for it.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL 9.6 behavior change with set returning (funct).*

2016-03-23 Thread Stephen Frost

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> On Wed, Mar 23, 2016 at 2:11 PM, Tom Lane  wrote:
> > In the meantime I suppose there's a case to be made for preserving
> > bug compatibility as much as possible.
> >
> > So anyway the question is whether to commit the attached or not.
> 
> +1 for commit - I'll trust Tom on the quality of the patch :)

I'm not going to object to it.  All-in-all, I suppose '+0' from me.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Show dropped users' backends in pg_stat_activity

2016-03-24 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Robert Haas  writes:
> > I am not really in favor of half-fixing this.  If we can't
> > conveniently wait until a dropped role is completely out of the
> > system, then I don't see a lot of point in trying to do it in the
> > limited cases where we can.  If LEFT JOIN is the way to go, then,
> > blech, but, so be it.
> 
> I concur.  Let's put the left join(s) into those views and call it
> good.

I'd suggest we also add some notes to the documentation that the correct
approach to dropping users is to disallow access first, then kill any
existing backends, and then drop the user.  That, plus the left joins,
seems like it's good enough.

> BTW, I think we would need the left joins even if we had interlocking
> in DROP, just to protect ourselves against race conditions.  Remember
> that what pg_stat_activity shows is a snapshot, which might be more or
> less out of date compared to the catalog contents.

True, though that would likely be a much smaller set of cases that might
also be short lived.

Might be good to also note in the docs how to kill off sessions which
are regular users but which no longer have a username, for folks who end
up in this situation that they managed to drop a role which still had
connections to the system.

Thanks!

Stephen

signature.asc
Description: Digital signature

[HACKERS] Dealing with collation and strcoll/strxfrm/etc

2016-03-28 Thread Stephen Frost

All,

Changed the thread name (we're no longer talking about release
notes...).

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Oleg Bartunov  writes:
> > Should we start thinking about ICU ?
> 
> Isn't it still true that ICU fails to meet our minimum requirements?
> That would include (a) working with the full Unicode character range
> (not only UTF16) and (b) working with non-Unicode encodings.  No doubt
> we could deal with (b) by inserting a conversion, but that would take
> a lot of shine off the performance numbers you mention.
> 
> I'm also not exactly convinced by your implicit assumption that ICU is
> bug-free.

We have a wiki page about ICU.  I'm not sure that it's current, but if
it isn't and people are interested then perhaps we should update it:

https://wiki.postgresql.org/wiki/Todo:ICU

If we're going to talk about minimum requirements, I'd like to argue
that we require whatever system we're using to have versioning (which
glibc currently lacks, as I understand it...) to avoid the risk that
indexes will become corrupt when whatever we're using for collation
changes.  I'm pretty sure that's already bitten us on at least some
RHEL6 -> RHEL7 migrations in some locales, even forgetting the issues
with strcoll vs. strxfrm.

Regarding key abbreviation and performance, if we are confident that
strcoll and strxfrm are at least independently internally consistent
then we could consider offering an option to choose between them.
We'd need to identify what each index was built with to do so, however,
as they would need to be rebuilt if the choice changes, at least
until/unless they're made to reliably agree.  Even using only one or the
other doesn't address the versioning problem though, which is a problem
for all currently released versions of PG and is just going to continue
to be an issue.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Parallel Queries and PostGIS

2016-03-28 Thread Stephen Frost

Paul,

* Paul Ramsey (pram...@cleverelephant.ca) wrote:
> I spent some time over the weekend trying out the different modes of
> parallel query (seq scan, aggregate, join) in combination with PostGIS
> and have written up the results here:
> 
> http://blog.cleverelephant.ca/2016/03/parallel-postgis.html

Neat!

Regarding aggregate parallelism and the cascaded union approach, though
I imagine in other cases as well, it seems like having a
"final-per-worker" function for aggregates would be useful.

Without actually looking at the code at all, it seems like that wouldn't
be terribly difficult to add.

Would you agree that it'd be helpful to have for making the st_union()
work better in parallel?

Though I do wonder if you would end up wanting to have a different
final() function in that case..

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Dealing with collation and strcoll/strxfrm/etc

2016-03-28 Thread Stephen Frost

* Peter Geoghegan (p...@heroku.com) wrote:
> On Mon, Mar 28, 2016 at 7:57 AM, Stephen Frost  wrote:
> > If we're going to talk about minimum requirements, I'd like to argue
> > that we require whatever system we're using to have versioning (which
> > glibc currently lacks, as I understand it...) to avoid the risk that
> > indexes will become corrupt when whatever we're using for collation
> > changes.  I'm pretty sure that's already bitten us on at least some
> > RHEL6 -> RHEL7 migrations in some locales, even forgetting the issues
> > with strcoll vs. strxfrm.
> 
> I totally agree that anything we should adopt should support
> versioning. Glibc does have a non-standard versioning scheme, but we
> don't use it. Other stdlibs may do versioning another way, or not at
> all. A world in which ICU is the defacto standard for Postgres (i.e.
> the actual standard on all major platforms), we mostly just have one
> thing to target, which seems like something to aim for.

Having to figure out how each and every stdlib does versioning doesn't
sound fun, I certainly agree with you there, but it hardly seems
impossible.  What we need, even if we look to move to ICU, is a place to
remember that version information and a way to do something when we
discover that we're now using a different version.

I'm not quite sure what the best way to do that is, but I imagine it
involves changes to existing catalogs or perhaps even a new one.  I
don't have any particularly great ideas for existing releases (maybe
stash information in the index somewhere when it's rebuilt and then
check it and throw an ERROR if they don't match?)

> The question is only how we deal with this when it happens. One thing
> that's attractive about ICU is that it makes this explicit, both for
> the logical behavior of a collation, as well as the stability of
> binary sort keys (Glibc's versioning seemingly just does the former).
> So the equivalent of strxfrm() output has license to change for
> technical reasons that are orthogonal to the practical concerns of
> end-users about how text sorts in their locale. ICU is clear on what
> it takes to make binary sort keys in indexes work. And various major
> database systems rely on this being right.

There seems to be some disagreement about if ICU provides the
information we'd need to make a decision or not.  It seems like it
would, given its usage in other database systems, but if so, we need to
very clearly understand exactly how it works and how we can depend on
it.

> > Regarding key abbreviation and performance, if we are confident that
> > strcoll and strxfrm are at least independently internally consistent
> > then we could consider offering an option to choose between them.
> 
> I think they just need to match, per the standard. After all,
> abbreviation will sometimes require strcoll() tie-breakers.

Ok, I didn't see that in the man-pages.  If that's the case then it
seems like there isn't much hope of just using strxfrm().

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] extend pgbench expressions with functions

2016-03-28 Thread Stephen Frost

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Tue, Mar 29, 2016 at 9:54 AM, Robert Haas  wrote:
> > On Sun, Mar 27, 2016 at 5:01 AM, Fabien COELHO  wrote:
> >> v40 is yet another rebase.
> >
> > Thanks.  Committed after removing an unnecessary parameter from the
> > coerceToXXX() functions.
> >
> > I guess the question here is whether we want the part-c patch, which
> > removes the historical \setrandom syntax.  That's technically no
> > longer needed since we now can use random() as a function directly
> > inside \set commands, but we might want it anyway for backward
> > compatibility.
> >
> > Anybody have an opinion on that?
> 
> +1 for nuking it. That's not worth the trouble maintaining it.

If we don't nuke it, it'll never die.

See also: pg_shadow

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [CommitFest App] Feature request -- review e-mail additions

2016-03-30 Thread Stephen Frost

* José Luis Tallón (jltal...@adv-solutions.net) wrote:
> * Prepend a [review] tag to the e-mail subject
> ... so that e-mails sent to -hackers will read  " [HACKERS]
> [review] "

Eh, I'm not against it but not sure it's all that necessary either.

> * Auto-CC the patch author on this e-mail
> I guess this should speed up reactions / make communication a
> bit more fluid.

This is almost a requirement, imv, as subsequent comments on the review
are frequent and it's a crime if the author misses all of that.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Multi-tenancy with RLS

2016-01-15 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Joe Conway  writes:
> > As Stephen mentioned, yes, I am very interested in at least some aspects
> > of this patch. The ability to apply RLS to system tables could be useful
> > to solve a number of problems we don't have a good story for today,
> > multi-tenancy only being one of them.
> 
> FWIW, it seems offhand like we might not have that much trouble with
> applying RLS to system catalogs as long as it's understood that RLS
> only has anything to do with SQL queries issued against the catalogs.

Right, that's what this patch set is about.

> If we imagine that RLS should somehow filter a backend's own operations on
> the catalogs, then I agree with Robert that the entire thing is deeply
> scary and probably incapable of being made to work robustly.

Personally, I like the idea of the capability, but I also agree that
it'd be a great deal more challenging to do and would require a lot of
pretty invasive and scary changes.  Hence, my thinking was that we'd
define our own set of policies which mimic what we already do through
the permissions system (thus, only impacting SQL queries against the
catalog and not anything about how the backend accesses the catalogs).

I'm on the fence about if we'd allow those policies to be modified by
users or not.

> However, by "not that much trouble" I only mean getting an implementation
> that works and doesn't create more security problems than it fixes.
> Usability is still likely to be a huge problem.  In particular it seems
> likely that any attempt to actually put RLS policies on the catalogs would
> completely destroy the ability to run pg_dump except as a BYPASSRLS role.
> That would be an unpleasant consequence.

I don't follow how this would destroy the ability to run pg_dump.
Ideally, we'd have a result where a user could run pg_dump without
having to apply any filters of their own and they'd get a dump of all
objects they're allowed to see.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Multi-tenancy with RLS

2016-01-15 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> However, by "not that much trouble" I only mean getting an implementation
> >> that works and doesn't create more security problems than it fixes.
> >> Usability is still likely to be a huge problem.  In particular it seems
> >> likely that any attempt to actually put RLS policies on the catalogs would
> >> completely destroy the ability to run pg_dump except as a BYPASSRLS role.
> >> That would be an unpleasant consequence.
> 
> > I don't follow how this would destroy the ability to run pg_dump.
> > Ideally, we'd have a result where a user could run pg_dump without
> > having to apply any filters of their own and they'd get a dump of all
> > objects they're allowed to see.
> 
> You mean, other than the fact that pg_dump sets row_security = off
> to ensure that what it's seeing *isn't* filtered.

There's a specific option to turn it back on already though.  This
wouldn't change that.

> The bigger picture here is that I do not think that you can just
> arbitrarily exclude non-owned objects from its view and still expect to
> get a valid dump; that will break dependency chains for example, possibly
> leading to stuff getting output in an order that doesn't restore.

We already have that issue when users select to dump out specific
schemas, I don't see this as being any different.  The idea behind
multi-tenancy is, generally speaking, you don't see or have any
references or dependencies with what other people have.  In those cases,
there won't be any dependencies to objects that you can't see.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Multi-tenancy with RLS

2016-01-15 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> Stephen Frost  writes:
> >>> I don't follow how this would destroy the ability to run pg_dump.
> >>> Ideally, we'd have a result where a user could run pg_dump without
> >>> having to apply any filters of their own and they'd get a dump of all
> >>> objects they're allowed to see.
> 
> >> You mean, other than the fact that pg_dump sets row_security = off
> >> to ensure that what it's seeing *isn't* filtered.
> 
> > There's a specific option to turn it back on already though.
> 
> Whereupon you'd have no certainty that what you got represented a
> complete dump of your own data.

It would be a dump of what you're allowed to see, rather than an error
saying you couldn't dump something you couldn't see, which is the
alternative we're talking about here.  Even if you've got a dependency
to something-or-other, if you don't have access to it, then you're
going to get an error.

In practice, you have to make sure to remember to include all of your
schemas when you pg_dump, and don't get it wrong or you'll get an error
(you don't have access to some schema referenced) or a subset of what
you intended (you forgot to include one you meant to).  That is not a
better user experience than being able to say "dump out everything I've
got access to."

In many, many use-cases that's exactly what you want.  pg_dump is more
than just a whole-database backup tool, and when it's used as a
whole-database backup tool, you'll need to make sure it has BYPASSRLS or
is a superuser or you could end up getting errors.  I don't see any
issue with that.

If the policies are incorrect then that'd be a problem, but I'm
certainly hopeful that we'd be able to get that right.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Log operating system user connecting via unix socket

2016-01-17 Thread Stephen Frost

José,

* José Arthur Benetasso Villanova (jose.art...@gmail.com) wrote:
> Here in my work, we have about 100 PostgreSQL machines and about 20 users
> with superuser privileges.

Sounds pretty common.  What kind of superuser rights are they using?
What is the minimum set of rights that are required for these users
(which may break out into different groups, naturally).  We're looking
at ways to provide access for certain operations to non-superusers, to
reduce the number of superuser accounts required.

> This group of 20 people change constantly, so it's cumbersome create a role
> for each. Instead, we map all of then in pg_ident.conf.

Do these 20 individuals have 'regular' accounts also?  Have you
considered granting them a superuser role which they could 'set role'
to when they need to perform a superuser operation?

> The problem is: with current postgres log, I just know that a postgres user
> connect, but I don't know which one is in case that more than one user is
> logged in the server.

Understood, that's unfortunate.

> This simple log line can create the relations needed for an audit.
> 
> Feel free to comment and criticize.

What I think we really want here is logging of the general 'system
user' for all auth methods instead of only for the 'peer' method.
Knowing who connected via Kerberos is quite valuable also, for example.

My thinking would be to add 'system_user' to the Peer struct and then
log the system user in PerformAuthentication.  Another thought might be
to replace peer_cn with 'peer_user' (it's the same thing for client cert
SSL connections, after all..) and then log it and also make it available
in pg_stat_activity.

These are a bit off-the-cuff comments, but hopefully make sense and
provide the right direction to be looking in.  The other thing to
consider is how this information is reflected in the CSV log and/or
log_line_prefix..

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Log operating system user connecting via unix socket

2016-01-17 Thread Stephen Frost

Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > What I think we really want here is logging of the general 'system
> > user' for all auth methods instead of only for the 'peer' method.
> 
> Well, we don't really know that except in a small subset of auth
> methods.  I agree that when we do know it, it's useful info to log.

Right.

> My big beef with the proposed patch is that the log message is emitted
> unconditionally.  There are lots and lots of users who feel that during
> normal operation, *zero* log messages should get emitted.  Those villagers
> would be on our doorsteps with pitchforks if we shipped this patch as-is.

Agreed.

> I would propose that this information should be emitted only when
> log_connections is enabled, and indeed that it should be part of the
> log_connections message not a separate message.  So this leads to
> thinking that somehow, the code for individual auth methods should
> be able to return an "additional info" field for inclusion in
> log_connections.  We already have such a concept for auth failures,
> cf commit 5e0b5dcab.

Apologies if it wasn't clear, but that's exactly what I was suggesting
by saying to add it to PerformAuthentication, which is where we emit
the connection info when log_connections is enabled. 

> > ... and also make it available in pg_stat_activity.
> 
> That's moving the goalposts quite a bit, and I'm not sure it's necessary
> or even desirable.  Let's just get this added to log_connections output,
> and then see if there's field demand for more.

This was in context of peer_cn, which is just a specific "system user"
value and which we're already showing in pg_stat_* info tables.  I'd
love to have the Kerberos principal available, but I don't think it'd
make sense to have a 'pg_stat_kerberos' just for that.

I agree that it's moving the goalposts for this patch and could be an
independent patch, but I don't see it as any different, from a
desirability and requirements perspective, than what we're doing for SSL
connections.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Additional role attributes && superuser review

2016-01-17 Thread Stephen Frost

* Bruce Momjian (br...@momjian.us) wrote:
> On Mon, Jan  4, 2016 at 12:55:16PM -0500, Stephen Frost wrote:
> > I'd like to be able to include, in both of those, a simple set of
> > instructions for granting the necessary rights to the user who is
> > running those processes.  A set of rights which an administrator can go
> > look up and easily read and understand the result of those grants.  For
> > example:
> > 
> ...
> > pgbackrest:
> > 
> >   To run pgbackrest as a non-superuser and not the 'postgres' system
> >   user, grant the pg_backup role to the backrest user and ensure the
> >   backrest system user has read access to the database files (eg: by
> >   having the system user be a member of the 'postgres' group):
> --
> 
> Just to clarify, the 'postgres' OS user group cannot read the data
> directory, e.g.
> 
>   drwx-- 19 postgres staff 4096 Jan 17 12:19 data/
>   ^^^group
> 
> I assume we don't want to change that.

This is going to be distribution dependent, unfortunately.  On
Debian-based distributions, the group is 'postgres' and it'd be
perfectly reasonable to allow that group to read the data directory.

I don't recall offhand if that means we'd have to make changes to allow
that, but, for my 2c, I don't see why we wouldn't allow it to be an
option.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Additional role attributes && superuser review

2016-01-17 Thread Stephen Frost

* Bruce Momjian (br...@momjian.us) wrote:
> On Sun, Jan 17, 2016 at 01:49:19PM -0500, Stephen Frost wrote:
> > * Bruce Momjian (br...@momjian.us) wrote:
> > > > pgbackrest:
> > > > 
> > > >   To run pgbackrest as a non-superuser and not the 'postgres' system
> > > >   user, grant the pg_backup role to the backrest user and ensure the
> > > >   backrest system user has read access to the database files (eg: by
> > > >   having the system user be a member of the 'postgres' group):
> > > --
> > > 
> > > Just to clarify, the 'postgres' OS user group cannot read the data
> > > directory, e.g.
> > > 
> > >   drwx-- 19 postgres staff 4096 Jan 17 12:19 data/
> > >   ^^^group
> > > 
> > > I assume we don't want to change that.
> > 
> > This is going to be distribution dependent, unfortunately.  On
> > Debian-based distributions, the group is 'postgres' and it'd be
> > perfectly reasonable to allow that group to read the data directory.
> 
> Well, while the group name would be OS-dependent, the lack of any group
> permisions in not OS-dependent and is forced by initdb:
> 
> umask(S_IRWXG | S_IRWXO);
> 
> create_data_directory();

Right, we also check in the backend on startup for certain permissions.
I don't recall offhand if that's forced to 700 or if we allow 750.

> > I don't recall offhand if that means we'd have to make changes to allow
> > that, but, for my 2c, I don't see why we wouldn't allow it to be an
> > option.
> 
> OK, that would be an initdb change then.

It would need to be optional, so distributions and users could choose
which makes sense for their systems.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Additional role attributes && superuser review

2016-01-17 Thread Stephen Frost

* Bruce Momjian (br...@momjian.us) wrote:
> On Wed, Jan  6, 2016 at 12:29:14PM -0500, Robert Haas wrote:
> > The point is that with the GRANT EXECUTE ON FUNCTION proposal, authors
> > of monitoring tools enjoy various really noteworthy advantages.  They
> > can have monitoring roles which have *exactly* the privileges that
> > their tool needs, not whatever set of permissions (larger or smaller)
> > the core project has decide the pg_monitor role should have.  They can
> > have optional features requiring extra permissions and those extra
> > permissions can be granted in precisely those shops where those extra
> > features are in use.  They can deploy a new versions of their
> > monitoring tool that requires an extra privilege on an existing
> > PostgreSQL release without requiring any core modifications, which
> > shaves years of time off the deployment schedule and avoids
> > contentious arguments with the lovable folks who populate this mailing
> > list.  That sounds *terrific* to me compared to the alternative you
> > are proposing.
> 
> I assume backup tools would either document the functions they want
> access to via SQL commands, or supply a script.  I assume they would
> create a non-login role (group) with the desired permissions, and then
> have users inherit from that.  They would also need to be able to allow
> upgrades where they would (conditionally?) add the role and then
> add/revoke permissions as needed, e.g. they might not need all
> permissions they needed in a previous release, or they might need new
> ones.
> 
> That all seems very straight-forward to me.

I'm not against that idea, though I continue to feel that there are
common sets of privileges which backup tools could leverage.

The other issue that I'm running into, again, while considering how to
move back to ACL-based permissions for these objects is that we can't
grant out the actual permissions which currently exist.  That means we
either need to break backwards compatibility, which would be pretty
ugly, in my view, or come up with new functions and then users will have
to know which functions to use when.

As I don't think we really want to break backwards compatibility or
remove existing functionality, the only approach which is going to make
sense is to add additional functions in some cases.  In particular, we
will need alternate versions of pg_terminate_backend and
pg_cancel_backend.  One thought I had was to make that
'pg_signal_backend', but that sounds like we'd allow any signal sent by
a user with that right, which seems a bit much to me...

Perhaps that's what I'll do though, barring other suggestions.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Additional role attributes && superuser review

2016-01-17 Thread Stephen Frost

Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Sun, Jan 17, 2016 at 01:57:22PM -0500, Stephen Frost wrote:
> > Right, we also check in the backend on startup for certain permissions.
> > I don't recall offhand if that's forced to 700 or if we allow 750.
> > 
> > > > I don't recall offhand if that means we'd have to make changes to allow
> > > > that, but, for my 2c, I don't see why we wouldn't allow it to be an
> > > > option.
> > > 
> > > OK, that would be an initdb change then.
> > 
> > It would need to be optional, so distributions and users could choose
> > which makes sense for their systems.
> 
> While the group owner of the directory is a distributions question, the
> permissions are usually a backup-method-specific requirement.  I can see
> us creating an SQL function that opens up group permissions on the data
> directory for specific backup tools that need it, then granting
> permissions on that function to the backup role.   This is another
> example where different backup tools need different permissions.

I don't believe we can really consider group ownership and group
permissions independently.  They really go hand-in-hand.  On
RedHat-based system, where the group is set as 'staff', you probably
don't want group permissions to be allowed.  On Debian-based systems,
where there is a dedicated 'postgres' group, group permissions are fine
to allow.

Group ownership and permissions aren't a backup-method-specific
requirement either, in my view.  I'm happy to chat with Marco (who has
said he would be weighing in on this thread when he is able to)
regarding barman, and whomever would be appropriate for BART (perhaps
you could let me know..?), but if it's possible to do a backup without
being a superuser and with only read access to the data directory, I
would expect every backup soltuion to view that as a feature which they
want to support, as there are environments which will find it desirable,
at a minimum, and even some which will require it.

Lastly, I'm pretty sure this would have to be a postgresql.conf option
as we check the permissions on the data directory on startup.  I don't
see how having an SQL function would work there as I certainly wouldn't
want to have the permissions changing on a running system.  That strikes
me as being risky without any real benefit.  Either it's safe and
acceptable to allow those rights to the group, or it isn't.  A temproary
grant really isn't helping with anything that I can see, surely there
are numerous ways to exploit such a time-based grant.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Additional role attributes && superuser review

2016-01-17 Thread Stephen Frost

* Bruce Momjian (br...@momjian.us) wrote:
> On Sun, Jan 17, 2016 at 06:58:25PM -0500, Stephen Frost wrote:
> > I'm not against that idea, though I continue to feel that there are
> > common sets of privileges which backup tools could leverage.
> > 
> > The other issue that I'm running into, again, while considering how to
> > move back to ACL-based permissions for these objects is that we can't
> > grant out the actual permissions which currently exist.  That means we
> 
> Is that because many of them are complex, e.g. you can kill only your
> own sessions?

Right.

> > either need to break backwards compatibility, which would be pretty
> > ugly, in my view, or come up with new functions and then users will have
> > to know which functions to use when.
> > 
> > As I don't think we really want to break backwards compatibility or
> > remove existing functionality, the only approach which is going to make
> > sense is to add additional functions in some cases.  In particular, we
> > will need alternate versions of pg_terminate_backend and
> > pg_cancel_backend.  One thought I had was to make that
> 
> Like these?  Could we define own-user-type rights?

Interesting idea but I don't really see that being general enough that
we would want to burn a GRANT bit for it...

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Additional role attributes && superuser review

2016-01-17 Thread Stephen Frost

* Bruce Momjian (br...@momjian.us) wrote:
> On Sun, Jan 17, 2016 at 09:10:23PM -0500, Stephen Frost wrote:
> > > While the group owner of the directory is a distributions question, the
> > > permissions are usually a backup-method-specific requirement.  I can see
> > > us creating an SQL function that opens up group permissions on the data
> > > directory for specific backup tools that need it, then granting
> > > permissions on that function to the backup role.   This is another
> > > example where different backup tools need different permissions.
> > 
> > I don't believe we can really consider group ownership and group
> > permissions independently.  They really go hand-in-hand.  On
> > RedHat-based system, where the group is set as 'staff', you probably
> > don't want group permissions to be allowed.  On Debian-based systems,
> > where there is a dedicated 'postgres' group, group permissions are fine
> > to allow.
> 
> Yes, I can see that as problematic.  Seems it would have to be something
> done by the administrator from the command-line.

initdb on both RedHat and Debian-based systems is run, generally
speaking, from the packaging scripts.  They would be able to pass the
correct options to initdb (or PG itself, if we decide that's
necessary..).

> > Group ownership and permissions aren't a backup-method-specific
> > requirement either, in my view.  I'm happy to chat with Marco (who has
> > said he would be weighing in on this thread when he is able to)
> > regarding barman, and whomever would be appropriate for BART (perhaps
> > you could let me know..?), but if it's possible to do a backup without
> > being a superuser and with only read access to the data directory, I
> > would expect every backup soltuion to view that as a feature which they
> > want to support, as there are environments which will find it desirable,
> > at a minimum, and even some which will require it.
> 
> pg_dump doesn't need to read the PGDATA directory, and I thought this
> permission was to be used by pg_dump users as well.

No.  That has been a source of confusion, though I'm not quite sure how
or why, beyond the general assumption that anything 'backup' must
include 'pg_dump' (I don't generally consider that to be the case,
myself, but it seems others do...).

This is only for file-based backups.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Additional role attributes && superuser review

2016-01-17 Thread Stephen Frost

* Bruce Momjian (br...@momjian.us) wrote:
> On Sun, Jan 17, 2016 at 09:23:14PM -0500, Stephen Frost wrote:
> > > > Group ownership and permissions aren't a backup-method-specific
> > > > requirement either, in my view.  I'm happy to chat with Marco (who has
> > > > said he would be weighing in on this thread when he is able to)
> > > > regarding barman, and whomever would be appropriate for BART (perhaps
> > > > you could let me know..?), but if it's possible to do a backup without
> > > > being a superuser and with only read access to the data directory, I
> > > > would expect every backup soltuion to view that as a feature which they
> > > > want to support, as there are environments which will find it desirable,
> > > > at a minimum, and even some which will require it.
> > > 
> > > pg_dump doesn't need to read the PGDATA directory, and I thought this
> > > permission was to be used by pg_dump users as well.
> > 
> > No.  That has been a source of confusion, though I'm not quite sure how
> > or why, beyond the general assumption that anything 'backup' must
> > include 'pg_dump' (I don't generally consider that to be the case,
> > myself, but it seems others do...).
> 
> I think the source of that is that many people have asked for
> backup-only uses, and I thought running pg_dump or pg_dumpall was one of
> those cases.

I'd want that to be a seperate permission which would really be
something along the lines of "allowed to read everything through a DB
connection", which is what pg_dump/pg_dumpall need.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] exposing pg_controldata and pg_config as functions

2016-01-18 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Jan 18, 2016 at 4:43 AM, Andres Freund  wrote:
> > Meh, that seems pretty far into pseudo security arguments.
> 
> Yeah, I really don't see anything in the pg_controldata output that
> looks sensitive.  The WAL locations are the closest of anything,
> AFAICS.

Heikki already showed how the WAL location information could be
exploited if compression is enabled.

I believe that's something we should care about and fix in one way or
another (my initial approach was using defualt roles, but using the ACL
system and starting out w/ no rights granted to that function also
works).

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Buildfarm server move

2016-01-18 Thread Stephen Frost

Tom, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Andrew Dunstan  writes:
> > Tomorrow, January 18th, at 4.00 pm US East Coast time (UT - 5.0) we will 
> > be moving the buildfarm server from its current home at CommandPrompt, 
> 
> Um, this message is postmarked 18 Jan 17:20, an hour later than the
> stated move time.  Did you mean the move will be Tue 19 Jan?

Yes.  It'll be tomorrow.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Log operating system user connecting via unix socket

2016-01-27 Thread Stephen Frost

José,

* José Arthur Benetasso Villanova (jose.art...@gmail.com) wrote:
> I wrote 2 possible patches, both issuing a detail message only if
> log_connections is enabled.
> 
> The first one using the Stephen Frost suggestion, inside the Port struct (I
> guess that this is the one, I coudn't find the Peer struct)

Ah, yes, apologies for the typo.

> The second one following the same approach of cf commit 5e0b5dcab, as
> pointed by Tom Lane.

This really isn't quite following along in the approach used by
5e0b5dcab, from my viewing of it.  I believe Tom was suggesting that an
essentially opaque value be returned to be included rather than what
you've done which codifies it as 'system_user'.

I'm not a fan of that approach though, as the mapping system we have in
pg_ident is generalized and this should be implemented generally by all
authentication methods which support mappings.

That's also why I was suggesting to get rid of peer_cn in the Port
structure in favor of having the 'system_user' or similar variable and
using it in all of these cases where we provide mapping support- then
all of the auth methods which support mappings would set that value,
including the existing SSL code.  You might need to check and see if
there's anything which depends on peer_cn being NULL for non-SSL
connections and adjust that logic, but hopefully that's not what we're
relying on.  I don't see anything like that on a quick glance through
the peer_cn usage.

Also, it looks like you missed one of the exit cases from
pg_SSPI_recvauth(), no?  You may need to refactor that code a bit to
provide easy access to what the system username used is, or simply make
sure to set the port->system_user value in both paths.

Lastly, are you sure that you have the right memory context for the
pstrdup() calls that you're making?  be-secure-openssl.c goes to some
effort to ensure that the memory allocated for peer_cn is in the
TopMemoryContext, but I don't see anything like that in the code
proposed, which makes me worried you didn't consider which memory
context you were allocating in.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Template for commit messages

2016-01-28 Thread Stephen Frost

* Joshua D. Drake (j...@commandprompt.com) wrote:
> On 01/28/2016 06:57 AM, Robert Haas wrote:
> 
> >>I'm on board with Bruce's template as being a checklist of points to be
> >>sure to cover when composing a commit message.  I'm not sure we need
> >>fixed-format rules.
> >
> >Well, I think what people are asking for is precisely a fixed format,
> >and I do think there is value in that.  It's nice to capture the
> >nuance, but the nuance is going to get flattened out anyway when the
> >release notes are created.  If we all agree to use a fixed format,
> >then a bunch of work there that probably has to be done manually can
> >be automated.  However, if we all agree to use a fixed format except
> >for you, we might as well just forget the whole thing, because the
> >percentage of commits that are done by you is quite high.
> 
> Yes, we are either all in or we may as well forgo this.

I don't have a particular issue with it, but would like whatever
template is decided upon to be included in our git repo and then we
should be able to make it the template that's opened up when people go
to commit pretty easily.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Additional role attributes && superuser review

2016-01-28 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
> On Sun, Jan 17, 2016 at 6:58 PM, Stephen Frost  wrote:
> > I'm not against that idea, though I continue to feel that there are
> > common sets of privileges which backup tools could leverage.
> >
> > The other issue that I'm running into, again, while considering how to
> > move back to ACL-based permissions for these objects is that we can't
> > grant out the actual permissions which currently exist.  That means we
> > either need to break backwards compatibility, which would be pretty
> > ugly, in my view, or come up with new functions and then users will have
> > to know which functions to use when.
> >
> > As I don't think we really want to break backwards compatibility or
> > remove existing functionality, the only approach which is going to make
> > sense is to add additional functions in some cases.  In particular, we
> > will need alternate versions of pg_terminate_backend and
> > pg_cancel_backend.  One thought I had was to make that
> > 'pg_signal_backend', but that sounds like we'd allow any signal sent by
> > a user with that right, which seems a bit much to me...
> 
> So, this seems like a case where a built-in role would be
> well-justified.  I don't really believe in built-in roles as a way of
> bundling related permissions; I know you do, but I don't.  I'd rather
> see the individual function permissions granted individually.  But
> here you are talking about a variable level of access to the same
> function, depending on role.  And for that it seems to me that a
> built-in role has a lot more to recommend it in that case than it does
> in the other.  If you have been granted pg_whack, you can signal any
> process on the system; otherwise just your own.  Those checks are
> internal to pg_terminate_backend/pg_cancel_backend so GRANT is not a
> substitute.

If we extend this into the future then it seems to potentially fall
afoul of Noah's concern that we're going to end up with two different
ways of saying GRANT EXECUTE X ON Y.  Consider the more complicated case
of pg_stat_activity, where values are shown or hidden based on who the
current role is.  The policy system only supports whole-row, today, but
the question has already come up, both on the lists and off, of support
for hiding individual column values for certain rows, exactly as we do
today for pg_stat_activity.  Once we reach that point, we'll have a way
to GRANT out these rights and a default role which just has them.

Personally, I don't have any particular issue having both, but the
desire was stated that it would be better to have the regular
GRANT EXECUTE ON catalog_func() working before we consider having
default roles for same.  That moves the goal posts awful far though, if
we're to stick with that and consider how we might extend the GRANT
system in the future.

What got me thinking along these lines was a question posed by Bruce
(Bruce, feel free to chime in if I've misunderstood) to me at SCALE
where we were chatting a bit about this, which was- how could we extend
GRANT to support the permissions that we actually wish
pg_terminate_backend() and pg_cancel_backend() to have, instead of using
a default role?  I didn't think too much on it at the time as it strikes
me as a pretty narrow use-case and requiring quite a bit of complexity
to get right, but there again, I'd certainly view a system where the
user is allowed to have pg_start_backup() rights but not
pg_stop_backup() as being quite a small use-case also, yet that's the
direction we're largely going in with this discussion.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Template for commit messages

2016-01-28 Thread Stephen Frost

* Bruce Momjian (br...@momjian.us) wrote:
> On Thu, Jan 28, 2016 at 10:40:09AM -0500, Stephen Frost wrote:
> > * Joshua D. Drake (j...@commandprompt.com) wrote:
> > > On 01/28/2016 06:57 AM, Robert Haas wrote:
> > > 
> > > >>I'm on board with Bruce's template as being a checklist of points to be
> > > >>sure to cover when composing a commit message.  I'm not sure we need
> > > >>fixed-format rules.
> > > >
> > > >Well, I think what people are asking for is precisely a fixed format,
> > > >and I do think there is value in that.  It's nice to capture the
> > > >nuance, but the nuance is going to get flattened out anyway when the
> > > >release notes are created.  If we all agree to use a fixed format,
> > > >then a bunch of work there that probably has to be done manually can
> > > >be automated.  However, if we all agree to use a fixed format except
> > > >for you, we might as well just forget the whole thing, because the
> > > >percentage of commits that are done by you is quite high.
> > > 
> > > Yes, we are either all in or we may as well forgo this.
> > 
> > I don't have a particular issue with it, but would like whatever
> > template is decided upon to be included in our git repo and then we
> > should be able to make it the template that's opened up when people go
> > to commit pretty easily.
> 
> OK, but keep in mind whatever script committers user should remove tags
> that are empty after exiting the editor.  I can provide the grep regex
> in git somewhere too:
> 
>   egrep -v 
> "^(Author|Reported-by|Bug|Reviewed-by|Tested-by|Backpatch-through): *$"

If the template is there then, for my part at least, I wouldn't mind
killing the empty lines.  Having a decent script would work too, of
course... but I have to admit that I've not tried having a script modify
my commit messages right before they're committed and, well, it'd take a
bit for me to be comfortable with it.

No one wants garbled commit messages from a script that went awry. ;)

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Additional role attributes && superuser review

2016-01-28 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Jan 28, 2016 at 11:04 AM, Stephen Frost  wrote:
> >> So, this seems like a case where a built-in role would be
> >> well-justified.  I don't really believe in built-in roles as a way of
> >> bundling related permissions; I know you do, but I don't.  I'd rather
> >> see the individual function permissions granted individually.  But
> >> here you are talking about a variable level of access to the same
> >> function, depending on role.  And for that it seems to me that a
> >> built-in role has a lot more to recommend it in that case than it does
> >> in the other.  If you have been granted pg_whack, you can signal any
> >> process on the system; otherwise just your own.  Those checks are
> >> internal to pg_terminate_backend/pg_cancel_backend so GRANT is not a
> >> substitute.
> >
> > If we extend this into the future then it seems to potentially fall
> > afoul of Noah's concern that we're going to end up with two different
> > ways of saying GRANT EXECUTE X ON Y.  Consider the more complicated case
> > of pg_stat_activity, where values are shown or hidden based on who the
> > current role is.  The policy system only supports whole-row, today, but
> > the question has already come up, both on the lists and off, of support
> > for hiding individual column values for certain rows, exactly as we do
> > today for pg_stat_activity.  Once we reach that point, we'll have a way
> > to GRANT out these rights and a default role which just has them.
> 
> Well, I'm not saying that predefined rolls are the *only* way to solve
> a problem like this, but I think they're one way and I don't clearly
> see that something else is better.  However, my precise point is that
> we should *not* have predefined rolls that precisely duplicate the
> result of GRANT EXECUTE X ON Y.  Having predefined rolls that change
> the behavior of the system in other ways is a different thing.  So I
> don't see how this falls afoul of Noah's concern.  (If it does,
> perhaps he can clarify.)

Apologies if it seems like what I'm getting at is obtuse but I'm trying
to generalize this, to provide guidance on how to handle the larger set
of privileges.

If I'm following correctly, having default roles for cases where the
role is specifically for something more than 'GRANT EXECUTE X ON Y' (or
a set of such commands..?) makes sense.  Going back to the list of
roles, that would mean that default roles:

pg_monitor

  Allows roles granted more information from pg_stat_activity.  Can't be
  just a regular non-default-role right as we don't, currently, have a
  way to say "filter out the values of certain columns on certain rows,
  but not on others."

  (There's a question here though- for the privileges which will be
  directly GRANT'able, should we GRANT those to pg_monitor, or have
  pg_monitor only provide unfiltered access to pg_stat_activity, or..?
  Further, if it's only for pg_stat_activity, should we name it
  something else?)

pg_signal_backend

  Allows roles to signal other backend processes beyond those backends
  which are owned by a user they are a role member of.  Can't be a
  regular non-default-role right as we don't, currently, have any way to
  GRANT rights to send signals only to backends you own or are a member
  of.

pg_replication

  Allows roles to use the various replication functions.  Can't be a
  regular non-default-role right as the REPLICATION role attribute
  allows access to those functions and the GRANT system has no way of
  saying "allow access to these functions if they have role attribute
  X."

Make sense, as these are cases where we can't simply write GRANT
statements and get the same result, but we don't need (or at least,
shouldn't have without supporting GRANT on catalog objects and agreement
on what they're intended for):

pg_backup

  pg_start_backup(), pg_stop_backup(), pg_switch_xlog(), and
  pg_create_restore_point() will all be managed by the normal GRANT
  system and therefore we don't need a default role for those use-cases.

pg_file_settings

  pg_file_settings() function and pg_file_settings view will be managed
  by the normal GRANT system and therefore we don't need a default role
  for those use-cases.

pg_replay

  pg_xlog_replay_pause(), and pg_xlog_replay_resume() will be managed
  by the normal GRANT system and therefore we don't need a default role
  for those use-cases.

pg_rotate_logfile

  pg_rotate_logfile() will be managed by the normal GRANT system and
  therefore we don't need a default role for that use-cases.

> > Personally, I don't have any particular iss

Re: [HACKERS] Additional role attributes && superuser review

2016-01-29 Thread Stephen Frost

Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Fri, Jan 29, 2016 at 6:37 AM, Stephen Frost  wrote:
> > * Robert Haas (robertmh...@gmail.com) wrote:
> >> On Thu, Jan 28, 2016 at 11:04 AM, Stephen Frost 
> wrote:
> >> > Personally, I don't have any particular issue having both, but the
> >> > desire was stated that it would be better to have the regular
> >> > GRANT EXECUTE ON catalog_func() working before we consider having
> >> > default roles for same.  That moves the goal posts awful far though, if
> >> > we're to stick with that and consider how we might extend the GRANT
> >> > system in the future.
> >>
> >> I don't think it moves the goal posts all that far.  Convincing
> >> pg_dump to dump grants on system functions shouldn't be a crazy large
> >> patch.
> >
> > I wasn't clear as to what I was referring to here.  I've already written
> > a patch to pg_dump to support grants on system objects and agree that
> > it's at least reasonable.
> 
> Is it already posted somewhere? I don't recall seeing it. Robert and Noah
> have a point that this would be useful for users who would like to dump
> GRANT/REVOKE rights on system functions & all, using a new option in
> pg_dumpall, say --with-system-acl or --with-system-privileges. 

Multiple versions were posted on this thread.  I don't fault you for not
finding it, this thread is a bit long in the tooth.  The one I'm
currently working from is:

http://www.postgresql.org/message-id/attachment/38049/catalog_function_acls_v4.patch

I'm going to split up the pg_dump changes and the backend changes, as
they can logically go in independently (though without both, we're not
moving the needle very far with regards to what administrators can do).

> If at least
> the three of you are agreeing here I think that we should try to move at
> least toward this goal first. That seems a largely doable goal for 9.6. For
> the set of default roles, there is clearly no clear consensus regarding
> what each role should do or not, and under which limitation it should
> operate.

I'm trying to work towards a consensus on the default roles, hence the
questions and discussion posed in the email you replied to.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Audit Extension

2016-02-05 Thread Stephen Frost

All,

* Robert Haas (robertmh...@gmail.com) wrote:
> OK, I'll bite: I'm worried that this patch will be a maintenance
> burden.  It's easy to imagine that changes to core will result in the
> necessity or at least desirability of changes to pgaudit, but I'm
> definitely not prepared to insist that future authors try to insist
> that future patch submitters have to understand this code and update
> it as things change.

I agree with this concern in general, and have since pgaudit was first
proposed for inclusion by Simon two years ago.  Having auditing as an
extension is what makes this into an issue though.  Were auditing baked
into core, it'd be far more straight-forward, much easier to maintain,
and would be updated as we add new core capabilities naturally.  David's
comments about how pgaudit has to work are entirely accurate- everything
ends up having to be reconstructed in a very painful way.

> The set of things that the patch can audit is pretty arbitrary and not
> well tied into the core code.  

This also speaks to the difficulties of having auditing implemented as
an extension.

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Yeah.  Auditing strikes me as a fine example of something for which there
> is no *technical* reason to need to put it in core.  It might need some
> more hooks than we have now, but that's no big deal.

I disagree with this, quite strongly.  There are very good, technical,
reasons why auditing should be a core capability.  Adding more hooks,
exposing various internal functions for use by extensions, etc, doesn't
change that any extension would have to be constantly updated as new
capabilities are added to core and auditing, as a capability, would
continue to be limited by virtue of being implemented as an extension.

I don't mean to detract anything from what 2ndQ and David have done with
pgaudit (which is the only auditing solution of its kind that I'm aware
of, so I don't understand the "competing implementations" argument which
has been brought up at all- there's really only one).  They've done just
as much as each version of PG has allowed them to do.  It's a much
better solution than what we have today (which is basically just
log_statement = 'all', which no one should ever consider to be a real
auditing solution).  I've already stated that I'd be willing to take on
resonsibility for maintaining it as an extension (included with PG or
not), but it's not the end-all, be-all of auditing for PG and we should
be continuing to look at implementing a full in-core auditing solution.

I'm still of the opinion that we should include pgaudit for the reason
that it's a good incremental step towards proper in-core auditing, but
there's clearly no consensus on that and doesn't appear that further
discussion is likely to change that.

An in-core auditing solution would provide us with proper grammar
support, ability to directly mark objects for auditing in the catalog,
allow us to much more easily maintain auditing capability over time as
a small incremental bit of work for each new feature (with proper
in-core infrastructure for it) and generally be a far better technical
solution.  Leveraging the GRANT system is quite cute, and does work, but
it's certainly non-intuitive and is only because we've got no better
way, due to it being implemented as an extension.

Looking at pgaudit and the other approaches to auditing which have been
developed (eg: applications which sit in front of PG and essentially
have to reimplement large bits of PG to then audit the commands sent
before passing them to PG, or hacks which try to make sense out of log
files full of SQL statements) make it quite clear, in my view, that
attempts to bolt-on auditing to PG result in a poorer solution, from a
technical perspective, than what this project is known for and capable
of.  To make true progress towards that, however, we need to get past
the thinking that auditing doesn't need to be in-core or that it should
be a second-class citizen feature or that we don't need it in PG.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] PostgreSQL Audit Extension

2016-02-05 Thread Stephen Frost

* Joe Conway (m...@joeconway.com) wrote:
> On 02/05/2016 10:16 AM, Stephen Frost wrote:
> > An in-core auditing solution would provide us with proper grammar
> > support, ability to directly mark objects for auditing in the catalog,
> > allow us to much more easily maintain auditing capability over time as
> > a small incremental bit of work for each new feature (with proper
> > in-core infrastructure for it) and generally be a far better technical
> > solution.  Leveraging the GRANT system is quite cute, and does work, but
> > it's certainly non-intuitive and is only because we've got no better
> > way, due to it being implemented as an extension.
> 
> I think one additional item needed would be the ability for the audit
> logs to be sent to a different location than the standard logs.

Indeed, reworking the logging to be supportive of multiple destinations
with tagging of the source, etc, has long been a desire of mine (and
others), though that's largely independent of auditing itself.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Explanation for bug #13908: hash joins are badly broken

2016-02-05 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> I'm of the opinion that this is a stop-ship bug for 9.5.1.  Barring
> somebody producing a fix over the weekend, I will deal with it by
> reverting the aforementioned commit.

Agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [ADMIN] 9.5 new setting "cluster name" and logging

2016-02-08 Thread Stephen Frost

Thomas,

* Thomas Munro (thomas.mu...@enterprisedb.com) wrote:
> On Tue, Feb 9, 2016 at 5:30 AM, Joe Conway  wrote:
> > On 02/08/2016 06:24 AM, Andres Freund wrote:
> >> On 2016-01-29 22:19:45 -0800, Evan Rempel wrote:
> >>> Now that there is a setting to give a cluster a "name", it would be nice 
> >>> to
> >>> have an escape sequence in the log_line_prefix setting that could 
> >>> reference
> >>> the cluster_name.
> >>
> >> I've argued[1][2] for this when cluster_name was introduced, but back
> >> then I seemed to have been the only one arguing for it. Josh later
> >> jumped that train.
> >>
> >> Given that we now had a number of people wishing for this, can we maybe
> >> reconsider?
> >
> > Seems like a reasonable idea to me. But if we add a log_line_prefix
> > setting, shouldn't we also add it to csvlog output too?
> 
> Here's a tiny patch adding support for %C to log_line_prefix (this was
> part of the cluster_name patch that didn't go it).
> 
> Given that csvlog's output format is hardcoded in write_csvlog, how is
> it supposed to evolve without upsetting consumers of this data?
> Wouldn't we first need to add a GUC that lets you control the columns
> it outputs?

Not sure if you really want to go there, but I do agree with you and
there's a thread from a few years back about something similar:

http://www.postgresql.org/message-id/flat/20110112142345.ga4...@tamriel.snowman.net

Included in that thread is a patch, which likely requires some dusting
off, to add exactly that ability.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost

Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Feb 9, 2016 at 3:01 PM, Joe Conway  wrote:
> > On 02/09/2016 11:47 AM, Robert Haas wrote:
> >> On Fri, Jan 15, 2016 at 11:53 AM, Stephen Frost  wrote:
> >>>> Whereupon you'd have no certainty that what you got represented a
> >>>> complete dump of your own data.
> >>>
> >>> It would be a dump of what you're allowed to see, rather than an error
> >>> saying you couldn't dump something you couldn't see, which is the
> >>> alternative we're talking about here.  Even if you've got a dependency
> >>> to something-or-other, if you don't have access to it, then you're
> >>> going to get an error.
> >>
> >> I think you're dismissing Tom's concerns far too lightly.  The
> >> row_security=off mode, which is the default, becomes unusable for
> >> non-superusers under this proposal.  That's bad.  And if you switch to
> >> the other mode, then you might accidentally fail to get all of the
> >> data in some table you're trying to back up.  That's bad too: that's
> >> why it isn't the default.  There's a big difference between saying
> >> "I'm OK with not dumping the tables I can't see" and "I'm OK with not
> >> dumping all of the data in some table I *can* see".
> >
> > I don't grok what you're saying here. If I, as a non-superuser could
> > somehow see all the rows in a table just by running pg_dump, including
> > rows that I could not normally see due to RLS policies, *that* would be
> > bad. I should have no expectation of being able to dump rows I can't
> > normally see.
> 
> That's true.  But I should also have an expectation that running
> pg_dump won't trigger arbitrary code execution, which is why by
> default, pg_dump sets row_security to OFF.  That way, if a row
> security policy applies, I get an error rather than an incomplete,
> possibly-invalid dump (and arbitrary code execution on the server
> side).  If I'm OK with doing the dump subject to row security, I can
> rerun with --enable-row-security.  But this proposal would force
> non-superusers to always use that option, and that's not a good idea.

Arbitrary code execution is quite a different concern from the prior
concern regarding incomplete dumps.

To the extent that untrusted code execution is an issue (and my
experience with environments which would deploy RLS tells me that it
isn't a practical concern), an option could be created which would cause
an error to be thrown on non-catalog RLS being run.

When it comes to multi-tenancy environments, as this thread is about,
chances are the only tables you can see are ones which you own or are
owned by a trusted user, which is why I don't view this as a pratical
concern, but I'm not against having a solution to address the issue
raised regarding arbitrary code execution, provided it doesn't create
more problems than it purports to solve.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost

JD,

* Joshua D. Drake (j...@commandprompt.com) wrote:
> pg_dump -U $non-super_user
> 
> Should just work, period.

That ship has sailed already, where you're running a pg_dump against
objects you don't own and which have RLS enabled on them.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
> On Tue, Feb 9, 2016 at 3:26 PM, Stephen Frost  wrote:
> > To the extent that untrusted code execution is an issue (and my
> > experience with environments which would deploy RLS tells me that it
> > isn't a practical concern), an option could be created which would cause
> > an error to be thrown on non-catalog RLS being run.
> 
> There's a major release already in the wild that doesn't behave that
> way.

I'm at a loss as to what you're getting at there.  We don't have any
catalog RLS, and when it comes to non-catalog RLS, we do have an option
to throw an error when it's going to be run (and it's the default, as
you pointed out), in the one major version which supports RLS.

> And anyway I think that's missing the point: it's true that
> features that are turned off don't cause problems, but features that
> are turned on shouldn't break things either.

I don't, generally, disagree with that statement, but we have to agree
on what's on vs. off and what is broken vs. working correctly.  See
nearby comments from JD about how non-superuser pg_dump could be seen as
broken when running against an environment where RLS is in use.

> > When it comes to multi-tenancy environments, as this thread is about,
> > chances are the only tables you can see are ones which you own or are
> > owned by a trusted user, which is why I don't view this as a pratical
> > concern, but I'm not against having a solution to address the issue
> > raised regarding arbitrary code execution, provided it doesn't create
> > more problems than it purports to solve.
> 
> Well, I'm against accepting this patch without such a solution.

That's at least something which can be built upon then to help this
progress.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost

JD,

* Joshua D. Drake (j...@commandprompt.com) wrote:
> On 02/09/2016 12:28 PM, Stephen Frost wrote:
> >* Joshua D. Drake (j...@commandprompt.com) wrote:
> >>pg_dump -U $non-super_user
> >>
> >>Should just work, period.
> >
> >That ship has sailed already, where you're running a pg_dump against
> >objects you don't own and which have RLS enabled on them.
> 
> Just to be clear, what I was suggesting is that pg_dump would just
> work (and RLS would properly execute) or in other words, I shouldn't
> have to tell pg_dump to enable row security else throw an error. If
> RLS is enabled, then the backup just runs with appropriate
> permissions.
> 
> Or am I missing something?

You do have to tell pg_dump to enable RLS if you want it to be enabled
when performing a pg_dump.  There's multiple reasons for this, the first
being that, otherwise, you might get an incomplete dump, and secondly,
you might execute a function that some untrusted user wrote and included
in their RLS policy.  We want to avoid both of those, unless you've
specifically asked for it to be done.  That's why row_security is set to
'off' by pg_dump by default.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On 9 February 2016 at 19:47, Robert Haas  wrote:
> > I think you're dismissing Tom's concerns far too lightly.  The
> > row_security=off mode, which is the default, becomes unusable for
> > non-superusers under this proposal.  That's bad.  And if you switch to
> > the other mode, then you might accidentally fail to get all of the
> > data in some table you're trying to back up.  That's bad too: that's
> > why it isn't the default.  There's a big difference between saying
> > "I'm OK with not dumping the tables I can't see" and "I'm OK with not
> > dumping all of the data in some table I *can* see".
> >
> > It seems to me that there's a big difference between policies we ship
> > out of the box and policies that are created be users: specifically,
> > the former can be assumed benign, while the latter can't.  I think
> > that difference matters here, although I'm not sure exactly where to
> > go with it.
> 
> It sounds like there needs to be a separate system_row_security
> setting that controls RLS on the system catalogs, and that it should
> be on by default in pg_dump.

Right, that's what I had been thinking also.

Thanks (and congrats, btw!),

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost

Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Joe Conway  writes:
> > Personally I don't buy that the current situation is a good thing. I
> > know that the "ship has sailed" and regret not having participated in
> > the earlier discussions, but I agree with JD here -- the unprivileged
> > user should not have to even think about whether RLS exists, they should
> > only see what they have been allowed to see by the privileged users (and
> > in the context of their own objects, owners are privileged). I don't
> > think an unprivileged user should get to decide what code runs in order
> > to make that happen.
> 
> Part of the problem here is that we have *not* created any hard and fast
> distinction between "privileged" and "unprivileged" users; I think that
> even speaking in those terms about RLS risks errors in your thinking.

I agree that where, exactly, that line is drawn is part of the issue,
and where's it's drawn is really system-dependent.  In many
environments, I view Joe's comments as entirely accurate- only
privileged users are allowed to create objects *at all*.  Of course,
there are a lot of environments where everyone is allowed to create
objects and, in those environments, all those users would be viewed as
"unprivileged", generally speaking.

> In particular, the code-execution issue arises from the fact that a table
> owner can now cause code to execute *with the permissions of someone else*
> if the someone else is foolish enough to select from his table.  No
> special privileges required, just the ability to create a table.  If we
> make pg_dump run with RLS enabled, then the "foolish" part doesn't need to
> be any more foolish than forgetting a -t switch when using pg_dump.

That distinction is really only relevant when it comes to pg_dump, as
those same users could use views to cause their code to be executed by
other users who are selecting from their view, and they could change if
it's a table or a view quite easily.  From a practical standpoint, we're
making a huge distinction between our client tools- pg_dump must be
protected from X, but we don't have any such qualms or concerns
regarding queries sent from psql.

Perhaps that's the right distinction to make, or perhaps we should come
up with a better answer for psql than what we have now, but I don't
agree that RLS is seriously moving the goalposts, overall, here,
particularly since you're not going to have any RLS policies being
executed by pg_dump when run as a superuser anyway, given Noah's change
to how BYPASSRLS works.

> Maybe we need to restrict that somehow, or maybe some better solution
> exists that we've not thought of yet.  But in its current state, RLS
> is at least as much a security hazard as it is a security aid.
> I do not want to see it extended in ways that make pg_dump unsafe to
> use.

I'm not against coming up with an approach which restricts cases where
user A can write code that will be run under another user's rights,
provided it doesn't make the system overly painful to use.  I don't see
RLS as changing the security risks all that much when you're talking
about regular user queries through psql, and the concern regarding
pg_dump has been addressed through the default of row_security being
off.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Multi-tenancy with RLS

2016-02-09 Thread Stephen Frost

* Joe Conway (m...@joeconway.com) wrote:
> On 02/09/2016 01:22 PM, Tom Lane wrote:
> > Maybe we need to restrict that somehow, or maybe some better solution
> > exists that we've not thought of yet.  But in its current state, RLS
> > is at least as much a security hazard as it is a security aid.
> > I do not want to see it extended in ways that make pg_dump unsafe to
> > use.
> 
> Ok, I can see that. Maybe we should have a specific GRANT for CREATE
> POLICY which is distinct from the privilege to CREATE TABLE?

Well, the only privilege we have now is "CREATE", which allows creation
of any kind of object inside a schema.  I'm generally in favor of
providing more granluar 'create table', 'create view', etc privileges
that can be granted out at the schema level, and 'create policy' would
be appropriate to include in such a set of object-creation permissions.

I don't have any particularly genius ideas about where we'd get the bits
to implement such a grant system though.  We could modify the existing
grant system to use larger bits, but given that this would only be
applicable for schemas, perhaps it'd make sense to have another field
in pg_namespace instead?  Not sure, just brainstorming here.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-02-10 Thread Stephen Frost

* Andres Freund (and...@anarazel.de) wrote:
> On 2016-02-10 13:46:05 +0100, Magnus Hagander wrote:
> > * If the client disconnects with a non-exclusive backup running, the backup
> > is automatically aborted. This is the same thing that pg_basebackup does.
> > To use these non-exclusive backups the backup software will have to
> > maintain a persistent connection to the database -- something that should
> > not be a problem for any of the modern ones out there (but would've been
> > slightly trickier back in the days when we suggested shellscripts)
> 
> I think we might want to make this one optional, but I'm not going to
> fight super hard for it.

I was thinking along the same lines.

> > * A new version of pg_stop_backup is created, taking an argument specifying
> > if it's exclusive or not. The current version of pg_stop_backup() continues
> > to work for exclusive backups, with no changes to behavior. The new
> > pg_stop_backup will return a record of three columns instead of just the
> > value -- the LSN (pglsn), the backup label file (text) and the tablespace
> > map file (text).
> 
> I wonder if we shouldn't change the API a bit more aggressively. Right
> now we return the label and the map - but what if there's more files at
> some later point? One way would be to make it a SETOF function returning
> 'filename', 'content' or such.  Makes it less clear what to do with the
> lsn admittedly.

If we make the 'client disconnect means abort' optional then we'd also
need to modify the API of stop backup to specify which backup to stop,
I'd think.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-02-10 Thread Stephen Frost

* Magnus Hagander (mag...@hagander.net) wrote:
> On Wed, Feb 10, 2016 at 2:46 PM, Andres Freund  wrote:
> > On 2016-02-10 13:46:05 +0100, Magnus Hagander wrote:
> > > Per discussionat the developer meeting in Brussels, here's a patch that
> > > makes some updates to the backup APIs, to support non-exclusive backups
> > > without using pg_basebackup.
> >
> > Thanks for following through with this!
> >
> > > * If the client disconnects with a non-exclusive backup running, the
> > backup
> > > is automatically aborted. This is the same thing that pg_basebackup does.
> > > To use these non-exclusive backups the backup software will have to
> > > maintain a persistent connection to the database -- something that should
> > > not be a problem for any of the modern ones out there (but would've been
> > > slightly trickier back in the days when we suggested shellscripts)
> >
> > I think we might want to make this one optional, but I'm not going to
> > fight super hard for it.
> 
> Not sure what you're referring to here. Do you mean being able to make a
> non-exclusive backup while not maintaining a connection? That's going to
> make things a *lot* more complicated, as we'd have to invent something like
> "backup slots" similar to what we're doing with replication slots. I doubt
> it's worth all that work and complexity.

Hrmmm.  If that's the case then perhaps you're right.  I liked the
general idea of not having to maintain a TCP connection during the
entire backup (TCP connections can be annoyingly finicky in certain
environments...), but I'm not sure it's worth a lot of additional
complexity.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Updated backup APIs for non-exclusive backups

2016-02-10 Thread Stephen Frost

* David Steele (da...@pgmasters.net) wrote:
> On 2/10/16 9:44 AM, Stephen Frost wrote:
> > Hrmmm.  If that's the case then perhaps you're right.  I liked the
> > general idea of not having to maintain a TCP connection during the
> > entire backup (TCP connections can be annoyingly finicky in certain
> > environments...), but I'm not sure it's worth a lot of additional
> > complexity.
> 
> Well, pgBackRest holds a connection to PostgreSQL through the entire
> backup and will abort the backup if it is severed.  The connection is
> always held locally, though, even if the master backup process is on a
> different server.  I haven't gotten any reports of aborted backups due
> to the connection failing.

Yeah, I know, I had been thinking it might be nice to not do that at
some point in the future, but thinking on it further, we've already got
a "pick up where you left off" capability with pgbackrest, so it's
really not a huge deal if the backup fails and has to be restarted, and
this does remove the need (or at least deprecate) to use the "stop an
already-running backup if you find one" option that pgbackrest has.

Thanks!

Stephen

signature.asc
Description: Digital signature

[HACKERS] Improve docs wrt catalog object ACLs

2016-02-10 Thread Stephen Frost

Greetings,

The way permissions on catalog objects are handled isn't discussed at
all in the documentation.  Barring objections, I'll commit and
back-patch the attached to improve that situation in the next day or so.

Thanks!

Stephen
From ad8e663893ea906238a9c0346bf8791eafe3d333 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Wed, 10 Feb 2016 13:28:11 -0500
Subject: [PATCH] Add note regarding permissions in pg_catalog

Add a note to the system catalog section pointing out that while
modifying the permissions on catalog tables is possible, it's
unlikely to have the desired effect.
---
 doc/src/sgml/catalogs.sgml | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 412c845..3e8ebee 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -21,6 +21,17 @@
particularly esoteric operations, such as adding index access methods.
   
 
+  
+   
+Changing the permissions on objects in the system catalogs, while
+possible, is unlikely to have the desired effect as the internal
+lookup functions use a cache and do not check the permissions nor
+policies of tables in the system catalog.  Further, permission
+changes to objects in the system catalogs are not preserved by
+pg_dump or across upgrades.
+   
+  
+
  
   Overview
 
-- 
2.5.0



signature.asc
Description: Digital signature

Re: [HACKERS] Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

2016-02-13 Thread Stephen Frost

* Craig Ringer (cr...@2ndquadrant.com) wrote:
> On 14 February 2016 at 08:05, Robert Haas  wrote:
> > First, the overall concept here is that processes can either be a
> > member of a lock group or a member of no lock group.  The concept of a
> > lock group is formally separate from the concept of a parallel group
> > created by a ParallelContext, but it is not clear that there will ever
> > be any other context in which a lock group will be a good idea.  It is
> > not impossible to imagine: for example, suppose you had a group of
> > backends that all had TCP client connections, and those processes all
> > wanted to ingest data and stuff it in a table but without allowing any
> > unrelated process to touch the table, say because it was going to be
> > inconsistent during the operation and until indexes were afterwards
> > rebuilt.
> 
> The case that comes to mind for me is in logical decoding, for decoding
> prepared xacts. Being able to make the prepared xact a member of a "lock
> group" along with the decoding session's xact may provide a solution to the
> locking-related challenges there.

I was thinking this would be the way to address the current issues with
parallel pg_dump and having a shared snpashot.

> I haven't looked closely at what's involved in the decoding prepared xact
> locking issues yet, just an idea.

Yeah, don't have much more than it being an idea to use this for the
pg_dump case.  I do know that's a case which has been brought up a
couple of times before.

> To do this it'd have to be possible to add an existing session/xact to a
> lock group (or make it the leader of a new lock group then join that
> group). Do you think that's practical with your design?

Seems like the same question applies to the pg_dump case.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] WIP: SCRAM authentication

2016-02-14 Thread Stephen Frost

Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Sat, Feb 13, 2016 at 3:05 AM, David Steele  wrote:
> > On 11/16/15 8:53 AM, Michael Paquier wrote:
> >> On Sat, Sep 5, 2015 at 9:31 AM, Bruce Momjian wrote:
> >>> On Fri, Sep  4, 2015 at 04:51:33PM -0400, Stephen Frost wrote:
> >>>>> Coming in late, but can you explain how multiple passwords allow for
> >>>>> easier automated credential rotation?  If you have five applications
> >>>>> with stored passwords, I imagine you can't change them all at once, so
> >>>>> with multiples you could change it on one, then go to the others and
> >>>>> change it there, and finally, remove the old password.  Is that the
> >>>>> process?  I am not realizing that without multiple plasswords, this is a
> >>>>> hard problem.
> >>>> That's exactly the process if multiple passwords can be used.  If
> >>>> there's only one account and one password supported then you have to
> >>>> change all the systems all at once and that certainly can be a hard
> >>>> problem.
> >>>>
> >>>> One way to deal with this is to have a bunch of different accounts, but
> >>>> that's certainly not simple either and can get quite painful.
> >>> OK, for me, if we can explain the benefit for users, it seems worth
> >>> doing just to allow that.
> >> Reviving an old thread for a patch still registered in this commit
> >> fest to make the arguing move on.
> >
> > I was wondering if this patch was going to be submitted for the 2016-03 CF?
> 
> For 9.6 I am afraid this is too late, per the rule that there cannot
> be huge patches for the last CF of a development cycle. But I have
> plans for this set of features afterwards with the first CF of 9.7 and
> I was planning to talk about it at PgCon's unconference if I can get
> there to gather some feedback. There is still cruel need for it on my
> side..

There's a lot of good reason to get SCRAM added as a protocol,
considering our current password-based implementation is rather..
lacking.  Regarding the specific comment about the timing, that rule is
specifically to prevent large patches from landing in the last CF
without any prior discussion or review, as I recall, so I'm not sure it
really applies here as there's been quite a bit of discussion and review
already.

That said, per various discussions, we'd really want a more-or-less
fully formed patch to land prior to the last CF, for this to have any
chance.  Perhaps that means it's not going to happen, which would be
unfortunate, but it's not beyond the possible, in my view, at least.

> > In addition, I would prefer to maintain the current structure of the
> > pg_authid table and use the rolpassword and rolvaliduntil columns to
> > store only the md5 verifier which will also be stored in
> > pg_auth_verifiers.  This would provide a smoother migration path with
> > the idea that rolpassword and rolvaliduntil will be removed from
> > pg_authid in a future version of Postgres.
> 
> Actually, I am of the opinion that both rolpassword and rolvaliduntil
> should be directly part of pg_auth_verifiers. Being able to handle
> multiple verifiers for the same protocol is a feature that is being
> asked for with a given password handling policy (was pinged again
> about that in Moscow last week). Rolling in new verifiers needs now
> extra roles to be created.

I'm on Michael's side here.  I don't believe it makes sense to try and
maintain the exact structure of pg_authid.  We are certainly happy to
whack around the other catalogs, and I'm unimpressed with my prior
efforts to provide backwards-compatible catalog (see pg_user, et al) for
just a few releases- we seem unable to get rid of them now, even though
we should have years ago, really.  Indeed, I'd be just as happy if we
got rid of them during this work..

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] WIP: SCRAM authentication

2016-02-14 Thread Stephen Frost

Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Mon, Feb 15, 2016 at 9:17 AM, Stephen Frost  wrote:
> > That said, per various discussions, we'd really want a more-or-less
> > fully formed patch to land prior to the last CF, for this to have any
> > chance.  Perhaps that means it's not going to happen, which would be
> > unfortunate, but it's not beyond the possible, in my view, at least.
> 
> Well, I could send a rebased patch with the new things proposed
> upthread, and with things split in as many patches as I can get out,
> roughly:
> 1) One patch for the catalog split
> 2) One for the GUC param controlling recommended protocols
> 3) One for the pg_upgrade function cleaning up automatically the
> entries of unsupported protocols
> 4) SCRAM on top of the rest, which is at more or less 75% something
> that Heikki produced.

That sounds like a pretty reasonable split, to me at least.

> >> > In addition, I would prefer to maintain the current structure of the
> >> > pg_authid table and use the rolpassword and rolvaliduntil columns to
> >> > store only the md5 verifier which will also be stored in
> >> > pg_auth_verifiers.  This would provide a smoother migration path with
> >> > the idea that rolpassword and rolvaliduntil will be removed from
> >> > pg_authid in a future version of Postgres.
> >>
> >> Actually, I am of the opinion that both rolpassword and rolvaliduntil
> >> should be directly part of pg_auth_verifiers. Being able to handle
> >> multiple verifiers for the same protocol is a feature that is being
> >> asked for with a given password handling policy (was pinged again
> >> about that in Moscow last week). Rolling in new verifiers needs now
> >> extra roles to be created.
> >
> > I'm on Michael's side here.  I don't believe it makes sense to try and
> > maintain the exact structure of pg_authid.  We are certainly happy to
> > whack around the other catalogs, and I'm unimpressed with my prior
> > efforts to provide backwards-compatible catalog (see pg_user, et al) for
> > just a few releases- we seem unable to get rid of them now, even though
> > we should have years ago, really.  Indeed, I'd be just as happy if we
> > got rid of them during this work..
> 
> We'd need as well to switch pg_shadow to have an array of elements
> made of protocol:identifier instead of a single password field. There
> can be only one valid identifier per protocol, user and valid_until
> for a single point in time, and I can't believe that we should
> recommend only one authentication protocol per single major version of
> Postgres.

Ugh, that sounds pretty grotty to me.

Applications which consider these fields will need to be updated, one
way or the other, and I'd much rather they be updated to work with
reasonable structures rather than something we've hacked together in
some faint hope that it'd be useful.  An array in pg_shadow for a field
which used to be a text field does *not* sound like a simpler solution
to me, and I'd rather simply do away with those views entirely, or at
least nuke the fields which are at issue, than try to come up with
something between wholesale change and no change that ends up being
worse than both.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] WIP: SCRAM authentication

2016-02-14 Thread Stephen Frost

Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> It seems to me that applications are going to need a refresh anyway...

Indeed.

> Among the other possibilities I can foresee:
> - Add a column "protocol" to pg_shadow and produce one row per
> protocol, so one user will be listed for all the protocol it has. Any
> application could then filter out things with an additional WHERE
> clause.
> - Nuke passwd from pg_shadow and have a new view pg_shadow_verifiers
> made of the user OID, the protocol and the verifier. This maps quite
> well with pg_auth_verifiers.
> - Give up and nuke pg_shadow, which is here for compatibility down to
> 8.1, and add a protocol column to pg_user, or even better create a new
> view pg_user_verifiers that has all the data of all the protocols. If
> we care a lot about backward-compatibility, pg_user could as well map
> with pg_auth_verifiers with the md5 protocol.
> I would go with the last one.

I would start by pointing out that pg_user currently uses pg_shadow..

Why do we need pg_shadow or pg_user or related views at all..?
Applications will need to be updated, we might as well simply nuke them
and expect applications to use the new catalogs.  Perhaps there is a
useful view or two which we can provide over the new catalogs, but I'd
rather consider how to create brand new, useful, views over the new
catalogs than consider any kind of way to provides backwards
compatible-ish views.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] WIP: SCRAM authentication

2016-02-14 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > Why do we need pg_shadow or pg_user or related views at all..?
> 
> A lot of code looks at those just to get usernames.  I am not in favor of
> breaking such stuff without need.

Alright.

> How about we just say that the password in these old views always reads
> out as '' even when there is a password, and we invent new views
> that carry real auth information?  (Hopefully in an extensible way.)

I'd be alright with that approach, I'd just rather that any clients
which actually want to read the password field be updated to look at the
extensible and sensible base catalogs, and not some hacked up array that
we shoved into that field.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] WIP: SCRAM authentication

2016-02-14 Thread Stephen Frost

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Mon, Feb 15, 2016 at 10:23 AM, Stephen Frost  wrote:
> > I would start by pointing out that pg_user currently uses pg_shadow..
> > Why do we need pg_shadow or pg_user or related views at all..?
> 
> pg_user/pg_shadow have the advantage to be world-readable and mask
> password values.

New views would have that same advantage, should we implement them that
way.  Tom's approach is also workable though, where we make the existing
views have a reducaed charter, which is mainly around providing user
lists and simply not include any info about password verifiers or the
like.

Thanks!

Stephen

signature.asc
Description: Digital signature

[HACKERS] GetExistingLocalJoinPath() vs. the docs

2016-02-15 Thread Stephen Frost

Greetings,

While getting back into the thread Re: Optimization for updating foreign
tables in Postgres FDW, I noticed some issues with the docs and
GetExistingLocalJoinPath():

GetExistingLocalJoinPath() exists in src/backend/foreign/foreign.c, but
the docs include discussion of it under 54.2 - Foreign Data Wrapper
Callback Routines.  Shouldn't this be under 54.3. Foreign Data Wrapper
Helper Functions?  Also, the prototype is incorrect in the docs (it
doesn't return a void) and the paragraph about what it's for could do
with some wordsmithing.

A link from 54.2 to 54.3 which mentions it would be fine, of course.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Relaxing SSL key permission checks

2016-02-18 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Andres Freund  writes:
> > ... We don't prevent the user from making the
> > configuration file world-writable either,
> 
> Maybe we should.  It wasn't an issue originally, because the config files
> were necessarily inside $PGDATA which we restrict permissions on.  But
> these days you can place the config files in places where untrustworthy
> people could get at them.

No, we should be improving our support of systems which provide more
specific groups, not destroying it.  Being able to run backups as a user
who is not able to modify the database would be great too, and that case
isn't covered by your approach to "allow group rights if the file is
owned by root."

Further, the notion that *this* is the footgun is completely off the
reservation- if the files have been changed to allow untrusted users to
have access to them, there isn't diddly we can do about it.  All we're
doing with this is imposing our own idea of what the system policy
should be, even though there are clear examples where that's just
blatently wrong.

If we really want to force these checks to happen (and I'm not
convinced that they're actually useful at all), then we need to provide
a way for users and distributions to control the specifics of the checks
as they chose.  Maybe that's a command-line switch instead of a GUC, or
it's something else, but there clearly isn't "one true way" here and we
should be flexible.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Relaxing SSL key permission checks

2016-02-18 Thread Stephen Frost

Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > Further, the notion that *this* is the footgun is completely off the
> > reservation- if the files have been changed to allow untrusted users to
> > have access to them, there isn't diddly we can do about it.
> 
> I completely disagree that those file-permissions checks are useless.
> What they accomplish is, if you accidentally set up an insecure key file,
> you'll get told about it fairly promptly, and have the opportunity to
> either fix the permissions or generate a new key, depending on your
> opinion of how likely it is that somebody stole the key already.  If we
> made no checks then, more than likely, an insecure key file would just
> sit there indefinitely, waiting for some passing blackhat to grab it.

If that's something you're concerned with then the right answer is to
monitor the file permissions- and there are tools which do exactly that.

I disagree that it's PG's charter to do that and, frankly, you *won't*
be told, in most cases, promptly about such a change.  We don't check
the permissions on every file or even every directory on every query or
even every connection.  With uptimes of days, weeks and months quite
often seen, this idea that PG is protecting you with these checks in
any way is instead creating a false sense of security.

> We can certainly discuss whether we need more than one model of what
> appropriate permissions are, but I do not believe that "rip out the
> check" is a helpful answer.

We're over-claiming what these protections are actually providing.  At
best, they help a new user out when they're first setting up their
database, if they're doing it from source and fully understand how to
correctly set up pg_hba.conf and friends to secure *those*.  Users
installing packages from distributions will have the benefit that the
distribution will get the permissions correct initially and not have to
worry about them.  Permission changes to these files after the system is
set up and running indicate that it's already too late because PG won't
tell the user there's an issue until a DB restart a month or two later,
and by then, if there's an untrusted user who was able to gain access
thanks to such changes, you can be sure they will have.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Relaxing SSL key permission checks

2016-02-19 Thread Stephen Frost

Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Now, I have heard it argued that the OpenSSH/L authors are a bunch of
> idiots who know nothing about security.  But it's not like insisting
> on restrictive permissions on key files is something we invented out
> of the blue.  It's pretty standard practice, AFAICT.

I certainly was not intending to imply that anyone was an 'idiot'.

Nor am I arguing that we must remove all such checks from every part of
the system.

I do contend that relying on such checks that happen on a relativly
infrequent basis in daemon processes creates a false sense of security
and does not practically improve security, today.  As I mentioned
previously, perhaps before distributions were as cognizant about
security concerns or about considering how a particular daemon should
be set up, or when users were still frequently installing from source,
such checks were more valuable.  However, when they get in the way of
entirely reasonable system policies and require distributions to patch
the source code, they're a problem.  That doesn't mean we necessairly
have to remove them, but we should be flexible.

Similar checks in client utilities, such as the ssh example, or in psql,
are more useful and, from a practical standpoint, havn't been an issue
for system policies.

Further, we do more than check key files but also check permissions on
the data directory and don't provide any way for users to configure the
permissions on new files, which could be seen as akin to sshd requiring
user home directories to be 700 and forcing umask to 077.

Note that all of this only actually applies to OpenSSH, not to OpenSSL.
Certainly, as evidenced by the question which sparked this discussion,
the packages which are configured to use the local snakeoil cert on
Debian-based systems (which also includes postfix and Apache, offhand)
do not have a problem with the group read permissions that are being
asked for.  I don't find that to be offensive or unacceptable in the
least, nor do I feel that Debian is flawed for taking this approach, or
that OpenSSL is flawed for not having such a check.

Thanks!

Stephen

signature.asc
Description: Digital signature

[HACKERS] Re: [COMMITTERS] pgsql: Cosmetic improvements in new config_info code.

2016-02-21 Thread Stephen Frost

Joe, all,

* Joe Conway (m...@joeconway.com) wrote:
> On 02/21/2016 08:38 AM, Tom Lane wrote:
> > Coverity griped about use of unchecked strcpy() into a local variable.
> > There's unlikely to be any actual bug there, since no caller would be
> > passing a path longer than MAXPGPATH, but nonetheless use of strlcpy()
> > seems preferable.
> 
> FWIW, strcpy() was being used in src/bin/pg_config/pg_config.c that I
> started with -- does that mean we are not getting Coverity coverage of
> src/bin?

Coverity does run against src/bin also.  It's possible this was
identified as an issue in pg_config.c, but, as Tom notes, it may not be
an actual bug and might have been marked as a non-bug in Coverity.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Relaxing SSL key permission checks

2016-02-21 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Christoph Berg  writes:
> > The attached patch has successfully been included in the 9.6 Debian
> > package, passed the regression tests there, and I've also done some
> > chmod/chown tests on the filesystem to verify it indeed catches the
> > cases laid out by Tom.
> 
> Please add this to the upcoming commitfest.  We don't seem to have
> enough consensus to just apply it, but perhaps the review process
> will produce some agreement.

Just to be clear, I'm not really against this patch as-is, but it
shouldn't be a precedent or limit us from supporting more permissive
permissions in other areas (or even here) if there are sensible
use-cases for more permissive permissions.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Re: [COMMITTERS] pgsql: Add some isolation tests for deadlock detection and resolution.

2016-02-22 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Robert Haas  writes:
> > On Wed, Feb 17, 2016 at 9:48 PM, Tom Lane  wrote:
> >> I just had a rather disturbing thought to the effect that this entire
> >> design --- ie, parallel workers taking out locks for themselves --- is
> >> fundamentally flawed.  As far as I can tell from README.parallel,
> >> parallel workers are supposed to exit (and, presumably, release their
> >> locks) before the leader's transaction commits.  Releasing locks before
> >> commit is wrong.  Do I need to rehearse why?
> 
> > No, you don't.  I've spent a good deal of time thinking about that problem.
> > [ much snipped ]
> > Unless I'm missing something, though, this is a fairly obscure
> > problem.  Early release of catalog locks is desirable, and locks on
> > scanned tables should be the same locks (or weaker) than already held
> > by the master.  Other cases are rare.  I think.  It would be good to
> > know if you think otherwise.
> 
> After further thought, what I think about this is that it's safe so long
> as parallel workers are strictly read-only.  Given that, early lock
> release after user table access is okay for the same reasons it's okay
> after catalog accesses.  However, this is one of the big problems that
> we'd have to have a solution for before we ever consider allowing
> read-write parallelism.

Having such a blocker for read-write parallelism would be unfortunate,
though perhaps there isn't much help for it, and having read-only query
parallelism is certainly far better than nothing.

> So what distresses me about the current situation is that this is a large
> stumbling block that I don't see documented anywhere.  It'd be good if you
> transcribed some of this conversation into README.parallel.

Agreed.

> (BTW, I don't believe your statement that transferring locks back to the
> master would be deadlock-prone.  If the lock system treats locks held by
> a lock group as effectively all belonging to one entity, then granting a
> lock identical to one already held by another group member should never
> fail.  I concur that it might be expensive performance-wise, though it
> hardly seems like this would be a dominant factor compared to all the
> other costs of setting up and tearing down parallel workers.)

This is only when a parallel worker is finished, no?  Isn't there
already some indication of when a parallel worker is done that the
master handles, where it could also check the shared lock table and see
if any locks were transferred to it on worker exit?

Only following this thread from afar, so take my suggestions with a
grain of salt.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] patch proposal

2016-08-17 Thread Stephen Frost

Venkata,

* Venkata B Nagothi (nag1...@gmail.com) wrote:
> Agreed. Additional option like "pause" would. As long as there is an option
> to ensure following happens if the recovery target is not reached -
> 
>  a) PG pauses the recovery at the end of the WAL
>  b) Generates a warning in the log file saying that recovery target point
> is not reached (there is a patch being worked upon on by Thom on this)
>  c) Does not open-up the database exiting from the recovery process by
> giving room to resume the replay of WALs

One thing to consider is just how different this is from simply bringing
PG up as a warm standby instead, with the warning added to indicate if
the recovery point wasn't reached.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Add -c to rsync commands on SR tutorial wiki page

2016-08-17 Thread Stephen Frost

* Jim Nasby (jim.na...@bluetreble.com) wrote:
> https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial does
> not specify -c for any of the rsync commands. That's maybe safe for
> WAL, but I don't think it's safe for any of the other uses, right?
> I'd like someone to confirm before I just change the page... my
> intention is to just stick -c in all the commands.

-c is only relevant when you are doing an incremental copy, but on a
quick look, all those rsync commands appear to be doing full copies?

You would want -c if you were taking a backup and then doing an update
of it using rsync. or something along those lines, as you can't really
trust rsync's time/size based comparison as it only has a 1 second level
granularity.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] patch proposal

2016-08-18 Thread Stephen Frost

* Venkata B Nagothi (nag1...@gmail.com) wrote:
> On Wed, Aug 17, 2016 at 11:27 PM, Stephen Frost  wrote:
> > * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > > Agreed. Additional option like "pause" would. As long as there is an
> > option
> > > to ensure following happens if the recovery target is not reached -
> > >
> > >  a) PG pauses the recovery at the end of the WAL
> > >  b) Generates a warning in the log file saying that recovery target point
> > > is not reached (there is a patch being worked upon on by Thom on this)
> > >  c) Does not open-up the database exiting from the recovery process by
> > > giving room to resume the replay of WALs
> >
> > One thing to consider is just how different this is from simply bringing
> > PG up as a warm standby instead, with the warning added to indicate if
> > the recovery point wasn't reached.
> 
> I am referring to a specific scenario (performing point-in time recovery)
> where-in a DBA attempts to bring up a standalone PG instance by restoring
> the backup and performing recovery to a particular recover target (XID,
> time or a named restore point) in the past by replaying all the available
> WAL archives, which is not quite possible through a warm-standby setup.
> 
> Warm standby is more of a high-availability solution and i do not think so,
> it addresses PITR kind of situation.

No, a warm standby is one which just plays through the WAL but doesn't
bring the database up or start its own set of WAL, which is what you're
asking about.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Add -c to rsync commands on SR tutorial wiki page

2016-08-18 Thread Stephen Frost

* Jim Nasby (jim.na...@bluetreble.com) wrote:
> On 8/17/16 9:46 PM, Stephen Frost wrote:
> >* Jim Nasby (jim.na...@bluetreble.com) wrote:
> >>> https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial does
> >>> not specify -c for any of the rsync commands. That's maybe safe for
> >>> WAL, but I don't think it's safe for any of the other uses, right?
> >>> I'd like someone to confirm before I just change the page... my
> >>> intention is to just stick -c in all the commands.
> >-c is only relevant when you are doing an incremental copy, but on a
> >quick look, all those rsync commands appear to be doing full copies?
> >
> >You would want -c if you were taking a backup and then doing an update
> >of it using rsync. or something along those lines, as you can't really
> >trust rsync's time/size based comparison as it only has a 1 second level
> >granularity.
> 
> I don't think it's any great leap for someone to think they can use
> those commands incrementally. It's certainly one of the first things
> you think of when using rsync. AFAIK there's no downside at all to
> using -c when it is a brand new copy, so I'm thinking we should just
> put it in there, especially considering what the potential downside
> is.

To have proper incremental backups done requires a lot more than just
throwing "-c" into the rsyncs.  For my 2c, I'm at the point where I'd
prefer we discourage people from using rsync, cp, or generally try to
set up their own hand-rolled backup system with PG.  Those examples
simply encourage poor setups that risk losing data and introducing
corruption.

Thanks!

Stephen


signature.asc
Description: Digital signature

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 4541 matches

Mail list logo