Re: [HACKERS] Change in "policy" on dump ordering?
Michael Banck writes: > Am Donnerstag, den 27.07.2017, 15:52 -0400 schrieb Tom Lane: >> So I'm thinking we should consider the multi-pass patch I posted as >> a short-term, backpatchable workaround, which we could hope to >> replace with dependency logic later. > +1, it would be really nice if this could be fixed in the back-branches > before v11. Done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
Am Donnerstag, den 27.07.2017, 15:52 -0400 schrieb Tom Lane: > So I'm thinking we should consider the multi-pass patch I posted as > a short-term, backpatchable workaround, which we could hope to > replace with dependency logic later. +1, it would be really nice if this could be fixed in the back-branches before v11. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
Robert Haas writes: > On Wed, Jul 26, 2017 at 2:41 PM, Tom Lane wrote: >> The bigger issue is whether there's some failure case this would cause >> that I'm missing altogether. Thoughts? > I think dependencies are fundamentally the right model for this sort > of problem. I can't imagine what could go wrong that wouldn't amount > to a failure to insert all of the right dependencies, and thus be > fixable by inserting whatever was missing. I tend to agree. One fairly obvious risk factor is that we end up with circular dependencies, but in that case we have conflicting requirements and we're gonna have to decide what to do about them no matter what. Another potential issue is pg_dump performance; this could result in a large increase in the number of DumpableObjects and dependency links. The topological sort is O(N log N), so we shouldn't hit any serious big-O problems, but even a more-or-less-linear slowdown might be problematic for some people. On the whole though, I believe pg_dump is mostly limited by its server queries, and that would probably still be true. But the big point: even if we had the code for this ready-to-go, I'd be hesitant to inject it into v10 at this point, let alone back-patch. If we go down this path we won't be seeing a fix for the matview ordering problem before v11 (at best). Maybe that's acceptable considering it's been there for several years already, but it's annoying. So I'm thinking we should consider the multi-pass patch I posted as a short-term, backpatchable workaround, which we could hope to replace with dependency logic later. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On Wed, Jul 26, 2017 at 2:41 PM, Tom Lane wrote: > The bigger issue is whether there's some failure case this would cause > that I'm missing altogether. Thoughts? I think dependencies are fundamentally the right model for this sort of problem. I can't imagine what could go wrong that wouldn't amount to a failure to insert all of the right dependencies, and thus be fixable by inserting whatever was missing. It's possible I am lacking in imagination, so take that opinion for what it's worth. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
I thought of a quite different way that we might attack this problem, but I'm not sure if it's worth pursuing or not. The idea is basically that we should get rid of the existing kluge for pushing ACLs to the end altogether, and instead rely on dependency sorting to make things work. This'd require some significant surgery on pg_dump, but it seems doable: * Make ACLs have actual DumpableObject representations that participate in the topological sort. * Add more explicit dependencies between these objects and other ones. For example, we'd fix the matview-permissions problem by adding dependencies not only on the tables a matview references, but on their ACLs. Another case is that we'd want to ensure that a table's ACL comes out after the table data, so as to solve the original problem that the current behavior was meant to deal with, ie allowing restore of tables even if they've been made read-only by revoking the owner's INSERT privilege. But I think that case would best be dealt with by forcing table ACLs to be post-data objects. (Otherwise they might come out in the middle "data" section of a restore, which is likely to break some client assumptions somewhere.) Another variant of that is that table ACLs probably need to come out after CREATE TRIGGER and foreign-key constraints, in case an owner has revoked their own TRIGGER or REFERENCES privilege. This seems potentially doable, but I sure don't see it coming out small enough to be a back-patchable fix; nor would it make things work for existing archive files, only new dumps. In fact, if we don't want to break parallel restore for existing dump files, we'd probably still have to implement the postpone-all-ACLs rule when dealing with an old dump file. (But maybe we could blow off that case, and just say you might have to not use parallel restore with old dumps that contain self-revoke ACLs?) The bigger issue is whether there's some failure case this would cause that I'm missing altogether. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On Wed, Jul 26, 2017 at 9:30 AM, Tom Lane wrote: > Meanwhile, we have problems that bite people who aren't doing anything > stranger than having a matview owned by a non-superuser. How do you > propose to fix that without reordering pg_dump's actions? Obviously changing the order is essential. What I wasn't sure about was whether a hard division into phases was a good idea. The advantage of the dependency mechanism is that, at least in theory, you can get things into any order you need by sticking the right dependencies in there. Your description made it sound like you'd hard-coded matview entries to the end rather than relying on dependencies, which could be a problem if something later turns up where we don't want them all the way at the end. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
Jordan Gigov writes: > But why should a superuser need the ACL to be applied before being allowed > access? If you make the permission-checking function check if the user is a > superuser before looking for per-user grants, wouldn't that solve the issue? The superuser's permissions are not relevant, because the materialized view is run with the permissions of its owner, not the superuser. We are not going to consider changing that, either, because it would open trivial-to-exploit security holes (any user could set up a trojan horse matview and just wait for the next pg_upgrade or dump/restore). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
But why should a superuser need the ACL to be applied before being allowed access? If you make the permission-checking function check if the user is a superuser before looking for per-user grants, wouldn't that solve the issue? On 26 July 2017 at 16:30, Tom Lane wrote: > Robert Haas writes: > > On Tue, Jul 25, 2017 at 10:45 PM, Tom Lane wrote: > >> Instead, I've prepared the attached draft patch, which addresses the > >> problem by teaching pg_backup_archiver.c to process TOC entries in > >> three separate passes, "main" then ACLs then matview refreshes. > > > What worries me a bit is the possibility that something might depend > > on a matview having already been refreshed. I cannot immediately > > think of a case whether such a thing happens that you won't dismiss as > > wrong-headed, but how sure are we that none such will arise? > > Um, there are already precisely zero guarantees about that. pg_dump > has always been free to order these actions in any way that satisfies > the dependencies it knows about. > > It's certainly possible to break it by introducing hidden dependencies > within CHECK conditions. But that's always been true, with or without > materialized views, and we've always dismissed complaints about it with > "sorry, we won't support that". (I don't think the trigger case is > such a problem, because we restore data before creating any triggers.) > > Meanwhile, we have problems that bite people who aren't doing anything > stranger than having a matview owned by a non-superuser. How do you > propose to fix that without reordering pg_dump's actions? > > Lastly, the proposed patch has the advantage that it acts at the restore > stage, not when a dump is being prepared. Since it isn't affecting what > goes into dump archives, it doesn't tie our hands if we think of some > better idea later. > > regards, tom lane >
Re: [HACKERS] Change in "policy" on dump ordering?
Robert Haas writes: > On Tue, Jul 25, 2017 at 10:45 PM, Tom Lane wrote: >> Instead, I've prepared the attached draft patch, which addresses the >> problem by teaching pg_backup_archiver.c to process TOC entries in >> three separate passes, "main" then ACLs then matview refreshes. > What worries me a bit is the possibility that something might depend > on a matview having already been refreshed. I cannot immediately > think of a case whether such a thing happens that you won't dismiss as > wrong-headed, but how sure are we that none such will arise? Um, there are already precisely zero guarantees about that. pg_dump has always been free to order these actions in any way that satisfies the dependencies it knows about. It's certainly possible to break it by introducing hidden dependencies within CHECK conditions. But that's always been true, with or without materialized views, and we've always dismissed complaints about it with "sorry, we won't support that". (I don't think the trigger case is such a problem, because we restore data before creating any triggers.) Meanwhile, we have problems that bite people who aren't doing anything stranger than having a matview owned by a non-superuser. How do you propose to fix that without reordering pg_dump's actions? Lastly, the proposed patch has the advantage that it acts at the restore stage, not when a dump is being prepared. Since it isn't affecting what goes into dump archives, it doesn't tie our hands if we think of some better idea later. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On Tue, Jul 25, 2017 at 10:45 PM, Tom Lane wrote: > Instead, I've prepared the attached draft patch, which addresses the > problem by teaching pg_backup_archiver.c to process TOC entries in > three separate passes, "main" then ACLs then matview refreshes. > It's survived light testing but could doubtless use further review. What worries me a bit is the possibility that something might depend on a matview having already been refreshed. I cannot immediately think of a case whether such a thing happens that you won't dismiss as wrong-headed, but how sure are we that none such will arise? I mean, a case that would actually break is if you had a CHECK constraint or a functional index that contained a function which referenced a materialized view for some validation or transformation purpose. Then it wouldn't be formally immutable, of course. But maybe we can imagine that some other case not involving lying could exist, or come to exist. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
I wrote: > The main problem with Kevin's fix, after thinking about it more, is that > it shoves matview refresh commands into the same final processing phase > where ACLs are done, which means that in a parallel restore they will not > be done in parallel. That seems like a pretty serious objection, although > maybe not so serious that we'd be willing to accept a major rewrite in the > back branches to avoid it. > I'm wondering at this point about having restore create a fake DO_ACLS > object (fake in the sense that it isn't in the dump file) that would > participate normally in the dependency sort, and which we'd give a > priority before matview refreshes but after everything else. "Restore" > of that object would perform the same operation we do now of running > through the whole TOC and emitting grants/revokes. So it couldn't be > parallelized in itself (at least not without an additional batch of work) > but it could be treated as an indivisible parallelized task, and then the > matview refreshes could be parallelizable tasks after that. > There's also Peter's proposal of splitting up GRANTs from REVOKEs and > putting only the latter at the end. I'm not quite convinced that that's > a good idea but it certainly merits consideration. After studying things for awhile, I've concluded that that last option is probably not workable. ACL items contain a blob of SQL that would be tricky to pull apart, and is both version and options dependent, and contains ordering dependencies that seem likely to defeat any desire to put the REVOKEs last anyway. Instead, I've prepared the attached draft patch, which addresses the problem by teaching pg_backup_archiver.c to process TOC entries in three separate passes, "main" then ACLs then matview refreshes. It's survived light testing but could doubtless use further review. Another way we could attack this is to adopt something similar to the PRE_DATA_BOUNDARY/POST_DATA_BOUNDARY mechanism; that is, invent more dummy section boundary objects, add dependencies sufficient to constrain all TOC objects to be before or after the appropriate boundaries, and then let the dependency sort go at it. But I think that way is probably more expensive than this one, and it doesn't have any real advantage if there's not a potential for circular dependencies that need to be broken. If somebody else wants to try drafting a patch like that, I won't stand in the way, but I don't wanna do so. Not clear where we want to go from here. Should we try to get this into next month's minor releases, or review it in September's commitfest and back-patch after that? regards, tom lane diff --git a/src/bin/pg_dump/pg_backup_archiver.h b/src/bin/pg_dump/pg_backup_archiver.h index 6123859..3687687 100644 *** a/src/bin/pg_dump/pg_backup_archiver.h --- b/src/bin/pg_dump/pg_backup_archiver.h *** typedef enum *** 203,208 --- 203,230 OUTPUT_OTHERDATA /* writing data as INSERT commands */ } ArchiverOutput; + /* + * For historical reasons, ACL items are interspersed with everything else in + * a dump file's TOC; typically they're right after the object they're for. + * However, we need to restore data before ACLs, as otherwise a read-only + * table (ie one where the owner has revoked her own INSERT privilege) causes + * data restore failures. On the other hand, matview REFRESH commands should + * come out after ACLs, as otherwise non-superuser-owned matviews might not + * be able to execute. (If the permissions at the time of dumping would not + * allow a REFRESH, too bad; we won't fix that for you.) These considerations + * force us to make three passes over the TOC, restoring the appropriate + * subset of items in each pass. We assume that the dependency sort resulted + * in an appropriate ordering of items within each subset. + */ + typedef enum + { + RESTORE_PASS_MAIN = 0, /* Main pass (most TOC item types) */ + RESTORE_PASS_ACL, /* ACL item types */ + RESTORE_PASS_REFRESH /* Matview REFRESH items */ + + #define RESTORE_PASS_LAST RESTORE_PASS_REFRESH + } RestorePass; + typedef enum { REQ_SCHEMA = 0x01, /* want schema */ *** struct _archiveHandle *** 329,334 --- 351,357 int noTocComments; ArchiverStage stage; ArchiverStage lastErrorStage; + RestorePass restorePass; /* used only during parallel restore */ struct _tocEntry *currentTE; struct _tocEntry *lastErrorTE; }; diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c index f461692..4cfb71c 100644 *** a/src/bin/pg_dump/pg_backup_archiver.c --- b/src/bin/pg_dump/pg_backup_archiver.c *** static ArchiveHandle *_allocAH(const cha *** 58,64 SetupWorkerPtrType setupWorkerPtr); static void _getObjectDescription(PQExpBuffer buf, TocEntry *te, ArchiveHandle *AH); ! static void _printTocEntry(ArchiveHandle *AH, TocEntry *te, bool isData, bool acl_pass); stati
Re: [HACKERS] Change in "policy" on dump ordering?
Stephen Frost writes: > * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: >> On 3/6/17 03:33, Michael Banck wrote: >>> Would this be a candidate for backpatching, or is the behaviour change >>> in pg_dump trumping the issues it solves? >> Unless someone literally has a materialized view on pg_policy, it >> wouldn't make a difference, so I'm not very keen on bothering to >> backpatch this. > Agreed. So actually, the problem with Jim's patch is that it doesn't fix the problem. pg_dump's attempts to REFRESH matviews will still fail in common cases, because they still come out before GRANTs, because pg_dump treats ACLs as a completely independent thing to be done last. This was noted as far back as 2015 (in a thread previously linked from this thread), and it's also the cause of Jordan Gigov's current complaint at https://www.postgresql.org/message-id/CA%2BnBocAmQ%2BOPNSKUzaaLa-6eGYVw5KqexWJaRoGvrvLyDir9gg%40mail.gmail.com Digging around in the archives, I find that Kevin had already proposed a fix in https://www.postgresql.org/message-id/flat/20160202161407.2778.24659%40wrigleys.postgresql.org which I didn't particularly care for, and apparently nobody else did either. But we really oughta do *something*. The main problem with Kevin's fix, after thinking about it more, is that it shoves matview refresh commands into the same final processing phase where ACLs are done, which means that in a parallel restore they will not be done in parallel. That seems like a pretty serious objection, although maybe not so serious that we'd be willing to accept a major rewrite in the back branches to avoid it. I'm wondering at this point about having restore create a fake DO_ACLS object (fake in the sense that it isn't in the dump file) that would participate normally in the dependency sort, and which we'd give a priority before matview refreshes but after everything else. "Restore" of that object would perform the same operation we do now of running through the whole TOC and emitting grants/revokes. So it couldn't be parallelized in itself (at least not without an additional batch of work) but it could be treated as an indivisible parallelized task, and then the matview refreshes could be parallelizable tasks after that. There's also Peter's proposal of splitting up GRANTs from REVOKEs and putting only the latter at the end. I'm not quite convinced that that's a good idea but it certainly merits consideration. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote: > On 3/6/17 03:33, Michael Banck wrote: > > Would this be a candidate for backpatching, or is the behaviour change > > in pg_dump trumping the issues it solves? > > Unless someone literally has a materialized view on pg_policy, it > wouldn't make a difference, so I'm not very keen on bothering to > backpatch this. Agreed. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Change in "policy" on dump ordering?
On 3/4/17 11:49 AM, Peter Eisentraut wrote: I wonder whether we should emphasize this change by assigning DO_REFRESH_MATVIEW a higher number, like 100? Since there wasn't any interest in that idea, I have committed Jim's patch as is. Thanks. Something else that seems somewhat useful would be to have the sort defined by an array of the ENUM values in the correct order, and then have the code do the mechanical map generation. I'm guessing the only reasonable way to make that work would be to have some kind of a last item indicator value, so you know how many values were in the ENUM. Maybe there's a better way to do that... -- Jim Nasby, Chief Data Architect, OpenSCG -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 3/6/17 03:33, Michael Banck wrote: > Would this be a candidate for backpatching, or is the behaviour change > in pg_dump trumping the issues it solves? Unless someone literally has a materialized view on pg_policy, it wouldn't make a difference, so I'm not very keen on bothering to backpatch this. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
Hi, On Sat, Mar 04, 2017 at 02:49:36PM -0500, Peter Eisentraut wrote: > On 3/1/17 08:36, Peter Eisentraut wrote: > > On 2/22/17 18:24, Jim Nasby wrote: > >>> Yes, by that logic matview refresh should always be last. > >> > >> Patches for head attached. > >> > >> RLS was the first item added after DO_REFRESH_MATVIEW, which was added > >> in 9.5. So if we want to treat this as a bug, they'd need to be patched > >> as well, which is a simple matter of swapping 33 and 34. > > > > I wonder whether we should emphasize this change by assigning > > DO_REFRESH_MATVIEW a higher number, like 100? > > Since there wasn't any interest in that idea, I have committed Jim's > patch as is. Would this be a candidate for backpatching, or is the behaviour change in pg_dump trumping the issues it solves? Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 3/1/17 08:36, Peter Eisentraut wrote: > On 2/22/17 18:24, Jim Nasby wrote: >>> Yes, by that logic matview refresh should always be last. >> >> Patches for head attached. >> >> RLS was the first item added after DO_REFRESH_MATVIEW, which was added >> in 9.5. So if we want to treat this as a bug, they'd need to be patched >> as well, which is a simple matter of swapping 33 and 34. > > I wonder whether we should emphasize this change by assigning > DO_REFRESH_MATVIEW a higher number, like 100? Since there wasn't any interest in that idea, I have committed Jim's patch as is. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/22/17 18:24, Jim Nasby wrote: > On 2/22/17 12:29 PM, Peter Eisentraut wrote: >> On 2/22/17 10:14, Jim Nasby wrote: >>> CREATE MATERIALIZED VIEW tmv AS SELECT * FROM pg_subscription; >>> SELECT 0 >>> >>> IOW, you can create matviews that depend on any other >>> table/view/matview, but right now if the matview includes certain items >>> it will mysteriously end up empty post-restore. >> >> Yes, by that logic matview refresh should always be last. > > Patches for head attached. > > RLS was the first item added after DO_REFRESH_MATVIEW, which was added > in 9.5. So if we want to treat this as a bug, they'd need to be patched > as well, which is a simple matter of swapping 33 and 34. I wonder whether we should emphasize this change by assigning DO_REFRESH_MATVIEW a higher number, like 100? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
Hi, I've found the (AIUI) previous discussion about this, it's Bug #13907: https://www.postgresql.org/message-id/flat/20160202161407.2778.24659%40wrigleys.postgresql.org#20160202161407.2778.24...@wrigleys.postgresql.org On Wed, Feb 22, 2017 at 05:24:49PM -0600, Jim Nasby wrote: > On 2/22/17 12:29 PM, Peter Eisentraut wrote: > >On 2/22/17 10:14, Jim Nasby wrote: > >>CREATE MATERIALIZED VIEW tmv AS SELECT * FROM pg_subscription; > >>SELECT 0 > >> > >>IOW, you can create matviews that depend on any other > >>table/view/matview, but right now if the matview includes certain items > >>it will mysteriously end up empty post-restore. > > > >Yes, by that logic matview refresh should always be last. In https://www.postgresql.org/message-id/9af4bc32-7e55-a21d-47e7-608582a8c48d%402ndquadrant.com you (Peter) wrote: "The reason that ACLs are restored last is that they could contain owner self-revokes. So anything that you run after the ACLs could fail because of that. I think a more complete fix would be to split up the ACL restores into the general part, which you would run right after the object is restored, and the owner revokes, which you would run last." > Patches for head attached. FWIW, Keven Grittner had proposed a more involved patch in https://www.postgresql.org/message-id/CACjxUsNmpQDL58zRm3EFS9atqGT8%2BX_2%2BFOCXpYBwWZw5wgi-A%40mail.gmail.com Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/22/17 5:38 PM, Michael Banck wrote: diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c index ea643397ba..708a47f3cb 100644 --- a/src/bin/pg_dump/pg_dump_sort.c +++ b/src/bin/pg_dump/pg_dump_sort.c @@ -26,6 +26,9 @@ static const char *modulename = gettext_noop("sorter"); * Sort priority for database object types. * Objects are sorted by type, and within a type by name. * + * Because materialized views can potentially reference system views, + * DO_REFRESH_MATVIEW should always be the last thing on the list. + * I think this comment is overly specific: any materialized view that references a view or table in a different schema (pg_catalog or not) will likely not refresh on pg_restore AIUI, so singling out system views doesn't look right to me. This isn't a matter of excluded schemas. The problem is that if you had a matview that referenced a system view for something that was restored after DO_REFRESH_MATVIEW (such as subscriptions) then the view would be inaccurate after the restore. Stephen, hopefully that answers your question as well. :) -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
Hi, On Wed, Feb 22, 2017 at 05:24:49PM -0600, Jim Nasby wrote: > On 2/22/17 12:29 PM, Peter Eisentraut wrote: > >On 2/22/17 10:14, Jim Nasby wrote: > >>CREATE MATERIALIZED VIEW tmv AS SELECT * FROM pg_subscription; > >>SELECT 0 > >> > >>IOW, you can create matviews that depend on any other > >>table/view/matview, but right now if the matview includes certain items > >>it will mysteriously end up empty post-restore. > > > >Yes, by that logic matview refresh should always be last. Glad to hear - I vaguely remember this coming up in a different thread some time ago, and I though you (Peter) had reservations about moving things past after the ACL step, but I couldn't find this thread now anymore, only https://www.postgresql.org/message-id/11166.1424357659%40sss.pgh.pa.us > Patches for head attached. Yay. > diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c > index ea643397ba..708a47f3cb 100644 > --- a/src/bin/pg_dump/pg_dump_sort.c > +++ b/src/bin/pg_dump/pg_dump_sort.c > @@ -26,6 +26,9 @@ static const char *modulename = gettext_noop("sorter"); > * Sort priority for database object types. > * Objects are sorted by type, and within a type by name. > * > + * Because materialized views can potentially reference system views, > + * DO_REFRESH_MATVIEW should always be the last thing on the list. > + * I think this comment is overly specific: any materialized view that references a view or table in a different schema (pg_catalog or not) will likely not refresh on pg_restore AIUI, so singling out system views doesn't look right to me. Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
Jim, * Jim Nasby (jim.na...@bluetreble.com) wrote: > On 2/22/17 12:29 PM, Peter Eisentraut wrote: > >On 2/22/17 10:14, Jim Nasby wrote: > >>CREATE MATERIALIZED VIEW tmv AS SELECT * FROM pg_subscription; > >>SELECT 0 > >> > >>IOW, you can create matviews that depend on any other > >>table/view/matview, but right now if the matview includes certain items > >>it will mysteriously end up empty post-restore. > > > >Yes, by that logic matview refresh should always be last. > > Patches for head attached. > > RLS was the first item added after DO_REFRESH_MATVIEW, which was > added in 9.5. So if we want to treat this as a bug, they'd need to > be patched as well, which is a simple matter of swapping 33 and 34. Can you clarify what misbehavior there is with RLS that would warrent this being a bug..? I did consider where in the dump I thought policies should go, though I may certainly have overlooked something. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/22/17 12:29 PM, Peter Eisentraut wrote: On 2/22/17 10:14, Jim Nasby wrote: CREATE MATERIALIZED VIEW tmv AS SELECT * FROM pg_subscription; SELECT 0 IOW, you can create matviews that depend on any other table/view/matview, but right now if the matview includes certain items it will mysteriously end up empty post-restore. Yes, by that logic matview refresh should always be last. Patches for head attached. RLS was the first item added after DO_REFRESH_MATVIEW, which was added in 9.5. So if we want to treat this as a bug, they'd need to be patched as well, which is a simple matter of swapping 33 and 34. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c index ea643397ba..708a47f3cb 100644 --- a/src/bin/pg_dump/pg_dump_sort.c +++ b/src/bin/pg_dump/pg_dump_sort.c @@ -26,6 +26,9 @@ static const char *modulename = gettext_noop("sorter"); * Sort priority for database object types. * Objects are sorted by type, and within a type by name. * + * Because materialized views can potentially reference system views, + * DO_REFRESH_MATVIEW should always be the last thing on the list. + * * NOTE: object-type priorities must match the section assignments made in * pg_dump.c; that is, PRE_DATA objects must sort before DO_PRE_DATA_BOUNDARY, * POST_DATA objects must sort after DO_POST_DATA_BOUNDARY, and DATA objects @@ -70,11 +73,11 @@ static const int dbObjectTypePriority[] = 22, /* DO_PRE_DATA_BOUNDARY */ 26, /* DO_POST_DATA_BOUNDARY */ 33, /* DO_EVENT_TRIGGER */ - 34, /* DO_REFRESH_MATVIEW */ - 35, /* DO_POLICY */ - 36, /* DO_PUBLICATION */ - 37, /* DO_PUBLICATION_REL */ - 38 /* DO_SUBSCRIPTION */ + 38, /* DO_REFRESH_MATVIEW */ + 34, /* DO_POLICY */ + 35, /* DO_PUBLICATION */ + 36, /* DO_PUBLICATION_REL */ + 37 /* DO_SUBSCRIPTION */ }; static DumpId preDataBoundId; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/22/17 10:14, Jim Nasby wrote: > CREATE MATERIALIZED VIEW tmv AS SELECT * FROM pg_subscription; > SELECT 0 > > IOW, you can create matviews that depend on any other > table/view/matview, but right now if the matview includes certain items > it will mysteriously end up empty post-restore. Yes, by that logic matview refresh should always be last. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/22/17 8:00 AM, Peter Eisentraut wrote: Actually, I think matviews really need to be the absolute last thing. What if you had a matview that referenced publications or subscriptions? I'm guessing that would be broken right now. I'm not sure what you have in mind here. Publications and subscriptions don't interact with materialized views, so the relative order doesn't really matter. CREATE MATERIALIZED VIEW tmv AS SELECT * FROM pg_subscription; SELECT 0 IOW, you can create matviews that depend on any other table/view/matview, but right now if the matview includes certain items it will mysteriously end up empty post-restore. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/22/17 00:55, Jim Nasby wrote: > Originally, no, but reviewing the list again I'm kindof wondering about > DO_DEFAULT_ACL, especially since the acl code in pg_dump looks at > defaults as part of what removes the need to explicitly dump > permissions. I'm also wondering if DO_POLICY could potentially affect > matviews? I'm not sure about the details of these, but I know that there are reasons why the permissions stuff is pretty late in the dump in general. > Actually, I think matviews really need to be the absolute last thing. > What if you had a matview that referenced publications or subscriptions? > I'm guessing that would be broken right now. I'm not sure what you have in mind here. Publications and subscriptions don't interact with materialized views, so the relative order doesn't really matter. > Not really; it just makes reference to needing to be in-sync with > pg_dump.c. My concern is that clearly people went to lengths in the past > to put everything possible before DO_PRE_DATA_BOUNDARY (ie, text search > and FDW) but most recently added stuff has gone after > DO_POST_DATA_BOUNDARY, even though there's no reason it couldn't be > pre-data. That's certainly a change, and I suspect it's not intentional I think the recent additions actually were intentional, although one could debate the intentions. ;-) -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/21/17 4:25 PM, Peter Eisentraut wrote: On 2/21/17 14:58, Jim Nasby wrote: AFAICT in older versions only object types that absolutely had to wait for DO_POST_DATA_BOUNDARY would do so. More recently though, objects are being added after that (presumably because it's easier than renumbering everything in dbObjectTypePriority). Is there any specific assignment that you have concerns about? Originally, no, but reviewing the list again I'm kindof wondering about DO_DEFAULT_ACL, especially since the acl code in pg_dump looks at defaults as part of what removes the need to explicitly dump permissions. I'm also wondering if DO_POLICY could potentially affect matviews? Actually, I think matviews really need to be the absolute last thing. What if you had a matview that referenced publications or subscriptions? I'm guessing that would be broken right now. Is this change a good or bad idea? Should there be an official guide for where new things go? The comment above dbObjectTypePriority explains it, doesn't it? Not really; it just makes reference to needing to be in-sync with pg_dump.c. My concern is that clearly people went to lengths in the past to put everything possible before DO_PRE_DATA_BOUNDARY (ie, text search and FDW) but most recently added stuff has gone after DO_POST_DATA_BOUNDARY, even though there's no reason it couldn't be pre-data. That's certainly a change, and I suspect it's not intentional (other than it's obviously less work to stick stuff at the end, but that could be fixed by having an array of the actual enum values and just having pg_dump sort that when it starts). -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change in "policy" on dump ordering?
On 2/21/17 14:58, Jim Nasby wrote: > AFAICT in older versions only object types that absolutely had to wait > for DO_POST_DATA_BOUNDARY would do so. More recently though, objects are > being added after that (presumably because it's easier than renumbering > everything in dbObjectTypePriority). Is there any specific assignment that you have concerns about? > Is this change a good or bad idea? Should there be an official guide for > where new things go? The comment above dbObjectTypePriority explains it, doesn't it? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Change in "policy" on dump ordering?
AFAICT in older versions only object types that absolutely had to wait for DO_POST_DATA_BOUNDARY would do so. More recently though, objects are being added after that (presumably because it's easier than renumbering everything in dbObjectTypePriority). Is this change a good or bad idea? Should there be an official guide for where new things go? -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers