Re: [HACKERS] Optimizing numeric SUM() aggregate
>I think we could do carry every 0x7FF / 1 accumulation, couldn't we? I feel that I have to elaborate a bit. Probably my calculations are wrong. Lets assume we already have accumulated INT_MAX of digits in previous-place accumulator. That's almost overflow, but that's not overflow. Carring that accumulator to currents gives us INT_MAX/1 carried sum. So in current-place accumulator we can accumulate: ( INT_MAX - INT_MAX / 1 ) / , where is max value dropped in current-place accumulator on each addition. That is INT_MAX * / or simply INT_MAX / 1. If we use unsigned 32-bit integer that is 429496. Which is 43 times less frequent carring. As a bonus, we get rid of const in the code (: Please correct me if I'm wrong. Best regards, Andrey Borodin, Octonica & Ural Federal University. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dumping extensions having sequences with 9.6beta3
On Wed, Jul 27, 2016 at 8:07 AM, Stephen Frost wrote: > That'd be great. It's definitely on my list of things to look into, but > I'm extremely busy this week. I hope to look into it on Friday, would > be great to see what you find. Sequences that are directly defined in extensions do not get dumped, and sequences that are part of a serial column in an extension are getting dumped. Looking into this problem, getOwnedSeqs() is visibly doing an incorrect assumption: sequences owned by table columns are dumped unconditionally, but this is not true for sequences that are part of extensions. More precisely, dobj->dump is being enforced to DUMP_COMPONENT_ALL, which makes the sequence definition to be dumped. Oops. The patch attached fixes the problem for me. I have added as well tests in test_pg_dump in the shape of sequences defined in an extension, and sequences that are part of a serial column. This patch is also able to work in the case where a sequence is created as part of a serial column, and gets removed after, say with ALTER EXTENSION DROP SEQUENCE. The behavior for sequences and serial columns that are not part of extensions is unchanged. Stephen, it would be good if you could check the correctness of this patch as you did all this refactoring of pg_dump to support catalog ACLs. I am sure by the way that checking for (owning_tab->dobj.dump && DUMP_COMPONENT_DEFINITION) != 0 is not good because of for example the case of a serial column created in an extension where the sequence is dropped from the extension afterwards. -- Michael diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 08c2b0c..0278995 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -6037,6 +6037,8 @@ getOwnedSeqs(Archive *fout, TableInfo tblinfo[], int numTables) continue; /* not an owned sequence */ if (seqinfo->dobj.dump & DUMP_COMPONENT_DEFINITION) continue; /* no need to search */ + if (seqinfo->dobj.ext_member) + continue; /* member of an extension */ owning_tab = findTableByOid(seqinfo->owning_tab); if (owning_tab && owning_tab->dobj.dump) { diff --git a/src/test/modules/test_pg_dump/t/001_base.pl b/src/test/modules/test_pg_dump/t/001_base.pl index fd9c37f..9caee93 100644 --- a/src/test/modules/test_pg_dump/t/001_base.pl +++ b/src/test/modules/test_pg_dump/t/001_base.pl @@ -226,7 +226,7 @@ my %tests = ( 'CREATE TABLE regress_pg_dump_table' => { regexp => qr/^ \QCREATE TABLE regress_pg_dump_table (\E - \n\s+\Qcol1 integer,\E + \n\s+\Qcol1 integer NOT NULL,\E \n\s+\Qcol2 integer\E \n\);$/xm, like => { binary_upgrade => 1, }, @@ -241,6 +241,48 @@ my %tests = ( schema_only=> 1, section_pre_data => 1, section_post_data => 1, }, }, + 'CREATE SEQUENCE regress_pg_dump_table_col1_seq' => { + regexp => qr/^ + \QCREATE SEQUENCE regress_pg_dump_table_col1_seq\E + \n\s+\QSTART WITH 1\E + \n\s+\QINCREMENT BY 1\E + \n\s+\QNO MINVALUE\E + \n\s+\QNO MAXVALUE\E + \n\s+\QCACHE 1;\E + $/xm, + like => { binary_upgrade => 1, }, + unlike => { + clean => 1, + clean_if_exists=> 1, + createdb => 1, + defaults => 1, + no_privs => 1, + no_owner => 1, + pg_dumpall_globals => 1, + schema_only=> 1, + section_pre_data => 1, + section_post_data => 1, }, }, + 'CREATE SEQUENCE regress_pg_dump_seq' => { + regexp => qr/^ + \QCREATE SEQUENCE regress_pg_dump_seq\E + \n\s+\QSTART WITH 1\E + \n\s+\QINCREMENT BY 1\E + \n\s+\QNO MINVALUE\E + \n\s+\QNO MAXVALUE\E + \n\s+\QCACHE 1;\E + $/xm, + like => { binary_upgrade => 1, }, + unlike => { + clean => 1, + clean_if_exists=> 1, + createdb => 1, + defaults => 1, + no_privs => 1, + no_owner => 1, + pg_dumpall_globals => 1, + schema_only=> 1, + section_pre_data => 1, + section_post_data => 1, }, }, 'CREATE ACCESS METHOD regress_test_am' => { regexp => qr/^ \QCREATE ACCESS METHOD regress_test_am TYPE INDEX HANDLER bthandler;\E diff --git a/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql b/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql index 5fe6063..93de2c5 100644 --- a/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql +++ b/src/test/modules/test_pg_dump/test_pg_dump--1.0.sql @@ -4,10 +4,12 @@ \echo Use "CREATE EXTENSION test_pg_dump" to load this file. \quit CREATE TABLE regress_pg_dump_table ( - col1 int, + col1 serial, col2 int ); +CREATE SEQUENCE regress_pg_dump_seq; + GRANT SELECT ON regress_pg_dump_table TO regress_dump_test_role; GRANT SELECT(col1) ON regress_pg_dump_table TO public; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] copyParamList
On Wed, Jul 27, 2016 at 2:03 AM, Robert Haas wrote: > On Fri, Jul 8, 2016 at 2:28 AM, Amit Kapila wrote: >> On Tue, May 31, 2016 at 10:10 PM, Robert Haas wrote: >>> On Fri, May 27, 2016 at 6:07 PM, Andrew Gierth >>> wrote: copyParamList does not respect from->paramMask, in what looks to me like an obvious oversight: retval->paramMask = NULL; [...] /* Ignore parameters we don't need, to save cycles and space. */ if (retval->paramMask != NULL && !bms_is_member(i, retval->paramMask)) retval->paramMask is never set to anything not NULL in this function, so surely that should either be initializing it to from->paramMask, or checking from->paramMask in the conditional? >>> >>> Oh, dear. I think you are right. I'm kind of surprised this didn't >>> provoke a test failure somewhere. >>> >> >> The reason why it didn't provoked any test failure is that it doesn't >> seem to be possible that from->paramMask in copyParamList can ever be >> non-NULL. Params formed will always have paramMask set as NULL in the >> code paths where copyParamList is used. I think we can just assign >> from->paramMask to retval->paramMask to make sure that even if it gets >> used in future, the code works as expected. Alternatively, one might >> think of adding an Assert there, but that doesn't seem to be >> future-proof. > > Hmm, I don't think this is the right fix. The point of the > bms_is_member test is that we don't want to copy any parameters that > aren't needed; but if we avoid copying parameters that aren't needed, > then it's fine for the new ParamListInfo to have a paramMask of NULL. > So I think it's actually correct that we set it that way, and I think > it's better than saving a pointer to the the original paramMask, > because (1) it's probably slightly faster to have it as NULL and (2) > the old paramMask might not be allocated in the same memory context as > the new ParamListInfo. So I think we instead ought to fix it as in > the attached. > Okay, that makes sense to me apart from minor issue reported by Andrew. I think we might want to slightly rephrase the comments on top of copyParamList() which indicates only about dynamic parameter hooks. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Wed, Jul 27, 2016 at 3:24 AM, Robert Haas wrote: > On Sat, Jul 23, 2016 at 3:55 AM, Amit Kapila wrote: >> Attached patch fixes the problem for me. Note, I have not tried to >> reproduce the problem for heap_lock_updated_tuple_rec(), but I think >> if you are convinced with above cases, then we should have a similar >> check in it as well. > > I don't think this hunk is correct: > > +/* > + * If we didn't pin the visibility map page and the page has become > + * all visible, we'll have to unlock and re-lock. See > heap_lock_tuple > + * for details. > + */ > +if (vmbuffer == InvalidBuffer && > PageIsAllVisible(BufferGetPage(buf))) > +{ > +LockBuffer(buf, BUFFER_LOCK_UNLOCK); > +visibilitymap_pin(rel, block, &vmbuffer); > +LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE); > +goto l4; > +} > > The code beginning at label l4 appears that the buffer is unlocked, > but this code leaves the buffer unlocked. Also, I don't see the point > of doing this test so far down in the function. Why not just recheck > *immediately* after taking the buffer lock? > Right, in this case we can recheck immediately after taking buffer lock, updated patch attached. In the passing by, I have noticed that heap_delete() doesn't do this unlocking, pinning of vm and locking at appropriate place. It just checks immediately after taking lock, whereas in the down code, it do unlock and lock the buffer again. I think we should do it as in attached patch (pin_vm_heap_delete-v1.patch). -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com pin_vm_lock_tuple-v2.patch Description: Binary data pin_vm_heap_delete-v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oddity in EXPLAIN for foreign/custom join pushdown plans
> I noticed that currently the core doesn't show any information on the > target relations involved in a foreign/custom join in EXPLAIN, by > itself. Here is an example: > > -- join two tables > EXPLAIN (COSTS false, VERBOSE) > SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY > t1.c3, t1.c1 OFFSET 100 LIMIT 10; > QUERY PLAN \ > > -- > -\ > --- > Limit > Output: t1.c1, t2.c1, t1.c3 > -> Foreign Scan > Output: t1.c1, t2.c1, t1.c3 > Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2) > Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T > 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1" ORDER BY > r1.c3 ASC N\ > ULLS LAST, r1."C 1" ASC NULLS LAST > (6 rows) > > postgres_fdw shows the target relations in the Relations line, as shown > above, but I think that the core should show such information > independently of FDWs; in the above example replace "Foreign Scan" with > "Foreign Join on public.ft1 t1, public.ft2 t2". Please find attached a > patch for that. Comments welcome! > This output is, at least, not incorrect. This ForeignScan actually scan a relation that consists of two joined tables on the remote side. So, not incorrect, but may not convenient for better understanding by users who don't have deep internal knowledge. On the other hands, I cannot be happy with your patch because it concludes the role of ForeignScan/CustomScan with scanrelid==0 is always join. However, it does not cover all the case. When postgres_fdw gets support of remote partial aggregate, we can implement the feature using ForeignScan with scanrelid==0. Is it a join? No. Probably, the core cannot know exact purpose of ForeignScan/CustomScan with scanrelid==0. It can be a remote partial aggregation. It can be an alternative sort logic by GPU. It can be something others. So, I think extension needs to inform the core more proper label to show; including a flag to control whether the relation(s) name shall be attached next to the ForeignScan/CustomScan line. I'd like you to suggest to add two fields below: - An alternative label extension wants to show instead of the default one - A flag to turn on/off printing relation(s) name EXPLAIN can print proper information according to these requirements. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] copyParamList
> "Robert" == Robert Haas writes: Robert> So I think we instead ought to fix it as in the attached. Robert>if (retval->paramMask != NULL && Robert> - !bms_is_member(i, retval->paramMask)) Robert> + !bms_is_member(i, from->paramMask)) Need to change both references to retval->paramMask, not just one. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On Wed, Jul 27, 2016 at 03:24:28AM +0200, Vik Fearing wrote: > On 27/07/16 03:15, Peter Eisentraut wrote: > > On 7/26/16 6:14 PM, Vik Fearing wrote: > >> As mentioned elsewhere in the thread, you can just do WHERE true > >> to get around it, so why on Earth have it PGC_SUSET? > > > > I'm not sure whether it's supposed to guard against typos and > > possibly buggy SQL string concatenation in application code. So > > it would help against accidental mistakes, whereas putting a WHERE > > TRUE in there would be an intentional override. > > If buggy SQL string concatenation in application code is your > argument, quite a lot of them add "WHERE true" so that they can just > append a bunch of "AND ..." clauses without worrying if it's the > first (or last, whatever), so I'm not sure this is protecting > anything. I am sure that I'm not the only one who's been asked for this feature because people other than me have piped up on this thread to that very effect. I understand that there may well be lots of really meticulous people on this list, people who would never accidentally do an unqualified DELETE on a table in production, but I can't claim to be one of them because I have, and not just once. It's under once a decade, but even that's too many. I'm not proposing to make this feature default, or even available by default, but I am totally certain that this is the kind of feature people would really appreciate, even if it doesn't prevent every catastrophe. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Oddity in EXPLAIN for foreign/custom join pushdown plans
On Wed, Jul 27, 2016 at 8:50 AM, Etsuro Fujita wrote: > Hi, > > I noticed that currently the core doesn't show any information on the > target relations involved in a foreign/custom join in EXPLAIN, by itself. > Here is an example: > > -- join two tables > EXPLAIN (COSTS false, VERBOSE) > SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY > t1.c3, t1.c1 OFFSET 100 LIMIT 10; > QUERY PLAN \ > > > ---\ > --- >Limit > Output: t1.c1, t2.c1, t1.c3 > -> Foreign Scan >Output: t1.c1, t2.c1, t1.c3 >Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2) >Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T 1" > r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1" ORDER BY r1.c3 > ASC N\ > ULLS LAST, r1."C 1" ASC NULLS LAST > (6 rows) > > postgres_fdw shows the target relations in the Relations line, as shown > above, but I think that the core should show such information independently > of FDWs; in the above example replace "Foreign Scan" with "Foreign Join on > public.ft1 t1, public.ft2 t2". Please find attached a patch for that. > Comments welcome! > The patch always prints ForeignJoin when scanrelid <= 0, which would be odd considering that FDWs can now push down post-join operations. We need to device a better way to convey post-join operations. May be something like Foreign Grouping, aggregation on ... Foreign Join on ... But then the question is a foreign scan node can be pushing down many operations together e.g. join, aggregation, sort OR join aggregation and windowing OR join and insert. How would we clearly convey this? May be we say Foreign Scan operations: join on ..., aggregation, ... That wouldn't be so great and might be clumsy for many operations. Any better idea? -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company
Re: [HACKERS] Why we lost Uber as a user
On Tue, Jul 26, 2016 at 8:27 PM, Stephen Frost wrote: > * Joshua D. Drake (j...@commandprompt.com) wrote: >> Hello, >> >> The following article is a very good look at some of our limitations >> and highlights some of the pains many of us have been working >> "around" since we started using the software. >> >> https://eng.uber.com/mysql-migration/ >> >> Specifically: >> >> * Inefficient architecture for writes >> * Inefficient data replication > > The above are related and there are serious downsides to having an extra > mapping in the middle between the indexes and the heap. > > What makes me doubt just how well they understood the issues or what is > happening is the lack of any mention of hint bits of tuple freezing > (requiring additional writes). Yeah. A surprising amount of that post seemed to be devoted to describing how our MVCC architecture works rather than what problem they had with it. I'm not saying we shouldn't take their bad experience seriously - we clearly should - but I don't feel like it's as clear as it could be about exactly where the breakdowns happened. That's why I found Josh's restatement useful - I am assuming without proof that his restatement is accurate -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Oddity in EXPLAIN for foreign/custom join pushdown plans
Hi, I noticed that currently the core doesn't show any information on the target relations involved in a foreign/custom join in EXPLAIN, by itself. Here is an example: -- join two tables EXPLAIN (COSTS false, VERBOSE) SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY t1.c3, t1.c1 OFFSET 100 LIMIT 10; QUERY PLAN \ ---\ --- Limit Output: t1.c1, t2.c1, t1.c3 -> Foreign Scan Output: t1.c1, t2.c1, t1.c3 Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2) Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1" ORDER BY r1.c3 ASC N\ ULLS LAST, r1."C 1" ASC NULLS LAST (6 rows) postgres_fdw shows the target relations in the Relations line, as shown above, but I think that the core should show such information independently of FDWs; in the above example replace "Foreign Scan" with "Foreign Join on public.ft1 t1, public.ft2 t2". Please find attached a patch for that. Comments welcome! Best regards, Etsuro Fujita explain-for-foreign-custom-join-pushdown.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Constraint merge and not valid status
At Tue, 26 Jul 2016 13:51:53 +0900, Amit Langote wrote in > > So, how about splitting the original equalTupleDescs into > > equalTupleDescs and equalTupleConstraints ? > > Actually TupleConstr is *part* of the TupleDesc struct, which the > relcache.c tries to preserve in *whole* across a relcache flush, so it > seems perhaps cleaner to keep the decision centralized in one function: The "Logical" is the cause of the ambiguity. It could be thought that relation cache maintains "Physical" tuple descriptor, which is defferent from "Logical" one. > keep_tupdesc = equalTupleDescs(relation->rd_att, newrel->rd_att); > ... > /* preserve old tupledesc and rules if no logical change */ > if (keep_tupdesc) > SWAPFIELD(TupleDesc, rd_att); > > Also: > > /* >* This struct is passed around within the backend to describe the >* structure of tuples. For tuples coming from on-disk relations, the >* information is collected from the pg_attribute, pg_attrdef, and >* pg_constraint catalogs. >... > typedef struct tupleDesc > { > > It would perhaps have been clear if there was a rd_constr field that > relcache.c independently managed. > > OTOH, I tend to agree that other places don't quite need the whole thing > (given also that most other callers except acquire_inherit_sample_rows > compare transient row types) Yes, constraints apparently doesn't affect the shpae of generatred tuples. On the other hand relcache should be aware of changes of constraints. Furthermore the transient row types has utterly no relation with constraints of even underlying relations. So, almost as your proposition, what do you think if the equalTupleDescs has extra parameter but named "logical", and ignore constraints if it is true? (Obviously the opposite will do). What I felt uneasy about is the name "compare_constr". regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] to_date_valid()
On 07/26/2016 06:25 PM, Peter Eisentraut wrote: On 7/5/16 4:24 AM, Albe Laurenz wrote: But notwithstanding your feeling that you would like your application to break if it makes use of this behaviour, it is a change that might make some people pretty unhappy - nobody can tell how many. What is the use of the existing behavior? You get back an arbitrary implementation dependent value. We don't even guarantee what the value will be. If we changed it to return a different implementation dependent value, would users get upset? No they would not get upset because they wouldn't know. Can we just do the right thing? JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
* Tom Lane (t...@sss.pgh.pa.us) wrote: > Peter Eisentraut writes: > > On 7/25/16 3:26 PM, Andrew Gierth wrote: > >> The issue I ran into was the exact same one as in the JDBC thread I > >> linked to earlier: correctly interpreting pg_index.indoption (to get the > >> ASC / DESC and NULLS FIRST/LAST settings), which requires knowing > >> whether amcanorder is true to determine whether to look at the bits at > >> all. > > > Maybe we should provide a facility to decode those bits then? > > Yeah. I'm not very impressed by the underlying assumption that it's > okay for client-side code to hard-wire knowledge about what indoption > bits mean, but not okay for it to hard-wire knowledge about which index > AMs use which indoption bits. There's something fundamentally wrong > in that. We don't let psql or pg_dump look directly at indoption, so > why would we think that third-party client-side code should do so? > > Andrew complained upthread that pg_get_indexdef() was too heavyweight > for his purposes, but it's not clear to me what he wants instead. I guess I'm missing something because it seems quite clear to me. He wants to know if the index was built with ASC or DESC, and if it was built with NULLS FIRST or NULLS LAST, just like the JDBC driver. pg_get_indexdef() will return that information, but as an SQL statement with a lot of other information that isn't relevant and is difficult to deal with when all you're trying to do is write an SQL query (no, I don't believe the solution here is to use pg_get_indexef() ~ 'DESC'). For my 2c, I'd like to see pg_dump able to use the catalog tables to derive the index definition, just as they manage to figure out table definitions without (for the most part) using functions. More generally, I believe we should be working to reach a point where we can reconstruct all objects in the database using just the catalog, without any SQL bits being provided from special functions which access information that isn't available at the SQL level. I don't see any problem with what Andrew has proposed as the information returned informs the creation of the DDL statement, but does not provide a textual "drop-in"/black-box component to include in the statement to recreate the object, the way pg_get_indexdef() does. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] [PATCH v12] GSSAPI encryption support
On Wed, Jul 27, 2016 at 12:22 AM, Robbie Harwood wrote: > Michael Paquier writes: > >> On Tue, Jul 26, 2016 at 5:58 AM, Robbie Harwood wrote: >>> Robbie Harwood writes: >> >> Sorry for my late reply. > > Thanks for the feedback! > If I were to continue as I have been - using the plaintext connection and auth negotiation path - then at the time of startup the client has no way of knowing whether to send connection parameters or not. Personally, I would be in favor of not frontloading these connection parameters over insecure connections, but it is my impression that the project does not want to go this way (which is fine). >> >> Per the discussion upthread I got the opposite impression, the startup >> packet should be sent after the connection has been established. SSL >> does so: the SSL negotiation goes first, and then the startup packet >> is sent. That's the flow with the status changing from >> CONNECTION_SSL_START -> CONNECTION_MADE. > > We are in agreement, I think. I have structured the referenced > paragraph badly: for this design, I'm suggesting separate GSS startup > path (like SSL) and once a tunnel is established we send the startup > packet. I probably should have just left this paragraph out. OK we're good then. On the server, I'll need to implement `hostgss` (by analogy to `hostssl`), and we'll want to lock authentication on those connections to GSSAPI-only. >> >> As well as hostnogss, but I guess that's part of the plan. > > Sure, `hostnogss` should be fine. This isn't quadratic, right? We don't > need hostnogssnossl (or thereabouts)? We don't need to do that far, users could still do the same with two different lines in pg_hba.conf. > So there's a connection setting `sslmode` that we'll want something > similar to here (`gssapimode` or so). `sslmode` has six settings, but I > think we only need three for GSSAPI: "disable", "allow", and "prefer" > (which presumably would be the default). Seeing the debate regarding sslmode these days, I would not say that "prefer" would be the default, but that's an implementation detail. > Lets suppose we're working with "prefer". GSSAPI will itself check two > places for credentials: client keytab and ccache. But if we don't find > credentials there, we as the client have two options on how to proceed. > > - First, we could prompt for a password (and then call > gss_acquire_cred_with_password() to get credentials), presumably with > an empty password meaning to skip GSSAPI. My memory is that the > current behavior for GSSAPI auth-only is to prompt for password if we > don't find credentials (and if it isn't, there's no reason not to > unless we're opposed to handling the password). > > - Second, we could skip GSSAPI and proceed with the next connection > method. This might be confusing if the user is then prompted for a > password and expects it to be for GSSAPI, but we could probably make > it sensible. I think I prefer the first option. Ah, right. I completely forgot that GSSAPI had its own handling of passwords for users registered to it... Isn't this distinction a good point for not implementing "prefer", "allow" or any equivalents? By that I mean that we should not have any GSS connection mode that fallbacks to something else if the first one fails. So we would live with the two following modes: - "disable", to only try a non-GSS connection - "enable", or "require", to only try a GSS connection. That seems quite acceptable to me as a first implementation to just have that. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
Peter Eisentraut writes: > On 7/25/16 3:26 PM, Andrew Gierth wrote: >> The issue I ran into was the exact same one as in the JDBC thread I >> linked to earlier: correctly interpreting pg_index.indoption (to get the >> ASC / DESC and NULLS FIRST/LAST settings), which requires knowing >> whether amcanorder is true to determine whether to look at the bits at >> all. > Maybe we should provide a facility to decode those bits then? Yeah. I'm not very impressed by the underlying assumption that it's okay for client-side code to hard-wire knowledge about what indoption bits mean, but not okay for it to hard-wire knowledge about which index AMs use which indoption bits. There's something fundamentally wrong in that. We don't let psql or pg_dump look directly at indoption, so why would we think that third-party client-side code should do so? Andrew complained upthread that pg_get_indexdef() was too heavyweight for his purposes, but it's not clear to me what he wants instead. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MSVC pl-perl error message is not verbose enough
On Wed, Jul 27, 2016 at 12:41 AM, John Harvey wrote: > Because of this, I've submitted a small patch which fixes the verbosity of > the error message to actually explain what's missing. I hope that this > patch will be considered for the community, and it would be nice if it was > back-patched. Improving this error message a bit looks like a good idea to me. Looking at the code to figure out what build.pl is looking for is a bit a pain for users just willing to compile the code, so if we can avoid such lookups with a cheap way, let's do it. Instead of your patch, I'd suggest saving $solution->{options}->{perl} . '\lib\CORE\perl*.lib' in a variable and then raise an error based on that. See attached. -- Michael diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm index fe905d3..5fad939 100644 --- a/src/tools/msvc/Mkvcbuild.pm +++ b/src/tools/msvc/Mkvcbuild.pm @@ -578,16 +578,17 @@ sub mkvcbuild } } $plperl->AddReference($postgres); + my $perl_path = $solution->{options}->{perl} . '\lib\CORE\perl*.lib'; my @perl_libs = grep { /perl\d+.lib$/ } - glob($solution->{options}->{perl} . '\lib\CORE\perl*.lib'); + glob($perl_path); if (@perl_libs == 1) { $plperl->AddLibrary($perl_libs[0]); } else { - die "could not identify perl library version"; + die "could not identify perl library version matching pattern $perl_path\n"; } # Add transform module dependent on plperl -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
Peter Eisentraut writes: > On 7/26/16 6:14 PM, Vik Fearing wrote: >> As mentioned elsewhere in the thread, you can just do WHERE true to get >> around it, so why on Earth have it PGC_SUSET? > I'm not sure whether it's supposed to guard against typos and possibly > buggy SQL string concatenation in application code. So it would help > against accidental mistakes, whereas putting a WHERE TRUE in there would > be an intentional override. Maybe I misunderstood Vik's point; I thought he was complaining that it's silly to make this SUSET rather than USERSET. I tend to agree. We have a rough consensus that GUCs that change query semantics are bad, but if it simply throws an error (or not) then it's not likely to cause any surprising application behaviors. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
On 7/25/16 3:26 PM, Andrew Gierth wrote: > The issue I ran into was the exact same one as in the JDBC thread I > linked to earlier: correctly interpreting pg_index.indoption (to get the > ASC / DESC and NULLS FIRST/LAST settings), which requires knowing > whether amcanorder is true to determine whether to look at the bits at > all. Maybe we should provide a facility to decode those bits then? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
Peter Eisentraut writes: > On 7/26/16 7:46 PM, Thomas Munro wrote: >> By the way, our documentation says that NOT NULL constraints are >> equivalent to CHECK (column_name IS NOT NULL). That is what the SQL >> standard says, but in fact our NOT NULL constraints are equivalent to >> CHECK (column_name IS DISTINCT FROM NULL). Should we update the >> documentation with something like the attached? > Couldn't we just fix that instead? For NOT NULL constraints on > composite type columns, create a full CHECK (column_name IS NOT NULL) > constraint instead, foregoing the attnotnull optimization. Maybe. There's a patch floating around that expands attnotnull into CHECK constraints, which'd provide the infrastructure needed to consider changing this behavior. But that's not going to be back-patchable, and as I noted in <10682.1469566...@sss.pgh.pa.us>, we have a problem right now that the planner's constraint exclusion logic gets these semantics wrong. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
On Wed, Jul 27, 2016 at 5:11 AM, Robert Haas wrote: > Committed with minor kibitizing: you don't need an "else" after a > statement that transfers control out of the function. Thanks. Right, I forgot that. > Shouldn't > pg_get_function_arguments, pg_get_function_identity_arguments, > pg_get_function_result, and pg_get_function_arg_default get the same > treatment? Changing all of them make sense. Please see attached. While looking at the series of functions pg_get_*, I have noticed as well that pg_get_userbyid() returns "unknown (OID=%u)" when it does not know a user. Perhaps we'd want to change that to NULL for consistency with the rest? -- Michael diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index 2915e21..51d0c23 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -2238,11 +2238,11 @@ pg_get_function_arguments(PG_FUNCTION_ARGS) StringInfoData buf; HeapTuple proctup; - initStringInfo(&buf); - proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); if (!HeapTupleIsValid(proctup)) - elog(ERROR, "cache lookup failed for function %u", funcid); + PG_RETURN_NULL(); + + initStringInfo(&buf); (void) print_function_arguments(&buf, proctup, false, true); @@ -2264,11 +2264,11 @@ pg_get_function_identity_arguments(PG_FUNCTION_ARGS) StringInfoData buf; HeapTuple proctup; - initStringInfo(&buf); - proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); if (!HeapTupleIsValid(proctup)) - elog(ERROR, "cache lookup failed for function %u", funcid); + PG_RETURN_NULL(); + + initStringInfo(&buf); (void) print_function_arguments(&buf, proctup, false, false); @@ -2289,11 +2289,11 @@ pg_get_function_result(PG_FUNCTION_ARGS) StringInfoData buf; HeapTuple proctup; - initStringInfo(&buf); - proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); if (!HeapTupleIsValid(proctup)) - elog(ERROR, "cache lookup failed for function %u", funcid); + PG_RETURN_NULL(); + + initStringInfo(&buf); print_function_rettype(&buf, proctup); @@ -2547,7 +2547,7 @@ pg_get_function_arg_default(PG_FUNCTION_ARGS) proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid)); if (!HeapTupleIsValid(proctup)) - elog(ERROR, "cache lookup failed for function %u", funcid); + PG_RETURN_NULL(); numargs = get_func_arg_info(proctup, &argtypes, &argnames, &argmodes); if (nth_arg < 1 || nth_arg > numargs || !is_input_argument(nth_arg - 1, argmodes)) diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out index 7c633ac..c5ff318 100644 --- a/src/test/regress/expected/rules.out +++ b/src/test/regress/expected/rules.out @@ -3094,3 +3094,33 @@ SELECT pg_get_viewdef(0); (1 row) +SELECT pg_get_function_arguments(0); + pg_get_function_arguments +--- + +(1 row) + +SELECT pg_get_function_identity_arguments(0); + pg_get_function_identity_arguments + + +(1 row) + +SELECT pg_get_function_result(0); + pg_get_function_result + + +(1 row) + +SELECT pg_get_function_arg_default(0, 0); + pg_get_function_arg_default +- + +(1 row) + +SELECT pg_get_function_arg_default('pg_class'::regclass, 0); + pg_get_function_arg_default +- + +(1 row) + diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql index 111c9ba..835945f 100644 --- a/src/test/regress/sql/rules.sql +++ b/src/test/regress/sql/rules.sql @@ -1152,3 +1152,8 @@ SELECT pg_get_indexdef(0); SELECT pg_get_ruledef(0); SELECT pg_get_triggerdef(0); SELECT pg_get_viewdef(0); +SELECT pg_get_function_arguments(0); +SELECT pg_get_function_identity_arguments(0); +SELECT pg_get_function_result(0); +SELECT pg_get_function_arg_default(0, 0); +SELECT pg_get_function_arg_default('pg_class'::regclass, 0); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] to_date_valid()
On 7/5/16 4:24 AM, Albe Laurenz wrote: > But notwithstanding your feeling that you would like your application > to break if it makes use of this behaviour, it is a change that might > make some people pretty unhappy - nobody can tell how many. What is the use of the existing behavior? You get back an arbitrary implementation dependent value. We don't even guarantee what the value will be. If we changed it to return a different implementation dependent value, would users get upset? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On 27/07/16 03:15, Peter Eisentraut wrote: > On 7/26/16 6:14 PM, Vik Fearing wrote: >> As mentioned elsewhere in the thread, you can just do WHERE true to get >> around it, so why on Earth have it PGC_SUSET? > > I'm not sure whether it's supposed to guard against typos and possibly > buggy SQL string concatenation in application code. So it would help > against accidental mistakes, whereas putting a WHERE TRUE in there would > be an intentional override. If buggy SQL string concatenation in application code is your argument, quite a lot of them add "WHERE true" so that they can just append a bunch of "AND ..." clauses without worrying if it's the first (or last, whatever), so I'm not sure this is protecting anything. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On 7/26/16 6:14 PM, Vik Fearing wrote: > As mentioned elsewhere in the thread, you can just do WHERE true to get > around it, so why on Earth have it PGC_SUSET? I'm not sure whether it's supposed to guard against typos and possibly buggy SQL string concatenation in application code. So it would help against accidental mistakes, whereas putting a WHERE TRUE in there would be an intentional override. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
On 7/26/16 7:46 PM, Thomas Munro wrote: > By the way, our documentation says that NOT NULL constraints are > equivalent to CHECK (column_name IS NOT NULL). That is what the SQL > standard says, but in fact our NOT NULL constraints are equivalent to > CHECK (column_name IS DISTINCT FROM NULL). Should we update the > documentation with something like the attached? Couldn't we just fix that instead? For NOT NULL constraints on composite type columns, create a full CHECK (column_name IS NOT NULL) constraint instead, foregoing the attnotnull optimization. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Increasing timeout of poll_query_until for TAP tests
On Mon, Jul 25, 2016 at 10:05 PM, Michael Paquier wrote: > On Mon, Jul 25, 2016 at 2:52 PM, Michael Paquier > wrote: >> Ah, yes, and that's a stupid mistake. We had better use >> replay_location instead of write_location. There is a risk that >> records have not been replayed yet even if they have been written on >> the standby, so it is possible that the query looking at tab_int may >> not see this relation. > > Or in short, the attached fixes 2) and will help providing input for 1).. Increasing the timeout had zero effect for test 003: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&dt=2016-07-26%2016%3A00%3A07 So we're facing something else. I'll dig into that deeper using manually hamster. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Macro customizable hashtable / bitmapscan & aggregation perf
Hi, As previously mentioned, dynahash isn't particularly fast. In many cases that doesn't particularly matter. But e.g. hash aggregation and bitmap index scans are quite sensitive to hash performance. The biggest problems with dynahash (also discussed at [1]) are a) two level structure doubling the amount of indirect lookups b) indirect function calls c) using separate chaining based conflict resolution d) being too general. In the post referenced above I'd implemented an open-coded hashtable addressing these issues for the tidbitmap.c case, with quite some success. It's not particularly desirable to have various slightly differing hash-table implementations across the backend though. The only solution I could think of, that preserves the efficiency, is to have a hash-table implementation which is customizable to the appropriate types et, via macros. In the attached patch I've attached simplehash.h, which can be customized by a bunch of macros, before being inlined. There's also a patch using this for tidbitmap.c and nodeAgg/nodeSubplan/... via execGrouping.c. To show the motivation: Bitmapscan: before: tpch[22005][1]=# EXPLAIN ANALYZE SELECT SUM(l_extendedprice) FROM lineitem WHERE (l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date); ┌──┐ │QUERY PLAN │ ├──┤ │ Aggregate (cost=942330.27..942330.28 rows=1 width=8) (actual time=5283.026..5283.026 rows=1 loops=1) │ │ -> Bitmap Heap Scan on lineitem (cost=193670.02..919511.38 rows=9127557 width=8) (actual time=3041.072..4436.569 rows=9113219 loops=1) │ │ Recheck Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date))│ │ Heap Blocks: exact=585542 │ │ -> Bitmap Index Scan on i_l_shipdate (cost=0.00..191388.13 rows=9127557 width=0) (actual time=2807.246..2807.246 rows=9113219 loops=1) │ │ Index Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date))│ │ Planning time: 0.077 ms │ │ Execution time: 5284.038 ms │ └──┘ (8 rows) after: tpch[21953][1]=# EXPLAIN ANALYZE SELECT SUM(l_extendedprice) FROM lineitem WHERE (l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date); ┌┐ │ QUERY PLAN │ ├┤ │ Aggregate (cost=942330.27..942330.28 rows=1 width=8) (actual time=3431.602..3431.602 rows=1 loops=1) │ │ -> Bitmap Heap Scan on lineitem (cost=193670.02..919511.38 rows=9127557 width=8) (actual time=1158.783..2588.357 rows=9113219 loops=1) │ │ Recheck Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date)) │ │ Heap Blocks: exact=585542 │ │ -> Bitmap Index Scan on i_l_shipdate (cost=0.00..191388.13 rows=9127557 width=0) (actual time=958.341..958.341 rows=9113219 loops=1) │ │ Index Cond: ((l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date)) │ │ Planning time: 0.070 ms │ │ Execution time: 3435.259 ms │ └─
Re: [HACKERS] Why we lost Uber as a user
* Joshua D. Drake (j...@commandprompt.com) wrote: > Hello, > > The following article is a very good look at some of our limitations > and highlights some of the pains many of us have been working > "around" since we started using the software. > > https://eng.uber.com/mysql-migration/ > > Specifically: > > * Inefficient architecture for writes > * Inefficient data replication The above are related and there are serious downsides to having an extra mapping in the middle between the indexes and the heap. What makes me doubt just how well they understood the issues or what is happening is the lack of any mention of hint bits of tuple freezing (requiring additional writes). > * Issues with table corruption That was a bug that was fixed quite quickly once it was detected. The implication that MySQL doesn't have similar bugs is entirely incorrect, as is the idea that logical replication would avoid data corruption issues (in practice, it actually tends to be quite a bit worse). > * Poor replica MVCC support Solved through the hot standby feedback system. > * Difficulty upgrading to newer releases Their specific issue with these upgrades was solved, years ago, by me (and it wasn't particularly difficult to do...) through the use of pg_upgrade's --link option and rsync's ability to construct hard link trees. Making major release upgrades easier with less downtime is certainly a good goal, but there's been a solution to the specific issue they had here for quite a while. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On Tue, Jul 26, 2016 at 04:39:14PM -0400, Robert Haas wrote: > On Mon, Jul 25, 2016 at 11:38 PM, David Fetter wrote: > > On Mon, Jul 25, 2016 at 11:12:24PM -0400, Robert Haas wrote: > >> On Fri, Jul 22, 2016 at 2:38 AM, David Fetter wrote: > >> > I've renamed it to require_where and contrib-ified. > >> > >> I'm not sure that the Authors section is entirely complete. > > > > Does this suit? > > YFTATP. Oops. I'd done it on the commitfest app, but not in the patch. I've also made this PGC_USERSET. Best, David. -- David Fetter http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate diff --git a/contrib/Makefile b/contrib/Makefile index 25263c0..4bd456f 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -40,6 +40,7 @@ SUBDIRS = \ pgstattuple \ pg_visibility \ postgres_fdw\ + require_where \ seg \ spi \ tablefunc \ diff --git a/contrib/require_where/Makefile b/contrib/require_where/Makefile new file mode 100644 index 000..731f9fb --- /dev/null +++ b/contrib/require_where/Makefile @@ -0,0 +1,15 @@ +MODULE_big = require_where +OBJS = require_where.o + +PGFILEDESC = 'require_where - require DELETE and/or UPDATE to have a WHERE clause' + +ifdef USE_PGXS + PG_CONFIG = pg_config + PGXS = $(shell $(PG_CONFIG) --pgxs) + include $(PGXS) +else + subdir = contrib/require_where + top_builddir = ../.. + include $(top_builddir)/src/Makefile.global + include $(top_builddir)/contrib/contrib-global.mk +endif diff --git a/contrib/require_where/require_where.c b/contrib/require_where/require_where.c new file mode 100644 index 000..556101a --- /dev/null +++ b/contrib/require_where/require_where.c @@ -0,0 +1,81 @@ +/* + * -- + * + * require_where.c + * + * Copyright (C) 2016, PostgreSQL Global Development Group + * + * IDENTIFICATION + * contrib/require_where/require_where.c + * + * -- + */ +#include "postgres.h" + +#include "fmgr.h" + +#include "parser/analyze.h" + +#include "utils/elog.h" +#include "utils/guc.h" + +PG_MODULE_MAGIC; + +void _PG_init(void); + +static post_parse_analyze_hook_type original_post_parse_analyze_hook = NULL; +static bool delete_requires_where = false; +static bool update_requires_where = false; + +static void +requires_where_check(ParseState *pstate, Query *query) +{ + + if (delete_requires_where && query->commandType == CMD_DELETE) + { + Assert(query->jointree != NULL); + if (query->jointree->quals == NULL) + ereport(ERROR, + (errcode(ERRCODE_CARDINALITY_VIOLATION), +errmsg("DELETE requires a WHERE clause"), +errhint("To delete all rows, use \"WHERE true\" or similar"))); + } + + if (update_requires_where && query->commandType == CMD_UPDATE) + { + Assert(query->jointree != NULL); + if (query->jointree->quals == NULL) + ereport(ERROR, + (errcode(ERRCODE_CARDINALITY_VIOLATION), +errmsg("UPDATE requires a WHERE clause"), +errhint("To update all rows, use \"WHERE true\" or similar"))); + } + + if (original_post_parse_analyze_hook != NULL) + (*original_post_parse_analyze_hook) (pstate, query); +} + +void +_PG_init(void) +{ + DefineCustomBoolVariable("requires_where.delete", + "Require every DELETE statement to have a WHERE clause.", + NULL, + &delete_requires_where, + false, + PGC_USERSET, + false, + NULL, NULL, NULL); + + DefineCustomBoolVariable("requires_where.update", + "Require every UPDATE statement to have a WHERE clause.", + NULL, + &update_requires_where, + false, +
Re: [HACKERS] AdvanceXLInsertBuffer vs. WAL segment compressibility
On 07/26/16 20:01, Michael Paquier wrote: > On Tue, Jul 26, 2016 at 9:48 PM, Amit Kapila wrote: >> Does any body else see the use case >> reported by Chapman important enough that we try to have some solution >> for it in-core? > > The lack of updates in the pg_lesslog project is a sign that it is not > that much used. I does not seem a good idea to bring in-core a tool > not used that much by users. Effectively, it already was brought in-core in commit 9a20a9b. Only, that change had an unintended consequence that *limits* compressibility - and it would not have that consequence, if it were changed to simply set xlp_pageaddr to InvalidXLogRecPtr in the dummy zero pages written to fill out a segment. -Chap -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] AdvanceXLInsertBuffer vs. WAL segment compressibility
On Tue, Jul 26, 2016 at 9:48 PM, Amit Kapila wrote: > Does any body else see the use case > reported by Chapman important enough that we try to have some solution > for it in-core? The lack of updates in the pg_lesslog project is a sign that it is not that much used. I does not seem a good idea to bring in-core a tool not used that much by users. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
On Wed, Jul 27, 2016 at 7:35 AM, Tom Lane wrote: > "David G. Johnston" writes: >> The concept embodied by "NULL" in the operator "IS [NOT] NULL" is distinct >> from the concept embodied by "NULL" in the operator "IS [NOT] DISTINCT >> FROM". > >> In short, the former smooths out the differences between composite and >> non-composite types while the later maintains their differences. While a >> bit confusing I don't see that there is much to be done about it - aside >> from making the distinction more clear at: >> https://www.postgresql.org/docs/devel/static/functions-comparison.html > >> Does spec support or refute this distinction in treatment? > > AFAICS, the IS [NOT] DISTINCT FROM operator indeed is specified to do the > "obvious" thing when one operand is NULL: you get a simple nullness check > on the other operand. So I went ahead and documented that it could be > used for that purpose. By the way, our documentation says that NOT NULL constraints are equivalent to CHECK (column_name IS NOT NULL). That is what the SQL standard says, but in fact our NOT NULL constraints are equivalent to CHECK (column_name IS DISTINCT FROM NULL). Should we update the documentation with something like the attached? -- Thomas Munro http://www.enterprisedb.com not-null-does-not-mean-check-is-not-null.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] to_date_valid()
On Tue, Jul 5, 2016 at 07:41:15AM -0400, David G. Johnston wrote: > Surely these beta testers would test against the RC before putting it into > production so I don't see an issue. I tend to agree generally but the point > of > beta is to find bugs and solicit suggestions for improvement to features. > Having found a bug it doesn't make sense to avoid patching our current > unstable > release. This applies moreso because of our annual release cycle. On the > topic of whether this becomes an exception to the rule I'm largely ambivalent. We don't assume users re-test everything when the RC1 arrives --- we assume the testing is cumulative from when we started beta. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dumping extensions having sequences with 9.6beta3
Michael, * Michael Paquier (michael.paqu...@gmail.com) wrote: > On Tue, Jul 26, 2016 at 4:50 PM, Noah Misch wrote: > > [Action required within 72 hours. This is a generic notification.] > > > > The above-described topic is currently a PostgreSQL 9.6 open item. Stephen, > > since you committed the patch believed to have created it, you own this open > > item. If some other commit is more relevant or if this does not belong as a > > 9.6 open item, please let us know. Otherwise, please observe the policy on > > open item ownership[1] and send a status update within 72 hours of this > > message. Include a date for your subsequent status update. Testers may > > discover new open items at any time, and I want to plan to get them all > > fixed > > well in advance of shipping 9.6rc1. Consequently, I will appreciate your > > efforts toward speedy resolution. Thanks. > > > > [1] > > http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com > > I am not sure what's Stephen's status on this item, but I am planning > to look at it within the next 24 hours. That'd be great. It's definitely on my list of things to look into, but I'm extremely busy this week. I hope to look into it on Friday, would be great to see what you find. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Why we lost Uber as a user
On Wed, Jul 27, 2016 at 7:19 AM, Robert Haas wrote: > I've seen multiple cases where this kind of thing causes a > sufficiently large performance regression that the system just can't > keep up. Things are OK when the table is freshly-loaded, but as soon > as somebody runs a query on any table in the cluster that lasts for a > minute or two, so much bloat accumulates that the performance drops to > an unacceptable level. This kind of thing certainly doesn't happen to > everybody, but equally certainly, this isn't the first time I've heard > of it being a problem. Sometimes, with careful tending and a very > aggressive autovacuum configuration, you can live with it, but it's > never a lot of fun. Yes.. That's not fun at all. And it takes days to do this tuning properly if you do such kind of tests on a given product that should work the way its spec certifies it to ease the customer experience. As much as this post is interesting, the comments on HN are a good read as well: https://news.ycombinator.com/item?id=12166585 Some points raised are that the "flaws" mentioned in this post are actually advantages. But I guess this depends on how you want to run your business via your application layer. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
On Tue, Jul 26, 2016 at 6:07 PM, Tom Lane wrote: > Josh Berkus writes: >> To explain this in concrete terms, which the blog post does not: > >> 1. Create a small table, but one with enough rows that indexes make >> sense (say 50,000 rows). > >> 2. Make this table used in JOINs all over your database. > >> 3. To support these JOINs, index most of the columns in the small table. > >> 4. Now, update that small table 500 times per second. > >> That's a recipe for runaway table bloat; VACUUM can't do much because >> there's always some minutes-old transaction hanging around (and SNAPSHOT >> TOO OLD doesn't really help, we're talking about minutes here), and >> because of all of the indexes HOT isn't effective. > > Hm, I'm not following why this is a disaster. OK, you have circa 100% > turnover of the table in the lifespan of the slower transactions, but I'd > still expect vacuuming to be able to hold the bloat to some small integer > multiple of the minimum possible table size. (And if the table is small, > that's still small.) I suppose really long transactions (pg_dump?) could > be pretty disastrous, but there are ways around that, like doing pg_dump > on a slave. > > Or in short, this seems like an annoyance, not a time-for-a-new-database > kind of problem. I've seen multiple cases where this kind of thing causes a sufficiently large performance regression that the system just can't keep up. Things are OK when the table is freshly-loaded, but as soon as somebody runs a query on any table in the cluster that lasts for a minute or two, so much bloat accumulates that the performance drops to an unacceptable level. This kind of thing certainly doesn't happen to everybody, but equally certainly, this isn't the first time I've heard of it being a problem. Sometimes, with careful tending and a very aggressive autovacuum configuration, you can live with it, but it's never a lot of fun. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
On 07/26/2016 03:07 PM, Tom Lane wrote: > Josh Berkus writes: >> That's a recipe for runaway table bloat; VACUUM can't do much because >> there's always some minutes-old transaction hanging around (and SNAPSHOT >> TOO OLD doesn't really help, we're talking about minutes here), and >> because of all of the indexes HOT isn't effective. > > Hm, I'm not following why this is a disaster. OK, you have circa 100% > turnover of the table in the lifespan of the slower transactions, but I'd > still expect vacuuming to be able to hold the bloat to some small integer > multiple of the minimum possible table size. Not in practice. Don't forget that you also have bloat of the indexes as well. I encountered multiple cases of this particular failure case, and often bloat ended up at something like 100X of the clean table/index size, with no stable size (that is, it always kept growing). This was the original impetus for wanting REINDEX CONCURRENTLY, but really that's kind of a workaround. (And if the table is small, > that's still small.) I suppose really long transactions (pg_dump?) could > be pretty disastrous, but there are ways around that, like doing pg_dump > on a slave. You'd need a dedicated slave for the pg_dump, otherwise you'd hit query cancel. > Or in short, this seems like an annoyance, not a time-for-a-new-database > kind of problem. It's considerably more than an annoyance for the people who suffer from it; for some databases I dealt with, this one issue was responsible for 80% of administrative overhead (cron jobs, reindexing, timeouts ...). But no, it's not a database-switcher *by itself*. But is is a chronic, and serious, problem. I don't have even a suggestion of a real solution for it without breaking something else, though. -- -- Josh Berkus Red Hat OSAS (any opinions are my own) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On 21/07/16 06:57, David Fetter wrote: > Folks, > > Please find attached a patch which makes it possible to disallow > UPDATEs and DELETEs which lack a WHERE clause. As this changes query > behavior, I've made the new GUCs PGC_SUSET. > > What say? I say I don't like this at all. As mentioned elsewhere in the thread, you can just do WHERE true to get around it, so why on Earth have it PGC_SUSET? I would rather, if we need nannying at all, have a threshold of number of rows affected. So if your update or delete exceeds that threshold, the query will be canceled. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
Josh Berkus writes: > To explain this in concrete terms, which the blog post does not: > 1. Create a small table, but one with enough rows that indexes make > sense (say 50,000 rows). > 2. Make this table used in JOINs all over your database. > 3. To support these JOINs, index most of the columns in the small table. > 4. Now, update that small table 500 times per second. > That's a recipe for runaway table bloat; VACUUM can't do much because > there's always some minutes-old transaction hanging around (and SNAPSHOT > TOO OLD doesn't really help, we're talking about minutes here), and > because of all of the indexes HOT isn't effective. Hm, I'm not following why this is a disaster. OK, you have circa 100% turnover of the table in the lifespan of the slower transactions, but I'd still expect vacuuming to be able to hold the bloat to some small integer multiple of the minimum possible table size. (And if the table is small, that's still small.) I suppose really long transactions (pg_dump?) could be pretty disastrous, but there are ways around that, like doing pg_dump on a slave. Or in short, this seems like an annoyance, not a time-for-a-new-database kind of problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
On Tue, Jul 26, 2016 at 5:26 PM, Josh Berkus wrote: > On 07/26/2016 01:53 PM, Josh Berkus wrote: >> The write amplification issue, and its correllary in VACUUM, certainly >> continues to plague some users, and doesn't have any easy solutions. > > To explain this in concrete terms, which the blog post does not: > > 1. Create a small table, but one with enough rows that indexes make > sense (say 50,000 rows). > > 2. Make this table used in JOINs all over your database. > > 3. To support these JOINs, index most of the columns in the small table. > > 4. Now, update that small table 500 times per second. > > That's a recipe for runaway table bloat; VACUUM can't do much because > there's always some minutes-old transaction hanging around (and SNAPSHOT > TOO OLD doesn't really help, we're talking about minutes here), and > because of all of the indexes HOT isn't effective. Removing the indexes > is equally painful because it means less efficient JOINs. > > The Uber guy is right that InnoDB handles this better as long as you > don't touch the primary key (primary key updates in InnoDB are really bad). > > This is a common problem case we don't have an answer for yet. This is why I think we need a pluggable heap storage layer, which could be done either by rebranding foreign data wrappers as data wrappers (as I have previously proposed) or using the access method interface (as proposed by Alexander Korotkov) at PGCon. We're reaching the limits of what can be done using our current heap format, and we need to enable developers to experiment with new things. Aside from the possibility of eventually coming up with something that's good enough to completely (or mostly) replace our current heap storage format, we need to support specialized data storage formats that are optimized for particular use cases (columnar, memory-optimized, WORM). I know that people are worried about ending up with too many heap storage formats, but I think we should be a lot more worried about not having enough heap storage formats. Anybody who thinks that the current design is working for all of our users is not paying very close attention. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
On Tue, Jul 26, 2016 at 02:26:57PM -0700, Josh Berkus wrote: > On 07/26/2016 01:53 PM, Josh Berkus wrote: > > The write amplification issue, and its correllary in VACUUM, certainly > > continues to plague some users, and doesn't have any easy solutions. > > To explain this in concrete terms, which the blog post does not: > > 1. Create a small table, but one with enough rows that indexes make > sense (say 50,000 rows). > > 2. Make this table used in JOINs all over your database. > > 3. To support these JOINs, index most of the columns in the small table. > > 4. Now, update that small table 500 times per second. > > That's a recipe for runaway table bloat; VACUUM can't do much because > there's always some minutes-old transaction hanging around (and SNAPSHOT > TOO OLD doesn't really help, we're talking about minutes here), and > because of all of the indexes HOT isn't effective. Removing the indexes > is equally painful because it means less efficient JOINs. > > The Uber guy is right that InnoDB handles this better as long as you > don't touch the primary key (primary key updates in InnoDB are really bad). > > This is a common problem case we don't have an answer for yet. Or, basically, we don't have an answer to without making something else worse. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reviewing freeze map code
On Sat, Jul 23, 2016 at 3:55 AM, Amit Kapila wrote: > Attached patch fixes the problem for me. Note, I have not tried to > reproduce the problem for heap_lock_updated_tuple_rec(), but I think > if you are convinced with above cases, then we should have a similar > check in it as well. I don't think this hunk is correct: +/* + * If we didn't pin the visibility map page and the page has become + * all visible, we'll have to unlock and re-lock. See heap_lock_tuple + * for details. + */ +if (vmbuffer == InvalidBuffer && PageIsAllVisible(BufferGetPage(buf))) +{ +LockBuffer(buf, BUFFER_LOCK_UNLOCK); +visibilitymap_pin(rel, block, &vmbuffer); +LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE); +goto l4; +} The code beginning at label l4 appears that the buffer is unlocked, but this code leaves the buffer unlocked. Also, I don't see the point of doing this test so far down in the function. Why not just recheck *immediately* after taking the buffer lock? If you find out that you need the pin after all, then LockBuffer(buf, BUFFER_LOCK_UNLOCK); visibilitymap_pin(rel, block, &vmbuffer); LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE); but *do not* go back to l4. Unless I'm missing something, putting this block further down, as you have it, buys nothing, because none of that intervening code can release the buffer lock without using goto to jump back to l4. +/* + * If we didn't pin the visibility map page and the page has become all + * visible while we were busy locking the buffer, or during some + * subsequent window during which we had it unlocked, we'll have to unlock + * and re-lock, to avoid holding the buffer lock across an I/O. That's a + * bit unfortunate, especially since we'll now have to recheck whether the + * tuple has been locked or updated under us, but hopefully it won't + * happen very often. + */ +if (vmbuffer == InvalidBuffer && PageIsAllVisible(page)) +{ +LockBuffer(*buffer, BUFFER_LOCK_UNLOCK); +visibilitymap_pin(relation, block, &vmbuffer); +LockBuffer(*buffer, BUFFER_LOCK_EXCLUSIVE); +goto l3; +} In contrast, this looks correct: l3 expects the buffer to be locked already, and the code above this point and below the point this logic can unlock and re-lock the buffer, potentially multiple times. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] AdvanceXLInsertBuffer vs. WAL segment compressibility
On 07/26/2016 04:21 PM, Robert Haas wrote: > I'm kind of curious WHY you are using archiving and forcing regular > segment switches rather than just using streaming replication. > ... AFAIK, streaming replication > essentially obsoleted that use case. You can just dribble the > individual bytes over the wire a few at a time to the standby or, with > pg_receivexlog, to an archive location. If it takes 6 months to fill > up a WAL segment, you don't care: you'll always have all the bytes Part of it is just the legacy situation: at the moment, the offsite host is of a different architecture and hasn't got PostgreSQL installed (but it's easily ssh'd to for delivering compressed WAL segments). We could change that down the road, and pg_receivexlog would work for getting the bytes over there. My focus for the moment was just on migrating a cluster to 9.5 without changing the surrounding arrangements all at once. Seeing how much worse our compression ratio will be, though, maybe I need to revisit that plan. Even so, I'd be curious whether it would break anything to have xlp_pageaddr simply set to InvalidXLogRecPtr in the dummy zero pages written to fill out a segment. At least until it's felt that archive_timeout has been so decidedly obsoleted by streaming replication that it is removed, and the log-tail zeroing code with it. That at least would eliminate the risk of anyone else repeating my astonishment. :) I had read that 9.4 added built-in log-zeroing code, and my first reaction was "cool! that may make the compression technique we're using unnecessary, but certainly can't make it worse" only to discover that it did, by ~ 300x, becoming now 3x *worse* than plain gzip, which itself is ~ 100x worse than what we had. -Chap -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
On 07/26/2016 01:53 PM, Josh Berkus wrote: > The write amplification issue, and its correllary in VACUUM, certainly > continues to plague some users, and doesn't have any easy solutions. To explain this in concrete terms, which the blog post does not: 1. Create a small table, but one with enough rows that indexes make sense (say 50,000 rows). 2. Make this table used in JOINs all over your database. 3. To support these JOINs, index most of the columns in the small table. 4. Now, update that small table 500 times per second. That's a recipe for runaway table bloat; VACUUM can't do much because there's always some minutes-old transaction hanging around (and SNAPSHOT TOO OLD doesn't really help, we're talking about minutes here), and because of all of the indexes HOT isn't effective. Removing the indexes is equally painful because it means less efficient JOINs. The Uber guy is right that InnoDB handles this better as long as you don't touch the primary key (primary key updates in InnoDB are really bad). This is a common problem case we don't have an answer for yet. -- -- Josh Berkus Red Hat OSAS (any opinions are my own) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why we lost Uber as a user
On 07/26/2016 09:54 AM, Joshua D. Drake wrote: > Hello, > > The following article is a very good look at some of our limitations and > highlights some of the pains many of us have been working "around" since > we started using the software. They also had other reasons to switch to MySQL, particularly around changes of staffing (the switch happened after they got a new CTO). And they encountered that 9.2 bug literally the week we released a fix, per one of the mailing lists. Even if they switched off, it's still a nice testimonial that they once ran their entire worldwide fleet off a single Postgres cluster. However, the issues they cite as limitations of our current replication system are real, or we wouldn't have so many people working on alternatives. We could really use pglogical in 10.0, as well as OLTP-friendly MM replication. The write amplification issue, and its correllary in VACUUM, certainly continues to plague some users, and doesn't have any easy solutions. I do find it interesting that they mention schema changes in passing, without actually saying anything about them -- given that schema changes have been one of MySQL's major limitations. I'll also note that they don't mention any of MySQL's corresponding weak spots, such as limitations on table size due to primary key sorting. One wonders what would have happened if they'd adopted a sharding model on top of Postgres? I would like to see someone blog about our testing for replication corruption issues now, in response to this. -- -- Josh Berkus Red Hat OSAS (any opinions are my own) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Make it possible to disallow WHERE-less UPDATE and DELETE
On Mon, Jul 25, 2016 at 11:38 PM, David Fetter wrote: > On Mon, Jul 25, 2016 at 11:12:24PM -0400, Robert Haas wrote: >> On Fri, Jul 22, 2016 at 2:38 AM, David Fetter wrote: >> > I've renamed it to require_where and contrib-ified. >> >> I'm not sure that the Authors section is entirely complete. > > Does this suit? YFTATP. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] copyParamList
On Fri, Jul 8, 2016 at 2:28 AM, Amit Kapila wrote: > On Tue, May 31, 2016 at 10:10 PM, Robert Haas wrote: >> On Fri, May 27, 2016 at 6:07 PM, Andrew Gierth >> wrote: >>> copyParamList does not respect from->paramMask, in what looks to me like >>> an obvious oversight: >>> >>> retval->paramMask = NULL; >>> [...] >>> /* Ignore parameters we don't need, to save cycles and space. */ >>> if (retval->paramMask != NULL && >>> !bms_is_member(i, retval->paramMask)) >>> >>> retval->paramMask is never set to anything not NULL in this function, >>> so surely that should either be initializing it to from->paramMask, or >>> checking from->paramMask in the conditional? >> >> Oh, dear. I think you are right. I'm kind of surprised this didn't >> provoke a test failure somewhere. >> > > The reason why it didn't provoked any test failure is that it doesn't > seem to be possible that from->paramMask in copyParamList can ever be > non-NULL. Params formed will always have paramMask set as NULL in the > code paths where copyParamList is used. I think we can just assign > from->paramMask to retval->paramMask to make sure that even if it gets > used in future, the code works as expected. Alternatively, one might > think of adding an Assert there, but that doesn't seem to be > future-proof. Hmm, I don't think this is the right fix. The point of the bms_is_member test is that we don't want to copy any parameters that aren't needed; but if we avoid copying parameters that aren't needed, then it's fine for the new ParamListInfo to have a paramMask of NULL. So I think it's actually correct that we set it that way, and I think it's better than saving a pointer to the the original paramMask, because (1) it's probably slightly faster to have it as NULL and (2) the old paramMask might not be allocated in the same memory context as the new ParamListInfo. So I think we instead ought to fix it as in the attached. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company fix-copyparamlist-rmh.patch Description: application/download -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] AdvanceXLInsertBuffer vs. WAL segment compressibility
On Fri, Jul 22, 2016 at 6:02 PM, Chapman Flack wrote: > At $work, we have a usually-low-activity PG database, so that almost > always the used fraction of each 16 MB WAL segment is far smaller > than 16 MB, and so it's a big win for archived-WAL storage space > if an archive-command can be written that compresses those files > effectively. I'm kind of curious WHY you are using archiving and forcing regular segment switches rather than just using streaming replication. Pre-9.0, use of archive_timeout was routine, since there was no other way to ensure that the data ended up someplace other than your primary with reasonable regularity. But, AFAIK, streaming replication essentially obsoleted that use case. You can just dribble the individual bytes over the wire a few at a time to the standby or, with pg_receivexlog, to an archive location. If it takes 6 months to fill up a WAL segment, you don't care: you'll always have all the bytes that were generated more than a fraction of a second before the master melted into a heap of slag. I'm not saying you don't have a good reason for doing what you are doing, just that I cannot think of one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
On Mon, Jun 6, 2016 at 3:30 AM, Michael Paquier wrote: > On Sun, Jun 5, 2016 at 11:16 PM, Tom Lane wrote: >> Michael Paquier writes: >>> Still, on back branches I think that it would be a good idea to have a >>> better error message for the ones that already throw an error, like >>> "trigger with OID %u does not exist". Switching from elog to ereport >>> would be a good idea, but wouldn't it be a problem to change what is >>> user-visible? >> >> If we're going to touch this behavior in the back branches, I would >> vote for changing it the same as we do in HEAD. I do not think that >> making a user-visible behavior change in a minor release and then a >> different change in the next major is good strategy. > > So, I have finished with the patch attached that changes all the > *def() functions to return NULL when an object does not exist. Some > callers of the index and constraint definitions still expect a cache > lookup error if the object does not exist, and this patch is careful > about that. I think that it would be nice to get that in 9.6, but I > won't fight hard for it either. So I am parking it for now in the > upcoming CF. > >> But, given the relative shortage of complaints from the field, I'd >> be more inclined to just do nothing in back branches. There might >> be people out there with code depending on the current behavior, >> and they'd be annoyed if we change it in a minor release. > > Sure. That's the safest position. Thinking about the existing behavior > for some of the index and constraint callers, even just changing the > error message does not bring much as some of them really expect to > fail with a cache lookup error. Committed with minor kibitizing: you don't need an "else" after a statement that transfers control out of the function. Shouldn't pg_get_function_arguments, pg_get_function_identity_arguments, pg_get_function_result, and pg_get_function_arg_default get the same treatment? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in citext's upgrade script for parallel aggregates
On Tue, Jul 26, 2016 at 3:48 AM, Noah Misch wrote: > On Thu, Jul 14, 2016 at 02:00:59AM +0200, Andreas Karlsson wrote: >> On 07/09/2016 05:42 AM, David Rowley wrote: >> >On 30 June 2016 at 03:49, Robert Haas wrote: >> >>On Sat, Jun 25, 2016 at 3:44 AM, Andreas Karlsson >> >>wrote: >> >>>On 06/24/2016 01:31 PM, David Rowley wrote: >> Seems there's a small error in the upgrade script for citext for 1.1 >> to 1.2 which will cause min(citext) not to be parallel enabled. >> >> max(citext)'s combinefunc is first set incorrectly, but then updated >> to the correct value. I assume it was meant to set the combine >> function for min(citext) instead. >> >> Fix attached. I've assumed that because we're still in beta that we >> can get away with this fix rather than making a 1.3 version to fix the >> issue. >> >>> >> >>>Yes, this is indeed a bug. >> >> >> >>Since we've already released beta2, I think we need to do a whole new >> >>extension version. We treated beta1 as a sufficiently-significant >> >>event to mandate a version bump, so we should do the same here. >> > >> >Ok, good point. Patch attached. >> >> Thanks! >> >> I tested the patch and it looks good. > > [Action required within 72 hours. This is a generic notification.] > > The above-described topic is currently a PostgreSQL 9.6 open item. Robert, > since you committed the patch believed to have created it, you own this open > item. If some other commit is more relevant or if this does not belong as a > 9.6 open item, please let us know. Otherwise, please observe the policy on > open item ownership[1] and send a status update within 72 hours of this > message. Include a date for your subsequent status update. Testers may > discover new open items at any time, and I want to plan to get them all fixed > well in advance of shipping 9.6rc1. Consequently, I will appreciate your > efforts toward speedy resolution. Thanks. Committed the patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
"David G. Johnston" writes: > The concept embodied by "NULL" in the operator "IS [NOT] NULL" is distinct > from the concept embodied by "NULL" in the operator "IS [NOT] DISTINCT > FROM". > In short, the former smooths out the differences between composite and > non-composite types while the later maintains their differences. While a > bit confusing I don't see that there is much to be done about it - aside > from making the distinction more clear at: > âhttps://www.postgresql.org/docs/devel/static/functions-comparison.html > Does spec support or refute this distinction in treatment? AFAICS, the IS [NOT] DISTINCT FROM operator indeed is specified to do the "obvious" thing when one operand is NULL: you get a simple nullness check on the other operand. So I went ahead and documented that it could be used for that purpose. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] old_snapshot_threshold allows heap:toast disagreement
On Thu, Jul 21, 2016 at 12:06 PM, Robert Haas wrote: > On Wed, Jul 20, 2016 at 9:15 PM, Noah Misch wrote: >> This PostgreSQL 9.6 open item now needs a permanent owner. Would any other >> committer like to take ownership? If this role interests you, please read >> this thread and the policy linked above, then send an initial status update >> bearing a date for your subsequent status update. If the item does not have >> a >> permanent owner by 2016-07-24 02:00 UTC, I will resolve the item by reverting >> commit 848ef42 and followups. > > I will adopt this item. I will provide an initial patch for this > issue, or convince someone else to do so, within one week. Therefore, > expect a further status update from me on or before July 28th. I > expect that the patch will be based on ideas from these emails: > > https://www.postgresql.org/message-id/1ab8f80a-d16e-4154-9497-98fbb1642...@anarazel.de > https://www.postgresql.org/message-id/20160720181213.f4io7gc6lyc37...@alap3.anarazel.de Here is a patch. Please review. I'm not suffering from an overwhelming excess of confidence that this is correct. In particular, I'm concerned about the fact that there isn't always a registered snapshot - if you make it ERROR out when that happens instead of doing what it actually does, the regression tests fail in CLUSTER and CREATE VIEW. Also, I wonder why it's right to use pairingheap_first() instead of looking at the oldest snapshot on the active snapshot stack, or conversely why that is right and this is not. Or maybe we need to check both. The basic principle of testing both MVCC and TOAST snapshots for snapshot-too-old problems seems sound. Furthermore, it seems clear enough that if we're ever looking up a datum in the TOAST table whose xmin is no longer included in our MyPgXact->xmin, then we're vulnerable to having TOAST chunks vacuumed away under us even without snapshot_too_old enabled. But there's a bit of spooky action at a distance: if we don't see any snapshots, then we have to assume that the scan is being performed with a non-MVCC snapshot and therefore we don't need to worry. But there's no way to verify that via an assertion, because the connection between the relevant MVCC snapshot and the corresponding TOAST snapshot is pretty indirect, mediated only by snapmgr.c. It's a bit like a 9-1-1 operator (or whatever your local emergency number is) saying "hey, thanks for calling. if I don't hear back from you about this issue again, I'll just assume everything's OK." That might actually the correct approach in some cases, but it certainly comes with a bit of risk. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company ost-heap-toast-disagreement-v1.patch Description: application/download -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Why we lost Uber as a user
Hello, The following article is a very good look at some of our limitations and highlights some of the pains many of us have been working "around" since we started using the software. https://eng.uber.com/mysql-migration/ Specifically: * Inefficient architecture for writes * Inefficient data replication * Issues with table corruption * Poor replica MVCC support * Difficulty upgrading to newer releases It is a very good read and I encourage our hackers to do so with an open mind. Sincerely, JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH v12] GSSAPI encryption support
Tom Lane writes: > Robbie Harwood writes: >> So there's a connection setting `sslmode` that we'll want something >> similar to here (`gssapimode` or so). `sslmode` has six settings, but I >> think we only need three for GSSAPI: "disable", "allow", and "prefer" >> (which presumably would be the default). > > FWIW, there is quite a bit of unhappiness around sslmode=prefer, see > for example this thread: > https://www.postgresql.org/message-id/flat/2A5EFBDC-41C6-42A8-8B6D-E69DA60E9962%40eggerapps.at > > I do not know if we can come up with a better answer, but I'd caution > you against thinking that that's a problem-free model to emulate. Understood. We have the slight simplification for GSSAPI of having mutual authentication always (i.e., no need to worry about unauthenticated-but-encrypted connections). My personal view is that we want to try for as much security as we can without breaking anything [0]. If a user knows that they want a specific security, they can set "require"; if they don't want it, they can set "disable". Setting "require" as the default breaks one class of users; setting "disable" another. And I don't think we can punt the problem to the user and make it a mandatory parameter, either. I'm absolutely open to suggestions for how we could do better here, especially since we're adding support for something new, but having read the thread you mention I don't immediately see a superior design. 0: For what it's worth, I also don't agree with the assertion that having the ability to fallback to plaintext from tampering makes the attempt at encryption useless; rather, it still foils a passive adversary, even if it doesn't do anything against an active one. signature.asc Description: PGP signature
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
On Tue, Jul 26, 2016 at 11:10 AM, Tom Lane wrote: > > 3. Andrew also revived the bug #7808 thread in which it was complained > that ExecMakeTableFunctionResult should not fail on null results from > functions returning SETOF composite. That's related only in that the > proposed fix is to translate a plain NULL into ROW(NULL,NULL,...). > I do not much like the implementation details of his proposed patch, > but I think making that translation is probably better than failing > altogether. Given the infrequency of field complaints, my recommendation > here is to fix it in HEAD but not back-patch. > Andrew's response also introduces an area for documentation improvement. The concept embodied by "NULL" in the operator "IS [NOT] NULL" is distinct from the concept embodied by "NULL" in the operator "IS [NOT] DISTINCT FROM". In short, the former smooths out the differences between composite and non-composite types while the later maintains their differences. While a bit confusing I don't see that there is much to be done about it - aside from making the distinction more clear at: https://www.postgresql.org/docs/devel/static/functions-comparison.html Does spec support or refute this distinction in treatment? CREATE TYPE twocol AS (col1 text, col2 int); CREATE TYPE twocolalt AS (col1 text, col2 int); SELECT row(null, null)::twocol IS NULL, null::twocol IS NULL, null::twocol IS NOT DISTINCT FROM row(null, null)::twocol Its at least worth considering whether to note that when comparing two composite values using IS DISTINCT FROM that the comparison is unaware of the composite type itself but performs an iterative comparison of each pair of columns. SELECT row(null, null)::twocol IS NOT DISTINCT FROM row(null, null)::twocolalt This is likely to matter little in practice given low odds of someone accidentially comparing two physically identical but semantically different composite types. David J.
[HACKERS] MSVC pl-perl error message is not verbose enough
Hello folks, I've got a small patch I'd like to submit for consideration. The scenario involves MSVC builds, where a user can do the following: 1) Install ActivePerl's latest (5.22 or 5.24). 2) Add this line to their config.pl file (in src/tools/msvc): $config->{perl} = "C:\\Perl64"; This will enable pl-perl as part of the build configuration 3) Try to run "build" using MSVC However, users that do this will end up with a curious error: "Could not identify perl library version at src/tools/msvc/Mkvcbuild.pm line 583." This is a confusing error to see, especially since ActivePerl was just installed. It makes debugging a little difficult (path issue? Perl installation issue? Code issue?). Other folks have seen this error before and been similiarly confused: https://www.postgresql.org/message-id/dub126-w931dea250b21c3a48dcabad1...@phx.gbl I've actually followed up with that submitter, and he had said that after a couple of days, they gave up on finding the solution for this error and tried a different path. The problem is that a default installation of ActivePerl installs perl524.dll, but not perl524.lib. I'm not sure if there's any way to get ActivePerl to install the lib file as part of its installation process, so I resorted to generating my own .lib file from the .dll that was installed. After doing that, and placing the .lib file in C:\Perl64\lib\CORE, the error goes away. The user does at this point have to modify the ActivePerl config file so that it will use MSVC compiler formats instead of gcc by modifying C:\Perl64\lib\CORE\config.h from: #define PERL_STATIC_INLINE static __inline__ to #define PERL_STATIC_INLINE static __inline After that, it works. I understand there's a lot of stuff out of the postgres community's hands here; we can't expect to solve ActivePerl problems. However, I think that the error message in the MSVC scripts isn't verbose enough to tell people what to look for. If their perl skills aren't very good, it may take some learning just to figure out what the problem is. Because of this, I've submitted a small patch which fixes the verbosity of the error message to actually explain what's missing. I hope that this patch will be considered for the community, and it would be nice if it was back-patched. Attached is my patch for review. Thank you, -John Harvey msvc_pl_perl_error_verbose.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH v12] GSSAPI encryption support
Robbie Harwood writes: > So there's a connection setting `sslmode` that we'll want something > similar to here (`gssapimode` or so). `sslmode` has six settings, but I > think we only need three for GSSAPI: "disable", "allow", and "prefer" > (which presumably would be the default). FWIW, there is quite a bit of unhappiness around sslmode=prefer, see for example this thread: https://www.postgresql.org/message-id/flat/2A5EFBDC-41C6-42A8-8B6D-E69DA60E9962%40eggerapps.at I do not know if we can come up with a better answer, but I'd caution you against thinking that that's a problem-free model to emulate. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH v12] GSSAPI encryption support
Robbie Harwood writes: > So there's a connection setting `sslmode` that we'll want something > similar to here (`gssapimode` or so). `sslmode` has six settings, but I > think we only need three for GSSAPI: "disable", "allow", and "prefer" > (which presumably would be the default). Apologies, this should say four; I neglected "require". signature.asc Description: PGP signature
Re: [HACKERS] [PATCH v12] GSSAPI encryption support
Michael Paquier writes: > On Tue, Jul 26, 2016 at 5:58 AM, Robbie Harwood wrote: >> Robbie Harwood writes: > > Sorry for my late reply. Thanks for the feedback! >>> If I were to continue as I have been - using the plaintext connection >>> and auth negotiation path - then at the time of startup the client has >>> no way of knowing whether to send connection parameters or not. >>> Personally, I would be in favor of not frontloading these connection >>> parameters over insecure connections, but it is my impression that the >>> project does not want to go this way (which is fine). > > Per the discussion upthread I got the opposite impression, the startup > packet should be sent after the connection has been established. SSL > does so: the SSL negotiation goes first, and then the startup packet > is sent. That's the flow with the status changing from > CONNECTION_SSL_START -> CONNECTION_MADE. We are in agreement, I think. I have structured the referenced paragraph badly: for this design, I'm suggesting separate GSS startup path (like SSL) and once a tunnel is established we send the startup packet. I probably should have just left this paragraph out. >>> On the server, I'll need to implement `hostgss` (by analogy to >>> `hostssl`), and we'll want to lock authentication on those connections >>> to GSSAPI-only. > > As well as hostnogss, but I guess that's part of the plan. Sure, `hostnogss` should be fine. This isn't quadratic, right? We don't need hostnogssnossl (or thereabouts)? >>> Clients will explicitly probe for GSSAPI support as they do for TLS >>> support (I look forward to the bikeshed on the order of these) and >>> should have a parameter to require said support. One thing I'm not >>> clear on is what our behavior should be when the user doesn't >>> explicitly request GSSAPI and doesn't have a ccache - do we prompt? >>> Skip probing? I'm not sure what the best option there is. > > I am not sure I get what you mean here. Okay. Let me try again: So there's a connection setting `sslmode` that we'll want something similar to here (`gssapimode` or so). `sslmode` has six settings, but I think we only need three for GSSAPI: "disable", "allow", and "prefer" (which presumably would be the default). Lets suppose we're working with "prefer". GSSAPI will itself check two places for credentials: client keytab and ccache. But if we don't find credentials there, we as the client have two options on how to proceed. - First, we could prompt for a password (and then call gss_acquire_cred_with_password() to get credentials), presumably with an empty password meaning to skip GSSAPI. My memory is that the current behavior for GSSAPI auth-only is to prompt for password if we don't find credentials (and if it isn't, there's no reason not to unless we're opposed to handling the password). - Second, we could skip GSSAPI and proceed with the next connection method. This might be confusing if the user is then prompted for a password and expects it to be for GSSAPI, but we could probably make it sensible. I think I prefer the first option. Thanks, --Robbie signature.asc Description: PGP signature
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
Peter Eisentraut writes: > On 7/22/16 7:01 PM, Andrew Gierth wrote: >> In light of the fact that it is an endless cause of bugs both in pg and >> potentially to applications, I propose that we cease attempting to >> conform to the spec's definition of IS NULL in favour of the following >> rules: > I can't see how we would incompatibly change existing > standards-conforming behavior merely because users are occasionally > confused and there are some bugs in corner cases. It seems clear that Andrew's proposal to reject the spec's definition that ROW(NULL,NULL,...) IS NULL is true has failed. So ExecEvalNullTest() should keep on doing what it's doing. There are several related issues however: 1. As per bug #14235, eval_const_expressions() behaves differently from ExecEvalNullTest() for nested-composite-types cases. It is inarguably a bug that they don't give the same answers. So far, no one has spoken against the conclusion reached in that thread that ExecEvalNullTest() correctly implements the spec's semantics and eval_const_expressions() does not. Therefore I propose to apply and back-patch Andrew's fix from that thread. 2. Our (or at least my) previous understanding was that ROW(NULL,NULL,...) ought to be considered NULL for all purposes, and that our failure to do so anywhere except ExecEvalNullTest was a spec violation we should do something about someday. But the bug #14235 thread makes it clear that that's not so, and that only in very limited cases is there an argument for applying IS [NOT] NULL's behavior to other constructs. COALESCE() was mentioned as something that maybe should change. I'm inclined to vote "no, let's keep COALESCE as-is". The spec's use of, essentially, macro expansion to define COALESCE is just stupid, not least because it's impossible to specify the expected at-most-once evaluation of each argument expression that way. (They appear to try to dodge that question by forbidding argument expressions with side-effects, which is just lame.) But had they written out a definition of COALESCE's behavior in words, they would almost certainly have written "V is not the null value" not "V IS NOT NULL". Anyone who stopped to think about it would surely think that the result of COALESCE(ROW(1,NULL), NULL) should not be NULL. So I think the apparent mandate to use IS NOT NULL's semantics for rowtype values in COALESCE is just an artifact of careless drafting. Between that and the backwards-compatibility hazards of changing, it's not worth it. 3. Andrew also revived the bug #7808 thread in which it was complained that ExecMakeTableFunctionResult should not fail on null results from functions returning SETOF composite. That's related only in that the proposed fix is to translate a plain NULL into ROW(NULL,NULL,...). I do not much like the implementation details of his proposed patch, but I think making that translation is probably better than failing altogether. Given the infrequency of field complaints, my recommendation here is to fix it in HEAD but not back-patch. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ispell/hunspell imprecision in error message
Artur Zakirov writes: > On 25.07.2016 19:53, Peter Eisentraut wrote: >> But I found that this actually talks about Hunspell format dictionaries. >> (So the man page is hunspell(5) as opposed to ispell(5).) While as far >> as the tsearch interfaces are concerned, those two are lumped together, >> should we not change the error message to refer to Hunspell? > As I understand, this error message refers to the Ispell dictionary > template. This template can use various dictionary formats. Which is > confusing. Maybe would be better to change dictionary template name. But > it can break backward compatibility... I think the error message has to refer to the ispell dictionary type by its assigned name, otherwise you'll just create even more confusion. (In other words, I read the message as talking about PG's ispell dictionary code, not about somebody else's ispell man file.) The right place to explain that ispell accepts hunspell-formatted files, or to link to specifications of what that means, is in the documentation. Section 12.6.5 already says that, but maybe it could be expanded/rewritten a little. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] AdvanceXLInsertBuffer vs. WAL segment compressibility
On 07/26/2016 08:48 AM, Amit Kapila wrote: > general, if you have a very low WAL activity, then the final size of > compressed WAL shouldn't be much even if you use gzip. It seems your 9.5 pg_xlog, low activity test cluster (segment switches forced only by checkpoint timeouts), compressed with gzip -9: $ for i in 0*; do echo -n "$i " && gzip -9 <$i | wc -c; done 000100010042 27072 000100010043 27075 000100010044 27077 000100010045 27073 000100010046 27075 Log from live pre-9.4 cluster, low-activity time of day, delta compression using rsync: 2016-07-26 03:54:02 EDT (walship) INFO: using 2.39s user, 0.4s system, 9.11s on wall: 231 byte 000100460029_000100460021_fwd ... 2016-07-26 04:54:01 EDT (walship) INFO: using 2.47s user, 0.4s system, 8.43s on wall: 232 byte 00010046002A_000100460022_fwd ... 2016-07-26 05:54:02 EDT (walship) INFO: using 2.56s user, 0.29s system, 9.44s on wall: 230 byte 00010046002B_000100460023_fwd So when I say "factor of 100", I'm understating slightly. (Those timings, for the curious, include sending a copy offsite via ssh.) > everything zero. Now, it might be possible to selectively initialize > the fields that doesn't harm the methodology for archive you are using > considering there is no other impact of same in code. However, it Indeed, it is only the one header field that duplicates the low- order part of the (hex) file name that breaks delta compression, because it has always been incremented even when nothing else is different, and it's scattered 2000 times through the file. Would it break anything for *that* to be zero in dummy blocks? -Chap -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: revert behavior of IS NULL on row types
On Mon, Jul 25, 2016 at 8:28 PM, Peter Eisentraut wrote: > On 7/22/16 7:01 PM, Andrew Gierth wrote: >> In light of the fact that it is an endless cause of bugs both in pg and >> potentially to applications, I propose that we cease attempting to >> conform to the spec's definition of IS NULL in favour of the following >> rules: > > I can't see how we would incompatibly change existing > standards-conforming behavior merely because users are occasionally > confused and there are some bugs in corner cases. +1 Anywhere we have standard-conforming behavior, changing the semantics of the standard syntax seems insane. We can create new syntax for extensions where we are convinced we have a better idea. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] AdvanceXLInsertBuffer vs. WAL segment compressibility
On Tue, Jul 26, 2016 at 9:02 AM, Chapman Flack wrote: > On 07/25/16 22:09, Michael Paquier wrote: > >> This is over-complicating things for little gain. The new behavior of >> filling in with zeros the tail of a segment makes things far better >> when using gzip in archive_command. > > Then how about filling with actual zeros, instead of with mostly-zeros > as is currently done? That would work just as well for gzip, and would > not sacrifice the ability to do 100x better than gzip. > There is a flag XLP_BKP_REMOVABLE for the purpose of ignoring empty blocks, any external tool/'s relying on it can break, if make everything zero. Now, it might be possible to selectively initialize the fields that doesn't harm the methodology for archive you are using considering there is no other impact of same in code. However, it doesn't look to be a neat way to implement the requirement. In general, if you have a very low WAL activity, then the final size of compressed WAL shouldn't be much even if you use gzip. It seems your main concern is that the size of WAL even though not high, but it is more than what you were earlier getting for your archive data. I think that is a legitimate concern, but I don't see much options apart for providing some selective way to not initialize everything in WAL page headers or have some tool like pg_lesslog that can be shipped as part of contrib module. I am not sure whether your use case is important enough to proceed with one of those options or may be consider some another approach. Does any body else see the use case reported by Chapman important enough that we try to have some solution for it in-core? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Optimizing numeric SUM() aggregate
Hi! I like the idea and implementation, but I have one suggestion. > Instead of propagating carry after each new value, it's done only every > values (or at the end). I think we could do carry every 0x7FF / 1 accumulation, couldn't we? Best regards, Andrey Borodin, Octonica & Ural Federal University. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusing TAP tests readme file
On Mon, Jul 25, 2016 at 9:53 PM, Michael Paquier wrote: > On Mon, Jul 25, 2016 at 7:42 PM, Ildar Musin wrote: >> I was checking out TAP tests documentation. And I found confusing this part >> of src/test/perl/README file: >> >> my $ret = $node->psql('postgres', 'SELECT 1'); >> is($ret, '1', 'SELECT 1 returns 1'); > > Good catch. > >> The returning value of psql() function is the exit code of the psql. Hence >> this test will never pass since psql returns 0 if query was successfully >> executed. Probably it was meant to be the safe_psql() function instead which >> returns stdout: >> >> my $ret = $node->safe_psql('postgres', 'SELECT 1'); >> is($ret, '1', 'SELECT 1 returns 1'); >> >> or else: >> >> my ($ret, $stdout, $stderr) = >> $node->psql('postgres', 'SELECT 1'); >> is($stdout, '1', 'SELECT 1 returns 1'); >> >> The attached patch fixes this. > > Just using psql_safe looks fine to me. Pushed. Thanks! Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Design for In-Core Logical Replication
On 26/07/16 00:05, Hannu Krosing wrote: CREATE PUBLICATION mypub; ALTER PUBLICATION mypub ADD TABLE users, departments; Would a subscription just be a logical grouping or would it be something stronger like meaning atomic subscriptions and/or a dedicated replication slot ? Not sure what you mean by atomic subscription but subscription creation adds replication slot to the provider node. Other than that subscription lives on the downstream node only. Can you subscribe to multiple publications through single SUBSCRIPTION ? Yes. What is supposed to happen if table A is in two subscriptions S1 and S2, and you subscribe to both? Will you get table a only once (both initial copy and events)? Yes only once, the replication works with tables, publication is really just grouping/filtering, what you get is union of tables in the publications. Would a subscription of "mypub" pop up on subscriber side atomically, or will subscribed tables appear one-by one when they are ready (initial copy + catchup event replay completed) ? Yes that's my plan as that makes it easier to parallelize and recover from crashes (also makes this faster as tables that are already done don't need to be copied again) during the initialization. Also makes it easier to reuse the table initialization code for adding new tables at later time. CREATE SUBSCRIPTION mysub WITH CONNECTION dbname=foo host=bar user=repuser PUBLICATION mypub; For the pgq-like version which consider a PUBLICATION just as list of tables to subscribe, I would add CREATE SUBSCRIPTION mysub WITH CONNECTION 'dbname=foo host=bar user=repuser' PUBLICATION mypub, mypub1; Yes that works as well. ALTER SUBSCRIPTION mysub DROP PUBLICATION mypub1; ALTER SUBSCRIPTION mysub ADD PUBLICATION mypub2; This does not yet, but I agree we should have it. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dumping extensions having sequences with 9.6beta3
On Tue, Jul 26, 2016 at 4:50 PM, Noah Misch wrote: > [Action required within 72 hours. This is a generic notification.] > > The above-described topic is currently a PostgreSQL 9.6 open item. Stephen, > since you committed the patch believed to have created it, you own this open > item. If some other commit is more relevant or if this does not belong as a > 9.6 open item, please let us know. Otherwise, please observe the policy on > open item ownership[1] and send a status update within 72 hours of this > message. Include a date for your subsequent status update. Testers may > discover new open items at any time, and I want to plan to get them all fixed > well in advance of shipping 9.6rc1. Consequently, I will appreciate your > efforts toward speedy resolution. Thanks. > > [1] > http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com I am not sure what's Stephen's status on this item, but I am planning to look at it within the next 24 hours. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] No longer possible to query catalogs for index capabilities?
And a doc patch to go with it: -- Andrew (irc:RhodiumToad) diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml index 0689cc9..3e13e38 100644 --- a/doc/src/sgml/catalogs.sgml +++ b/doc/src/sgml/catalogs.sgml @@ -577,6 +577,89 @@ + +Capability information formerly stored in pg_am +is now available via the functions +pg_indexam_has_capability and +pg_index_has_capability +(see ). The following +boolean-valued capability names are currently supported: + + + + Capabilities + + + + + Name + Description + + + + + amcanorder + Does the access method support ordered scans sorted by the + indexed column's value? + + + amcanorderbyop + Does the access method support ordered scans sorted by the result + of an operator on the indexed column? + + + amcanbackward + Does the access method support backward scanning? + + + amcanunique + Does the access method support unique indexes? + + + amcanmulticol + Does the access method support multicolumn indexes? + + + amoptionalkey + Does the access method support a scan without any constraint + for the first index column? + + + amsearcharray + Does the access method support ScalarArrayOpExpr searches? + + + amsearchnulls + Does the access method support IS NULL/NOT NULL searches? + + + amstorage + Can index storage data type differ from column data type? + + + amclusterable + Can an index of this type be clustered on? + + + ampredlocks + Does an index of this type manage fine-grained predicate locks? + + + amgettuple + Does the access method provide an amgettuple function? + + + amgetbitmap + Does the access method provide an amgetbitmap function? + + + amcanreturn + Does the access method support index-only scans? + + + + + diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index baef80d..fd6b983 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -16193,6 +16193,14 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); +pg_indexam_has_capability + + + +pg_index_has_capability + + + pg_options_to_table @@ -16380,6 +16388,16 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); number of columns, pretty-printing is implied + pg_indexam_has_capability(am_oid, cap_name) + boolean + Test whether an index access method has a specified capability + + + pg_index_has_capability(index_oid, cap_name) + boolean + Test whether the access method for the specified index has a specified capability + + pg_options_to_table(reloptions) setof record get the set of storage option name/value pairs @@ -16523,6 +16541,16 @@ SELECT pg_type_is_visible('myschema.widget'::regtype); + pg_indexam_has_capability and + pg_index_has_capability return whether the specified + access method, or the access method for the specified index, advertises the + named capability. NULL is returned if the capability + name is not known; true if the capability is advertised, + false if it is not. Refer + to for capability names and their meanings. + + + pg_options_to_table returns the set of storage option name/value pairs (option_name/option_value) when passed -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]
>Can you please patch BRIN to use this new function too? On my machine replacement of both BRIN update cases (see https://github.com/x4m/pggistopt/commit/a6d301ff79104b977619339d53aebf748045418a ) showed no performance changes on folowing update query (6 seconds of updates avg): create table dataTable(x int, y int); insert into dataTable(x,y) select x,y from generate_series(1,1e3,1) x,generate_series(1,1e3,1) y; create index idx on dataTable using brin(x,y); update datatable set x = random()*1024, y = random()*1024; https://gist.github.com/x4m/7e69fd924b9ffd2fdc9c6100e741171d Probably I was looking in a wrong place. I do not see other cases when PageIndexTupleOverwrite can improve performance of BRIN. Though I'll make PageIndexTupleOverwrite BRIN-compatible in forthcoming patch version: BRIN tuples have no length in header and it must be taken as a parameter. Just as the PageAddItem do. Best regards, Andrey Borodin, Octonica & Ural Federal University. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] fixes for the Danish locale
On 21/07 08.42, Jeff Janes wrote: > In Danish, the sequence 'aa' is sometimes treated as a single letter > which collates after 'z'. For the record: this is also true for Norwegian, in both locales it collates equal to 'Ã¥' which is the 29th letter of the alphabet. But 'aa' is no longer used in ordinary words, only names (in Norwegian only personal names, in Danish also place names). - Bjorn Munch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ispell/hunspell imprecision in error message
On 25.07.2016 19:53, Peter Eisentraut wrote: I was trying to look up the background of this error message: "Ispell dictionary supports only \"default\", \"long\", and \"num\" flag value" (src/backend/tsearch/spell.c) But I found that this actually talks about Hunspell format dictionaries. (So the man page is hunspell(5) as opposed to ispell(5).) While as far as the tsearch interfaces are concerned, those two are lumped together, should we not change the error message to refer to Hunspell? Hello, As I understand, this error message refers to the Ispell dictionary template. This template can use various dictionary formats. Which is confusing. Maybe would be better to change dictionary template name. But it can break backward compatibility... If we want to change the error message, maybe change it to the following? "Hunspell dictionary format supports only \"default\", \"long\", and \"num\" flag value" -- Artur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_dumping extensions having sequences with 9.6beta3
On Sat, Jul 23, 2016 at 01:40:01PM +0900, Michael Paquier wrote: > On Fri, Jul 22, 2016 at 6:27 PM, Philippe BEAUDOIN wrote: > > I am currently playing with extensions. And I found a strange behaviour > > change with 9.6beta2 and 3 when pg_dumping a database with an extension > > having sequences. This looks like a bug, ... unless I did something wrong. > > [...] > > => as expected, with latest minor versions of postgres 9.1 to 9.5, the > > sequences associated to the t1.c1 and t1.c3 columns are not dumped, > >while the sequence associated to t2.c1 is dumped. > > => with 9.6beta3 (as with beta2), the 3 sequences are dumped. > > Thanks for the report! I haven't looked at the problem in details yet, > but my guess is that this is owned by Stephen Frost. test_pg_dump does > not cover sequences yet, it would be visibly good to get coverage for > that. I am adding an open item as well. [Action required within 72 hours. This is a generic notification.] The above-described topic is currently a PostgreSQL 9.6 open item. Stephen, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a 9.6 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within 72 hours of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping 9.6rc1. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bug in citext's upgrade script for parallel aggregates
On Thu, Jul 14, 2016 at 02:00:59AM +0200, Andreas Karlsson wrote: > On 07/09/2016 05:42 AM, David Rowley wrote: > >On 30 June 2016 at 03:49, Robert Haas wrote: > >>On Sat, Jun 25, 2016 at 3:44 AM, Andreas Karlsson wrote: > >>>On 06/24/2016 01:31 PM, David Rowley wrote: > Seems there's a small error in the upgrade script for citext for 1.1 > to 1.2 which will cause min(citext) not to be parallel enabled. > > max(citext)'s combinefunc is first set incorrectly, but then updated > to the correct value. I assume it was meant to set the combine > function for min(citext) instead. > > Fix attached. I've assumed that because we're still in beta that we > can get away with this fix rather than making a 1.3 version to fix the > issue. > >>> > >>>Yes, this is indeed a bug. > >> > >>Since we've already released beta2, I think we need to do a whole new > >>extension version. We treated beta1 as a sufficiently-significant > >>event to mandate a version bump, so we should do the same here. > > > >Ok, good point. Patch attached. > > Thanks! > > I tested the patch and it looks good. [Action required within 72 hours. This is a generic notification.] The above-described topic is currently a PostgreSQL 9.6 open item. Robert, since you committed the patch believed to have created it, you own this open item. If some other commit is more relevant or if this does not belong as a 9.6 open item, please let us know. Otherwise, please observe the policy on open item ownership[1] and send a status update within 72 hours of this message. Include a date for your subsequent status update. Testers may discover new open items at any time, and I want to plan to get them all fixed well in advance of shipping 9.6rc1. Consequently, I will appreciate your efforts toward speedy resolution. Thanks. [1] http://www.postgresql.org/message-id/20160527025039.ga447...@tornado.leadboat.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers