Re: [HACKERS] UPDATE of partition key
On 9 November 2017 at 09:27, Thomas Munro wrote: > On Wed, Nov 8, 2017 at 5:57 PM, Amit Khandekar wrote: >> On 8 November 2017 at 07:55, Thomas Munro >> wrote: >>> On Tue, Nov 7, 2017 at 8:03 AM, Robert Haas wrote: The changes to trigger.c still make me super-nervous. Hey THOMAS MUNRO, any chance you could review that part? > > At first, it seemed quite strange to me that row triggers and > statement triggers fire different events for the same modification. > Row triggers see DELETE + INSERT (necessarily because different > tables are involved), but this fact is hidden from the target table's > statement triggers. > > The alternative would be for all triggers to see consistent events and > transitions. Instead of having your special case code in ExecInsert > and ExecDelete that creates the two halves of a 'synthetic' UPDATE for > the transition tables, you'd just let the existing ExecInsert and > ExecDelete code do its thing, and you'd need a flag to record that you > should also fire INSERT/DELETE after statement triggers if any rows > moved. Yeah I also had thought about that. But thought that change was too invasive. For e.g. letting ExecARInsertTriggers() do the transition capture even when transition_capture->tcs_update_new_table is set. I was also thinking of having a separate function to *only* add the transition table rows. So in ExecInsert, call this one instead of ExecARUpdateTriggers(). But realized that the existing ExecARUpdateTriggers() looks like a better, robust interface with all its checks. Just that calling ExecARUpdateTriggers() sounds like we are also firing trigger; we are not firing any trigger or saving any event, we are just adding the transition row. > > After sleeping on this question, I am coming around to the view that > the way you have it is right. The distinction isn't really between > row triggers and statement triggers, it's between triggers at > different levels in the hierarchy. It just so happens that we > currently only fire target table statement triggers and leaf table row > triggers. Yes. And rows are there only in leaf partitions. So we have to simulate as though the target table has these rows. Like you mentioned, the user has to get the impression of a normal table. So we have to do something extra to capture the rows. > Future development ideas that seem consistent with your choice: > > 1. If we ever allow row triggers with transition tables on child > tables, then I think *their* transition tables should certainly see > the deletes and inserts, otherwise OLD TABLE and NEW TABLE would be > inconsistent with the OLD and NEW variables in a single trigger > invocation. (These were prohibited mainly due to lack of time and > (AFAIK) limited usefulness; I think they would need probably need > their own separate tuplestores, or possibly some kind of filtering.) As we know, for row triggers on leaf partitions, we treat them as normal tables, so a trigger written on a leaf partition sees only the local changes. The trigger is unaware whether the insert is part of an UPDATE row movement. Similarly, the transition table referenced by that row trigger function should see only the NEW table, not the old table. > > 2. If we ever allow row triggers on partitioned tables (ie that fire > when its children are modified), then I think their UPDATE trigger > should probably fire when a row moves between any two (grand-)*child > tables, just as you have it for target table statement triggers. Yes I agree. > It doesn't matter that the view from parent tables' triggers is > inconsistent with the view from leaf table triggers: it's a feature > that we 'hide' partitioning from the user to the extent we can so that > you can treat the partitioned table just like a table. > > Any other views? I think because because there is no provision for a row trigger on partitioned table, users who want to have a common trigger on a partition subtree, has no choice but to create the same trigger individually on the leaf partitions. And that's the reason we cannot handle an update row movement with triggers without anomalies. Thanks -Amit Khandekar -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] path toward faster partition pruning
On 10 November 2017 at 16:30, Kyotaro HORIGUCHI wrote: > In 0002, bms_add_range has a bit naive-looking loop > > + while (wordnum <= uwordnum) > + { > + bitmapword mask = (bitmapword) ~0; > + > + /* If working on the lower word, zero out bits below 'lower'. > */ > + if (wordnum == lwordnum) > + { > + int lbitnum = BITNUM(lower); > + mask >>= lbitnum; > + mask <<= lbitnum; > + } > + > + /* Likewise, if working on the upper word, zero bits above > 'upper' */ > + if (wordnum == uwordnum) > + { > + int ushiftbits = BITS_PER_BITMAPWORD - (BITNUM(upper) > + 1); > + mask <<= ushiftbits; > + mask >>= ushiftbits; > + } > + > + a->words[wordnum++] |= mask; > + } > > Without some aggressive optimization, the loop takes most of the > time to check-and-jump for nothing especially with many > partitions and somewhat unintuitive. > > The following uses a bit tricky bitmap operation but > is straightforward as a whole. > > = > /* fill the bits upper from BITNUM(lower) (0-based) of the first word */ > a->workds[wordnum++] += ~(bitmapword)((1 << BITNUM(lower)) - 1); > > /* fill up intermediate words */ > while (wordnum < uwordnum) >a->words[wordnum++] = ~(bitmapword) 0; > > /* fill up to BITNUM(upper) bit (0-based) of the last word */ > a->workds[wordnum++] |= > (~(bitmapword) 0) >> (BITS_PER_BITMAPWORD - (BITNUM(upper) - 1)); > = No objections here for making bms_add_range() perform better, but this is not going to work when lwordnum == uwordnum. You'd need to special case that. I didn't think it was worth the trouble, but maybe it is... I assume the += should be |=. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg audit requirements
Hi I am sending some notes, experience about usage of pgAudit. pgAudit provides basic functionality and usually is good enough. But it is not good enough for some applications in financial services. The requirements: 1. structured output - attached query is not good enough - column name, table name, schema, database, role should be separated 2. separated log (log file) with guaranteed write - fsync after every line means significant performance issue, but fsync every 1sec (or defined interval) is acceptable 3. security issues - not enough access rights to database object should be processed and logged in audit log too. Regards Pavel
Re: [HACKERS] proposal: psql command \graw
2017-11-10 8:12 GMT+01:00 Fabien COELHO : > > ISTM that you can remove "force_column_header" and just set "tuple_only" >>> to what you need, that is you do not need to change anything in function >>> "print_unaligned_text". >>> >> >> Last point is not possible - I would not to break original tuple only >> mode. >> > > Hmmm... I do not understand. I can see only one use of force_column_header > in the function: > > - if (!opt_tuples_only) > + if (!opt_tuples_only || opt_force_column_header) > > So I would basically suggest to do: > > my_popt.topt.tuples_only = !pset.g_raw_header; > > in the driver. Looking at the detailed code in that function, probably you > need to set start_table to on when headers are needed and stop_table to off > for the raw mode anyway? > > Maybe I'm missing something, but it looks that it could be made to work > without adding another boolean. > The tuples only cannot be disabled, because then other parts print number of rows postgres=# \pset format unaligned Output format is unaligned. postgres=# select 10 as a, 20 as b; a|b 10|20 (1 row) < > -- > Fabien. >
Re: [HACKERS] proposal: psql command \graw
ISTM that you can remove "force_column_header" and just set "tuple_only" to what you need, that is you do not need to change anything in function "print_unaligned_text". Last point is not possible - I would not to break original tuple only mode. Hmmm... I do not understand. I can see only one use of force_column_header in the function: - if (!opt_tuples_only) + if (!opt_tuples_only || opt_force_column_header) So I would basically suggest to do: my_popt.topt.tuples_only = !pset.g_raw_header; in the driver. Looking at the detailed code in that function, probably you need to set start_table to on when headers are needed and stop_table to off for the raw mode anyway? Maybe I'm missing something, but it looks that it could be made to work without adding another boolean. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] path toward faster partition pruning
Hello, At Fri, 10 Nov 2017 14:44:55 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20171110.144455.117208639.horiguchi.kyot...@lab.ntt.co.jp> > > Those two conditions are not orthogonal. Maybe something like > > following seems more understantable. > > > > > if (!constfalse) > > > { > > > /* No constraints on the keys, so, return *all* partitions. */ > > > if (nkeys == 0) > > > return bms_add_range(result, 0, partdesc->nparts - 1); > > > > > > result = get_partitions_for_keys(relation, &keys); > > > } So, the condition (!constfalse && nkeys == 0) cannot return there. I'm badly confused by the variable name. I couldn't find another reasonable structure using the current classify_p_b_keys(), but could you add a comment like the following as an example? + /* + * Ths function processes other than OR expressions and returns + * the excluded OR expressions in or_clauses + */ > nkeys = classify_partition_bounding_keys(relation, clauses, >&keys, &constfalse, >&or_clauses); > /* >* Only look up in the partition decriptor if the query provides >* constraints on the keys at all. >*/ > if (!constfalse) > { > if (nkey > 0) > result = get_partitions_for_keys(relation, &keys); > else -+ /* No constraints on the keys, so, all partitions are passed. */ > result = bms_add_range(result, 0, partdesc->nparts - 1); > } > + /* +* We have a partition set for clauses not returned in or_clauses +* here. Conjuct the result of each OR clauses. +*/ > foreach(lc, or_clauses) > { > BoolExpr *or = (BoolExpr *) lfirst(lc); > ListCell *lc1; > Bitmapset *or_partset = NULL; > + Assert(or_clause(or)); regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: psql command \graw
2017-11-09 21:12 GMT+01:00 Pavel Stehule : > > > 2017-11-09 21:03 GMT+01:00 Fabien COELHO : > >> >> Hello Pavel, >> >> I hope so I fixed all mentioned issues. >>> >> >> Patch applies with a warning: >> >> > git apply ~/psql-graw-2.patch >> /home/fabien/psql-graw-2.patch:192: new blank line at EOF. >> + >> warning: 1 line adds whitespace errors. >> >> Otherwise it compiles. "make check" ok. doc gen ok. >> >> Two spurious empty lines are added before StoreQueryTuple. >> >> Doc: "If + is appended to the command name, a column >> names are displayed." >> >> I suggest instead: "When + is appended, column names >> are also displayed." >> >> ISTM that you can remove "force_column_header" and just set "tuple_only" >> to what you need, that is you do not need to change anything in function >> "print_unaligned_text". >> > > Last point is not possible - I would not to break original tuple only > mode. > > updated patch > Pavel > >> >> -- >> Fabien. >> > > diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index e520cdf3ba..457a59eeab 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -2020,6 +2020,19 @@ CREATE INDEX + +\graw[+] [ filename ] +\graw[+] [ |command ] + + +\graw is equivalent to \g, but +forces unaligned output mode for this query. When + +is appended, column names are also displayed. + + + + + \gset [ prefix ] diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 8cc4de3878..b3461291eb 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -332,7 +332,8 @@ exec_command(const char *cmd, status = exec_command_errverbose(scan_state, active_branch); else if (strcmp(cmd, "f") == 0) status = exec_command_f(scan_state, active_branch); - else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0) + else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0 || + strcmp(cmd, "graw") == 0 || strcmp(cmd, "graw+") == 0) status = exec_command_g(scan_state, active_branch, cmd); else if (strcmp(cmd, "gdesc") == 0) status = exec_command_gdesc(scan_state, active_branch); @@ -1232,6 +1233,7 @@ exec_command_f(PsqlScanState scan_state, bool active_branch) /* * \g [filename] -- send query, optionally with output to file/pipe + * \graw [filename] -- same as \g with raw format * \gx [filename] -- same as \g, with expanded mode forced */ static backslashResult @@ -1254,6 +1256,10 @@ exec_command_g(PsqlScanState scan_state, bool active_branch, const char *cmd) free(fname); if (strcmp(cmd, "gx") == 0) pset.g_expanded = true; + else if (strcmp(cmd, "graw") == 0) + pset.g_raw = true; + else if (strcmp(cmd, "graw+") == 0) + pset.g_raw_header = true; status = PSQL_CMD_SEND; } else diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 7a91a44b2b..9f7ef51dfb 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -865,6 +865,14 @@ PrintQueryTuples(const PGresult *results) if (pset.g_expanded) my_popt.topt.expanded = 1; + /* one-shot raw output requested by \raw and \graw+ */ + else if (pset.g_raw || pset.g_raw_header) + { + my_popt.topt.format = PRINT_UNALIGNED; + my_popt.topt.tuples_only = true; + my_popt.topt.force_column_header = pset.g_raw_header; + } + /* write output to \g argument, if any */ if (pset.gfname) { @@ -1517,6 +1525,10 @@ sendquery_cleanup: /* reset \gx's expanded-mode flag */ pset.g_expanded = false; + /* reset \graw flags */ + pset.g_raw = false; + pset.g_raw_header = false; + /* reset \gset trigger */ if (pset.gset_prefix) { diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c index a926c40b9b..e573711434 100644 --- a/src/bin/psql/help.c +++ b/src/bin/psql/help.c @@ -167,7 +167,7 @@ slashUsage(unsigned short int pager) * Use "psql --help=commands | wc" to count correctly. It's okay to count * the USE_READLINE line even in builds without that. */ - output = PageOutput(125, pager ? &(pset.popt.topt) : NULL); + output = PageOutput(126, pager ? &(pset.popt.topt) : NULL); fprintf(output, _("General\n")); fprintf(output, _(" \\copyright show PostgreSQL usage and distribution terms\n")); @@ -176,6 +176,7 @@ slashUsage(unsigned short int pager) fprintf(output, _(" \\g [FILE] or ; execute query (and send results to file or |pipe)\n")); fprintf(output, _(" \\gdesc describe result of query, without executing it\n")); fprintf(output, _(" \\gexec execute query, then execute each value in its result\n")); + fprintf(output, _(" \\graw[+] [FILE]as \\g, but forces unaligned raw output mode\n")); fprintf(output, _(" \\gset [PREFIX] execute query and store results in psql variables\n")); fprintf(output, _(" \\gx [FILE] as \\g, but forces expanded output mode\n")); fprintf(output, _(" \\q quit psql\
Re: [HACKERS] path toward faster partition pruning
Ooops! The following comment is wrong. Please ignore it. At Fri, 10 Nov 2017 14:38:11 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20171110.143811.97616847.horiguchi.kyot...@lab.ntt.co.jp> > Those two conditions are not orthogonal. Maybe something like > following seems more understantable. > > > if (!constfalse) > > { > > /* No constraints on the keys, so, return *all* partitions. */ > > if (nkeys == 0) > > return bms_add_range(result, 0, partdesc->nparts - 1); > > > > result = get_partitions_for_keys(relation, &keys); > > } > > I'm not sure what is meant to be (formally) done here, but it > seems to me that OrExpr is assumed to be only at the top level of > the caluses. So the following (just an example, but meaningful > expression in this shpape must exists.) expression is perhaps > wrongly processed here. > > CREATE TABLE p (a int) PARITION BY (a); > CREATE TABLE c1 PARTITION OF p FOR VALUES FROM (0) TO (10); > CREATE TABLE c2 PARTITION OF p FOR VALUES FROM (10) TO (20); > > SELECT * FROM p WHERE a = 15 AND (a = 15 OR a = 5); > > get_partitions_for_keys() returns both c1 and c2 and still > or_clauses here holds (a = 15 OR a = 5) and the function tries to > *add* partitions for a = 15 and a = 5 separetely. This is working fine. Sorry for the bogus comment. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] hash partitioning
On Fri, Nov 10, 2017 at 4:41 AM, Robert Haas wrote: > On Wed, Nov 1, 2017 at 6:16 AM, amul sul wrote: >> Fixed in the 0003 patch. > > I have committed this patch set with the attached adjustments. > Thanks a lot for your support & a ton of thanks to all reviewer. Regards, Amul -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] path toward faster partition pruning
Hello, this is the second part of the review. At Fri, 10 Nov 2017 12:30:00 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20171110.123000.151902771.horiguchi.kyot...@lab.ntt.co.jp> > In 0002, bms_add_range has a bit naive-looking loop > In 0003, In 0004, The name get_partitions_from_clauses_guts(), it seems to me that we usually use _internal for recursive part of some function. (I have the same comment as David about the comment for get_partition_from_clause()) About the same function: Couldn't we get out in the fast path when clauses == NIL? +/* No constraints on the keys, so, return *all* partitions. */ +result = bms_add_range(result, 0, partdesc->nparts - 1); This allows us to return immediately here. And just above this, + if (nkeys > 0 && !constfalse) + result = get_partitions_for_keys(relation, &keys); + else if (!constfalse) Those two conditions are not orthogonal. Maybe something like following seems more understantable. > if (!constfalse) > { > /* No constraints on the keys, so, return *all* partitions. */ > if (nkeys == 0) > return bms_add_range(result, 0, partdesc->nparts - 1); > > result = get_partitions_for_keys(relation, &keys); > } I'm not sure what is meant to be (formally) done here, but it seems to me that OrExpr is assumed to be only at the top level of the caluses. So the following (just an example, but meaningful expression in this shpape must exists.) expression is perhaps wrongly processed here. CREATE TABLE p (a int) PARITION BY (a); CREATE TABLE c1 PARTITION OF p FOR VALUES FROM (0) TO (10); CREATE TABLE c2 PARTITION OF p FOR VALUES FROM (10) TO (20); SELECT * FROM p WHERE a = 15 AND (a = 15 OR a = 5); get_partitions_for_keys() returns both c1 and c2 and still or_clauses here holds (a = 15 OR a = 5) and the function tries to *add* partitions for a = 15 and a = 5 separetely. I'd like to pause here. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] Faster processing at Gather node
On Fri, Nov 10, 2017 at 8:36 AM, Robert Haas wrote: > On Thu, Nov 9, 2017 at 9:31 PM, Amit Kapila wrote: >> Have you set force_parallel_mode=regress; before running the >> statement? > > Yes, I tried that first. > >> If so, then why you need to tune other parallel query >> related parameters? > > Because I couldn't get it to break the other way, I then tried this. > > Instead of asking me what I did, can you tell me what I need to do? > Maybe a self-contained reproducible test case including exactly what > goes wrong on your end? > I think we are missing something very basic because you should see the failure by executing that statement in force_parallel_mode=regress even in a freshly created database. I guess the missing point is that I am using assertions enabled build and probably you are not (If this is the reason, then it should have striked me first time). Anyway, I am writing steps to reproduce the issue. 1. initdb 2. start server 3. connect using psql 4. set force_parallel_mode=regress; 5. Create Table as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r'; -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] path toward faster partition pruning
Hello, At Fri, 10 Nov 2017 09:34:57 +0900, Amit Langote wrote in <5fcb1a9f-b4ad-119d-14c7-282c30c7f...@lab.ntt.co.jp> > Hi Amul. > > On 2017/11/09 20:05, amul sul wrote: > > On Mon, Nov 6, 2017 at 3:31 PM, Amit Langote > > wrote: > >> On 2017/11/06 14:32, David Rowley wrote: > >>> On 6 November 2017 at 17:30, Amit Langote wrote: > On 2017/11/03 13:32, David Rowley wrote: > > On 31 October 2017 at 21:43, Amit Langote wrote: > > [] > >> > >> Attached updated set of patches, including the fix to make the new pruning > >> code handle Boolean partitioning. > >> > > > > I am getting following warning on mac os x: > > Thanks for the review. > > > partition.c:1800:24: warning: comparison of constant -1 with > > expression of type 'NullTestType' > > (aka 'enum NullTestType') is always false > > [-Wtautological-constant-out-of-range-compare] > > if (keynullness[i] == -1) > > ~~ ^ ~~ > > partition.c:1932:25: warning: comparison of constant -1 with > > expression of type 'NullTestType' > > (aka 'enum NullTestType') is always false > > [-Wtautological-constant-out-of-range-compare] > > if (keynullness[i] == -1) > > ~~ ^ ~~ > > 2 warnings generated. > > > > > > Comment for 0004 patch: > > 270 + /* -1 represents an invalid value of NullTestType. */ > > 271 + memset(keynullness, -1, PARTITION_MAX_KEYS * sizeof(NullTestType)); > > > > I think we should not use memset to set a value other than 0 or true/false. > > This will work for -1 on the system where values are stored in the 2's > > complement but I am afraid of other architecture. > > OK, I will remove all instances of comparing and setting variables of type > NullTestType to a value of -1. In 0002, bms_add_range has a bit naive-looking loop + while (wordnum <= uwordnum) + { + bitmapword mask = (bitmapword) ~0; + + /* If working on the lower word, zero out bits below 'lower'. */ + if (wordnum == lwordnum) + { + int lbitnum = BITNUM(lower); + mask >>= lbitnum; + mask <<= lbitnum; + } + + /* Likewise, if working on the upper word, zero bits above 'upper' */ + if (wordnum == uwordnum) + { + int ushiftbits = BITS_PER_BITMAPWORD - (BITNUM(upper) + 1); + mask <<= ushiftbits; + mask >>= ushiftbits; + } + + a->words[wordnum++] |= mask; + } Without some aggressive optimization, the loop takes most of the time to check-and-jump for nothing especially with many partitions and somewhat unintuitive. The following uses a bit tricky bitmap operation but is straightforward as a whole. = /* fill the bits upper from BITNUM(lower) (0-based) of the first word */ a->workds[wordnum++] += ~(bitmapword)((1 << BITNUM(lower)) - 1); /* fill up intermediate words */ while (wordnum < uwordnum) a->words[wordnum++] = ~(bitmapword) 0; /* fill up to BITNUM(upper) bit (0-based) of the last word */ a->workds[wordnum++] |= (~(bitmapword) 0) >> (BITS_PER_BITMAPWORD - (BITNUM(upper) - 1)); = In 0003, +match_clauses_to_partkey(RelOptInfo *rel, ... + if (rinfo->pseudoconstant && +(IsA(clause, Const) && + Const *) clause)->constisnull) || + !DatumGetBool(((Const *) clause)->constvalue + { +*constfalse = true; +continue; If we call this function in both conjunction and disjunction context (the latter is only in recursive case). constfalse == true means no need of any clauses for the former case. Since (I think) just a list of RestrictInfo is expected to be treated as a conjunction (it's quite doubious, though..), we might be better call this for each subnodes of a disjunction. Or, like match_clauses_to_index, we might be better provide match_clause_to_partkey(rel, rinfo, contains_const), which returns NULL if constfalse. (I'm not self-confident on this..) + /* + * If no commutator exists, cannot flip the qual's args, + * so give up. + */ + if (!OidIsValid(expr_op)) +continue; I suppose it's better to leftop and rightop as is rather than flipping over so that var is placed left-side. Does that make things so complex? + * It the operator happens to be '<>', which is never listed If? +if (!op_in_opfamily(expr_op, partopfamily)) +{ + Oidnegator = get_negator(expr_op); + + if (!OidIsValid(negator) || +!op_in_opfamily(negator, partopfamily)) +continue; classify_partition_bounding_keys() checks the same thing by checking whether the negator's strategy is BTEquealStrategyNumber. (I'm not sure the
Re: [HACKERS] [POC] Faster processing at Gather node
On Thu, Nov 9, 2017 at 9:31 PM, Amit Kapila wrote: > Have you set force_parallel_mode=regress; before running the > statement? Yes, I tried that first. > If so, then why you need to tune other parallel query > related parameters? Because I couldn't get it to break the other way, I then tried this. Instead of asking me what I did, can you tell me what I need to do? Maybe a self-contained reproducible test case including exactly what goes wrong on your end? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
On Fri, Nov 10, 2017 at 10:00 AM, Tom Lane wrote: > Stephen Frost writes: >> I'm guessing no, which essentially means that *we* consider access to >> lo_import/lo_export to be equivilant to superuser and therefore we're >> not going to implement anything to try and prevent the user who has >> access to those functions from becoming superuser. If we aren't willing >> to do that, then how can we really say that there's some difference >> between access to these functions and being a superuser? > > We seem to be talking past each other. Yes, if a user has malicious > intentions, it's possibly to parlay lo_export into obtaining a superuser > login (I'm less sure that that's necessarily true for lo_import). > That does NOT make it "equivalent", except perhaps in the view of someone > who is only considering blocking malevolent actors. It does not mean that > there's no value in preventing a task that needs to run lo_export from > being able to accidentally destroy any data in the database. There's a > range of situations where you are concerned about accidents and errors, > not malicious intent; but your argument ignores those use-cases. That will not sound much as a surprise as I spawned the original thread, but like Robert I understand that getting rid of all superuser checks is a goal that we are trying to reach to allow admins to have more flexibility in handling permissions to a subset of objects. Forcing an admin to give full superuser rights to one user willing to work only on LOs import and export is a wrong concept. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] Faster processing at Gather node
On Fri, Nov 10, 2017 at 12:05 AM, Robert Haas wrote: > On Thu, Nov 9, 2017 at 12:08 AM, Amit Kapila wrote: >> This change looks suspicious to me. I think here we can't use the >> tupDesc constructed from targetlist. One problem, I could see is that >> the check for hasOid setting in tlist_matches_tupdesc won't give the >> correct answer. In case of the scan, we use the tuple descriptor >> stored in relation descriptor which will allow us to take the right >> decision in tlist_matches_tupdesc. If you try the statement CREATE >> TABLE as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r'; in >> force_parallel_mode=regress, then you can reproduce the problem I am >> trying to highlight. > > I tried this, but nothing seemed to be obviously broken. Then I > realized that the CREATE TABLE command wasn't using parallelism, so I > retried with parallel_setup_cost = 0, parallel_tuple_cost = 0, and > min_parallel_table_scan_size = 0. That got it to use parallel query, > but I still don't see anything broken. Can you clarify further? > Have you set force_parallel_mode=regress; before running the statement? If so, then why you need to tune other parallel query related parameters? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Runtime Partition Pruning
On Thu, Nov 9, 2017 at 9:01 PM, Robert Haas wrote: > On Thu, Nov 9, 2017 at 6:18 AM, Beena Emerson wrote: >> The code still chooses the custom plan instead of the generic plan for >> the prepared statements. I am working on it. > > I don't think it's really the job of this patch to do anything about > that problem. > +1. I think if we really want to do something about plan choice when partitions are involved that should be done as a separate patch. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On Thu, Nov 9, 2017 at 4:53 PM, Andres Freund wrote: > Primarily because it's not an anti-corruption tool. I'd be surprised if > there weren't ways to corrupt the page using these corruptions that > aren't detected by it. It's very hard to assess the risk of missing something that's actually detectable with total confidence, but I think that the check is actually very thorough. > But even if it were, I don't think there's > enough information to do so in the general case. You very well can end > up with pages where subsequent hot pruning has removed a good bit of the > direct evidence of this bug. Sure, but maybe those are cases that can't get any worse anyway. So the question of avoiding making it worse doesn't arise. > But I'm not really sure why the error detection capabilities of matter > much for the principal point I raised, which is how much work we need to > do to not further worsen the corruption. You're right. Just trying to put the risk in context, and to understand the extent of the concern that you have. -- Peter Geoghegan
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
Stephen Frost writes: > I'm guessing no, which essentially means that *we* consider access to > lo_import/lo_export to be equivilant to superuser and therefore we're > not going to implement anything to try and prevent the user who has > access to those functions from becoming superuser. If we aren't willing > to do that, then how can we really say that there's some difference > between access to these functions and being a superuser? We seem to be talking past each other. Yes, if a user has malicious intentions, it's possibly to parlay lo_export into obtaining a superuser login (I'm less sure that that's necessarily true for lo_import). That does NOT make it "equivalent", except perhaps in the view of someone who is only considering blocking malevolent actors. It does not mean that there's no value in preventing a task that needs to run lo_export from being able to accidentally destroy any data in the database. There's a range of situations where you are concerned about accidents and errors, not malicious intent; but your argument ignores those use-cases. To put it more plainly: your argument is much like saying that a person who knows a sudo password might as well do everything they ever do as root. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On 2017-11-09 16:45:07 -0800, Peter Geoghegan wrote: > On Thu, Nov 9, 2017 at 4:17 PM, Andres Freund wrote: > >> Actually, on second thought, I take that back -- I don't think that > >> REINDEXing will even finish once a HOT chain is broken by the bug. > >> IndexBuildHeapScan() actually does quite a good job of making sure > >> that HOT chains are sane, which is how the enhanced amcheck notices > >> the bug here in practice. > > > > I think that's too optimistic. > > Why? Because the "find the TID of the root" logic in > IndexBuildHeapScan()/heap_get_root_tuples() won't reliably find the > actual root (it might be some other HOT chain root following TID > recycling by VACUUM)? Primarily because it's not an anti-corruption tool. I'd be surprised if there weren't ways to corrupt the page using these corruptions that aren't detected by it. But even if it were, I don't think there's enough information to do so in the general case. You very well can end up with pages where subsequent hot pruning has removed a good bit of the direct evidence of this bug. But I'm not really sure why the error detection capabilities of matter much for the principal point I raised, which is how much work we need to do to not further worsen the corruption. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix bloom WAL tap test
On Thu, Nov 9, 2017 at 7:51 PM, Alexander Korotkov wrote: > On Wed, Nov 8, 2017 at 5:46 AM, Masahiko Sawada > wrote: >> > So I think >> > that you should instead do something like that: >> > >> > --- a/contrib/bloom/Makefile >> > +++ b/contrib/bloom/Makefile >> > @@ -20,5 +20,12 @@ include $(top_builddir)/src/Makefile.global >> > include $(top_srcdir)/contrib/contrib-global.mk >> > endif >> > >> > +installcheck: wal-installcheck >> > + >> > +check: wal-check >> > + >> > +wal-installcheck: >> > + $(prove_installcheck) >> > + >> > wal-check: temp-install >> > $(prove_check) >> > >> > This works for me as I would expect it should. >> >> Looks good to me too. > > OK, then so be it :) Thanks for the new version. This one, as well as the switch to psql_safe in https://www.postgresql.org/message-id/CAPpHfduxgEYF_0BTs-mxGC4=w5sw8rnUbq9BSTp1Wq7=nwr...@mail.gmail.com are good for a committer lookup IMO. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On Thu, Nov 9, 2017 at 4:17 PM, Andres Freund wrote: >> Actually, on second thought, I take that back -- I don't think that >> REINDEXing will even finish once a HOT chain is broken by the bug. >> IndexBuildHeapScan() actually does quite a good job of making sure >> that HOT chains are sane, which is how the enhanced amcheck notices >> the bug here in practice. > > I think that's too optimistic. Why? Because the "find the TID of the root" logic in IndexBuildHeapScan()/heap_get_root_tuples() won't reliably find the actual root (it might be some other HOT chain root following TID recycling by VACUUM)? Assuming that's what you meant: I would have thought that the xmin/xmax matching within heap_get_root_tuples() makes the sanity checking fairly reliable in practice. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] User defined data types in Logical Replication
Hi, We are getting the bellow error while trying use Logical Replication with user defined data types in a C program (when call elog function). ERROR: XX000: cache lookup failed for type X # X is remote type's oid It occurs in worker.c:slot_store_error_callback function when remotetypoid not exist in local pg_type. I have tried to write a patch (attached). I think it is not kindly to change typename to the OID's one, But I could not find the easy way to get typename from OID in the remote host. --- Thanks and best regards, Dang Minh Huong NEC Solution Innovators, Ltd. http://www.nec-solutioninnovators.co.jp/en/ logicalrep_typmap.patch Description: logicalrep_typmap.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] path toward faster partition pruning
Hi Amul. On 2017/11/09 20:05, amul sul wrote: > On Mon, Nov 6, 2017 at 3:31 PM, Amit Langote > wrote: >> On 2017/11/06 14:32, David Rowley wrote: >>> On 6 November 2017 at 17:30, Amit Langote wrote: On 2017/11/03 13:32, David Rowley wrote: > On 31 October 2017 at 21:43, Amit Langote wrote: > [] >> >> Attached updated set of patches, including the fix to make the new pruning >> code handle Boolean partitioning. >> > > I am getting following warning on mac os x: Thanks for the review. > partition.c:1800:24: warning: comparison of constant -1 with > expression of type 'NullTestType' > (aka 'enum NullTestType') is always false > [-Wtautological-constant-out-of-range-compare] > if (keynullness[i] == -1) > ~~ ^ ~~ > partition.c:1932:25: warning: comparison of constant -1 with > expression of type 'NullTestType' > (aka 'enum NullTestType') is always false > [-Wtautological-constant-out-of-range-compare] > if (keynullness[i] == -1) > ~~ ^ ~~ > 2 warnings generated. > > > Comment for 0004 patch: > 270 + /* -1 represents an invalid value of NullTestType. */ > 271 + memset(keynullness, -1, PARTITION_MAX_KEYS * sizeof(NullTestType)); > > I think we should not use memset to set a value other than 0 or true/false. > This will work for -1 on the system where values are stored in the 2's > complement but I am afraid of other architecture. OK, I will remove all instances of comparing and setting variables of type NullTestType to a value of -1. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On Thu, Nov 9, 2017 at 4:17 PM, Andres Freund wrote: >> I don't follow you here. Why would REINDEXing make the rows that >> should be dead disappear again, even for a short period of time? > > It's not the REINDEX that makes them reappear. Of course. I was just trying to make sense of what you said. > It's the second > vacuum. The reindex part was about $user trying to fix the problem... > As you need two vacuums with appropriate cutoffs to hit the "rows > revive" problem, that'll often in practice not happen immediately. This explanation clears things up, though. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On 2017-11-09 16:02:17 -0800, Peter Geoghegan wrote: > > What I'm currently wondering about is how much we need to harden > > postgres against such existing corruption. If e.g. the hot chains are > > broken somebody might have reindexed thinking the problem is fixed - but > > if they then later vacuum everything goes to shit again, with dead rows > > reappearing. > > I don't follow you here. Why would REINDEXing make the rows that > should be dead disappear again, even for a short period of time? It's not the REINDEX that makes them reappear. It's the second vacuum. The reindex part was about $user trying to fix the problem... As you need two vacuums with appropriate cutoffs to hit the "rows revive" problem, that'll often in practice not happen immediately. > Actually, on second thought, I take that back -- I don't think that > REINDEXing will even finish once a HOT chain is broken by the bug. > IndexBuildHeapScan() actually does quite a good job of making sure > that HOT chains are sane, which is how the enhanced amcheck notices > the bug here in practice. I think that's too optimistic. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Thu, Nov 9, 2017 at 2:56 PM, Stephen Frost wrote: > > Further, I agree entirely that we > > shouldn't be deciding that certain capabilities are never allowed to be > > given to a user- but that's why superuser *exists* and why it's possible > > to give superuser access to multiple users. The question here is if > > it's actually sensible for us to make certain actions be grantable to > > non-superusers which allow that role to become, or to essentially be, a > > superuser. What that does, just as it does in the table case, from my > > viewpoint at least, is make it *look* to admins like they're grant'ing > > something less than superuser when, really, they're actually giving the > > user superuser-level access. That violates the POLA because the admin > > didn't write 'ALTER USER joe WITH SUPERUSER', they just wrote 'GRANT > > EXECUTE ON lo_import() TO joe;'. > > This is exactly the kind of thing that I'm talking about. Forcing an > administrator to hand out full superuser access instead of being able > to give just lo_import() makes life difficult for users for no good > reason. The existence of a theoretically-exploitable security > vulnerability is not tantamount to really having access, especially on > a system with a secured logging facility. This is where I disagree. The administrator *is* giving the user superuser-level access, they're just doing it in a different way from explicitly saying 'ALTER USER .. WITH SUPERUSER' and that's exactly the issue that I have. This isn't a theoretical exploit- the user with lo_import/lo_export access is able to read and write any file on the filesystem which the PG OS has access to, including things like pg_hba.conf in most configurations, the file backing pg_authid, the OS user's .bashrc, the Kerberos principal, the CA public key, the SSL keys, tables owned by other users that the in-database role doesn't have access to, et al. Further, will we consider these *actual*, non-theoretical, methods to gain superuser access, or to bypass the database authorization system, to be security issues or holes to plug? I'm guessing no, which essentially means that *we* consider access to lo_import/lo_export to be equivilant to superuser and therefore we're not going to implement anything to try and prevent the user who has access to those functions from becoming superuser. If we aren't willing to do that, then how can we really say that there's some difference between access to these functions and being a superuser? Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On Thu, Nov 9, 2017 at 2:24 PM, Andres Freund wrote: > Attached is a version of the already existing regression test that both > reproduces the broken hot chain (and thus failing index lookups) and > then also the tuple reviving. I don't see any need for letting this run > with arbitrary permutations. I thought that the use of every possible permutation was excessive, myself. It left us with an isolation test that didn't precisely describe the behavior that is tested. What you came up with seems far, far better, especially because of the comments you included. The mail message-id references seem to add a lot, too. > What I'm currently wondering about is how much we need to harden > postgres against such existing corruption. If e.g. the hot chains are > broken somebody might have reindexed thinking the problem is fixed - but > if they then later vacuum everything goes to shit again, with dead rows > reappearing. I don't follow you here. Why would REINDEXing make the rows that should be dead disappear again, even for a short period of time? It might do so for index scans, I suppose, but not for sequential scans. Are you concerned about a risk of somebody not noticing that sequential scans are still broken? Actually, on second thought, I take that back -- I don't think that REINDEXing will even finish once a HOT chain is broken by the bug. IndexBuildHeapScan() actually does quite a good job of making sure that HOT chains are sane, which is how the enhanced amcheck notices the bug here in practice. (Before this bug was discovered, I would have expected amcheck to catch problems like it slightly later, during the Bloom filter probe for that HOT chain...but, in fact, it never gets there with corruption from this bug in practice, AFAIK.) -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup --progress output for batch execution
On Fri, Sep 29, 2017 at 4:00 PM, Martin Marques < martin.marq...@2ndquadrant.com> wrote: > Hi, > > Some time ago I had to work on a system where I was cloning a standby > using pg_basebackup, that didn't have screen or tmux. For that reason I > redirected the output to a file and ran it with nohup. > > I normally (always actually ;) ) run pg_basebackup with --progress and > --verbose so I can follow how much has been done. When done on a tty you > get a nice progress bar with the percentage that has been cloned. > > The problem came with the execution and redirection of the output, as > the --progress option will write a *very* long line! > > Back then I thought of writing a patch (actually someone suggested I do > so) to add a --batch-mode option which would change the behavior > pg_basebackup has when printing the output messages. > While separate lines in the output file is better than one very long line, it still doesn't seem so useful. If you aren't watching it in real time, then you really need to have a timestamp on each line so that you can interpret the output. The lines are about one second apart, but I don't know robust that timing is under all conditions. I think I agree with Arthur that I'd rather have the decision made by inspecting whether output is going to a tty, rather than by adding another command line option. But maybe that is not detected robustly enough across all platforms and circumstances? Cheers, Jeff
Re: [HACKERS] libpq connection strings: control over the cipher suites?
On 11/09/2017 03:17 PM, Michael Paquier wrote: > On Fri, Nov 10, 2017 at 2:53 AM, Joe Conway wrote: >> On 11/09/2017 03:27 AM, Graham Leggett wrote: >>> Is there a parameter or mechanism for setting the required ssl cipher list >>> from the client side? >> >> I don't believe so. That is controlled by ssl_ciphers, which requires a >> restart in order to change. >> >> https://www.postgresql.org/docs/10/static/runtime-config-connection.html#GUC-SSL-CIPHERS > > Since commit de41869 present in v10, SSL parameters can be reloaded. Oh, cool, I must have missed that -- thanks! Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
Michael Paquier writes: > On Fri, Nov 10, 2017 at 7:05 AM, Tom Lane wrote: >> I did miss the need to fix the docs, and am happy to put in some strong >> wording about the security hazards of these functions while fixing the >> docs. But I do not think that leaving them with hardwired superuser >> checks is an improvement over being able to control them with GRANT. > Sorry about that. lobj.sgml indeed mentions superusers. Do you need a patch? No, I can write it. But I'm going to wait to see where this debate settles before expending effort on the docs. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq connection strings: control over the cipher suites?
On Fri, Nov 10, 2017 at 2:53 AM, Joe Conway wrote: > On 11/09/2017 03:27 AM, Graham Leggett wrote: >> Is there a parameter or mechanism for setting the required ssl cipher list >> from the client side? > > I don't believe so. That is controlled by ssl_ciphers, which requires a > restart in order to change. > > https://www.postgresql.org/docs/10/static/runtime-config-connection.html#GUC-SSL-CIPHERS Since commit de41869 present in v10, SSL parameters can be reloaded. On libpq there is only an API to have a look at what are the ciphers set by the server via PQsslAttribute(). -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
On Fri, Nov 10, 2017 at 7:05 AM, Tom Lane wrote: > I did miss the need to fix the docs, and am happy to put in some strong > wording about the security hazards of these functions while fixing the > docs. But I do not think that leaving them with hardwired superuser > checks is an improvement over being able to control them with GRANT. Sorry about that. lobj.sgml indeed mentions superusers. Do you need a patch? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] hash partitioning
On Wed, Nov 1, 2017 at 6:16 AM, amul sul wrote: > Fixed in the 0003 patch. I have committed this patch set with the attached adjustments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company hash-adjustments.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] A hook for session start
On Fri, Nov 10, 2017 at 2:32 AM, Fabrízio de Royes Mello wrote: > On Thu, Nov 9, 2017 at 12:09 AM, Michael Paquier > wrote: >> +++ b/src/test/modules/test_session_hooks/session_hooks.conf >> @@ -0,0 +1 @@ >> +shared_preload_libraries = 'test_session_hooks' >> Don't you think that this should be a GUC? My previous comment >> outlined that. I won't fight hard on that point in any case, don't >> worry. I just want to make things clear :) > > Ooops... my fault... fixed! > > Thanks again!! +/* GUC variables */ +static char *session_hook_username = NULL; This causes the module to crash if test_session_hooks.username is not set. I would recommend just assigning a default value, say "postgres". -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple
On 2017-11-04 06:15:00 -0700, Andres Freund wrote: > The reason for that is that I hadn't yet quite figured out how the bug I > described in the commit message (and the previously committed testcase) > would cause that. I was planning to diagnose / experiment more with this > and write an email if I couldn't come up with an explanation. The > committed test does *not* actually trigger that. > > The reason I couldn't quite figure out how the problem triggers is that > [ long explanation ] Attached is a version of the already existing regression test that both reproduces the broken hot chain (and thus failing index lookups) and then also the tuple reviving. I don't see any need for letting this run with arbitrary permutations. Thanks to whoever allowed isolationtester permutations to go over multiple lines and allow comments. I was wondering about adding that as a feature just to discover it's already there ;) What I'm currently wondering about is how much we need to harden postgres against such existing corruption. If e.g. the hot chains are broken somebody might have reindexed thinking the problem is fixed - but if they then later vacuum everything goes to shit again, with dead rows reappearing. There's no way we can fix hot chains after the fact, but preventing dead rows from reapparing seems important. A minimal version of that is fairly easy - we slap a bunch of if if !TransactionIdDidCommit() elog(ERROR) at various code paths. But that'll often trigger clog access errors when the problem occurred - if we want to do better we need to pass down relfrozenxid/relminmxid to a few functions. I'm inclined to do so, but it'll make the patch larger... Comments? Greetings, Andres Freund # Test for interactions of tuple freezing with dead, as well as recently-dead # tuples using multixacts via FOR KEY SHARE. setup { DROP TABLE IF EXISTS tab_freeze; CREATE TABLE tab_freeze ( id int PRIMARY KEY, name char(3), x int); INSERT INTO tab_freeze VALUES (1, '111', 0); INSERT INTO tab_freeze VALUES (3, '333', 0); } teardown { DROP TABLE tab_freeze; } session "s1" step "s1_begin" { BEGIN; } step "s1_update"{ UPDATE tab_freeze SET x = x + 1 WHERE id = 3; } step "s1_commit"{ COMMIT; } step "s1_vacuum"{ VACUUM FREEZE tab_freeze; } step "s1_selectone" { BEGIN; SET LOCAL enable_seqscan = false; SET LOCAL enable_bitmapscan = false; SELECT * FROM tab_freeze WHERE id = 3; COMMIT; } step "s1_selectall" { SELECT * FROM tab_freeze ORDER BY name, id; } session "s2" step "s2_begin" { BEGIN; } step "s2_key_share" { SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE; } step "s2_commit"{ COMMIT; } step "s2_vacuum"{ VACUUM FREEZE tab_freeze; } session "s3" step "s3_begin" { BEGIN; } step "s3_key_share" { SELECT id FROM tab_freeze WHERE id = 3 FOR KEY SHARE; } step "s3_commit"{ COMMIT; } step "s3_vacuum"{ VACUUM FREEZE tab_freeze; } # This permutation verfies that a previous bug # https://postgr.es/m/e5711e62-8fdf-4dca-a888-c200bf6b5...@amazon.com # https://postgr.es/m/20171102112019.33wb7g5wp4zpj...@alap3.anarazel.de # is not reintroduced. We used to make wrong pruning / freezing # decision for multixacts, which could lead to a) broken hot chains b) # dead rows being revived. permutation "s1_begin" "s2_begin" "s3_begin" # start transactions "s1_update" "s2_key_share" "s3_key_share" # have xmax be a multi with an updater, updater being oldest xid "s1_update" # create additional row version that has multis "s1_commit" "s2_commit" # commit both updater and share locker "s2_vacuum" # due to bug in freezing logic, we used to *not* prune updated row, and then froze it "s1_selectone" # if hot chain is broken, the row can't be found via index scan "s3_commit" # commit remaining open xact "s2_vacuum" # pruning / freezing in broken hot chains would unset xmax, reviving rows "s1_selectall" # show borkedness -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect option to forgo buffer locking?
On 2017-11-09 17:14:11 -0500, Tom Lane wrote: > If we do this, I'd suggest exposing it as a separate SQL function > get_raw_page_unlocked() rather than as an option to get_raw_page(). > > The reasoning is that if we ever allow these functions to be controlled > via GRANT instead of hardwired superuser checks (cf nearby debate about > lo_import/lo_export), one might reasonably consider the unlocked form as > more risky than the locked form, and hence not wish to have to give out > privileges to both at once. Good idea! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect option to forgo buffer locking?
Robert Haas writes: > On Thu, Nov 9, 2017 at 12:58 PM, Andres Freund wrote: >> You can already pass arbitrary byteas to heap_page_items(), so I don't >> see how this'd change anything exposure-wise? Or are you thinking that >> users would continually do this with actual page contents and would be >> more likely to hit edge cases than if using pg_read_binary_file() or >> such (which obviously sees an out of date page)? > More the latter. It's not really an increase in terms of security > exposure, but it might lead to more trouble in practice. If we do this, I'd suggest exposing it as a separate SQL function get_raw_page_unlocked() rather than as an option to get_raw_page(). The reasoning is that if we ever allow these functions to be controlled via GRANT instead of hardwired superuser checks (cf nearby debate about lo_import/lo_export), one might reasonably consider the unlocked form as more risky than the locked form, and hence not wish to have to give out privileges to both at once. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
Robert Haas writes: > On Thu, Nov 9, 2017 at 2:56 PM, Stephen Frost wrote: >> Further, I agree entirely that we >> shouldn't be deciding that certain capabilities are never allowed to be >> given to a user- but that's why superuser *exists* and why it's possible >> to give superuser access to multiple users. The question here is if >> it's actually sensible for us to make certain actions be grantable to >> non-superusers which allow that role to become, or to essentially be, a >> superuser. What that does, just as it does in the table case, from my >> viewpoint at least, is make it *look* to admins like they're grant'ing >> something less than superuser when, really, they're actually giving the >> user superuser-level access. That violates the POLA because the admin >> didn't write 'ALTER USER joe WITH SUPERUSER', they just wrote 'GRANT >> EXECUTE ON lo_import() TO joe;'. > This is exactly the kind of thing that I'm talking about. Forcing an > administrator to hand out full superuser access instead of being able > to give just lo_import() makes life difficult for users for no good > reason. The existence of a theoretically-exploitable security > vulnerability is not tantamount to really having access, especially on > a system with a secured logging facility. Exactly. I think that Stephen's argument depends on what a black-hat privilege recipient could theoretically do, but fails to consider what's useful for white-hat users. One of the standard rules for careful admins is to do as little as possible as root/superuser. If you have a situation where it's necessary to use, say, lo_import as part of a routine data import task, then the only alternative previously was to do that task as superuser. That is not an improvement, by any stretch of the imagination, over granting lo_import privileges to some otherwise-vanilla role that can run the data import task. We've previously discussed workarounds such as creating SECURITY DEFINER wrapper functions, but I don't think that approach does much to change the terms of the debate: it'd still be better if the wrapper function itself didn't need full superuser. I did miss the need to fix the docs, and am happy to put in some strong wording about the security hazards of these functions while fixing the docs. But I do not think that leaving them with hardwired superuser checks is an improvement over being able to control them with GRANT. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] different content of pg_depend after pg_upgrade
Hi We checked some check query based on some operations on pg_depend table. This query did different result when database was migrated with pg_dump or with pg_upgrade. I found so this query was broken, but I found interesting thing. The count is 1 for any objid select distinct count(distinct classid), objid from pg_depend group by objid; when system was loaded from dump but when we used pg_upgrade, then previous rule was invalid. Is it expected behave? Regards Pavel
[HACKERS] OpeSSL - PostgreSQL
Hi All, I am using PostgreSQL version *9.5.7* on Red hat enterprise Linux *7.2.* *OpenSSL version : * OpenSSL 1.0.1e-fips 11 Feb 2013. I have a requirement to enable the SSL in my environment with specific cipher suites,we want to restrict weak cipher suites from open SSL default list. We have list of cipher suites, which are authorized to use in my environment.So the Client Applications use one of authorized cipher suites while configuring application server. Is it require to install different version of OpenSSL software instead of default OpenSSL on Linux ?. How to configure the PostgreSQL to allow specif cipher suites from different client applications? Thanks, Chiru
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
On Thu, Nov 9, 2017 at 2:56 PM, Stephen Frost wrote: > I agree that it may not be obvious which cases lead to a relatively easy > way to obtain superuser and which don't, and that's actually why I'd > much rather it be something that we're considering and looking out for > because, frankly, we're in a much better position generally to realize > those cases than our users are. I disagree. It's flattering to imagine that PostgreSQL developers, as a class, are smarter than PostgreSQL users, but it doesn't match my observations. > Further, I agree entirely that we > shouldn't be deciding that certain capabilities are never allowed to be > given to a user- but that's why superuser *exists* and why it's possible > to give superuser access to multiple users. The question here is if > it's actually sensible for us to make certain actions be grantable to > non-superusers which allow that role to become, or to essentially be, a > superuser. What that does, just as it does in the table case, from my > viewpoint at least, is make it *look* to admins like they're grant'ing > something less than superuser when, really, they're actually giving the > user superuser-level access. That violates the POLA because the admin > didn't write 'ALTER USER joe WITH SUPERUSER', they just wrote 'GRANT > EXECUTE ON lo_import() TO joe;'. This is exactly the kind of thing that I'm talking about. Forcing an administrator to hand out full superuser access instead of being able to give just lo_import() makes life difficult for users for no good reason. The existence of a theoretically-exploitable security vulnerability is not tantamount to really having access, especially on a system with a secured logging facility. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: psql command \graw
2017-11-09 21:03 GMT+01:00 Fabien COELHO : > > Hello Pavel, > > I hope so I fixed all mentioned issues. >> > > Patch applies with a warning: > > > git apply ~/psql-graw-2.patch > /home/fabien/psql-graw-2.patch:192: new blank line at EOF. > + > warning: 1 line adds whitespace errors. > > Otherwise it compiles. "make check" ok. doc gen ok. > > Two spurious empty lines are added before StoreQueryTuple. > > Doc: "If + is appended to the command name, a column > names are displayed." > > I suggest instead: "When + is appended, column names > are also displayed." > > ISTM that you can remove "force_column_header" and just set "tuple_only" > to what you need, that is you do not need to change anything in function > "print_unaligned_text". > Last point is not possible - I would not to break original tuple only mode. Pavel > > -- > Fabien. >
[HACKERS] Inlining functions with "expensive" parameters
All, As we try and make PostGIS more "parallel sensitive" we have been added costs to our functions, so that their relative CPU cost is more accurately reflected in parallel plans. This has resulted in an odd side effect: some of our "wrapper" functions stop giving index scans in plans [1]. This is a problem! An example of a "wrapper" is ST_Intersects(geom1, geom2). It combines an index operation (geom1 && geom2) with an exact spatial test (_ST_Intersects(geom1, geom2). This is primarily for user convenience, and has worked for us well for a decade and more. Having this construct stop working is definitely a problem. As we add costs to our functions, the odds increase that one of the parameters to a wrapper might be a costed function. It's not uncommon to see: ST_Interects(geom, ST_SetSRID('POLYGON(...)', 4326)) It's fair to say that we really do depend on our wrappers getting inlined basically all the time. They are simple functions, they do nothing other than 'SELECT func1() AND func2() AND arg1 && arg2'. However, once costs are added to the parameters, the inlining can be turned off relatively quickly. Here's a PgSQL native example: -- Create data table and index. Analyze. DROP TABLE IF EXISTS boxen; CREATE TABLE boxen AS SELECT row_number() OVER() As gid, box(point(x, y),point(x+1, y+1)) AS b, x, y FROM generate_series(-100,100) As y, generate_series(-100,100) As x; CREATE INDEX idx_b_geom_gist ON boxen USING gist(b); ANALYZE boxen; -- An inlined function -- When set 'STRICT' it breaks index access -- However 'IMMUTABLE' doesn't seem to bother it CREATE OR REPLACE FUNCTION good_box(box, box) RETURNS boolean AS 'SELECT $1 OPERATOR(&&) $2 AND length(lseg(point($1),point($2))) < 3' LANGUAGE 'sql'; -- Start with a low cost circle() ALTER FUNCTION circle(point, double precision) COST 1; -- [A] Query plan hits index EXPLAIN SELECT gid FROM boxen WHERE good_box( boxen.b, box(circle(point(20.5, 20.5), 2)) ); -- [B] Query plan hits index EXPLAIN SELECT gid FROM boxen, (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c WHERE good_box( boxen.b, box(circle(point(c.x, c.y), 2)) ); -- Increase cost of circle ALTER FUNCTION circle(point, double precision) COST 100; -- [B] Query plan does not hit index! EXPLAIN SELECT gid FROM boxen, (SELECT x, y FROM boxen WHERE x < 0 and y < 0) AS c WHERE good_box( boxen.b, box(circle(point(c.x, c.y), 2)) ); The inlining is getting tossed out on a test of how expensive the function parameters are [2]. As a result, we lose what is really the correct plan, and get a sequence scan instead of an index scan. The test of parameter cost seems quite old (15+ years) and perhaps didn't anticipate highly variable individual function costs (or maybe it did). As it stands though, PostGIS is currently stuck choosing between having costs on our functions or having our inlined wrappers, because we cannot have both at the same time. Any thoughts? Thanks! P. [1] https://trac.osgeo.org/postgis/ticket/3675#comment:18 [2] https://github.com/postgres/postgres/blob/ae20b23a9e7029f31ee902da08a464d968319f56/src/backend/optimizer/util/clauses.c#L4581-L4584
Re: [HACKERS] proposal: psql command \graw
Hello Pavel, I hope so I fixed all mentioned issues. Patch applies with a warning: > git apply ~/psql-graw-2.patch /home/fabien/psql-graw-2.patch:192: new blank line at EOF. + warning: 1 line adds whitespace errors. Otherwise it compiles. "make check" ok. doc gen ok. Two spurious empty lines are added before StoreQueryTuple. Doc: "If + is appended to the command name, a column names are displayed." I suggest instead: "When + is appended, column names are also displayed." ISTM that you can remove "force_column_header" and just set "tuple_only" to what you need, that is you do not need to change anything in function "print_unaligned_text". -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Thu, Nov 9, 2017 at 1:52 PM, Stephen Frost wrote: > > This is not unlike the discussions we've had in the past around allowing > > non-owners of a table to modify properties of a table, where the > > argument has been successfully and clearly made that if you make the > > ability to change the table a GRANT'able right, then that can be used to > > become the role which owns the table without much difficulty, and so > > there isn't really a point in making that right independently > > GRANT'able. We have some of that explicitly GRANT'able today due to > > requirements of the spec, but that's outside of our control. > > I don't think it's quite the same thing. I wouldn't go out of my way > to invent grantable table rights that just let you usurp the table > owner's permissions, but I still think it's better to have a uniform > rule that functions we ship don't contain hard-coded superuser checks. Just to be clear, we should certainly be thinking about more than just functions but also about things like publications and subscriptions and other bits of the system. I haven't done a detailed analysis quite yet, but I'm reasonably confident that a number of things in this last release cycle have resulted in quite a few additional explicit superuser checks, which I do think is an issue to be addressed. > One problem is that which functions allow an escalation to superuser > that is easy enough or reliable enough may not actually be a very > bright line in all cases. But more than that, I think it's a > legitimate decision to grant to a user a right that COULD lead to a > superuser escalation and trust them not to use that way, or prevent > them from using it that way by some means not known to the database > system (e.g. route all queries through pgbouncer and send them to a > teletype; if a breach is detected, go to the teletype room, read the > paper contained therein, and decide who to fire/imprison/execute at > gunpoint). It shouldn't be our job to decide that granting a certain > right is NEVER ok. I agree that it may not be obvious which cases lead to a relatively easy way to obtain superuser and which don't, and that's actually why I'd much rather it be something that we're considering and looking out for because, frankly, we're in a much better position generally to realize those cases than our users are. Further, I agree entirely that we shouldn't be deciding that certain capabilities are never allowed to be given to a user- but that's why superuser *exists* and why it's possible to give superuser access to multiple users. The question here is if it's actually sensible for us to make certain actions be grantable to non-superusers which allow that role to become, or to essentially be, a superuser. What that does, just as it does in the table case, from my viewpoint at least, is make it *look* to admins like they're grant'ing something less than superuser when, really, they're actually giving the user superuser-level access. That violates the POLA because the admin didn't write 'ALTER USER joe WITH SUPERUSER', they just wrote 'GRANT EXECUTE ON lo_import() TO joe;'. We can certainly argue about if the admin should have just known that lo_import()/lo_export() or similar functions were equivilant to grant'ing a user superuser access, but it's not entirely clear to me that it's actually advantageous to go out of our way to provide multiple ways for superuser-level access to be grant'ed out to users. Let's provide one very clear way to do that and strive to avoid building in others, either intentionally or unintentionally. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] [PATCH] Improve geometric types
>> This is also effecting lseg ## box operator. > > Mmm.. It returns (1.5, 1.5) with the 0004 patch. It is surely a > point on the second operand but I'm not sure it's right that the > operator returns a specific point for two parallel segments. I am changing it to return NULL, when they are parallel. > I'd like to put comments on 0001 and 0004 only now: > > - Adding [LR]DELIM_L looks good but they should be described in >the comment just above. I will mention it on the new version. > - I'm not sure the change of box_construct is good but currently >all callers fits the new interface (giving two points, not >four coordinates). I tried to make things more consistent. The other constructors takes points. > - close_lseg seems forgetting the case where the two given >segments are crossing (and parallel). I am re-implementing it covering those cases. > - make_bound_box operates directly on the poly->boundbox. I'm > afraid that such coding hinders compiers from using registers. I am changing it back. > This is a bit apart from this patch, it would be better if we > could eliminate internal calls using DirectFunctionCall. We don't seem to be able to fix all issues without doing that. I will incorporate the change. Thank you for your review. I will address your other email before posting new versions. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
On Thu, Nov 9, 2017 at 3:47 AM, Amit Kapila wrote: > I think I understood your concern after some offlist discussion and it > is primarily due to the inheritance related check which can skip the > generation of gather paths when it shouldn't. So what might fit > better here is a straight check on the number of base rels such that > allow generating gather path in set_rel_pathlist, if there are > multiple baserels involved. I have used all_baserels which I think > will work better for this purpose. Yes, that looks a lot more likely to be correct. Let's see what Tom thinks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
On Thu, Nov 9, 2017 at 1:52 PM, Stephen Frost wrote: >> I disagree that that is, or should be, a guiding principle. > > It was what I was using as the basis of the work which I did in this > area and, at least at that time, there seemed to be little issue with > that. Well, there actually kind of was. It came up here: http://postgr.es/m/ca+tgmoy6ne5n4jc5awxser2g2gdgr4omf7edcexamvpf_jq...@mail.gmail.com I mis-remembered who was on which side of the debate, though, hence the comment about employers. But never mind about that, since I was wrong. Sorry for not checking that more carefully before. > This is not unlike the discussions we've had in the past around allowing > non-owners of a table to modify properties of a table, where the > argument has been successfully and clearly made that if you make the > ability to change the table a GRANT'able right, then that can be used to > become the role which owns the table without much difficulty, and so > there isn't really a point in making that right independently > GRANT'able. We have some of that explicitly GRANT'able today due to > requirements of the spec, but that's outside of our control. I don't think it's quite the same thing. I wouldn't go out of my way to invent grantable table rights that just let you usurp the table owner's permissions, but I still think it's better to have a uniform rule that functions we ship don't contain hard-coded superuser checks. One problem is that which functions allow an escalation to superuser that is easy enough or reliable enough may not actually be a very bright line in all cases. But more than that, I think it's a legitimate decision to grant to a user a right that COULD lead to a superuser escalation and trust them not to use that way, or prevent them from using it that way by some means not known to the database system (e.g. route all queries through pgbouncer and send them to a teletype; if a breach is detected, go to the teletype room, read the paper contained therein, and decide who to fire/imprison/execute at gunpoint). It shouldn't be our job to decide that granting a certain right is NEVER ok. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
Robert, * Robert Haas (robertmh...@gmail.com) wrote: > On Thu, Nov 9, 2017 at 1:16 PM, Stephen Frost wrote: > > While we have been working to reduce the number of superuser() checks in > > the backend in favor of having the ability to GRANT explicit rights, one > > of the guideing principles has always been that capabilities which can > > be used to gain superuser rights with little effort should remain > > restricted to the superuser, which is why the lo_import/lo_export hadn't > > been under consideration for superuser-check removal in the analysis I > > provided previously. > > I disagree that that is, or should be, a guiding principle. It was what I was using as the basis of the work which I did in this area and, at least at that time, there seemed to be little issue with that. > I'm not > sure that anyone other than you and your fellow employees at Crunchy > has ever agreed that this is some kind of principle. You make it > sound like there's a consensus about this, but I think there isn't. It's unclear to me why you're bringing up employers in this discussion, particularly since Tom is the one who just moved things in the direction that you're evidently advocating for. Clearly there isn't consensus if you and others disagree. Things certainly can change over time as well, but if we're going to make a change here then we should make it willfully, plainly, and consistently. > I think our guiding principle should be to get rid of ALL of the > hard-coded superuser checks and let people GRANT what they want. If > they grant a permission that results in somebody escalating to > superuser, they get to keep both pieces. Such risks might be worth > documenting, but we shouldn't obstruct people from doing it. I would certainly like to see the many additional hard-coded superuser checks introduced into PG10 removed and replaced with more fine-grained GRANT-based privileges (either as additional GRANT rights or perhaps through the default roles system). What I dislike about allowing GRANT's of a privilege which allows someone to be superuser is that it makes it *look* like you're only GRANT'ing some subset of reasonable rights when, in reality, you're actually GRANT'ing a great deal more. This is not unlike the discussions we've had in the past around allowing non-owners of a table to modify properties of a table, where the argument has been successfully and clearly made that if you make the ability to change the table a GRANT'able right, then that can be used to become the role which owns the table without much difficulty, and so there isn't really a point in making that right independently GRANT'able. We have some of that explicitly GRANT'able today due to requirements of the spec, but that's outside of our control. > In the same way, Linux will not prevent you from making a binary > setuid regardless of what the binary does. If you make a binary > setuid root that lets someone hack root, that's your fault, not the > operating system. This is actually one of the things that SELinux is intended to address, so I don't agree that it's quite this cut-and-dry. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Aggregates push-down to partitions
Hi Konstantin! 09.11.17 20:14, Konstantin Knizhnik wrote: It is still far from ideal plan because each worker is working with all partitions, instead of spitting partitions between workers and calculate partial aggregates for each partition. But if we add FDW as a child of parent table, then parallel scan can not be used and we get the worst possible plan: postgres=# create foreign table derived_fdw() inherits(base) server pg_fdw options (table_name 'derived1');CREATE FOREIGN TABLE postgres=# explain select sum(x) from base; QUERY PLAN -- Aggregate (cost=34055.07..34055.08 rows=1 width=8) -> Append (cost=0.00..29047.75 rows=2002926 width=4) -> Seq Scan on base (cost=0.00..0.00 rows=1 width=4) -> Seq Scan on derived1 (cost=0.00..14425.00 rows=100 width=4) -> Seq Scan on derived2 (cost=0.00..14425.00 rows=100 width=4) -> Foreign Scan on derived_fdw (cost=100.00..197.75 rows=2925 width=4) (6 rows) So we sequentially pull all data to this node and compute aggregates locally. Ideal plan will calculate in parallel partial aggregates at all nodes and then combine partial results. It requires two changes: 1. Replace Aggregate->Append with Finalize_Aggregate->Append->Partial_Aggregate 2. Concurrent execution of Append. It also can be done in two different ways: we can try to use existed parallel workers infrastructure and replace Append with Gather. It seems to be the best approach for local partitioning. In case of remote (FDW) partitions, it is enough to split starting of execution (PQsendQuery in postgres_fdw) and getting results. So it requires some changes in FDW protocol. I wonder if somebody already investigate this problem or working in this direction. May be there are already some patches proposed? I have searched hackers archive, but didn't find something relevant... Are there any suggestions about the best approach to implement this feature? Maybe in this thread[1] your described problem are solved through introducing Parallel Append node? 1. https://www.postgresql.org/message-id/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1%2BS%2BvRuUQ%40mail.gmail.com -- Regards, Maksim Milyutin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
On Thu, Nov 9, 2017 at 1:16 PM, Stephen Frost wrote: > While we have been working to reduce the number of superuser() checks in > the backend in favor of having the ability to GRANT explicit rights, one > of the guideing principles has always been that capabilities which can > be used to gain superuser rights with little effort should remain > restricted to the superuser, which is why the lo_import/lo_export hadn't > been under consideration for superuser-check removal in the analysis I > provided previously. I disagree that that is, or should be, a guiding principle. I'm not sure that anyone other than you and your fellow employees at Crunchy has ever agreed that this is some kind of principle. You make it sound like there's a consensus about this, but I think there isn't. I think our guiding principle should be to get rid of ALL of the hard-coded superuser checks and let people GRANT what they want. If they grant a permission that results in somebody escalating to superuser, they get to keep both pieces. Such risks might be worth documenting, but we shouldn't obstruct people from doing it. In the same way, Linux will not prevent you from making a binary setuid regardless of what the binary does. If you make a binary setuid root that lets someone hack root, that's your fault, not the operating system. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] Faster processing at Gather node
On Thu, Nov 9, 2017 at 12:08 AM, Amit Kapila wrote: > This change looks suspicious to me. I think here we can't use the > tupDesc constructed from targetlist. One problem, I could see is that > the check for hasOid setting in tlist_matches_tupdesc won't give the > correct answer. In case of the scan, we use the tuple descriptor > stored in relation descriptor which will allow us to take the right > decision in tlist_matches_tupdesc. If you try the statement CREATE > TABLE as_select1 AS SELECT * FROM pg_class WHERE relkind = 'r'; in > force_parallel_mode=regress, then you can reproduce the problem I am > trying to highlight. I tried this, but nothing seemed to be obviously broken. Then I realized that the CREATE TABLE command wasn't using parallelism, so I retried with parallel_setup_cost = 0, parallel_tuple_cost = 0, and min_parallel_table_scan_size = 0. That got it to use parallel query, but I still don't see anything broken. Can you clarify further? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: psql command \graw
Hi 2017-08-24 5:50 GMT+02:00 Fabien COELHO : > > Hello Pavel, > > I have added the patch to the next commitfest. > > Patch applies, compiles, works. > > I'm okay with the names graw/graw+, and for having such short-hands. > > Missing break in switch, even if last item and useless, because other > items do it... Also should be added at its place in alphabetical order? > > "column_header" is somehow redundant with "tuples_only". Use the > existing one instead of adding a new one? > > More generally, ISTM that the same effect could be achieved without > adding a new print function, but by setting more options (separator, > ...) and calling an existing print function. If so, I think it would > reduce the code size. > > Missing help entry. > > Missing non regression tests. > > Missing documentation. > > I hope so I fixed all mentioned issues. Regards Pavel > -- > Fabien. > diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml index e520cdf3ba..9e7030f247 100644 --- a/doc/src/sgml/ref/psql-ref.sgml +++ b/doc/src/sgml/ref/psql-ref.sgml @@ -2020,6 +2020,19 @@ CREATE INDEX + +\graw[+] [ filename ] +\graw[+] [ |command ] + + +\graw is equivalent to \g, but +forces unaligned output mode for this query. If + +is appended to the command name, a column names are displayed. + + + + + \gset [ prefix ] diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c index 8cc4de3878..b3461291eb 100644 --- a/src/bin/psql/command.c +++ b/src/bin/psql/command.c @@ -332,7 +332,8 @@ exec_command(const char *cmd, status = exec_command_errverbose(scan_state, active_branch); else if (strcmp(cmd, "f") == 0) status = exec_command_f(scan_state, active_branch); - else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0) + else if (strcmp(cmd, "g") == 0 || strcmp(cmd, "gx") == 0 || + strcmp(cmd, "graw") == 0 || strcmp(cmd, "graw+") == 0) status = exec_command_g(scan_state, active_branch, cmd); else if (strcmp(cmd, "gdesc") == 0) status = exec_command_gdesc(scan_state, active_branch); @@ -1232,6 +1233,7 @@ exec_command_f(PsqlScanState scan_state, bool active_branch) /* * \g [filename] -- send query, optionally with output to file/pipe + * \graw [filename] -- same as \g with raw format * \gx [filename] -- same as \g, with expanded mode forced */ static backslashResult @@ -1254,6 +1256,10 @@ exec_command_g(PsqlScanState scan_state, bool active_branch, const char *cmd) free(fname); if (strcmp(cmd, "gx") == 0) pset.g_expanded = true; + else if (strcmp(cmd, "graw") == 0) + pset.g_raw = true; + else if (strcmp(cmd, "graw+") == 0) + pset.g_raw_header = true; status = PSQL_CMD_SEND; } else diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c index 7a91a44b2b..aeec302eae 100644 --- a/src/bin/psql/common.c +++ b/src/bin/psql/common.c @@ -865,6 +865,14 @@ PrintQueryTuples(const PGresult *results) if (pset.g_expanded) my_popt.topt.expanded = 1; + /* one-shot raw output requested by \raw and \graw+ */ + else if (pset.g_raw || pset.g_raw_header) + { + my_popt.topt.format = PRINT_UNALIGNED; + my_popt.topt.tuples_only = true; + my_popt.topt.force_column_header = pset.g_raw_header; + } + /* write output to \g argument, if any */ if (pset.gfname) { @@ -893,6 +901,8 @@ PrintQueryTuples(const PGresult *results) } + + /* * StoreQueryTuple: assuming query result is OK, save data into variables * @@ -1517,6 +1527,10 @@ sendquery_cleanup: /* reset \gx's expanded-mode flag */ pset.g_expanded = false; + /* reset \graw flags */ + pset.g_raw = false; + pset.g_raw_header = false; + /* reset \gset trigger */ if (pset.gset_prefix) { diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c index a926c40b9b..e573711434 100644 --- a/src/bin/psql/help.c +++ b/src/bin/psql/help.c @@ -167,7 +167,7 @@ slashUsage(unsigned short int pager) * Use "psql --help=commands | wc" to count correctly. It's okay to count * the USE_READLINE line even in builds without that. */ - output = PageOutput(125, pager ? &(pset.popt.topt) : NULL); + output = PageOutput(126, pager ? &(pset.popt.topt) : NULL); fprintf(output, _("General\n")); fprintf(output, _(" \\copyright show PostgreSQL usage and distribution terms\n")); @@ -176,6 +176,7 @@ slashUsage(unsigned short int pager) fprintf(output, _(" \\g [FILE] or ; execute query (and send results to file or |pipe)\n")); fprintf(output, _(" \\gdesc describe result of query, without executing it\n")); fprintf(output, _(" \\gexec execute query, then execute each value in its result\n")); + fprintf(output, _(" \\graw[+] [FILE]as \\g, but forces unaligned raw output mode\n")); fprintf(output, _(" \\gset [PREFIX] execute query and store results in psql variables\n")); fprintf(output, _(" \\gx [FILE] as \\g, but f
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
Tom, Michael, * Tom Lane (t...@sss.pgh.pa.us) wrote: > Michael Paquier writes: > > On Thu, Nov 9, 2017 at 6:05 AM, Tom Lane wrote: > >> Another idea would be to invent a new external flag bit "INV_WRITE_ONLY", > >> so that people who wanted true write-only could get it, without breaking > >> backwards-compatible behavior. But I'm inclined to wait for some field > >> demand to show up before adding even that little bit of complication. > > > Demand that may never show up, and the current behavior on HEAD does > > not receive any complains either. I am keeping the patch simple for > > now. That's less aspirin needed for everybody. > > Looks good to me, pushed. While we have been working to reduce the number of superuser() checks in the backend in favor of having the ability to GRANT explicit rights, one of the guideing principles has always been that capabilities which can be used to gain superuser rights with little effort should remain restricted to the superuser, which is why the lo_import/lo_export hadn't been under consideration for superuser-check removal in the analysis I provided previously. As such, I'm not particularly thrilled to see those checks simply just removed. If we are going to say that it's acceptable to allow non-superusers arbitrary access to files on the filesystem as the OS user then we would also be removing similar checks from adminpack, something I've also been against quite clearly in past discussions. This also didn't update the documentation regarding these functions which clearly states that superuser is required. If we are going to allow non-superusers access to these functions, we certainly need to remove the claim stating that we don't do that and further we must make it abundantly clear that these functions are extremely dangerous and could be trivially used to make someone who has access to them into a superuser. I continue to feel that this is something worthy of serious consideration and discussion, and not something to be done because we happen to be modifying the code in this area. I'm tempted to suggest we should have another role attribute or some other additional check when it comes to functions like these. The right way to allow users to be able to pull in data off the filesystem, imv, would be by providing a way to define specific locations on the filesystem which users can be granted access to import data from, as I once wrote a patch to do. That's certainly quite tricky to get correct, but that's much better than allowing non-superusers arbitrary access, which is what this does and what users may start using if we continue to remove these restrictions without providing a better option. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Parallel Append implementation
On Sat, Oct 28, 2017 at 5:50 AM, Robert Haas wrote: > No, because the Append node is *NOT* getting copied into shared > memory. I have pushed a comment update to the existing functions; you > can use the same comment for this patch. I spent the last several days working on this patch, which had a number of problems both cosmetic and functional. I think the attached is in better shape now, but it could certainly use some more review and testing since I only just finished modifying it, and I modified it pretty heavily. Changes: - I fixed the "morerows" entries in the documentation. If you had built the documentation the way you had it and loaded up in a web browser, you would have seen that the way you had it was not correct. - I moved T_AppendState to a different position in the switch inside ExecParallelReInitializeDSM, so as to keep that switch in the same order as all of the other switch statements in that file. - I rewrote the comment for pa_finished. It previously began with "workers currently executing the subplan", which is not an accurate description. I suspect this was a holdover from a previous version of the patch in which this was an array of integers rather than an array of type bool. I also fixed the comment in ExecAppendEstimate and added, removed, or rewrote various other comments as well. - I renamed PA_INVALID_PLAN to INVALID_SUBPLAN_INDEX, which I think is more clear and allows for the possibility that this sentinel value might someday be used for non-parallel-aware Append plans. - I largely rewrote the code for picking the next subplan. A superficial problem with the way that you had it is that you had renamed exec_append_initialize_next to exec_append_seq_next but not updated the header comment to match. Also, the logic was spread out all over the file. There are three cases: not parallel aware, leader, worker. You had the code for the first case at the top of the file and the other two cases at the bottom of the file and used multiple "if" statements to pick the right one in each case. I replaced all that with a function pointer stored in the AppendState, moved the code so it's all together, and rewrote it in a way that I find easier to understand. I also changed the naming convention. - I renamed pappend_len to pstate_len and ParallelAppendDescData to ParallelAppendState. I think the use of the word "descriptor" is a carryover from the concept of a scan descriptor. There's nothing really wrong with inventing the concept of an "append descriptor", but it seems more clear to just refer to shared state. - I fixed ExecAppendReInitializeDSM not to reset node->as_whichplan. Per commit 41b0dd987d44089dc48e9c70024277e253b396b7, that's wrong; instead, local state should be reset in ExecReScanAppend. I installed what I believe to be the correct logic in that function instead. - I fixed list_qsort() so that it copies the type of the old list into the new list. Otherwise, sorting a list of type T_IntList or T_OidList would turn it into just plain T_List, which is wrong. - I removed get_append_num_workers and integrated the logic into the callers. This function was coded quite strangely: it assigned the return value of fls() to a double and then eventually rounded the result back to an integer. But fls() returns an integer, so this doesn't make much sense. On a related note, I made it use fls(# of subpaths) instead of fls(# of subpaths)+1. Adding 1 doesn't make sense to me here because it leads to a decision to use 2 workers for a single, non-partial subpath. I suspect both of these mistakes stem from thinking that fls() returns the base-2 logarithm, but in fact it doesn't, quite: log2(1) = 0.0 but fls(1) = 1. - In the process of making the changes described in the previous point, I added a couple of assertions, one of which promptly failed. It turns out the reason is that your patch didn't update accumulate_append_subpaths(), which can result in flattening non-partial paths from a Parallel Append into a parent Append's list of partial paths, which is bad. The easiest way to fix that would be to just teach accumulate_append_subpaths() not to flatten a Parallel Append into a parent Append or MergeAppend node, but it seemed to me that there was a fair amount of duplication between accumulate_partialappend_subpath() and accumulate_append_subpaths, so what I did instead is folded all of the necessarily logic directly into accumulate_append_subpaths(). This approach also avoids some assumptions that your code made, such as the assumption that we will never have a partial MergeAppend path. - I changed create_append_path() so that it uses the parallel_aware argument as the only determinant of whether the output path is flagged as parallel-aware. Your version also considered whether parallel_workers > 0, but I think that's not a good idea; the caller should pass the correct value for parallel_aware rather than relying on this function to fix it. Possibly you indirectly
Re: [HACKERS] pageinspect option to forgo buffer locking?
On Thu, Nov 9, 2017 at 12:58 PM, Andres Freund wrote: > You can already pass arbitrary byteas to heap_page_items(), so I don't > see how this'd change anything exposure-wise? Or are you thinking that > users would continually do this with actual page contents and would be > more likely to hit edge cases than if using pg_read_binary_file() or > such (which obviously sees an out of date page)? More the latter. It's not really an increase in terms of security exposure, but it might lead to more trouble in practice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect option to forgo buffer locking?
On 2017-11-09 12:55:30 -0500, Robert Haas wrote: > On Thu, Nov 9, 2017 at 12:49 PM, Andres Freund wrote: > > Occasionally, when debugging issues, I find it quite useful to be able > > to do a heap_page_items() on a page that's actually locked exclusively > > concurrently. E.g. investigating the recent multixact vacuuming issues, > > it was very useful to attach a debugger to one backend to step through > > freezing, and display the page in another session. > > > > Currently the locking in get_raw_page_internal() prevents that. If it's > > an option defaulting to off, I don't see why we couldn't allow that to > > skip locking the page's contents. Obviously you can get corrupted > > contents that way, but we already allow to pass arbitrary stuff to > > heap_page_items(). Since pinning wouldn't be changed, there's no danger > > of the page being moved out from under us. > > heap_page_items() is, if I remember correctly, not necessarily going > to tolerate malformed input very well - I think that's why we restrict > all of these functions to superusers. So using it in this way would > seem to increase the risk of a server crash or other horrible > misbehavior. Of course if we're just dev-testing that doesn't really > matter. You can already pass arbitrary byteas to heap_page_items(), so I don't see how this'd change anything exposure-wise? Or are you thinking that users would continually do this with actual page contents and would be more likely to hit edge cases than if using pg_read_binary_file() or such (which obviously sees an out of date page)? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pageinspect option to forgo buffer locking?
On Thu, Nov 9, 2017 at 12:49 PM, Andres Freund wrote: > Occasionally, when debugging issues, I find it quite useful to be able > to do a heap_page_items() on a page that's actually locked exclusively > concurrently. E.g. investigating the recent multixact vacuuming issues, > it was very useful to attach a debugger to one backend to step through > freezing, and display the page in another session. > > Currently the locking in get_raw_page_internal() prevents that. If it's > an option defaulting to off, I don't see why we couldn't allow that to > skip locking the page's contents. Obviously you can get corrupted > contents that way, but we already allow to pass arbitrary stuff to > heap_page_items(). Since pinning wouldn't be changed, there's no danger > of the page being moved out from under us. heap_page_items() is, if I remember correctly, not necessarily going to tolerate malformed input very well - I think that's why we restrict all of these functions to superusers. So using it in this way would seem to increase the risk of a server crash or other horrible misbehavior. Of course if we're just dev-testing that doesn't really matter. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks
Michael Paquier writes: > On Thu, Nov 9, 2017 at 6:05 AM, Tom Lane wrote: >> Another idea would be to invent a new external flag bit "INV_WRITE_ONLY", >> so that people who wanted true write-only could get it, without breaking >> backwards-compatible behavior. But I'm inclined to wait for some field >> demand to show up before adding even that little bit of complication. > Demand that may never show up, and the current behavior on HEAD does > not receive any complains either. I am keeping the patch simple for > now. That's less aspirin needed for everybody. Looks good to me, pushed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq connection strings: control over the cipher suites?
On 11/09/2017 03:27 AM, Graham Leggett wrote: > Is there a parameter or mechanism for setting the required ssl cipher list > from the client side? I don't believe so. That is controlled by ssl_ciphers, which requires a restart in order to change. https://www.postgresql.org/docs/10/static/runtime-config-connection.html#GUC-SSL-CIPHERS select name,setting,context from pg_settings where name like '%ssl%'; name| setting | context ---+--+ ssl | off | postmaster ssl_ca_file | | postmaster ssl_cert_file | server.crt | postmaster ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL | postmaster ssl_crl_file | | postmaster ssl_ecdh_curve| prime256v1 | postmaster ssl_key_file | server.key | postmaster ssl_prefer_server_ciphers | on | postmaster (8 rows) HTH, Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pageinspect option to forgo buffer locking?
On Thu, Nov 9, 2017 at 9:49 AM, Andres Freund wrote: > Currently the locking in get_raw_page_internal() prevents that. If it's > an option defaulting to off, I don't see why we couldn't allow that to > skip locking the page's contents. Obviously you can get corrupted > contents that way, but we already allow to pass arbitrary stuff to > heap_page_items(). Since pinning wouldn't be changed, there's no danger > of the page being moved out from under us. +1. I've done things like this before myself. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pageinspect option to forgo buffer locking?
Hi, Occasionally, when debugging issues, I find it quite useful to be able to do a heap_page_items() on a page that's actually locked exclusively concurrently. E.g. investigating the recent multixact vacuuming issues, it was very useful to attach a debugger to one backend to step through freezing, and display the page in another session. Currently the locking in get_raw_page_internal() prevents that. If it's an option defaulting to off, I don't see why we couldn't allow that to skip locking the page's contents. Obviously you can get corrupted contents that way, but we already allow to pass arbitrary stuff to heap_page_items(). Since pinning wouldn't be changed, there's no danger of the page being moved out from under us. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] A hook for session start
On Thu, Nov 9, 2017 at 12:09 AM, Michael Paquier wrote: > > On Thu, Nov 9, 2017 at 2:42 AM, Fabrízio de Royes Mello > wrote: > > On Wed, Nov 8, 2017 at 12:47 AM, Michael Paquier < michael.paqu...@gmail.com> > > wrote: > >> - Let's restrict the logging to a role name instead of a database > >> name, and let's parametrize it with a setting in the temporary > >> configuration file. Let's not bother about multiple role support with > >> a list, for the sake of tests and simplicity only defining one role > >> looks fine to me. Comments in the code should be clear about the > >> dependency. > > > > Makes sense and simplify the test code. Fixed. > > + if (!strcmp(username, "regress_sess_hook_usr2")) > + { > + const char *dbname; > [...] > +++ b/src/test/modules/test_session_hooks/session_hooks.conf > @@ -0,0 +1 @@ > +shared_preload_libraries = 'test_session_hooks' > Don't you think that this should be a GUC? My previous comment > outlined that. I won't fight hard on that point in any case, don't > worry. I just want to make things clear :) > Ooops... my fault... fixed! Thanks again!! -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog: http://fabriziomello.github.io >> Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello >> Github: http://github.com/fabriziomello diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 05c5c19..d3156ad 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -169,6 +169,9 @@ static ProcSignalReason RecoveryConflictReason; static MemoryContext row_description_context = NULL; static StringInfoData row_description_buf; +/* Hook for plugins to get control at start of session */ +session_start_hook_type session_start_hook = NULL; + /* * decls for routines only used in this file * @@ -3857,6 +3860,9 @@ PostgresMain(int argc, char *argv[], if (!IsUnderPostmaster) PgStartTime = GetCurrentTimestamp(); + if (session_start_hook) + (*session_start_hook) (); + /* * POSTGRES main processing loop begins here * diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c index 20f1d27..16ec376 100644 --- a/src/backend/utils/init/postinit.c +++ b/src/backend/utils/init/postinit.c @@ -76,6 +76,8 @@ static bool ThereIsAtLeastOneRole(void); static void process_startup_options(Port *port, bool am_superuser); static void process_settings(Oid databaseid, Oid roleid); +/* Hook for plugins to get control at end of session */ +session_end_hook_type session_end_hook = NULL; /*** InitPostgres support ***/ @@ -1154,6 +1156,10 @@ ShutdownPostgres(int code, Datum arg) * them explicitly. */ LockReleaseAll(USER_LOCKMETHOD, true); + + /* Hook at session end */ + if (session_end_hook) + (*session_end_hook) (); } diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h index f8c535c..9f05bfb 100644 --- a/src/include/tcop/tcopprot.h +++ b/src/include/tcop/tcopprot.h @@ -35,6 +35,13 @@ extern PGDLLIMPORT const char *debug_query_string; extern int max_stack_depth; extern int PostAuthDelay; +/* Hook for plugins to get control at start and end of session */ +typedef void (*session_start_hook_type) (void); +typedef void (*session_end_hook_type) (void); + +extern PGDLLIMPORT session_start_hook_type session_start_hook; +extern PGDLLIMPORT session_end_hook_type session_end_hook; + /* GUC-configurable parameters */ typedef enum diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile index b7ed0af..a3c8c1e 100644 --- a/src/test/modules/Makefile +++ b/src/test/modules/Makefile @@ -11,6 +11,7 @@ SUBDIRS = \ snapshot_too_old \ test_ddl_deparse \ test_extensions \ + test_session_hooks \ test_parser \ test_pg_dump \ test_rbtree \ diff --git a/src/test/modules/test_session_hooks/.gitignore b/src/test/modules/test_session_hooks/.gitignore new file mode 100644 index 000..5dcb3ff --- /dev/null +++ b/src/test/modules/test_session_hooks/.gitignore @@ -0,0 +1,4 @@ +# Generated subdirectories +/log/ +/results/ +/tmp_check/ diff --git a/src/test/modules/test_session_hooks/Makefile b/src/test/modules/test_session_hooks/Makefile new file mode 100644 index 000..c5c3860 --- /dev/null +++ b/src/test/modules/test_session_hooks/Makefile @@ -0,0 +1,21 @@ +# src/test/modules/test_session_hooks/Makefile + +MODULES = test_session_hooks +PGFILEDESC = "test_session_hooks - Test session hooks with an extension" + +EXTENSION = test_session_hooks +DATA = test_session_hooks--1.0.sql + +REGRESS = test_session_hooks +REGRESS_OPTS = --temp-config=$(top_srcdir)/src/test/modules/test_session_hooks/session_hooks.conf + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = src/test/modules/
[HACKERS] Aggregates push-down to partitions
There is a huge thread concerning pushing-down aggregates to FDW: https://www.postgresql.org/message-id/flat/CAFjFpRcnueviDpngJ3QSVvj7oyukr9NkSiCspqd4N%2BdCEdvYvg%40mail.gmail.com#cafjfprcnuevidpngj3qsvvj7oyukr9nksicspqd4n+dcedv...@mail.gmail.com but as far as I understand nothing is done for efficient calculation of aggregates for partitioned table. In case of local partitions it is somehow compensated by parallel query plan: postgres=# create table base(x integer); CREATE TABLE postgres=# create table derived1() inherits (base); CREATE TABLE postgres=# create table derived2() inherits (base); CREATE TABLE postgres=# insert into derived1 values (generate_series(1,100)); INSERT 0 100 postgres=# insert into derived2 values (generate_series(1,100)); INSERT 0 100 postgres=# explain select sum(x) from base; QUERY PLAN - Finalize Aggregate (cost=12176.63..12176.64 rows=1 width=8) -> Gather (cost=12176.59..12176.61 rows=8 width=8) Workers Planned: 8 -> Partial Aggregate (cost=12175.59..12175.60 rows=1 width=8) -> Append (cost=0.00..11510.47 rows=266048 width=4) -> Parallel Seq Scan on base (cost=0.00..0.00 rows=1 width=4) -> Parallel Seq Scan on derived1 (cost=0.00..5675.00 rows=125000 width=4) -> Parallel Seq Scan on derived2 (cost=0.00..5835.47 rows=141047 width=4) (8 rows) It is still far from ideal plan because each worker is working with all partitions, instead of spitting partitions between workers and calculate partial aggregates for each partition. But if we add FDW as a child of parent table, then parallel scan can not be used and we get the worst possible plan: postgres=# create foreign table derived_fdw() inherits(base) server pg_fdw options (table_name 'derived1');CREATE FOREIGN TABLE postgres=# explain select sum(x) from base; QUERY PLAN -- Aggregate (cost=34055.07..34055.08 rows=1 width=8) -> Append (cost=0.00..29047.75 rows=2002926 width=4) -> Seq Scan on base (cost=0.00..0.00 rows=1 width=4) -> Seq Scan on derived1 (cost=0.00..14425.00 rows=100 width=4) -> Seq Scan on derived2 (cost=0.00..14425.00 rows=100 width=4) -> Foreign Scan on derived_fdw (cost=100.00..197.75 rows=2925 width=4) (6 rows) So we sequentially pull all data to this node and compute aggregates locally. Ideal plan will calculate in parallel partial aggregates at all nodes and then combine partial results. It requires two changes: 1. Replace Aggregate->Append with Finalize_Aggregate->Append->Partial_Aggregate 2. Concurrent execution of Append. It also can be done in two different ways: we can try to use existed parallel workers infrastructure and replace Append with Gather. It seems to be the best approach for local partitioning. In case of remote (FDW) partitions, it is enough to split starting of execution (PQsendQuery in postgres_fdw) and getting results. So it requires some changes in FDW protocol. I wonder if somebody already investigate this problem or working in this direction. May be there are already some patches proposed? I have searched hackers archive, but didn't find something relevant... Are there any suggestions about the best approach to implement this feature? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] (spelling) Ensure header of postgresql.auto.conf is consistent
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= writes: > Em qui, 9 de nov de 2017 às 06:15, Feike Steenbergen < > feikesteenber...@gmail.com> escreveu: >> Attached a patch that ensures the header of postgresql.auto.conf is >> consistent, whether created by initdb or recreated when ALTER SYSTEM >> is issued. > Interesting... IMHO this typo should be backpatched to 9.4 when ALTER > SYSTEM was introduced. Agreed, and done. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A weird bit in pg_upgrade/exec.c
Alvaro Herrera writes: > I think odd coding this was introduced recently because of the > pg_resetxlog -> pg_resetwal renaming. Dunno about that, but certainly somebody fat-fingered a refactoring there. The 9.6 code looks quite different but doesn't seem to be doing anything silly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A weird bit in pg_upgrade/exec.c
a.akent...@postgrespro.ru writes: > I've came across a weird bit in pg_upgrade/exec.c > We have a function check_bin_dir() which goes like this (old_cluster and > new_cluster are global variables): > void check_bin_dir(ClusterInfo *cluster) > { > ... > get_bin_version(&old_cluster); > get_bin_version(&new_cluster); > ... > } > This function has two calls: > check_bin_dir(&old_cluster); > check_bin_dir(&new_cluster); > I'd like to substitute these last two lines with this: > get_bin_version(cluster); Yeah, the way it is now seems outright broken. It will try to do get_bin_version on the new cluster before having done validate_exec there, violating its own comment. I think we should change this as a bug fix, independently of whatever else you had in mind to do here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A weird bit in pg_upgrade/exec.c
a.akent...@postgrespro.ru wrote: > This function has two calls: > check_bin_dir(&old_cluster); > check_bin_dir(&new_cluster); > > I'd like to substitute these last two lines with this: > get_bin_version(cluster); Odd indeed. One would think that if a cluster variable is passed as parameter, the global vars should not be used. +1 for fixing it, and your proposal sounds as good as any. > Doing it would simplify the patch I'm writing, but I'm worried I might break > something that's been there for a long time and has been working fine. I think odd coding this was introduced recently because of the pg_resetxlog -> pg_resetwal renaming. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Pg V10: Patch for bug in bonjour support
Thomas Munro writes: > On Thu, Nov 9, 2017 at 6:27 PM, Tom Lane wrote: >> Is there really much interest in Bonjour support on non-macOS platforms? >> I hadn't heard that anybody but Apple was invested in it. > Not from me. My only interest here was to pipe up because I knew that > what was originally proposed would have broken stuff on macOS, and > after that piqued curiosity. I won't mind at all if you revert the > commit to prevent confusion. If the intersection of FreeBSD, > PostgreSQL and Bonjour users is a non-empty set, [s]he might at least > find this archived discussion useful... Not hearing anyone else speaking up for this, I'll go revert the configure change, and instead put in a comment pointing out that Avahi support would require a lot more than an extra -l switch. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Robert Haas writes: > No, that's not right. Now that you mention it, I realize that tuple > locks can definitely cause deadlocks. Example: Yeah. Foreign-key-related tuple locks are another rich source of examples. > ... So I don't > think we can remove speculative insertion locks from the deadlock > detector either. That scares me too. I think that relation extension can safely be transferred to some lower-level mechanism, because what has to be done while holding the lock is circumscribed and below the level of database operations (which might need other locks). These other ideas seem a lot riskier. (But see recent conversation where I discouraged Alvaro from holding extension locks across BRIN summarization activity. We'll need to look and make sure that nobody else has had creative ideas like that.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Runtime Partition Pruning
On Thu, Nov 9, 2017 at 6:18 AM, Beena Emerson wrote: > The code still chooses the custom plan instead of the generic plan for > the prepared statements. I am working on it. I don't think it's really the job of this patch to do anything about that problem. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
On Wed, Nov 8, 2017 at 9:40 AM, Masahiko Sawada wrote: > Speaking of the acquiring these four lock types and heavy weight lock, > there obviously is a call path to acquire any of four lock types while > holding a heavy weight lock. In reverse, there also is a call path > that we acquire a heavy weight lock while holding any of four lock > types. The call path I found is that in heap_delete we acquire a tuple > lock and call XactLockTableWait or MultiXactIdWait which eventually > could acquire LOCKTAG_TRANSACTION in order to wait for the concurrent > transactions finish. But IIUC since these functions acquire the lock > for the concurrent transaction's transaction id, deadlocks doesn't > happen. No, that's not right. Now that you mention it, I realize that tuple locks can definitely cause deadlocks. Example: setup: rhaas=# create table foo (a int, b text); CREATE TABLE rhaas=# create table bar (a int, b text); CREATE TABLE rhaas=# insert into foo values (1, 'hoge'); INSERT 0 1 session 1: rhaas=# begin; BEGIN rhaas=# update foo set b = 'hogehoge' where a = 1; UPDATE 1 session 2: rhaas=# begin; BEGIN rhaas=# update foo set b = 'quux' where a = 1; session 3: rhaas=# begin; BEGIN rhaas=# lock bar; LOCK TABLE rhaas=# update foo set b = 'blarfle' where a = 1; back to session 1: rhaas=# select * from bar; ERROR: deadlock detected LINE 1: select * from bar; ^ DETAIL: Process 88868 waits for AccessShareLock on relation 16391 of database 16384; blocked by process 88845. Process 88845 waits for ExclusiveLock on tuple (0,1) of relation 16385 of database 16384; blocked by process 88840. Process 88840 waits for ShareLock on transaction 1193; blocked by process 88868. HINT: See server log for query details. So what I said before was wrong: we definitely cannot exclude tuple locks from deadlock detection. However, we might be able to handle the problem in another way: introduce a separate, parallel-query specific mechanism to avoid having two participants try to update and/or delete the same tuple at the same time - e.g. advertise the BufferTag + offset within the page in DSM, and if somebody else already has that same combination advertised, wait until they no longer do. That shouldn't ever deadlock, because the other worker shouldn't be able to find itself waiting for us while it's busy updating a tuple. After some further study, speculative insertion locks look problematic too. I'm worried about the code path ExecInsert() [taking speculative insertion locking] -> heap_insert -> heap_prepare_insert -> toast_insert_or_update -> toast_save_datum -> heap_open(rel->rd_rel->reltoastrelid, RowExclusiveLock). That sure looks like we can end up waiting for a relation lock while holding a speculative insertion lock, which seems to mean that speculative insertion locks are subject to at least theoretical deadlock hazards as well. Note that even if we were guaranteed to be holding the lock on the toast relation already at this point, it wouldn't fix the problem, because we might still have to build or refresh a relcache entry at this point, which could end up scanning (and thus locking) system catalogs. Any syscache lookup can theoretically take a lock, even though most of the time it doesn't, and thus taking a lock that has been removed from the deadlock detector (or, say, an lwlock) and then performing a syscache lookup with it held is not OK. So I don't think we can remove speculative insertion locks from the deadlock detector either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] A weird bit in pg_upgrade/exec.c
Hello! I've came across a weird bit in pg_upgrade/exec.c We have a function check_bin_dir() which goes like this (old_cluster and new_cluster are global variables): void check_bin_dir(ClusterInfo *cluster) { ... get_bin_version(&old_cluster); get_bin_version(&new_cluster); ... } This function has two calls: check_bin_dir(&old_cluster); check_bin_dir(&new_cluster); I'd like to substitute these last two lines with this: get_bin_version(cluster); Doing it would simplify the patch I'm writing, but I'm worried I might break something that's been there for a long time and has been working fine. Is there maybe a reason for it to be this way that I don't happen to understand? Also, there's the exact same situation with the get_bin_version() function in the same file. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Reorder header files in alphabetical order
Hi, Attached is a patch to reorder header files in joinrels.c and pathnode.c in alphabetical order, removing unnecessary ones. Best regards, Etsuro Fujita *** a/src/backend/optimizer/path/joinrels.c --- b/src/backend/optimizer/path/joinrels.c *** *** 16,30 #include "miscadmin.h" #include "catalog/partition.h" - #include "nodes/relation.h" #include "optimizer/clauses.h" #include "optimizer/joininfo.h" #include "optimizer/pathnode.h" #include "optimizer/paths.h" #include "optimizer/prep.h" - #include "optimizer/cost.h" - #include "utils/memutils.h" #include "utils/lsyscache.h" static void make_rels_by_clause_joins(PlannerInfo *root, --- 16,28 #include "miscadmin.h" #include "catalog/partition.h" #include "optimizer/clauses.h" #include "optimizer/joininfo.h" #include "optimizer/pathnode.h" #include "optimizer/paths.h" #include "optimizer/prep.h" #include "utils/lsyscache.h" + #include "utils/memutils.h" static void make_rels_by_clause_joins(PlannerInfo *root, *** a/src/backend/optimizer/util/pathnode.c --- b/src/backend/optimizer/util/pathnode.c *** *** 17,24 #include #include "miscadmin.h" ! #include "nodes/nodeFuncs.h" #include "nodes/extensible.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" #include "optimizer/pathnode.h" --- 17,25 #include #include "miscadmin.h" ! #include "foreign/fdwapi.h" #include "nodes/extensible.h" + #include "nodes/nodeFuncs.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" #include "optimizer/pathnode.h" *** *** 29,35 #include "optimizer/tlist.h" #include "optimizer/var.h" #include "parser/parsetree.h" - #include "foreign/fdwapi.h" #include "utils/lsyscache.h" #include "utils/memutils.h" #include "utils/selfuncs.h" --- 30,35 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup --progress output for batch execution
Hello, On Sun, Oct 01, 2017 at 04:49:17PM -0300, Martin Marques wrote: > Updated patch with documentation of the new option. > I have checked the patch. The patch is applied and compiled correctly without any errors. Tests passed. The documentation doesn't have errors too. I have a little suggestion. Maybe insert new line without any additional parameters? We can check that stderr is not terminal using isatty(). The code could become: if (!isatty(fileno(stderr))) fprintf(stderr, "\n"); else fprintf(stderr, "\r"); Also it could be good to not insert new line after progress: if (showprogress) { progress_report(PQntuples(res), NULL, true); /* if (!batchmode) */ /* or */ if (isatty(fileno(stderr))) fprintf(stderr, "\n"); /* Need to move to next line */ } Otherwise we have an extra empty line: pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup: checkpoint completed pg_basebackup: write-ahead log start point: 0/1C28 on timeline 1 pg_basebackup: starting background WAL receiver pg_basebackup: created temporary replication slot "pg_basebackup_19635" 0/183345 kB (0%), 0/1 tablespace (data_repl/backup_label ) 183355/183355 kB (100%), 0/1 tablespace (data_repl/global/pg_control) 183355/183355 kB (100%), 1/1 tablespace pg_basebackup: write-ahead log end point: 0/1CF8 pg_basebackup: waiting for background process to finish streaming ... pg_basebackup: base backup completed -- Arthur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Jsonb transform for pl/python
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed Hello Anthony, Great job! I decided to take a closer look on your patch. Here are some defects I discovered. > + Additional extensions are available that implement transforms for > + the jsonb type for the language PL/Python. The > + extensions for PL/Perl are called 1. The part regarding PL/Perl is obviously from another patch. 2. jsonb_plpython2u and jsonb_plpythonu are marked as relocatable, while jsonb_plpython3u is not. Is it a mistake? Anyway if an extension is relocatable there should be a test that checks this. 3. Not all json types are test-covered. Tests for 'true' :: jsonb, '3.14' :: jsonb and 'null' :: jsonb are missing. 4. jsonb_plpython.c:133 - "Iterate trhrough Jsonb object." Typo, it should be "through" or probably even "over". 5. It looks like you've implemented transform in two directions Python <-> JSONB, however I see tests only for Python <- JSONB case. 6. Tests passed on Python 2.7.14 but failed on 3.6.2: > CREATE EXTENSION jsonb_plpython3u CASCADE; > + ERROR: could not access file "$libdir/jsonb_plpython3": No such file or > directory module_pathname in jsonb_plpython3u.control should be $libdir/jsonb_plpython3u, not $libdir/jsonb_plpython3. Tested on Arch Linux x64, GCC 7.2.0. The new status of this patch is: Waiting on Author -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] libpq connection strings: control over the cipher suites?
Hi all, According to the docs at https://www.postgresql.org/docs/9.5/static/libpq-connect.html#LIBPQ-CONNSTRING there are various parameters that control ssl from the client side, including providing the ssl certs, keys, etc. Is there a parameter or mechanism for setting the required ssl cipher list from the client side? Regards, Graham — smime.p7s Description: S/MIME cryptographic signature
Re: [HACKERS] Runtime Partition Pruning
Hello all, Here is the updated patch which is rebased over v10 of Amit Langote's path towards faster pruning patch [1]. It modifies the PartScanKeyInfo struct to hold expressions which is then evaluated by the executor to fetch the correct partitions using the function. The code still chooses the custom plan instead of the generic plan for the prepared statements. I am working on it. The following output is after adding a hack in the code forcing selection of generic plan. postgres=# EXPLAIN EXECUTE prstmt_select(7); QUERY PLAN -- Append (cost=0.00..1732.25 rows=2 width=8) -> Seq Scan on tprt_1 (cost=0.00..847.00 rows=16667 width=8) Filter: ($1 > col1) -> Seq Scan on tprt_2 (cost=0.00..847.00 rows=16667 width=8) Filter: ($1 > col1) (5 rows) postgres=# EXPLAIN EXECUTE prstmt_select(20); QUERY PLAN -- Append (cost=0.00..1732.25 rows=3 width=8) -> Seq Scan on tprt_1 (cost=0.00..847.00 rows=16667 width=8) Filter: ($1 > col1) -> Seq Scan on tprt_2 (cost=0.00..847.00 rows=16667 width=8) Filter: ($1 > col1) -> Seq Scan on tprt_3 (cost=0.00..38.25 rows=753 width=8) Filter: ($1 > col1) (7 rows) [1] https://www.postgresql.org/message-id/b8094e71-2c73-ed8e-d8c3-53f232c8c049%40lab.ntt.co.jp Tested on commit: 9b9cb3c4534d717c1c95758670198ebbf8a20af2 -- Beena Emerson EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company 0001-Implement-runtime-partiton-pruning.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] path toward faster partition pruning
On Mon, Nov 6, 2017 at 3:31 PM, Amit Langote wrote: > On 2017/11/06 14:32, David Rowley wrote: >> On 6 November 2017 at 17:30, Amit Langote wrote: >>> On 2017/11/03 13:32, David Rowley wrote: On 31 October 2017 at 21:43, Amit Langote wrote: [] > > Attached updated set of patches, including the fix to make the new pruning > code handle Boolean partitioning. > I am getting following warning on mac os x: partition.c:1800:24: warning: comparison of constant -1 with expression of type 'NullTestType' (aka 'enum NullTestType') is always false [-Wtautological-constant-out-of-range-compare] if (keynullness[i] == -1) ~~ ^ ~~ partition.c:1932:25: warning: comparison of constant -1 with expression of type 'NullTestType' (aka 'enum NullTestType') is always false [-Wtautological-constant-out-of-range-compare] if (keynullness[i] == -1) ~~ ^ ~~ 2 warnings generated. Comment for 0004 patch: 270 + /* -1 represents an invalid value of NullTestType. */ 271 + memset(keynullness, -1, PARTITION_MAX_KEYS * sizeof(NullTestType)); I think we should not use memset to set a value other than 0 or true/false. This will work for -1 on the system where values are stored in the 2's complement but I am afraid of other architecture. Regards, Amul -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix bloom WAL tap test
On Wed, Nov 8, 2017 at 5:46 AM, Masahiko Sawada wrote: > > So I think > > that you should instead do something like that: > > > > --- a/contrib/bloom/Makefile > > +++ b/contrib/bloom/Makefile > > @@ -20,5 +20,12 @@ include $(top_builddir)/src/Makefile.global > > include $(top_srcdir)/contrib/contrib-global.mk > > endif > > > > +installcheck: wal-installcheck > > + > > +check: wal-check > > + > > +wal-installcheck: > > + $(prove_installcheck) > > + > > wal-check: temp-install > > $(prove_check) > > > > This works for me as I would expect it should. > > Looks good to me too. OK, then so be it :) -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company wal-check-on-bloom-check-3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] (spelling) Ensure header of postgresql.auto.conf is consistent
On Thu, Nov 9, 2017 at 6:25 PM, Fabrízio de Royes Mello wrote: > Interesting... IMHO this typo should be backpatched to 9.4 when ALTER SYSTEM > was introduced. Yeah, that's harmless. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] (spelling) Ensure header of postgresql.auto.conf is consistent
Em qui, 9 de nov de 2017 às 06:15, Feike Steenbergen < feikesteenber...@gmail.com> escreveu: > Attached a patch that ensures the header of postgresql.auto.conf is > consistent, whether created by initdb or recreated when ALTER SYSTEM > is issued. > > The tiny difference caused some false-positives on our configuration > management identifying changes, which was enough of an itch for me to > scratch. > Interesting... IMHO this typo should be backpatched to 9.4 when ALTER SYSTEM was introduced. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Timbira: http://www.timbira.com.br >> Blog: http://fabriziomello.github.io >> Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello >> Github: http://github.com/fabriziomello
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)
Hi 2017-11-06 14:00 GMT+01:00 Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp>: > Thank you for the new patch. > > - The latest patch is missing xpath_parser.h at least since > ns-3. That of the first (not-numbered) version was still > usable. > > - c29c578 conflicts on doc/src/sgml/func.sgml > > > At Sun, 15 Oct 2017 12:06:11 +0200, Pavel Stehule > wrote in b9efon...@mail.gmail.com> > > 2017-10-02 12:22 GMT+02:00 Kyotaro HORIGUCHI < > > horiguchi.kyot...@lab.ntt.co.jp>: > > > > > Hi, thanks for the new patch. > > > > > > # The patch is missing xpath_parser.h. That of the first patch was > usable. > > > > > > At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule < > pavel.steh...@gmail.com> > > > wrote in > > mail.gmail.com> > > > > Hi > > > > > > > > now xpath and xpath_exists supports default namespace too > > > > > > At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule < > pavel.steh...@gmail.com> > > > wrote in > > gmail.com> > > > > > 1. Uniformity among simliar features > > > > > > > > > > As mentioned in the proposal, but it is lack of uniformity that > > > > > the xpath transformer is applied only to xmltable and not for > > > > > other xpath related functions. > > > > > > > > > > > > > I have to fix the XPath function. The SQL/XML function Xmlexists > doesn't > > > > support namespaces/ > > > > > > Sorry, I forgot to care about that. (And the definition of > > > namespace array is of course fabricated by me). I'd like to leave > > > this to committers. Anyway it is working but the syntax (or > > > whether it is acceptable) is still arguable. > > > > > > SELECT xpath('/a/text()', 'http://example.com";> > > > test', > > > ARRAY[ARRAY['', 'http://example.com']]); > > > | xpath > > > | > > > | {test} > > > | (1 row) > > > > > > > > > The internal name is properly rejected, but the current internal > > > name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We > > > are preserving some short names and reject them as > > > user-defined. Doesn't just 'pgsqlxml' work? > > > > > > > > > Default namespace correctly become to be applied on bare > > > attribute names. > > > > > > > updated doc, > > > > fixed all variants of expected result test file > > > > > > Sorry for one by one comment but I found another misbehavior. > > > > > > create table t1 (id int, doc xml); > > > insert into t1 > > >values > > >(5, 'http://x.y";>50 > > rows>'); > > > select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' AS x), > > > '/x:rows/x:row' passing t1.doc columns data int PATH > > > 'child::x:a[1][attribute::hoge="haha"]') as x; > > > | data > > > | -- > > > |50 > > > > > > but the following fails. > > > > > > select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), > > > '/rows/row' passing t1.doc columns data int PATH > > > 'child::a[1][attribute::hoge="haha"]') as x; > > > | data > > > | -- > > > | > > > | (1 row) > > > > > > Perhaps child::a is not prefixed by the transformation. > > > > > > > the problem was in unwanted attribute modification. The parser didn't > > detect "attribute::hoge" as attribute. Updated parser does it. I reduce > > duplicated code there more. > > It worked as expected. But the comparison of "attribute" is > missing t1.length = 9 so the following expression wrongly passes. > > child::a[1][attributeabcdefg::hoge="haha" > > It is confusing that is_qual_name becomes true when t2 is not a > "qual name", and the way it treats a double-colon is hard to > understand. > > It essentially does inserting the default namespace before > unqualified non-attribute name. I believe we can easily > look-ahead to detect a double colon and it would make things > simpler. Could you consider something like the attached patch? > (applies on top of ns-4 patch.) > > > > XPath might be complex enough so that it's worth switching to > > > yacc/lex based transformer that is formally verifiable and won't > > > need a bunch of cryptic tests that finally cannot prove the > > > completeness. synchronous_standy_names is far simpler than XPath > > > but using yacc/lex parser. > > > > > > > > > Anyway the following is nitpicking of the current xpath_parser.c. > > > > > > - NODENAME_FIRSTCHAR allows '-' as the first char but it is > > > excluded from NameStartChar (https://www.w3.org/TR/REC- > > > xml/#NT-NameStartChar) > > > I think characters with high-bit set is okay. > > > Also IS_NODENAME_CHAR should be changed. > > > > > > > fixed > > > > > > > - NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category > > > but have different naming schemes. Can these are named in the same > way? > > > > > > > fixed > > > > > > > - The current transoformer seems to using up to one token stack > > > depth. Maybe the stack is needless. (pushed token is always > > > popped just after) > > > > > > > fixed > > Thank you. > > I found another (and should be the last, so sorry..) functional > defect in this. This doesn't add default namespace if the ta
Re: [HACKERS] Proposal: Improve bitmap costing for lossy pages
Hi Dilip, v6 patch: 42 + /* 43 +* Estimate number of hashtable entries we can have within maxbytes. This 44 +* estimates the hash cost as sizeof(PagetableEntry). 45 +*/ 46 + nbuckets = maxbytes / 47 + (sizeof(PagetableEntry) + sizeof(Pointer) + sizeof(Pointer)); It took me a little while to understand this calculation. You have moved this code from tbm_create(), but I think you should move the following comment as well: tidbitmap.c: 276 /* 277 * Estimate number of hashtable entries we can have within maxbytes. This 278 * estimates the hash cost as sizeof(PagetableEntry), which is good enough 279 * for our purpose. Also count an extra Pointer per entry for the arrays 280 * created during iteration readout. 281 */ Regards, Amul -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] why not parallel seq scan for slow functions
On Wed, Nov 8, 2017 at 6:48 PM, Robert Haas wrote: > On Wed, Nov 8, 2017 at 7:26 AM, Amit Kapila wrote: >> We do want to generate it later when there isn't inheritance involved, >> but only if there is a single rel involved (simple_rel_array_size >> <=2). The rule is something like this, we will generate the gather >> paths at this stage only if there are more than two rels involved and >> there isn't inheritance involved. > > Why is that the correct rule? > > Sorry if I'm being dense here. I would have thought we'd want to skip > it for the topmost scan/join rel regardless of the presence or absence > of inheritance. > I think I understood your concern after some offlist discussion and it is primarily due to the inheritance related check which can skip the generation of gather paths when it shouldn't. So what might fit better here is a straight check on the number of base rels such that allow generating gather path in set_rel_pathlist, if there are multiple baserels involved. I have used all_baserels which I think will work better for this purpose. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com parallel_paths_include_tlist_cost_v6.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Restricting maximum keep segments by repslots
Oops! The previous patch is forgetting the default case and crashes. At Wed, 08 Nov 2017 13:14:31 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20171108.131431.170534842.horiguchi.kyot...@lab.ntt.co.jp> > > I don't think 'distance' is a good metric - that's going to continually > > change. Why not store the LSN that's available and provide a function > > that computes this? Or just rely on the lsn - lsn operator? > > It seems reasonable.,The 'secured minimum LSN' is common among > all slots so showing it in the view may look a bit stupid but I > don't find another suitable place for it. distance = 0 meant the > state that the slot is living but insecured in the previous patch > and that information is lost by changing 'distance' to > 'min_secure_lsn'. > > Thus I changed the 'live' column to 'status' and show that staus > in text representation. > > status: secured | insecured | broken > > So this looks like the following (max_slot_wal_keep_size = 8MB, > which is a half of the default segment size) > > -- slots that required WAL is surely available > select restart_lsn, status, min_secure_lsn, pg_current_wal_lsn() from > pg_replication_slots; > restart_lsn | status | min_recure_lsn | pg_current_wal_lsn > +-++ > 0/1A60 | secured | 0/1A00 | 0/1B42BC78 > > -- slots that required WAL is still available but insecured > restart_lsn | status| min_recure_lsn | pg_current_wal_lsn > +---++ > 0/1A60 | insecured | 0/1C00 | 0/1D76C948 > > -- slots that required WAL is lost > # We should have seen the log 'Some replication slots have lost...' > > restart_lsn | status | min_recure_lsn | pg_current_wal_lsn > +++ > 0/1A60 | broken | 0/1C00 | 0/1D76C9F0 > > > I noticed that I abandoned the segment fragment of > max_slot_wal_keep_size in calculating in the routines. The > current patch honors the frament part of max_slot_wal_keep_size. I changed IsLsnStillAvailable to return meaningful values regardless whether max_slot_wal_keep_size is set or not. # I had been forgetting to count the version for latestst several # patches. I give the version '4' - as the next of the last # numbered patch. -- Kyotaro Horiguchi NTT Open Source Software Center >From 109f056e257aba70dddc8d466767ed0a1db371e2 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi Date: Tue, 28 Feb 2017 11:39:48 +0900 Subject: [PATCH 1/2] Add WAL releaf vent for replication slots Adds a capability to limit the number of segments kept by replication slots by a GUC variable. --- src/backend/access/transam/xlog.c | 39 +++ src/backend/utils/misc/guc.c | 11 src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/access/xlog.h | 1 + 4 files changed, 52 insertions(+) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index dd028a1..cfdae39 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -105,6 +105,7 @@ int wal_level = WAL_LEVEL_MINIMAL; int CommitDelay = 0; /* precommit delay in microseconds */ int CommitSiblings = 5; /* # concurrent xacts needed to sleep */ int wal_retrieve_retry_interval = 5000; +int max_slot_wal_keep_size_mb = 0; #ifdef WAL_DEBUG bool XLOG_DEBUG = false; @@ -9432,9 +9433,47 @@ KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo) if (max_replication_slots > 0 && keep != InvalidXLogRecPtr) { XLogSegNo slotSegNo; + int slotlimitsegs; + uint64 recptroff; + uint64 slotlimitbytes; + uint64 slotlimitfragment; + + recptroff = XLogSegmentOffset(recptr, wal_segment_size); + slotlimitbytes = 1024 * 1024 * max_slot_wal_keep_size_mb; + slotlimitfragment = XLogSegmentOffset(slotlimitbytes, + wal_segment_size); + + /* calculate segments to keep by max_slot_wal_keep_size_mb */ + slotlimitsegs = ConvertToXSegs(max_slot_wal_keep_size_mb, + wal_segment_size); + /* honor the fragment */ + if (recptroff < slotlimitfragment) + slotlimitsegs++; XLByteToSeg(keep, slotSegNo, wal_segment_size); + /* + * ignore slots if too many wal segments are kept. + * max_slot_wal_keep_size is just accumulated on wal_keep_segments. + */ + if (max_slot_wal_keep_size_mb > 0 && slotSegNo + slotlimitsegs < segno) + { + segno = segno - slotlimitsegs; /* must be positive */ + + /* + * warn only if the checkpoint flushes the required segment. + * we assume here that *logSegNo is calculated keep location. + */ + if (slotSegNo < *logSegNo) +ereport(WARNING, + (errmsg ("restart LSN of replication slots is ignored by checkpoint"), + errdetail("Some replication slots have lost required WAL segnents to continue by up to %ld segments.", + (segno < *logSegNo ? segno : *logSegNo) - slotS
Re: [HACKERS] [PATCH] Improve geometric types
Hello, > I'd like to put comments on 0001 and 0004 only now: ... I don't have a comment on 0002. About 0003: > @@ -4487,21 +4486,21 @@ circle_in(PG_FUNCTION_ARGS) > ... > circle->radius = single_decode(s, &s, "circle", str); > - if (circle->radius < 0) > + if (float8_lt(circle->radius, 0.0)) > ereport(ERROR, > (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), flost8_lt and its family functions are provided to unify the sorting order including NaN. NaN is not rejected by the usage of float8_lt in the case but it is what the function is expected to be used for. If we wanted to check if it is positive, it unexpectedly throws an exception. (I suppose that NaNs should be silently ignored rather than stopping a query by throwng an exception.) Addition to that I don't think it proper that applying EPSILON(!) there. It should be strictly compared regardless whether EPSION is considered or not. Similary, circle_overlap for example, float8_le is used to compare the distance and the summed radius. NaN causes a problem in another place. > PG_RETURN_BOOL(FPle(point_dt(&circle1->center, &circle2->center), > float8_pl(circle1->radius, circle2->radius))); If the distance was NaN and the summed radius is not, the function returns true. I think that a reasonable behavior is that an object containing NaN cannot make any meaningful relationship with other objects as floating number itself behave so. (Any comparison other than != with NaN returns always false) Just using another series of comparison functions that return false for NaN-containing comparison is not a solution since the meaning of the returned false differs by context, just same as the first problem above. For exameple, the fictious functions below, | bool circle_overlaps() | ret = FPle(distance, radius_sum); This gives correct results, but | bool circle_not_overlaps() | ret = FPgt(distance, radius_sum); This gives a wrong result for NaN-containing objects. Perhaps it is enough to explicitly define behaviors for NaN before comparison. circle_overlap() > distance = point_dt(); > radius_sum = float8_pl(...); > > /* NaN-containing objects doesn't overlap any other objects */ > if (isnan(distance) || isnan(radius_sum)) > PG_RETURN_BOOL(false); > > /* NaN ordering of FPle() doesn't get into mischief here */ > return PG_RETURN_BOOL(FPle(distance, radius_sum)); (End Of the Comment to 0003) regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] (spelling) Ensure header of postgresql.auto.conf is consistent
Attached a patch that ensures the header of postgresql.auto.conf is consistent, whether created by initdb or recreated when ALTER SYSTEM is issued. The tiny difference caused some false-positives on our configuration management identifying changes, which was enough of an itch for me to scratch. regards, Feike Steenbergen 0001-Make-header-of-postgresql.conf.auto-consistent.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers