Re: [HACKERS] security labels on databases are bad for dump restore
On Thu, Jul 30, 2015 at 10:37:33PM -0400, Adam Brightwell wrote: On Thu, Jul 30, 2015 at 02:26:34PM -0400, Robert Haas wrote: 1. pg_dumpall -g 2. pg_dump --create per database Gah, OK, I see your point. But we better document this, because if you need a PhD in PostgreSQL-ology to take a backup, we're not in a good place. Definitely. Agreed. Though, honestly, I find this to be a cumbersome approach. I think it just makes things more confusing, even if it is well documented. Perhaps it might be necessary as a bridge to get to a better place. But my first question as an end user would be, 'why can't one tool do this?'. pg_dumpall (without -g) is that one tool. It has excellent dump fidelity. It lacks the slicing and dump format options of pg_dump, which are important to many sites. Also, by using 'pg_dumpall -g' aren't you potentially getting things that you don't want/need/care about? For instance, if database 'foo' is owned by 'user1' and database 'bar' is owned by 'user2' and neither have any knowledge/relation of/to the other, then when I dump 'foo', in this manner, wouldn't I also be including 'user2'? Said differently, a restore of a 'foo'-only dump would also include a 'bar' related role. That seems like a bad idea, IMHO. Maybe it can't be avoided, but I'd expect that only relevant information for the database being dumped would be included. Nothing in core PostgreSQL attempts to answer the high-level question Is user1 relevant to database bar? PostgreSQL has no one concept of a role's relevance to databases. Some reasonable heuristics come to mind, but nothing I'd be eager to memorialize in a core tool like pg_dumpall. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Cleaning up missing ERRCODE assignments
I believe we have a project policy that all user-facing error reports should go through ereport not elog (so that they're translatable) and should not have ERRCODE_INTERNAL_ERROR as SQLSTATE. It's sometimes debatable where the line is between user-facing and not, but surely any error that is triggered by the standard regression tests has to be considered user-facing. Yet running the tests with log_error_verbosity set to verbose turns up quite a few XX000 log entries. The attached patch fixes all the cases exposed by the regression tests. I also went through each affected file and adjusted any other elog or ereport calls that seemed to need it, on the theory that such oversights probably travel in herds. I don't pretend that this is a complete fix, but it's at least a down payment on the problem. Probably the main thing that's worth discussing here is what to do in plperl/plpython/pltcl when reporting an error thrown from the respective language interpreter; in most cases we don't have a clear idea what SQLSTATE should be used, but that doesn't make ERRCODE_INTERNAL_ERROR an acceptable choice. I used ERRCODE_EXTERNAL_ROUTINE_EXCEPTION where there was not an obviously better candidate, but I wonder if anyone has a different idea? I propose to back-patch this into 9.5, but not further; it's not an important enough issue to justify changing SQLSTATE behavior in stable branches. regards, tom lane diff --git a/contrib/tablefunc/tablefunc.c b/contrib/tablefunc/tablefunc.c index 8a95d47..cb1d029 100644 *** a/contrib/tablefunc/tablefunc.c --- b/contrib/tablefunc/tablefunc.c *** crosstab(PG_FUNCTION_ARGS) *** 432,438 break; default: /* result type isn't composite */ ! elog(ERROR, return type must be a row type); break; } --- 432,440 break; default: /* result type isn't composite */ ! ereport(ERROR, ! (errcode(ERRCODE_DATATYPE_MISMATCH), ! errmsg(return type must be a row type))); break; } *** build_tuplestore_recursively(char *key_f *** 1350,1356 appendStringInfo(chk_current_key, %s%s%s, branch_delim, current_key, branch_delim); if (strstr(chk_branchstr.data, chk_current_key.data)) ! elog(ERROR, infinite recursion detected); } /* OK, extend the branch */ --- 1352,1360 appendStringInfo(chk_current_key, %s%s%s, branch_delim, current_key, branch_delim); if (strstr(chk_branchstr.data, chk_current_key.data)) ! ereport(ERROR, ! (errcode(ERRCODE_INVALID_RECURSION), ! errmsg(infinite recursion detected))); } /* OK, extend the branch */ *** validateConnectbyTupleDesc(TupleDesc tup *** 1429,1435 { if (tupdesc-natts != (CONNECTBY_NCOLS + serial_column)) ereport(ERROR, ! (errcode(ERRCODE_SYNTAX_ERROR), errmsg(invalid return type), errdetail(Query-specified return tuple has \ wrong number of columns.))); --- 1433,1439 { if (tupdesc-natts != (CONNECTBY_NCOLS + serial_column)) ereport(ERROR, ! (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg(invalid return type), errdetail(Query-specified return tuple has \ wrong number of columns.))); *** validateConnectbyTupleDesc(TupleDesc tup *** 1438,1444 { if (tupdesc-natts != CONNECTBY_NCOLS_NOBRANCH + serial_column) ereport(ERROR, ! (errcode(ERRCODE_SYNTAX_ERROR), errmsg(invalid return type), errdetail(Query-specified return tuple has \ wrong number of columns.))); --- 1442,1448 { if (tupdesc-natts != CONNECTBY_NCOLS_NOBRANCH + serial_column) ereport(ERROR, ! (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg(invalid return type), errdetail(Query-specified return tuple has \ wrong number of columns.))); *** validateConnectbyTupleDesc(TupleDesc tup *** 1447,1460 /* check that the types of the first two columns match */ if (tupdesc-attrs[0]-atttypid != tupdesc-attrs[1]-atttypid) ereport(ERROR, ! (errcode(ERRCODE_SYNTAX_ERROR), errmsg(invalid return type), errdetail(First two columns must be the same type.))); /* check that the type of the third column is INT4 */ if (tupdesc-attrs[2]-atttypid != INT4OID) ereport(ERROR, ! (errcode(ERRCODE_SYNTAX_ERROR), errmsg(invalid return type), errdetail(Third column must be type %s., format_type_be(INT4OID; --- 1451,1464 /* check that the types of the first two columns match */ if (tupdesc-attrs[0]-atttypid != tupdesc-attrs[1]-atttypid) ereport(ERROR, ! (errcode(ERRCODE_DATATYPE_MISMATCH), errmsg(invalid return type), errdetail(First two columns must be the same type.))); /* check that the type of the third column is INT4 */ if (tupdesc-attrs[2]-atttypid != INT4OID)
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
Tom Lane writes: Well, I certainly think all of these represent bugs: [...] thanks for priorizing them. I'll try to digest them somewhat before posting. This one's pretty darn odd, because 2619 is pg_statistic and not an index at all: 4 | ERROR: cache lookup failed for index 2619 This is actually the one from the README :-). Quoting to spare media discontinuity: --8---cut here---start-8--- Taking a closer look at it reveals that it happens when you query a certain catalog view like this: self=# select indexdef from pg_catalog.pg_indexes where indexdef is not NULL; FEHLER: cache lookup failed for index 2619 This is because the planner then puts pg_get_indexdef(oid) in a context where it sees non-index-oids, which causes it to croak: QUERY PLAN Hash Join (cost=17.60..30.65 rows=9 width=4) Hash Cond: (i.oid = x.indexrelid) - Seq Scan on pg_class i (cost=0.00..12.52 rows=114 width=8) Filter: ((pg_get_indexdef(oid) IS NOT NULL) AND (relkind = 'i'::char)) - Hash (cost=17.31..17.31 rows=23 width=4) - Hash Join (cost=12.52..17.31 rows=23 width=4) Hash Cond: (x.indrelid = c.oid) - Seq Scan on pg_index x (cost=0.00..4.13 rows=113 width=8) - Hash (cost=11.76..11.76 rows=61 width=8) - Seq Scan on pg_class c (cost=0.00..11.76 rows=61 width=8) Filter: (relkind = ANY ('{r,m}'::char[])) --8---cut here---end---8--- thanks, Andreas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doubt about AccessExclusiveLock in ALTER TABLE .. SET ( .. );
On Sat, Aug 1, 2015 at 5:00 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fabrízio de Royes Mello wrote: In this patch I didn't change all lockmode comparison places previous pointed by you, but I can change it maybe adding other method called LockModeIsValid(lockmode) to do the comparison lockmode = NoLock lockmode MAX_LOCKMODES used in many places. I don't like this. Is it possible to write these comparisons in terms of what they conflict with? I think there are two main cases in the existing code: 1. is this lock mode valid (sounds reasonable) 2. can this be acquired in hot standby (not so much, but makes sense.) and now we have your third thing, what is the strongest of these two locks. The third method already exists in tablecmds.c: /* * Take the greatest lockmode from any subcommand */ if (cmd_lockmode lockmode) lockmode = cmd_lockmode; For instance, if you told me to choose between ShareLock and ShareUpdateExclusiveLock I wouldn't know which one is strongest. I don't it's sensible to have the lock mode compare primitive honestly. I don't have any great ideas to offer ATM sadly. Yes, the thing is that lowering the lock levels is good for concurrency, but the non-monotony of the lock levels makes it impossible to choose an intermediate state correctly. Hence I think that the one and only safe answer to this problem is that we check if the locks taken for each subcommand are all the same, and use this lock. If there is just one lock different, like for example one subcommand uses ShareLock, and a second one ShareUpdateExclusiveLock (good example of yours) we simply fallback to AccessExclusiveLock, based on the fact that we are sure that this lock conflicts with all the other ones. At the same time I think that we should as well patch the existing code path of tablecmds.c that already does those comparisons. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DESIGN] ParallelAppend
On Tue, Jul 28, 2015 at 6:08 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: On Tue, Jul 28, 2015 at 7:59 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai Sent: Monday, July 27, 2015 11:07 PM To: Amit Kapila Is there a real need to have new node like ParallelAppendPath? Can't we have Funnel node beneath AppendNode and then each worker will be responsible to have SeqScan on each inherited child relation. Something like Append --- Funnel -- SeqScan rel1 -- SeqScan rel2 If Funnel can handle both of horizontal and vertical parallelism, it is a great simplification. I never stick a new node. Once Funnel get a capability to have multiple child nodes, probably, Append node above will have gone. I expect set_append_rel_pathlist() add two paths based on Append and Funnel, then planner will choose the cheaper one according to its cost. In the latest v16 patch, Funnel is declared as follows: typedef struct Funnel { Scanscan; int num_workers; } Funnel; If we try to add Append capability here, I expects the structure will be adjusted as follows, for example: typedef struct Funnel { Scanscan; List *funnel_plans; List *funnel_num_workers; } Funnel; As literal, funnel_plans saves underlying Plan nodes instead of the lefttree. Also, funnel_num_workers saves number of expected workers to be assigned on individual child plans. or shall we have a node like above and name it as FunnelAppend or AppenFunnel? It is better to have smaller number of node types which are capable to kick background workers because of simplification of path construction. Let's assume the case below. When planner considers a path to append child scans on rel1, rel2 and rel3 but the cheapest path of rel2 is Funnel+PartialSeqScan, we cannot put Funnel here unless we don't pull up Funnel of rel2, can we? (Append? or Funnel) -- SeqScan on rel1 -- Funnel -- PartialSeqScan on rel2 -- IndexScan on rel3 I am not sure, but what problem do you see in putting Funnel node for one of the relation scans and not for the others. At this moment, I'm not certain whether background worker can/ought to launch another background workers. If sub-Funnel node is executed by 10-processes then it also launch 10-processes for each, 100-processes will run for each? If we pull Funnel here, I think the plan shall be as follows: Funnel -- SeqScan on rel1 -- PartialSeqScan on rel2 -- IndexScan on rel3 So if we go this route, then Funnel should have capability to execute non-parallel part of plan as well, like in this case it should be able to execute non-parallel IndexScan on rel3 as well and then it might need to distinguish between parallel and non-parallel part of plans. I think this could make Funnel node complex. It is difference from what I plan now. In the above example, Funnel node has two non-parallel aware node (rel1 and rel3) and one parallel aware node, then three PlannedStmt for each shall be put on the TOC segment. Even though the background workers pick up a PlannedStmt from the three, only one worker can pick up the PlannedStmt for rel1 and rel3, however, rel2 can be executed by multiple workers simultaneously. (Note: if number of workers are less than three in this case, PlannedStmt for rel3 shall not be picked up unless any other worker don't complete to run other plan on rel1 or rel2). From the standpoint of the Funnel, it just kicks background workers with: - multiple PlannedStmt nodes - maximum number of workers for each plan in addition to the current form. Then, it continues to fetch records from the shm_mq. Probably, it does not change the current form so much. If all we have to pay attention is Funnel node, it makes the code around path construction and pull-up logic much simpler, rather than multiple node types can kick background workers. Okay, but I think pulling-up Funnel node makes sense only when all nodes beneath it needs to be executed parallely. I think its decision should be based on the cost, that includes additional startup_cost to launch background worker, as long as non-parallel node is also capable to run on the worker side. Even though create_parallelscan_paths() in v16 set num_workers not larger than parallel_seqscan_degree, total number of the concurrent background workers may exceed this configuration if more than two PartialSeqScan nodes are underlying. It is a different configuration from max_worker_processes, so it is not a matter as long as we
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
Robert Haas robertmh...@gmail.com writes: The problem that was bothering us (or at least what was bothering me) is that the PlannerInfo provides only a list of SpecialJoinInfo structures, which don't directly give you the original join order. In fact, min_righthand and min_lefthand are intended to constraint the *possible* join orders, and are deliberately designed *not* to specify a single join order. If you're sending a query to a remote PostgreSQL node, you don't want to know what all the possible join orders are; it's the remote side's job to plan the query. You do, however, need an easy way to identify one join order that you can use to construct a query. It didn't seem easy to do that without duplicating make_join_rel(), which seemed like a bad idea. In principle it seems like you could traverse root-parse-jointree as a guide to reconstructing the original syntactic structure; though I'm not sure how hard it would be to ignore the parts of that tree that correspond to relations you're not shipping. But maybe there's a good way to do it. Tom wasn't crazy about this hook both because of the frequency of calls and also because of the long argument list. I think those concerns are legitimate; I just couldn't see how to make the other way work. In my vision you probably really only want one call per build_join_rel event (that is, per construction of a new RelOptInfo), not per make_join_rel event. It's possible that an FDW that wants to handle joins but is not talking to a remote query planner would need to grovel through all the join ordering possibilities individually, and then maybe hooking at make_join_rel is sensible rather than having to reinvent that logic. But I'd want to see a concrete use-case first, and I certainly don't think that that's the main case to design the API around. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
Andreas Seltenreich seltenre...@gmx.de writes: Tom Lane writes: What concerns me more is that what you're finding is only cases that trip an assertion sanity check. It seems likely that you're also managing to trigger other bugs with less drastic consequences, such as could not devise a query plan failures or just plain wrong answers. Ja, some of these are logged as well[1], but most of them are really as undrastic as can get, and I was afraid reporting them would be more of a nuisance. Well, I certainly think all of these represent bugs: 3 | ERROR: plan should not reference subplan's variable 2 | ERROR: failed to assign all NestLoopParams to plan nodes 1 | ERROR: could not find pathkey item to sort This I'm not sure about; it could be that the query gave conflicting collation specifiers, but on the other hand we've definitely had bugs with people forgetting to run assign_query_collations on subexpressions: 4646 | ERROR: could not determine which collation to use for string comparison This one's pretty darn odd, because 2619 is pg_statistic and not an index at all: 4 | ERROR: cache lookup failed for index 2619 These seem likely to be bugs as well, though maybe they are race conditions during a DROP and not worth fixing: 1171 | ERROR: cache lookup failed for index 16862 172 | ERROR: cache lookup failed for index 257148 84 | ERROR: could not find member 1(34520,34520) of opfamily 1976 55 | ERROR: missing support function 1(34516,34516) in opfamily 1976 13 | ERROR: could not find commutator for operator 34538 2 | ERROR: cache lookup failed for index 12322 I would say anything of the sort that is repeatable definitely deserves investigation, because even if it's an expectable error condition, we should be throwing a more user-friendly error message. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] brin index vacuum versus transaction snapshots
Alvaro Herrera alvhe...@2ndquadrant.com wrote: Untested patch attached. That fixes the installcheck failure on my machine. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doubt about AccessExclusiveLock in ALTER TABLE .. SET ( .. );
On August 1, 2015 2:17:24 PM GMT+02:00, Michael Paquier michael.paqu...@gmail.com wrote: On Sat, Aug 1, 2015 at 5:00 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Fabrízio de Royes Mello wrote: In this patch I didn't change all lockmode comparison places previous pointed by you, but I can change it maybe adding other method called LockModeIsValid(lockmode) to do the comparison lockmode = NoLock lockmode MAX_LOCKMODES used in many places. I don't like this. Is it possible to write these comparisons in terms of what they conflict with? I think there are two main cases in the existing code: 1. is this lock mode valid (sounds reasonable) 2. can this be acquired in hot standby (not so much, but makes sense.) and now we have your third thing, what is the strongest of these two locks. The third method already exists in tablecmds.c: /* * Take the greatest lockmode from any subcommand */ if (cmd_lockmode lockmode) lockmode = cmd_lockmode; For instance, if you told me to choose between ShareLock and ShareUpdateExclusiveLock I wouldn't know which one is strongest. I don't it's sensible to have the lock mode compare primitive honestly. I don't have any great ideas to offer ATM sadly. Yes, the thing is that lowering the lock levels is good for concurrency, but the non-monotony of the lock levels makes it impossible to choose an intermediate state correctly. How about simply acquiring all the locks individually of they're different types? These few acquisitions won't matter. --- Please excuse brevity and formatting - I am writing this on my mobile phone. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Transactions involving multiple postgres foreign servers
On Tue, Feb 17, 2015 at 2:56 PM, Ashutosh Bapat ashutosh.ba...@enterprisedb.com wrote: 2. New catalog - This method takes out the need to have separate method for C1, C5 and even C2, also the synchronization will be taken care of by row locks, there will be no limit on the number of foreign transactions as well as the size of foreign prepared transaction information. But big problem with this approach is that, the changes to the catalogs are atomic with the local transaction. If a foreign prepared transaction can not be aborted while the local transaction is rolled back, that entry needs to retained. But since the local transaction is aborting the corresponding catalog entry would become invisible and thus unavailable to the resolver (alas! we do not have autonomous transaction support). We may be able to overcome this, by simulating autonomous transaction through a background worker (which can also act as a resolver). But the amount of communication and synchronization, might affect the performance. For Rollback, why can't we do it in reverse way, first rollback transaction in foreign servers and then rollback local transaction. I think for Commit, it is essential that we first commit in local server, so that we can resolve the transaction status of prepared transactions on foreign servers after crash recovery. However for Abort case, I think even if we don't Rollback in local server, it can be deduced (any transaction which is not committed should be Rolledback) during crash recovery for the matter of resolving transaction status of prepared transaction. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Re: Using quicksort and a merge step to significantly improve on tuplesort's single run external sort
On 2015-07-31 13:31:54 -0400, Robert Haas wrote: On Fri, Jul 31, 2015 at 7:21 AM, Jeremy Harris j...@wizmail.org wrote: Heapification is O(n) already, whether siftup (existing) or down. That's not my impression, or what Wikipedia says. Source? Building a binary heap via successive insertions is O(n log n), but building it directly is O(n). Looks like wikipedia agrees too https://en.wikipedia.org/wiki/Binary_heap#Building_a_heap I'm pretty sure that there's a bunch of places where we intentionally build a heap at once instead successively. At least reorderbuffer.c does so, and it looks like nodeMergeAppend as well (that's why they use binaryheap_add_unordered and then binaryheap_build). Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Solaris testers wanted for strxfrm() behavior
On Fri, Jul 31, 2015 at 08:29:51PM -0700, Josh Berkus wrote: Where are we with this? Do we feel confident that this bug is only on old versions of Solaris we don't care about? Or does it remain to be resolved? Affected systems either have an available vendor update addressing the problem or are so old that we don't care. Commit be8b06c makes affected systems fail early, which resolves the matter. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
On Fri, Jul 3, 2015 at 6:25 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: Can't FDWs get the join information through the root, which I think we would pass to the API as the argument? This is exactly what Tom suggested originally, and it has some appeal, but neither KaiGai nor I could see how to make it work . Do you have an idea? It's not too late to go back and change the API. The problem that was bothering us (or at least what was bothering me) is that the PlannerInfo provides only a list of SpecialJoinInfo structures, which don't directly give you the original join order. In fact, min_righthand and min_lefthand are intended to constraint the *possible* join orders, and are deliberately designed *not* to specify a single join order. If you're sending a query to a remote PostgreSQL node, you don't want to know what all the possible join orders are; it's the remote side's job to plan the query. You do, however, need an easy way to identify one join order that you can use to construct a query. It didn't seem easy to do that without duplicating make_join_rel(), which seemed like a bad idea. But maybe there's a good way to do it. Tom wasn't crazy about this hook both because of the frequency of calls and also because of the long argument list. I think those concerns are legitimate; I just couldn't see how to make the other way work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
On Fri, Jul 3, 2015 at 6:25 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: Can't FDWs get the join information through the root, which I think we would pass to the API as the argument? This is exactly what Tom suggested originally, and it has some appeal, but neither KaiGai nor I could see how to make it work . Do you have an idea? It's not too late to go back and change the API. The problem that was bothering us (or at least what was bothering me) is that the PlannerInfo provides only a list of SpecialJoinInfo structures, which don't directly give you the original join order. In fact, min_righthand and min_lefthand are intended to constraint the *possible* join orders, and are deliberately designed *not* to specify a single join order. If you're sending a query to a remote PostgreSQL node, you don't want to know what all the possible join orders are; it's the remote side's job to plan the query. You do, however, need an easy way to identify one join order that you can use to construct a query. It didn't seem easy to do that without duplicating make_join_rel(), which seemed like a bad idea. But maybe there's a good way to do it. Tom wasn't crazy about this hook both because of the frequency of calls and also because of the long argument list. I think those concerns are legitimate; I just couldn't see how to make the other way work. I could have a discussion with Fujita-san about this topic. He has a little bit tricky, but I didn't have a clear reason to deny, idea to tackle this matter. At the line just above set_cheapest() of the standard_join_search(), at least one built-in join logic are already added to the RelOptInfo, thus, FDW driver can reference the cheapest path by built-in logic and its lefttree and righttree that construct a joinrel. Its assumption is, the best paths by built-in logic are at least enough reasonable join order than other potential ones. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Using quicksort and a merge step to significantly improve on tuplesort's single run external sort
On 31/07/15 18:31, Robert Haas wrote: On Fri, Jul 31, 2015 at 7:21 AM, Jeremy Harris j...@wizmail.org wrote: Heapification is O(n) already, whether siftup (existing) or down. That's not my impression, or what Wikipedia says. Source? Measurements done last year: http://www.postgresql.org/message-id/52f35462.3030...@wizmail.org (spreadsheet attachment) http://www.postgresql.org/message-id/52f40ce9.1070...@wizmail.org (measurement procedure and spreadsheet explanation) -- Cheers, Jeremy -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] No more libedit?! - openssl plans to switch to APL2
Hi, According to https://www.openssl.org/blog/blog/2015/08/01/cla/ openssl is planning to relicense to the apache license 2.0. While APL2 is not compatible with GLP2 it *is* compatible with GPL3. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
The total number of heap pages is known, and the total number of index pages is also known, so it's possible to derive a percentage out of this part. The total number of index pages scanned during entire vacuum will depend on number of index scans that happens. In order to extrapolate percent complete for phase two(index scan) we need number of index scans * total index pages. The number of index scans can vary from 1 to n (n depending on maintenance_work_mem) Summarizing suggestions in previous mails, following information can be reported Phase 1.heap pages scanned / total heap pages Phase 2.index pages scanned / total index pages (across all indexes) Phase 3.count of heap pages vacuumed Additional info like number of index scans so far, number of dead tuples being vacuumed in one batch can also be provided. A combined estimate for vacuum percent complete can be provided by summing up heap pages scanned, index pages scanned against total heap pages, total index pages * number of index scans. Thank you, Rahila Syed
Re: [HACKERS] pg_rewind failure by file deletion in source server
On 07/17/2015 06:28 AM, Michael Paquier wrote: On Wed, Jul 1, 2015 at 9:31 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Jul 1, 2015 at 2:21 AM, Heikki Linnakangas hlinn...@iki.fi wrote: On 06/29/2015 09:44 AM, Michael Paquier wrote: On Mon, Jun 29, 2015 at 4:55 AM, Heikki Linnakangas wrote: But we'll still need to handle the pg_xlog symlink case somehow. Perhaps it would be enough to special-case pg_xlog for now. Well, sure, pg_rewind does not copy the soft links either way. Now it would be nice to have an option to be able to recreate the soft link of at least pg_xlog even if it can be scripted as well after a run. Hmm. I'm starting to think that pg_rewind should ignore pg_xlog entirely. In any non-trivial scenarios, just copying all the files from pg_xlog isn't enough anyway, and you need to set up a recovery.conf after running pg_rewind that contains a restore_command or primary_conninfo, to fetch the WAL. So you can argue that by not copying pg_xlog automatically, we're actually doing a favour to the DBA, by forcing him to set up the recovery.conf file correctly. Because if you just test simple scenarios where not much time has passed between the failover and running pg_rewind, it might be enough to just copy all the WAL currently in pg_xlog, but it would not be enough if more time had passed and not all the required WAL is present in pg_xlog anymore. And by not copying the WAL, we can avoid some copying, as restore_command or streaming replication will only copy what's needed, while pg_rewind would copy all WAL it can find the target's data directory. pg_basebackup also doesn't include any WAL, unless you pass the --xlog option. It would be nice to also add an optional --xlog option to pg_rewind, but with pg_rewind it's possible that all the required WAL isn't present in the pg_xlog directory anymore, so you wouldn't always achieve the same effect of making the backup self-contained. So, I propose the attached. It makes pg_rewind ignore the pg_xlog directory in both the source and the target. If pg_xlog is simply ignored, some old WAL files may remain in target server. Don't these old files cause the subsequent startup of target server as new standby to fail? That is, it's the case where the WAL file with the same name but different content exist both in target and source. If that's harmfull, pg_rewind also should remove the files in pg_xlog of target server. This would reduce usability. The rewound node will replay WAL from the previous checkpoint where WAL forked up to the minimum recovery point of source node where pg_rewind has been run. Hence if we remove completely the contents of pg_xlog we'd lose a portion of the logs that need to be replayed until timeline is switched on the rewound node when recovering it (while streaming from the promoted standby, whatever). I don't really see why recycled segments would be a problem, as that's perhaps what you are referring to, but perhaps I am missing something. Hmm. My thinking was that you need to set up restore_command or primary_conninfo anyway, to fetch the old WAL, so there's no need to copy any WAL. But there's a problem with that: you might have WAL files in the source server that haven't been archived yet, and you need them to recover the rewound target node. That's OK for libpq mode, I think as the server is still running and presumably and you can fetch the WAL with streaming replication, but for copy-mode, that's not a good assumption. You might be relying on a WAL archive, and the file might not be archived yet. Perhaps it's best if we copy all the WAL files from source in copy-mode, but not in libpq mode. Regarding old WAL files in the target, it's probably best to always leave them alone. They should do no harm, and as a general principle it's best to avoid destroying evidence. It'd be nice to get some fix for this for alpha2, so I'll commit a fix to do that on Monday, unless we come to a different conclusion before that. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
Amit, Let me ask three more detailed questions. Why Funnel has a valid qual of the subplan? The 2nd argument of make_funnel() is qualifier of the subplan (PartialSeqScan) then it is initialized at ExecInitFunnel, but never executed on the run-time. Why does Funnel node has useless qualifier expression here (even though it is harmless)? Why Funnel delivered from Scan? Even though it constructs a compatible target-list with underlying partial-scan node, it does not require the node is also delivered from Scan. For example, Sort or Append don't change the target-list definition from its input, also don't have its own qualifier. It seems to me the definition below is more suitable... typedef struct Funnel { Planplan; int num_workers; } Funnel; Does ExecFunnel() need to have a special code path to handle EvalPlanQual()? Probably, it just calls underlying node in the local context. ExecScan() of PartialSeqScan will check its qualifier towards estate-es_epqTuple[]. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai Sent: Thursday, July 30, 2015 8:45 AM To: Amit Kapila Cc: Robert Haas; Gavin Flower; Jeff Davis; Andres Freund; Amit Langote; Amit Langote; Fabrízio Mello; Thom Brown; Stephen Frost; pgsql-hackers; Haribabu Kommi Subject: Re: [HACKERS] Parallel Seq Scan On Wed, Jul 29, 2015 at 7:32 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: Hi Amit, Could you tell me the code intention around ExecInitFunnel()? ExecInitFunnel() calls InitFunnel() that opens the relation to be scanned by the underlying PartialSeqScan and setup ss_ScanTupleSlot of its scanstate. The main need is for relation descriptor which is then required to set the scan tuple's slot. Basically it is required for tuples flowing from worker which will use the scan tuple slot of FunnelState. According to the comment of InitFunnel(), it open the relation and takes appropriate lock on it. However, an equivalent initialization is also done on InitPartialScanRelation(). Why does it acquire the relation lock twice? I think locking twice is not required, it is just that I have used the API ExecOpenScanRelation() which is used during other node's initialisation due to which it lock's twice. I think in general it should be harmless. Thanks, I could get reason of the implementation. It looks to me this design is not problematic even if Funnel gets capability to have multiple sub-plans thus is not associated with a particular relation as long as target-list and projection-info are appropriately initialized. Best regards, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TAP tests are badly named
On Thu, Jul 30, 2015 at 07:54:27PM -0400, Andrew Dunstan wrote: On 07/30/2015 12:40 PM, Andrew Dunstan wrote: We should describe test sets by what they test, not by how they test. TAP I agree with that philosophy. I also respect the practicality of grouping by test harness as a shorthand. Tests sharing a harness tend to share dependencies, harness bugs, logging mechanisms, etc. is a testing tool/protocol. The current set of tests we have test the programs in src/bin, and we should really name the test set by a name that reflects that, rather than the fact that we are using TAP tools to run the tests. What if we decide to test something else using TAP? Would we call that set of tests TAP tests too? Yes. TAP test refers to any test in a suite written for the prove harness, including the existing suite outside src/bin: $ find . -name t ./src/bin/pg_controldata/t ./src/bin/scripts/t ./src/bin/pg_rewind/t ./src/bin/pg_basebackup/t ./src/bin/pg_ctl/t ./src/bin/pg_config/t ./src/bin/initdb/t ./src/test/ssl/t --enable-tap-tests is a reasonable configuration setting, because it's about whether or not we have a TAP testing framework available, but I think we should stop calling the bin tests TAP tests and we should change the test name in vcregress.pl to a more appropriate name. In the buildfarm I'm calling the step bin-check: http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=crakedt=2015-07-30%2012%3A25%3A58stg=bin-check Thoughts? While lack of granularity in vcregress.pl is hostile, with so much about MSVC development being hostile, this one is below my noise floor. In fact, looking more closely at the changes that have been made to vcregress.pl, I don't really like the way this has been done. I'm putting together some changes to bring things more into line with how the Makefiles work. The commit history of vcregress.pl shows that doing good here is difficult, and the rewards are low. If you wish to write tightly-focused patches to align vcregress.pl with the experience of testing with GNU make, I am cautiously optimistic. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Cleaning up missing ERRCODE assignments
On Sat, Aug 1, 2015 at 8:39 AM, Tom Lane t...@sss.pgh.pa.us wrote: I propose to back-patch this into 9.5, but not further; it's not an important enough issue to justify changing SQLSTATE behavior in stable branches. +1. As I've said in the past, I think that making it possible to determine mechanically that a can't happen error has in fact occurred is of huge value to the project. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TAP tests are badly named
On Sat, Aug 01, 2015 at 07:13:04PM -0400, Andrew Dunstan wrote: On 08/01/2015 04:44 PM, Noah Misch wrote: --enable-tap-tests is a reasonable configuration setting, because it's about whether or not we have a TAP testing framework available, but I think we should stop calling the bin tests TAP tests and we should change the test name in vcregress.pl to a more appropriate name. In the buildfarm I'm calling the step bin-check: http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=crakedt=2015-07-30%2012%3A25%3A58stg=bin-check Thoughts? While lack of granularity in vcregress.pl is hostile, with so much about MSVC development being hostile, this one is below my noise floor. Speaking with my buildfarm hat on, it's well above mine. And these changes make the situation worse quite gratuitously. Anyway, I'll fix it. Sounds good. I'll be interested to read your design. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] upgrade failure from 9.5 to head
On 08/01/2015 07:13 PM, Noah Misch wrote: On Wed, Jul 29, 2015 at 04:42:55PM -0400, Andrew Dunstan wrote: The next hump is this, in restoring contrib_regression_test_ddl_parse: pg_restore: creating FUNCTION public.text_w_default_in(cstring) pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 243; 1255 62534 FUNCTION text_w_default_in(cstring) buildfarm pg_restore: [archiver (db)] could not execute query: ERROR: pg_type OID value not set when in binary upgrade mode Command was: CREATE FUNCTION text_w_default_in(cstring) RETURNS text_w_default LANGUAGE internal STABLE STRICT AS $$texti... Is this worth bothering about, or should I simply remove the database before trying to upgrade? That's a bug. The test_ddl_deparse suite leaves a shell type, which pg_upgrade fails to reproduce. Whether to have pg_upgrade support that or just error out cleanly is another question. Well, for now I'll just drop the database. Thanks for the diagnosis. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TAP tests are badly named
On 08/01/2015 04:44 PM, Noah Misch wrote: --enable-tap-tests is a reasonable configuration setting, because it's about whether or not we have a TAP testing framework available, but I think we should stop calling the bin tests TAP tests and we should change the test name in vcregress.pl to a more appropriate name. In the buildfarm I'm calling the step bin-check: http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=crakedt=2015-07-30%2012%3A25%3A58stg=bin-check Thoughts? While lack of granularity in vcregress.pl is hostile, with so much about MSVC development being hostile, this one is below my noise floor. Speaking with my buildfarm hat on, it's well above mine. And these changes make the situation worse quite gratuitously. Anyway, I'll fix it. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] upgrade failure from 9.5 to head
On Wed, Jul 29, 2015 at 04:42:55PM -0400, Andrew Dunstan wrote: The next hump is this, in restoring contrib_regression_test_ddl_parse: pg_restore: creating FUNCTION public.text_w_default_in(cstring) pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 243; 1255 62534 FUNCTION text_w_default_in(cstring) buildfarm pg_restore: [archiver (db)] could not execute query: ERROR: pg_type OID value not set when in binary upgrade mode Command was: CREATE FUNCTION text_w_default_in(cstring) RETURNS text_w_default LANGUAGE internal STABLE STRICT AS $$texti... Is this worth bothering about, or should I simply remove the database before trying to upgrade? That's a bug. The test_ddl_deparse suite leaves a shell type, which pg_upgrade fails to reproduce. Whether to have pg_upgrade support that or just error out cleanly is another question. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 64-bit XIDs again
On 31 July 2015 at 22:46, Heikki Linnakangas hlinn...@iki.fi wrote: On 07/31/2015 12:29 AM, Josh Berkus wrote: On 07/30/2015 07:24 AM, Heikki Linnakangas wrote: You'd never be forced to do anti-wraparound vacuums, you could just let the clog grow arbitrarily large When I introduced the same idea a few years back, having the clog get arbitrarily large was cited as a major issue. I was under the impression that clog size had some major performance impacts. Well, sure, if you don't want the clog to grow arbitrarily large, then you need to freeze. This statement isn't quite right, things are better than that. We don't need to freeze in order to shrink the clog, we just need to hint and thereby ensure we move forwards the lowest unhinted xid. That does involve scanning, but doesn't need to scan indexes. That scan won't produce anywhere near as much additional WAL traffic or I/O. In practice, larger clog would only happen with higher transaction rate, which means more system resources, so I don't think its too much of a problem overall. -- Simon Riggshttp://www.2ndQuadrant.com/ http://www.2ndquadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] Encoding of early PG messages
Oh sorry, I think I misunderstood your suggestion - setting lc_messages in the startup packet wouldn't work any more than setting client_encoding, would it. So any solution here would be on the database/backend side, and so irrelevant for a general-purpose driver... On Fri, Jul 31, 2015 at 4:28 PM, Shay Rojansky r...@roji.org wrote: Thanks for the suggestions Tom. As I'm developing a general-purpose driver I can't do anything in PostgreSQL config, but it's a good workaround suggestion for users who encounter this error. Sending lc_messages in the startup packet could work, but if I understand correctly that setting combines both encoding and language. I guess I can look at the user's locale preference on the client machine, try to translate that into a PostgreSQL language/encoding and send that in lc_messages - that seems like it might work. Shay On Fri, Jul 31, 2015 at 3:46 PM, Tom Lane t...@sss.pgh.pa.us wrote: Shay Rojansky r...@roji.org writes: Developing Npgsql I've encountered the problem described in http://www.postgresql.org/message-id/20081223212414.gd3...@merkur.hilbert.loc : a German installation of PostgreSQL seems to respond to an incorrect password with a non-UTF8 encoding of the error messages, even if the startup message contains client_encoding=UTF8. I wouldn't hold your breath waiting for that to change. A possible workaround is to run the postmaster with lc_messages=C and then switch to your desired message language per-session. It would certainly work to send lc_messages along with client_encoding in the startup packet; or possibly you could set those settings as per-database or per-role settings to avoid needing to teach the application code about it. This would mean that bad-password and similar errors would come out in English, but at least they'd be validly encoded ... regards, tom lane
Re: [HACKERS] [sqlsmith] Failed assertion in joinrels.c
Tom Lane writes: What concerns me more is that what you're finding is only cases that trip an assertion sanity check. It seems likely that you're also managing to trigger other bugs with less drastic consequences, such as could not devise a query plan failures or just plain wrong answers. Ja, some of these are logged as well[1], but most of them are really as undrastic as can get, and I was afraid reporting them would be more of a nuisance. I analysed a couple of the cache lookup failures, and they all had a similar severreness than the example in the README[2]. The operator ones I analysed seem due to intentionally broken operators in the regression db. The NestLoopParams and subplan reference one sound interesting though… I'm not sure how we could identify wrong answers automatically :-( Csmith isn't doing this either. They discuss differential testing though in their papers, i.e., comparing the results of different products. Maybe a simple metric like numbers of rows returned might already be valuable for correctness checks. I also thought about doing some sampling on the data and simulating relational operations and check for witness tuples, but it is probably not appropriate to start implementing a mini-rdbms on the client side. but it might be worth checking for XX000 SQLSTATE responses, since generally that should be a can't-happen case. (Or if it can happen, we need to change the errcode.) The sqlstate is currently missing in the reports because libpqxx is not putting it in it's exceptions :-/. regards, Andreas Footnotes: [1] smith=# select * from report24h; count | error ---+-- 43831 | ERROR: unsupported XML feature 39496 | ERROR: invalid regular expression: quantifier operand invalid 27261 | ERROR: canceling statement due to statement timeout 21386 | ERROR: operator does not exist: point = point 8580 | ERROR: cannot compare arrays of different element types 5019 | ERROR: invalid regular expression: brackets [] not balanced 4646 | ERROR: could not determine which collation to use for string comparison 2583 | ERROR: invalid regular expression: nfa has too many states 2248 | ERROR: operator does not exist: xml = xml 1198 | ERROR: operator does not exist: polygon = polygon 1171 | ERROR: cache lookup failed for index 16862 677 | ERROR: invalid regular expression: parentheses () not balanced 172 | ERROR: cache lookup failed for index 257148 84 | ERROR: could not find member 1(34520,34520) of opfamily 1976 55 | ERROR: missing support function 1(34516,34516) in opfamily 1976 42 | ERROR: operator does not exist: city_budget = city_budget 13 | ERROR: could not find commutator for operator 34538 10 | ERROR: could not identify a comparison function for type xid 4 | Connection to database failed 4 | ERROR: cache lookup failed for index 2619 3 | ERROR: plan should not reference subplan's variable 2 | ERROR: cache lookup failed for index 12322 2 | ERROR: failed to assign all NestLoopParams to plan nodes 2 | ERROR: invalid regular expression: invalid character range 1 | ERROR: could not find pathkey item to sort (25 rows) Time: 1158,990 ms [2] https://github.com/anse1/sqlsmith/blob/master/README.org -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers