[HACKERS] Commitfest: patches Ready for Committer
In addition to the few patches left in Needs Review state, we have six patches marked as Ready for Committer. Committers: Could you please pick a patch, and commit if appropriate? Or if there's a patch there that you think should *not* be committed, please speak up. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On 07/18/2014 10:47 AM, Michael Paquier wrote: On Fri, Jul 18, 2014 at 3:54 AM, Peter Geoghegan p...@heroku.com wrote: I am not opposed to moving the contrib code into core in the manner that you oppose. I don't feel strongly either way. I noticed in passing that your revision says this *within* levenshtein.c: + * Guaranteed to work with Name datatype's cstrings. + * For full details see levenshtein.c. Yeah, I looked at what I produced yesterday night again and came across a couple of similar things :) And reworked a couple of things in the version attached, mainly wordsmithing and adding comments here and there, as well as making the naming of the Levenshtein functions in core the same as the ones in fuzzystrmatch 1.0. I imagined that when a committer picked this up, an executive decision would be made one way or the other. I am quite willing to revise the patch to alter this behavior at the request of a committer. Fine for me. I'll move this patch to the next stage then. There are a bunch of compiler warnings: parse_relation.c: In function ‘errorMissingColumn’: parse_relation.c:3114:447: warning: ‘closestcol1’ may be used uninitialized in this function [-Wmaybe-uninitialized] parse_relation.c:3066:8: note: ‘closestcol1’ was declared here parse_relation.c:3129:29: warning: ‘closestcol2’ may be used uninitialized in this function [-Wmaybe-uninitialized] parse_relation.c:3067:8: note: ‘closestcol2’ was declared here levenshtein.c: In function ‘levenshtein_common’: levenshtein.c:107:6: warning: unused variable ‘start_column_local’ [-Wunused-variable] - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
On Sun, Oct 5, 2014 at 7:39 PM, Ali Akbar the.ap...@gmail.com wrote: 2014-10-05 15:21 GMT+07:00 Ali Akbar the.ap...@gmail.com: - i think you can use the fctx-current variable without temporary variable (there's comment in the add_var function: Full version of add functionality on variable level (handling signs). result might point to one of the operands too without danger.). But you _must_ switch the context first because add_var will allocate new array for the data and freeing the old one. Yep. - numeric can be NaN. We must reject it as first, finish and last parameter. Makes sense. - numeric datatype is large, but there are limitations. According to doc, the limit is: up to 131072 digits before the decimal point; up to 16383 digits after the decimal point. How can we check if the next step overflows? As a comparison, in int.c, generate_series_step_int4 checks if its overflows, and stop the next call by setting step to 0. Should we do that? Yes we should. - while testing regression test, opr_sanity checks that the generate_series_numeric function is used twice (once for 2 parameter and once for the 3 parameter function), so i changed the name to generate_series_step_numeric and created new function generate_series_numeric that calls generate_series_step_numeric. Yep. It seems to me that this patch is heading in a good direction (haven't tested or tried to break it, just looked at the code). However please be careful of code format, particularly brackets for if blocks. For example this thing: if (foo) { blah; } Should be that: if (foo) blah; Then in the case of multiple lines, this thing: if (foo) { blah; blah2; } Should be that: if (foo) { blah; blah2; } Code convention is detailed in the docs: http://www.postgresql.org/docs/devel/static/source.html Regards, -- Michael
Re: [HACKERS] CREATE IF NOT EXISTS INDEX
On Mon, Oct 6, 2014 at 5:12 AM, Marti Raudsepp ma...@juffo.org wrote: On Mon, Oct 6, 2014 at 4:17 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: CREATE INDEX ... [ IF NOT EXISTS [ name ] ] ON ... I think this one is wrong now. I see now, I think you meant: CREATE INDEX ... [ [ IF NOT EXISTS ] name ] ON ... If yes, +1 for this, there's no redundancy any more. Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
Thanks for the review. Attached the formatted patch according to your suggestion. - numeric datatype is large, but there are limitations. According to doc, the limit is: up to 131072 digits before the decimal point; up to 16383 digits after the decimal point. How can we check if the next step overflows? As a comparison, in int.c, generate_series_step_int4 checks if its overflows, and stop the next call by setting step to 0. Should we do that? Yes we should. how can we check the overflow after add_var? (in int.c, the code checks for integer calculation overflow, that wraps the result to negative value) in numeric.sql regression test, i've added this query: select (i / (10::numeric ^ 131071))::numeric(1,0) from generate_series(-9*(10::numeric ^ 131071), 9*(10::numeric ^ 131071), (10::numeric ^ 131071)) as a(i); Because the doc notes that the maximum numeric digit before decimal point is 131072, i hope this query covers the overflow case (in the last value it will generate, if we add 9 x 10^13071 with 10^13071, add_var will overflows). But in my tests, that isn't the case. The code works without any error and returns the correct rows: numeric - -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 (19 rows) Regards, -- Ali Akbar *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 14074,14081 AND tbody row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter)/function/literal/entry ! entrytypeint/type or typebigint/type/entry ! entrytypesetof int/type or typesetof bigint/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of one --- 14074,14081 tbody row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter)/function/literal/entry ! entrytypeint/type, typebigint/type or typenumeric/type/entry ! entrytypesetof int/type, typesetof bigint/type, or typesetof numeric/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of one *** *** 14084,14091 AND row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter, parameterstep/parameter)/function/literal/entry ! entrytypeint/type or typebigint/type/entry ! entrytypesetof int/type or typesetof bigint/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of parameterstep/parameter --- 14084,14091 row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter, parameterstep/parameter)/function/literal/entry ! entrytypeint/type, typebigint/type or typenumeric/type/entry ! entrytypesetof int/type, typesetof bigint/type or typesetof numeric/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of parameterstep/parameter *** a/src/backend/utils/adt/numeric.c --- b/src/backend/utils/adt/numeric.c *** *** 28,33 --- 28,34 #include access/hash.h #include catalog/pg_type.h + #include funcapi.h #include libpq/pqformat.h #include miscadmin.h #include nodes/nodeFuncs.h *** *** 261,266 typedef struct NumericVar --- 262,279 /* -- + * Data for generate series + * -- + */ + typedef struct + { + NumericVar current; + NumericVar finish; + NumericVar step; + } generate_series_numeric_fctx; + + + /* -- * Some preinitialized constants * -- */ *** *** 1221,1226 numeric_floor(PG_FUNCTION_ARGS) --- 1234,1346 PG_RETURN_NUMERIC(res); } + + /* + * generate_series_numeric() - + * + * Generate series of numeric. + */ + Datum + generate_series_numeric(PG_FUNCTION_ARGS) + { + return generate_series_step_numeric(fcinfo); + } + + Datum + generate_series_step_numeric(PG_FUNCTION_ARGS) + { + generate_series_numeric_fctx *fctx; + FuncCallContext *funcctx; + MemoryContext oldcontext; + + if (SRF_IS_FIRSTCALL()) + { + Numeric start_num = PG_GETARG_NUMERIC(0); + Numeric finish_num = PG_GETARG_NUMERIC(1); + NumericVar steploc = const_one; + + /* Handle NaN in start finish */ + if (NUMERIC_IS_NAN(start_num) || NUMERIC_IS_NAN(finish_num)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg(start and finish cannot be NaN))); + + /* see if we
Re: [HACKERS] KNN-GiST with recheck
Thanks. The main question now is design of this patch. Currently, it does all the work inside access method. We already have some discussion of pro and cons of this method. I would like to clarify alternatives now. I can see following way: 1. Implement new executor node which performs sorting by priority queue. Let's call it Priority queue. I think it should be separate node from Sort node. Despite Priority queue and Sort are essentially similar from user view, they would be completely different in implementation. 2. Implement some interface to transfer distance values from access method to Priority queue node. If we assume that all of them need recheck, maybe it can be done without passing distance values. 3. Somehow tell the planner that it could use Priority queue in corresponding cases. I see two ways of doing this: - Add flag to operator in opclass indicating that index can only order by lower bound of col op value, not by col op value itself. - Define new relation between operators. Value of one operator could be lower bound for value of another operator. So, planner can put Priority queue node when lower bound ordering is possible from index. Also ALTER OPERATOR command would be reasonable, so extensions could upgrade. I think, it would be better to make it a property of the operator class. We can add a column to pg_amop or define another value for amoppurpose on pg_amop. Syntax can be something like this: CREATE OPERATOR CLASS circle_ops DEFAULT FOR TYPE circle USING gist AS OPERATOR 15 -(circle, point) FOR ORDER BY pg_catalog.float_ops LOWER BOUND; While looking at it, I realize that current version of the patch does not use the sort operator family defined with the operator class. It assumes that the distance function will return values compatible with the operator. Operator class definition makes me think that there is not such an assumption. Besides overhead, this way makes significant infrastructural changes. So, it may be over-engineering. However, it's probably more clean and beautiful solution. I would like to get some feedback from people familiar with KNN-GiST like Heikki or Tom. What do you think about this? Any other ideas? I would be happy to test and review the changes. I think it is nice to solve the problem in a generalized way improving the access method infrastructure. Definitely, we should have a consensus on the design before working on the infrastructure changes. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to support SEMI and ANTI join removal
On Wed, Oct 1, 2014 at 1:34 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-10-01 01:03:35 +1300, David Rowley wrote: On Wed, Oct 1, 2014 at 12:01 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-09-30 23:25:45 +1300, David Rowley wrote: I've not quite gotten my head around how we might stop the unneeded relation from being the primary path to join the other inner relations, i.e. what would stop the planner making a plan that hashed all the other relations and planned to perform a sequence scan on the relation that we have no need to scan (because the foreign key tells us the join is pointless). If we were not use use that relation then we'd just have a bunch of hash tables with no path to join them up. If we did anything to force the planner into creating a plan that would suit skipping relations, then we could possibly be throwing away the optimal plan. Right? I'm not 100% sure I understand your problem description, but let me describe how I think this would work. During planning, you'd emit the exactly same plan as you'd today, with two exceptions: a) When costing a node where one side of a join is very likely to be removable, you'd cost it nearly as if there wasn't a join. Ok given the tables: create table t1 (x int primary key); create table t2 (y int primary key); suppose the planner came up with something like: test=# explain (costs off) select t2.* from t1 inner join t2 on t1.x=t2.y; QUERY PLAN Hash Join Hash Cond: (t1.x = t2.y) - Seq Scan on t1 - Hash - Seq Scan on t2 If we had a foreign key... alter table t2 add constraint t2_y_fkey foreign key (y) references t1 (x); ...the join to t1 could possibly be ignored by the executor... but there's a problem as the plan states we're going to seqscan then hash that relation, then seqscan t1 with a hash lookup on each of t1's rows. In this case how would the executor skip the scan on t1? I can see how this might work if it was t2 that we were removing, as we'd just skip the hash lookup part in the hash join node. Hm, right. But that doesn't seem like a fatal problem to me. The planner knows about t1/t2 and Seq(t1), Seq(t2), not just Hash(Seq(t2)). So it can tell the HashJoin node that when the 'shortcut' qualifier is true, it should source everything from Seq(t2). Since the sequence scan doesn't care about the node ontop that doesn't seem to be overly dramatic? Obviously reality makes this a bit more complicated... Ok, after a bit of study on the hash join code I can see what you mean, it shouldn't really matter. I've started working on this now and I've made some changes in analyzejoins.c so that instead of removing joins, it just marks the RangeTblEntry, setting a new flag named skipJoinPossible to true. (I'll think of a better name later) I'm starting off with hash joins and I'm hacking a bit at ExecInitHashJoin. I want to add 2 bool flags to HashJoinState, outerIsRequired and innerIsRequired. I think if both of these flags are set then we can just abort the join altogether. Though in order to set these flags I need to identify which relations are for the outer and which are for the inner side of the join. I've got the logic for that only partially worked out. My understanding of it so far is that I just need to look at the hjstate-js.ps.lefttree and righttree. Inner being right, and outer being left. I'm a little stuck on more complex cases where the scan is nested deeper in the tree and I'm not quite sure on the logic I should be using to navigate to it. Take the following plan: (which I've amended to mark the left and right nodes) explain select t1.* from t1 inner join t2 on t1.t2_id= t2.id inner join t3 on t2.t3_id=t3.id; QUERY PLAN Hash Join (cost=122.15..212.40 rows=2140 width=8) Hash Cond: (t2.t3_id = t3.id) - Hash Join (cost=58.15..118.98 rows=2140 width=12) (left node) Hash Cond: (t1.t2_id = t2.id) - Seq Scan on t1 (cost=0.00..31.40 rows=2140 width=8) (left node) - Hash (cost=31.40..31.40 rows=2140 width=8) (right node) - Seq Scan on t2 (cost=0.00..31.40 rows=2140 width=8) (left node) - Hash (cost=34.00..34.00 rows=2400 width=4) (right node) - Seq Scan on t3 (cost=0.00..34.00 rows=2400 width=4) (left node) The schema is set up in such a way that the joins to t2 and t3 can be... skipped, and my new code in analyzejoin.c marks the RangeTblEntry records for this relation to reflect that. During ExecInitHashJoin for the join between t2 and t3, if I want to find t2 in the tree, I'd need to do hjstate-js.ps.lefttree-righttree-lefttree... (which I know just from looking at the explain output) I just can't work out the logic behind where the scan
[HACKERS] Improve automatic analyze messages for inheritance trees
I noticed that analyze messages shown by autovacuum don't discriminate between non-inherited cases and inherited cases, as shown in the below example: LOG: automatic analyze of table postgres.public.pt system usage: CPU 0.00s/0.01u sec elapsed 0.06 sec LOG: automatic analyze of table postgres.public.pt system usage: CPU 0.00s/0.02u sec elapsed 0.11 sec (The first one is for table postgres.public.pt and the second one is for table inheritance tree postgres.public.pt.) So, I'd like to propose improving the messages for inherited cases, in order to easily distinguish such cases from non-inherited cases. Please find attached a patch. I'll add this to the upcoming CF. Thanks, Best regards, Etsuro Fujita *** a/src/backend/commands/analyze.c --- b/src/backend/commands/analyze.c *** *** 648,659 do_analyze_rel(Relation onerel, VacuumStmt *vacstmt, if (Log_autovacuum_min_duration == 0 || TimestampDifferenceExceeds(starttime, GetCurrentTimestamp(), Log_autovacuum_min_duration)) ! ereport(LOG, ! (errmsg(automatic analyze of table \%s.%s.%s\ system usage: %s, ! get_database_name(MyDatabaseId), ! get_namespace_name(RelationGetNamespace(onerel)), ! RelationGetRelationName(onerel), ! pg_rusage_show(ru0; } /* Roll back any GUC changes executed by index functions */ --- 648,669 if (Log_autovacuum_min_duration == 0 || TimestampDifferenceExceeds(starttime, GetCurrentTimestamp(), Log_autovacuum_min_duration)) ! { ! if (inh) ! ereport(LOG, ! (errmsg(automatic analyze of table inheritance tree \%s.%s.%s\ system usage: %s, ! get_database_name(MyDatabaseId), ! get_namespace_name(RelationGetNamespace(onerel)), ! RelationGetRelationName(onerel), ! pg_rusage_show(ru0; ! else ! ereport(LOG, ! (errmsg(automatic analyze of table \%s.%s.%s\ system usage: %s, ! get_database_name(MyDatabaseId), ! get_namespace_name(RelationGetNamespace(onerel)), ! RelationGetRelationName(onerel), ! pg_rusage_show(ru0; ! } } /* Roll back any GUC changes executed by index functions */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog and replication slots
On Mon, Oct 6, 2014 at 7:14 PM, Andres Freund and...@2ndquadrant.com wrote: On 2014-10-04 14:25:27 +0900, Michael Paquier wrote: And as I am on it, attached is a patch that can be applied to master and REL9_4_STABLE to rename the --create and --drop to --create-slot and --drop-slot. Thanks. diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c index c48cecc..585d7b0 100644 --- a/src/bin/pg_basebackup/pg_recvlogical.c +++ b/src/bin/pg_basebackup/pg_recvlogical.c @@ -91,8 +91,8 @@ usage(void) time between status packets sent to server (default: %d)\n), (standby_message_timeout / 1000)); printf(_( -S, --slot=SLOTname of the logical replication slot\n)); printf(_(\nAction to be performed:\n)); - printf(_( --create create a new replication slot (for the slot's name see --slot)\n)); - printf(_( --startstart streaming in a replication slot (for the slot's name see --slot)\n)); + printf(_( --create-slot create a new replication slot (for the slot's name see --slot)\n)); + printf(_( --start-slot start streaming in a replication slot (for the slot's name see --slot)\n)); printf(_( --drop drop the replication slot (for the slot's name see --slot)\n)); printf(_(\nReport bugs to pgsql-b...@postgresql.org.\n)); } @@ -618,9 +618,9 @@ main(int argc, char **argv) {status-interval, required_argument, NULL, 's'}, {slot, required_argument, NULL, 'S'}, /* action */ - {create, no_argument, NULL, 1}, + {create-slot, no_argument, NULL, 1}, {start, no_argument, NULL, 2}, - {drop, no_argument, NULL, 3}, + {drop-slot, no_argument, NULL, 3}, {NULL, 0, NULL, 0} Uh? You're documenting --start-slot here and a couple other places, but you implement --drop-slot. And --start-slot doesn't seem to make much sense to me. +-+, yes. It was aimed to be --create-slot :) Perhaps that was caused because of the lack of caffeine. Thanks for noticing and fixing. -- Michael
Re: [HACKERS] pg_receivexlog and replication slots
Hi, On 2014-10-04 15:01:03 +0900, Michael Paquier wrote: On Fri, Oct 3, 2014 at 8:57 PM, Andres Freund and...@2ndquadrant.com wrote: para +applicationpg_receivexlog/application can run in one of two following +modes, which control physical replication slot: I don't think that's good enough. There's also the important mode where it's not doing --create/--drop at all. Well, yes, however the third mode is not explicitly present, and I don't see much point in adding a --start mode thinking backward-compatibility. Now, I refactored a bit the documentation to mention that pg_receivexlog can perform additional actions to control replication slots. I added as well in the portion of option --slot how it interacts with --create-slot and --drop-slot. That's better. + if (db_name) + { + fprintf(stderr, + _(%s: database defined for replication connection \%s\\n), + progname, replication_slot); + disconnect_and_exit(1); + } I don't like 'defined' here. 'replication connection unexpectedly is database specific' or something would be better. Sure, IMO the error message should as well mention the replication slot being used, so I reformulated as such: replication connection using slot foo is unexpectedly database specific I don't see why the slot should be there. If this has gone wrong it's not related to the slot. +applicationpg_receivexlog/application can perform one of the two +following actions in order to control physical replication slots: + +variablelist + varlistentry + termoption--create-slot/option/term + listitem + para +Create a new physical replication slot with the name specified in +option--slot/option, then start stream. + /para + /listitem + /varlistentry *, then start to stream WAL. + if (replication_slot == NULL (do_drop_slot || do_create_slot)) + { + fprintf(stderr, _(%s: replication slot needed with action --create-slot or --drop-slot\n), progname); + fprintf(stderr, _(Try \%s --help\ for more information.\n), + progname); + exit(1); + } I changed this to refer to --slot. + /* + * Run IDENTIFY_SYSTEM so we can get the timeline and current xlog + * position. + */ + if (!RunIdentifySystem(conn, NULL, NULL, NULL, db_name)) + disconnect_and_exit(1); Comment is wrongly copied. Obviously we're neither looking at the timeline nor the WAL position. Pushed with these adjustments. Amit, Fujii: Sorry for not mentioning you as reviewers of the patchset. I hadn't followed the earlier development and thus somehow missed that you'd both done a couple rounds of reivew. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Feasibility of supporting bind params for all command types
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Tom Lane said: ... Craig Ringer cr...@2ndquadrant.com writes: While looking at an unrelated issue in PgJDBC I noticed that it's difficult for users and the driver to tell in advance if a given statement will support bind parameters. It's not that hard ;-) ... if it ain't SELECT/INSERT/UPDATE/DELETE, it won't accept parameters. Yes, it is as easy as that. That's exactly what DBD::Pg does - looks at the first word of the statement. Although you also need to add VALUES and WITH to that list. :) As a result, some code that worked with PgJDBC using the v2 protocol will fail with the v3 protocol, e.g. It'd be nice not to force users to do their own escaping of literals in non-plannable statements. Before embarking on anything like this I thought I'd check and see if anyone's looked into supporting bind parameters in utility statements, or if not, if anyone has any ideas about the feasibility of adding such support. I don't think that's a hill you want to conquer. Let that code relying on v2 behavior get rewritten, or make the driver smart enough to handle it automagically the best it can. - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201410060710 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAlQyeNIACgkQvJuQZxSWSshYewCgg/EmgTbPp5KnfUpYfga8nsee GVMAniXC+FxHFsiuT07idP8Tw70gCoBe =a20X -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format and API changes (9.5)
On 2014-10-06 14:19:39 +0300, Heikki Linnakangas wrote: On 10/06/2014 04:42 AM, Michael Paquier wrote: On Fri, Oct 3, 2014 at 9:51 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: So I now have a refactoring patch ready that I'd like to commit (the attached two patches together), but to be honest, I have no idea why the second patch is so essential to performance. Thanks. I did some more tests with master, master+patch1, master+patch1+CRC refactoring, but I am not able to see any performance difference with pgbench (--no-vacuum, -t) and the test suite you provided, just some noise that barely changed performance. Thanks for the confirmation. I'm really going crazy with benchmarking this. Sometimes I see a big difference, the next day it's gone. A usual suspect for this is turbo mode and power control. It often already helps to disable the latter to get much more reproducible benchmark results. Barring objections, I'll commit this, and then continue benchmarking the second patch with the WAL format and API changes. I'd like to have a look at it beforehand. I've not yet really looked, but on a quick readthrough XLogInsertRecData() staying in xlog.c doesn't make me happy... Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
Il 03/10/14 23:12, Andres Freund ha scritto: On 2014-10-03 17:31:45 +0200, Marco Nenciarini wrote: I've updated the wiki page https://wiki.postgresql.org/wiki/Incremental_backup following the result of discussion on hackers. Compared to first version, we switched from a timestamp+checksum based approach to one based on LSN. This patch adds an option to pg_basebackup and to replication protocol BASE_BACKUP command to generate a backup_profile file. It is almost useless by itself, but it is the foundation on which we will build the file based incremental backup (and hopefully a block based incremental backup after it). Any comment will be appreciated. In particular I'd appreciate comments on correctness of relnode files detection and LSN extraction code. Can you describe the algorithm you implemented in words? Here it is the relnode files detection algorithm: I've added a has_relfiles parameter to the sendDir function. If has_relfiles is true every file in the directory is tested against the validateRelfilenodeName function. If the response is true, the maxLSN value is computed for the file. The sendDir function is called with has_relfiles=true by sendTablespace function and by sendDir itself when is recurring into a subdirectory * if has_relfiles is true * if we are recurring into a ./global or ./base directory The validateRelfilenodeName has been taken from pg_computemaxlsn patch. It's short enough to be pasted here: static bool validateRelfilenodename(char *name) { int pos = 0; while ((name[pos] = '0') (name[pos] = '9')) pos++; if (name[pos] == '_') { pos++; while ((name[pos] = 'a') (name[pos] = 'z')) pos++; } if (name[pos] == '.') { pos++; while ((name[pos] = '0') (name[pos] = '9')) pos++; } if (name[pos] == 0) return true; return false; } To compute the maxLSN for a file, as the file is sent in TAR_SEND_SIZE chunks (32kb) and it is always a multiple of the block size, I've added the following code inside the send cycle: + char *page; + + /* Scan every page to find the max file LSN */ + for (page = buf; page buf + (off_t) cnt; page += (off_t) BLCKSZ) { + pagelsn = PageGetLSN(page); + if (filemaxlsn pagelsn) + filemaxlsn = pagelsn; + } + Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it signature.asc Description: OpenPGP digital signature
Re: [HACKERS] pg_receivexlog and replication slots
On Mon, Oct 6, 2014 at 8:01 PM, Andres Freund and...@2ndquadrant.com wrote: Pushed with these adjustments. Thanks. The portions changed look fine to me. -- Michael
Re: [HACKERS] Promise index tuples for UPSERT
On 3 October 2014 11:54, Heikki Linnakangas hlinnakan...@vmware.com wrote: Simon's approach would actually pass that test case just fine. It inserts the (promise) index tuple first, and heap tuple only after that. It will fail the test case with more than one unique index, however. Please explain what you mean by fail here? My understanding of what you're saying is that if * we have a table with 1 unique index * and we update the values of the uniquely index columns (e.g. PK update) * on both of the uniquely indexed column sets then we get occaisonal deadlocks, just as we would do using current UPDATE/INSERT. Is their a business use case that requires that? (Or exactly what you meant, if that isn't it?) My view is if we are going to base the whole design on this point, then we need to have it very clearly accessible for all to understand. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_receivexlog --status-interval add fsync feedback
On 09/29/2014 01:13 PM, furu...@pm.nttdata.co.jp wrote: I don't understand what this patch does. When would you want to use the new --reply-fsync option? Is there any reason *not* to use it? In other words, do we need an option for this, couldn't you just always send the feedback message after fsync? Thanks for the comment. --reply-fsync option is intended for use in synchronous mode. By specifying -F option and --slot option, process calls fsync() when it received the WAL, and flush location would be set in feedback message. Interval of sending feedback message depends on -s option in this state, so in the case of synchronous mode, waiting for feedback message would occure. therefore, --reply-fsync option is necessary. because it can send the feedback message after fsync without waiting for the interval of -s option. The reason for not sending the feedback message after fsync without waiting for the interval of -s option always, is to answer the needs who want to use fsync only (NOT using synchronous mode). I still don't get it. AFAICS there are two ways to use pg_receivexlog. Either you use it as a synchronous standby, or not. What set of options would you pass if you want to use it as a synchronous standby? And if you don't? Could we just have a single --synchronous flag, instead of -F and --reply-fsync? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for autocommit-off mode for psql and fix some omissions
On Mon, Oct 6, 2014 at 7:36 PM, Feike Steenbergen feikesteenber...@gmail.com wrote: I would like to propose to add a regression test for all statements that call PreventTransactionChain in autocommit-off mode. I propose to add these tests to src/test/regress/sql/psql.sql as this is a psql-specific mode. Alternatively an isolated test called autocommit.sql could be created. Putting all this stuff in psql.sql is good enough IMO. During the writing of the regression test I found another statement not covered in the current function: DROP INDEX CONCURRENTLY. That's a good catch and it should be a separate patch. This could even be considered for a back-patch down to 9.2. Thoughts? I have created a patch consisting of a regression test and adding DROP INDEX CONCURRENTLY to command_no_begin. CREATE DATABASE and DROP DATABASE are not commands present (not allowed?) in the regression suite. ALTER SYSTEM has no tests as well, and REINDEX DATABASE may take time so they may be better ripped off... Also tests for CLUSTER without arguments, transaction commands, DISCARD and VACUUM would be good things. Regards, -- Michael
Re: [HACKERS] Promise index tuples for UPSERT
On 10/06/2014 03:05 PM, Simon Riggs wrote: On 3 October 2014 11:54, Heikki Linnakangas hlinnakan...@vmware.com wrote: Simon's approach would actually pass that test case just fine. It inserts the (promise) index tuple first, and heap tuple only after that. It will fail the test case with more than one unique index, however. Please explain what you mean by fail here? I meant that the test case will sometimes deadlock, and some transactions will therefore be rolled back. My understanding of what you're saying is that if * we have a table with 1 unique index * and we update the values of the uniquely index columns (e.g. PK update) * on both of the uniquely indexed column sets then we get occaisonal deadlocks, just as we would do using current UPDATE/INSERT. Right. To be precise: you don't need to update both of the columns in the same transaction, it's enough that some of the concurrent transactions update one column, while other transactions update the other column. Is their a business use case that requires that? I don't know. Conceivably any use case where you have two unique constraints to begin with. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Promise index tuples for UPSERT
On 10/06/2014 03:21 PM, Heikki Linnakangas wrote: On 10/06/2014 03:05 PM, Simon Riggs wrote: My understanding of what you're saying is that if * we have a table with 1 unique index * and we update the values of the uniquely index columns (e.g. PK update) * on both of the uniquely indexed column sets then we get occaisonal deadlocks, just as we would do using current UPDATE/INSERT. Right. To be precise: you don't need to update both of the columns in the same transaction, it's enough that some of the concurrent transactions update one column, while other transactions update the other column. Ok, that didn't make much sense. With UPSERT, you have to specify values for both columns. But it's sufficient that you have a mix of transactions where only some are UPSERTs, and others are regular UPDATEs on one of the columns. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WAL format and API changes (9.5)
On Mon, Oct 6, 2014 at 8:19 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 10/06/2014 04:42 AM, Michael Paquier wrote: On Fri, Oct 3, 2014 at 9:51 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: So I now have a refactoring patch ready that I'd like to commit (the attached two patches together), but to be honest, I have no idea why the second patch is so essential to performance. Thanks. I did some more tests with master, master+patch1, master+patch1+CRC refactoring, but I am not able to see any performance difference with pgbench (--no-vacuum, -t) and the test suite you provided, just some noise that barely changed performance. Thanks for the confirmation. I'm really going crazy with benchmarking this. Sometimes I see a big difference, the next day it's gone. The benchmark paradigms. * Fixed XLogSaveBufferForHint. It didn't initialize BkpBlock struct, rendering it completely broken. Note for other reviewers: that's represented by this addition in XLogSaveBufferForHint: + /* Make a BkpBlock struct representing the buffer */ + XLogFillBkpBlock(buffer, buffer_std, bkpb) Regards, -- Michael
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
Il 04/10/14 08:35, Michael Paquier ha scritto: On Sat, Oct 4, 2014 at 12:31 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: Compared to first version, we switched from a timestamp+checksum based approach to one based on LSN. Cool. This patch adds an option to pg_basebackup and to replication protocol BASE_BACKUP command to generate a backup_profile file. It is almost useless by itself, but it is the foundation on which we will build the file based incremental backup (and hopefully a block based incremental backup after it). Hm. I am not convinced by the backup profile file. What's wrong with having a client send only an LSN position to get a set of files (or partial files filed with blocks) newer than the position given, and have the client do all the rebuild analysis? The main problem I see is the following: how a client can detect a truncated or removed file? Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it signature.asc Description: OpenPGP digital signature
Re: [HACKERS] SSL regression test suite
On 08/12/2014 03:53 PM, Heikki Linnakangas wrote: On 08/12/2014 02:28 PM, Andres Freund wrote: On 2014-08-12 14:01:18 +0300, Heikki Linnakangas wrote: Also, to test sslmode=verify-full, where the client checks that the server certificate's hostname matches the hostname that it connected to, you need to have two aliases for the same server, one that matches the certificate and one that doesn't. But I think I found a way around that part; if the certificate is set up for localhost, and connect to 127.0.0.1, you get a mismatch. Alternatively, and to e.g. test wildcard certs and such, I think you can specify both host and hostaddr to connect to connect without actually doing a dns lookup. Oh, I didn't know that's possible! Yeah, that's a good solution. Here's a new version of the SSL regression suite I wrote earlier. It now specifies both host and hostaddr in the connection string as Andres suggested, so it no longer requires changes to network configuration. I added a bunch of tests for the SAN feature that Alexey Klyukin wrote and was committed earlier. Plus a lot of miscellaneous cleanup. This probably needs some further cleanup before it's ready for committing. One issues is that it creates a temporary cluster that listens for TCP connections on localhost, which isn't safe on a multi-user system. - Heikki diff --git a/src/test/Makefile b/src/test/Makefile index 0fd7eab..e6a7154 100644 --- a/src/test/Makefile +++ b/src/test/Makefile @@ -12,6 +12,6 @@ subdir = src/test top_builddir = ../.. include $(top_builddir)/src/Makefile.global -SUBDIRS = regress isolation +SUBDIRS = regress isolation ssl $(recurse) diff --git a/src/test/ssl/Makefile b/src/test/ssl/Makefile new file mode 100644 index 000..8e0db47 --- /dev/null +++ b/src/test/ssl/Makefile @@ -0,0 +1,59 @@ +#- +# +# Makefile for src/test/ssl +# +# Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group +# Portions Copyright (c) 1994, Regents of the University of California +# +# src/test/ssl/Makefile +# +#- + +subdir = src/test/ssl +top_builddir = ../../.. +include $(top_builddir)/src/Makefile.global + +CERTIFICATES := serverroot server-cn-and-alt-names \ + server-cn-only server-single-alt-name server-multiple-alt-names \ + server-no-names \ + clientroot client + +SSLFILES := $(CERTIFICATES:%=ssl/%.key) $(CERTIFICATES:%=ssl/%.crt) + +sslfiles: $(SSLFILES) + +# Rule for creating private/public key pairs +ssl/%.key: + openssl genrsa -out $@ 1024 + chmod 0600 $@ + +# Rule for creating CA certificates (client and server) +ssl/%root.crt: ssl/%root.key %root.config + openssl req -new -key ssl/$*root.key -days 36500 -out ssl/$*root.crt -x509 -config $*root.config + echo 00 ssl/$*root.srl + +# Server certificates, signed by server root CA: +ssl/server-%.crt: ssl/server-%.key ssl/serverroot.crt +# Generate a Certificate Sign Request (CSR) + openssl req -new -key ssl/server-$*.key -out ssl/server-$*.csr -config server-$*.config +# Sign the certificate with the right CA + openssl x509 -req -in ssl/server-$*.csr -CA ssl/serverroot.crt -CAkey ssl/serverroot.key -CAserial ssl/serverroot.srl -out ssl/server-$*.crt -extfile server-$*.config -extensions v3_req + rm ssl/server-$*.csr + +# Client certificate, signed by the client root CA: +ssl/client.crt: ssl/client.key ssl/clientroot.crt +# Generate a Certificate Sign Request (CSR) + openssl req -new -key ssl/client.key -out ssl/client.csr -config client.config +# Sign the certificate with the right CA + openssl x509 -req -in ssl/client.csr -CA ssl/clientroot.crt -CAkey ssl/clientroot.key -CAserial ssl/clientroot.srl -out ssl/client.crt + rm ssl/client.csr + +sslfiles-clean: + rm -f $(SSLFILES) ssl/client-root.srl ssl/server-root.srl + +check: + $(prove_check) + +installcheck: + rm -rf tmp_check + $(prove_installcheck) diff --git a/src/test/ssl/README b/src/test/ssl/README new file mode 100644 index 000..dfd2d79 --- /dev/null +++ b/src/test/ssl/README @@ -0,0 +1,43 @@ +src/test/ssl/README + +SSL regression tests + + +This directory contains a test suite for SSL support. + +Running the tests += + +make check + +Certificates + + +The test suite needs a set of public/private key pairs and certificates to +run: + +serverroot.crt: CA used to sign server certificates +clientroot.crt: CA used to sign client certificates +server-*.crt: server certificate, with small variations in the hostnames + present in the certificate. +client.crt: a client certificate, for user ssltestuser + +For convenience, these keypairs and certificates are included in the ssl/ +subdirectory, but the Makefile also contains a rule, make sslfiles, to +recreate them if you want to make changes. + + +TODO + + +* Allow the client-side of the tests to be run on different host easily. + Currently,
Re: [HACKERS] CREATE IF NOT EXISTS INDEX
On Mon, Oct 6, 2014 at 4:49 AM, Marti Raudsepp ma...@juffo.org wrote: On Mon, Oct 6, 2014 at 5:12 AM, Marti Raudsepp ma...@juffo.org wrote: On Mon, Oct 6, 2014 at 4:17 AM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: CREATE INDEX ... [ IF NOT EXISTS [ name ] ] ON ... I think this one is wrong now. I see now, I think you meant: CREATE INDEX ... [ [ IF NOT EXISTS ] name ] ON ... If yes, +1 for this, there's no redundancy any more. You are correct... Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index e469b17..ecebcbf 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -21,7 +21,7 @@ PostgreSQL documentation refsynopsisdiv synopsis -CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ replaceable class=parametername/replaceable ] ON replaceable class=parametertable_name/replaceable [ USING replaceable class=parametermethod/replaceable ] +CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] replaceable class=parametername/replaceable ] ON replaceable class=parametertable_name/replaceable [ USING replaceable class=parametermethod/replaceable ] ( { replaceable class=parametercolumn_name/replaceable | ( replaceable class=parameterexpression/replaceable ) } [ COLLATE replaceable class=parametercollation/replaceable ] [ replaceable class=parameteropclass/replaceable ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] ) [ WITH ( replaceable class=PARAMETERstorage_parameter/replaceable = replaceable class=PARAMETERvalue/replaceable [, ... ] ) ] [ TABLESPACE replaceable class=parametertablespace_name/replaceable ] @@ -99,6 +99,18 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ replaceable class=parametername/ variablelist varlistentry + termliteralIF NOT EXISTS/literal/term + listitem + para +Do not throw an error if the index already exists. A notice is issued +in this case. Note that there is no guarantee that the existing index +is anything like the one that would have been created. +Index name is required when IF NOT EXISTS is specified. + /para + /listitem + /varlistentry + + varlistentry termliteralUNIQUE/literal/term listitem para diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c index ee10594..8905e30 100644 --- a/src/backend/catalog/index.c +++ b/src/backend/catalog/index.c @@ -697,7 +697,8 @@ index_create(Relation heapRelation, bool allow_system_table_mods, bool skip_build, bool concurrent, - bool is_internal) + bool is_internal, + bool if_not_exists) { Oid heapRelationId = RelationGetRelid(heapRelation); Relation pg_class; @@ -773,10 +774,22 @@ index_create(Relation heapRelation, elog(ERROR, shared relations must be placed in pg_global tablespace); if (get_relname_relid(indexRelationName, namespaceId)) + { + if (if_not_exists) + { + ereport(NOTICE, + (errcode(ERRCODE_DUPLICATE_TABLE), + errmsg(relation \%s\ already exists, skipping, + indexRelationName))); + heap_close(pg_class, RowExclusiveLock); + return InvalidOid; + } + ereport(ERROR, (errcode(ERRCODE_DUPLICATE_TABLE), errmsg(relation \%s\ already exists, indexRelationName))); + } /* * construct tuple descriptor for index tuples diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c index 160f006..5ef6dcc 100644 --- a/src/backend/catalog/toasting.c +++ b/src/backend/catalog/toasting.c @@ -342,7 +342,7 @@ create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid, rel-rd_rel-reltablespace, collationObjectId, classObjectId, coloptions, (Datum) 0, true, false, false, false, - true, false, false, true); + true, false, false, true, false); heap_close(toast_rel, NoLock); diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c index 8a1cb4b..a03773b 100644 --- a/src/backend/commands/indexcmds.c +++ b/src/backend/commands/indexcmds.c @@ -610,7 +610,14 @@ DefineIndex(Oid relationId, stmt-isconstraint, stmt-deferrable, stmt-initdeferred, allowSystemTableMods, skip_build || stmt-concurrent, - stmt-concurrent, !check_rights); + stmt-concurrent, !check_rights, + stmt-if_not_exists); + + if (!OidIsValid(indexRelationId)) + { + heap_close(rel, NoLock); + return indexRelationId; + } /* Add any requested comment */ if (stmt-idxcomment != NULL) diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 225756c..39b55db 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -2907,6 +2907,7 @@
Re: [HACKERS] Promise index tuples for UPSERT
On 6 October 2014 13:21, Heikki Linnakangas hlinnakan...@vmware.com wrote: My understanding of what you're saying is that if * we have a table with 1 unique index * and we update the values of the uniquely index columns (e.g. PK update) * on both of the uniquely indexed column sets then we get occaisonal deadlocks, just as we would do using current UPDATE/INSERT. Right. To be precise: you don't need to update both of the columns in the same transaction, it's enough that some of the concurrent transactions update one column, while other transactions update the other column. CREATE TABLE foo (id1integer not null primary key ,id2integer not null unique ,valinteger); Given the table above, which one do we mean? 1. When we mix UPDATE foo SET id2 = X WHERE id1 = Y; and UPDATE foo SET id1 = Y WHERE id2 = X; we can deadlock 2. When we mix UPDATE foo SET val = Z WHERE id1 = Y; and UPDATE foo SET val = W WHERE id2 = X; we can deadlock (2) is a common use case, (1) is a very rare use case and most likely a poor design If the user wishes to protect against such deadlocks they retain the option to use row locking. Yes? -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest: patches Ready for Committer
Heikki Linnakangas hlinnakan...@vmware.com writes: Committers: Could you please pick a patch, and commit if appropriate? Or if there's a patch there that you think should *not* be committed, please speak up. The custom plan API thing may be marked ready for committer, but that doesn't mean it's committable, or even that there is any consensus about whether we want it or what it should look like. The levenshtein-distance thingy seems to still be a topic of debate as well, both as to how we're going to refactor the code and as to what the exact hinting rules ought to be. If some committer wants to take charge of it and resolve those issues, fine; but I don't want to see it done by just blindly committing whatever the last submitted version was. I've not paid much of any attention to the other four. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for autocommit-off mode for psql and fix some omissions
On 6 October 2014 14:09, Michael Paquier michael.paqu...@gmail.com wrote: That's a good catch and it should be a separate patch. This could even be considered for a back-patch down to 9.2. Thoughts? If I isolate DROP INDEX concurrently, this patch would do the trick. 20141006_drop_index_concurrently.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for autocommit-off mode for psql and fix some omissions
Feike Steenbergen feikesteenber...@gmail.com writes: I would like to propose to add a regression test for all statements that call PreventTransactionChain in autocommit-off mode. What class of bug would that prevent exactly? It seems to me like something that would normally get forgotten when we add any new such statement. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Promise index tuples for UPSERT
On 10/06/2014 04:44 PM, Simon Riggs wrote: On 6 October 2014 13:21, Heikki Linnakangas hlinnakan...@vmware.com wrote: My understanding of what you're saying is that if * we have a table with 1 unique index * and we update the values of the uniquely index columns (e.g. PK update) * on both of the uniquely indexed column sets then we get occaisonal deadlocks, just as we would do using current UPDATE/INSERT. Right. To be precise: you don't need to update both of the columns in the same transaction, it's enough that some of the concurrent transactions update one column, while other transactions update the other column. CREATE TABLE foo (id1integer not null primary key ,id2integer not null unique ,valinteger); Given the table above, which one do we mean? 1. When we mix UPDATE foo SET id2 = X WHERE id1 = Y; and UPDATE foo SET id1 = Y WHERE id2 = X; we can deadlock 2. When we mix UPDATE foo SET val = Z WHERE id1 = Y; and UPDATE foo SET val = W WHERE id2 = X; we can deadlock (2) is a common use case, (1) is a very rare use case and most likely a poor design Well, at least one of the statements has to be an UPSERT, and at least one of them has to update a column with a unique constraint on it. This pair of transactions could deadlock, for example: Transaction 1: INSERT INTO foo VALUES (Y, X, Z) ON CONFLICT IGNORE; Transaction 2: UPDATE foo SET id2 = X WHERE id1 = Y; That's made-up syntax, but the idea is that the first transaction attempts to insert a row with values id1=Y, id2=X, val=Z. If that fails because of a row with id1=Y or id2=X already exists, then it's supposed to do nothing. If the user wishes to protect against such deadlocks they retain the option to use row locking. Yes? Sorry, I didn't understand that. Row locking? In general, this is of course a lot easier to implement if we restrict it so that it only works in some limited cases. That may be fine, but then we have to be able to document clearly what the limitations are, and throw an error if you violate those limitations. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE IF NOT EXISTS INDEX
On Mon, Oct 6, 2014 at 4:27 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: create_index_if_not_exists_v7.patch Looks good to me. Marking ready for committer. If you have any feedback about my reviews, I would gladly hear it. I'm quite new to this. PS: You seem to be submitting many patches, but have you reviewed any recently? See: https://wiki.postgresql.org/wiki/Submitting_a_Patch#Mutual_Review_Offset_Obligations Each patch submitter to a CommitFest is expected to review at least one other patch Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
I'm a bit confused about who I should be replying to, but since you were the last one with a patch... On Mon, Oct 6, 2014 at 11:44 AM, Ali Akbar the.ap...@gmail.com wrote: Thanks for the review. Attached the formatted patch according to your suggestion. + select * from generate_series(0.1::numeric, 10.0::numeric, 0.1::numeric); + generate_series + - + 0.1 ... + 10.0 + (100 rows) Unless there is a good reason, can you please keep individual test output fewer than 100 lines? I think the 41-line result is an overkill as well. This just bloats the repository size unnecessarily. Also, I see that this patch was added to the 2014-10 commitfest and then deleted again (by user apaan). Why? Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to support SEMI and ANTI join removal
On Mon, Oct 6, 2014 at 5:57 AM, David Rowley dgrowle...@gmail.com wrote: Hm, right. But that doesn't seem like a fatal problem to me. The planner knows about t1/t2 and Seq(t1), Seq(t2), not just Hash(Seq(t2)). So it can tell the HashJoin node that when the 'shortcut' qualifier is true, it should source everything from Seq(t2). Since the sequence scan doesn't care about the node ontop that doesn't seem to be overly dramatic? Obviously reality makes this a bit more complicated... Ok, after a bit of study on the hash join code I can see what you mean, it shouldn't really matter. I've started working on this now and I've made some changes in analyzejoins.c so that instead of removing joins, it just marks the RangeTblEntry, setting a new flag named skipJoinPossible to true. (I'll think of a better name later) I'm starting off with hash joins and I'm hacking a bit at ExecInitHashJoin. I want to add 2 bool flags to HashJoinState, outerIsRequired and innerIsRequired. I think if both of these flags are set then we can just abort the join altogether. Though in order to set these flags I need to identify which relations are for the outer and which are for the inner side of the join. I've got the logic for that only partially worked out. My understanding of it so far is that I just need to look at the hjstate-js.ps.lefttree and righttree. Inner being right, and outer being left. I'm a little stuck on more complex cases where the scan is nested deeper in the tree and I'm not quite sure on the logic I should be using to navigate to it. Take the following plan: (which I've amended to mark the left and right nodes) explain select t1.* from t1 inner join t2 on t1.t2_id= t2.id inner join t3 on t2.t3_id=t3.id; QUERY PLAN Hash Join (cost=122.15..212.40 rows=2140 width=8) Hash Cond: (t2.t3_id = t3.id) - Hash Join (cost=58.15..118.98 rows=2140 width=12) (left node) Hash Cond: (t1.t2_id = t2.id) - Seq Scan on t1 (cost=0.00..31.40 rows=2140 width=8) (left node) - Hash (cost=31.40..31.40 rows=2140 width=8) (right node) - Seq Scan on t2 (cost=0.00..31.40 rows=2140 width=8) (left node) - Hash (cost=34.00..34.00 rows=2400 width=4) (right node) - Seq Scan on t3 (cost=0.00..34.00 rows=2400 width=4) (left node) The schema is set up in such a way that the joins to t2 and t3 can be... skipped, and my new code in analyzejoin.c marks the RangeTblEntry records for this relation to reflect that. During ExecInitHashJoin for the join between t2 and t3, if I want to find t2 in the tree, I'd need to do hjstate-js.ps.lefttree-righttree-lefttree... (which I know just from looking at the explain output) I just can't work out the logic behind where the scan node will actually be. At first I had thought something like, loop down the lefttree path until I reach a deadend, and that's the outer scan node, but in this case there's a right turn in there too, so that won't work. If I keep going down the left path I'd end up at t1, which is completely wrong. Can anyone shed any light on how I might determine where the scan rel is in the tree? I need to find it so I can check if the RangeTblEntry is marked as skip-able. I think you're probably going to need logic that knows about particular node types and does the right thing for each one. I think - but maybe I'm misunderstanding - what you're looking for is a function of the form Oid ScansOnePlainTableWithoutQuals(). The algorithm could be something like: switch (node type) { case T_SeqScanState: if (no quals) return the_appropriate_oid; return false; case T_HashJoin: decide whether we can ignore one side of the join fish out the node from the other side of the join (including reaching through the Hash node if necessary) return ScansOnePlainTableWithoutQuals(the node we fished out); ...other specific cases... default: return false; } This seems messy, though. Can't the deferred trigger queue become non-empty at pretty much any point in time? At exactly what point are we making this decision, and how do we know the correct answer can't change after that point? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
On Mon, Oct 6, 2014 at 8:59 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: Il 04/10/14 08:35, Michael Paquier ha scritto: On Sat, Oct 4, 2014 at 12:31 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: Compared to first version, we switched from a timestamp+checksum based approach to one based on LSN. Cool. This patch adds an option to pg_basebackup and to replication protocol BASE_BACKUP command to generate a backup_profile file. It is almost useless by itself, but it is the foundation on which we will build the file based incremental backup (and hopefully a block based incremental backup after it). Hm. I am not convinced by the backup profile file. What's wrong with having a client send only an LSN position to get a set of files (or partial files filed with blocks) newer than the position given, and have the client do all the rebuild analysis? The main problem I see is the following: how a client can detect a truncated or removed file? When you take a differential backup, the server needs to send some piece of information about every file so that the client can compare that list against what it already has. But a full backup does not need to include similar information. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last Commitfest patches waiting review
On Fri, Oct 3, 2014 at 12:14 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: Sort support for text with strxfrm() poor man's keys Robert? What do you think of Peter's latest patch? I haven't had time to look at it yet. Can we move it to the next CommitFest? I spent a lot of time on this one, but I can't keep doing that forever, because, you know, other work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add regression tests for autocommit-off mode for psql and fix some omissions
It would test that when setting AUTOCOMMIT to off that you will not run into: ERROR: [...] cannot run inside a transaction block when issuing one of these PreventTransactionChain commands. In src/bin/psql/common.c -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to support SEMI and ANTI join removal
On 2014-10-06 10:46:09 -0400, Robert Haas wrote: This seems messy, though. Can't the deferred trigger queue become non-empty at pretty much any point in time? At exactly what point are we making this decision, and how do we know the correct answer can't change after that point? What we've been talking about is doing this during executor startup. And at that point we really don't care about new entries in the queue during query execution - we can't see them anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: CRC algorithm (was Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes)
On Tue, Sep 16, 2014 at 6:49 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: As it happens, I also wrote an implementation of Slice-by-4 the other day If Heikki's version works I see little need to use my/Abhijit's patch. That version has part of it under the zlib license. If Heikki's version is a 'clean room', then I'd say we go with it. It looks really quite similar though... We can make minor changes like additional unrolling without problems lateron. I used http://create.stephan-brumme.com/crc32/#slicing-by-8-overview as reference - you can probably see the similarity. Any implementation is going to look more or less the same, though; there aren't that many ways to write the implementation. So, it seems like the status of this patch is: 1. It probably has a bug, since Amit's testing seemed to show that it wasn't returning the same results as unpatched master. 2. The performance tests showed a significant win on an important workload. 3. It's not in any CommitFest anywhere. Given point #2, it's seems like we ought to find a way to keep this from sliding into oblivion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to support SEMI and ANTI join removal
On Mon, Oct 6, 2014 at 10:59 AM, Andres Freund and...@2ndquadrant.com wrote: On 2014-10-06 10:46:09 -0400, Robert Haas wrote: This seems messy, though. Can't the deferred trigger queue become non-empty at pretty much any point in time? At exactly what point are we making this decision, and how do we know the correct answer can't change after that point? What we've been talking about is doing this during executor startup. And at that point we really don't care about new entries in the queue during query execution - we can't see them anyway. Ah, OK. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
+ select * from generate_series(0.1::numeric, 10.0::numeric, 0.1::numeric); + generate_series + - + 0.1 ... + 10.0 + (100 rows) Unless there is a good reason, can you please keep individual test output fewer than 100 lines? I think the 41-line result is an overkill as well. This just bloats the repository size unnecessarily. Okay. In this revision i cut the tests' result except the last one (the one that tests values just before numeric overflow). Also, I see that this patch was added to the 2014-10 commitfest and then deleted again (by user apaan). Why? User apaan is me. When i added to the commitfest, the patch is listed there by me (apaan). I only added some bits from original patch by Платон Малюгин that was revised further by Michael Paquier. So i think they should have the credits for this patch, not me. In this situation, what should i do? Thanks for the review Manti! Regards, -- Ali Akbar *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 14074,14081 AND tbody row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter)/function/literal/entry ! entrytypeint/type or typebigint/type/entry ! entrytypesetof int/type or typesetof bigint/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of one --- 14074,14081 tbody row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter)/function/literal/entry ! entrytypeint/type, typebigint/type or typenumeric/type/entry ! entrytypesetof int/type, typesetof bigint/type, or typesetof numeric/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of one *** *** 14084,14091 AND row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter, parameterstep/parameter)/function/literal/entry ! entrytypeint/type or typebigint/type/entry ! entrytypesetof int/type or typesetof bigint/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of parameterstep/parameter --- 14084,14091 row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter, parameterstep/parameter)/function/literal/entry ! entrytypeint/type, typebigint/type or typenumeric/type/entry ! entrytypesetof int/type, typesetof bigint/type or typesetof numeric/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of parameterstep/parameter *** a/src/backend/utils/adt/numeric.c --- b/src/backend/utils/adt/numeric.c *** *** 28,33 --- 28,34 #include access/hash.h #include catalog/pg_type.h + #include funcapi.h #include libpq/pqformat.h #include miscadmin.h #include nodes/nodeFuncs.h *** *** 261,266 typedef struct NumericVar --- 262,279 /* -- + * Data for generate series + * -- + */ + typedef struct + { + NumericVar current; + NumericVar finish; + NumericVar step; + } generate_series_numeric_fctx; + + + /* -- * Some preinitialized constants * -- */ *** *** 1221,1226 numeric_floor(PG_FUNCTION_ARGS) --- 1234,1346 PG_RETURN_NUMERIC(res); } + + /* + * generate_series_numeric() - + * + * Generate series of numeric. + */ + Datum + generate_series_numeric(PG_FUNCTION_ARGS) + { + return generate_series_step_numeric(fcinfo); + } + + Datum + generate_series_step_numeric(PG_FUNCTION_ARGS) + { + generate_series_numeric_fctx *fctx; + FuncCallContext *funcctx; + MemoryContext oldcontext; + + if (SRF_IS_FIRSTCALL()) + { + Numeric start_num = PG_GETARG_NUMERIC(0); + Numeric finish_num = PG_GETARG_NUMERIC(1); + NumericVar steploc = const_one; + + /* Handle NaN in start finish */ + if (NUMERIC_IS_NAN(start_num) || NUMERIC_IS_NAN(finish_num)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg(start and finish cannot be NaN))); + + /* see if we were given an explicit step size */ + if (PG_NARGS() == 3) + { + Numeric step_num = PG_GETARG_NUMERIC(2); + + if (NUMERIC_IS_NAN(step_num)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg(step size cannot be NaN))); + + init_var_from_num(step_num, steploc); + + if (cmp_var(steploc, const_zero) == 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg(step size cannot equal zero))); +
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
Il 03/10/14 22:47, Robert Haas ha scritto: On Fri, Oct 3, 2014 at 12:08 PM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: Il 03/10/14 17:53, Heikki Linnakangas ha scritto: If we're going to need a profile file - and I'm not convinced of that - is there any reason to not always include it in the backup? The main reason is to have a centralized list of files that need to be present. Without a profile, you have to insert some sort of placeholder for kipped files. Why do you need to do that? And where do you need to do that? It seems to me that there are three interesting operations: 1. Take a full backup. Basically, we already have this. In the backup label file, make sure to note the newest LSN guaranteed to be present in the backup. Don't we already have it in START WAL LOCATION? 2. Take a differential backup. In the backup label file, note the LSN of the fullback to which the differential backup is relative, and the newest LSN guaranteed to be present in the differential backup. The actual backup can consist of a series of 20-byte buffer tags, those being the exact set of blocks newer than the base-backup's latest-guaranteed-to-be-present LSN. Each buffer tag is followed by an 8kB block of data. If a relfilenode is truncated or removed, you need some way to indicate that in the backup; e.g. include a buffertag with forknum = -(forknum + 1) and blocknum = the new number of blocks, or InvalidBlockNumber if removed entirely. To have a working backup you need to ship each block which is newer than latest-guaranteed-to-be-present in full backup and not newer than latest-guaranteed-to-be-present in the current backup. Also, as a further optimization, you can think about not sending the empty space in the middle of each page. My main concern here is about how postgres can remember that a relfilenode has been deleted, in order to send the appropriate deletion tag. IMHO the easiest way is to send the full list of files along the backup and let to the client the task to delete unneeded files. The backup profile has this purpose. Moreover, I do not like the idea of using only a stream of block as the actual differential backup, for the following reasons: * AFAIK, with the current infrastructure, you cannot do a backup with a block stream only. To have a valid backup you need many files for which the concept of LSN doesn't apply. * I don't like to have all the data from the various tablespace/db/whatever all mixed in the same stream. I'd prefer to have the blocks saved on a per file basis. 3. Apply a differential backup to a full backup to create an updated full backup. This is just a matter of scanning the full backup and the differential backup and applying the changes in the differential backup to the full backup. You might want combinations of these, like something that does 2+3 as a single operation, for efficiency, or a way to copy a full backup and apply a differential backup to it as you go. But that's it, right? What else do you need? Nothing else. Once we agree on definition of involved files and protocols formats, only the actual coding remains. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it signature.asc Description: OpenPGP digital signature
Re: [HACKERS] UPSERT wiki page, and SQL MERGE syntax
On Fri, Oct 3, 2014 at 4:16 PM, Kevin Grittner kgri...@ymail.com wrote: ... the SQL standard does not require that MERGE be atomic in the sense of atomically providing either an INSERT or UPDATE, ... My understanding is that the standard logically requires (without concern for implementation details) that the second specified table (via table name or subquery -- which could be a VALUES statement) is scanned, and for each row it attempts to match a row in the target table. That will either match or not, according to the boolean expression in the ON clause. You can have one WHEN MATCHED THEN UPDATE clause and/or one WHEN NOT MATCHED THEN INSERT clause. If both clauses are present, I believe that it is guaranteed that one or the other (but not both) will fire for each row in the second table. The usual guarantees for each isolation level must not be violated (although an implementation is not required to generate anomalies which could not happen with serial execution). So as usual for a transaction involving multiple tables, serialization anomalies are possible if the transaction isolation level is REPEATABLE READ or less. Is that what you're getting at, or something else? I think the problem is that it's not possible to respect the usual guarantees even at READ COMMITTED level when performing an INSERT OR UPDATE operation (however spelled). You may find that there's a tuple with the same PK which is committed but not visible to the snapshot you took at the beginning of the statement. An INSERT would fail; an UPDATE would see no rows and do nothing. IOW, *neither* operation succeeds according to its classic semantics; to succeed, the INSERT OR UPDATE has to do something altogether novel. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inefficient barriers on solaris with sun cc
On Thu, Oct 2, 2014 at 2:06 PM, Andres Freund and...@2ndquadrant.com wrote: Also, I pretty much designed those definitions to match what Linux does. And it doesn't require that either, though it says that in most cases it will work out that way. My point is that that read barriers aren't particularly meaningful without a defined store order from another thread/process. Without any form of pairing you don't have that. The writing side could just have reordered the writes in a way you didn't want them. And the kernel docs do say A lack of appropriate pairing is almost certainly an error. But since read barriers also pair with lock releases operations, that's normally not a big problem. Agreed, but it's possible to have a read-fence where an atomic operation provides the ordering on the other side, or something like that. I'm still unsure what you want to show with that example? Me, too. I think we've drifted off in the weeds. Do we know what we need to know to fix $SUBJECT? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inefficient barriers on solaris with sun cc
On 2014-10-06 11:38:47 -0400, Robert Haas wrote: On Thu, Oct 2, 2014 at 2:06 PM, Andres Freund and...@2ndquadrant.com wrote: Also, I pretty much designed those definitions to match what Linux does. And it doesn't require that either, though it says that in most cases it will work out that way. My point is that that read barriers aren't particularly meaningful without a defined store order from another thread/process. Without any form of pairing you don't have that. The writing side could just have reordered the writes in a way you didn't want them. And the kernel docs do say A lack of appropriate pairing is almost certainly an error. But since read barriers also pair with lock releases operations, that's normally not a big problem. Agreed, but it's possible to have a read-fence where an atomic operation provides the ordering on the other side, or something like that. Sure, that's one of the possible pairings. Most atomics have barrier semantics... I'm still unsure what you want to show with that example? Me, too. I think we've drifted off in the weeds. Do we know what we need to know to fix $SUBJECT? I think we can pretty much apply Oskari's patch after replacing acquire/release with read/write intrinsics. I'm opening a bug with the gcc folks about clarifying the docs on their intrinsics. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
On Mon, Oct 6, 2014 at 11:33 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: 1. Take a full backup. Basically, we already have this. In the backup label file, make sure to note the newest LSN guaranteed to be present in the backup. Don't we already have it in START WAL LOCATION? Yeah, probably. I was too lazy to go look for it, but that sounds like the right thing. 2. Take a differential backup. In the backup label file, note the LSN of the fullback to which the differential backup is relative, and the newest LSN guaranteed to be present in the differential backup. The actual backup can consist of a series of 20-byte buffer tags, those being the exact set of blocks newer than the base-backup's latest-guaranteed-to-be-present LSN. Each buffer tag is followed by an 8kB block of data. If a relfilenode is truncated or removed, you need some way to indicate that in the backup; e.g. include a buffertag with forknum = -(forknum + 1) and blocknum = the new number of blocks, or InvalidBlockNumber if removed entirely. To have a working backup you need to ship each block which is newer than latest-guaranteed-to-be-present in full backup and not newer than latest-guaranteed-to-be-present in the current backup. Also, as a further optimization, you can think about not sending the empty space in the middle of each page. Right. Or compressing the data. My main concern here is about how postgres can remember that a relfilenode has been deleted, in order to send the appropriate deletion tag. You also need to handle truncation. IMHO the easiest way is to send the full list of files along the backup and let to the client the task to delete unneeded files. The backup profile has this purpose. Moreover, I do not like the idea of using only a stream of block as the actual differential backup, for the following reasons: * AFAIK, with the current infrastructure, you cannot do a backup with a block stream only. To have a valid backup you need many files for which the concept of LSN doesn't apply. * I don't like to have all the data from the various tablespace/db/whatever all mixed in the same stream. I'd prefer to have the blocks saved on a per file basis. OK, that makes sense. But you still only need the file list when sending a differential backup, not when sending a full backup. So maybe a differential backup looks like this: - Ship a table-of-contents file with a list relation files currently present and the length of each in blocks. - For each block that's been modified since the original backup, ship a file called delta_original file name which is of the form block numberchanged block contents [...]. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
Il 06/10/14 16:51, Robert Haas ha scritto: On Mon, Oct 6, 2014 at 8:59 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: Il 04/10/14 08:35, Michael Paquier ha scritto: On Sat, Oct 4, 2014 at 12:31 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: Compared to first version, we switched from a timestamp+checksum based approach to one based on LSN. Cool. This patch adds an option to pg_basebackup and to replication protocol BASE_BACKUP command to generate a backup_profile file. It is almost useless by itself, but it is the foundation on which we will build the file based incremental backup (and hopefully a block based incremental backup after it). Hm. I am not convinced by the backup profile file. What's wrong with having a client send only an LSN position to get a set of files (or partial files filed with blocks) newer than the position given, and have the client do all the rebuild analysis? The main problem I see is the following: how a client can detect a truncated or removed file? When you take a differential backup, the server needs to send some piece of information about every file so that the client can compare that list against what it already has. But a full backup does not need to include similar information. I agree that a full backup does not need to include a profile. I've added the option to require the profile even for a full backup, as it can be useful for backup softwares. We could remove the option and build the profile only during incremental backups, if required. However, I would avoid the needing to scan the whole backup to know the size of the recovered data directory, hence the backup profile. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Add generate_series(numeric, numeric)
On Mon, Oct 6, 2014 at 6:12 PM, Ali Akbar the.ap...@gmail.com wrote: User apaan is me. When i added to the commitfest, the patch is listed there by me (apaan). That's fine I think, it's just for tracking who made the changes in the CommitFest app. What actually matters is what you write in the Author field, which could contain all 3 names separated by commas. the one that tests values just before numeric overflow Actually I don't know if that's too useful. I think you should add a test case that causes an error to be thrown. Also, I noticed that there are a few trailing spaces in the patch that should be removed: +generate_series_numeric(PG_FUNCTION_ARGS) ... + if (NUMERIC_IS_NAN(start_num) || NUMERIC_IS_NAN(finish_num)) ... + if (PG_NARGS() == 3) ... + if (NUMERIC_IS_NAN(step_num)) Regards, Marti -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
On Mon, Oct 6, 2014 at 11:51 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: I agree that a full backup does not need to include a profile. I've added the option to require the profile even for a full backup, as it can be useful for backup softwares. We could remove the option and build the profile only during incremental backups, if required. However, I would avoid the needing to scan the whole backup to know the size of the recovered data directory, hence the backup profile. That doesn't seem to be buying you much. Calling stat() on every file in a directory tree is a pretty cheap operation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
Hello, 2014-10-06 17:51 GMT+02:00 Marco Nenciarini marco.nenciar...@2ndquadrant.it : I agree that a full backup does not need to include a profile. I've added the option to require the profile even for a full backup, as it can be useful for backup softwares. We could remove the option and build the profile only during incremental backups, if required. However, I would avoid the needing to scan the whole backup to know the size of the recovered data directory, hence the backup profile. I really like this approach. I think we should leave users the ability to ship a profile file even in case of full backup (by default disabled). Thanks, Gabriele
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
Il 06/10/14 17:55, Robert Haas ha scritto: On Mon, Oct 6, 2014 at 11:51 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: I agree that a full backup does not need to include a profile. I've added the option to require the profile even for a full backup, as it can be useful for backup softwares. We could remove the option and build the profile only during incremental backups, if required. However, I would avoid the needing to scan the whole backup to know the size of the recovered data directory, hence the backup profile. That doesn't seem to be buying you much. Calling stat() on every file in a directory tree is a pretty cheap operation. In case of incremental backup it is not true. You have to read the delta file to know the final size. You can optimize it putting this information in the first few bytes, but in case of compressed tar format you will need to scan the whole archive. Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it signature.asc Description: OpenPGP digital signature
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
On 10/06/2014 06:33 PM, Marco Nenciarini wrote: Il 03/10/14 22:47, Robert Haas ha scritto: 2. Take a differential backup. In the backup label file, note the LSN of the fullback to which the differential backup is relative, and the newest LSN guaranteed to be present in the differential backup. The actual backup can consist of a series of 20-byte buffer tags, those being the exact set of blocks newer than the base-backup's latest-guaranteed-to-be-present LSN. Each buffer tag is followed by an 8kB block of data. If a relfilenode is truncated or removed, you need some way to indicate that in the backup; e.g. include a buffertag with forknum = -(forknum + 1) and blocknum = the new number of blocks, or InvalidBlockNumber if removed entirely. To have a working backup you need to ship each block which is newer than latest-guaranteed-to-be-present in full backup and not newer than latest-guaranteed-to-be-present in the current backup. Also, as a further optimization, you can think about not sending the empty space in the middle of each page. My main concern here is about how postgres can remember that a relfilenode has been deleted, in order to send the appropriate deletion tag. IMHO the easiest way is to send the full list of files along the backup and let to the client the task to delete unneeded files. The backup profile has this purpose. Right, but the server doesn't need to send a separate backup profile file for that. Rather, anything that the server *didn't* send, should be deleted. I think the missing piece in this puzzle is that even for unmodified blocks, the server should send a note saying the blocks were present, but not modified. So for each file present in the server, the server sends a block stream. For each block, it sends either the full block contents, if it was modified, or a simple indicator that it was not modified. There's a downside to this, though. The client has to read the whole stream, before it knows which files were present. So when applying a block stream directly over an old backup, the client cannot delete files until it has applied all the other changes. That needs more needs more disk space. With a separate profile file that's sent *before* the rest of the backup, you could delete the obsolete files first. But that's not a very big deal. I would suggest that you leave out the profile file in the first version, and add it as an optimization later, if needed. Moreover, I do not like the idea of using only a stream of block as the actual differential backup, for the following reasons: * AFAIK, with the current infrastructure, you cannot do a backup with a block stream only. To have a valid backup you need many files for which the concept of LSN doesn't apply. Those should be sent in whole. At least in the first version. The non-relation files are small compared to relation files, so it's not too bad to just include them in full. 3. Apply a differential backup to a full backup to create an updated full backup. This is just a matter of scanning the full backup and the differential backup and applying the changes in the differential backup to the full backup. You might want combinations of these, like something that does 2+3 as a single operation, for efficiency, or a way to copy a full backup and apply a differential backup to it as you go. But that's it, right? What else do you need? Nothing else. Once we agree on definition of involved files and protocols formats, only the actual coding remains. BTW, regarding the protocol, I have an idea. Rather than invent a whole new file format to represent the modified blocks, can we reuse some existing binary diff file format? For example, the VCDIFF format (RFC 3284). For each unmodified block, the server would send a vcdiff COPY instruction, to copy the block from the old backup, and for a modified block, the server would send an ADD instruction, with the new block contents. The VCDIFF file format is quite flexible, but we would only use a small subset of it. I believe that subset would be just as easy to generate in the backend as a custom file format, but you could then use an external tool (xdelta3, open-vcdiff) to apply the diff manually, in case of emergency. In essence, the server would send a tar stream as usual, but for each relation file, it would send a VCDIFF file with name relfilenode.vcdiff instead. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
Il 06/10/14 17:50, Robert Haas ha scritto: On Mon, Oct 6, 2014 at 11:33 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: 2. Take a differential backup. In the backup label file, note the LSN of the fullback to which the differential backup is relative, and the newest LSN guaranteed to be present in the differential backup. The actual backup can consist of a series of 20-byte buffer tags, those being the exact set of blocks newer than the base-backup's latest-guaranteed-to-be-present LSN. Each buffer tag is followed by an 8kB block of data. If a relfilenode is truncated or removed, you need some way to indicate that in the backup; e.g. include a buffertag with forknum = -(forknum + 1) and blocknum = the new number of blocks, or InvalidBlockNumber if removed entirely. To have a working backup you need to ship each block which is newer than latest-guaranteed-to-be-present in full backup and not newer than latest-guaranteed-to-be-present in the current backup. Also, as a further optimization, you can think about not sending the empty space in the middle of each page. Right. Or compressing the data. If we want to introduce compression on server side, I think that compressing the whole tar stream would be more effective. My main concern here is about how postgres can remember that a relfilenode has been deleted, in order to send the appropriate deletion tag. You also need to handle truncation. Yes, of course. The current backup profile contains the file size, and it can be used to truncate the file to the right size. IMHO the easiest way is to send the full list of files along the backup and let to the client the task to delete unneeded files. The backup profile has this purpose. Moreover, I do not like the idea of using only a stream of block as the actual differential backup, for the following reasons: * AFAIK, with the current infrastructure, you cannot do a backup with a block stream only. To have a valid backup you need many files for which the concept of LSN doesn't apply. * I don't like to have all the data from the various tablespace/db/whatever all mixed in the same stream. I'd prefer to have the blocks saved on a per file basis. OK, that makes sense. But you still only need the file list when sending a differential backup, not when sending a full backup. So maybe a differential backup looks like this: - Ship a table-of-contents file with a list relation files currently present and the length of each in blocks. Having the size in bytes allow you to use the same format for non-block files. Am I missing any advantage of having the size in blocks over having the size in bytes? Regards, Marco -- Marco Nenciarini - 2ndQuadrant Italy PostgreSQL Training, Services and Support marco.nenciar...@2ndquadrant.it | www.2ndQuadrant.it signature.asc Description: OpenPGP digital signature
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
On 10/06/2014 07:06 PM, Marco Nenciarini wrote: Il 06/10/14 17:55, Robert Haas ha scritto: On Mon, Oct 6, 2014 at 11:51 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: I agree that a full backup does not need to include a profile. I've added the option to require the profile even for a full backup, as it can be useful for backup softwares. We could remove the option and build the profile only during incremental backups, if required. However, I would avoid the needing to scan the whole backup to know the size of the recovered data directory, hence the backup profile. That doesn't seem to be buying you much. Calling stat() on every file in a directory tree is a pretty cheap operation. In case of incremental backup it is not true. You have to read the delta file to know the final size. You can optimize it putting this information in the first few bytes, but in case of compressed tar format you will need to scan the whole archive. I think you're pretty much screwed with the compressed tar format anyway. The files in the .tar can be in different order in the 'diff' and the base backup, so you need to do random access anyway when you try apply the diff. And random access isn't very easy with uncompressed tar format either. I think it would be acceptable to only support incremental backups with the directory format. In hindsight, our compressed tar format was not a very good choice, because it makes random access impossible. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
On Mon, Oct 6, 2014 at 12:06 PM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: Il 06/10/14 17:55, Robert Haas ha scritto: On Mon, Oct 6, 2014 at 11:51 AM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: I agree that a full backup does not need to include a profile. I've added the option to require the profile even for a full backup, as it can be useful for backup softwares. We could remove the option and build the profile only during incremental backups, if required. However, I would avoid the needing to scan the whole backup to know the size of the recovered data directory, hence the backup profile. That doesn't seem to be buying you much. Calling stat() on every file in a directory tree is a pretty cheap operation. In case of incremental backup it is not true. You have to read the delta file to know the final size. You can optimize it putting this information in the first few bytes, but in case of compressed tar format you will need to scan the whole archive. Well, sure. But I never objected to sending a profile in a differential backup. I'm just objecting to sending one in a full backup. At least not without a more compelling reason why we need it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
On Mon, Oct 6, 2014 at 12:18 PM, Marco Nenciarini marco.nenciar...@2ndquadrant.it wrote: - Ship a table-of-contents file with a list relation files currently present and the length of each in blocks. Having the size in bytes allow you to use the same format for non-block files. Am I missing any advantage of having the size in blocks over having the size in bytes? Size in bytes would be fine, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
On 10/06/2014 07:00 PM, Gabriele Bartolini wrote: Hello, 2014-10-06 17:51 GMT+02:00 Marco Nenciarini marco.nenciar...@2ndquadrant.it : I agree that a full backup does not need to include a profile. I've added the option to require the profile even for a full backup, as it can be useful for backup softwares. We could remove the option and build the profile only during incremental backups, if required. However, I would avoid the needing to scan the whole backup to know the size of the recovered data directory, hence the backup profile. I really like this approach. I think we should leave users the ability to ship a profile file even in case of full backup (by default disabled). I don't see the point of making the profile optional. Why burden the user with that decision? I'm not convinced we need it at all, but if we're going to have a profile file, it should always be included. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_receivexlog always handles -d option argument as connstr
Hi all, pg_receivexlog always handles argument of -d option as connstr formatted value. We can doubly specify host name, port number. The other client tools handles -d option as connstr value only if argument has = character. The document says that pg_receivexlog ignores database name, and this option is called for consistency with other client applications. But if we specify database name like other client tool '-d hoge' , then we will definitely got error. I think that pg_receivexlog should be changed to get consistency with other client tools. Thought? Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE IF NOT EXISTS INDEX
On Mon, Oct 6, 2014 at 11:13 AM, Marti Raudsepp ma...@juffo.org wrote: On Mon, Oct 6, 2014 at 4:27 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: create_index_if_not_exists_v7.patch Looks good to me. Marking ready for committer. Thanks. If you have any feedback about my reviews, I would gladly hear it. I'm quite new to this. Was great... PS: You seem to be submitting many patches, but have you reviewed any recently? See: https://wiki.postgresql.org/wiki/Submitting_a_Patch#Mutual_Review_Offset_Obligations Each patch submitter to a CommitFest is expected to review at least one other patch Yes, I know it... I'll dedicate more time to help on reviews too... It's very important and fundamental activity. Thanks for reminder. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello
Re: [HACKERS] Simplify calls of pg_class_aclcheck when multiple modes are used
On Wed, Aug 27, 2014 at 9:02 AM, Michael Paquier michael.paqu...@gmail.com wrote: Hi all, In a couple of code paths we do the following to check permissions on an object: if (pg_class_aclcheck(relid, userid, ACL_USAGE) != ACLCHECK_OK pg_class_aclcheck(relid, userid, ACL_UPDATE) != ACLCHECK_OK) ereport(ERROR, blah); Wouldn't it be better to simplify that with a single call of pg_class_aclcheck, gathering together the modes that need to be checked? In the case above, the call to pg_class_aclcheck would become like that: if (pg_class_aclcheck(relid, userid, ACL_USAGE | ACL_UPDATE) != ACLCHECK_OK) ereport(ERROR, blah); That's not a critical thing, but it would save some cycles. Patch is attached. I did a little review: - applied to master without errors - no compiler warnings - no regressions It's a minor change and as Michael already said would save some cycles. Marked as Ready for commiter. Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello
Re: [HACKERS] Last Commitfest patches waiting review
On Mon, Oct 6, 2014 at 7:57 AM, Robert Haas robertmh...@gmail.com wrote: I haven't had time to look at it yet. Can we move it to the next CommitFest? I spent a lot of time on this one, but I can't keep doing that forever, because, you know, other work. Are you suggesting that it would be useful to have input from another person? Because if you are, then I agree that ideally that would be possible. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] copy.c handling for RLS is insecure
In DoCopy, some RLS-specific code constructs a SelectStmt to handle the case where COPY TO is invoked on an RLS-protected relation. But I think this step is bogus in two ways: /* Build FROM clause */ from = makeRangeVar(NULL, RelationGetRelationName(rel), 1); First, because relations are schema objects, there could be multiple relations with the same name. The RangeVar might end up referring to a different one of those objects than the user originally specified. That seems like it could be surprising at best and a security vulnerability at worst. So at the very list I think this needs to pass the schema name as well as the relation name. I'm not 100% sure it's OK even then, because what about a concurrent rename of the schema? Second, the third argument to makeRangeVar is a parse location, and I believe it it is conventional to use -1, rather than 1, when no location can be identified. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UPSERT wiki page, and SQL MERGE syntax
On Mon, Oct 6, 2014 at 8:35 AM, Robert Haas robertmh...@gmail.com wrote: I think the problem is that it's not possible to respect the usual guarantees even at READ COMMITTED level when performing an INSERT OR UPDATE operation (however spelled). That's definitely the main problem. Also, there could be garden variety race conditions. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] copy.c handling for RLS is insecure
* Robert Haas (robertmh...@gmail.com) wrote: In DoCopy, some RLS-specific code constructs a SelectStmt to handle the case where COPY TO is invoked on an RLS-protected relation. But I think this step is bogus in two ways: /* Build FROM clause */ from = makeRangeVar(NULL, RelationGetRelationName(rel), 1); First, because relations are schema objects, there could be multiple relations with the same name. The RangeVar might end up referring to a different one of those objects than the user originally specified. Argh. That's certainly no good. It should just be using the RangeVar relation passed in from CopyStmt, no? We don't have to address the case where it's NULL (tho we should perhaps Assert(), just to be sure), as that would only happen in the COPY select_with_parens ... production and this is only for the normal 'COPY relname' case. That seems like it could be surprising at best and a security vulnerability at worst. So at the very list I think this needs to pass the schema name as well as the relation name. I'm not 100% sure it's OK even then, because what about a concurrent rename of the schema? Hmm, that's certainly an interesting point, but I'm trying to work out how this is different from normal COPY..? pg_analyze_and_rewrite() happens for both cases down in BeginCopy(). Second, the third argument to makeRangeVar is a parse location, and I believe it it is conventional to use -1, rather than 1, when no location can be identified. Err, you're right, but I think we should just eliminate the entire makeRangeVar() call, and then the location would be correctly set too. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Last Commitfest patches waiting review
On Mon, Oct 6, 2014 at 11:27 AM, Robert Haas robertmh...@gmail.com wrote: Well, really, I was just suggesting that I can spend more time on the patch, but not immediately. We haven't really talked about the idea of the HyperLogLog-based abort mechanism - the actual cost model - even though I thought we'd have discussed that extensively by now. Maybe I'm reading too much into that, but my guess is that that's because it's a particularly hard thing to have an opinion on. I think that It might not be a bad idea to have another opinion on that in particular. We've historically resisted this kind of special case optimization, and yet the optimization is likely to be very effective in many representative cases. Plus, some aspects of the cost model are a bit fuzzy, in a way that things in the executor ordinarily are not, and so I can understand how this would present any reviewer with difficulty. By the way, the original description of this approach to accelerating sorts I saw was here, in this 2001 paper: http://lgis.informatik.uni-kl.de/archiv/wwwdvs.informatik.uni-kl.de/courses/DBSREAL/SS2001/Vorlesungsunterlagen/Implementing_Sorting.pdf (Under 2.4 Cache-optimized techniques) -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] copy.c handling for RLS is insecure
Stephen Frost sfr...@snowman.net writes: * Robert Haas (robertmh...@gmail.com) wrote: First, because relations are schema objects, there could be multiple relations with the same name. The RangeVar might end up referring to a different one of those objects than the user originally specified. Argh. That's certainly no good. It should just be using the RangeVar relation passed in from CopyStmt, no? No, it shouldn't be doing that either. That would imply looking up the relation a second time, and then you have a race condition against concurrent renames (the same type of security bug Robert spent a great deal of time on, not so long ago). Once you've identified the target relation by OID, nothing else later in the command should be doing a fresh lookup by name. Period. If you've got APIs in here that depend on passing RangeVars to identify relations, those APIs are broken and need to be changed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] copy.c handling for RLS is insecure
On Mon, Oct 6, 2014 at 2:49 PM, Stephen Frost sfr...@snowman.net wrote: * Robert Haas (robertmh...@gmail.com) wrote: In DoCopy, some RLS-specific code constructs a SelectStmt to handle the case where COPY TO is invoked on an RLS-protected relation. But I think this step is bogus in two ways: /* Build FROM clause */ from = makeRangeVar(NULL, RelationGetRelationName(rel), 1); First, because relations are schema objects, there could be multiple relations with the same name. The RangeVar might end up referring to a different one of those objects than the user originally specified. Argh. That's certainly no good. It should just be using the RangeVar relation passed in from CopyStmt, no? I don't think that's adequate. You can't do a RangeVar-to-OID translation, use the resulting OID for some security-relevant decision, and then repeat the RangeVar-to-OID translation and hope you get the same answer. That's what led to CVE-2014-0062, fixed by commit 5f173040e324f6c2eebb90d86cf1b0cdb5890f0a. I can't work out off-hand whether the issue is exploitable in this instance, but it sure seems like a bad idea to rely on it not being so. We don't have to address the case where it's NULL (tho we should perhaps Assert(), just to be sure), as that would only happen in the COPY select_with_parens ... production and this is only for the normal 'COPY relname' case. The antecedent of it in the case where it's NULL is unclear to me. Hmm, that's certainly an interesting point, but I'm trying to work out how this is different from normal COPY..? pg_analyze_and_rewrite() happens for both cases down in BeginCopy(). As far as I can see, the previous code only looked up any given name once. If you got a relation name, DoCopy() looked it up, and then BeginCopy() references it only by the passed-down Relation descriptor; if you got a query, DoCopy() ignores it, and then BeginCopy. All of which is fine, at least AFAICS; if you think otherwise, that should be reported to pgsql-security. The problem with your code is that you start with a relation name (and thus look it up in DoCopy()) and then construct a query (which causes the name to be looked up again when the query is passed to pg_analyze_and_rewrite() from BeginCopy()) -- and the lookup might not get the same answer both times. That is, not to put to fine a point on it, bad news. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] copy.c handling for RLS is insecure
I left out a few words there. On Mon, Oct 6, 2014 at 3:07 PM, Robert Haas robertmh...@gmail.com wrote: Hmm, that's certainly an interesting point, but I'm trying to work out how this is different from normal COPY..? pg_analyze_and_rewrite() happens for both cases down in BeginCopy(). As far as I can see, the previous code only looked up any given name once. If you got a relation name, DoCopy() looked it up, and then BeginCopy() references it only by the passed-down Relation descriptor; if you got a query, DoCopy() ignores it, and then BeginCopy. ...passes it to pg_analyze_and_rewrite(), which looks up any names it contains. All of which is fine, at least AFAICS; if you think otherwise, that should be reported to pgsql-security. The problem with your code is that you start with a relation name (and thus look it up in DoCopy()) and then construct a query (which causes the name to be looked up again when the query is passed to pg_analyze_and_rewrite() from BeginCopy()) -- and the lookup might not get the same answer both times. That is, not to put to fine a point on it, bad news. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Corporate and Individual Contributor License Agreements (CLAs)
On Mon, Oct 6, 2014 at 9:58 PM, Arcadiy Ivanov arca...@gmail.com wrote: Hi folks, My corp (CSC) OSS division requires CLAs to be signed for OSS project participation to begin. I need to fix a few problems in PGJDBC driver and am unable to start without them. Neither Google nor PG Wiki contain CLA licenses and I have no idea where else to look. PostgreSQL Global Development Group doesn't have its own website nor mentions any contributors/volunteers on legal side. Can anyone help by letting me know who to contact or where to look? This is covered in the Developer FAQ at http://wiki.postgresql.org/wiki/Developer_FAQ#Do_I_need_to_sign_a_copyright_assignment.3F Bottom line is, there is no CLA to sign. (And the PostgeSQL Global Development Group certainly has it's own website - it's www.postgresql.org) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Corporate and Individual Contributor License Agreements (CLAs)
Arcadiy Ivanov arca...@gmail.com writes: My corp (CSC) OSS division requires CLAs to be signed for OSS project participation to begin. I need to fix a few problems in PGJDBC driver and am unable to start without them. Neither Google nor PG Wiki contain CLA licenses and I have no idea where else to look. PostgreSQL Global Development Group doesn't have its own website nor mentions any contributors/volunteers on legal side. Can anyone help by letting me know who to contact or where to look? There are no such agreements for Postgres work. The community explicitly rejected the idea more than a dozen years ago. If your lawyers can't cope with the concept of informal communities, I'm sorry, but we're not going to burden ourselves with paperwork in order to make them happy. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Corporate and Individual Contributor License Agreements (CLAs)
Thank you Magnus. The absence of legal entity and therefore of CLAs will make for an awesome discussion with legal. :D On 2014-10-06 16:04, Magnus Hagander wrote: On Mon, Oct 6, 2014 at 9:58 PM, Arcadiy Ivanov arca...@gmail.com wrote: Hi folks, My corp (CSC) OSS division requires CLAs to be signed for OSS project participation to begin. I need to fix a few problems in PGJDBC driver and am unable to start without them. Neither Google nor PG Wiki contain CLA licenses and I have no idea where else to look. PostgreSQL Global Development Group doesn't have its own website nor mentions any contributors/volunteers on legal side. Can anyone help by letting me know who to contact or where to look? This is covered in the Developer FAQ at http://wiki.postgresql.org/wiki/Developer_FAQ#Do_I_need_to_sign_a_copyright_assignment.3F Bottom line is, there is no CLA to sign. (And the PostgeSQL Global Development Group certainly has it's own website - it's www.postgresql.org) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Corporate and Individual Contributor License Agreements (CLAs)
Thanks Tom. On 2014-10-06 16:06, Tom Lane wrote: Arcadiy Ivanov arca...@gmail.com writes: My corp (CSC) OSS division requires CLAs to be signed for OSS project participation to begin. I need to fix a few problems in PGJDBC driver and am unable to start without them. Neither Google nor PG Wiki contain CLA licenses and I have no idea where else to look. PostgreSQL Global Development Group doesn't have its own website nor mentions any contributors/volunteers on legal side. Can anyone help by letting me know who to contact or where to look? There are no such agreements for Postgres work. The community explicitly rejected the idea more than a dozen years ago. If your lawyers can't cope with the concept of informal communities, I'm sorry, but we're not going to burden ourselves with paperwork in order to make them happy. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] copy.c handling for RLS is insecure
On Mon, Oct 06, 2014 at 03:15:25PM -0400, Stephen Frost wrote: As far as I can see, the previous code only looked up any given name once. If you got a relation name, DoCopy() looked it up, and then BeginCopy() references it only by the passed-down Relation descriptor; if you got a query, DoCopy() ignores it, and then BeginCopy. All of which is fine, at least AFAICS; if you think otherwise, that should be reported to pgsql-security. Yeah, that's correct. I suppose there's some possible risk of things changing between when you parse the query and when it actually gets analyzed and rewritten, but that's not a security risk per-se.. I'm not sure I understand. If that change violates an access control, it's a security risk /per se/, as you put it. Are you saying that such changes, even though they might be bugs, categorically couldn't violate an access control? Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] copy.c handling for RLS is insecure
David, On Monday, October 6, 2014, David Fetter da...@fetter.org wrote: On Mon, Oct 06, 2014 at 03:15:25PM -0400, Stephen Frost wrote: As far as I can see, the previous code only looked up any given name once. If you got a relation name, DoCopy() looked it up, and then BeginCopy() references it only by the passed-down Relation descriptor; if you got a query, DoCopy() ignores it, and then BeginCopy. All of which is fine, at least AFAICS; if you think otherwise, that should be reported to pgsql-security. Yeah, that's correct. I suppose there's some possible risk of things changing between when you parse the query and when it actually gets analyzed and rewritten, but that's not a security risk per-se.. I'm not sure I understand. If that change violates an access control, it's a security risk /per se/, as you put it. The case I was referring to doesn't violate an access control. I was merely pointing out that things can change between when the query is submitted by the user (or even later, during parse analysis) and when we actually resolve names to OIDs. Thanks, Stephen
Re: [HACKERS] [RFC] Incremental backup v2: add backup profile to base backup
On Mon, Oct 06, 2014 at 07:24:32PM +0300, Heikki Linnakangas wrote: On 10/06/2014 07:00 PM, Gabriele Bartolini wrote: Hello, 2014-10-06 17:51 GMT+02:00 Marco Nenciarini marco.nenciar...@2ndquadrant.it : I agree that a full backup does not need to include a profile. I've added the option to require the profile even for a full backup, as it can be useful for backup softwares. We could remove the option and build the profile only during incremental backups, if required. However, I would avoid the needing to scan the whole backup to know the size of the recovered data directory, hence the backup profile. I really like this approach. I think we should leave users the ability to ship a profile file even in case of full backup (by default disabled). I don't see the point of making the profile optional. Why burden the user with that decision? I'm not convinced we need it at all, but if we're going to have a profile file, it should always be included. +1 for fewer user decisions, especially with something light-weight in resource consumption like the profile. Cheers, David. -- David Fetter da...@fetter.org http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fet...@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] replicating DROP commands across servers
On 10/4/14, 8:12 PM, Robert Haas wrote: It's just not sane to try to parse such text strings. But this is a pretty ridiculous argument. We have an existing parser that does it just fine, and a special-purpose parser that does just that (and not all of the other stuff that the main parser does) would be a great deal simpler. Maybe there are examples other than the ones you listed here that demonstrate that this is actually a hard problem, but the fact that you might need to undo quote_literal() or search for and split on fixed strings does not. FWIW, I've run into situations more than once in userspace where I need a way to properly separate schema and object name. Generally I can make do using reg* casts and then hitting catalog tables, but it'd be nice if there was an easier way. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
On Tue, Oct 7, 2014 at 12:51 AM, Marti Raudsepp ma...@juffo.org wrote: On Mon, Oct 6, 2014 at 6:12 PM, Ali Akbar the.ap...@gmail.com wrote: User apaan is me. When i added to the commitfest, the patch is listed there by me (apaan). That's fine I think, it's just for tracking who made the changes in the CommitFest app. What actually matters is what you write in the Author field, which could contain all 3 names separated by commas. It's fine not to add mine to the list of authors, I did a hack review. Feel free to add it to the list of reviewers though. -- Michael
Re: [HACKERS] Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
On Mon, Oct 6, 2014 at 10:45 PM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Paquier michael.paqu...@gmail.com writes: On Mon, Oct 6, 2014 at 1:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: That looks about like mine too, though I'm not using --disable-rpath ... what's the reason for that? No real reason. That was only some old remnant in a build script that was here for ages :) Hm. Grasping at straws here ... what's your locale enviroment? The system locales have nothing really special... $ locale LANG= LC_COLLATE=C LC_CTYPE=UTF-8 LC_MESSAGES=C LC_MONETARY=C LC_NUMERIC=C LC_TIME=C LC_ALL= But now that you mention it I have as well that: $ defaults read -g AppleLocale en_JP -- Michael
Re: [HACKERS] Proposal for better support of time-varying timezone abbreviations
On 10/5/14, 5:42 PM, Tom Lane wrote: Gavin Flower gavinflo...@archidevsys.co.nz writes: The use of an /as_at_date/ is far more problematic. The idea relates to how existing date/times should be treated with respect to the date/time that a pg database is updated with new time zone data files. In the simplest form: there would be a function in pg that would return the date/time a new time zone data file was entered into the system, so that application software can manually correct when the stored GMT date/time was stored incorrectly because the wring GMT offset was used due to the updated time zone data files not being in place. Alternatively, pg could offer to do the correction in a one-off action at the time the new zone data files were updated. Right now there's basically no way to do something like that, since what we store for timestamptz is just a UTC time instant, with no record of what GMT offset was involved much less exactly how the offset was specified in the input. We'd probably have to (at least) double the on-disk size of timestamptz values to record that ... which seems like a mighty high price to pay to fix a corner case. Not to mention that nobody's going to be willing to break on-disk compatibility of timestamptz for this. FWIW, I agree for timestamptz, but I do wish we had a timestamp datatype that stored the exact timezone in effect when the data was entered. That can really, REALLY save your rear if you screw up either timezone in postgresql.conf, or the server's timezone. The part that seems hard (at least to me) is the question of how to actually store the timezone, because I don't think storing the text string America/Central is going to cut it. :/ -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
2014-10-06 22:51 GMT+07:00 Marti Raudsepp ma...@juffo.org: That's fine I think, it's just for tracking who made the changes in the CommitFest app. What actually matters is what you write in the Author field, which could contain all 3 names separated by commas. Ok. Added to commitfest: https://commitfest.postgresql.org/action/patch_view?id=1591 the one that tests values just before numeric overflow Actually I don't know if that's too useful. I think you should add a test case that causes an error to be thrown. Actually i added the test case because in the code, when adding step into current for the last value, i expected it to overflow: /* increment current in preparation for next iteration */ add_var(fctx-current, fctx-step, fctx-current); where in the last calculation, current is 9 * 10^131071. Plus 10^131071, it will be 10^131072, which i expected to overflow numeric type (in the doc, numeric's range is up to 131072 digits before the decimal point). In attached patch, i narrowed the test case to produce smaller result. Also, I noticed that there are a few trailing spaces in the patch that should be removed: +generate_series_numeric(PG_FUNCTION_ARGS) ... + if (NUMERIC_IS_NAN(start_num) || NUMERIC_IS_NAN(finish_num)) ... + if (PG_NARGS() == 3) ... + if (NUMERIC_IS_NAN(step_num)) Ah, didn't see it. Thanks. Fixed in this patch. Regards, -- Ali Akbar *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 14074,14081 AND tbody row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter)/function/literal/entry ! entrytypeint/type or typebigint/type/entry ! entrytypesetof int/type or typesetof bigint/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of one --- 14074,14081 tbody row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter)/function/literal/entry ! entrytypeint/type, typebigint/type or typenumeric/type/entry ! entrytypesetof int/type, typesetof bigint/type, or typesetof numeric/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of one *** *** 14084,14091 AND row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter, parameterstep/parameter)/function/literal/entry ! entrytypeint/type or typebigint/type/entry ! entrytypesetof int/type or typesetof bigint/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of parameterstep/parameter --- 14084,14091 row entryliteralfunctiongenerate_series(parameterstart/parameter, parameterstop/parameter, parameterstep/parameter)/function/literal/entry ! entrytypeint/type, typebigint/type or typenumeric/type/entry ! entrytypesetof int/type, typesetof bigint/type or typesetof numeric/type (same as argument type)/entry entry Generate a series of values, from parameterstart/parameter to parameterstop/parameter with a step size of parameterstep/parameter *** a/src/backend/utils/adt/numeric.c --- b/src/backend/utils/adt/numeric.c *** *** 28,33 --- 28,34 #include access/hash.h #include catalog/pg_type.h + #include funcapi.h #include libpq/pqformat.h #include miscadmin.h #include nodes/nodeFuncs.h *** *** 261,266 typedef struct NumericVar --- 262,279 /* -- + * Data for generate series + * -- + */ + typedef struct + { + NumericVar current; + NumericVar finish; + NumericVar step; + } generate_series_numeric_fctx; + + + /* -- * Some preinitialized constants * -- */ *** *** 1221,1226 numeric_floor(PG_FUNCTION_ARGS) --- 1234,1346 PG_RETURN_NUMERIC(res); } + + /* + * generate_series_numeric() - + * + * Generate series of numeric. + */ + Datum + generate_series_numeric(PG_FUNCTION_ARGS) + { + return generate_series_step_numeric(fcinfo); + } + + Datum + generate_series_step_numeric(PG_FUNCTION_ARGS) + { + generate_series_numeric_fctx *fctx; + FuncCallContext *funcctx; + MemoryContext oldcontext; + + if (SRF_IS_FIRSTCALL()) + { + Numeric start_num = PG_GETARG_NUMERIC(0); + Numeric finish_num = PG_GETARG_NUMERIC(1); + NumericVar steploc = const_one; + + /* Handle NaN in start finish */ + if (NUMERIC_IS_NAN(start_num) || NUMERIC_IS_NAN(finish_num)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg(start and finish cannot be NaN))); + + /* see if we were
Re: [HACKERS] Add regression tests for autocommit-off mode for psql and fix some omissions
On 10/6/14, 9:59 AM, Feike Steenbergen wrote: It would test that when setting AUTOCOMMIT to off that you will not run into: ERROR: [...] cannot run inside a transaction block when issuing one of these PreventTransactionChain commands. In src/bin/psql/common.c Yes, but what happens when a new non-transaction command is added? If we forget to exclude it in psql, we'll certainly also forget to add it to the unit test. The options I see... 1) If there's a definitive way to tell from backend source code what commands disallow transactions then we can just use that information to generate the list of commands psql shouldn't do that with. 2) Always run the regression test with auto-commit turned off. 3) Run the regression in both modes (presumably only on the build farm due to how long it would take). -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest: patches Ready for Committer
On Mon, Oct 6, 2014 at 10:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: The levenshtein-distance thingy seems to still be a topic of debate as well, both as to how we're going to refactor the code and as to what the exact hinting rules ought to be. If some committer wants to take charge of it and resolve those issues, fine; but I don't want to see it done by just blindly committing whatever the last submitted version was. My 2c. I imagine that in this case the consensus is going to be first: - Move a minimum of the core functions of fuzzystrmatch and rename them (str_distance?) - Do not dump fuzzystrmatch and have the levenshtein call those functions In any case levenshtein code needs a careful refactoring and some additional thoughts first before the hint code is touched. Regards. -- Michael
Re: [HACKERS] Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
On Tue, Oct 7, 2014 at 8:14 AM, Michael Paquier michael.paqu...@gmail.com wrote: The system locales have nothing really special... $ locale LANG= LC_COLLATE=C LC_CTYPE=UTF-8 LC_MESSAGES=C LC_MONETARY=C LC_NUMERIC=C LC_TIME=C LC_ALL= But now that you mention it I have as well that: $ defaults read -g AppleLocale en_JP Hm... I have tried changing the system locales (to en_US for example) and time format but I can still trigger the issue all the time. I'll try to have a closer look.. It looks like this test does not like some settings at the OS level. -- Michael
Re: [HACKERS] TAP test breakage on MacOS X
On Thu, 2014-10-02 at 21:18 -0400, Robert Haas wrote: If none of this gets us closer to an answer, I can try to produce a patch that produces more details for such failures. A test that fails for no reason that can be gleaned from the output is not an improvement over not having a test at all. I understand that this isn't great, and it's certainly something I'm looking into. But it's like pg_regress saying that psql crashed and leaving you to find out why. I don't think saying that the entire regression test suite is useless because of that is fair. The TAP tests are arguably already much easier to debug than pg_regress ever was. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Promise index tuples for UPSERT
On 6 October 2014 15:04, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 10/06/2014 04:44 PM, Simon Riggs wrote: On 6 October 2014 13:21, Heikki Linnakangas hlinnakan...@vmware.com wrote: My understanding of what you're saying is that if * we have a table with 1 unique index * and we update the values of the uniquely index columns (e.g. PK update) * on both of the uniquely indexed column sets then we get occaisonal deadlocks, just as we would do using current UPDATE/INSERT. Right. To be precise: you don't need to update both of the columns in the same transaction, it's enough that some of the concurrent transactions update one column, while other transactions update the other column. CREATE TABLE foo (id1integer not null primary key ,id2integer not null unique ,valinteger); Given the table above, which one do we mean? 1. When we mix UPDATE foo SET id2 = X WHERE id1 = Y; and UPDATE foo SET id1 = Y WHERE id2 = X; we can deadlock 2. When we mix UPDATE foo SET val = Z WHERE id1 = Y; and UPDATE foo SET val = W WHERE id2 = X; we can deadlock (2) is a common use case, (1) is a very rare use case and most likely a poor design Well, at least one of the statements has to be an UPSERT, and at least one of them has to update a column with a unique constraint on it. This pair of transactions could deadlock, for example: Transaction 1: INSERT INTO foo VALUES (Y, X, Z) ON CONFLICT IGNORE; Transaction 2: UPDATE foo SET id2 = X WHERE id1 = Y; That's made-up syntax, but the idea is that the first transaction attempts to insert a row with values id1=Y, id2=X, val=Z. If that fails because of a row with id1=Y or id2=X already exists, then it's supposed to do nothing. Lets look at a real world example CREATE TABLE citizen (ssninteger not null primary key ,email text not null unique ,tax_amount decimal); Transaction 1: INSERT INTO citizen VALUES (555123456, 'si...@2ndquadrant.com', 1000.00) ON CONFLICT IGNORE; Transaction 2: UPDATE foo SET email = 'si...@2ndquadrant.com', tax_amount = 1000.00 WHERE ssn = 555123456; OK, now I understand how a deadlock is possible. Thanks for your help. Again I note that there is no isolation test that refers to this situation, nor any documentation, internal or user facing that describes the situation or its workaround. My feeling is that is an unlikely situation. To have two actors concurrently updating the same data AND in different ways from two different angles. How likely is it that we would issue those two transactions concurrently AND we would be concerned because this caused an error? If the tax_amount was the same, it wouldn't matter that one failed. If the tax_amount differeed, we would want to know about the error, not accept it in silence. Are any of those things substantially worse than the current situation using INSERT/UPDATE loops? It might be nice if the above never deadlocked. What is the price of ensuring that in terms of code maintainability and performance? What would this do to COPY performance? If the user wishes to protect against such deadlocks they retain the option to use row locking. Yes? Sorry, I didn't understand that. Row locking? I think that thought doesn't apply here. In general, this is of course a lot easier to implement if we restrict it so that it only works in some limited cases. That may be fine, but then we have to be able to document clearly what the limitations are, and throw an error if you violate those limitations. Seems reasonable. My point here is to establish that... a) there are multiple ways to implement the UPSERT feature and none should be thrown away too quickly b) the current patch does not implement something we all agree on yet c) not all requirements have been properly documented, understood or agreed by hackers If we want to move forwards we need to agree things based upon clarity and real world usage. It may be that people on reading this now believe Peter's HW locking approach is the best. I'm happy to go with consensus. My feeling is that substantially more work is required on explaining the details around multiple unique index constraints, trigger behaviour and various other corner cases. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
Michael Paquier michael.paqu...@gmail.com writes: Hm... I have tried changing the system locales (to en_US for example) and time format but I can still trigger the issue all the time. I'll try to have a closer look.. It looks like this test does not like some settings at the OS level. I eventually realized that the critical difference was you'd added CFLAGS= to the configure call. On this platform that has the net effect of removing -O2 from the compiler flags, and apparently that shifts around the stack layout enough to expose the clobber. The fix is simple enough: ecpg's version of ParseDateTime is failing to check for overrun of the field[] array until *after* it's already clobbered the stack: *** a/src/interfaces/ecpg/pgtypeslib/dt_common.c --- b/src/interfaces/ecpg/pgtypeslib/dt_common.c *** ParseDateTime(char *timestr, char *lowst *** 1695,1703 while (*(*endstr) != '\0') { /* Record start of current field */ - field[nf] = lp; if (nf = MAXDATEFIELDS) return -1; /* leading digit? then date or time */ if (isdigit((unsigned char) *(*endstr))) --- 1695,1703 while (*(*endstr) != '\0') { /* Record start of current field */ if (nf = MAXDATEFIELDS) return -1; + field[nf] = lp; /* leading digit? then date or time */ if (isdigit((unsigned char) *(*endstr))) Kind of astonishing that nobody else has reported this, given that there's been a regression test specifically meant to catch such a problem since 4318dae. The stack layout in PGTYPESdate_from_asc must happen to avoid the issue on practically all platforms. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Failure with make check-world for pgtypeslib/dt_test2 with HEAD on OSX
On Tue, Oct 7, 2014 at 9:57 AM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Paquier michael.paqu...@gmail.com writes: Hm... I have tried changing the system locales (to en_US for example) and time format but I can still trigger the issue all the time. I'll try to have a closer look.. It looks like this test does not like some settings at the OS level. I eventually realized that the critical difference was you'd added CFLAGS= to the configure call. On this platform that has the net effect of removing -O2 from the compiler flags, and apparently that shifts around the stack layout enough to expose the clobber. At least my scripts are weird enough to trigger such behaviors. The funny part is that it's really a coincidence, CFLAGS was being set with an empty variable, variable removed in this script some time ago. The fix is simple enough: ecpg's version of ParseDateTime is failing to check for overrun of the field[] array until *after* it's already clobbered the stack: Kind of astonishing that nobody else has reported this, given that there's been a regression test specifically meant to catch such a problem since 4318dae. The stack layout in PGTYPESdate_from_asc must happen to avoid the issue on practically all platforms. Yes, thanks. That's it. At least I am not going crazy. Regards, -- Michael
Re: [HACKERS] Last Commitfest patches waiting review
2014-10-03 23:14 GMT+07:00 Heikki Linnakangas hlinnakan...@vmware.com: Fix xpath() to return namespace definitions (fixes the issue with nested or repeated xpath()) Peter, you've done some XML stuff before; could you have a look at this too? I am the author of the patch. I've sent Peter off-the-list review request email as you had suggested before, but he didn't respond. How can i help to ease the review? The last patch is re-implementation, as per Tom Lane's findings about xmlNodeCopy's behavior if there's not enough memory. It turns out that the reimplementation is not as simple as before (because reimplement some of xmlNodeCopy code must be reimplemented here). Reviewing the patch myself, i've found some code formatting problems. Will fix and post in the patch's thread. Regards, -- Ali Akbar
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
While reviewing the patch myself, i spotted some formatting problems in the code. Fixed in this v5 patch. Also, this patch uses context patch format (in first versions, because of the small modification, context patch format obfucates the changes. After reimplementation this isn't the case anymore) Thanks, Ali Akbar *** a/src/backend/utils/adt/xml.c --- b/src/backend/utils/adt/xml.c *** *** 141,149 static bool print_xml_decl(StringInfo buf, const xmlChar *version, pg_enc encoding, int standalone); static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace, int encoding); ! static text *xml_xmlnodetoxmltype(xmlNodePtr cur); static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj, ! ArrayBuildState **astate); #endif /* USE_LIBXML */ static StringInfo query_to_xml_internal(const char *query, char *tablename, --- 141,151 pg_enc encoding, int standalone); static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace, int encoding); ! static text *xml_xmlnodetoxmltype(xmlNodePtr cur, ! PgXmlErrorContext *xmlerrcxt); static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj, ! ArrayBuildState **astate, ! PgXmlErrorContext *xmlerrcxt); #endif /* USE_LIBXML */ static StringInfo query_to_xml_internal(const char *query, char *tablename, *** *** 3594,3620 SPI_sql_row_to_xmlelement(int rownum, StringInfo result, char *tablename, #ifdef USE_LIBXML /* * Convert XML node to text (dump subtree in case of element, * return value otherwise) */ static text * ! xml_xmlnodetoxmltype(xmlNodePtr cur) { xmltype*result; if (cur-type == XML_ELEMENT_NODE) { xmlBufferPtr buf; buf = xmlBufferCreate(); PG_TRY(); { xmlNodeDump(buf, NULL, cur, 0, 1); result = xmlBuffer_to_xmltype(buf); } PG_CATCH(); { xmlBufferFree(buf); PG_RE_THROW(); } --- 3596,3706 #ifdef USE_LIBXML + /* check ns definition of node and its childrens. If any one of ns is + * not defined in node and it's children, but in the node's parent, + * copy the definition to node. + */ + static void + xml_checkandcopyns(xmlNodePtr root, + PgXmlErrorContext *xmlerrcxt, + xmlNodePtr node, + xmlNsPtr lastns_before) + { + xmlNsPtr ns = NULL; + xmlNsPtr cur_ns; + xmlNodePtr cur; + + if (node-ns != NULL) + { + /* check in new nses */ + cur_ns = lastns_before == NULL ? node-nsDef : lastns_before-next; + while (cur_ns != NULL) + { + if (cur_ns-href != NULL) + { + if (((cur_ns-prefix == NULL) (node-ns-prefix == NULL)) || + ((cur_ns-prefix != NULL) (node-ns-prefix != NULL) + xmlStrEqual(cur_ns-prefix, node-ns-prefix))) + { + ns = cur_ns; + break; + } + } + cur_ns = cur_ns-next; + } + if (ns == NULL) /* not in new nses */ + { + ns = xmlSearchNs(NULL, node-parent, node-ns-prefix); + + if (ns != NULL) + { + ns = xmlNewNs(root, ns-href, ns-prefix); + + if (ns == NULL xmlerrcxt-err_occurred) + xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY, + could not allocate xmlNs); + } + } + } + /* check and copy ns for children recursively */ + cur = node-children; + while (cur != NULL) + { + xml_checkandcopyns(root, xmlerrcxt, cur, lastns_before); + cur = cur-next; + } + } + /* * Convert XML node to text (dump subtree in case of element, * return value otherwise) */ static text * ! xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt) { xmltype*result; if (cur-type == XML_ELEMENT_NODE) { xmlBufferPtr buf; + xmlNsPtr lastns_before; + xmlNsPtr ns; + xmlNsPtr next; buf = xmlBufferCreate(); + PG_TRY(); { + lastns_before = cur-nsDef; + if (lastns_before != NULL) + { + while (lastns_before-next != NULL) + lastns_before = lastns_before-next; + } + xml_checkandcopyns(cur, xmlerrcxt, cur, lastns_before); + xmlNodeDump(buf, NULL, cur, 0, 1); result = xmlBuffer_to_xmltype(buf); + + /* delete and free new nses */ + ns = lastns_before == NULL ? cur-nsDef : lastns_before-next; + while (ns != NULL) + { + next = ns-next; + xmlFree(ns); + ns = next; + } + + if (lastns_before == NULL) + cur-nsDef = NULL; + else + lastns_before-next = NULL; } PG_CATCH(); { + /* new namespaces will be freed while free-ing the node, so we + * won't free it here + */ xmlBufferFree(buf); PG_RE_THROW(); } *** *** 3660,3666 xml_xmlnodetoxmltype(xmlNodePtr cur) */ static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj, ! ArrayBuildState **astate) { int result = 0; Datum datum; --- 3746,3753 */ static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj, !
Re: [HACKERS] Promise index tuples for UPSERT
On Mon, Oct 6, 2014 at 5:33 PM, Simon Riggs si...@2ndquadrant.com wrote: Lets look at a real world example CREATE TABLE citizen (ssninteger not null primary key ,email text not null unique ,tax_amount decimal); Transaction 1: INSERT INTO citizen VALUES (555123456, 'si...@2ndquadrant.com', 1000.00) ON CONFLICT IGNORE; Transaction 2: UPDATE foo SET email = 'si...@2ndquadrant.com', tax_amount = 1000.00 WHERE ssn = 555123456; OK, now I understand how a deadlock is possible. Thanks for your help. Again I note that there is no isolation test that refers to this situation, nor any documentation, internal or user facing that describes the situation or its workaround. This seems like a concern specific to other approaches to value locking. But fair enough. My feeling is that is an unlikely situation. To have two actors concurrently updating the same data AND in different ways from two different angles. Hard to say for sure. How likely is it that we would issue those two transactions concurrently AND we would be concerned because this caused an error? If the tax_amount was the same, it wouldn't matter that one failed. If the tax_amount differeed, we would want to know about the error, not accept it in silence. Are any of those things substantially worse than the current situation using INSERT/UPDATE loops? Yes, because the new feature is supposed to make it so that you yourself don't have to put your UPSERT statement in a loop with subxacts. Taking away the burden of having to think about this stuff is something I'm striving for here. It might be nice if the above never deadlocked. What is the price of ensuring that in terms of code maintainability and performance? I am going to reserve judgement on that, at least for a little while. It seems like the person proposing an alternative ought to have a better sense of what the price of avoiding this is. I'd understand what you were getting at more here if it immediately made our lives easier in some obvious way. I don't see that it does, though I admit that I may simply not understand where you're coming from. So sure, let's not be prejudiced about what's important, but at the same time I don't see that either Heikki or I have actually been inflexible to a degree that hurts things WRT not giving up on important high-level-ish goals. I am not completely inflexible on never error. I am very close to totally inflexible, though. I think I could live with an error that literally no one would ever see. For example, we could error if there was an excessive number of retries, which I find acceptable because it will never happen in the real world. I tend to think that what you're talking about is pretty far from that, though. My point here is to establish that... a) there are multiple ways to implement the UPSERT feature and none should be thrown away too quickly b) the current patch does not implement something we all agree on yet c) not all requirements have been properly documented, understood or agreed by hackers If we want to move forwards we need to agree things based upon clarity and real world usage. I certainly agree with that. It may be that people on reading this now believe Peter's HW locking approach is the best. I'm happy to go with consensus. I bet you didn't think that you'd say that a week ago. :-) I hope I don't sound smug when I say that. I just mean, as you say, that we all need to keep an open mind on this. A healthy respect for the problem is recommended. I think it's still possible that there are problems with design #1, even on its own terms. My feeling is that substantially more work is required on explaining the details around multiple unique index constraints, trigger behaviour and various other corner cases. Probably. Ideally, we should do that in a way driven by real-world prototypes. In that spirit, I attach a new version of my patch, but now implemented using approach #2 to value locking. I haven't spent all that much time testing this (at least recently, in this form), but it does pass all existing tests, including my stress-tests when run for half an hour. A lot of those corner cases you mention are big concerns. It's much easier to identify these issues by breaking real implementations. So surprisingly, to a certain extent (with something like this) it makes sense to have requirements driven by actual implementations. If we cannot do this iteratively, we are likely to fail. That's just how it is, I think. -- Peter Geoghegan 0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patch.gz Description: GNU Zip compressed data 0006-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patch.gz Description: GNU Zip compressed data 0005-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patch.gz Description: GNU Zip compressed data 0004-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patch.gz Description: GNU Zip compressed data
Re: [HACKERS] pg_receivexlog always handles -d option argument as connstr
On Mon, Oct 6, 2014 at 10:23 PM, Sawada Masahiko sawada.m...@gmail.com wrote: Hi all, pg_receivexlog always handles argument of -d option as connstr formatted value. We can doubly specify host name, port number. The other client tools handles -d option as connstr value only if argument has = character. pg_basebackup also seems to behave same as pg_receivexlog. psql also treats it in similar way. The behaviour of psql is as below: psql.exe -d=host=localhost port=5432 dbname=postgres psql: invalid connection option psql.exe -d host=localhost port=5432 dbname=postgres psql (9.5devel) WARNING: Console code page (437) differs from Windows code page (1252) 8-bit characters might not work correctly. See psql reference page Notes for Windows users for details. Type help for help. postgres=# The document says that pg_receivexlog ignores database name, and this option is called for consistency with other client applications. But if we specify database name like other client tool '-d hoge' , then we will definitely got error. What I understand from document is that it ignores database name when given in connection string. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] TAP test breakage on MacOS X
On Mon, Oct 6, 2014 at 8:15 PM, Peter Eisentraut pete...@gmx.net wrote: On Thu, 2014-10-02 at 21:18 -0400, Robert Haas wrote: If none of this gets us closer to an answer, I can try to produce a patch that produces more details for such failures. A test that fails for no reason that can be gleaned from the output is not an improvement over not having a test at all. I understand that this isn't great, and it's certainly something I'm looking into. But it's like pg_regress saying that psql crashed and leaving you to find out why. I don't think saying that the entire regression test suite is useless because of that is fair. The TAP tests are arguably already much easier to debug than pg_regress ever was. Well, maybe. I wasn't able, after about 5 minutes of searching, to locate either a log file with details of the failure or the code that revealed what the test, the expected result, and the actual result were. It's possible that all that information is there and I just don't know where to look; it took me a while to learn where the various logs (postmaster.log, initdb.log, results) left behind by pg_regress were, too. If that information is not there, then I'd say it's not easier to debug. If it is and I don't know where to look ... well then I just need to get educated. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Corporate and Individual Contributor License Agreements (CLAs)
On 10/07/2014 04:10 AM, Arcadiy Ivanov wrote: Thank you Magnus. The absence of legal entity and therefore of CLAs will make for an awesome discussion with legal. :D They want a piece of paper to sign. So print out the PostgreSQL license. Sign it. Hand it to them. That might satisfy them. They can even scan it and send it to someone if they want, though I expect it'd go straight in the trash mailbox or get a perfunctory reply along the lines of I don't know what you expect me to do with this. Candidates to receive such a message might be: http://www.postgresql.eu/about/contact/ https://www.postgresql.us/contact though I don't speak for either of them. If they want some written acknowledgement from PostgreSQL that you're allowed to contribute that'll be harder because there isn't really anyone to supply such an acknowledgement. One of the two associations above might be able to help, or you could contact someone from: http://www.postgresql.org/support/professional_support/ to assist your legal team. A final option would be to contact the SFLC: https://www.softwarefreedom.org/ who are *not* associated with PostgreSQL, but might be able to provide useful general advice, especially if your company is willing to pay for the time they spend doing so. Note that there is never any guarantee that your contributions will be accepted by PgJDBC, the core project, or anybody else. That will be judged on perceived merit and on how effectively you explain the utility of the changes you want to make. Some companies find it easier to work via partners experienced with the development model; again, you can find candidates at http://www.postgresql.org/support/professional_support/ . -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Corporate and Individual Contributor License Agreements (CLAs)
On 10/07/2014 03:58 AM, Arcadiy Ivanov wrote: Hi folks, My corp (CSC) OSS division requires CLAs to be signed for OSS project participation to begin. I need to fix a few problems in PGJDBC driver and am unable to start without them. Would you mind enumerating those problems? Ideally, it'd be best to do so as github issues on https://github.com/pgjdbc/pgjdbc/ accompanied by details and where possible test cases. A post to the PgJDBC mailing list would be OK too. PgJDBC is a separate, though associated, project, and pgsql-hackers isn't really the right place. Before starting work on fixing issues it's important to discuss them and make sure there's a reasonable level of agreement that those issues are in fact problems, and that the approach you plan on taking to fix them will be acceptable. Otherwise you might do a lot of work only to have it rejected, or to find that you've fixed a problem that really stemmed from a misunderstanding of how something works. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On 26 September 2014 01:24, Jeff Janes Wrote, I think you have an off-by-one error in the index into the array of file handles. Actually the problem is that the socket for the master connection was not getting initialized, see my one line addition here. connSlot = (ParallelSlot*)pg_malloc(concurrentCons * sizeof(ParallelSlot)); connSlot[0].connection = conn; + connSlot[0].sock = PQsocket(conn); Thanks for the review, I have fixed this. However, I don't think it is good to just ignore errors from the select call (like the EBADF) and go into a busy loop instead, so there are more changes needed than this. Actually this select_loop function I have implemented same as other client application are handling, i.e pg_dum in parallel.c, however parallel.c is handling the case if process is in abort (if Ctrl+c is recieved), And we need to handle the same, so I have fixed this in attached patch. Also, cancelling the run (by hitting ctrl-C in the shell that invoked it) does not seem to work on linux. I get a message that says Cancel request sent, but then it continues to finish the job anyway. Apart from above mentioned reason, GetQueryResult was also not setting “SetCancelConn” as Amit has pointed, now this is also fixed. Regards, Dilip Kumar vacuumdb_parallel_v15.patch Description: vacuumdb_parallel_v15.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On 26 September 2014 12:24, Amit Kapila Wrote, I don't think this can handle cancel requests properly because you are just setting it in GetIdleSlot() what if the cancel request came during GetQueryResult() after sending sql for all connections (probably thats the reason why Jeff is not able to cancel the vacuumdb when using parallel option). You are right, I have fixed, it in latest patch, please check latest patch @ (4205e661176a124faf891e0a6ba9135266363...@szxeml509-mbs.china.huawei.comhttp://www.postgresql.org/message-id/4205e661176a124faf891e0a6ba9135266363...@szxeml509-mbs.china.huawei.com) dilip@linux-ltr9:/home/dilip/9.4/install/bin ./vacuumdb -z -a -j 8 -p 9005 vacuumdb: vacuuming database db1 vacuumdb: vacuuming database postgres Cancel request sent vacuumdb: vacuuming of database postgres failed: ERROR: canceling statement due to user request Few other points 1. + vacuum_parallel(const char *dbname, bool full, bool verbose, { .. +connSlot = (ParallelSlot*)pg_malloc(concurrentCons * sizeof(ParallelSlot)); +connSlot[0].connection = conn; Fixed a. Does above memory gets freed anywhere, if not isn't it good idea to do the same b. For slot 0, you are not seeting it as PQsetnonblocking, where as I think it can be used to run commands like any other connection. Yes, this was missing in the code, I have fixed it.. 2. +/* +* If user has given the vacuum of complete db, then if +* any of the object vacuum failed it can be ignored and vacuuming +* of other object can be continued, this is the same behavior as +* vacuuming of complete db is handled without --jobs option +*/ s/object/object's FIXED 3. +if(!completedb || +(sqlState strcmp(sqlState, ERRCODE_UNDEFINED_TABLE) != 0)) +{ + +fprintf(stderr, _(%s: vacuuming of database \%s\ failed: %s), +progname, dbname, PQerrorMessage (conn)); Indentation on both places is wrong. Check other palces for similar issues. FIXED 4. +bool analyze_only, bool freeze, int numAsyncCons, In code still there is reference to AsyncCons, as decided lets change it to concurrent_connections | conc_cons FIXED Regards, Dilip