Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Thu, Oct 4, 2012 at 11:51 AM, Tom Lane wrote: > Greg Stark writes: > > I'm a bit puzzled why we're so afraid of swapping the relfilenodes > > when that's what the current REINDEX does. > > Swapping the relfilenodes is fine *as long as you have exclusive lock*. > The trick is to make it safe without that. It will definitely not work > to do that without exclusive lock, because at the instant you would try > it, people will be accessing the new index (by OID). > OK, so index swapping could be done by: 1) Index name switch. This is not thought as safe as the system does not pay attention on index names at all. 2) relfilenode switch. An ExclusiveLock is necessary.The lock that would be taken is not compatible with a concurrent operation, except if we consider that the lock will not be taken for a long time, only during the swap moment. Reindex uses this mechanism, so it would be good for consistency. 3) Switch the OIDs of indexes. Looks safe from the system prospective and it will be necessary to invalidate the cache entries for both relations after swap. Any opinions on this one? -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]
2012-10-04 05:24 keltezéssel, Peter Eisentraut írta: On Wed, 2012-10-03 at 18:15 +0200, Boszormenyi Zoltan wrote: The second generation of this work is now attached and contains a new feature as was discussed and suggested by Magnus Hagander, Fujii Masao and Peter Eisentraut. So libpq has grown a new function: +/* return the connection options used by a live connection */ +extern PQconninfoOption *PQconninfo(PGconn *conn); This copies all the connection parameters back from the live PGconn itself so everything that's needed to connect is already validated. I don't like that this code maintains a second list of all possible libpq connection parameters. Where does it do that? In PQconninfo() itself? Why is it a problem? Or to put it bluntly: the same problem is in fillPGconn() too, as it also has the same set of parameters listed. So there is already code that you don't like. :-) How about a static mapping between option names and offsetof(struct pg_conn, member) values? This way both fillPGconn() and PQconninfo() can avoid maintaining the list of parameter names. The parameters to add to the connection string should be driven off the authoritative list in PQconninfoOptions. So, should I add a second flag to PQconninfoOption to indicate that certain options should not be used for primary_conninfo? Thanks and best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xmalloc => pg_malloc
Peter Eisentraut writes: > xmalloc, xstrdup, etc. are pretty common names for functions that do > alloc-or-die (another possible naming scheme ;-) ). The naming > pg_malloc etc. on the other hand suggests that the allocation is being > done in a PostgreSQL-specific way, and anyway sounds too close to > palloc. > So I'd be more in favor of xmalloc <= pg_malloc. Meh. The fact that other people use that name is not really an advantage from where I sit. I'm concerned about possible name collisions, eg in libraries loaded into the backend. There are probably not any actual risks of collision right now, given that all these functions are currently in our client-side programs --- but it's foreseeable that we might use this same naming convention in more-exposed places in future. In fact, somebody was already proposing creating such functions in the core backend. But having said that, I'm not absolutely wedded to these names; they were just the majority of existing cases. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] xmalloc => pg_malloc
On Tue, 2012-10-02 at 12:02 -0400, Tom Lane wrote: > While looking around to fix the pg_malloc(0) issue, I noticed that > various other pieces of code such as pg_basebackup have essentially > identical functions, except they're called xmalloc(). I propose to > standardize all these things on this set of names: > > pg_malloc > pg_malloc0 (for malloc-and-zero behavior) > pg_calloc (randomly different API for pg_malloc0) > pg_realloc > pg_free > pg_strdup > > Any objections? xmalloc, xstrdup, etc. are pretty common names for functions that do alloc-or-die (another possible naming scheme ;-) ). The naming pg_malloc etc. on the other hand suggests that the allocation is being done in a PostgreSQL-specific way, and anyway sounds too close to palloc. So I'd be more in favor of xmalloc <= pg_malloc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]
On Wed, 2012-10-03 at 18:15 +0200, Boszormenyi Zoltan wrote: > The second generation of this work is now attached and contains a new > feature as was discussed and suggested by Magnus Hagander, Fujii Masao > and Peter Eisentraut. So libpq has grown a new function: > > +/* return the connection options used by a live connection */ > +extern PQconninfoOption *PQconninfo(PGconn *conn); > > This copies all the connection parameters back from the live PGconn > itself > so everything that's needed to connect is already validated. I don't like that this code maintains a second list of all possible libpq connection parameters. The parameters to add to the connection string should be driven off the authoritative list in PQconninfoOptions. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question on "box @> point" using GiST index on boxes
Ralf Rantzau writes: > I would like to test the containment of a point against many boxes. > I did not find a way to express "box @> point" in straightforward way such > that the GiST index on the boxes is exploited. Yeah, that operator is not in any GiST opclass, as you can easily verify with a look into pg_amop. It seems like it probably wouldn't be terribly hard to add it (and related box vs point operators) to GiST box_ops, but nobody's bothered to do the legwork. > The way I currently represent a point p is as: box(p, p). In this case, > the GiST index use kicks in. Right, that's the standard workaround. Note that I'm not sure that direct use of the box vs point operators would be any faster, but it'd surely be less surprising. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Docs bug: SET ROLE docs should "see also: DISCARD ALL"
Hi folks There's no mention anywhere in `SET ROLE` of the ability of `DISCARD ALL` to reset the role to default. Ditto `SET SESSION AUTHORIZATION`. That's pretty important, since an app that wants to allow arbitrary SQL to be executed as an assumed user identity might be guarding against "RESET ROLE", "SET ROLE", etc statements but not know to watch for "DISCARD". Sure, it's a bad idea to accept arbitrary SQL from a client, and filtering it is never going to be perfect, but it's clear when looking at things like discussion of RESET ROLE in SECURITY DEFINER functions that this is something people do and is a concern. BTW, it'd be *really* nice if there were a way to: SET ROLE some_role RESET_COOKIE 'random-garbage'; that prevented RESET ROLE without supplying a RESET_COOKIE. Ditto again for `SET SESSION AUTHORIZATION`. For that matter it'd be helpful even to have a "NORESET" option that made it like a priv drop, where DISCARD ALL, RESET ROLE, RESET SESSION AUTHORIZATION etc after a SET ROLE somebody NORESET or SET SESSION AUTHORIZATION somebody NORESET just did nothing. -- Craig Ringer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
Greg Stark writes: > I'm a bit puzzled why we're so afraid of swapping the relfilenodes > when that's what the current REINDEX does. Swapping the relfilenodes is fine *as long as you have exclusive lock*. The trick is to make it safe without that. It will definitely not work to do that without exclusive lock, because at the instant you would try it, people will be accessing the new index (by OID). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Make CREATE AGGREGATE check validity of initcond value?
In http://archives.postgresql.org/pgsql-general/2012-10/msg00138.php we see an example where a user tried to create an aggregate whose "initcond" (initial transition value) wasn't valid for the transition data type. CREATE AGGREGATE didn't complain because it just stores the initial condition as a text string. It seems to me that it'd be a lot more user-friendly if it did check the value, as per the attached proposed patch. Does anyone have an objection to this? I can imagine cases where the check would reject values that would get accepted at runtime, if the type's input function was sensitive to the phase of the moon or something. But it doesn't seem very probable, whereas checking the value seems like an eminently useful thing to do. Or maybe I'm just overreacting to the report --- I can't recall any previous complaints like this, so maybe entering a bogus initcond is a corner case too. regards, tom lane diff --git a/src/backend/commands/aggregatecmds.c b/src/backend/commands/aggregatecmds.c index c99c07c..b9f8711 100644 *** a/src/backend/commands/aggregatecmds.c --- b/src/backend/commands/aggregatecmds.c *** DefineAggregate(List *name, List *args, *** 61,66 --- 61,67 Oid *aggArgTypes; int numArgs; Oid transTypeId; + char transTypeType; ListCell *pl; /* Convert list of names to a name and namespace */ *** DefineAggregate(List *name, List *args, *** 181,187 * aggregate. */ transTypeId = typenameTypeId(NULL, transType); ! if (get_typtype(transTypeId) == TYPTYPE_PSEUDO && !IsPolymorphicType(transTypeId)) { if (transTypeId == INTERNALOID && superuser()) --- 182,189 * aggregate. */ transTypeId = typenameTypeId(NULL, transType); ! transTypeType = get_typtype(transTypeId); ! if (transTypeType == TYPTYPE_PSEUDO && !IsPolymorphicType(transTypeId)) { if (transTypeId == INTERNALOID && superuser()) *** DefineAggregate(List *name, List *args, *** 194,199 --- 196,219 } /* + * If we have an initval, and it's not for a pseudotype (particularly a + * polymorphic type), make sure it's acceptable to the type's input + * function. We will store the initval as text, because the input + * function isn't necessarily immutable (consider "now" for timestamp), + * and we want to use the runtime not creation-time interpretation of the + * value. However, if it's an incorrect value it seems much more + * user-friendly to complain at CREATE AGGREGATE time. + */ + if (initval && transTypeType != TYPTYPE_PSEUDO) + { + Oid typinput, + typioparam; + + getTypeInputInfo(transTypeId, &typinput, &typioparam); + (void) OidInputFunctionCall(typinput, initval, typioparam, -1); + } + + /* * Most of the argument-checking is done inside of AggregateCreate */ AggregateCreate(aggName, /* aggregate name */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Thu, Oct 4, 2012 at 2:19 AM, Michael Paquier wrote: >> I think what you'd have to do is drop the old index (relying on the >> assumption that no one is accessing it anymore after a certain point, so >> you can take exclusive lock on it now) and then rename the new index >> to have the old index's name. However, renaming an index without >> exclusive lock on it still seems a bit risky. Moreover, what if you >> crash right after committing the drop of the old index? I think this would require a new state which is the converse of indisvalid=f. Right now there's no state the index can be in that means the index should be ignored for both scans and maintenance but might have old sessions that might be using it or maintaining it. I'm a bit puzzled why we're so afraid of swapping the relfilenodes when that's what the current REINDEX does. It seems flaky to have two different mechanisms depending on which mode is being used. It seems more conservative to use the same mechanism and just figure out what's required to ensure it's safe in both modes. At least there won't be any bugs from unexpected consequences that aren't locking related if it's using the same mechanics. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detecting libpq connections improperly shared via fork()
On 10/03/2012 09:23 PM, Tom Lane wrote: A bigger problem with this is that it only fixes the issue for cases in which somebody makes new threads of control with fork(). I believe that issues involving multiple threads trying to use the same PGconn are at least as widespread. I'm not terribly excited about removing functionality and adding overhead to protect against just one variant of the problem. I had the same thought re threads. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
On Fri, Sep 14, 2012 at 9:25 AM, Shigeru HANADA wrote: > Hi all, > > I'd like to propose FDW for PostgreSQL as a contrib module again. > Attached patch is updated version of the patch proposed in 9.2 > development cycle. very nice. > - binary transfer (only against servers with same PG major version?) Unfortunately this is not enough -- at least not without some additional work. The main problem is user defined types, especially composites. Binary wire format sends internal member oids which the receiving server will have to interpret. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detecting libpq connections improperly shared via fork()
Andres Freund writes: > On Thursday, October 04, 2012 12:16:14 AM Daniel Farina wrote: >> I suppose this might needlessly eliminate someone who forks and hands >> off the PGconn struct to exactly one child, but it's hard to argue >> with its simplicity and portability of mechanism. > Even that scenario isn't easy to get right... You need to get the socket from > libpq in the parent after the fork() and close it. Only after that you can > PQfinish it. Which you probably need to do before establishing other > connections. No, it's much easier than that: the parent can simply forget that it has a PGconn. It will leak the memory occupied by the PGconn object, and it will leak an open socket (which will only be half-open after the child does PQfinish). This would be noticeable if the parent is long-lived and creates many such connections over its lifespan, but otherwise people could be doing it just fine. In fact, I had to look closely to convince myself that pgbench didn't do it already. I suspect that if we provide a mechanism like this, we'll have to provide a way to turn it off, or somebody is going to complain that we broke their code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On 2012/10/04, at 10:00, Tom Lane wrote: > Michael Paquier writes: >> On Wed, Oct 3, 2012 at 5:10 PM, Andres Freund wrote: >> 14) Swap new and old indexes, consisting here in switching their names. >>> I think switching based on their names is not going to work very well >>> because >>> indexes are referenced by oid at several places. Swapping >>> pg_index.indexrelid >>> or pg_class.relfilenode seems to be the better choice to me. We expect >>> relfilenode changes for such commands, but not ::regclass oid changes. > >> OK, if there is a choice to be made, switching relfilenode would be a >> better choice as it points to the physical storage itself. It looks more >> straight-forward than switching oids, and takes the switch at the root. > > Andres is quite right that "switch by name" is out of the question --- > for the most part, the system pays no attention to index names at all. > It just gets a list of the OIDs of indexes belonging to a table and > works with that. Sure. The switching being done by changing the index name is just the direction taken by the first version of the patch, and only that. I just wrote this version without really looking for a bulletproof solution but only for something to discuss about. > > However, I'm pretty suspicious of the idea of switching relfilenodes as > well. You generally can't change the relfilenode of a relation (either > a table or an index) without taking an exclusive lock on it, because > changing the relfilenode *will* break any concurrent operations on the > index. And there is not anyplace in the proposed sequence where it's > okay to have exclusive lock on both indexes, at least not if the goal > is to not block concurrent updates at any time. Ok. As the goal is to allow concurrent operations, this is not reliable either. So what is remaining is the method switching the OIDs of old and new indexes in pg_index? Any other candidates? > > I think what you'd have to do is drop the old index (relying on the > assumption that no one is accessing it anymore after a certain point, so > you can take exclusive lock on it now) and then rename the new index > to have the old index's name. However, renaming an index without > exclusive lock on it still seems a bit risky. Moreover, what if you > crash right after committing the drop of the old index? > > I'm really not convinced that we have a bulletproof solution yet, > at least not if you insist on the replacement index having the same name as > the original. How badly do we need that? And we do not really need such a solution as I am not insisting on the method that switches indexes by changing names. I am open to a reliable and robust method, and I hope this method could be decided in this thread. Thanks for those arguments, I am feeling it is really leading the discussion to the good direction. Thanks. Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detecting libpq connections improperly shared via fork()
Daniel Farina writes: > On Wed, Oct 3, 2012 at 3:14 PM, Andres Freund wrote: >> Hm. An easier version of this could just be storing the pid of the process >> that did the PQconnectdb* in the PGconn struct. You can then check that >> PGconn->pid == getpid() at relatively few places and error out on a mismatch. >> That should be doable with only minor overhead. > I suppose this might needlessly eliminate someone who forks and hands > off the PGconn struct to exactly one child, but it's hard to argue > with its simplicity and portability of mechanism. Yeah, the same thing had occurred to me, but I'm not sure that getpid() is really cheap enough to claim that the overhead is negligible. A bigger problem with this is that it only fixes the issue for cases in which somebody makes new threads of control with fork(). I believe that issues involving multiple threads trying to use the same PGconn are at least as widespread. I'm not terribly excited about removing functionality and adding overhead to protect against just one variant of the problem. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQping command line tool
Phil Sorber writes: > How about adding it as an option to psql? That's not to say that I > think we shouldn't also add it to 'pg_ctl status'. > I was looking at the code and originally I was using return code to > signify what the status was and some text output when quiet wasn't > set, but psql has it's own set of established return codes. How does > everyone feel about using different return codes when psql is in the > PQping mode? Personally, I think adding this to psql has nothing to recommend it: it would be shoehorning an unrelated behavior in among what are already too many constraints. If we're going to do it at all, it should be a stand-alone tool. If it's not worth that much work, it's not worth doing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
Michael Paquier writes: > On Wed, Oct 3, 2012 at 5:10 PM, Andres Freund wrote: > 14) Swap new and old indexes, consisting here in switching their names. >> I think switching based on their names is not going to work very well >> because >> indexes are referenced by oid at several places. Swapping >> pg_index.indexrelid >> or pg_class.relfilenode seems to be the better choice to me. We expect >> relfilenode changes for such commands, but not ::regclass oid changes. > OK, if there is a choice to be made, switching relfilenode would be a > better choice as it points to the physical storage itself. It looks more > straight-forward than switching oids, and takes the switch at the root. Andres is quite right that "switch by name" is out of the question --- for the most part, the system pays no attention to index names at all. It just gets a list of the OIDs of indexes belonging to a table and works with that. However, I'm pretty suspicious of the idea of switching relfilenodes as well. You generally can't change the relfilenode of a relation (either a table or an index) without taking an exclusive lock on it, because changing the relfilenode *will* break any concurrent operations on the index. And there is not anyplace in the proposed sequence where it's okay to have exclusive lock on both indexes, at least not if the goal is to not block concurrent updates at any time. I think what you'd have to do is drop the old index (relying on the assumption that no one is accessing it anymore after a certain point, so you can take exclusive lock on it now) and then rename the new index to have the old index's name. However, renaming an index without exclusive lock on it still seems a bit risky. Moreover, what if you crash right after committing the drop of the old index? I'm really not convinced that we have a bulletproof solution yet, at least not if you insist on the replacement index having the same name as the original. How badly do we need that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] bison location reporting for potentially-empty list productions
In the just-committed patch for CREATE SCHEMA IF NOT EXISTS, there is an error thrown by the grammar when IF NOT EXISTS is specified together with any schema-element clauses. I thought it would make more sense for the error cursor to point at the schema-element clauses, rather than at the IF NOT EXISTS as submitted. So the code looks like, eg, | CREATE SCHEMA IF_P NOT EXISTS ColId OptSchemaEltList ... if ($7 != NIL) ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("CREATE SCHEMA IF NOT EXISTS cannot include schema elements"), parser_errposition(@7))); However, it turns out this actually results in a cursor pointing at the previous word: CREATE SCHEMA IF NOT EXISTS test_schema_1 -- fail, disallowed CREATE TABLE abc ( a serial, b int UNIQUE ); ERROR: CREATE SCHEMA IF NOT EXISTS cannot include schema elements LINE 1: CREATE SCHEMA IF NOT EXISTS test_schema_1 ^ After some study I think what is happening is that there's a deficiency in the location-tracking macro in gram.y: /* Location tracking support --- simpler than bison's default */ #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ if (N) \ (Current) = (Rhs)[1]; \ else \ (Current) = (Rhs)[0]; \ } while (0) OptSchemaEltList looks like this: OptSchemaEltList: OptSchemaEltList schema_stmt{ $$ = lappend($1, $2); } | /* EMPTY */ { $$ = NIL; } ; When the macro is evaluated for the initial empty production for OptSchemaEltList, the "else" case causes it to index off the beginning of the array and pick up the location for the preceding token. Then, each succeeding reduction of the first part of the production copies that bogus value from the first RHS member item. So this same problem applies not only to OptSchemaEltList but to any case where we've formed a zero-or-more-element-list production in this style. So far as I can tell, there are no other cases in the current grammar where we make use of the position of a list nonterminal for error-reporting purposes, which is why we'd not noticed this before. But it seems likely to come up again. It seems clear to me now that the macro ought at least to be changed to #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ if (N) \ (Current) = (Rhs)[1]; \ else \ (Current) = -1; \ } while (0) so that the parse location attributed to an empty production is -1 (ie, unknown) rather than the current quite bogus behavior of marking it as the preceding token's location. (For one thing, what if there *isn't* a preceding token? That's not possible I think in the current grammar, but if it were possible this code would be indexing off the beginning of the locations array.) But that change wouldn't really fix the issue for CREATE SCHEMA --- the -1 would be propagated up to all instances of OptSchemaEltList even after we'd reduced the first production a few times, so that we'd end up printing no error cursor for this case. To produce a really useful cursor for this type of situation I think we would have to do something like this: #define YYLLOC_DEFAULT(Current, Rhs, N) \ do { \ int i; (Current) = -1; \ for (i = 1; i <= (N); i++) { (Current) = (Rhs)[i]; \ if ((Current) >= 0) break; } } while (0) This is pretty ugly and seems likely to create a visible hit on the parser's speed (which is already not that good). I'm not sure it's worth it for something that seems to be a corner case --- anyway we've not hit it before in six years since the location tracking support was added. At this point I'm inclined to change the macro to the second case (with the -1) and accept a less good error cursor position for the CREATE SCHEMA error. Has anybody got a better idea? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
On Oct 3, 2012, at 4:49 PM, Tom Lane wrote: > Committed with some adjustments, notably repairing the > order-of-operations error I complained about before. Awesome, thanks! David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= writes: > The attached patch implements the behavior we've discussed. Committed with some adjustments, notably repairing the order-of-operations error I complained about before. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Wed, Oct 3, 2012 at 5:10 PM, Andres Freund wrote: > > 14) Swap new and old indexes, consisting here in switching their names. > I think switching based on their names is not going to work very well > because > indexes are referenced by oid at several places. Swapping > pg_index.indexrelid > or pg_class.relfilenode seems to be the better choice to me. We expect > relfilenode changes for such commands, but not ::regclass oid changes. > OK, if there is a choice to be made, switching relfilenode would be a better choice as it points to the physical storage itself. It looks more straight-forward than switching oids, and takes the switch at the root. Btw, there is still something I wanted to clarify. You mention in your ideas "old" and "new" indexes. Such as we create a new index at the begininning and drop the old one at the end. This is not completely true in the case of switching relfilenode. What happens is that we create a new index with a new physical storage, then at swap step, we switch the old storage and the new storage. Once swap is done, the index that needs to be set as invalid and not ready is not the old index. but the index created at the beginning of process that has now the old relfilenode. Then the relation that is indeed dropped at the end of process is also the index with the old relfilenode, so the index created indeed at the beginning of process. I understand that this is playing with the words, but I just wanted to confirm that we are on the same line. -- Michael Paquier http://michael.otacoo.com
[HACKERS] Question on "box @> point" using GiST index on boxes
Hello, I would like to test the containment of a point against many boxes. I did not find a way to express "box @> point" in straightforward way such that the GiST index on the boxes is exploited. The only way to use a point directly is to turn the box into a polygon. Is it a missing feature? The way I currently represent a point p is as: box(p, p). In this case, the GiST index use kicks in. Regards, Ralf -- drop table if exists boxes cascade; create table boxes ( b box ); -- Some random data insert into boxes select box(point((random()*100)::int, (random()*100)::int), point((random()*100)::int, (random()*100)::int)) from (select * from generate_series(1,1000)) as t; create index i on boxes using gist (b); vacuum analyze boxes; explain select * from boxes where b @> '((0,0),(0,0))'::box; explain select * from boxes where b::polygon @> '(0,0)'::point; RESULT: QUERY PLAN Index Scan using i on boxes (cost=0.00..8.27 rows=1 width=32) Index Cond: (b @> '(0,0),(0,0)'::box) (2 rows) QUERY PLAN - Seq Scan on boxes (cost=0.00..23.00 rows=500 width=32) Filter: ((b)::polygon @> '(0,0)'::point) (2 rows)
Re: [HACKERS] do we EXEC_BACKEND on Mac OS X?
Martijn van Oosterhout writes: > On Wed, Oct 03, 2012 at 01:57:47PM -0400, Bruce Momjian wrote: >> On Wed, Oct 3, 2012 at 01:53:28PM -0400, Tom Lane wrote: >>> How exactly would a library prevent such problems? In particular, >>> let's see a proposal for how libpq might make it look like a fork >>> was transparent for an open connection. >> I guess that is impossible. > Well, not quite impossible. A simple solution would be to use > pthread_atfork() to register a handler that puts the socket into an > invalid state in either the parent or the child. That doesn't make it "transparent" --- that merely ensures that we break one of the two currently-working use cases (namely, that either the parent or the child can continue to use the connection as long as the other doesn't touch it). regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detecting libpq connections improperly shared via fork()
On Wed, Oct 3, 2012 at 3:34 PM, Andres Freund wrote: > On Thursday, October 04, 2012 12:16:14 AM Daniel Farina wrote: >> On Wed, Oct 3, 2012 at 3:14 PM, Andres Freund > wrote: >> > On Thursday, October 04, 2012 12:08:18 AM Daniel Farina wrote: >> >> It would be fantastic for libpq to somehow monitor use of a connection >> >> from multiple PIDs that share a parent and deliver an error indicating >> >> what is wrong. Unfortunately detecting that would require either a >> >> file or some kind of shared memory map, AFAIK, and I don't know how >> >> keen anyone is on accepting that patch. So, may I ask: how keen is >> >> anyone on accepting such a patch, and under what conditions of >> >> mechanism? >> > >> > Hm. An easier version of this could just be storing the pid of the >> > process that did the PQconnectdb* in the PGconn struct. You can then >> > check that PGconn->pid == getpid() at relatively few places and error >> > out on a mismatch. That should be doable with only minor overhead. >> >> I suppose this might needlessly eliminate someone who forks and hands >> off the PGconn struct to exactly one child, but it's hard to argue >> with its simplicity and portability of mechanism. > Even that scenario isn't easy to get right... You need to get the socket from > libpq in the parent after the fork() and close it. Only after that you can > PQfinish it. Which you probably need to do before establishing other > connections. Well, as a general rule, people care a lot less about "that strange error that happens when I'm done" as opposed to "that thing that happens randomly while I'm mid-workload," so odds are very slim that I'd see that defect reported -- I think chances are also poor that that bug would go fixed in applications that have it, because the impact would appear minor, so as per the natural of order of things it'll be deprioritized indefinitely. But I see your point: the number of intentional abusers might be even smaller than I thought. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detecting libpq connections improperly shared via fork()
On Thursday, October 04, 2012 12:16:14 AM Daniel Farina wrote: > On Wed, Oct 3, 2012 at 3:14 PM, Andres Freund wrote: > > On Thursday, October 04, 2012 12:08:18 AM Daniel Farina wrote: > >> It would be fantastic for libpq to somehow monitor use of a connection > >> from multiple PIDs that share a parent and deliver an error indicating > >> what is wrong. Unfortunately detecting that would require either a > >> file or some kind of shared memory map, AFAIK, and I don't know how > >> keen anyone is on accepting that patch. So, may I ask: how keen is > >> anyone on accepting such a patch, and under what conditions of > >> mechanism? > > > > Hm. An easier version of this could just be storing the pid of the > > process that did the PQconnectdb* in the PGconn struct. You can then > > check that PGconn->pid == getpid() at relatively few places and error > > out on a mismatch. That should be doable with only minor overhead. > > I suppose this might needlessly eliminate someone who forks and hands > off the PGconn struct to exactly one child, but it's hard to argue > with its simplicity and portability of mechanism. Even that scenario isn't easy to get right... You need to get the socket from libpq in the parent after the fork() and close it. Only after that you can PQfinish it. Which you probably need to do before establishing other connections. The only scenario where I can dream up some use for doing something like that isn't really convincing and revolves around setreuid() and peer authentication. But you can do the setreuid after the fork just fine... Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detecting libpq connections improperly shared via fork()
On Wed, Oct 3, 2012 at 3:14 PM, Andres Freund wrote: > On Thursday, October 04, 2012 12:08:18 AM Daniel Farina wrote: >> It would be fantastic for libpq to somehow monitor use of a connection >> from multiple PIDs that share a parent and deliver an error indicating >> what is wrong. Unfortunately detecting that would require either a >> file or some kind of shared memory map, AFAIK, and I don't know how >> keen anyone is on accepting that patch. So, may I ask: how keen is >> anyone on accepting such a patch, and under what conditions of >> mechanism? > Hm. An easier version of this could just be storing the pid of the process > that did the PQconnectdb* in the PGconn struct. You can then check that > PGconn->pid == getpid() at relatively few places and error out on a mismatch. > That should be doable with only minor overhead. I suppose this might needlessly eliminate someone who forks and hands off the PGconn struct to exactly one child, but it's hard to argue with its simplicity and portability of mechanism. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Detecting libpq connections improperly shared via fork()
On Thursday, October 04, 2012 12:08:18 AM Daniel Farina wrote: > It would be fantastic for libpq to somehow monitor use of a connection > from multiple PIDs that share a parent and deliver an error indicating > what is wrong. Unfortunately detecting that would require either a > file or some kind of shared memory map, AFAIK, and I don't know how > keen anyone is on accepting that patch. So, may I ask: how keen is > anyone on accepting such a patch, and under what conditions of > mechanism? Hm. An easier version of this could just be storing the pid of the process that did the PQconnectdb* in the PGconn struct. You can then check that PGconn->pid == getpid() at relatively few places and error out on a mismatch. That should be doable with only minor overhead. I have seen such errors before... Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQping command line tool
On Wed, Oct 3, 2012 at 11:42 AM, Pavel Stehule wrote: > 2012/10/3 Phil Sorber : >> On Wed, Oct 3, 2012 at 11:35 AM, Bruce Momjian wrote: >>> On Tue, Oct 2, 2012 at 11:01:36PM -0400, Phil Sorber wrote: I was wondering recently if there was any command line tool that utilized PQping() or PQpingParams(). I searched the code and couldn't find anything and was wondering if there was any interest to have something like this included? I wrote something for my purposes of performing a health check that also supports nagios style status output. It's probably convenient for scripting purposes as well. It's not currently ready for submission to a commitfest, but if there was an interest I would clean it up so that it would be. >>> >>> I don't see any tool using PQping except pg_ctl. Perhaps we should >>> modify "pg_ctl status" to use PQping. Right now is only checks the >>> postmaster.pid file, and checks to see that the pid is a running >>> postmaster. What it currently doesn't do is to check if the server is >>> accepting connections with PQping(), like we do for "pg_ctl -w start". >>> >>> Comments? >> >> I was thinking that maybe this should be a new feature in an existing >> tool, however I don't think pg_ctl would satisfy my use case as it's >> normally bundled with the server. This would need to be something that >> I could install just a client package. It's not a deal breaker, but it >> makes things more complex. >> >> How about adding it as an option to psql? That's not to say that I >> think we shouldn't also add it to 'pg_ctl status'. I was looking at the code and originally I was using return code to signify what the status was and some text output when quiet wasn't set, but psql has it's own set of established return codes. How does everyone feel about using different return codes when psql is in the PQping mode? Also was just printing out terse text forms of the enums. OK, NO_RESPONSE, etc. I was thinking they could be used in shell scripts that way, but we could do that with return codes as well. Would people like to see something more human friendly and descriptive? Also -C, --check was available and I went with that. Most of the other stuff I could think of already had the short option taken. > > +1 > > Pavel >> >>> >>> -- >>> Bruce Momjian http://momjian.us >>> EnterpriseDB http://enterprisedb.com >>> >>> + It's impossible for everything to be true. + >> >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Detecting libpq connections improperly shared via fork()
Per http://archives.postgresql.org/pgsql-hackers/2012-10/msg00167.php On Wed, Oct 3, 2012 at 9:41 AM, Tom Lane wrote: > To bring that closer to home, suppose you have a program with an open > database connection in libpq, and you fork(), and then parent and child > both try to use the connection. How well would that work? Is it the > fault of fork()? This brings to mind a lot of issues we see reported among our users: I see the problem of database connections shared among processes as a reported problem *all the time*, because of the popularity of fork-based web servers. If you do something just a tiny bit wrong you end up establishing the connection before the fork and then two things can happen: * If SSL is off, you never notice until you get some really bizarre issues that result from an unfortunate collision of protocol traffic. Since many applications have idle-time, this happens rarely enough to be subtle that some people never take notice. A tell-tell sign is an error reported from something that looks like one query jammed into another. * When SSL is enabled this strangeness is seen more or less immediately, but shows up as cryptic OpenSSL complaints, to which no earthly person is going to know what to do. It would be fantastic for libpq to somehow monitor use of a connection from multiple PIDs that share a parent and deliver an error indicating what is wrong. Unfortunately detecting that would require either a file or some kind of shared memory map, AFAIK, and I don't know how keen anyone is on accepting that patch. So, may I ask: how keen is anyone on accepting such a patch, and under what conditions of mechanism? As a minor functional downside, it would hurt a hypothetical user that is carefully sharing a backend file descriptor between processes using libpq and synchronizing them very carefully (notably, with encryption this becomes almost comically difficult and brittle). I conjecture such a person is almost entirely hypothetical in nature. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Wednesday, October 03, 2012 11:42:25 PM Michael Paquier wrote: > On 2012/10/04, at 5:41, Andres Freund wrote: > > On Wednesday, October 03, 2012 10:12:58 PM Michael Paquier wrote: > >> On 2012/10/03, at 23:52, Andres Freund wrote: > >>> On Wednesday, October 03, 2012 04:28:59 PM Tom Lane wrote: > Andres Freund writes: > > Maybe I am missing something here, but reindex concurrently should do > > 1) BEGIN > > 2) Lock table in share update exlusive > > 3) lock old index > > 3) create new index > > 4) obtain session locks on table, old index, new index > > 5) commit > > 6) process till newindex->insisready (no new locks) > > 7) process till newindex->indisvalid (no new locks) > > 8) process till !oldindex->indisvalid (no new locks) > > 9) process till !oldindex->indisready (no new locks) > > 10) drop all session locks > > 11) lock old index exlusively which should be "invisible" now > > 12) drop old index > > You can't drop the session locks until you're done. Consider somebody > else trying to do a DROP TABLE between steps 10 and 11, for instance. > >>> > >>> Yea, the session lock on the table itself probably shouldn't be > >>> dropped. If were holding only that one there shouldn't be any > >>> additional deadlock dangers when dropping the index due to lock > >>> upgrades as were doing the normal dance any DROP INDEX does. They seem > >>> pretty unlikely in a !valid !ready table > >> > >> Just à note... > >> My patch drops the locks on parent table and indexes at the end of > >> process, after dropping the old indexes ;) > > > > I think that might result in deadlocks with concurrent sessions in some > > circumstances if those other sessions already have a lower level lock on > > the index. Thats why I think dropping the lock on the index and then > > reacquiring an access exlusive might be necessary. > > Its not a too likely scenario, but why not do it right if its just 3 > > lines... > > Tom is right. This scenario does not cover the case where you drop the > parent table or you drop the index, which is indeed invisible, but still > has a pg_class and a pg_index entry, from a different session after step > 10 and before step 11. So you cannot either drop the locks on indexes > until you are done at step 12. Yep: > Yea, the session lock on the table itself probably shouldn't be dropped. But that does *not* mean you cannot avoid lock upgrade issues by dropping the lower level lock on the index first and only then acquiring the access exlusive lock. Note that dropping an index always includes *first* getting a lock on the table so doing it that way is safe and just the same as a normal drop index. Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we EXEC_BACKEND on Mac OS X?
On Wed, Oct 03, 2012 at 01:57:47PM -0400, Bruce Momjian wrote: > On Wed, Oct 3, 2012 at 01:53:28PM -0400, Tom Lane wrote: > > Bruce Momjian writes: > > > Yes, but those framework libraries are typically supposed to prevent > > > such problems from being seen by applications calling them. > > > > How exactly would a library prevent such problems? In particular, > > let's see a proposal for how libpq might make it look like a fork > > was transparent for an open connection. > > I guess that is impossible. Well, not quite impossible. A simple solution would be to use pthread_atfork() to register a handler that puts the socket into an invalid state in either the parent or the child. http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html Ofcourse, the backward compatabilty issues prevent us doing that, but it's *possible*. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On 2012/10/04, at 5:41, Andres Freund wrote: > On Wednesday, October 03, 2012 10:12:58 PM Michael Paquier wrote: >> On 2012/10/03, at 23:52, Andres Freund wrote: >>> On Wednesday, October 03, 2012 04:28:59 PM Tom Lane wrote: Andres Freund writes: > Maybe I am missing something here, but reindex concurrently should do > 1) BEGIN > 2) Lock table in share update exlusive > 3) lock old index > 3) create new index > 4) obtain session locks on table, old index, new index > 5) commit > 6) process till newindex->insisready (no new locks) > 7) process till newindex->indisvalid (no new locks) > 8) process till !oldindex->indisvalid (no new locks) > 9) process till !oldindex->indisready (no new locks) > 10) drop all session locks > 11) lock old index exlusively which should be "invisible" now > 12) drop old index You can't drop the session locks until you're done. Consider somebody else trying to do a DROP TABLE between steps 10 and 11, for instance. >>> >>> Yea, the session lock on the table itself probably shouldn't be dropped. >>> If were holding only that one there shouldn't be any additional deadlock >>> dangers when dropping the index due to lock upgrades as were doing the >>> normal dance any DROP INDEX does. They seem pretty unlikely in a !valid >>> !ready table >> >> Just à note... >> My patch drops the locks on parent table and indexes at the end of process, >> after dropping the old indexes ;) > I think that might result in deadlocks with concurrent sessions in some > circumstances if those other sessions already have a lower level lock on the > index. Thats why I think dropping the lock on the index and then reacquiring > an access exlusive might be necessary. > Its not a too likely scenario, but why not do it right if its just 3 lines... Tom is right. This scenario does not cover the case where you drop the parent table or you drop the index, which is indeed invisible, but still has a pg_class and a pg_index entry, from a different session after step 10 and before step 11. So you cannot either drop the locks on indexes until you are done at step 12. > > Andres > -- > Andres Freundhttp://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER command reworks
Excerpts from Alvaro Herrera's message of mié oct 03 18:25:54 -0300 2012: > Excerpts from Kohei KaiGai's message of lun sep 10 08:08:32 -0300 2012: > > > As attached, I split off the original one into three portions; for > > set-schema, > > set-owner and rename-to. Please apply them in order of patch filename. > > Regarding to the error message, RenameErrorMsgAlreadyExists() was added > > to handle per object type messages in case when aclcheck_error() is not > > available to utilize. > > I have pushed the regression tests and parts 1 and 2. Only part 3 is > missing from this series, but I'm not as sure about that one as the > other two. Not really a fan of RenameErrorMsgAlreadyExists() as coded, > but I'll think more about it. I forgot to mention: I think with a little more effort (a callback to be registered as the permission check to run during SET OWNER, maybe?) we could move the foreign stuff and event triggers into the generic SET OWNER implementation. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER command reworks
Excerpts from Kohei KaiGai's message of lun sep 10 08:08:32 -0300 2012: > As attached, I split off the original one into three portions; for set-schema, > set-owner and rename-to. Please apply them in order of patch filename. > Regarding to the error message, RenameErrorMsgAlreadyExists() was added > to handle per object type messages in case when aclcheck_error() is not > available to utilize. I have pushed the regression tests and parts 1 and 2. Only part 3 is missing from this series, but I'm not as sure about that one as the other two. Not really a fan of RenameErrorMsgAlreadyExists() as coded, but I'll think more about it. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] FDW for PostgreSQL
Hanada-san, I tried to check this patch. Because we also had some discussion on this extension through the last two commit fests, I have no fundamental design arguments. So, let me drop in the implementation detail of this patch. At the postgresql_fdw/deparse.c, * Even though deparseVar() is never invoked with need_prefix=true, I doubt why Var reference needs to be qualified with relation alias. It seems to me relation alias is never used in the remote query, so isn't it a possible bug? * deparseFuncExpr() has case handling depending on funcformat of FuncExpr. I think all the cases can be deparsed using explicit function call, and it can avoid a trouble when remote host has inconsistent cast configuration. At the postgresql_fdw/connection.c, * I'm worry about the condition for invocation of begin_remote_tx(). + if (use_tx && entry->refs == 1) +begin_remote_tx(entry->conn); + entry->use_tx = use_tx; My preference is: if (use_tx && !entry->use_tx), instead. Even though here is no code path to make a problem obvious, it may cause possible difficult-to-find bug, in case when caller tried to get a connection being already acquired by someone but no transaction needed. At the postgresql_fdw/postgresql_fdw.c, * When pgsqlGetForeignPaths() add SQL statement into fdw_private, it is implemented as: + /* Construct list of SQL statements and bind it with the path. */ + fdw_private = lappend(NIL, makeString(fpstate->sql.data)); Could you use list_make1() instead? * At the bottom half of query_row_processor(), I found these mysterious two lines. MemoryContextSwitchTo(festate->scan_cxt); MemoryContextSwitchTo(festate->temp_cxt); Why not switch temp_cxt directly? At the sgml/postgresql-fdw.sgml, * Please add this version does not support sub-transaction handling. Especially, all we can do is abort top-level transaction in case when an error is occurred at the remote side within sub-transaction. I hope to take over this patch for committer soon. Thanks, 2012/9/14 Shigeru HANADA : > Hi all, > > I'd like to propose FDW for PostgreSQL as a contrib module again. > Attached patch is updated version of the patch proposed in 9.2 > development cycle. > > For ease of review, I summarized what the patch tries to achieve. > > Abstract > > This patch provides FDW for PostgreSQL which allows users to access > external data stored in remote PostgreSQL via foreign tables. Of course > external instance can be beyond network. And I think that this FDW > could be an example of other RDBMS-based FDW, and it would be useful for > proof-of-concept of FDW-related features. > > Note that the name has been changed from "pgsql_fdw" which was used in > last proposal, since I got a comment which says that most of existing > FDWs have name "${PRODUCT_NAME}_fdw" so "postgresql_fdw" or > "postgres_fdw" would be better. For this issue, I posted another patch > which moves existing postgresql_fdw_validator into contrib/dblink with > renaming in order to reserve the name "postgresql_fdw" for this FDW. > Please note that the attached patch requires dblink_fdw_validator.patch > to be applied first. > http://archives.postgresql.org/pgsql-hackers/2012-09/msg00454.php > > Query deparser > == > Now postgresql_fdw has its own SQL query deparser inside, so it's free > from backend's ruleutils module. > > This deparser maps object names when generic options below were set. > > nspname of foreign table: used as namespace (schema) of relation > relname of foreign table: used as relation name > colname of foreign column: used as column name > > This mapping allows flexible schema design. > > SELECT optimization > === > postgresql_fdw always retrieves as much columns as foreign table from > remote to avoid overhead of column mapping. However, often some of them > (or sometimes all of them) are not used on local side, so postgresql_fdw > uses NULL literal as such unused columns in SELECT clause of remote > query. For example, let's assume one of pgbench workloads: > > SELECT abalance FROM pgbench_accounts WHERE aid = 1; > > This query generates a remote query below. In addition to bid and > filler, aid is replaced with NULL because it's already evaluated on > remote side. > > SELECT NULL, NULL, abalance, NULL FROM pgbench_accounts > WHERE (aid OPERATOR(pg_catalog.=) 1); > > This trick would improve performance notably by reducing amount of data > to be transferred. > > One more example. Let's assume counting rows. > > SELCT count(*) FROM pgbench_accounts; > > This query requires only existence of row, so no actual column reference > is in SELECT clause. > > SELECT NULL, NULL, NULL, NULL FROM pgbench_accounts; > > WHERE push down > === > postgresql_fdw pushes down some of restrictions (IOW, top level elements > in WHERE clause which are connected with AND) which can be evaluated on > remote side
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Wednesday, October 03, 2012 10:12:58 PM Michael Paquier wrote: > On 2012/10/03, at 23:52, Andres Freund wrote: > > On Wednesday, October 03, 2012 04:28:59 PM Tom Lane wrote: > >> Andres Freund writes: > >>> Maybe I am missing something here, but reindex concurrently should do > >>> 1) BEGIN > >>> 2) Lock table in share update exlusive > >>> 3) lock old index > >>> 3) create new index > >>> 4) obtain session locks on table, old index, new index > >>> 5) commit > >>> 6) process till newindex->insisready (no new locks) > >>> 7) process till newindex->indisvalid (no new locks) > >>> 8) process till !oldindex->indisvalid (no new locks) > >>> 9) process till !oldindex->indisready (no new locks) > >>> 10) drop all session locks > >>> 11) lock old index exlusively which should be "invisible" now > >>> 12) drop old index > >> > >> You can't drop the session locks until you're done. Consider somebody > >> else trying to do a DROP TABLE between steps 10 and 11, for instance. > > > > Yea, the session lock on the table itself probably shouldn't be dropped. > > If were holding only that one there shouldn't be any additional deadlock > > dangers when dropping the index due to lock upgrades as were doing the > > normal dance any DROP INDEX does. They seem pretty unlikely in a !valid > > !ready table > > Just à note... > My patch drops the locks on parent table and indexes at the end of process, > after dropping the old indexes ;) I think that might result in deadlocks with concurrent sessions in some circumstances if those other sessions already have a lower level lock on the index. Thats why I think dropping the lock on the index and then reacquiring an access exlusive might be necessary. Its not a too likely scenario, but why not do it right if its just 3 lines... Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade does not completely honor --new-port
On Wed, Oct 3, 2012 at 05:16:55PM -0300, Alvaro Herrera wrote: > Excerpts from Devrim GÜNDÜZ's message of mié oct 03 17:00:16 -0300 2012: > > > > Hi, > > > > On Wed, 2012-09-26 at 22:06 -0400, Bruce Momjian wrote: > > > > > > I just performed a test upgrade from 9.1 to 9.2, and used > > > > --new-port variable. However, the analyze_new_cluster.sh does not > > > > include the new port, thus when I run it, it fails. Any chance to > > > > add the port number to the script? > > > > > > Well, the reason people normally use the port number is to do a live > > > check, but obviously when the script is created it isn't doing a > > > check. I am worried that if I do embed the port number in there, then > > > if they change the port after the upgrade, they now can't use the > > > script. I assume users would have PGPORT set before running the > > > script, no? > > > > They can't use the script in each way -- at least we can make it usable > > for one case, I think. > > Well, you could have the script set the port number only if the variable > is not set from the calling shell ... you know, > PGPORT=${PGPORT:=the_other_number} . That way, if the user wants to > specify a different port, they have to set PGPORT before calling the > script. Good idea, but that is only going to work on Unix, and in fact only using certain shells. I don't think we want to go there, do we? I could expand that out to a normal shell _if_ statement, but again, only works on Unix. What we _could_ do is to add a comment line at the top that defines a string that can be supplied, and default it to the port number; that would work on Unix and Windows, e.g. # uncomment and adjust if you want a special port number # PGPORT_STR="-p 5435" # export PGPORT For Windows it would be "REM". Is everyone happy with that? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade does not completely honor --new-port
On Wed, Oct 3, 2012 at 11:00:16PM +0300, Devrim Gunduz wrote: > > Hi, > > On Wed, 2012-09-26 at 22:06 -0400, Bruce Momjian wrote: > > > > I just performed a test upgrade from 9.1 to 9.2, and used > > > --new-port variable. However, the analyze_new_cluster.sh does not > > > include the new port, thus when I run it, it fails. Any chance to > > > add the port number to the script? > > > > Well, the reason people normally use the port number is to do a live > > check, but obviously when the script is created it isn't doing a > > check. I am worried that if I do embed the port number in there, then > > if they change the port after the upgrade, they now can't use the > > script. I assume users would have PGPORT set before running the > > script, no? > > They can't use the script in each way -- at least we can make it usable > for one case, I think. Well, my assumption is that they are unlikely to move the old _binary_ directory, but they are more likely to change the port number. My point is that if they change the port number to the default from a non-default, or they set the PGPORT environment variable, the script will work. If we hard-code the port, it would not work. In fact, pg_upgrade defaults to use port 50432 if they don't supply one. We would embed the port number only if they supplied a custom port number, but again, they might change that before going live with the new server. I guess I am confused why you would use pg_upgrade, and start the new server on a non-default port that isn't the same as PGPORT. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade does not completely honor --new-port
Excerpts from Devrim GÜNDÜZ's message of mié oct 03 17:00:16 -0300 2012: > > Hi, > > On Wed, 2012-09-26 at 22:06 -0400, Bruce Momjian wrote: > > > > I just performed a test upgrade from 9.1 to 9.2, and used > > > --new-port variable. However, the analyze_new_cluster.sh does not > > > include the new port, thus when I run it, it fails. Any chance to > > > add the port number to the script? > > > > Well, the reason people normally use the port number is to do a live > > check, but obviously when the script is created it isn't doing a > > check. I am worried that if I do embed the port number in there, then > > if they change the port after the upgrade, they now can't use the > > script. I assume users would have PGPORT set before running the > > script, no? > > They can't use the script in each way -- at least we can make it usable > for one case, I think. Well, you could have the script set the port number only if the variable is not set from the calling shell ... you know, PGPORT=${PGPORT:=the_other_number} . That way, if the user wants to specify a different port, they have to set PGPORT before calling the script. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On 2012/10/03, at 23:52, Andres Freund wrote: > On Wednesday, October 03, 2012 04:28:59 PM Tom Lane wrote: >> Andres Freund writes: >>> Maybe I am missing something here, but reindex concurrently should do >>> 1) BEGIN >>> 2) Lock table in share update exlusive >>> 3) lock old index >>> 3) create new index >>> 4) obtain session locks on table, old index, new index >>> 5) commit >>> 6) process till newindex->insisready (no new locks) >>> 7) process till newindex->indisvalid (no new locks) >>> 8) process till !oldindex->indisvalid (no new locks) >>> 9) process till !oldindex->indisready (no new locks) >>> 10) drop all session locks >>> 11) lock old index exlusively which should be "invisible" now >>> 12) drop old index >> >> You can't drop the session locks until you're done. Consider somebody >> else trying to do a DROP TABLE between steps 10 and 11, for instance. > Yea, the session lock on the table itself probably shouldn't be dropped. If > were holding only that one there shouldn't be any additional deadlock dangers > when dropping the index due to lock upgrades as were doing the normal dance > any DROP INDEX does. They seem pretty unlikely in a !valid !ready table > Just à note... My patch drops the locks on parent table and indexes at the end of process, after dropping the old indexes ;) Michael > > Greetings, > > Andres > -- > Andres Freundhttp://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade does not completely honor --new-port
Hi, On Wed, 2012-09-26 at 22:06 -0400, Bruce Momjian wrote: > > I just performed a test upgrade from 9.1 to 9.2, and used > > --new-port variable. However, the analyze_new_cluster.sh does not > > include the new port, thus when I run it, it fails. Any chance to > > add the port number to the script? > > Well, the reason people normally use the port number is to do a live > check, but obviously when the script is created it isn't doing a > check. I am worried that if I do embed the port number in there, then > if they change the port after the upgrade, they now can't use the > script. I assume users would have PGPORT set before running the > script, no? They can't use the script in each way -- at least we can make it usable for one case, I think. Regards, -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz signature.asc Description: This is a digitally signed message part
Re: [HACKERS] Hash id in pg_stat_statements
On Wed, Oct 3, 2012 at 11:04 AM, Tom Lane wrote: > Daniel Farina writes: >> Instead, I think it makes sense to assign a number -- arbitrarily, but >> uniquely -- to the generation of a new row in pg_stat_statements, and, >> on the flip side, whenever a row is retired its number should be >> eliminated, practically, for-ever. This way re-introductions between >> two samplings of pg_stat_statements cannot be confused for a >> contiguously maintained statistic on a query. > > This argument seems sensible to me. Is there any use-case where the > proposed counter wouldn't do what people wished to do with an exposed > hash value? Yes: unless schemas are included into the canonicalized query text, two identical texts can actually mean different things. This is the only corner case (besides collision) I can think of. I also referenced an idea in a similar vein to the counter/arbitrary number: instead, one can perform a kind of error propagation from evicted entries in the hash table. This might avoid having to even bother saving a counter, which feels rather nice to me. I have a sketch (I now know of two stupid bugs in this) in implementation and it comes out very neatly so far. I got this idea from a paper that I saw implemented once, with strikingly good effect: http://www.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf Here they have a very specific rendition that is, for a couple of reasons, not quite what we want, I think. But the part I found very interesting was the simple propagation of error that made the technique possible. Inspired by this, I gave every hash entry a maximum-under-estimation error. When members of the hash table are evicted, the minimum of the metric gathered plus its already accumulated under-estimation error are propagated to the new entry. The general property is that hash table members who are frequently evicted will accumulate huge amounts of error. Those that are never evicted (thanks to constant use) never accumulate any error. A tool reading the table can take into account the error accumulation to determine if there was an eviction between two samplings, as well as bounding the error accrued between two snapshots. I think there is a tiny bit of room for some sort of ambiguity in a corner case, but I'd have to think more about it. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] gistchoose vs. bloat
On Mon, Oct 1, 2012 at 5:15 AM, Jeff Davis wrote: > On Tue, 2012-09-04 at 19:21 +0400, Alexander Korotkov wrote: > > > New version of patch is attached. Parameter "randomization" was > > introduced. It controls whether to randomize choose. Choose algorithm > > was rewritten. > > > Review comments: > > 1. Comment above while loop in gistRelocateBuildBuffersOnSplit needs to > be updated. > Actually, I didn't realize what exact comment you expect. Check if added commend meets you expectations. > > 2. Typo in two places: "if randomization id required". > > 3. In gistRelocateBuildBuffersOnSplit, shouldn't that be: > splitPageInfo = &relocationBuffersInfos[bufferIndex]; >not: > splitPageInfo = &relocationBuffersInfos[i]; > Fixed. > 4. It looks like the randomization is happening while trying to compare > the penalties. I think it may be more readable to separate those two > steps; e.g. > > /* create a mapping whether randomization is on or not */ > for (i = FirstOffsetNumber; i <= maxoff; i = OffsetNumberNext(i)) > offsets[i - FirstOffsetNumber] = i; > > if (randomization) > /* randomize offsets array */ > > for (i = 0; i < maxoff; i++) > { > offset = offsets[i]; > ... > } > > That's just an idea; if you think it's more readable as-is (or if I am > misunderstanding) then let me know. > Actually, current implementation comes from idea of creating possible less overhead when randomization is off. I'll try to measure overhead in worst case. If it is low enough then you proposal looks reasonable to me. -- With best regards, Alexander Korotkov. gist_choose_bloat-0.3.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash id in pg_stat_statements
On 3 October 2012 19:04, Tom Lane wrote: > Daniel Farina writes: >> Instead, I think it makes sense to assign a number -- arbitrarily, but >> uniquely -- to the generation of a new row in pg_stat_statements, and, >> on the flip side, whenever a row is retired its number should be >> eliminated, practically, for-ever. This way re-introductions between >> two samplings of pg_stat_statements cannot be confused for a >> contiguously maintained statistic on a query. > > This argument seems sensible to me. Is there any use-case where the > proposed counter wouldn't do what people wished to do with an exposed > hash value? Yes. The hash could be used to aggregate query execution costs across entire WAL-based replication clusters. I'm not opposed to Daniel's suggestion, though. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Hash id in pg_stat_statements
Daniel Farina writes: > Instead, I think it makes sense to assign a number -- arbitrarily, but > uniquely -- to the generation of a new row in pg_stat_statements, and, > on the flip side, whenever a row is retired its number should be > eliminated, practically, for-ever. This way re-introductions between > two samplings of pg_stat_statements cannot be confused for a > contiguously maintained statistic on a query. This argument seems sensible to me. Is there any use-case where the proposed counter wouldn't do what people wished to do with an exposed hash value? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we EXEC_BACKEND on Mac OS X?
On Wed, Oct 3, 2012 at 01:53:28PM -0400, Tom Lane wrote: > Bruce Momjian writes: > > Yes, but those framework libraries are typically supposed to prevent > > such problems from being seen by applications calling them. > > How exactly would a library prevent such problems? In particular, > let's see a proposal for how libpq might make it look like a fork > was transparent for an open connection. I guess that is impossible. > > This is > > certainly sloppy practice on Apple's part, and it leave us wondering if > > we are using anything that might be a problem. The bottom line is that > > we don't know. > > > Libraries are supposed to document these limitations, as we do with > > libpq. I wonder if they just documented fork() and now don't feel they > > need to document these limitations per-library. > > Do we know that they *didn't* document such issues per-library? > Mentioning the risk under fork() too doesn't seem unreasonable. > > Not trying to sound like an Apple apologist, but I see a whole lot of > bashing going on here on the basis of darn little evidence. Well, ideally if Apple is going to brand a Unix function as unsafe, it would be good to mention which libraries are unsafe. I have no idea if they are documenting the problems in the libraries themselves. I guess my point is that the fork() warning was too vague. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
=?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= writes: > The attached patch implements the behavior we've discussed. OK, I'll pick this up again, since we seem to have consensus on this behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we EXEC_BACKEND on Mac OS X?
Bruce Momjian writes: > Yes, but those framework libraries are typically supposed to prevent > such problems from being seen by applications calling them. How exactly would a library prevent such problems? In particular, let's see a proposal for how libpq might make it look like a fork was transparent for an open connection. > This is > certainly sloppy practice on Apple's part, and it leave us wondering if > we are using anything that might be a problem. The bottom line is that > we don't know. > Libraries are supposed to document these limitations, as we do with > libpq. I wonder if they just documented fork() and now don't feel they > need to document these limitations per-library. Do we know that they *didn't* document such issues per-library? Mentioning the risk under fork() too doesn't seem unreasonable. Not trying to sound like an Apple apologist, but I see a whole lot of bashing going on here on the basis of darn little evidence. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
2012/10/3 Alvaro Herrera > Excerpts from Fabrízio de Royes Mello's message of mié oct 03 10:11:03 > -0300 2012: > > > Maybe something like this? > > > >ereport(ERROR, > >(errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > > errmsg("IF NOT EXISTS cannot be used with schema elements"), > > parser_errposition(@9))); > > Seems reasonable, but where? Please submit a complete patch. > > The attached patch implements the behavior we've discussed. If we use "IF NOT EXISTS" with schema elements then occurs an error like this: [local]:5432 fabrizio@fabrizio=# CREATE SCHEMA IF NOT EXISTS test_schema_1 CREATE TABLE abc ( a serial, b int UNIQUE ); ERROR: IF NOT EXISTS cannot be used with schema elements LINE 1: CREATE SCHEMA IF NOT EXISTS test_schema_1 ^ Time: 0,773 ms Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Blog sobre TI: http://fabriziomello.blogspot.com >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello create_schema_if_not_exists_v7.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we EXEC_BACKEND on Mac OS X?
On Wed, Oct 3, 2012 at 12:41:37PM -0400, Tom Lane wrote: > I wrote: > > Alvaro Herrera writes: > >> Noticed while perusing > >> http://lwn.net/Articles/518306/ > > > I'm afraid Brian was just looking for an excuse to dump on Apple. We > > have a lot of years of Postgres experience showing that fork() works > > fine on OS X. > > BTW, I think the commenter at the bottom of the thread puts his finger > on the core of the real problem: > > > I'd wager most libraries are not fork safe, including such libraries > > as SQLite as mentioned in the SQLite FAQ. Libraries that talk to the > > outside world contain much state that is not safe to share. > > To bring that closer to home, suppose you have a program with an open > database connection in libpq, and you fork(), and then parent and child > both try to use the connection. How well would that work? Is it the > fault of fork()? > > I think Apple is just pointing out that their framework libraries have > similar issues. Yes, but those framework libraries are typically supposed to prevent such problems from being seen by applications calling them. This is certainly sloppy practice on Apple's part, and it leave us wondering if we are using anything that might be a problem. The bottom line is that we don't know. Libraries are supposed to document these limitations, as we do with libpq. I wonder if they just documented fork() and now don't feel they need to document these limitations per-library. Anyway, I agree that we need to see a failure before adjusting anything. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we EXEC_BACKEND on Mac OS X?
I wrote: > Alvaro Herrera writes: >> Noticed while perusing >> http://lwn.net/Articles/518306/ > I'm afraid Brian was just looking for an excuse to dump on Apple. We > have a lot of years of Postgres experience showing that fork() works > fine on OS X. BTW, I think the commenter at the bottom of the thread puts his finger on the core of the real problem: > I'd wager most libraries are not fork safe, including such libraries > as SQLite as mentioned in the SQLite FAQ. Libraries that talk to the > outside world contain much state that is not safe to share. To bring that closer to home, suppose you have a program with an open database connection in libpq, and you fork(), and then parent and child both try to use the connection. How well would that work? Is it the fault of fork()? I think Apple is just pointing out that their framework libraries have similar issues. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we EXEC_BACKEND on Mac OS X?
Alvaro Herrera writes: > See the CAVEATS here: > https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/fork.2.html > Apparently fork() without exec() isn't all that well supported. This doesn't say fork() doesn't work. It says that Apple's framework libraries aren't meant to work in a forked subprocess --- but we don't use those. > Noticed while perusing > http://lwn.net/Articles/518306/ I'm afraid Brian was just looking for an excuse to dump on Apple. We have a lot of years of Postgres experience showing that fork() works fine on OS X. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Tablefunc crosstab error messages
On Mon, Aug 27, 2012 at 08:07:02PM -0400, Mali Akmanalp wrote: > Returning the type information to the caller seems like a pain > but compatCrosstabTupleDescs already has instances in it where it fails > with an error message, so I propose we just do that and tell the user the > expected and actual types here, instead of returning false here and > throwing a generic error message in the caller. > > What do you think? Let me know so I can write up a patch for you guys. That would prove helpful, +1 from me. > Also, just curious, is the reason the query passed into crosstab is a > string (rather than an SQL expression) that it's just easier to implement > that way? Yes. Accepting the query as proper SQL syntax would require modifying the backend grammar. Extensions cannot do that, but they can define functions accepting a string and run it as a SQL command. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]
Hi, first, thanks for the review. Comments are below. 2012-09-20 12:30 keltezéssel, Amit Kapila írta: On Sun, 01 Jul 2012 13:02:17 +0200 Boszormenyi Zoltan wrote: >attached is a patch that does $SUBJECT. >It's a usability enhancement, to take a backup, write >a minimalistic recovery.conf and start the streaming >standby in one go. >Comments? *[Review of Patch]* *Basic stuff:* -- - Patch applies OK This is not true anymore with a newer GIT version. Some chunk for pg_basebackup.c was rejected. - Compiles cleanly with no warnings *What it does:* - The pg_basebackup tool does the backup of Cluster from server to the specified location. This new functionality will also writes the recovery.conf in the database directory and start the standby server based on options passed to pg_basebackup. *Usability* ** For usability aspect, I am not very sure how many users would like to start the standby server using basebackup. Also, Magnus raised the point that it wouldn't really work on MS Windows where you *have to* start the service via OS facilities. This part of the patch was killed. According to me it can be useful for users who have automated scripts to start server after backup can use this feature. Well, scripting is not portable across UNIXes and Windows, you have to spell out starting the server differently. *Feature Testing:* - 1. Test pg_basebackup with option -R to check that the recovery.conf file is written to data directory. --recovery.conf file is created in data directory. 2. Test pg_basebackup with option -R to check that the recovery.conf file is not able to create because of disk full. --Error is given as recovery.conf file is not able to create. 3. Test pg_basebackup with option -S to check the standby server start on the same/different machine. --Starting standby server is success in if pg_basebackup is taken in different machine. 4. Test pg_basebackup with both options -S and -R to check the standby server start on same/different machine. --Starting standby server is success in if pg_basebackup is taken in different machine. 5. Test pg_basebackup with option -S including -h, -U, -p, -w and -W to check the standy server start and verify the recovery.conf which is created in data directory. --Except password, rest of the primary connection info parameters are working fine. The password part is now fixed. 6. Test pg_basebackup with conflict options (-x or -X and -R or -S). --Error is given when the conflict options are provided to pg_basebackup. 7. Test pg_basebackup with option -S where pg_ctl/postmaster binaries are not present in the path. --Error is given as not able to execute. 8. Test pg_basebackup with option -S by connecting to a standby server. --standby server is started successfully when pg_basebackup is made from a standby server also. *Code Review:* 1. In function WriteRecoveryConf(), un-initialized filename is used. due to which it can print junk for below line in code printf("add password to primary_conninfo in %s if needed\n", filename); Fixed. 2. In function WriteRecoveryConf(), in below code if fopen fails (due to disk full or any other file related error) it will print the error and exits. So now it can be confusing to user, in respect to can he consider backup as successfull and proceed. IMO, either error meesage or documentation can suggest the for such error user can proceed with backup to write his own recovery.conf and start the standby. +cf = fopen(filename, "w"); +if (cf == NULL) +{ +fprintf(stderr, _("cannot create %s"), filename); +exit(1); +} But BaseBackup() function did indicate already that it has finished successfully with if (verbose) fprintf(stderr, "%s: base backup completed\n", progname); Would it be an expected (as in: not confusing) behaviour to return 0 from pg_basebackup if the backup itself has finished, but failed to write the recovery.conf or start the standby if those were requested? I have modified my WriteRecoveryConf() to do exit(2) instead of exit(1) to indicate a different error. exit(1) seems to be for reporting configuration or connection errors. (I may be mistaken though.) 3. In function main, instead of the following code it can be changed in two different ways, if (startstandby) writerecoveryconf = true; change1: case 'R': writerecoveryconf = true; break; case 'S': startstandby = true; writerecoveryconf = true; break; change2: case 'S': startstandby = true; case 'R': writerecoveryconf = true;
Re: [HACKERS] do we EXEC_BACKEND on Mac OS X?
On Wed, Oct 3, 2012 at 01:05:54PM -0300, Alvaro Herrera wrote: > See the CAVEATS here: > > https://developer.apple.com/library/mac/#documentation/Darwin/Referenc > e/ManPages/man2/fork.2.html > > Apparently fork() without exec() isn't all that well supported. > > Noticed while perusing http://lwn.net/Articles/518306/ I think this comment is more relevant: Ah, OK, I found this https://developer.apple.com/library/mac/#documentation/Da... It seems that from 10.5 this caveat was added to the official OS X documentation. In that light I think it's safest to conclude that Apple realised fork() is hard (check out the long list of things a current Linux fork does to retain sanity in the face of threads, asynchronous I/O, capabilities and other fun toys that didn't exist at the dawn of Unix) and decided they don't care. It will probably work, but if it doesn't they aren't interested in explaining why/ fixing the problem. On balance I agree this makes OS X a pretty shoddy Unix, but then, I would have been easily persuaded of that anyway. I am hesitant to avoid fork() on OS/X until someone reports a problem; the slowdown would be significant, and I don't think we use enough OS/X libraries to cause a problem for us, though Bonjour might be a problem. Anyway, good you asked and we should be aware of possible problems. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
Excerpts from Fabrízio de Royes Mello's message of mié oct 03 10:11:03 -0300 2012: > Maybe something like this? > >ereport(ERROR, >(errcode(ERRCODE_FEATURE_NOT_SUPPORTED), > errmsg("IF NOT EXISTS cannot be used with schema elements"), > parser_errposition(@9))); Seems reasonable, but where? Please submit a complete patch. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] do we EXEC_BACKEND on Mac OS X?
See the CAVEATS here: https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/fork.2.html Apparently fork() without exec() isn't all that well supported. Noticed while perusing http://lwn.net/Articles/518306/ -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Missing OID define
On 2 October 2012 15:47, Phil Sorber wrote: > Thom Brown and I were doing some hacking the other day and came across > this missing define. We argued over who was going to send the patch in > and I lost. So here it is. Capital idea. +1 -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQping command line tool
2012/10/3 Phil Sorber : > On Wed, Oct 3, 2012 at 11:35 AM, Bruce Momjian wrote: >> On Tue, Oct 2, 2012 at 11:01:36PM -0400, Phil Sorber wrote: >>> I was wondering recently if there was any command line tool that >>> utilized PQping() or PQpingParams(). I searched the code and couldn't >>> find anything and was wondering if there was any interest to have >>> something like this included? I wrote something for my purposes of >>> performing a health check that also supports nagios style status >>> output. It's probably convenient for scripting purposes as well. It's >>> not currently ready for submission to a commitfest, but if there was >>> an interest I would clean it up so that it would be. >> >> I don't see any tool using PQping except pg_ctl. Perhaps we should >> modify "pg_ctl status" to use PQping. Right now is only checks the >> postmaster.pid file, and checks to see that the pid is a running >> postmaster. What it currently doesn't do is to check if the server is >> accepting connections with PQping(), like we do for "pg_ctl -w start". >> >> Comments? > > I was thinking that maybe this should be a new feature in an existing > tool, however I don't think pg_ctl would satisfy my use case as it's > normally bundled with the server. This would need to be something that > I could install just a client package. It's not a deal breaker, but it > makes things more complex. > > How about adding it as an option to psql? That's not to say that I > think we shouldn't also add it to 'pg_ctl status'. +1 Pavel > >> >> -- >> Bruce Momjian http://momjian.us >> EnterpriseDB http://enterprisedb.com >> >> + It's impossible for everything to be true. + > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQping command line tool
Bruce Momjian writes: > I don't see any tool using PQping except pg_ctl. Perhaps we should > modify "pg_ctl status" to use PQping. Right now is only checks the > postmaster.pid file, and checks to see that the pid is a running > postmaster. What it currently doesn't do is to check if the server is > accepting connections with PQping(), like we do for "pg_ctl -w start". The thing about pg_ctl is that it requires access to the data directory (and still would, in the variant you propose). If we were going to do something like what Phil suggests then I think it ought to be something usable remotely. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQping command line tool
On Wed, Oct 3, 2012 at 11:35 AM, Bruce Momjian wrote: > On Tue, Oct 2, 2012 at 11:01:36PM -0400, Phil Sorber wrote: >> I was wondering recently if there was any command line tool that >> utilized PQping() or PQpingParams(). I searched the code and couldn't >> find anything and was wondering if there was any interest to have >> something like this included? I wrote something for my purposes of >> performing a health check that also supports nagios style status >> output. It's probably convenient for scripting purposes as well. It's >> not currently ready for submission to a commitfest, but if there was >> an interest I would clean it up so that it would be. > > I don't see any tool using PQping except pg_ctl. Perhaps we should > modify "pg_ctl status" to use PQping. Right now is only checks the > postmaster.pid file, and checks to see that the pid is a running > postmaster. What it currently doesn't do is to check if the server is > accepting connections with PQping(), like we do for "pg_ctl -w start". > > Comments? I was thinking that maybe this should be a new feature in an existing tool, however I don't think pg_ctl would satisfy my use case as it's normally bundled with the server. This would need to be something that I could install just a client package. It's not a deal breaker, but it makes things more complex. How about adding it as an option to psql? That's not to say that I think we shouldn't also add it to 'pg_ctl status'. > > -- > Bruce Momjian http://momjian.us > EnterpriseDB http://enterprisedb.com > > + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQping command line tool
On Tue, Oct 2, 2012 at 11:01:36PM -0400, Phil Sorber wrote: > I was wondering recently if there was any command line tool that > utilized PQping() or PQpingParams(). I searched the code and couldn't > find anything and was wondering if there was any interest to have > something like this included? I wrote something for my purposes of > performing a health check that also supports nagios style status > output. It's probably convenient for scripting purposes as well. It's > not currently ready for submission to a commitfest, but if there was > an interest I would clean it up so that it would be. I don't see any tool using PQping except pg_ctl. Perhaps we should modify "pg_ctl status" to use PQping. Right now is only checks the postmaster.pid file, and checks to see that the pid is a running postmaster. What it currently doesn't do is to check if the server is accepting connections with PQping(), like we do for "pg_ctl -w start". Comments? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Switching timeline over streaming replication
On Tuesday, October 02, 2012 4:21 PM Heikki Linnakangas wrote: > Thanks for the thorough review! I committed the xlog.c refactoring patch > now. Attached is a new version of the main patch, comments on specific > points below. I didn't adjust the docs per your comments yet, will do > that next. I have some doubts regarding the comments fixed by you and some more new review comments. After this I shall focus majorly towards testing of this Patch. > On 01.10.2012 05:25, Amit kapila wrote: > > 1. In function readTimeLineHistory(), > >two mechanisms are used to fetch timeline from history file > >+sscanf(fline, "%u\t%X/%X", &tli, &switchpoint_hi, > > &switchpoint_lo); > > + > > > 8. In function writeTimeLineHistoryFile(), will it not be better to > > directly write rather than to later do pg_fsync(). > >as it is just one time write. > > Not sure I understood this right, but writeTimeLineHistoryFile() needs > to avoid putting a corrupt, e.g incomplete, file in pg_xlog. The same as > writeTimeLineHistory(). That's why the write+fsync+rename dance is > needed. > Why fsync, isn't the above purpose be resolved if write directly writes to file and then rename. > > 21. @@ -2411,27 +2411,6 @@ reaper(SIGNAL_ARGS) > > > > a. won't it impact stop of online basebackup functionality > because earlier on promotion > > I think this code will stop walsenders and basebackup stop > will also give error in such cases. > > Hmm, should a base backup be aborted when the standby is promoted? Does > the promotion render the backup corrupt? I think currently it does so. Pls refer 1. do_pg_stop_backup(char *labelfile, bool waitforarchive) { .. if (strcmp(backupfrom, "standby") == 0 && !backup_started_in_recovery) ereport(ERROR, (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("the standby was promoted during online backup"), errhint("This means that the backup being taken is corrupt " "and should not be used. " "Try taking another online backup."))); .. } 2. In documentation of pg_basebackup there is a Note: .If the standby is promoted to the master during online backup, the backup fails. New Ones --- 35.WalSenderMain(void) { .. +if (walsender_shutdown_requested) +ereport(FATAL, +(errcode(ERRCODE_ADMIN_SHUTDOWN), + errmsg("terminating replication connection due to administrator command"))); + +/* Tell the client that we are ready to receive commands */ +ReadyForQuery(DestRemote); + .. +if (walsender_shutdown_requested) +ereport(FATAL, +(errcode(ERRCODE_ADMIN_SHUTDOWN), + errmsg("terminating replication connection due to administrator command"))); + is it necessary to check walsender_shutdown_requested 2 times in a loop, if yes, then can we write comment why it is important to check it again. 35. +SendTimeLineHistory(TimeLineHistoryCmd *cmd) { .. + fd = PathNameOpenFile(path, O_RDONLY | PG_BINARY, 0666); error handling for fd < 0 is missing. 36.+SendTimeLineHistory(TimeLineHistoryCmd *cmd) { .. if (nread <= 0) +ereport(ERROR, +(errcode_for_file_access(), + errmsg("could not read file \"%s\": %m", +path))); FileClose should be done in error case as well. 37. static void XLogSend(char *msgbuf, bool *caughtup) { .. if (currentTimeLineIsHistoric && XLByteLE(currentTimeLineValidUpto, sentPtr)) { /* * This was a historic timeline, and we've reached the point where * we switched to the next timeline. Let the client know we've * reached the end of this timeline, and what the next timeline is. */ /* close the current file. */ if (sendFile >= 0) close(sendFile); sendFile = -1; *caughtup = true; /* Send CopyDone */ pq_putmessage_noblock('c', NULL, 0); streamingDoneSending = true; return; } } I am not able to understand after sending CopyDone message from above code, how walreceiver is handling it and then replying it a CopyDone message. Basically I want to know the walreceiver code which handles it? 38. st
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Wednesday, October 03, 2012 04:28:59 PM Tom Lane wrote: > Andres Freund writes: > > Maybe I am missing something here, but reindex concurrently should do > > 1) BEGIN > > 2) Lock table in share update exlusive > > 3) lock old index > > 3) create new index > > 4) obtain session locks on table, old index, new index > > 5) commit > > 6) process till newindex->insisready (no new locks) > > 7) process till newindex->indisvalid (no new locks) > > 8) process till !oldindex->indisvalid (no new locks) > > 9) process till !oldindex->indisready (no new locks) > > 10) drop all session locks > > 11) lock old index exlusively which should be "invisible" now > > 12) drop old index > > You can't drop the session locks until you're done. Consider somebody > else trying to do a DROP TABLE between steps 10 and 11, for instance. Yea, the session lock on the table itself probably shouldn't be dropped. If were holding only that one there shouldn't be any additional deadlock dangers when dropping the index due to lock upgrades as were doing the normal dance any DROP INDEX does. They seem pretty unlikely in a !valid !ready table anyway. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [9.1] 2 bugs with extensions
Excerpts from Dimitri Fontaine's message of mié sep 26 11:36:38 -0300 2012: > Marko Kreen writes: > > 1) Dumpable sequences are not supported - if sequence is tagged > >with pg_catalog.pg_extension_config_dump(), the pg_dump tries > >to run COPY on it. > > I can only reproduce that in 9.1. I first tried in HEAD where pg_dump > will just entirely skip the sequence, which is not the right answer > either, but at least does not spit out that message. > > > pg_dump: Error message from server: ERROR: cannot copy from sequence > > "batch_id_seq" > > pg_dump: The command was: COPY pgq.batch_id_seq TO stdout; > > Please find attached a patch that fixes it in 9.1, in all classic > pg_dump, --data-only and --schema-only. > > git diff --stat > src/bin/pg_dump/pg_dump.c | 68 > +++- > 1 files changed, 54 insertions(+), 14 deletions(-) > > Once that's ok for 9.1, I'll get to work on a fix for master, oh and > look at what the situation is in 9.2, which I guess is the same as in > master actually. I had a look at the patches submitted by Dimitri and in my tests they do exactly what's intended, i.e. produce a data dump of the extension's config sequences when necessary. However, after a couple of hours trying to understand what the patches are *doing* I failed to figure it out completely, and I'm afraid that there might be secondary side effects that I'm failing to notice. So I'm punting this to Tom, who seems to be the author of this code. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
Andres Freund writes: > Maybe I am missing something here, but reindex concurrently should do > 1) BEGIN > 2) Lock table in share update exlusive > 3) lock old index > 3) create new index > 4) obtain session locks on table, old index, new index > 5) commit > 6) process till newindex->insisready (no new locks) > 7) process till newindex->indisvalid (no new locks) > 8) process till !oldindex->indisvalid (no new locks) > 9) process till !oldindex->indisready (no new locks) > 10) drop all session locks > 11) lock old index exlusively which should be "invisible" now > 12) drop old index You can't drop the session locks until you're done. Consider somebody else trying to do a DROP TABLE between steps 10 and 11, for instance. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
2012/10/3 Alvaro Herrera > Excerpts from Fabrízio de Royes Mello's message of mié oct 03 09:27:41 > -0300 2012: > > 2012/10/2 Fabrízio de Royes Mello > > > > > > > > You're right... the latest proposed patch don't implements it. > > > > > > I'll change the patch and send soon... > > > > > > > > What is more reasonable? > > * show a syntax error or > > * show a message that you can not use the INE with contained objects > > Second one. > > Maybe something like this? ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("IF NOT EXISTS cannot be used with schema elements"), parser_errposition(@9))); -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Blog sobre TI: http://fabriziomello.blogspot.com >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
Excerpts from Fabrízio de Royes Mello's message of mié oct 03 09:27:41 -0300 2012: > 2012/10/2 Fabrízio de Royes Mello > > > > > You're right... the latest proposed patch don't implements it. > > > > I'll change the patch and send soon... > > > > > What is more reasonable? > * show a syntax error or > * show a message that you can not use the INE with contained objects Second one. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: PATCH: pgbench - random sampling of transaction written into log
On 03.09.2012 01:40, Tomas Vondra wrote: So, here is a fixed version of the patch. I've made these changes: Committed with some minor kibitzing. 1) The option is now '--sampling-rate' instead of '-R' and accepts float arguments. I've decided not to use 0.01 = 1% but use 1 = 1%, so it accepts values between 0 and 100. I find this more natural. I find a fraction, ie. 0.01 = 1% more natural, so I changed it back to that. I realize this is bike-shedding, but I believe Robert also found this more natural, as that's what he suggested upthread. I edited the comments and variable names a little bit, and also added a small paragraph to the pgbench user manual about this under "Per-Transaction Logging", in addition to the explanation under "Benchmarking Options" - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby
2012-07-01 18:01 keltezéssel, Fujii Masao írta: On Mon, Jul 2, 2012 at 12:42 AM, Boszormenyi Zoltan wrote: Hi, 2012-07-01 17:38 keltezéssel, Fujii Masao írta: On Sun, Jul 1, 2012 at 8:02 PM, Boszormenyi Zoltan wrote: Hi, attached is a patch that does $SUBJECT. It's a usability enhancement, to take a backup, write a minimalistic recovery.conf and start the streaming standby in one go. Comments? Could you add the patch to the next CommitFest? I will. If the backup is taken from the standby server, the standby's recovery.conf is included in the backup. What happens in this case? As documented, the command line parameters of pg_basebackup will be used for recovery.conf. So, the new standby will replicate the previous one. Cascading replication works since 9.2. So pg_basebackup overwrites the recovery.conf which was backed up from the standby with the recovery.conf which was created by using the command line parameters of pg_basebackup? Only if requested. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE SCHEMA IF NOT EXISTS
2012/10/2 Fabrízio de Royes Mello > > You're right... the latest proposed patch don't implements it. > > I'll change the patch and send soon... > > What is more reasonable? * show a syntax error or * show a message that you can not use the INE with contained objects Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL >> Blog sobre TI: http://fabriziomello.blogspot.com >> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello >> Twitter: http://twitter.com/fabriziomello
[HACKERS] Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
On 02.10.2012 14:09, Pavel Stehule wrote: fixed patch Thanks, committed with some minor editorializing. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Wednesday, October 03, 2012 01:15:27 PM Michael Paquier wrote: > On Wed, Oct 3, 2012 at 8:08 PM, Andres Freund wrote: > > On Wednesday, October 03, 2012 12:59:25 PM Greg Stark wrote: > > > Just for background. The showstopper for REINDEX concurrently was not > > > that it was particularly hard to actually do the reindexing. But it's > > > not obvious how to obtain a lock on both the old and new index without > > > creating a deadlock risk. I don't remember exactly where the deadlock > > > risk lies but there are two indexes to lock and whichever order you > > > obtain the locks it might be possible for someone else to be waiting > > > to obtain them in the opposite order. > > > > > > I'm sure it's possible to solve the problem. But the footwork needed > > > to release locks then reobtain them in the right order and verify that > > > the index hasn't changed out from under you might be a lot of > > > headache. > > > > Maybe I am missing something here, but reindex concurrently should do > > 1) BEGIN > > 12) drop old index > > The code I sent already does that more or less btw. Just that it can be > more simplified... The above just tried to describe the stuff thats relevant for locking, maybe I wasn't clear enough on that ;) > > I don't see where the deadlock danger is hidden in that? > > I didn't find anything relevant in a quick search of the archives... > > About the deadlock issues, do you mean the case where 2 sessions are > running REINDEX and/or REINDEX CONCURRENTLY on the same table or index in > parallel? No idea. The bit about deadlocks originally came from Greg, not me ;) I guess its more the interaction with normal sessions, because the locking used (SHARE UPDATE EXLUSIVE) prevents another CONCURRENT action running at the same time. I don't really see the danger there though because we should never need to acquire locks that we don't already have except the final AccessExclusiveLock but thats after we dropped other locks and after the index is made unusable. Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Wed, Oct 3, 2012 at 8:08 PM, Andres Freund wrote: > On Wednesday, October 03, 2012 12:59:25 PM Greg Stark wrote: > > Just for background. The showstopper for REINDEX concurrently was not > > that it was particularly hard to actually do the reindexing. But it's > > not obvious how to obtain a lock on both the old and new index without > > creating a deadlock risk. I don't remember exactly where the deadlock > > risk lies but there are two indexes to lock and whichever order you > > obtain the locks it might be possible for someone else to be waiting > > to obtain them in the opposite order. > > > > I'm sure it's possible to solve the problem. But the footwork needed > > to release locks then reobtain them in the right order and verify that > > the index hasn't changed out from under you might be a lot of > > headache. > Maybe I am missing something here, but reindex concurrently should do > 1) BEGIN > 2) Lock table in share update exlusive > 3) lock old index > 3) create new index > 4) obtain session locks on table, old index, new index > 5) commit Build new index. > 6) process till newindex->insisready (no new locks) > validate new index > 7) process till newindex->indisvalid (no new locks) > Forgot the swap old index/new index. > 8) process till !oldindex->indisvalid (no new locks) > 9) process till !oldindex->indisready (no new locks) > 10) drop all session locks > 11) lock old index exclusively which should be "invisible" now > 12) drop old index > The code I sent already does that more or less btw. Just that it can be more simplified... > I don't see where the deadlock danger is hidden in that? > > I didn't find anything relevant in a quick search of the archives... > About the deadlock issues, do you mean the case where 2 sessions are running REINDEX and/or REINDEX CONCURRENTLY on the same table or index in parallel? -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Wednesday, October 03, 2012 12:59:25 PM Greg Stark wrote: > Just for background. The showstopper for REINDEX concurrently was not > that it was particularly hard to actually do the reindexing. But it's > not obvious how to obtain a lock on both the old and new index without > creating a deadlock risk. I don't remember exactly where the deadlock > risk lies but there are two indexes to lock and whichever order you > obtain the locks it might be possible for someone else to be waiting > to obtain them in the opposite order. > > I'm sure it's possible to solve the problem. But the footwork needed > to release locks then reobtain them in the right order and verify that > the index hasn't changed out from under you might be a lot of > headache. Maybe I am missing something here, but reindex concurrently should do 1) BEGIN 2) Lock table in share update exlusive 3) lock old index 3) create new index 4) obtain session locks on table, old index, new index 5) commit 6) process till newindex->insisready (no new locks) 7) process till newindex->indisvalid (no new locks) 8) process till !oldindex->indisvalid (no new locks) 9) process till !oldindex->indisready (no new locks) 10) drop all session locks 11) lock old index exlusively which should be "invisible" now 12) drop old index I don't see where the deadlock danger is hidden in that? I didn't find anything relevant in a quick search of the archives... Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
Just for background. The showstopper for REINDEX concurrently was not that it was particularly hard to actually do the reindexing. But it's not obvious how to obtain a lock on both the old and new index without creating a deadlock risk. I don't remember exactly where the deadlock risk lies but there are two indexes to lock and whichever order you obtain the locks it might be possible for someone else to be waiting to obtain them in the opposite order. I'm sure it's possible to solve the problem. But the footwork needed to release locks then reobtain them in the right order and verify that the index hasn't changed out from under you might be a lot of headache. Perhaps a good way to tackle it is to have a generic "verify two indexes are equivalent and swap the underlying relfilenodes" operation that can be called from both regular reindex and reindex concurrently. As long as it's the only function that ever locks two indexes then it can just determine what locking discipline it wants to use. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework
Hi, this is the latest one, fixing a bug in the accounting of per-statement lock timeout handling and tweaking some comments. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ 2-lock_timeout-v27.patch.gz Description: Unix tar archive -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby
On Wed, Oct 3, 2012 at 10:37 AM, Boszormenyi Zoltan wrote: > Hi, > > 2012-10-03 10:25 keltezéssel, Magnus Hagander írta: >> >> On Tue, Jul 3, 2012 at 9:47 PM, Peter Eisentraut wrote: >>> >>> On mĺn, 2012-07-02 at 01:10 +0900, Fujii Masao wrote: > > But I think that part is lacking in functionality: AFAICT it's > hardcoded to only handle host, port, user and password. What about > other connection parameters, likely passed to pg_basebackup through > the environment in that case? isn't that quite likely to break the > server later? What about something like PQconninfo which returns the connection string value established at connection? > Maybe the proper way around that is to provide the ability for > pg_basebackup to take a full connection string, just like we allow > psql to do? +1 >>> I think both of these would be necessary to make this work smoothly. >>> >>> You also need to take into account situations like when pg_basebackup >>> found a server with the help of PG* environment variables that might no >>> longer be set when the server tries to start recovery. So you need >>> something like PQconninfo. >> >> Zoltan, >> >> are you planning to work on the things discussed in this thread? I >> notice the patch is sitting with "waiting on author" in the CF app - >> so the second question is that if you are doing that, do you think it >> will be done within the scope of the current CF? > > > yes, I am on it so it can be done in this CF. Great, thanks! -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby
Hi, 2012-10-03 10:25 keltezéssel, Magnus Hagander írta: On Tue, Jul 3, 2012 at 9:47 PM, Peter Eisentraut wrote: On mån, 2012-07-02 at 01:10 +0900, Fujii Masao wrote: But I think that part is lacking in functionality: AFAICT it's hardcoded to only handle host, port, user and password. What about other connection parameters, likely passed to pg_basebackup through the environment in that case? isn't that quite likely to break the server later? What about something like PQconninfo which returns the connection string value established at connection? Maybe the proper way around that is to provide the ability for pg_basebackup to take a full connection string, just like we allow psql to do? +1 I think both of these would be necessary to make this work smoothly. You also need to take into account situations like when pg_basebackup found a server with the help of PG* environment variables that might no longer be set when the server tries to start recovery. So you need something like PQconninfo. Zoltan, are you planning to work on the things discussed in this thread? I notice the patch is sitting with "waiting on author" in the CF app - so the second question is that if you are doing that, do you think it will be done within the scope of the current CF? yes, I am on it so it can be done in this CF. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On 3 October 2012 09:10, Andres Freund wrote: >> The following restrictions are applied. >> - REINDEX [ DATABASE | SYSTEM ] cannot be run concurrently. > I would like to support something like REINDEX USER TABLES; or similar at some > point, but that very well can be a second phase. Yes, that would be a nice feature anyway, even without concurrently. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Wed, Oct 3, 2012 at 5:10 PM, Andres Freund wrote: > > This basically allows to perform read and write operations on a table > whose > > index(es) are reindexed at the same time. Pretty useful for a production > > environment. The caveats of this feature is that it is slower than > normal > > reindex, and impacts other backends with the extra CPU, memory and IO it > > uses to process. The implementation is based on something on the same > ideas > > as pg_reorg and on an idea of Andres. > > > > The following restrictions are applied. > > - REINDEX [ DATABASE | SYSTEM ] cannot be run concurrently. > I would like to support something like REINDEX USER TABLES; or similar at > some > point, but that very well can be a second phase. This is something out of scope for the time being honestly. Later? why not... > > - REINDEX CONCURRENTLY cannot run inside a transaction block. > > > - toast relations are reindexed non-concurrently when table reindex is > done > > and that this table has toast relations > Why that restriction? > This is the state of the current version of the patch. And not what the final version should do. I agree that toast relations should also be reindexed concurrently as the others. Regarding this current restriction, my point was just to get some feedback before digging deeper. I should have told that though... > > > Here is a description of what happens when reorganizing an index > > concurrently > > (the beginning of the process is similar to CREATE INDEX CONCURRENTLY): > > 1) creation of a new index based on the same columns and restrictions as > > the index that is rebuilt (called here old index). This new index has as > > name $OLDINDEX_cct. So only a suffix _cct is added. It is marked as > > invalid and not ready. > You probably should take a SHARE UPDATE EXCLUSIVE lock on the table at that > point already, to prevent schema changes. > > > 8) Take a reference snapshot and validate the new indexes > Hm. Unless you factor in corrupt indices, why should this be needed? > > > 14) Swap new and old indexes, consisting here in switching their names. > I think switching based on their names is not going to work very well > because > indexes are referenced by oid at several places. Swapping > pg_index.indexrelid > or pg_class.relfilenode seems to be the better choice to me. We expect > relfilenode changes for such commands, but not ::regclass oid changes. > OK, so you mean to create an index, then switch only the relfilenode. Why not. This is largely doable. I think that what is important here is to choose a way of doing an keep it until the end. > > Such a behaviour would at least be complicated for pg_depend and > pg_constraint. > > > The following process might be reducible, but I would like that to be > > decided depending on the community feedback and experience on such > > concurrent features. > > For the time being I took an approach that looks slower, but secured to > my > > mind with multiple waits (perhaps sometimes unnecessary?) and > > subtransactions. > > > If during the process an error occurs, the table will finish with either > > the old or new index as invalid. In this case the user will be in charge > to > > drop the invalid index himself. > > The concurrent index can be easily identified with its suffix *_cct. > I am not really happy about relying on some arbitrary naming here. That > still > can result in conflicts and such. > The concurrent names are generated automatically with a function in indexcmds.c, the same way a pkey indexes. Let's imagine that the reindex concurrently command is run twice after a failure. The second concurrent index will not have *_cct as suffix but _cct1. However I am open to more ideas here. What I feel about the concurrent index is that it needs a pg_class entry, even if it is just temporary, and this entry needs a name. > > This patch has required some refactorisation effort as I noticed that the > > code of index for concurrent operations was not very generic. In order > to do > > that, I created some new functions in index.c called index_concurrent_* > > which are used by CREATE INDEX and REINDEX in my patch. Some refactoring > has > > also been done regarding the> wait processes. > > > REINDEX TABLE and REINDEX INDEX follow the same code path > > (ReindexConcurrentIndexes in indexcmds.c). The patch structure is > relying a > > maximum on the functions of index.c when creating, building and > validating > > concurrent index. > I haven't looked at the patch yet, but I was pretty sure that you would > need > to do quite some refactoring to implement this and this looks like roughly > the > right direction... > Thanks for spending time on it. -- Michael Paquier http://michael.otacoo.com
Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby
On Tue, Jul 3, 2012 at 9:47 PM, Peter Eisentraut wrote: > On mån, 2012-07-02 at 01:10 +0900, Fujii Masao wrote: >> > But I think that part is lacking in functionality: AFAICT it's >> > hardcoded to only handle host, port, user and password. What about >> > other connection parameters, likely passed to pg_basebackup through >> > the environment in that case? isn't that quite likely to break the >> > server later? >> >> What about something like PQconninfo which returns the connection >> string value established at connection? >> >> > Maybe the proper way around that is to provide the ability for >> > pg_basebackup to take a full connection string, just like we allow >> > psql to do? >> >> +1 >> > I think both of these would be necessary to make this work smoothly. > > You also need to take into account situations like when pg_basebackup > found a server with the help of PG* environment variables that might no > longer be set when the server tries to start recovery. So you need > something like PQconninfo. Zoltan, are you planning to work on the things discussed in this thread? I notice the patch is sitting with "waiting on author" in the CF app - so the second question is that if you are doing that, do you think it will be done within the scope of the current CF? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
Hi, On Wednesday, October 03, 2012 03:14:17 AM Michael Paquier wrote: > One of the outputs on the discussions about the integration of pg_reorg in > core > was that Postgres should provide some ways to do REINDEX, CLUSTER and ALTER > TABLE concurrently with low-level locks in a way similar to CREATE INDEX > CONCURRENTLY. > > The discussions done can be found on this thread: > http://archives.postgresql.org/pgsql-hackers/2012-09/msg00746.php > > Well, I spent some spare time working on the implementation of REINDEX > CONCURRENTLY. Very cool! > This basically allows to perform read and write operations on a table whose > index(es) are reindexed at the same time. Pretty useful for a production > environment. The caveats of this feature is that it is slower than normal > reindex, and impacts other backends with the extra CPU, memory and IO it > uses to process. The implementation is based on something on the same ideas > as pg_reorg and on an idea of Andres. > The following restrictions are applied. > - REINDEX [ DATABASE | SYSTEM ] cannot be run concurrently. I would like to support something like REINDEX USER TABLES; or similar at some point, but that very well can be a second phase. > - REINDEX CONCURRENTLY cannot run inside a transaction block. > - toast relations are reindexed non-concurrently when table reindex is done > and that this table has toast relations Why that restriction? > Here is a description of what happens when reorganizing an index > concurrently > (the beginning of the process is similar to CREATE INDEX CONCURRENTLY): > 1) creation of a new index based on the same columns and restrictions as > the index that is rebuilt (called here old index). This new index has as > name $OLDINDEX_cct. So only a suffix _cct is added. It is marked as > invalid and not ready. You probably should take a SHARE UPDATE EXCLUSIVE lock on the table at that point already, to prevent schema changes. > 8) Take a reference snapshot and validate the new indexes Hm. Unless you factor in corrupt indices, why should this be needed? > 14) Swap new and old indexes, consisting here in switching their names. I think switching based on their names is not going to work very well because indexes are referenced by oid at several places. Swapping pg_index.indexrelid or pg_class.relfilenode seems to be the better choice to me. We expect relfilenode changes for such commands, but not ::regclass oid changes. Such a behaviour would at least be complicated for pg_depend and pg_constraint. > The following process might be reducible, but I would like that to be > decided depending on the community feedback and experience on such > concurrent features. > For the time being I took an approach that looks slower, but secured to my > mind with multiple waits (perhaps sometimes unnecessary?) and > subtransactions. > If during the process an error occurs, the table will finish with either > the old or new index as invalid. In this case the user will be in charge to > drop the invalid index himself. > The concurrent index can be easily identified with its suffix *_cct. I am not really happy about relying on some arbitrary naming here. That still can result in conflicts and such. > This patch has required some refactorisation effort as I noticed that the > code of index for concurrent operations was not very generic. In order to do > that, I created some new functions in index.c called index_concurrent_* > which are used by CREATE INDEX and REINDEX in my patch. Some refactoring has > also been done regarding the> wait processes. > REINDEX TABLE and REINDEX INDEX follow the same code path > (ReindexConcurrentIndexes in indexcmds.c). The patch structure is relying a > maximum on the functions of index.c when creating, building and validating > concurrent index. I haven't looked at the patch yet, but I was pretty sure that you would need to do quite some refactoring to implement this and this looks like roughly the right direction... > Thanks, and looking forward to your feedback, I am very happy that youre taking this on! Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On 3 October 2012 02:14, Michael Paquier wrote: > Well, I spent some spare time working on the implementation of REINDEX > CONCURRENTLY. Thanks > The following restrictions are applied. > - REINDEX [ DATABASE | SYSTEM ] cannot be run concurrently. Fair enough > - indexes for exclusion constraints cannot be reindexed concurrently. > - toast relations are reindexed non-concurrently when table reindex is done > and that this table has toast relations Those restrictions are important ones to resolve since they prevent the CONCURRENTLY word from being true in a large proportion of cases. We need to be clear that the remainder of this can be done in user space already, so the proposal doesn't move us forwards very far, except in terms of packaging. IMHO this needs to be more than just moving a useful script into core. > Here is a description of what happens when reorganizing an index > concurrently There are four waits for every index, again similar to what is possible in user space. When we refactor that, I would like to break things down into N discrete steps, if possible. Each time we hit a wait barrier, a top-level process would be able to switch to another task to avoid waiting. This would then allow us to proceed more quickly through the task. I would admit that is a later optimisation, but it would be useful to have the innards refactored to allow for that more easily later. I'd accept Not yet, if doing that becomes a problem in short term. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers