Re: [HACKERS] ECPG FETCH readahead, was: Re: ECPG fixes
2013-12-21 14:56 keltezéssel, Peter Eisentraut írta: This patch didn't make it out of the 2013-11 commit fest. You should move it to the next commit fest (probably with an updated patch) before January 15th. Done. Best regards, Zoltán Böszörményi -- -- Zoltán Böszörményi Cybertec Schönig Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt, Austria Web: http://www.postgresql-support.de http://www.postgresql.at/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] nested hstore patch
On Fri, Dec 20, 2013 at 6:16 PM, David E. Wheeler da...@justatheory.com wrote: * New operators: + `hstore - int`: Get string value at array index (starting at 0) + `hstore ^ text`:Get numeric value for key + `hstore ^ int`: Get numeric value at array index + `hstore ? text`:Get boolean value for key + `hstore ? int`: Get boolean value at array index + `hstore # text[]`: Get string value for key path + `hstore #^ text[]`: Get numeric value for key path + `hstore #? text[]`: Get boolean value for key path + `hstore % text`:Get hstore value for key + `hstore % int`: Get hstore value at array index + `hstore #% text[]`: Get hstore value for key path + `hstore ? int`: Does hstore contain array index + `hstore #? text[]`: Does hstore contain key path + `hstore - int`: Delete index from left operand + `hstore #- text[]`: Delete key path from left operand Although in some ways there's a certain elegance to this, it also sorta looks like punctuation soup. I can't help wondering whether we'd be better off sticking to function names. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] nested hstore patch
On 12/23/2013 12:28 PM, Robert Haas wrote: On Fri, Dec 20, 2013 at 6:16 PM, David E. Wheeler da...@justatheory.com wrote: * New operators: + `hstore - int`: Get string value at array index (starting at 0) + `hstore ^ text`:Get numeric value for key + `hstore ^ int`: Get numeric value at array index + `hstore ? text`:Get boolean value for key + `hstore ? int`: Get boolean value at array index + `hstore # text[]`: Get string value for key path + `hstore #^ text[]`: Get numeric value for key path + `hstore #? text[]`: Get boolean value for key path + `hstore % text`:Get hstore value for key + `hstore % int`: Get hstore value at array index + `hstore #% text[]`: Get hstore value for key path + `hstore ? int`: Does hstore contain array index + `hstore #? text[]`: Does hstore contain key path + `hstore - int`: Delete index from left operand + `hstore #- text[]`: Delete key path from left operand Although in some ways there's a certain elegance to this, it also sorta looks like punctuation soup. I can't help wondering whether we'd be better off sticking to function names. Has anybody looked into how hard it would be to add method notation to postgreSQL, so that instead of calling getString(hstorevalue, n) we could use hstorevalue.getString(n) -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CLUSTER FREEZE
On 2013-12-22 20:45:02 -0500, Robert Haas wrote: I suspect we ought to extend this to rewriting variants of ALTER TABLE as well, but a little thought is needed there. ATRewriteTables() appears to just call heap_insert() for each updated row, which if I'm not mistaken is an MVCC violation - offhand, it seems like a transaction with an older MVCC snapshot could access the table for this first time after the rewriter commits, and fail to see rows which would have appeared to be there before the rewrite. (I haven't actually tested this, so watch me be wrong.) If we're OK with an MVCC violation here, we could just pass HEAP_INSERT_FROZEN and have a slightly different flavor of violation; if not, this needs some kind of more extensive surgery quite apart from what we do about freezing. Yes, rewriting ALTER TABLEs certainly aren't MVCC safe. I thought that was documented, but apparently not. I am not sure it can be fixed easily using the tricks CLUSTER plays, there might be nasty edge-cases because of the changes in the column definition. Certainly not a trivial project. I think we should leave ALTER TABLE as a completely separate project and just improve CLUSTER for now. The practical impact of rewriting ALTER TABLEs not freezing is far smaller, because they are very seldomly performed in bigger databases. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] preserving forensic information when we freeze
On Fri, Dec 20, 2013 at 9:56 PM, Robert Haas robertmh...@gmail.com wrote: On Fri, Dec 20, 2013 at 7:51 AM, Andres Freund and...@2ndquadrant.com wrote: I wondered that, too, but it's not well-defined for all tuples. What happens if you pass in constructed tuple rather than an on-disk tuple? Those should be discernible I think, t_self/t_tableOid won't be set for generated tuples. I went looking for existing precedent for code that does things like this and found record_out, which does this: HeapTupleHeader rec = PG_GETARG_HEAPTUPLEHEADER(0); ... /* Extract type info from the tuple itself */ tupType = HeapTupleHeaderGetTypeId(rec); tupTypmod = HeapTupleHeaderGetTypMod(rec); tupdesc = lookup_rowtype_tupdesc(tupType, tupTypmod); ncolumns = tupdesc-natts; /* Build a temporary HeapTuple control structure */ tuple.t_len = HeapTupleHeaderGetDatumLength(rec); ItemPointerSetInvalid((tuple.t_self)); tuple.t_tableOid = InvalidOid; tuple.t_data = rec; This appears to be a typical pattern, although interestingly I noticed that row_to_json() doesn't bother setting t_tableOid or t_self, which I think it's supposed to do. The problem I see here is that this code seems to imply that a function passed a record doesn't actually have enough information to know what sort of a thing it's getting. The use of HeapTupleHeaderGetTypeId and HeapTupleHeaderGetTypMod implies that it's safe to assume that t_choice will contain DatumTupleFields rather than HeapTupleFields, which doesn't seem to bode well for your approach. Am I missing a trick? If not, here's a patch done the way I originally proposed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company pg-tuple-header.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] preserving forensic information when we freeze
On 2013-12-20 13:22:07 +0100, Andres Freund wrote: On 2013-12-20 07:12:01 -0500, Robert Haas wrote: I think the root of the problem is that nobody's very eager to add more hidden system catalog columns because each one bloats pg_attribute significantly. I think that part is actually solveable - there's no real need for them to be real columns, present in pg_attribute, things could easily enough be setup during parse analysis. I've spent some time yesterday hacking up a prototype for this. The amount of effort seems reasonable, although the attached patch certainly doesn't do all the neccessary things. The regression tests pass. In the prototype only oid keeps it's pg_attribute entry, everything else uses the static entries via SystemAttributeDefinition()/SystemAttributeByName(). We might want to remove oids as well, but it seems reasonable to continue allowing defining column permissions on it. And it's likely the attribute that's most likely ingrained deeply in user applications. The bigger problem I see is that adding more useful columns will cause name clashes, which will probably prohibit improvements in that direction. I wonder if we could solve that by naming new columns like system:attribute or similar. Not pretty, but maybe better than the alternatives. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services From 3a5692575c0077601cfba938239f4dd6f74de3c5 Mon Sep 17 00:00:00 2001 From: Andres Freund and...@anarazel.de Date: Mon, 23 Dec 2013 13:44:36 +0100 Subject: [PATCH] prototype-patch: remove system columns from pg_attribute --- src/backend/access/common/heaptuple.c | 14 src/backend/catalog/aclchk.c | 27 +++--- src/backend/catalog/heap.c| 18 ++-- src/backend/commands/tablecmds.c | 154 +++--- src/backend/parser/parse_relation.c | 65 ++ src/backend/utils/cache/lsyscache.c | 11 ++- src/include/access/sysattr.h | 3 +- 7 files changed, 188 insertions(+), 104 deletions(-) diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c index 347d616..99471ab 100644 --- a/src/backend/access/common/heaptuple.c +++ b/src/backend/access/common/heaptuple.c @@ -59,7 +59,9 @@ #include access/sysattr.h #include access/tuptoaster.h +#include access/transam.h #include executor/tuptable.h +#include storage/procarray.h /* Does att's datatype allow packing into the 1-byte-header varlena format? */ @@ -286,6 +288,7 @@ heap_attisnull(HeapTuple tup, int attnum) case MinCommandIdAttributeNumber: case MaxTransactionIdAttributeNumber: case MaxCommandIdAttributeNumber: + case CookedMaxTransactionIdAttributeNumber: /* these are never null */ break; @@ -558,6 +561,17 @@ heap_getsysattr(HeapTuple tup, int attnum, TupleDesc tupleDesc, bool *isnull) case TableOidAttributeNumber: result = ObjectIdGetDatum(tup-t_tableOid); break; + case CookedMaxTransactionIdAttributeNumber: + { +TransactionId xid; +xid = HeapTupleHeaderGetUpdateXid(tup-t_data); +if (TransactionIdIsValid(xid) + !TransactionIdIsInProgress(xid) + !TransactionIdDidCommit(xid)) + xid = InvalidTransactionId; +result = TransactionIdGetDatum(xid); + } + break; default: elog(ERROR, invalid attnum: %d, attnum); result = 0; /* keep compiler quiet */ diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c index 5a46fd9..807e274 100644 --- a/src/backend/catalog/aclchk.c +++ b/src/backend/catalog/aclchk.c @@ -1532,16 +1532,19 @@ expand_all_col_privileges(Oid table_oid, Form_pg_class classForm, if (classForm-relkind == RELKIND_VIEW curr_att 0) continue; - attTuple = SearchSysCache2(ATTNUM, - ObjectIdGetDatum(table_oid), - Int16GetDatum(curr_att)); - if (!HeapTupleIsValid(attTuple)) - elog(ERROR, cache lookup failed for attribute %d of relation %u, - curr_att, table_oid); - - isdropped = ((Form_pg_attribute) GETSTRUCT(attTuple))-attisdropped; - - ReleaseSysCache(attTuple); + if (curr_att 0) + isdropped = false; + else + { + attTuple = SearchSysCache2(ATTNUM, + ObjectIdGetDatum(table_oid), + Int16GetDatum(curr_att)); + if (!HeapTupleIsValid(attTuple)) +elog(ERROR, cache lookup failed for attribute %d of relation %u, + curr_att, table_oid); + isdropped = ((Form_pg_attribute) GETSTRUCT(attTuple))-attisdropped; + ReleaseSysCache(attTuple); + } /* ignore dropped columns */ if (isdropped) @@ -1579,6 +1582,10 @@ ExecGrant_Attribute(InternalGrant *istmt, Oid relOid, const char *relname, Oid *oldmembers; Oid *newmembers; + /* FIXME: don't set column permissions on system columns */ + if (attnum 0) + return; + attr_tuple = SearchSysCache2(ATTNUM, ObjectIdGetDatum(relOid), Int16GetDatum(attnum)); diff --git
Re: [HACKERS] preserving forensic information when we freeze
On 2013-12-20 21:56:42 -0500, Robert Haas wrote: On Fri, Dec 20, 2013 at 7:51 AM, Andres Freund and...@2ndquadrant.com wrote: I wondered that, too, but it's not well-defined for all tuples. What happens if you pass in constructed tuple rather than an on-disk tuple? Those should be discernible I think, t_self/t_tableOid won't be set for generated tuples. I went looking for existing precedent for code that does things like this and found record_out, which does this: I think there's plenty precedent for C code, but here the problem seems to be that, when we pass a record that's originally a plain row, it's coerced (via IO functions) into a record with typeid/typemod set. So, for this to work via SQL, we'd need to add a notation that allows passing table rows to functions without being modified. That might be a good idea in general, but likely more work than it's work tackling in this context. This appears to be a typical pattern, although interestingly I noticed that row_to_json() doesn't bother setting t_tableOid or t_self, which I think it's supposed to do. Yes, that seems to be a minor bug. Andrew? Am I missing a trick? No, don't think so, I wasn't thinking on the SQL level. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] preserving forensic information when we freeze
On Mon, Dec 23, 2013 at 7:45 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-12-20 13:22:07 +0100, Andres Freund wrote: On 2013-12-20 07:12:01 -0500, Robert Haas wrote: I think the root of the problem is that nobody's very eager to add more hidden system catalog columns because each one bloats pg_attribute significantly. I think that part is actually solveable - there's no real need for them to be real columns, present in pg_attribute, things could easily enough be setup during parse analysis. I've spent some time yesterday hacking up a prototype for this. The amount of effort seems reasonable, although the attached patch certainly doesn't do all the neccessary things. The regression tests pass. Well, I'm all in favor of de-bloating pg_attribute, but I don't particularly like cooked_xmax. Even if it were better-named (updatexid?), I'm still not sure I'd like it very much. And I definitely think that getting rid of the pg_attribute entries ought to be a separate patch from adding any new system columns we want to have. In the prototype only oid keeps it's pg_attribute entry, everything else uses the static entries via SystemAttributeDefinition()/SystemAttributeByName(). We might want to remove oids as well, but it seems reasonable to continue allowing defining column permissions on it. And it's likely the attribute that's most likely ingrained deeply in user applications. Seems fine to me. The bigger problem I see is that adding more useful columns will cause name clashes, which will probably prohibit improvements in that direction. I wonder if we could solve that by naming new columns like system:attribute or similar. Not pretty, but maybe better than the alternatives. If we're going to add a prefix, I think it should be one that doesn't require quoting the identifier. In retrospect pg_xmin or _pg_xmin or __pg_xmin would have been a better name than xmin. But TBH, I'm kind of enamored of the function-based approach at the moment. It just seems a lot more flexible. Adding a function is nearly free and we can add as many as we need doing whatever we want, and they can even go into contrib modules if we like it better that way. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] preserving forensic information when we freeze
On 2013-12-23 10:26:49 -0500, Robert Haas wrote: On Mon, Dec 23, 2013 at 7:45 AM, Andres Freund and...@2ndquadrant.com wrote: I've spent some time yesterday hacking up a prototype for this. The amount of effort seems reasonable, although the attached patch certainly doesn't do all the neccessary things. The regression tests pass. Well, I'm all in favor of de-bloating pg_attribute, but I don't particularly like cooked_xmax. Even if it were better-named (updatexid?), I'm still not sure I'd like it very much. And I definitely think that getting rid of the pg_attribute entries ought to be a separate patch from adding any new system columns we want to have. Yea, that was just a demo attribute, I am far from sure we should add any. It was just important to me to test that we're easily able to do so. I don't even think it's semantics are necessarily all that useful. I think we should do this, independent from adding additional attributes. The reduction of rows in pg_attribute is quite noticeable, and I can't see us just dropping the old system columns. But TBH, I'm kind of enamored of the function-based approach at the moment. It just seems a lot more flexible. Adding a function is nearly free and we can add as many as we need doing whatever we want, and they can even go into contrib modules if we like it better that way. Well, it really depends on the usecase. Reading SELECT header.xmin, header.xmax FROM sometable tbl, pg_tuple_header(tbl.tableoid, tbl.ctid) header; is surely harder to understand than SELECT tbl.xmin, tbl.xmax FROM sometable tbl; and the latter is noticeably cheaper since we're not re-fetching the tuple, especially when the tuple is the result of a more complicated query including a seqscan, we might often need to re-fetch the tuple from disk, thanks to the ringbuffer. That really makes me less than convinced that the function based approach is the right tool for everything. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Upgrade to Autoconf 2.69
On 12/21/13, 9:39 AM, Peter Eisentraut wrote: This is enabling large-file support on OS X, so that seems kind of important. It's not failing with newer versions of OS X, so that leaves the following possibilities, I think: - Large files never worked on 10.5. That would be strange because Autoconf explicitly refers to that version. - New OS X versions have this on by default (could someone check?), so it already worked before, and it's something specific to 10.5 that's broken. Current OS X has 64-bit inodes on by default, so this code works in general. So it's probably one of the other causes. - It's something specific to PowerPC or endianness. - That build farm member has a strange setup. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Sun, Dec 22, 2013 at 6:42 PM, Peter Geoghegan p...@heroku.com wrote: On Fri, Dec 20, 2013 at 11:59 PM, Peter Geoghegan p...@heroku.com wrote: I think that the way forward is to refine my design in order to upgrade locks from exclusive buffer locks to something else, managed by the lock manager but perhaps through an additional layer of indirection. As previously outlined, I'm thinking of a new SLRU-based granular value locking infrastructure built for this purpose, with btree inserters marking pages as having an entry in this table. I'm working on a revision that holds lmgr page-level exclusive locks (and buffer pins) across multiple operations. This isn't too different to what you've already seen, since they are still only held for an instant. Notably, hash indexes currently quickly grab and release lmgr page-level locks, though they're the only existing clients of that infrastructure. I think on reflection that fully-fledged value locking may be overkill, given the fact that these locks are only held for an instant, and only need to function as a choke point for unique index insertion, and only when upserting occurs. This approach seems promising. It didn't take me very long to get it to a place where it passed a few prior test-cases of mine, with fairly varied input, though the patch isn't likely to be posted for another few days. I think I can get it to a place where it doesn't regress regular insertion at all. I think that that will tick all of the many boxes, without unwieldy complexity and without compromising conceptual integrity. I mention this now because obviously time is a factor. If you think there's something I need to do, or that there's some way that I can more usefully coordinate with you, please let me know. Likewise for anyone else following. I don't think this is a project to rush through. We've lived without MERGE/UPSERT for several years now, and we can live without it for another release cycle while we try to reach agreement on the way forward. I can tell that you're convinced you know the right way forward here, and you may be right, but I don't think you've convinced everyone else - maybe not even anyone else. I wouldn't suggest modeling anything you do on the way hash indexes using heavyweight locks. That is a performance disaster, not to mention being basically a workaround for the fact that whoever wrote the code originally didn't bother figuring out any way that splitting a bucket could be accomplished in a crash-safe manner, even in theory. If it weren't for that, we'd be using buffer locks there. That doesn't necessarily mean that page-level heavyweight locks aren't the right thing here, but the performance aspects of any such approach will need to be examined carefully. To be honest, I am still not altogether sold on any part of this feature. I don't like the fact that it violates MVCC - although I admit that some form of violation is inevitable in any feature in this area unless we're content to live with many serialization failures, I don't like the particular way it violates MVCC, I don't like the syntax (returns rejects? blech), and I don't like the fact that getting the locking right, or even getting the semantics right, seems to be so darned hard. I think we're in real danger of building something that will be too complex, or just too weird, for users to use, and too complex to maintain as well. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] nested hstore patch
Hello 2013/12/23 Hannu Krosing ha...@2ndquadrant.com On 12/23/2013 12:28 PM, Robert Haas wrote: On Fri, Dec 20, 2013 at 6:16 PM, David E. Wheeler da...@justatheory.com wrote: * New operators: + `hstore - int`: Get string value at array index (starting at 0) + `hstore ^ text`:Get numeric value for key + `hstore ^ int`: Get numeric value at array index + `hstore ? text`:Get boolean value for key + `hstore ? int`: Get boolean value at array index + `hstore # text[]`: Get string value for key path + `hstore #^ text[]`: Get numeric value for key path + `hstore #? text[]`: Get boolean value for key path + `hstore % text`:Get hstore value for key + `hstore % int`: Get hstore value at array index + `hstore #% text[]`: Get hstore value for key path + `hstore ? int`: Does hstore contain array index + `hstore #? text[]`: Does hstore contain key path + `hstore - int`: Delete index from left operand + `hstore #- text[]`: Delete key path from left operand Although in some ways there's a certain elegance to this, it also sorta looks like punctuation soup. I can't help wondering whether we'd be better off sticking to function names. Has anybody looked into how hard it would be to add method notation to postgreSQL, so that instead of calling getString(hstorevalue, n) we could use hstorevalue.getString(n) yes, I played with it some years ago. I ended early, there was a problem with parser - when I tried append a new rule. And because there was not simple solution, I didn't continue. But it can be nice feature - minimally for plpgsql coders. Regards Pavel -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] preserving forensic information when we freeze
On Mon, Dec 23, 2013 at 10:42 AM, Andres Freund and...@2ndquadrant.com wrote: But TBH, I'm kind of enamored of the function-based approach at the moment. It just seems a lot more flexible. Adding a function is nearly free and we can add as many as we need doing whatever we want, and they can even go into contrib modules if we like it better that way. Well, it really depends on the usecase. Reading SELECT header.xmin, header.xmax FROM sometable tbl, pg_tuple_header(tbl.tableoid, tbl.ctid) header; is surely harder to understand than SELECT tbl.xmin, tbl.xmax FROM sometable tbl; and the latter is noticeably cheaper since we're not re-fetching the tuple, especially when the tuple is the result of a more complicated query including a seqscan, we might often need to re-fetch the tuple from disk, thanks to the ringbuffer. That really makes me less than convinced that the function based approach is the right tool for everything. Meh. Who are all of these people who are fetching xmin, xmax, cmin, and cmax in complex queries, and why are they doing that? If those columns are just for forensic purposes, then whatever performance disadvantages the function-based approach has don't really matter. If people are using them as integral parts of their application, then that's more of a concern, but how are those people not getting broken every release or three anyway? We keep inventing things like combo-cids and multi-xacts and multi-xacts that also contain an update and now freezing without changing xmin, and if you're actually relying on those columns for anything but forensics your application is presumably going to break when we change whatever part it depends on. Is anyone really doing that, and if so, why? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assertion failure in base backup code path
On Dec 19, 2013 12:06 AM, Andres Freund and...@2ndquadrant.com wrote: Hi Magnus, It looks to me like the path to do_pg_start_backup() outside of a transaction context comes from your initial commit of the base backup facility. The problem is that you're not allowed to do anything leading to a syscache/catcache lookup in those contexts. I think that may have come with the addition of the replication privilege actually but that doesn't change the fact that yes, it appears broken.. /Magnus
Re: [HACKERS] WITHIN GROUP patch
I wrote: Or, really, why don't we just do the same thing I'm advocating for the plain-ordered-set case? That is, if there's a single collation applying to all the collatable inputs, that's the collation to use for the aggregate; otherwise it has no determinate collation, and it'll throw an error at runtime if it needs one. I went and tried to implement this, and realized that it would take some pretty significant rewriting of parse_collate.c, because of examples like this: rank(a,b) within group (order by c collate foo, d collate bar) In the current parse_collate logic, it would throw error immediately upon being told to merge the two explicit-COLLATE results. We'd need a way to postpone that error and instead just decide that the rank aggregate's collation is indeterminate. While that's perhaps just a SMOP, it would mean that ordered-set aggregates don't resolve collation the same way as other functions, which pretty much destroys the argument for this approach. What's more, the same problem applies to non-hypothetical ordered-set aggregates, if they've got more than one sortable input column. What I'm now thinking we want to do is: 1. Non-hypothetical direct args always contribute to determining the agg's collation. 2. Hypothetical and aggregated args contribute to the agg's collation only if the agg is designed so that there is always exactly one aggregated arg (ie, it's non-variadic with one aggregated arg). Otherwise we assign their collations per-sort-column and don't merge them into the aggregate's collation. This specification ensures that a variadic aggregate doesn't change behavior depending on how many sort columns there happen to be. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] XML Issue with DTDs
On 12/19/13, 6:40 PM, Florian Pflug wrote: The following example fails for XMLOPTION set to DOCUMENT as well as for XMLOPTION set to CONTENT. select xmlconcat( xmlparse(document '!DOCTYPE test [!ELEMENT test EMPTY]test/'), xmlparse(content 'test/') )::text::xml; The SQL standard specifies that DTDs are dropped by xmlconcat. It's just not implemented. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
On Sun, Dec 22, 2013 at 8:32 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Sun, Dec 22, 2013 at 3:01 PM, Fujii Masao masao.fu...@gmail.com wrote: I found the bug of ALTER SYSTEM SET patch. The procedure to reproduce it is here. $ psql =# ALTER SYSTEM SET shared_buffers = '512MB'; ALTER SYSTEM =# \q $ pg_ctl -D data reload server signaled 2013-12-22 18:24:13 JST LOG: received SIGHUP, reloading configuration files 2013-12-22 18:24:13 JST LOG: parameter shared_buffers cannot be changed without restarting the server 2013-12-22 18:24:13 JST LOG: configuration file X?? contains errors; unaffected changes were applied The configuration file name in the last log message is broken. This problem was introduced by the ALTER SYSTEM SET patch. FreeConfigVariables(head); snip else if (apply) ereport(elevel, (errcode(ERRCODE_CONFIG_FILE_ERROR), errmsg(configuration file \%s\ contains errors; unaffected changes were applied, ErrorConfFile))); The cause of the problem is that, in ProcessConfigFile(), the log message including the 'ErrorConfFile' is emitted after 'head' is free'd even though 'ErrorConfFile' points to one of entry of the 'head'. Your analysis is absolutely right. The reason for this issue is that we want to display filename to which erroneous parameter belongs and in this case we are freeing the parameter info before actual error. To fix, we need to save the filename value before freeing parameter list. Please find the attached patch to fix this issue. Note - In my m/c, I could not reproduce the issue. I think this is not surprising because we are not setting freed memory to NULL, so it can display anything (sometimes right value as well) If you find that the provided patch doesn't fix the problem or you don't find it appropriate due to some other reason, let me know the feedback. I shall be able to provide updated patch early next as I will be on vacation for remaining part of this week. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] nested hstore patch
On Dec 23, 2013, at 6:28 AM, Robert Haas wrote: On Fri, Dec 20, 2013 at 6:16 PM, David E. Wheeler da...@justatheory.com wrote: * New operators: + `hstore - int`: Get string value at array index (starting at 0) + `hstore ^ text`:Get numeric value for key + `hstore ^ int`: Get numeric value at array index + `hstore ? text`:Get boolean value for key + `hstore ? int`: Get boolean value at array index + `hstore # text[]`: Get string value for key path + `hstore #^ text[]`: Get numeric value for key path + `hstore #? text[]`: Get boolean value for key path + `hstore % text`:Get hstore value for key + `hstore % int`: Get hstore value at array index + `hstore #% text[]`: Get hstore value for key path + `hstore ? int`: Does hstore contain array index + `hstore #? text[]`: Does hstore contain key path + `hstore - int`: Delete index from left operand + `hstore #- text[]`: Delete key path from left operand Although in some ways there's a certain elegance to this, it also sorta looks like punctuation soup. I can't help wondering whether we'd be better off sticking to function names. The key thing is making it easy for people to easily chain calls to their nested hstore objects, and I think these operators accomplish that. Some of them are fairly intuitive, and I think as a community if we have a) good docs, b) good blog posts on how to use nested hstore, and c) provides clear instructions @ PG events on how to use it, it would be okay, though some things, i.e. extracting the key by a path, might be better being in a function anyway. However, having it as an operator might encourage more usage, only because people tend to think that functions will slow my query down. My only concern is the consistency with the generally accepted standard of JSON and with the upcoming jsonb type. I'm not sure if the jsonb API has been defined yet, but it would be great to keep consistency between nested hstore and jsonb so people don't have to learn two different access systems. Data extraction from JSON is often done by the dot operator in implementations, and depending on the language you are in, there are ways to add / test existence / remove objects from the JSON blob. Being able to extract objects from nested hstore / JSON using the dot operator would be simple and intuitive and general well-understood, but of course there are challenges with doing that in PG and well, proper SQL. Jonathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] preserving forensic information when we freeze
Robert Haas robertmh...@gmail.com wrote: Meh. Who are all of these people who are fetching xmin, xmax, cmin, and cmax in complex queries, and why are they doing that? If those columns are just for forensic purposes, then whatever performance disadvantages the function-based approach has don't really matter. If people are using them as integral parts of their application, then that's more of a concern, but how are those people not getting broken every release or three anyway? We keep inventing things like combo-cids and multi-xacts and multi-xacts that also contain an update and now freezing without changing xmin, and if you're actually relying on those columns for anything but forensics your application is presumably going to break when we change whatever part it depends on. Is anyone really doing that, and if so, why? I've seen an ORM use xmin for a sort of optimistic concurrency control scheme. Let's say that you bring up an address by primary key, and change an incorrect apartment number. At the same time, someone else is entering a change of address, to a whole new location where a totally different apartment number is supplied. The developers want to prevent portions of the two addresses from being intermingled, they don't want to hold a database transaction open for the time the user has the window up, they want to avoid blocking as much as possible, and they don't want to store an extra column with a version number or timestamp just to do that -- so they use xmin. When they go to rewrite the row they use the PK values plus the xmin in the UPDATE statement's WHERE clause. If the row is not found, the user gets an error message. Currently that does mean that the users can get a spurious error message when there is a tuple freeze while the window is open, so the current changes are probably good news for those using this approach. Other than that and ctid (for similar reasons) I have not seen system columns used in applications or application frameworks. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
Discussion around this topic has reached hundreds of messages, and whilst I have failed to find mention of my following question, I appreciate it may have already been discussed. I have noticed that configuration parameters for extensions are only supported if the server was started with one of the parameters already being set in postgresql.conf: # without any mention in postgresql.conf postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true; ERROR: unrecognized configuration parameter auto_explain.log_verbose # with auto_explain.log_verbose = false added to postgresql.conf postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true; ALTER SYSTEM Removing the entry from postgresql.conf, restarting Postgres, setting it to the default, then restarting again renders it unsettable again: # removed entry from postgresql.conf and restarted postgres=# ALTER SYSTEM SET auto_explain.log_verbose = default; ALTER SYSTEM # restart postgres postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true; ERROR: unrecognized configuration parameter auto_explain.log_verbose Is this by design? -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Tom == Tom Lane t...@sss.pgh.pa.us writes: Tom What I'm now thinking we want to do is: Tom 1. Non-hypothetical direct args always contribute to determining Tom the agg's collation. Tom 2. Hypothetical and aggregated args contribute to the agg's Tom collation only if the agg is designed so that there is always Tom exactly one aggregated arg (ie, it's non-variadic with one Tom aggregated arg). Otherwise we assign their collations Tom per-sort-column and don't merge them into the aggregate's Tom collation. Tom This specification ensures that a variadic aggregate doesn't Tom change behavior depending on how many sort columns there happen Tom to be. Works for me. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] preserving forensic information when we freeze
On 2013-12-23 10:56:07 -0500, Robert Haas wrote: Well, it really depends on the usecase. Reading SELECT header.xmin, header.xmax FROM sometable tbl, pg_tuple_header(tbl.tableoid, tbl.ctid) header; is surely harder to understand than SELECT tbl.xmin, tbl.xmax FROM sometable tbl; and the latter is noticeably cheaper since we're not re-fetching the tuple, especially when the tuple is the result of a more complicated query including a seqscan, we might often need to re-fetch the tuple from disk, thanks to the ringbuffer. That really makes me less than convinced that the function based approach is the right tool for everything. Meh. Who are all of these people who are fetching xmin, xmax, cmin, and cmax in complex queries, and why are they doing that? I have seen it done to avoid ABA problems in cache invalidation mechanisms by simply checking ctid, xmin, xmax to stay the same. If those columns are just for forensic purposes, then whatever performance disadvantages the function-based approach has don't really matter. If people are using them as integral parts of their application, then that's more of a concern, but how are those people not getting broken every release or three anyway? We keep inventing things like combo-cids and multi-xacts and multi-xacts that also contain an update and now freezing without changing xmin, and if you're actually relying on those columns for anything but forensics your application is presumably going to break when we change whatever part it depends on. Well, all of the fundamental changes (combocids, the initial multixact introduction) have been quite some time ago. I think backward compat has a much higher value these days (I also don't see much point in looking at cmin/cmax for anything but diagnostic purposes). I don't think any of the usecases I've seen would be broken by either fk-locks (multixacts have been there before, doesn't matter much that they now contain updates) nor by forensic freezing. The latter because they really only checked that xmin/xmax were the same, and we hopefully haven't broken that... But part of my point really was the usability, not only the performance. Requiring LATERAL queries to be usable sensibly causes a Meh from my side ;) Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
On Mon, Dec 23, 2013 at 1:42 PM, Thom Brown t...@linux.com wrote: Discussion around this topic has reached hundreds of messages, and whilst I have failed to find mention of my following question, I appreciate it may have already been discussed. I have noticed that configuration parameters for extensions are only supported if the server was started with one of the parameters already being set in postgresql.conf: # without any mention in postgresql.conf postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true; ERROR: unrecognized configuration parameter auto_explain.log_verbose # with auto_explain.log_verbose = false added to postgresql.conf postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true; ALTER SYSTEM Removing the entry from postgresql.conf, restarting Postgres, setting it to the default, then restarting again renders it unsettable again: # removed entry from postgresql.conf and restarted postgres=# ALTER SYSTEM SET auto_explain.log_verbose = default; ALTER SYSTEM # restart postgres postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true; ERROR: unrecognized configuration parameter auto_explain.log_verbose Is this by design? I would think that you'd need to have auto_explain loaded in the backend where you're trying to make a change, but you shouldn't need the setting to be present in postgresql.conf, I would think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] preserving forensic information when we freeze
On Mon, Dec 23, 2013 at 1:57 PM, Andres Freund and...@2ndquadrant.com wrote: Well, all of the fundamental changes (combocids, the initial multixact introduction) have been quite some time ago. I think backward compat has a much higher value these days (I also don't see much point in looking at cmin/cmax for anything but diagnostic purposes). I don't think any of the usecases I've seen would be broken by either fk-locks (multixacts have been there before, doesn't matter much that they now contain updates) nor by forensic freezing. The latter because they really only checked that xmin/xmax were the same, and we hopefully haven't broken that... But part of my point really was the usability, not only the performance. Requiring LATERAL queries to be usable sensibly causes a Meh from my side ;) I simply can't imagine that we're going to want to add a system column every time we change something, or even enough of them to cover all of the things we've already got. We'd need at least infomask, infomask2, and hoff, and maybe some functions for peeking through mxacts to find the updater and locker XIDs and lock modes. With a couple of functions we can do all that and not look back. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
On 23 December 2013 19:14, Robert Haas robertmh...@gmail.com wrote: On Mon, Dec 23, 2013 at 1:42 PM, Thom Brown t...@linux.com wrote: Discussion around this topic has reached hundreds of messages, and whilst I have failed to find mention of my following question, I appreciate it may have already been discussed. I have noticed that configuration parameters for extensions are only supported if the server was started with one of the parameters already being set in postgresql.conf: # without any mention in postgresql.conf postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true; ERROR: unrecognized configuration parameter auto_explain.log_verbose # with auto_explain.log_verbose = false added to postgresql.conf postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true; ALTER SYSTEM Removing the entry from postgresql.conf, restarting Postgres, setting it to the default, then restarting again renders it unsettable again: # removed entry from postgresql.conf and restarted postgres=# ALTER SYSTEM SET auto_explain.log_verbose = default; ALTER SYSTEM # restart postgres postgres=# ALTER SYSTEM SET auto_explain.log_verbose = true; ERROR: unrecognized configuration parameter auto_explain.log_verbose Is this by design? I would think that you'd need to have auto_explain loaded in the backend where you're trying to make a change, but you shouldn't need the setting to be present in postgresql.conf, I would think. This appears to be the case. I hadn't set the library to be loaded in the config. I guess therefore it follows that arbitrary configuration parameters aren't supported (e.g. moo.bark = 5), only pre-defined ones. Thanks Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
On Mon, Dec 23, 2013 at 2:19 PM, Thom Brown t...@linux.com wrote: I would think that you'd need to have auto_explain loaded in the backend where you're trying to make a change, but you shouldn't need the setting to be present in postgresql.conf, I would think. This appears to be the case. I hadn't set the library to be loaded in the config. I guess therefore it follows that arbitrary configuration parameters aren't supported (e.g. moo.bark = 5), only pre-defined ones. Yeah, and that's by design. Otherwise, it would be too easy to set a config parameter to a value that wasn't legal, and therefore make the server fail to start. moo.bark = 5 likely won't cause any problems, but auto_explain.log_verbose = fasle could. By insisting that the module providing the GUC be loaded, we can sanity-check the value at the time it gets set. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters
On 23 December 2013 19:35, Robert Haas robertmh...@gmail.com wrote: On Mon, Dec 23, 2013 at 2:19 PM, Thom Brown t...@linux.com wrote: I would think that you'd need to have auto_explain loaded in the backend where you're trying to make a change, but you shouldn't need the setting to be present in postgresql.conf, I would think. This appears to be the case. I hadn't set the library to be loaded in the config. I guess therefore it follows that arbitrary configuration parameters aren't supported (e.g. moo.bark = 5), only pre-defined ones. Yeah, and that's by design. Otherwise, it would be too easy to set a config parameter to a value that wasn't legal, and therefore make the server fail to start. moo.bark = 5 likely won't cause any problems, but auto_explain.log_verbose = fasle could. By insisting that the module providing the GUC be loaded, we can sanity-check the value at the time it gets set. Alles klar. :) -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ERROR during end-of-xact/FATAL
On Fri, Nov 15, 2013 at 09:51:32AM -0500, Robert Haas wrote: On Wed, Nov 13, 2013 at 11:04 AM, Noah Misch n...@leadboat.com wrote: So, in short, ERROR + ERROR*10 = PANIC, but FATAL + ERROR*10 = FATAL. That's bizarre. Quite so. Given that that's where we are, promoting an ERROR during FATAL processing to PANIC doesn't seem like it's losing much; we're essentially already doing that in the (probably more likely) case of a persistent ERROR during ERROR processing. But since PANIC sucks, I'd rather go the other direction: let's make an ERROR during ERROR processing promote to FATAL. And then let's do what you write above: make sure that there's a separate on-shmem-exit callback for each critical shared memory resource and that we call of those during FATAL processing. Many of the factors that can cause AbortTransaction() to fail can also cause CommitTransaction() to fail, and those would still PANIC if the transaction had an xid. How practical might it be to also escape from an error during CommitTransaction() with a FATAL instead of PANIC? There's more to fix up in that case (sinval, NOTIFY), but it may be within reach. If such a technique can only reasonably fix abort, though, I have doubts it buys us enough. The critical stuff that's got to happen after RecordTransactionCommit() appears to be ProcArrayEndTransaction() and AtEOXact_Inval(). Unfortunately, the latter is well after the point when we're supposed to only be doing non-critical resource cleanup, nonwithstanding which it appears to be critical. So here's a sketch. Hoist the preparatory logic in RecordTransactionCommit() - smgrGetPendingDeletes, xactGetCommittedChildren, and xactGetCommittedInvalidationMessages up into the caller and do it before setting TRANS_COMMIT. If any of that stuff fails, we'll land in AbortTransaction() which must cope. As soon as we exit the commit critical section, set a flag somewhere (where?) indicating that we have written our commit record; when that flag is set, (a) promote any ERROR after that point through the end of commit cleanup to FATAL and (b) if we enter AbortTransaction(), don't try to RecordTransactionAbort(). I can't see that the notification stuff requires fixup in this case; AFAICS, it is just adjusting backend-local state, and it's OK to disregard any problems there during a FATAL exit. Do you see something to the contrary? The part not to miss is SignalBackends() in ProcessCompletedNotifies(). It happens outside CommitTransaction(), just before the backend goes idle. Without that, listeners could miss our notification until some other NOTIFY transaction signals them. That said, since ProcessCompletedNotifies() already accepts the possibility of failure, it would not be crazy to overlook this. But invalidation messages are a problem: if we commit and exit without sending our queued-up invalidation messages, Bad Things Will Happen. Perhaps we could arrange things so that in that case only, we just PANIC. That would allow most write transactions to get by with FATAL, promoting to PANIC only in the case of transactions that have modified system catalogs and only until the invalidations have actually been sent. Avoiding the PANIC in that case seems to require some additional wizardry which is not entirely clear to me at this time. Agreed. I think we'll have to approach the various problems in this area stepwise, or we'll never make any progress. That's one option. The larger point is that allowing ERROR-late-in-COMMIT and FATAL + ERROR scenarios to get away with FATAL requires this sort of analysis of every exit callback and all the later stages of CommitTransaction(). The current bug level shows such analysis hasn't been happening, and the dearth of bug reports suggests the scenarios are rare. I don't think the project stands to gain enough from the changes contemplated here and, perhaps more notably, from performing such analysis during review of future patches that touch CommitTransaction() and the exit sequence. It remains my recommendation to run proc_exit_prepare() and the post-CLOG actions of CommitTransaction() and AbortTransaction() in critical sections, giving the scenarios blunt but reliable treatment. If reports of excess PANIC arise, the thing to fix is the code causing the late error. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] better atomics - v0.2
On Thu, Dec 5, 2013 at 6:39 AM, Andres Freund and...@2ndquadrant.com wrote: On 2013-11-19 10:37:35 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: The only animal we have that doesn't support quiet inlines today is HP-UX/ac++, and I think - as in patch 1 in the series - we might be able to simply suppress the warning there. Or just not worry about it, if it's only a warning? So, my suggestion on that end is that we remove the requirement for quiet inline now and see if it that has any negative consequences on the buildfarm for a week or so. Imo that's a good idea regardless of us relying on inlining support. Does anyone have anything against that plan? If not, I'll prepare a patch. Or does the warning mean code bloat (lots of useless copies of the inline function)? After thinking on that for a bit, yes that's a possible consequence, but it's quite possible that it happens in cases where we don't get the warning too, so I don't think that argument has too much bearing as I don't recall a complaint about it. But I bet the warning we're getting there is complaining about exactly that thing. It's possible that it means warning: i've optimized away your unused function into nothing but I think it's more likely that it means warning: i just emitted dead code. Indeed, I suspect that this is why that test is written that way in the first place. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WITHIN GROUP patch
Atri Sharma atri.j...@gmail.com writes: Please find attached the latest patch for WITHIN GROUP. This patch is after fixing the merge conflicts. I've committed this after significant editorialization --- most notably, I pushed control of the sort step into the aggregate support functions. I didn't like the way nodeAgg.c had been hacked up to do it the other way. There's a couple hundred lines of code handling that in orderedsetaggs.c, which is more or less comparable to the amount of code that didn't get added to nodeAgg.c, so I think the argument that the original approach avoided code bloat is bogus. The main reason I pushed all the new aggregates into a single file (orderedsetaggs.c) was so they could share a private typedef for the transition state value. It's possible that we should expose that struct so that third-party aggregate functions could leverage the existing transition-function infrastructure instead of having to copy-and-paste it. I wasn't sure where to put it though --- maybe a new include file would be needed. Anyway I left the point for future discussion. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Mon, Dec 23, 2013 at 7:49 AM, Robert Haas robertmh...@gmail.com wrote: I don't think this is a project to rush through. We've lived without MERGE/UPSERT for several years now, and we can live without it for another release cycle while we try to reach agreement on the way forward. I can tell that you're convinced you know the right way forward here, and you may be right, but I don't think you've convinced everyone else - maybe not even anyone else. That may be. Attention from reviewers has been in relatively short supply. Not that that isn't always true. I wouldn't suggest modeling anything you do on the way hash indexes using heavyweight locks. That is a performance disaster, not to mention being basically a workaround for the fact that whoever wrote the code originally didn't bother figuring out any way that splitting a bucket could be accomplished in a crash-safe manner, even in theory. If it weren't for that, we'd be using buffer locks there. Having looked at the code for the first time recently, I'd agree that hash indexes are a disaster. A major advantage of The Lehman and Yao Algorithm, as prominently noted in the paper, is that exclusive locks are only acquired on leaf pages to increase concurrency. Since I only propose to extend this to a heavyweight page lock, and still only for an instant, it seems reasonable to assume that the performance will be acceptable for an initial version of this. It's not as if most places will have to pay any heed to this heavyweight lock - index scans and non-upserting inserts are generally unaffected. We can later optimize performance as we measure a need to do so. Early indications are that the performance is reasonable. Holding value locks for more than an instant doesn't make sense. The reason is simple: when upserting, we're tacitly only really looking for violations on one particular unique index. We just lock them all at once because the implementation doesn't know *which* unique index. So in actuality, it's really no different from existing potential-violation handling for unique indexes, except we have to do some extra work in addition to the usual restart from scratch stuff (iff we have multiple unique indexes). To be honest, I am still not altogether sold on any part of this feature. I don't like the fact that it violates MVCC - although I admit that some form of violation is inevitable in any feature in this area unless we're content to live with many serialization failures, I don't like the particular way it violates MVCC Discussions around visibility issues have not been very useful. As I've said, I don't like the term MVCC violation, because it's suggestive of some classical, codified definition of MVCC, a definition that doesn't actually exist anywhere, even in research papers, AFAICT. So while I understand your concerns around the modifications to HeapTupleSatisfiesMVCC(), and while I respect that we need to be mindful of the performance impact, my position is that if that really is what we need to do, we might as well be honest about it, and express intent succinctly and directly. This is a position that is orthogonal to the proposed syntax, even if that is convenient to my patch. It's already been demonstrated that yes, the MVCC violation can be problematic when we call HeapTupleSatisfiesUpdate(), which is a bug that was fixed by making another modest modification to HeapTupleSatisfiesUpdate(). It is notable that that bug would have still occurred had a would-be-HTSMVCC-invisible tuple been passed through any other means. What problem, specifically, do you envisage avoiding by doing it some other way? What other way do you have in mind? We invested huge effort into more granular FK locking when we had a few complaints about it. I wouldn't be surprised if that effort modestly regressed HeapTupleSatisfiesMVCC(). On the other hand, this feature has been in very strong demand for over a decade, and has a far smaller code footprint. I don't want to denigrate the FK locking stuff in any way - it is a fantastic piece of work - but it's important to have a sense of proportion about these things. In order to make visibility work in the way we require, we're almost always just doing additional checking of infomask bits, and the t_infomask variable is probably already in a CPU register (this is a huge simplification, but is essentially true). Like you, I have noted that HeapTupleSatisfiesMVCC() is a fairly hot routine during profiling before, but it's not *that* hot. It's understandable that you raise these points, but from my perspective it's hard to address your concerns without more concrete objections. I don't like the syntax (returns rejects? blech) I suppose it isn't ideal in some ways. On the other hand, it is extremely flexible, with many of the same advantages of SQL MERGE. Importantly, it will facilitate merging as part of conflict resolution on multi-master replication systems, which I think is of considerable
[HACKERS] trailing comment ghost-timing
With \timing on, a trailing comment yields a timing. # test.sql select 1; /* select 2 */ $ psql -f test.sql ?column? -- 1 (1 row) Time: 0.651 ms Time: 0.089 ms I assume it is timing something about that comment (right?). Confusing and annoying, IMHO. Is there any chance such trailing ghost-timings can be removed? BTW: $ cat ~/.psqlrc \set QUIET on \timing on Thanks, Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Planning time in explain/explain analyze
Hi, A user asked in -performance[1] if there is a way to measure the planning time. Other than measuring it in the client I do not think there is so I quickly hacked a patched which adds Planning time to the outputs of EXPLAIN and EXPLAIN ANALYZE. Is this something useful? I think it is, since plan time can become an issue for complex queries. A couple of questions about the path: Should the planning time be added to both EXPLAIN and EXPLAIN ANALYZE? And is the output obvious enough? The patch does not include any changes to documentation or tests. I will fix that if people think this patch is useful. $ EXPLAIN SELECT * FROM pg_stat_activity; QUERY PLAN -- Nested Loop (cost=1.15..2.69 rows=1 width=337) - Hash Join (cost=1.02..2.41 rows=1 width=273) Hash Cond: (s.usesysid = u.oid) - Function Scan on pg_stat_get_activity s (cost=0.00..1.00 rows=100 width=209) - Hash (cost=1.01..1.01 rows=1 width=68) - Seq Scan on pg_authid u (cost=0.00..1.01 rows=1 width=68) - Index Scan using pg_database_oid_index on pg_database d (cost=0.13..0.27 rows=1 width=68) Index Cond: (oid = s.datid) Planning time: 0.258 ms (9 rows) $ EXPLAIN ANALYZE SELECT * FROM pg_stat_activity; QUERY PLAN Nested Loop (cost=1.15..2.69 rows=1 width=337) (actual time=0.094..0.096 rows=1 loops=1) - Hash Join (cost=1.02..2.41 rows=1 width=273) (actual time=0.078..0.079 rows=1 loops=1) Hash Cond: (s.usesysid = u.oid) - Function Scan on pg_stat_get_activity s (cost=0.00..1.00 rows=100 width=209) (actual time=0.053..0.053 rows=1 loops=1) - Hash (cost=1.01..1.01 rows=1 width=68) (actual time=0.014..0.014 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB - Seq Scan on pg_authid u (cost=0.00..1.01 rows=1 width=68) (actual time=0.007..0.009 rows=1 loops=1) - Index Scan using pg_database_oid_index on pg_database d (cost=0.13..0.27 rows=1 width=68) (actual time=0.009..0.010 rows=1 loops=1) Index Cond: (oid = s.datid) Planning time: 0.264 ms Total runtime: 0.158 ms (11 rows) Links 1. http://www.postgresql.org/message-id/cacfv+pknembqyjpcqrgsvmc_hvrgai3d_ge893n8qbx+ymh...@mail.gmail.com -- Andreas Karlsson diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c new file mode 100644 index 9969a25..42425fa *** a/src/backend/commands/explain.c --- b/src/backend/commands/explain.c *** ExplainOneQuery(Query *query, IntoClause *** 318,330 (*ExplainOneQuery_hook) (query, into, es, queryString, params); else { ! PlannedStmt *plan; /* plan the query */ plan = pg_plan_query(query, 0, params); /* run it (if needed) and produce output */ ! ExplainOnePlan(plan, into, es, queryString, params); } } --- 318,336 (*ExplainOneQuery_hook) (query, into, es, queryString, params); else { ! PlannedStmt *plan; ! instr_time planstart, planduration; ! ! INSTR_TIME_SET_CURRENT(planstart); /* plan the query */ plan = pg_plan_query(query, 0, params); + INSTR_TIME_SET_CURRENT(planduration); + INSTR_TIME_SUBTRACT(planduration, planstart); + /* run it (if needed) and produce output */ ! ExplainOnePlan(plan, into, es, queryString, params, planduration); } } *** ExplainOneUtility(Node *utilityStmt, Int *** 401,412 */ void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es, ! const char *queryString, ParamListInfo params) { DestReceiver *dest; QueryDesc *queryDesc; instr_time starttime; double totaltime = 0; int eflags; int instrument_option = 0; --- 407,420 */ void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es, ! const char *queryString, ParamListInfo params, ! instr_time planduration) { DestReceiver *dest; QueryDesc *queryDesc; instr_time starttime; double totaltime = 0; + double plantime = INSTR_TIME_GET_DOUBLE(planduration); int eflags; int instrument_option = 0; *** ExplainOnePlan(PlannedStmt *plannedstmt, *** 482,487 --- 490,500 /* Create textual dump of plan tree */ ExplainPrintPlan(es, queryDesc); + if (es-format == EXPLAIN_FORMAT_TEXT) + appendStringInfo(es-str, Planning time: %.3f ms\n, 1000.0 * plantime); + else + ExplainPropertyFloat(Planning Time, 1000.0 * plantime, 3, es); + /* Print info about runtime of triggers */ if (es-analyze) { diff --git a/src/backend/commands/prepare.c
Re: [HACKERS] trailing comment ghost-timing
On 12/24/2013 02:05 AM, Erik Rijkers wrote: With \timing on, a trailing comment yields a timing. # test.sql select 1; /* select 2 */ $ psql -f test.sql ?column? -- 1 (1 row) Time: 0.651 ms Time: 0.089 ms I assume it is timing something about that comment (right?). Confusing and annoying, IMHO. Is there any chance such trailing ghost-timings can be removed? This seems to be caused by psql sending the comment to the server to evaluate as a query. I personally think timing should always output something for every command sent to the server. To fix this problem I guess one would have to modify psql_scan() to check if a scanned query only contains comments and then not send it to the server at all. The minimal file to reproduce it is: /**/ -- Andreas Karlsson -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PoC: Partial sort
On 12/22/2013 04:38 PM, Alexander Korotkov wrote: postgres=# explain analyze select * from test order by v1, id limit 10; QUERY PLAN --- Limit (cost=11441.77..11442.18 rows=10 width=12) (actual time=79.980..79.982 rows=10 loops=1) - Partial sort (cost=11441.77..53140.44 rows=100 width=12) (actual time=79.978..79.978 rows=10 loops=1) Sort Key: v1, id Presorted Key: v1 Sort Method: top-N heapsort Memory: 25kB - Index Scan using test_v1_idx on test (cost=0.42..47038.83 rows=100 width=12) (actual time=0.031..38.275 rows=100213 loops=1) Total runtime: 81.786 ms (7 rows) Have you thought about how do you plan to print which sort method and how much memory was used? Several different sort methods may have been use in the query. Should the largest amount of memory/disk be printed? However, work with joins needs more improvements. That would be really nice to have, but the patch seems useful even without the improvements to joins. -- Andreas Karlsson -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trailing comment ghost-timing
Andreas Karlsson wrote On 12/24/2013 02:05 AM, Erik Rijkers wrote: With \timing on, a trailing comment yields a timing. # test.sql select 1; /* select 2 */ $ psql -f test.sql ?column? -- 1 (1 row) Time: 0.651 ms Time: 0.089 ms I assume it is timing something about that comment (right?). Confusing and annoying, IMHO. Is there any chance such trailing ghost-timings can be removed? This seems to be caused by psql sending the comment to the server to evaluate as a query. I personally think timing should always output something for every command sent to the server. To fix this problem I guess one would have to modify psql_scan() to check if a scanned query only contains comments and then not send it to the server at all. The minimal file to reproduce it is: /**/ -- Andreas Karlsson I need to be convinced that the server should not just silently ignore trailing comments. I'd consider an exception if the only text sent is a comment ( in such a case we should throw an error ) but if valid commands are sent and there is just some comment text at the end it should be ignored the same as if the comments were embedded in the middle of the query text. I've encountered other clients that output phantom results in this situation and solving it at the server seems worthwhile so client applications do not have to care. In the example case, I think, putting the comment before the command results in only one timing. This inconsistency is a symptom of this situation being handled incorrectly. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/trailing-comment-ghost-timing-tp5784473p5784484.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning time in explain/explain analyze
Andreas Karlsson andr...@proxel.se writes: The patch does not include any changes to documentation or tests. I will fix that if people think this patch is useful. I take it you've not tried the regression tests with this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] trailing comment ghost-timing
On 12/24/2013 03:17 AM, David Johnston wrote: I need to be convinced that the server should not just silently ignore trailing comments. I'd consider an exception if the only text sent is a comment ( in such a case we should throw an error ) but if valid commands are sent and there is just some comment text at the end it should be ignored the same as if the comments were embedded in the middle of the query text. I've encountered other clients that output phantom results in this situation and solving it at the server seems worthwhile so client applications do not have to care. In the example case, I think, putting the comment before the command results in only one timing. This inconsistency is a symptom of this situation being handled incorrectly. It is not sent to the server as a trailing comment. The following file is sent to the server like this. File: /**/; /**/ Commands: PQexec(..., /**/;); PQexec(..., /**/); If this has to be fixed it should be in the client. I think people would complain if we broke the API by starting to throw an exception on PQexec with a string containing no actual query. -- Andreas Karlsson -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning time in explain/explain analyze
On 23-12-2013 22:12, Andreas Karlsson wrote: A user asked in -performance[1] if there is a way to measure the planning time. Other than measuring it in the client I do not think there is so I quickly hacked a patched which adds Planning time to the outputs of EXPLAIN and EXPLAIN ANALYZE. Is this something useful? I think it is, since plan time can become an issue for complex queries. Yes, but a couple of years ago, this same feature was proposed as a module [1][2]. I'm not sure if it is worth including as an additional module or not. Opinions? [1] http://www.postgresql.org/message-id/BANLkTi=sxKTCTcv9hsACu38eaj=fw_4...@mail.gmail.com [2] https://github.com/umitanuki/planinstr -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning time in explain/explain analyze
On 12/24/2013 03:33 AM, Tom Lane wrote: Andreas Karlsson andr...@proxel.se writes: The patch does not include any changes to documentation or tests. I will fix that if people think this patch is useful. I take it you've not tried the regression tests with this. Yeah, forgot to mention that we need some way to disable it in the tests. Either by not having it included in EXPLAIN or by adding an option to turn it off. Any suggestion on which would be preferable? -- Andreas Karlsson -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning time in explain/explain analyze
On Mon, Dec 23, 2013 at 9:54 PM, Andreas Karlsson andr...@proxel.se wrote: On 12/24/2013 03:33 AM, Tom Lane wrote: Andreas Karlsson andr...@proxel.se writes: The patch does not include any changes to documentation or tests. I will fix that if people think this patch is useful. I take it you've not tried the regression tests with this. Yeah, forgot to mention that we need some way to disable it in the tests. Either by not having it included in EXPLAIN or by adding an option to turn it off. Any suggestion on which would be preferable? I would be tempted to display it only if (COSTS OFF) is not given. As far as I can tell, the major use case for (COSTS OFF) is when you want the output to be stable so you can include it in a regression test. Technically speaking, planning time is not a cost, but I'm not sure I could live with myself if we forced everyone to write (COSTS OFF, PLANNING_TIME OFF). And I don't think much of the idea of only including planning time when ANALYZE is used, because you don't have to want to run the query to want to know how long it took to plan. Also, +1 for this general concept. Great idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning time in explain/explain analyze
Robert Haas robertmh...@gmail.com writes: On Mon, Dec 23, 2013 at 9:54 PM, Andreas Karlsson andr...@proxel.se wrote: Yeah, forgot to mention that we need some way to disable it in the tests. Either by not having it included in EXPLAIN or by adding an option to turn it off. Any suggestion on which would be preferable? I would be tempted to display it only if (COSTS OFF) is not given. Yeah. The alternatives seem to be: 1. Change a lot of regression tests. This would be a serious PITA, not so much for the one-time cost as for every time we needed to back-patch a regression test that includes explain output. -1. 2. Don't display the planning time by default, necessitating a new PLANNING_TIME ON option. This has backwards compatibility to recommend it, but not much else. 3. Let COSTS OFF suppress it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Planning time in explain/explain analyze
On Mon, Dec 23, 2013 at 10:47 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Dec 23, 2013 at 9:54 PM, Andreas Karlsson andr...@proxel.se wrote: Yeah, forgot to mention that we need some way to disable it in the tests. Either by not having it included in EXPLAIN or by adding an option to turn it off. Any suggestion on which would be preferable? I would be tempted to display it only if (COSTS OFF) is not given. Yeah. The alternatives seem to be: 1. Change a lot of regression tests. This would be a serious PITA, not so much for the one-time cost as for every time we needed to back-patch a regression test that includes explain output. -1. 2. Don't display the planning time by default, necessitating a new PLANNING_TIME ON option. This has backwards compatibility to recommend it, but not much else. 3. Let COSTS OFF suppress it. Another option would be to not display it by default, but make VERBOSE show it. However, I think making it controlled by COSTS is a better fit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Logging WAL when updating hintbit
On Sat, Dec 21, 2013 at 4:13 AM, Robert Haas robertmh...@gmail.com wrote: On Fri, Dec 20, 2013 at 9:06 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Michael Paquier escribió: On Fri, Dec 20, 2013 at 1:05 PM, Sawada Masahiko sawada.m...@gmail.com wrote: Sorry the patch which I attached has wrong indent on pg_controldata. I have modified it and attached the new version patch. Now that you send this patch, I am just recalling some recent email from Tom arguing about avoiding to mix lower and upper-case characters for a GUC parameter name: http://www.postgresql.org/message-id/30569.1384917...@sss.pgh.pa.us To fullfill this requirement, could you replace walLogHints by wal_log_hints in your patch? Thoughts from others? The issue is with the user-visible variables, not with internal variables implementing them. I think the patch is sane. (Other than the fact that it was posted as a patch-on-patch instead of patch-on-master). But spelling it the same way everywhere really improves greppability. Yep, finding all the code paths with a single keyword is usually a good thing. Attached is a purely-aesthetical patch that updates the internal variable name to wal_log_hints. -- Michael commit a727578f38ae6574d36fc840ec903cc2a6f4a860 Author: Michael Paquier mich...@otacoo.com Date: Tue Dec 24 22:56:13 2013 +0900 Rename internal variable of wal_log_hints for better consistency Purely-aesthetical patch... diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 1277e71..43a0416 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -79,7 +79,7 @@ bool XLogArchiveMode = false; char *XLogArchiveCommand = NULL; bool EnableHotStandby = false; bool fullPageWrites = true; -bool walLogHints = false; +bool wal_log_hints = false; bool log_checkpoints = false; int sync_method = DEFAULT_SYNC_METHOD; int wal_level = WAL_LEVEL_MINIMAL; @@ -5271,7 +5271,7 @@ BootStrapXLOG(void) ControlFile-max_prepared_xacts = max_prepared_xacts; ControlFile-max_locks_per_xact = max_locks_per_xact; ControlFile-wal_level = wal_level; - ControlFile-wal_log_hints = walLogHints; + ControlFile-wal_log_hints = wal_log_hints; ControlFile-data_checksum_version = bootstrap_data_checksum_version; /* some additional ControlFile fields are set in WriteControlFile() */ @@ -9060,7 +9060,7 @@ static void XLogReportParameters(void) { if (wal_level != ControlFile-wal_level || - walLogHints != ControlFile-wal_log_hints || + wal_log_hints != ControlFile-wal_log_hints || MaxConnections != ControlFile-MaxConnections || max_worker_processes != ControlFile-max_worker_processes || max_prepared_xacts != ControlFile-max_prepared_xacts || @@ -9083,7 +9083,7 @@ XLogReportParameters(void) xlrec.max_prepared_xacts = max_prepared_xacts; xlrec.max_locks_per_xact = max_locks_per_xact; xlrec.wal_level = wal_level; - xlrec.wal_log_hints = walLogHints; + xlrec.wal_log_hints = wal_log_hints; rdata.buffer = InvalidBuffer; rdata.data = (char *) xlrec; @@ -9098,7 +9098,7 @@ XLogReportParameters(void) ControlFile-max_prepared_xacts = max_prepared_xacts; ControlFile-max_locks_per_xact = max_locks_per_xact; ControlFile-wal_level = wal_level; - ControlFile-wal_log_hints = walLogHints; + ControlFile-wal_log_hints = wal_log_hints; UpdateControlFile(); } } @@ -9485,7 +9485,7 @@ xlog_redo(XLogRecPtr lsn, XLogRecord *record) ControlFile-max_prepared_xacts = xlrec.max_prepared_xacts; ControlFile-max_locks_per_xact = xlrec.max_locks_per_xact; ControlFile-wal_level = xlrec.wal_level; - ControlFile-wal_log_hints = walLogHints; + ControlFile-wal_log_hints = wal_log_hints; /* * Update minRecoveryPoint to ensure that if recovery is aborted, we diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index a605363..f8261b6 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -876,7 +876,7 @@ static struct config_bool ConfigureNamesBool[] = gettext_noop(Writes full pages to WAL when first modified after a checkpoint, even for a non-critical modifications), NULL }, - walLogHints, + wal_log_hints, false, NULL, NULL, NULL }, diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index 954486a..91ba0d1 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -189,7 +189,7 @@ extern bool XLogArchiveMode; extern char *XLogArchiveCommand; extern bool EnableHotStandby; extern bool fullPageWrites; -extern bool walLogHints; +extern bool wal_log_hints; extern bool log_checkpoints; extern int num_xloginsert_slots; @@ -221,7 +221,7 @@ extern int wal_level; * of the bits make it to disk, but the checksum wouldn't match. Also WAL-log * them if forced by wal_log_hints=on. */ -#define XLogHintBitIsNeeded() (DataChecksumsEnabled() || walLogHints) +#define XLogHintBitIsNeeded() (DataChecksumsEnabled()
Re: [HACKERS] INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
On Mon, Dec 23, 2013 at 5:59 PM, Peter Geoghegan p...@heroku.com wrote: On Mon, Dec 23, 2013 at 7:49 AM, Robert Haas robertmh...@gmail.com wrote: I don't think this is a project to rush through. We've lived without MERGE/UPSERT for several years now, and we can live without it for another release cycle while we try to reach agreement on the way forward. I can tell that you're convinced you know the right way forward here, and you may be right, but I don't think you've convinced everyone else - maybe not even anyone else. That may be. Attention from reviewers has been in relatively short supply. Not that that isn't always true. I think concrete concerns about usability have largely been subordinated to abstruse discussions about locking protocols. A discussion strictly about what syntax people would consider reasonable, perhaps on another thread, might elicit broader participation (although this week might not be the right time to try to attract an audience). Having looked at the code for the first time recently, I'd agree that hash indexes are a disaster. A major advantage of The Lehman and Yao Algorithm, as prominently noted in the paper, is that exclusive locks are only acquired on leaf pages to increase concurrency. Since I only propose to extend this to a heavyweight page lock, and still only for an instant, it seems reasonable to assume that the performance will be acceptable for an initial version of this. It's not as if most places will have to pay any heed to this heavyweight lock - index scans and non-upserting inserts are generally unaffected. We can later optimize performance as we measure a need to do so. Early indications are that the performance is reasonable. OK. To be honest, I am still not altogether sold on any part of this feature. I don't like the fact that it violates MVCC - although I admit that some form of violation is inevitable in any feature in this area unless we're content to live with many serialization failures, I don't like the particular way it violates MVCC Discussions around visibility issues have not been very useful. As I've said, I don't like the term MVCC violation, because it's suggestive of some classical, codified definition of MVCC, a definition that doesn't actually exist anywhere, even in research papers, AFAICT. I don't know whether or not that's literally true, but like Potter Stewart, I don't think there's any real ambiguity about the underlying concept. The concepts of read-write, write-read, and write-write dependencies between transactions are well-described in textbooks such as Jim Gray's Transaction Processing: Concepts and Techniques and this paper on MVCC: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.142.552rep=rep1type=pdf I think the definition of an MVCC violation is that a snapshot sees the effects of a transaction which committed after that snapshot was taken. And maybe it's good and right that this patch is introducing a new way for that to happen, or maybe it's not, but the point is that we get to decide. I've been very consistent even in the face of strong criticism. What I have now is essentially the same design as back in early September. After the initial ON DUPLICATE KEY IGNORE patch in August, I soon realized that value locking and row locking could not be sensibly considered in isolation, and over the objections of others pushed ahead with integrating the two. I believe now as I believed then that value locks need to be cheap to release (or it at least needs to be possible), and that it was okay to drop all value locks when we need to deal with a possible conflict/getting an xid shared lock - if those unlocked pages have separate conflicts on our next attempt, the feature is being badly misused (for upserting) or it doesn't matter because we only need one conclusive No answer (for IGNOREing, but also for upserting). I'm not saying that you haven't been consistent, or that you've done anything wrong at all. I'm just saying that the default outcome is that we change nothing, and the fact that nobody's been able to demonstrate an approach is clearly superior to what you've proposed does not mean we have to accept what you've proposed. I am not necessarily advocating for rejecting your proposed approach, although I do have concerns about it, but I think it is clear that it is not backed by any meaningful amount of consensus. Maybe that will change in the next two months, and maybe it won't. If it doesn't, whether through any fault of yours or not, I don't think this is going in. If this is all perfectly clear to you already, then I apologize for belaboring the point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Fix typo in src/backend/utils/mmgr/README
This is a small patch to fix a typo in src/backend/utils/mmgr/README. Thanks, Best regards, Etsuro Fujita typofix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] varattno remapping
Hi all I'm finding it necessary to remap the varno and varattno of Var elements in RETURNING lists as part of updatable s.b. view processing. A reference to a view col in RETURNING must be rewritten to point to the new resultRelation at the top level, so that the effects of BEFORE triggers are visible in the returned result. Usually the view will have a different structure to the base table, and in this case it's necessary to change the varattno of the Var to point to the corresponding col on the base rel. So the short version: Given the RTE for a simple view with only one base rel and an attribute number for a col in that view, and assuming that the view col is a simple reference to a col in the underlying base rel, is there any sane way to get the attribute number for the corresponding col on the base rel? For example, given: CREATE TABLE t (a integer, b text, c numeric); CREATE VIEW v1 AS SELECT c, a FROM t; UPDATE v1 SET c = NUMERIC '42' RETURNING a, c; ... the Vars for a and c in the RETURNING clause need to be remapped so their varno is the rti for t once the RTE has been injected, and the varattnos need changing to the corresponding attribute numbers on the base table. So a Var for v1(c), say (1,1), must be remapped to (2,3) i.e. varno 2 (t), varattno 3 (c). I'm aware that this is somewhat like the var/attr twiddling being complained about in the RLS patch. I don't see a way around it, though. I can't push the RETURNING tlist entries down into the expanded view's subquery and add an outer Var reference - they must point directly to the resultRelation so that we see the effects of any BEFORE triggers. I'm looking for advice on how to determine, given an RTI and an attribute number for a simple view, what the attribute number of the col in the view's base relation is. It can be assumed for this purpose that the view attribute is a simple Var (otherwise we'll ERROR out, since we can't update an expression). I'm a bit stumped at this point. I could adapt the ChangeVarNodes walker to handle the actual recursive rewriting and Var changing. It assumes that the varattno is the same on both the old and new varno, as it's intended for when you have an RT index change, but could be taught to optionally remap varattnos. To do that, though, I need a way to actually do that varattno remapping cleanly. Anyone got any ideas? -- Craig Ringer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] varattno remapping
On 12/24/2013 02:35 PM, Craig Ringer wrote: So the short version: Given the RTE for a simple view with only one base rel and an attribute number for a col in that view, and assuming that the view col is a simple reference to a col in the underlying base rel, is there any sane way to get the attribute number for the corresponding col on the base rel? So, of course, I find it as soon as I post. map_variable_attnos(...), also in src/backend/rewrite/rewriteManip.c . Just generate the mapping table and go. Sorry for the noise folks. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] varattno remapping
On Tue, Dec 24, 2013 at 11:47 AM, Craig Ringer cr...@2ndquadrant.comwrote: On 12/24/2013 02:35 PM, Craig Ringer wrote: So the short version: Given the RTE for a simple view with only one base rel and an attribute number for a col in that view, and assuming that the view col is a simple reference to a col in the underlying base rel, is there any sane way to get the attribute number for the corresponding col on the base rel? So, of course, I find it as soon as I post. map_variable_attnos(...), also in src/backend/rewrite/rewriteManip.c . Just generate the mapping table and go. Could you please explain a little bit more how would you solve the posed problem using map_variable_attnos? I was recently working on a similar problem and used the following algo to solve it. I had to find to which column of the base table does a column in the select statement of the view query belong. To relate a target list entry in the select query of the view to an actual column in base table here is what I did First determine whether the var's varno refer to an RTE which is a view? If yes then get the view query (RangeTblEntry::subquery) and see which element in the view query's target list does this var's varattno point to. Get the varno of that target list entry. Look for that RTE in the view's query and see if that RTE is a real table then use that var making sure its varno now points to the actual table. Thanks in advance. Sorry for the noise folks. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- -- *Abbas* Architect Ph: 92.334.5100153 Skype ID: gabbasb www.enterprisedb.co http://www.enterprisedb.com/mhttp://www.enterprisedb.com/ *Follow us on Twitter* @EnterpriseDB Visit EnterpriseDB for tutorials, webinars, whitepapershttp://www.enterprisedb.com/resources-communityand morehttp://www.enterprisedb.com/resources-community
Re: [HACKERS] trailing comment ghost-timing
On Tue, Dec 24, 2013 at 03:40:58AM +0100, Andreas Karlsson wrote: On 12/24/2013 03:17 AM, David Johnston wrote: It is not sent to the server as a trailing comment. The following file is sent to the server like this. File: /**/; /**/ Commands: PQexec(..., /**/;); PQexec(..., /**/); If this has to be fixed it should be in the client. I think people would complain if we broke the API by starting to throw an exception on PQexec with a string containing no actual query. (Untested). Isn't this just a case of psql not printing out a timing if the server responds with PGRES_EMPTY_QUERY? Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature