Re: [HACKERS] Enabling Checksums
On 08.03.2013 05:31, Bruce Momjian wrote: Also, don't all modern storage drives have built-in checksums, and report problems to the system administrator? Does smartctl help report storage corruption? Let me take a guess at answering this --- we have several layers in a database server: 1 storage 2 storage controller 3 file system 4 RAM 5 CPU My guess is that storage checksums only cover layer 1, while our patch covers layers 1-3, and probably not 4-5 because we only compute the checksum on write. There is a thing called Data Integrity Field and/or Data Integrity Extensions, that allow storing a checksum with each disk sector, and verifying the checksum in each layer. The basic idea is that instead of 512 byte sectors, the drive is formatted to use 520 byte sectors, with the extra 8 bytes used for the checksum and some other metadata. That gets around the problem we have in PostgreSQL, and that filesystems have, which is that you need to store the checksum somewhere along with the data. When a write I/O request is made in the OS, the OS calculates the checksum and passes it to through the controller to the drive. The drive verifies the checksum, and aborts the I/O request if it doesn't match. On a read, the checksum is read from the drive along with the actual data, passed through the controller, and the OS verifies it. This covers layers 1-2 or 1-3. Now, this requires all the components to have support for that. I'm not an expert on these things, but I'd guess that that's a tall order today. I don't know which hardware vendors and kernel versions support that. But things usually keep improving, and hopefully in a few years, you can easily buy a hardware stack that supports DIF all the way through. In theory, the OS could also expose the DIF field to the application, so that you get end-to-end protection from the application to the disk. This means that the application somehow gets access to those extra bytes in each sector, and you have to calculate and verify the checksum in the application. There are no standard APIs for that yet, though. See https://www.kernel.org/doc/Documentation/block/data-integrity.txt. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling Checksums
On 07.03.2013 23:45, Jeff Davis wrote: By the way, I can not find any trace of XLogCheckBufferNeedsBackup(), was that a typo? Ah, sorry, that was a new function introduced by another patch I was reviewing at the same time, and I conflated the two. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist
2013/3/8 Josh Kupershmidt schmi...@gmail.com: [Moving to -hackers] On Sat, Feb 23, 2013 at 2:51 PM, Pavel Stehule pavel.steh...@gmail.com wrote: so * --conditional-drops replaced by --if-exists Thanks for the fixes, I played around with the patch a bit. I was sort of expecting this example to work (after setting up the regression database with `make installcheck`) pg_dump --clean --if-exists -Fp -d regression --file=regression.sql createdb test psql -v ON_ERROR_STOP=1 --single-transaction -d test -f regression.sql But it fails, first at: ... DROP TRIGGER IF EXISTS tsvectorupdate ON public.test_tsvector; ERROR: relation public.test_tsvector does not exist This seems like a shortcoming of DROP TRIGGER ... IF EXISTS, and it looks like DROP RULE ... IF EXISTS has the same problem. I recall DROP ... IF EXISTS being fixed recently for not to error out if the schema specified for the object does not exist, and ISTM the same arguments could be made in favor of fixing DROP TRIGGER/TABLE ... IF EXISTS not to error out if the table doesn't exist. Working further through the dump of the regression database, these also present problems for --clean --if-exists dumps: DROP CAST IF EXISTS (text AS public.casttesttype); ERROR: type public.casttesttype does not exist DROP OPERATOR IF EXISTS public.% (point, widget); ERROR: type widget does not exist DROP FUNCTION IF EXISTS public.pt_in_widget(point, widget); ERROR: type widget does not exist I'm not sure whether DROP CAST/OPERATOR/FUNCTION IF EXISTS should be more tolerant of nonexistent types, of if the mess could perhaps be avoided by dump reordering. Note, this usability problem affects unpatched head as well: pg_dump -Fc -d regression --file=regression.dump pg_restore --clean -1 -d regression regression.dump ... pg_restore: [archiver (db)] could not execute query: ERROR: type widget does not exist Command was: DROP FUNCTION public.widget_out(widget); (The use here is a little different than the first example above, but I would still expect this case to work.) The above problems with IF EXISTS aren't really a problem of the patch per se, but IMO it would be nice to straighten all the issues out together for 9.4. ok * -- additional check, available only with -c option Cool. I think it would also be useful to check that --clean may only be used with --format=p to avoid any confusion there. (This issue could be addressed in a separate patch if you'd rather not lard this patch.) no some people - like we in our company would to use this feature for quiet and strict mode for plain text format too. enabling this feature has zero overhead so there are no reason block it anywhere. Some comments on the changes: 1. There is at least one IF EXISTS check missing from pg_dump.c, see for example this statement from a dump of the regression database with --if-exists: ALTER TABLE public.nv_child_2010 DROP CONSTRAINT nv_child_2010_d_check; 2. Shouldn't pg_restore get --if-exists as well? 3. + printf(_( --if-exists don't report error if cleaned object doesn't exist\n)); This help output bleeds just over our de facto 80-character limit. Also contractions seem to be avoided elsewhere. It's a little hard to squeeze a decent explanation into one line, but perhaps: Use IF EXISTS when dropping objects would be better. The sgml changes could use some wordsmithing and grammar fixes. I could clean these up for you in a later version if you'd like. 4. There seem to be spurious whitespace changes to the function prototype and declaration for _printTocEntry. I'll send updated version in next months Thank you for review Regards Pavel That's all I've had time for so far... Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REFRESH MATERIALIZED VIEW locklevel
2013/3/8 Andres Freund and...@anarazel.de: On 2013-03-07 15:21:35 -0800, Josh Berkus wrote: This limitation is in no way crippling for this feature, or even a major detraction. I still intend to promote the heck out of this feature. Thats scaring me. Because the current state of the feature isn't something that people expect under the term materialized views and I am pretty damn sure people will then remember postgres as trying to provide a tick-box item without it being really usable in the real world. And thats not something I want postgres to be known for. +1. It seems wise to wait for the feature to ripen some more. That way, the impact of any promotion will be stronger; Most people understand “materialized views” to mean something more that what is currently there. Of course, a drawback of waiting would be that you might lose the momentum of the expression “materialized views.” OTOH, any questions along the lines of “I thought PG supported materialized views since 9.3? Why are they making such a fuss about it now (i.e., 9.3)?” would lead to people discussing even more, which might enhance the effect of the promotion. Nicolas -- A. Because it breaks the logical sequence of discussion. Q. Why is top posting bad? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.3] OAT_POST_ALTER object access hooks
Thanks for your reviewing. 2013/3/7 Robert Haas robertmh...@gmail.com: On Sun, Jan 27, 2013 at 1:55 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote: The part-2 patch adds new OAT_POST_ALTER event type, and its relevant permission checks on contrib/sepgsql. This documentation hunk is unclear: +On xref linkend=sql-createfunction, literalinstall/ permission +will be checked if literalleakproof/ attribute was given, not only +literalcreate/ on the new function. Based on previous experience reading your patches, I'm guessing that what you actually mean is that both things are checked, but the wording doesn't make that clear. Also, if permissions are now checked on functions, doesn't the previous sentence need an update? Your guess is right. When user defines a function with leakproof attribute, sepgsql checks both of create and install permissions on the new function being labeled according to the default security labeling rules. The previous section introduces the common behavior when user create a database object, not particular object class. So, it mention about create permission only on creation of object. On the other hand, the later session introduces special checks depending on object classes, such as schema objects. This section says as below on top of the secsion: | A few additional checks are applied depending on object types. And, the sentence says not only literalcreate/. Please give me idea to make the sentence not misleading. +In addition, literaladd_name/ and literalremove_name/ permission +will be checked towards relevant schema when we try to rename or set +new schema on the altered object. Suggest: In addition, literalremove_name/ and literaladd_name/ will be checked on the old and new schemas, respectively, when an object is moved to a new schema. +A few additional checks are applied depending on object types. For certain object types, additional checks are performed. Thanks, I applied it. +On xref linkend=sql-alterfunction, literalinstall/ permission +will be checked if literalleakproof/ attribute was turned on, not +only literalsetattr/ on the new function. This is a near-duplicate of the previous hunk and suffers from the same awkwardness. The above section introduces about behavior when user create an object of particular object classes. Do I revise it to introduce the behavior + * is_internal: TRUE if constraint is constructed unless user's intention I dunno what this means. What's the difference between an internal constraint and a non-internal constraint, and why do we need that distinction? This seems to percolate to a lot of places; I'd rather not do that without a real good reason. is_internal is not a property of constraint itself, but reflects the nature of its invocation context. Unfortunately, some invocation path requires to handle the event when a constraint is created or altered as internal one. For example, make_new_heap() that also calls heap_create_with_catalog() is called to construct a clone empty relation to rewrite whole table on some ALTER TABLE command and others. This table creation is purely internal stuff (in other words, object was constructed because of just implementation reason). The heap_create_with_catalog() also calls StoreConstraints() that adds a new constraint with hook invocation. It is a situation that extension wants to distinct an internal one from non-internal one. Otherwise, in case when AT_ReAddConstraint command tries to add a constraint, it is constructed due to data type changes in primary ALTER TABLE command, even existing one is internally dropped. So, it is a reason why I had to add is_internal flag for constraint. + /* XXX - should be checked at caller side */ XXX should be used only for things that really ought to be revisited and changed. See the wording I used in the just-committed part 1 patch. OK, I'll fix it. +#include catalog/objectaccess.h This is the only hunk in collationcmds.c, hence presumably not needed. + /* Post create hook of this access method operator */ + InvokeObjectPostCreateHook(AccessMethodOperatorRelationId, + entryoid, 0); I suggest uniformly adding a blank line before each of these hunks, rather than adding it for some and not others. I think, though, that we could probably ditch the comments throughout. They don't add anything, really. OK, I'll follow the manner. The comment about hook might make sense in the previous version, but these comments does not introduce something more than function-name. @@ -3330,7 +3342,6 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel, */ break; case AT_SetTableSpace: /* SET TABLESPACE */ - /* * Nothing to do here; Phase 3 does the work
Re: [HACKERS] SP-GiST for ranges based on 2d-mapping and quad-tree
On 03.03.2013 19:42, Alexander Korotkov wrote: This patch only adds one more operator to already committed new opclass. Tests already cover this case. Without patch corresponding test leads to sequential scan instead of index scan. Thanks, committed with some trivial cleanup. However, I can't see any documentation changes about already committed opclass. I think we need a documentation update as an separate patch. Agreed. I think a small mention in the Range Types - Indexing section (http://www.postgresql.org/docs/devel/static/rangetypes.html#RANGETYPES-GIST) will do. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sql_drop Event Triggerg
On Tue, Mar 5, 2013 at 12:45 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Hmm, maybe I should be considering a pair of macros instead -- UTILITY_START_DROP and UTILITY_END_DROP. I'll give this a try. Other ideas are welcome. That seems like a possibly promising idea. I do wonder how well any of this is going to scale. Presumably people are going to want similar things for CREATE and (hardest) ALTER. Seems like ProcessUtility() could get pretty messy and confusing. But I don't have a better idea, either. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index Unqiueness
On Tue, Mar 5, 2013 at 10:41 PM, abhinav batra abba...@gmail.com wrote: Hey I want to work towards the follwing feature in TODO list: Prevent index uniqueness checks when UPDATE does not modify the columnUniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE. However, HOT already short-circuits this in common cases, so more work might not be helpful. So I wanted to know if someone is already working on this. Not to my knowledge. But... I wonder why you want to work on this? I've never heard of anyone having a performance problem because of this. If you are, it would be interesting to hear the details. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index Unqiueness
On 2013-03-06 09:11:38 +0530, abhinav batra wrote: Hey I want to work towards the follwing feature in TODO list: Prevent index uniqueness checks when UPDATE does not modify the columnUniqueness (index) checks are done when updating a column even if the column is not modified by the UPDATE. However, HOT already short-circuits this in common cases, so more work might not be helpful. I'd be interested in something slightly related that is doing HOT on a per-index basis. Currently we don't do hot if any index is updated even though quite possibly most of the indexes don't change. I think that might result in some quite nice speedups... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Wed, Mar 6, 2013 at 09:16:59AM -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: On 5 March 2013 22:02, Tom Lane t...@sss.pgh.pa.us wrote: FWIW, my opinion is that doing anything like this in the planner is going to be enormously expensive. As we already said: no MVs = zero overhead = no problem. Well, in the first place that statement is false on its face: we'll still spend cycles looking for relevant MVs, or at least maintaining a complexly-indexed cache that helps us find out that there are none in a reasonable amount of time. In the second place, even if it were approximately true it wouldn't help the people who were using MVs. It costs in the cases where time savings are possible and not otherwise. And that is just complete nonsense: matching costs whether you find a match or not. Could we have a little less Pollyanna-ish optimism and a bit more realism about the likely cost of such a feature? Should we add this to the TODO list as a possibility? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sql_drop Event Triggerg
Robert Haas escribió: On Tue, Mar 5, 2013 at 12:45 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Hmm, maybe I should be considering a pair of macros instead -- UTILITY_START_DROP and UTILITY_END_DROP. I'll give this a try. Other ideas are welcome. That seems like a possibly promising idea. I do wonder how well any of this is going to scale. I did followup with a patch implementing that; did you see it? Presumably people are going to want similar things for CREATE and (hardest) ALTER. Seems like ProcessUtility() could get pretty messy and confusing. But I don't have a better idea, either. :-( Well, the first thing that we need to settle is the user interface. Normalized command string don't seem to cut it; requiring users to write SQL parsers is rather unfriendly IMHO. The current idea of having a function that returns objects affected by the command seems relatively sensible. For drops, it seems pretty straighforward so far. For CREATE it's probably somewhat more involved, but seems doable in principle (but yes, we're going to have to sprinkle ProcessUtility() with a lot of UTILITY_START/END_CREATE calls). Not sure about ALTER; maybe we will need a completely different idea to attack that. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized paths vs index clauses extracted from OR clauses
On Tue, Mar 5, 2013 at 11:53 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Mar 5, 2013 at 3:44 PM, Tom Lane t...@sss.pgh.pa.us wrote: If foo OR bar is useful as an indexqual condition in the inner scan, that's one thing. But if it isn't, the cycles expended to check it in the inner scan are possibly wasted, because we'll still have to check the full original OR clause later. It's possible that the filter condition removes enough rows from the inner scan's result to justify the redundant checks, but it's at least as possible that it doesn't. Yeah, that's pretty unappealing. It probably doesn't matter much if foo is just a column reference, but what if it's an expensive function? For that matter, what if it's a volatile function that we can't execute twice without changing the results? Well, *that* worry at least is a nonissue: if the clause contains volatile functions then it's not a candidate to be an indexqual anyway. The code that pulls out these simplified OR clauses is only looking for clauses that can be shown to match existing indexes, so it won't pick anything like that. But expensive functions could be a hazard. (Hm, maybe what we need is a marker for enforce this clause only if you feel like it?) Not sure I get the parenthesized bit. I was thinking that we'd extract the simplified clause and then mark it as something to be used only if it can be used as an indexqual. However, on second thought that still leaves us with the problem that some parameterized paths yield more rows than others. OK, that's what I was thinking, and why I was confused. Maybe that assumption simply has to go ... That assumption is pretty darn important from a planning-time perspective, though. I think if we rip it out we'd better have some idea what we're going to do instead. This parameterized path stuff is very cool, but just like any other planner trick it has to be worth the cycles spent at runtime. On further reflection, I'm wondering about this part from your original email: We do get as far as finding that out and building a bitmap scan path, but the path is marked as yielding 2918 rows (the whole product table), not the 2 rows it actually will produce. That's because the parameterized path code is designed to assume that all identically-parameterized scans will produce the same number of rows, and it's already computed that row estimate without the benefit of the extracted OR clause. Can't we fix this in some more-localized way? I mean, bitmap index scans are a bit of a special case, because the individual quals get their own plan tree nodes. With a sequential scan, index scan, or index-only scan, any scan of the same relation can incorporate all of the same quals. But if we implement a scan for a = 1 or b = 1 as a bitmap-or of two bitmap-scans, then suddenly there's a node in the tree that can only accommodate only the a = 1 qual and another that can accommodate only the b = 1 qual. Expecting those to have the the same row-count is clearly nonsense, but it feels like an artifact of the representational choice. One can imagine a database where a sequential scan with a complex filter condition is represented as a filtering node atop a sequential-scan node rather than as a sequential scan node with a filter condition glued on to it; conversely, one could imagine a representation for bitmap scans where a bitmap-or operation comes out as a single node with a more-complex internal structure. I'm not actually proposing that we change the representational choices we've made here (though I wouldn't be averse to it), but the point I'm making is that it seems like the behavior that's causing the problem is a representational artifact, and thinking about the problem that way might suggest a narrower fix. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Materialized views WIP patch
On Tue, Mar 5, 2013 at 08:50:39AM +, Simon Riggs wrote: Its not a different feature, its what most people expect a feature called MV to deliver. That's not a matter of opinion, its simply how every other database works currently - Oracle, Teradata, SQLServer at least. The fact that we don't allow MVs to automatically optimize Good points. queries is acceptable, as long as that is clearly marked in some way to avoid confusion, and I don't mean buried on p5 of the docs. What we have here is a partial implementation that can be improved upon over next few releases. I hope anyone isn't going to claim that Materialized Views have been implemented in the release notes in this release, because unqualified that would be seriously misleading and might even stifle further funding to improve things to the level already implemented elsewhere. Just to reiterate, I fully support the committing of this partial feature into Postgres in this release because it will be a long haul to complete the full feature and what we have here is a reasonable stepping stone to get there. Transactionally up-yo-date MVs can be used like indexes in the planner. The idea that this is impossible because of the permutations involved is somewhat ridiculous; there is much published work on optimising that and some obvious optimisations. Clearly that varies according to the number of MVs and the number of tables they touch, not the overall complexity of the query. The overhead is probably same or less as partial indexes, which we currently think is acceptable. In any case, if you don't wish that overhead, don't use MVs. While you are right that automatically using materialized views is like the optimizer choosing partial indexes, we actually already have auto-selection of row-level materialized views with expression indexes and index-only scans. When you do the insert or update, the indexed function is called and the value stored in the index. If you later query the function call, we can pull the value right from the index. This, of course, is a very crude definition of materialized view, but it seems relevant. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index Unqiueness
On Fri, Mar 8, 2013 at 8:58 AM, Andres Freund and...@2ndquadrant.com wrote: I'd be interested in something slightly related that is doing HOT on a per-index basis. Currently we don't do hot if any index is updated even though quite possibly most of the indexes don't change. I think that might result in some quite nice speedups... Yeah, I agree. The thing that seems tricky is that, at present, once you have the CTID of a tuple in a HOT chain, you can follow the chain all the way to the end without inspecting the tuple contents - just xmin/xmax; and you know that all the tuples in the chain agree on all indexed attributes. If we relax that, consider a chain A - B - C, where A and B agree on all indexed attributes but B and C agree on only some of them. At present, only A will have any index pointers. If we break that assumption, then we the ability to HOT-prune away B and eventually C without index vac, which is a bummer. If we keep that assumption, then we can have an index pointer whose attributes don't match those of the tuple it references, which threatens to break index-only scans (at least). I haven't been able to think of a way to make this work out cleanly. I think it's important to understand whether we're trying to extend the benefits of HOT update (skipping index insertions) or HOT cleanup (page-at-a-time vacuuming). They're both significant. If the goal is to optimize away index maintenance for the columns that aren't updated, we could change the way vacuum works: when vacuum finds a dead tuple, instead of scanning for the index pointers and removing them, it could scan for the index pointers and update them to point to the new CTID (which is available from looking at the original tuple). Then, potentially, updates don't need to insert into indexes on unchanged columns, because scans can follow the CTID pointers forward where needed. For sanity's sake, it might be best to limit this to cases where the old and new tuples are on the same page, both to minimize the overhead during scans, and also because there's not enough space available in the line pointer to store a full CTID (and we definitely don't want to postpone truncating tuples to line pointers to get this optimization). I'm waving my hands wildly here; I might be totally off-base in thinking that any of this can work. More broadly, I'm not sure if it's the right thing to optimize. It essentially aims to speed up updates, but the cost would be slower scans; indeed, it's hard to think of anything we could do in this area that wouldn't make scans more complex and therefore potentially slower, and it's not clear that's the right way to go. HOT cleanups - or hint bit setting - during scans are already a frequent cause of performance complaints. We could skip them during read-only operations but we do them for a good reason: otherwise, repeated scans can get very painful. So I think there's a decent argument that we're already over-optimized for writes at the expense of reads. I'm increasingly tempted to think that our heap representation needs a much broader rethink. We hear endless complaints in this forum to the effect that hint bits cause problems, both directly with performance and indirectly by greatly complicating things for other features, such as page checksums. We all love HOT, but it doesn't cover enough cases, and the HOT cleanups can occasionally cause pain. Freezing sucks. The often-expressed desire to cluster a table and have it stay clustered is unimplementable. Returning space to the OS requires painfully expensive maintenance. The 24-byte-per-tuple overhead seems badly overpriced for insert-only tables. These are not fundamental truths of the universe, or even of PostgreSQL; they are specific consequences of the representation we've chosen for heaps. Many of them are things that we've grown into, rather than designed with malice aforethought: for example, freezing is a consequence of the after-the-fact desire to be able to support more than 4bn transactions over the lifetime of the database. So it's way better than what we had before, and yet, if we all sat down and designed a new on-disk storage format for a new product today, I'm sure none of us would pick one that expires after 2bn transactions. We can continue to whittle away at these problems incrementally, and I hope we do, but I've started to feel like we're bumping up against the limits of what is feasible given the design decisions to which we've already committed. To take one concrete example, suppose we sat down and designed a new heap format, and suppose we made it our principal goal to make it write-once and zero-maintenance. In other words, if a user sat down and bulk-loaded data into a table, and never modified it, there would be no further writes to that table after the initial flush to disk (no hint bits, no visibility map bits, no freezing) and no scans of the data except as a result of user-initiated activity (i.e. no scans
Re: [HACKERS] Materialized views WIP patch
On Thu, Mar 7, 2013 at 12:14 PM, David E. Wheeler da...@justatheory.com wrote: On Mar 7, 2013, at 7:55 AM, Kevin Grittner kgri...@ymail.com wrote: If the answer to both those questions is “yes,” I think the term should remain “table,” with a few mentions that the term includes materialized views (and excludes foreign tables). And if the answers are not exactly and yes? I still tend to think that the term should remain “table,” with brief mentions at the top of pages when the term should be assumed to represent tables and matviews, and otherwise required disambiguations. This ship has already sailed. There are plenty of places where operations apply to a subset of the relation types that exist today, and we either list them out or refer to relations generically. Changing that would require widespread changes to both the documentation and existing error message text. We cannot decide that table now means table or materialized view any more than we can decide that it means table or foreign table, as was proposed around the time those changes went in. Yeah, it's more work, and it's a little annoying, but it's also clear. Nothing else is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sql_drop Event Triggerg
On Fri, Mar 8, 2013 at 9:18 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas escribió: On Tue, Mar 5, 2013 at 12:45 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Hmm, maybe I should be considering a pair of macros instead -- UTILITY_START_DROP and UTILITY_END_DROP. I'll give this a try. Other ideas are welcome. That seems like a possibly promising idea. I do wonder how well any of this is going to scale. I did followup with a patch implementing that; did you see it? No, sorry. Which thread is it on? Presumably people are going to want similar things for CREATE and (hardest) ALTER. Seems like ProcessUtility() could get pretty messy and confusing. But I don't have a better idea, either. :-( Well, the first thing that we need to settle is the user interface. Normalized command string don't seem to cut it; requiring users to write SQL parsers is rather unfriendly IMHO. I could not agree more. The format we're moving towards for dropped objects can easily be converted back into SQL if you happen to want to do that, but it's also much easier to process programatically than a compared with a normalized command string. So I think we get the best of both worlds with this design. The current idea of having a function that returns objects affected by the command seems relatively sensible. For drops, it seems pretty straighforward so far. For CREATE it's probably somewhat more involved, but seems doable in principle (but yes, we're going to have to sprinkle ProcessUtility() with a lot of UTILITY_START/END_CREATE calls). Not sure about ALTER; maybe we will need a completely different idea to attack that. I am inclined to think that putting this logic in ProcessUtility isn't scalable, even for CREATE, and even moreso for ALTER, unless we can put it around everything in that function, rather than each command individually. Suppose for example that on entry to that function we simply did this: if (isCompleteQuery) ++CompleteQueryNestingLevel; ...and at exit, we did the reverse. This could work a bit like the GUC nesting level. When an object is dropped, we find the array slot for the current nesting level, and it's, say, a List **, and we push a new element onto that list. When somebody asks for a list of dropped objects, we pull from the list for the current nesting level. When decrementing the nesting level, we flush the list associated with the old nesting level, if any. if (isCompleteQuery) { list_free(dropped_objects_list[CompleteQueryNestingLevel]; dropped_objects_list[CompleteQueryNestingLevel] = NIL; --CompleteQueryNestingLevel; } Now, if we want to support CREATE, we can just have a created_objects_list array as well, and only minimal changes are required here. If we want to support ALTER we can have an altered_objects_list, and the only decision is what data to stuff into the list elements. I think this is a lot better than decorating each individual command, which is already unweildly and bound to get worse. It is a bit of a definitional change, because it implies that the list of dropped objects is the list of objects dropped by the most-nearly-enclosing DDL command, rather than the list of objects dropped by the most-nearly-enclosing DDL command *that is capable of dropping objects*. However, I'm inclined to think that's a better definition anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist
On Fri, Mar 8, 2013 at 2:23 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2013/3/8 Josh Kupershmidt schmi...@gmail.com: Cool. I think it would also be useful to check that --clean may only be used with --format=p to avoid any confusion there. (This issue could be addressed in a separate patch if you'd rather not lard this patch.) no some people - like we in our company would to use this feature for quiet and strict mode for plain text format too. enabling this feature has zero overhead so there are no reason block it anywhere. I'm not sure I understand what you're getting at, but maybe my proposal wasn't so clear. Right now, you can specify --clean along with -Fc to pg_dump, and pg_dump will not complain even though this combination is nonsense. I am proposing that pg_dump error out in this case, i.e. $ pg_dump -Fc --file=test.dump --clean -d test pg_dump: option --clean only valid with plain format dump Although this lack of an error a (IMO) misfeature of existing pg_dump, so if you'd rather leave this issue aside for your patch, that is fine. Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] sql_drop Event Triggerg
Robert Haas escribió: On Fri, Mar 8, 2013 at 9:18 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Robert Haas escribió: On Tue, Mar 5, 2013 at 12:45 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Hmm, maybe I should be considering a pair of macros instead -- UTILITY_START_DROP and UTILITY_END_DROP. I'll give this a try. Other ideas are welcome. That seems like a possibly promising idea. I do wonder how well any of this is going to scale. I did followup with a patch implementing that; did you see it? No, sorry. Which thread is it on? http://www.postgresql.org/message-id/20130305214218.gp9...@alvh.no-ip.org I think Gmail's feature of breaking threads when subject changes is an annoyance here. I somehow added a g at the end and later dropped it. I didn't remember that behavior of Gmail's. The current idea of having a function that returns objects affected by the command seems relatively sensible. For drops, it seems pretty straighforward so far. For CREATE it's probably somewhat more involved, but seems doable in principle (but yes, we're going to have to sprinkle ProcessUtility() with a lot of UTILITY_START/END_CREATE calls). Not sure about ALTER; maybe we will need a completely different idea to attack that. I am inclined to think that putting this logic in ProcessUtility isn't scalable, even for CREATE, and even moreso for ALTER, unless we can put it around everything in that function, rather than each command individually. Suppose for example that on entry to that function we simply did this: if (isCompleteQuery) ++CompleteQueryNestingLevel; ...and at exit, we did the reverse. This could work a bit like the GUC nesting level. Hmm, this seems an interesting idea to explore. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Support for REINDEX CONCURRENTLY
On Fri, Mar 8, 2013 at 10:00 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Mar 8, 2013 at 1:41 AM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Mar 7, 2013 at 7:19 AM, Andres Freund and...@2ndquadrant.com wrote: The strange think about hoge_pkey_cct_cct is that it seems to imply that an invalid index was reindexed concurrently? But I don't see how it could happen either. Fujii, can you reproduce it? Yes, I can even with the latest version of the patch. The test case to reproduce it is: (Session 1) CREATE TABLE hoge (i int primary key); INSERT INTO hoge VALUES (generate_series(1,10)); (Session 2) BEGIN; SELECT * FROM hoge; (keep this session as it is) (Session 1) SET statement_timeout TO '1s'; REINDEX TABLE CONCURRENTLY hoge; \d hoge REINDEX TABLE CONCURRENTLY hoge; \d hoge I fixed this problem in the patch attached. It was caused by 2 things: - The concurrent index was seen as valid from other backend between phases 3 and 4. So the concurrent index is made valid at phase 4, then swap is done and finally marked as invalid. So it remains invalid seen from the other sessions. - index_set_state_flags used heap_inplace_update, which is not completely safe at swapping phase, so I had to extend it a bit to use a safe simple_heap_update at swap phase. Thanks! + para + Concurrent indexes based on a literalPRIMARY KEY/ or an literal + EXCLUSION/ constraint need to be dropped with literalALTER TABLE Typo: s/EXCLUSION/EXCLUDE I encountered a segmentation fault when I ran REINDEX CONCURRENTLY. The test case to reproduce the segmentation fault is: 1. Install btree_gist 2. Run btree_gist's regression test (i.e., make installcheck) 3. Log in contrib_regression database after the regression test 4. Execute REINDEX TABLE CONCURRENTLY moneytmp Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist
2013/3/8 Josh Kupershmidt schmi...@gmail.com: On Fri, Mar 8, 2013 at 2:23 AM, Pavel Stehule pavel.steh...@gmail.com wrote: 2013/3/8 Josh Kupershmidt schmi...@gmail.com: Cool. I think it would also be useful to check that --clean may only be used with --format=p to avoid any confusion there. (This issue could be addressed in a separate patch if you'd rather not lard this patch.) no some people - like we in our company would to use this feature for quiet and strict mode for plain text format too. enabling this feature has zero overhead so there are no reason block it anywhere. I'm not sure I understand what you're getting at, but maybe my proposal wasn't so clear. Right now, you can specify --clean along with -Fc to pg_dump, and pg_dump will not complain even though this combination is nonsense. I am proposing that pg_dump error out in this case, i.e. $ pg_dump -Fc --file=test.dump --clean -d test pg_dump: option --clean only valid with plain format dump Although this lack of an error a (IMO) misfeature of existing pg_dump, so if you'd rather leave this issue aside for your patch, that is fine. I'll see - please, stay tuned to 9.4 first commitfest Regards Pavel Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [DOCS] Contrib module xml2 status
On Fri, Feb 22, 2013 at 08:25:31AM +0900, Ian Lawrence Barwick wrote: The point is we can remove the module when someone fixes and replaces the functionality that's left in there that some people rely on. So far nobody has stepped up to the plate, although now that we have lateral a sane replacement for xpath_table might well be a lot easier to achieve. If someone is interested in working on this I'd be happy to hear about it. Maybe it would be a good Google SOC project. It might be worth adding an explicit entry in the TODO list for removing this and summarising what needs to be done. https://wiki.postgresql.org/wiki/Todo#XML I think this is already covered in this TODO item: Restructure XML and /contrib/xml2 functionality http://archives.postgresql.org/pgsql-hackers/2011-02/msg02314.php http://archives.postgresql.org/pgsql-hackers/2011-03/msg00017.php -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling Checksums
I also see the checksum patch is taking a beating. I wanted to step back and ask what pertntage of known corruptions cases will this checksum patch detect? I'm pretty sure that early on Jeff posted some statstics which indicated that the current approach would detect 99% of corruption introduced at the PostgreSQL, filesystem, or storage layer, and a significant but minority amount of the corruption introduced through bad RAM (this is harder to detect, and FS checksums don't catch it either). What percentage of these corruptions would filesystem checksums have detected? In what way is that relevant? Given that there were already a couple dozen posts establishing that FS checksums are not adequate, please don't bring this up again. Also, don't all modern storage drives have built-in checksums, and report problems to the system administrator? Does smartctl help report storage corruption? To date, there are no useful tools which would detect user-level file corruption using these. Not that there couldn't theoretically be, but such tools appearing in enterprise OSes is at least several years away. Let me take a guess at answering this --- we have several layers in a database server: 1 storage 2 storage controller 3 file system 4 RAM 5 CPU My guess is that storage checksums only cover layer 1, while our patch covers layers 1-3, and probably not 4-5 because we only compute the checksum on write. You're forgetting two other major causes: * PostgreSQL bugs * operator error If that is correct, the open question is what percentage of corruption happens in layers 1-3? The majority. I don't know that anyone has done an industry survey to determine this, but out of the cases of Postgres corruption we've had to deal with for clients, only one was the result of bad RAM. I have never seen corruption caused by a CPU bug. The rest have been caused by: * operator error * postgres bugs * bad controller/driver * bad disk * filesystem bug Further, the solution for bad RAM is fairly easy: use ECC RAM, and make sure that the syslog goes to some real person. ECC RAM is pretty good at detecting its own errors. There's also another use case people have not been discussing, which is the technical validation use case. Give you an example: We had a client who had a server device running on FreeBSD/UFS. In 2009, they upgraded the device spec, including new storage and a new version of PostgreSQL. Their customers began filing corruption bug reports. After some examination of the systems involved, we conculded that the issue was the FreeBSD drivers for the new storage, which were unstable and had custom source patches. However, without PostgreSQL checksums, we couldn't *prove* it wasn't PostgreSQL at fault. It ended up taking weeks of testing, most of which was useless, to prove to them they had a driver problem so it could be fixed. If Postgres had had checksums, we could have avoided wasting a couple weeks looking for non-existant PostgreSQL bugs. In any large enterprise with dozens to hundreds of PostgreSQL servers, PostgreSQL, the OS/FS, and the hardware are going to be run by 3 different teams. When corruption occurs, the DBAs need to be able to demonstrate that the corruption is not in the DBMS, in order to get the other teams to hunt corruption bugs on their own layers. Also, I'm kinda embarassed that, at this point, InnoDB has checksums and we don't. :-( -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Identity projection
On 12.02.2013 11:03, Amit Kapila wrote: + /* + * equivalent_tlists + * returns whether two traget lists are equivalent + * + * We consider two target lists equivalent if both have + * only Var entries and resjunk of each target entry is same. + * + * This function is used to decide whether to create a result node. + * We need to ensure that each entry is Var as below node will not be + * projection capable, so it will not be able to compute expressions. + */ + bool + equivalent_tlists(List *tlist1, List *tlist2) + { + ListCell *lp, + *lc; + + if (list_length(tlist1) != list_length(tlist2)) + return false; /* tlists not same length */ + + forboth(lp, tlist1, lc, tlist2) + { + TargetEntry *tle1 = (TargetEntry *) lfirst(lp); + TargetEntry *tle2 = (TargetEntry *) lfirst(lc); + + if (tle1-resjunk != tle1-resjunk) + return false; /* tlist doesn't match junk status */ + + if (tle1-expr IsA(tle1-expr, Var) + tle2-expr IsA(tle2-expr, Var)) + { + Var*var1 = (Var *) tle1-expr; + Var*var2 = (Var *) tle2-expr; + + if (var1-varattno != var2-varattno) + return false; /* different order */ + } + else + return false; + } + return true; + } Hmm, shouldn't that also check Var.varno and Var.varlevelsup? I tried really hard to come up with a query that would fail because of that, but I wasn't able to find one. Nevertheless, seems like those fields should be checked. On a more trivial note, equivalent_tlists() seems too generic for this. I'd expect two tlists that contain e.g an identical Const node to be equivalent, but this returns false for it. I'd suggest calling it something like tlist_needs_projection() instead. Also, it probably belongs in tlist.c. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REFRESH MATERIALIZED VIEW locklevel
Andres, Crossing this over to pgsql-advocacy, because this is really an Advocacy discussion. The point is that a) refreshing is the only way to update materialized views. There's no incremental support. b) refreshing will take a long time (otherwise you wouldn't have create a materialized view) and you can't use the view during that. Which means for anyone wanting to use matviews in a busy environment you will need to build the new matview separately and then move it into place via renames. With all the issues that brings like needing to recreate dependent views and such. There's a lot of shops which currently have matviews which are referesed daily, during low-activity periods. I consult for several. While concurrent refresh will make MVs much more useful for shops with a tighter refresh cycle, what Kevin has developed is useful *to me* immediately. It allows me to cut dozens to hundreds of lines of application code and replace it with a simple declaration and a Refresh cron job. Sorry, but thats not very useful expect (and there very much so) as a stepping stone for further work. What you're saying is MVs aren't useful *to me* in their current state, therefore they aren't useful, therefore they're a non-feature. Well, the 9.3 version is useful to *me*, and I expect that they will be useful to a large number of other people, even if they don't help *you*. As a parallel feature, 9.2's cascading replication is completely useless to me and my clients because streaming-only remastering isn't supported. I expressed the opinion that maybe we shouldn't promote cascade rep as a major feature until it was; I was outvoted, because it turns out that 9.2 cascade rep *is* useful to a large number of people who are willing to work around its current limitations. Further, we get pretty much one and only one chance to promote a new major feature, which is when that feature is first introduced. Improving the feature in the next version of Postgres is not news, so we can't successfully promote it. If we soft-pedal MVs in the 9.3 announcement, we will not be able to get people excited about them in 9.4; they will be yesterday's news. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REFRESH MATERIALIZED VIEW locklevel
On Fri, Mar 8, 2013 at 11:59 AM, Josh Berkus j...@agliodbs.com wrote: Andres, Crossing this over to pgsql-advocacy, because this is really an Advocacy discussion. The point is that a) refreshing is the only way to update materialized views. There's no incremental support. b) refreshing will take a long time (otherwise you wouldn't have create a materialized view) and you can't use the view during that. Which means for anyone wanting to use matviews in a busy environment you will need to build the new matview separately and then move it into place via renames. With all the issues that brings like needing to recreate dependent views and such. There's a lot of shops which currently have matviews which are referesed daily, during low-activity periods. I consult for several. While concurrent refresh will make MVs much more useful for shops with a tighter refresh cycle, what Kevin has developed is useful *to me* immediately. It allows me to cut dozens to hundreds of lines of application code and replace it with a simple declaration and a Refresh cron job. Sorry, but thats not very useful expect (and there very much so) as a stepping stone for further work. What you're saying is MVs aren't useful *to me* in their current state, therefore they aren't useful, therefore they're a non-feature. Well, the 9.3 version is useful to *me*, and I expect that they will be useful to a large number of other people, even if they don't help *you*. As a parallel feature, 9.2's cascading replication is completely useless to me and my clients because streaming-only remastering isn't supported. I expressed the opinion that maybe we shouldn't promote cascade rep as a major feature until it was; I was outvoted, because it turns out that 9.2 cascade rep *is* useful to a large number of people who are willing to work around its current limitations. Further, we get pretty much one and only one chance to promote a new major feature, which is when that feature is first introduced. Improving the feature in the next version of Postgres is not news, so we can't successfully promote it. If we soft-pedal MVs in the 9.3 announcement, we will not be able to get people excited about them in 9.4; they will be yesterday's news. +1 on this. they are useful to me as immediately and I work in busy environments. the formal matview feature is a drop in replace for my ad hoc implementation of 'drop cache table, replace from view'. I already have to work around the locking issue anyways -- sure, it would be great if I didn't have to do that either but I'll take the huge syntactical convenience alone. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] REFRESH MATERIALIZED VIEW locklevel
On 03/08/2013 10:09 AM, Merlin Moncure wrote: On Fri, Mar 8, 2013 at 11:59 AM, Josh Berkus j...@agliodbs.com wrote: Andres, Further, we get pretty much one and only one chance to promote a new major feature, which is when that feature is first introduced. Improving the feature in the next version of Postgres is not news, so we can't successfully promote it. If we soft-pedal MVs in the 9.3 announcement, we will not be able to get people excited about them in 9.4; they will be yesterday's news. +1 on this. they are useful to me as immediately and I work in busy environments. the formal matview feature is a drop in replace for my ad hoc implementation of 'drop cache table, replace from view'. I already have to work around the locking issue anyways -- sure, it would be great if I didn't have to do that either but I'll take the huge syntactical convenience alone. Just to throw my +1 into the ring. Well written Josh. JD merlin -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC @cmdpromptinc - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants?
On Mon, Feb 25, 2013 at 02:31:21PM +, Peter Geoghegan wrote: On 25 February 2013 11:49, Robert Haas robertmh...@gmail.com wrote: I did attempt to do some tinkering with this while I was playing with it, but I didn't come up with anything really compelling. You can reduce the number of comparisons on particular workloads by tinkering with the algorithm, but then somebody else ends up doing more comparisons, so it's hard to say whether you've really made things better. Or at least I found it so. Right. To be honest, the real reason that it bothers me is that everything else that our qsort routine does that differs from classic quicksort (mostly quadratic insurance, like the median-of-medians pivot selection, but also the fallback to insertion sort when n 7) is very well supported by peer reviewed research. Like Tom, I find it implausible that Sedgewick and others missed a trick, where we did not, particularly with something so simple. Perhaps we are more likely to be fed sorted data than a typical qsort usecase. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants?
-Original Message- From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Bruce Momjian Sent: Friday, March 08, 2013 11:22 AM To: Peter Geoghegan Cc: Robert Haas; Tom Lane; PG Hackers Subject: Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants? On Mon, Feb 25, 2013 at 02:31:21PM +, Peter Geoghegan wrote: On 25 February 2013 11:49, Robert Haas robertmh...@gmail.com wrote: I did attempt to do some tinkering with this while I was playing with it, but I didn't come up with anything really compelling. You can reduce the number of comparisons on particular workloads by tinkering with the algorithm, but then somebody else ends up doing more comparisons, so it's hard to say whether you've really made things better. Or at least I found it so. Right. To be honest, the real reason that it bothers me is that everything else that our qsort routine does that differs from classic quicksort (mostly quadratic insurance, like the median-of-medians pivot selection, but also the fallback to insertion sort when n 7) is very well supported by peer reviewed research. Like Tom, I find it implausible that Sedgewick and others missed a trick, where we did not, particularly with something so simple. Perhaps we are more likely to be fed sorted data than a typical qsort usecase. Checking for pre-sorted input will not make the routine faster on average. However, it prevents a common perverse case where the runtime goes quadratic, so sorting 10^6 elements will take K*10^12th operations when the bad condition does occur. Checking for pre-sorted data can be thought of as an insurance policy. It's kind of a pain to pay the premiums, but you sure are glad you have it when an accident occurs. Because the check is linear, and the algorithm is O(n*log(n)), the cost is not terribly significant. Switching to insertion sort for small partitions is almost always a good idea. This idea was invented by Richard Singleton as ACM algorithm 347. A different sort of safeguard, as compared to checking for pre-ordered data, is to watch recursion depth. If we find that we have already gone past a depth of log2(n) levels and we are not ready to shift to insertion sort, then there is some sort of perverse data set. A traditional fix is to switch to heap sort at this point. Since heap sort is also O(n*log(n)) {though the constant multiplier is larger than for quicksort on average} you never get O(n*n) behavior. This method is called introspective sort. There are, of course, other clever things that can be tried. The relaxed weak heapsort of Edelkamp and Steigler, for instance, or using tries as in Cache-Efficient String Sorting Using Copying by Ranjan Sinha. There is also possibly significant benefit to using radix sort for fixed width data that can be easily binned. I seem to recall that a year or two back some study was done on quicksort methodology as used in PostgreSQL. As I recall, the algorithm used in PostgreSQL fared well in the tests. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants?
On Fri, Mar 8, 2013 at 07:43:10PM +, Dann Corbit wrote: I seem to recall that a year or two back some study was done on quicksort methodology as used in PostgreSQL. As I recall, the algorithm used in PostgreSQL fared well in the tests. Well, that's good to hear. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants?
On 8 March 2013 11:48, Bruce Momjian br...@momjian.us wrote: On Fri, Mar 8, 2013 at 07:43:10PM +, Dann Corbit wrote: I seem to recall that a year or two back some study was done on quicksort methodology as used in PostgreSQL. As I recall, the algorithm used in PostgreSQL fared well in the tests. Well, that's good to hear. I wouldn't mind taking a look at that. We're not the only ones that use (more or less) that same algorithm. I noticed that Redis does so too, just for example (though they didn't get wise to the problems with the swap_cnt pessimisation). -- Regards, Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants?
-Original Message- From: Peter Geoghegan [mailto:peter.geoghega...@gmail.com] Sent: Friday, March 08, 2013 12:00 PM To: Bruce Momjian Cc: Dann Corbit; Robert Haas; Tom Lane; PG Hackers Subject: Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants? On 8 March 2013 11:48, Bruce Momjian br...@momjian.us wrote: On Fri, Mar 8, 2013 at 07:43:10PM +, Dann Corbit wrote: I seem to recall that a year or two back some study was done on quicksort methodology as used in PostgreSQL. As I recall, the algorithm used in PostgreSQL fared well in the tests. Well, that's good to hear. I wouldn't mind taking a look at that. We're not the only ones that use (more or less) that same algorithm. I noticed that Redis does so too, just for example (though they didn't get wise to the problems with the swap_cnt pessimisation). Turns out, it was a lot longer ago than I thought (2005!). The thread was called Which qsort is used, but I also seem to recall that the topic was revisited a few times. See, for instance: http://www.postgresql.org/message-id/pine.lnx.4.58.0512121138080.18...@eon.cs http://www.cs.toronto.edu/~zhouqq/postgresql/sort/sort.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On Thu, Mar 7, 2013 at 2:48 PM, David E. Wheeler da...@justatheory.com wrote: In the spirit of being liberal about what we accept but strict about what we store, it seems to me that JSON object key uniqueness should be enforced either by throwing an error on duplicate keys, or by flattening so that the latest key wins (as happens in JavaScript). I realize that tracking keys will slow parsing down, and potentially make it more memory-intensive, but such is the price for correctness. I'm with Andrew. That's a rathole I emphatically don't want to go down. I wrote this code originally, and I had the thought clearly in mind that I wanted to accept JSON that was syntactically well-formed, not JSON that met certain semantic constraints. We could add functions like json_is_non_stupid(json) so that people can easily add a CHECK constraint that enforces this if they so desire. But enforcing it categorically seems like a bad plan, especially since at this point it would require a compatibility break with previous releases. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/08/2013 09:39 PM, Robert Haas wrote: On Thu, Mar 7, 2013 at 2:48 PM, David E. Wheeler da...@justatheory.com wrote: In the spirit of being liberal about what we accept but strict about what we store, it seems to me that JSON object key uniqueness should be enforced either by throwing an error on duplicate keys, or by flattening so that the latest key wins (as happens in JavaScript). I realize that tracking keys will slow parsing down, and potentially make it more memory-intensive, but such is the price for correctness. I'm with Andrew. That's a rathole I emphatically don't want to go down. I wrote this code originally, and I had the thought clearly in mind that I wanted to accept JSON that was syntactically well-formed, not JSON that met certain semantic constraints. If it does not meet these semantic constraints, then it is not really JSON - it is merely JSON-like. this sounds very much like MySQLs decision to support timestamp -00-00 00:00 - syntactically correct, but semantically wrong. We could add functions like json_is_non_stupid(json) so that people can easily add a CHECK constraint that enforces this if they so desire. But enforcing it categorically seems like a bad plan, especially since at this point it would require a compatibility break with previous releases If we ever will support real spec-compliant JSON (maybe based on recursive hstore ?) then there will be a compatibility break anyway, so why not do it now. Or do you seriously believe that somebody is using PostgreSQL JSON to store these kind of json documents Cheers Hannu Krosing -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
Hannu Krosing escribió: On 03/08/2013 09:39 PM, Robert Haas wrote: On Thu, Mar 7, 2013 at 2:48 PM, David E. Wheeler da...@justatheory.com wrote: In the spirit of being liberal about what we accept but strict about what we store, it seems to me that JSON object key uniqueness should be enforced either by throwing an error on duplicate keys, or by flattening so that the latest key wins (as happens in JavaScript). I realize that tracking keys will slow parsing down, and potentially make it more memory-intensive, but such is the price for correctness. I'm with Andrew. That's a rathole I emphatically don't want to go down. I wrote this code originally, and I had the thought clearly in mind that I wanted to accept JSON that was syntactically well-formed, not JSON that met certain semantic constraints. If it does not meet these semantic constraints, then it is not really JSON - it is merely JSON-like. this sounds very much like MySQLs decision to support timestamp -00-00 00:00 - syntactically correct, but semantically wrong. Is it wrong? The standard cited says SHOULD, not MUST. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On Mar 8, 2013, at 1:01 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: If it does not meet these semantic constraints, then it is not really JSON - it is merely JSON-like. this sounds very much like MySQLs decision to support timestamp -00-00 00:00 - syntactically correct, but semantically wrong. Is it wrong? The standard cited says SHOULD, not MUST. Yes, it is wrong, because multiple keys are specifically disallowed for accessing values. Hence this new error: david=# select json_get('{foo: 1, foo: 2}', 'foo'); ERROR: field name is not unique in json object I really don’t think that should be possible. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
Well I would much prefer to find out sooner rather than later that there is a problem, so I would much prefer know I've created a duplicate as soon as the system can detect it. In general, Postgresql appears much better at this than MySQL On 09/03/13 10:01, Alvaro Herrera wrote: Hannu Krosing escribió: On 03/08/2013 09:39 PM, Robert Haas wrote: On Thu, Mar 7, 2013 at 2:48 PM, David E. Wheeler da...@justatheory.com wrote: In the spirit of being liberal about what we accept but strict about what we store, it seems to me that JSON object key uniqueness should be enforced either by throwing an error on duplicate keys, or by flattening so that the latest key wins (as happens in JavaScript). I realize that tracking keys will slow parsing down, and potentially make it more memory-intensive, but such is the price for correctness. I'm with Andrew. That's a rathole I emphatically don't want to go down. I wrote this code originally, and I had the thought clearly in mind that I wanted to accept JSON that was syntactically well-formed, not JSON that met certain semantic constraints. If it does not meet these semantic constraints, then it is not really JSON - it is merely JSON-like. this sounds very much like MySQLs decision to support timestamp -00-00 00:00 - syntactically correct, but semantically wrong. Is it wrong? The standard cited says SHOULD, not MUST. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/08/2013 04:01 PM, Alvaro Herrera wrote: Hannu Krosing escribió: On 03/08/2013 09:39 PM, Robert Haas wrote: On Thu, Mar 7, 2013 at 2:48 PM, David E. Wheeler da...@justatheory.com wrote: In the spirit of being liberal about what we accept but strict about what we store, it seems to me that JSON object key uniqueness should be enforced either by throwing an error on duplicate keys, or by flattening so that the latest key wins (as happens in JavaScript). I realize that tracking keys will slow parsing down, and potentially make it more memory-intensive, but such is the price for correctness. I'm with Andrew. That's a rathole I emphatically don't want to go down. I wrote this code originally, and I had the thought clearly in mind that I wanted to accept JSON that was syntactically well-formed, not JSON that met certain semantic constraints. If it does not meet these semantic constraints, then it is not really JSON - it is merely JSON-like. this sounds very much like MySQLs decision to support timestamp -00-00 00:00 - syntactically correct, but semantically wrong. Is it wrong? The standard cited says SHOULD, not MUST. Here's what rfc2119 says about that wording: 4. SHOULD NOT This phrase, or the phrase NOT RECOMMENDED mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label. So we're allowed to do as Robert chose, and I think there are good reasons for doing so (apart from anything else, checking it would slow down the parser enormously). Now you could argue that in that case the extractor functions should allow it too, and it's probably fairly easy to change them to allow it. In that case we need to decide who wins. We could treat a later field lexically as overriding an earlier field of the same name, which I think is what David expected. That's what plv8 does (i.e. it's how v8 interprets JSON): andrew=# create or replace function jget(t json, fld text) returns text language plv8 as ' return t[fld]; '; CREATE FUNCTION andrew=# select jget('{f1:x,f1:y}','f1'); jget -- y (1 row) Or you could take the view I originally took that in view of the RFC wording we should raise an error if this was found. I can live with either view. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On Mar 8, 2013, at 1:21 PM, Andrew Dunstan and...@dunslane.net wrote: Here's what rfc2119 says about that wording: 4. SHOULD NOT This phrase, or the phrase NOT RECOMMENDED mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label. I suspect this was allowed for the JavaScript behavior where multiple keys are allowed, but the last key in the list wins. So we're allowed to do as Robert chose, and I think there are good reasons for doing so (apart from anything else, checking it would slow down the parser enormously). Yes, but the implications are going to start biting us on the ass now. Now you could argue that in that case the extractor functions should allow it too, and it's probably fairly easy to change them to allow it. In that case we need to decide who wins. We could treat a later field lexically as overriding an earlier field of the same name, which I think is what David expected. That's what plv8 does (i.e. it's how v8 interprets JSON): andrew=# create or replace function jget(t json, fld text) returns text language plv8 as ' return t[fld]; '; CREATE FUNCTION andrew=# select jget('{f1:x,f1:y}','f1'); jget -- y (1 row) Or you could take the view I originally took that in view of the RFC wording we should raise an error if this was found. I can live with either view. I’m on the fence. On the one hand, I like the plv8 behavior, which is nice for a dynamic language. On the other hand, I don't much care for it in my database, where I want data storage requirements to be quite strict. I hate the idea of -00-00 being allowed as a date, and am uncomfortable with allowing duplicate keys to be stored in the JSON data type. So my order of preference for the options would be: 1. Have the JSON type collapse objects so the last instance of a key wins and is actually stored 2. Throw an error when a JSON type has duplicate keys 3. Have the accessors find the last instance of a key and return that value 4. Let things remain as they are now On second though, I don't like 4 at all. It means that the JSON type things a value is valid while the accessor does not. They contradict one another. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/08/2013 10:01 PM, Alvaro Herrera wrote: Hannu Krosing escribió: On 03/08/2013 09:39 PM, Robert Haas wrote: On Thu, Mar 7, 2013 at 2:48 PM, David E. Wheeler da...@justatheory.com wrote: In the spirit of being liberal about what we accept but strict about what we store, it seems to me that JSON object key uniqueness should be enforced either by throwing an error on duplicate keys, or by flattening so that the latest key wins (as happens in JavaScript). I realize that tracking keys will slow parsing down, and potentially make it more memory-intensive, but such is the price for correctness. I'm with Andrew. That's a rathole I emphatically don't want to go down. I wrote this code originally, and I had the thought clearly in mind that I wanted to accept JSON that was syntactically well-formed, not JSON that met certain semantic constraints. If it does not meet these semantic constraints, then it is not really JSON - it is merely JSON-like. this sounds very much like MySQLs decision to support timestamp -00-00 00:00 - syntactically correct, but semantically wrong. Is it wrong? The standard cited says SHOULD, not MUST. I think one MAY start implementation with loose interpretation of SHOULD, but if at all possible we SHOULD implement the SHOULD-qualified features :) http://www.ietf.org/rfc/rfc2119.txt: SHOULD This word, or the adjective RECOMMENDED, mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. We might start with just throwing a warning for duplicate keys, but I can see no good reason to do so. Except ease of implementation and with current JSON-AS-TEXT implenetation performance. And providing a boolean function is_really_json_object(json) to be used in check constraints seems plain weird . Otoh, as the spec defines JSON as being designed to be a subset of javascript, it SHOULD accept select '{foo: 1, foo: 2}'::json; but turn it into '{foo: 2}'::json; for storage. I do not think it would be a good idea to leave it to data extraction functions to always get the last value for foo in this case 2 -- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/08/2013 04:28 PM, David E. Wheeler wrote: On Mar 8, 2013, at 1:21 PM, Andrew Dunstan and...@dunslane.net wrote: Here's what rfc2119 says about that wording: 4. SHOULD NOT This phrase, or the phrase NOT RECOMMENDED mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label. I suspect this was allowed for the JavaScript behavior where multiple keys are allowed, but the last key in the list wins. So we're allowed to do as Robert chose, and I think there are good reasons for doing so (apart from anything else, checking it would slow down the parser enormously). Yes, but the implications are going to start biting us on the ass now. Now you could argue that in that case the extractor functions should allow it too, and it's probably fairly easy to change them to allow it. In that case we need to decide who wins. We could treat a later field lexically as overriding an earlier field of the same name, which I think is what David expected. That's what plv8 does (i.e. it's how v8 interprets JSON): andrew=# create or replace function jget(t json, fld text) returns text language plv8 as ' return t[fld]; '; CREATE FUNCTION andrew=# select jget('{f1:x,f1:y}','f1'); jget -- y (1 row) Or you could take the view I originally took that in view of the RFC wording we should raise an error if this was found. I can live with either view. I’m on the fence. On the one hand, I like the plv8 behavior, which is nice for a dynamic language. On the other hand, I don't much care for it in my database, where I want data storage requirements to be quite strict. I hate the idea of -00-00 being allowed as a date, and am uncomfortable with allowing duplicate keys to be stored in the JSON data type. So my order of preference for the options would be: 1. Have the JSON type collapse objects so the last instance of a key wins and is actually stored 2. Throw an error when a JSON type has duplicate keys 3. Have the accessors find the last instance of a key and return that value 4. Let things remain as they are now On second though, I don't like 4 at all. It means that the JSON type things a value is valid while the accessor does not. They contradict one another. You can forget 1. We are not going to have the parser collapse anything. Either the JSON it gets is valid or it's not. But the parser isn't going to try to MAKE it valid. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/08/2013 10:42 PM, Andrew Dunstan wrote: On 03/08/2013 04:28 PM, David E. Wheeler wrote: On Mar 8, 2013, at 1:21 PM, Andrew Dunstan and...@dunslane.net wrote: Here's what rfc2119 says about that wording: 4. SHOULD NOT This phrase, or the phrase NOT RECOMMENDED mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label. I suspect this was allowed for the JavaScript behavior where multiple keys are allowed, but the last key in the list wins. So we're allowed to do as Robert chose, and I think there are good reasons for doing so (apart from anything else, checking it would slow down the parser enormously). Yes, but the implications are going to start biting us on the ass now. Now you could argue that in that case the extractor functions should allow it too, and it's probably fairly easy to change them to allow it. In that case we need to decide who wins. We could treat a later field lexically as overriding an earlier field of the same name, which I think is what David expected. That's what plv8 does (i.e. it's how v8 interprets JSON): andrew=# create or replace function jget(t json, fld text) returns text language plv8 as ' return t[fld]; '; CREATE FUNCTION andrew=# select jget('{f1:x,f1:y}','f1'); jget -- y (1 row) Or you could take the view I originally took that in view of the RFC wording we should raise an error if this was found. I can live with either view. I’m on the fence. On the one hand, I like the plv8 behavior, which is nice for a dynamic language. On the other hand, I don't much care for it in my database, where I want data storage requirements to be quite strict. I hate the idea of -00-00 being allowed as a date, and am uncomfortable with allowing duplicate keys to be stored in the JSON data type. So my order of preference for the options would be: 1. Have the JSON type collapse objects so the last instance of a key wins and is actually stored 2. Throw an error when a JSON type has duplicate keys 3. Have the accessors find the last instance of a key and return that value 4. Let things remain as they are now On second though, I don't like 4 at all. It means that the JSON type things a value is valid while the accessor does not. They contradict one another. You can forget 1. We are not going to have the parser collapse anything. Either the JSON it gets is valid or it's not. But the parser isn't going to try to MAKE it valid. Ok, so the make valid part will have to wait for http://www.pgcon.org/2013/schedule/events/518.en.html if this will ever happen ;) Which means that all extractor functions will need to do much more work in case of complex json, think of json_get('(a:{b:1},a:{1:x}, a:[1,{b:7}]}'::json, [a,1,b]) the true value in javascript is but here the get_json function has several options to error out early and real confusion as to where to report the error if in the end it is not found. essentially all extractor functions have to do what we omitted doing at input time. Cheers Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/08/2013 04:42 PM, Andrew Dunstan wrote: So my order of preference for the options would be: 1. Have the JSON type collapse objects so the last instance of a key wins and is actually stored 2. Throw an error when a JSON type has duplicate keys 3. Have the accessors find the last instance of a key and return that value 4. Let things remain as they are now On second though, I don't like 4 at all. It means that the JSON type things a value is valid while the accessor does not. They contradict one another. You can forget 1. We are not going to have the parser collapse anything. Either the JSON it gets is valid or it's not. But the parser isn't going to try to MAKE it valid. Actually, now I think more about it 3 is the best answer. Here's why: even the JSON generators can produce JSON with non-unique field names: andrew=# select row_to_json(q) from (select x as a, y as a from generate_series(1,2) x, generate_series(3,4) y) q; row_to_json --- {a:1,a:3} {a:1,a:4} {a:2,a:3} {a:2,a:4} So I think we have no option but to say, in terms of rfc 2119, that we have careful considered and decided not to comply with the RFC's recommendation (and we should note that in the docs). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_dump selectively ignores extension configuration tables
(reposting apparently I used a verboten word the first time around sigh. Sorry for any duplicates) The -t and -n options of pg_dump do not dump anything from an extension configuration table, whereas normal pg_dump will dump the user data. To see what I mean, in psql do (tested on pg9.2): 8-- create extension postgis; insert into spatial_ref_sys values(42,'test',42,'foo','bar'); 8-- Then in bash do: 8-- pg_dumptest|grep spatial_ref_sys pg_dump -t spatial_ref_sys test|grep spatial_ref_sys pg_dump -n public test|grep spatial_ref_sys 8-- Is this intentional, or oversight, or missing feature? Thanks, Joe -- Joe Conway credativ LLC: http://www.credativ.us Linux, PostgreSQL, and general Open Source Training, Service, Consulting, 24x7 Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
Actually, now I think more about it 3 is the best answer. Here's why: even the JSON generators can produce JSON with non-unique field names: +1 Also, I think we should add a json_normalize() function to the TODO list. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] scanner/parser minimization
On Thu, Feb 28, 2013 at 04:09:11PM -0500, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: A whole lot of those state transitions are attributable to states which have separate transitions for each of many keywords. Yeah, that's no surprise. The idea that's been in the back of my mind for awhile is to try to solve the problem at the lexer level not the parser level: that is, have the lexer return IDENT whenever a keyword appears in a context where it should be interpreted as unreserved. You suggest somehow driving that off mid-rule actions, but that seems to be to be probably a nonstarter from a code complexity and maintainability standpoint. I believe however that it's possible to extract an idea of which tokens the parser believes it can see next at any given parse state. (I've seen code for this somewhere on the net, but am too lazy to go searching for it again right now.) So we could imagine a rule along I believe tokenizing of typedefs requries the lexer to peek at the parser state: http://calculist.blogspot.com/2009/02/c-typedef-parsing-problem.html The well-known typedef problem with parsing C is that the standard C grammar is ambiguous unless the lexer distinguishes identifiers bound by typedef and other identifiers as two separate lexical classes. This means that the parser needs to feed scope information to the lexer during parsing. One upshot is that lexing must be done concurrently with parsing. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/08/2013 11:03 PM, Andrew Dunstan wrote: On 03/08/2013 04:42 PM, Andrew Dunstan wrote: So my order of preference for the options would be: 1. Have the JSON type collapse objects so the last instance of a key wins and is actually stored 2. Throw an error when a JSON type has duplicate keys 3. Have the accessors find the last instance of a key and return that value 4. Let things remain as they are now On second though, I don't like 4 at all. It means that the JSON type things a value is valid while the accessor does not. They contradict one another. You can forget 1. We are not going to have the parser collapse anything. Either the JSON it gets is valid or it's not. But the parser isn't going to try to MAKE it valid. Actually, now I think more about it 3 is the best answer. Here's why: even the JSON generators can produce JSON with non-unique field names: Yes, especially if you consider popular json generators vim and strcat() :) It is not a serialisation of some existing object, but it is something that JavaScript could interpret as valid subset of JavaScript which producees a JavaScript Object when interpreted. In this sense it is way better than MySQL timestamp -00-00 00:00 So the loose (without implementing the SHOULD part) meaning of JSON spec is anything that can be read into JavaScript producing a JS Object and not serialisation of a JavaScript Object as I wanted to read it initially. andrew=# select row_to_json(q) from (select x as a, y as a from generate_series(1,2) x, generate_series(3,4) y) q; row_to_json --- {a:1,a:3} {a:1,a:4} {a:2,a:3} {a:2,a:4} So I think we have no option but to say, in terms of rfc 2119, that we have careful considered and decided not to comply with the RFC's recommendation The downside is, that the we have just shifted the burden of JS Object generation to the getter functions. I suspect that 99.98% of the time we will get valid and unique JS Object serializations or equivalent as input to json_in() If we want the getter functions to handle the loose JSON to Object conversion side assuming our stored JSON can contain non-unique keys then these are bound to be slower, as they have to do these checks. Thay can't just grab the first matching one and return or recurse on that. (and we should note that in the docs). definitely +1 cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On 03/08/2013 06:37 PM, Hannu Krosing wrote: I suspect that 99.98% of the time we will get valid and unique JS Object serializations or equivalent as input to json_in() If we want the getter functions to handle the loose JSON to Object conversion side assuming our stored JSON can contain non-unique keys then these are bound to be slower, as they have to do these checks. Thay can't just grab the first matching one and return or recurse on that. No, there will be no slowdown. The parser doesn't short circuit. Read the code. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Index Unqiueness
On Fri, Mar 8, 2013 at 10:26:21AM -0500, Robert Haas wrote: overhead seems badly overpriced for insert-only tables. These are not fundamental truths of the universe, or even of PostgreSQL; they are specific consequences of the representation we've chosen for heaps. Many of them are things that we've grown into, rather than designed with malice aforethought: for example, freezing is a consequence of the after-the-fact desire to be able to support more than 4bn transactions over the lifetime of the database. So it's way better than what we had before, and yet, if we all sat down and designed a new on-disk storage format for a new product today, I'm sure none of us would pick one that expires after 2bn transactions. One thing to remember is that our freeze level is 200M transactions because of clog lookups/size, not wraparound concerns. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling Checksums
On Fri, Mar 8, 2013 at 5:46 PM, Josh Berkus j...@agliodbs.com wrote: After some examination of the systems involved, we conculded that the issue was the FreeBSD drivers for the new storage, which were unstable and had custom source patches. However, without PostgreSQL checksums, we couldn't *prove* it wasn't PostgreSQL at fault. It ended up taking weeks of testing, most of which was useless, to prove to them they had a driver problem so it could be fixed. If Postgres had had checksums, we could have avoided wasting a couple weeks looking for non-existant PostgreSQL bugs. How would Postgres checksums have proven that? A checksum failure just means *something* has gone wrong. it could still be Postgres that's done it. In fact I would hazard that checksum failures would be the way most Postgres bugs will be found at some point. Also, I'm kinda embarassed that, at this point, InnoDB has checksums and we don't. :-( As much as it sounds silly I think this is a valid argument. Not just InnoDB but Oracle and other database and even other storage software. I think even if the patch doesn't get accepted this go around it'll be in the next release. Either we'll think of solutions for some of the performance bottlenecks, we'll iron out the transition so you can turn it off and on freely, or we'll just realize that people are running with the patch and life is ok even with these problems. If i understand the performance issues right the main problem is the extra round trip to the wal log which can require a sync. Is that right? That seems like a deal breaker to me. I would think an 0-10% i/o bandwidth or cpu bandwidth penalty would be acceptable but an extra rotational latency even just on some transactions would be a real killer. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why do we still perform a check for pre-sorted input within qsort variants?
On Fri, Mar 8, 2013 at 7:43 PM, Dann Corbit dcor...@connx.com wrote: Checking for pre-sorted input will not make the routine faster on average. However, it prevents a common perverse case where the runtime goes quadratic, so sorting 10^6 elements will take K*10^12th operations when the bad condition does occur. Checking for pre-sorted data can be thought of as an insurance policy. It's kind of a pain to pay the premiums, but you sure are glad you have it when an accident occurs. Because the check is linear, and the algorithm is O(n*log(n)), the cost is not terribly significant. Well pre-sorted inputs are not the only quadratic case. If we really wanted to eliminate quadratic cases we could implement the pivot choice algorithm that guarantees n*log(n) worst-case. But that will increase the average run-time. If we're not doing that then I think your whole argument falls apart. We do care about the average case as well as the worst-case. There's been a *ton* of research on sorting. I find it hard to believe there isn't a pretty solid consensus on which which of these defense mechanisms is the best trade-off. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [pgsql-advocacy] Call for Google Summer of Code mentors, admins
Thom. I don't mind being an admin again. Can you gather together all of the projects suggested on this thread and use them to create updated text for the GSOC page? If you don't have web repo access, I can create a patch, but if you can do the text, that would be a big help. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Enabling Checksums
On 3/8/13 3:38 AM, Heikki Linnakangas wrote: See https://www.kernel.org/doc/Documentation/block/data-integrity.txt That includes an interesting comment that's along the lines of the MySQL checksum tests already mentioned: The 16-bit CRC checksum mandated by both the SCSI and SATA specs is somewhat heavy to compute in software. Benchmarks found that calculating this checksum had a significant impact on system performance for a number of workloads. Some controllers allow a lighter-weight checksum to be used when interfacing with the operating system. Emulex, for instance, supports the TCP/IP checksum instead. The TCP/IP checksum spec is at https://tools.ietf.org/html/rfc793 ; its error detection limitations are described at http://www.noahdavids.org/self_published/CRC_and_checksum.html ; and a good article about optimizing its code is at http://www.locklessinc.com/articles/tcp_checksum/ I'll take a longer look at whether it's an improvement on the Fletcher-16 used by the current patch. All of these 16 bit checksums are so much better than nothing. I don't think some shift toward prioritizing computation speed over detection rate is a problem. In the long run really sensitive 32 bit checksums will become more practical. As Heikki pointed out, the direction this whole area seems to be going is that one day you might get checksums all the way from application to hardware. That's another possible future where having some field tested checksum feature in the database will be valuable. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Duplicate JSON Object Keys
On Fri, Mar 08, 2013 at 10:34:20PM +0100, Hannu Krosing wrote: On 03/08/2013 10:01 PM, Alvaro Herrera wrote: Hannu Krosing escribi?: If it does not meet these semantic constraints, then it is not really JSON - it is merely JSON-like. Is it wrong? The standard cited says SHOULD, not MUST. I think one MAY start implementation with loose interpretation of SHOULD, but if at all possible we SHOULD implement the SHOULD-qualified features :) http://www.ietf.org/rfc/rfc2119.txt: SHOULD This word, or the adjective RECOMMENDED, mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. That SHOULD in section 2.2 of RFC 4627 constrains JSON data, not JSON parsers. Section 4 addresses parsers, saying A JSON parser MUST accept all texts that conform to the JSON grammar. We might start with just throwing a warning for duplicate keys, but I can see no good reason to do so. Except ease of implementation and with current JSON-AS-TEXT implenetation performance. Since its departure from a SHOULD item does not impugn the conformance of an input text, it follows that json_in(), to be a conforming JSON parser, MUST not reject objects with duplicate keys. -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request for vote to move forward with recovery.conf overhaul
On Wed, Mar 6, 2013 at 2:08 PM, Michael Paquier michael.paqu...@gmail.comwrote: Thanks for taking time in typing a complete summary of the situation. That really helps. On Mon, Mar 4, 2013 at 11:25 AM, Greg Smith g...@2ndquadrant.com wrote: On 1/23/13 6:36 AM, Michael Paquier wrote: The only problem I see is if the same parameter is defined in recovery.conf and postgresql.conf, is the priority given to recovery.conf? This sort of thing is what dragged down the past work on this. I don't really agree with the idea this thread is based on, that it was backward compatibility that kept this from being finished last year. I put a good bit of time into a proposal that a few people seemed happy with; all its ideas seem to have washed away. That balanced a couple of goals: -Allow a pg_ctl standby and pg_ctl recover command that work similarly to promote. This should slim down the work needed for the first replication setup people do. -Make it obvious when people try to use recovery.conf that it's not supported anymore. -Provide a migration path for tool authors strictly in the form of some documentation and error message hints. That was it as far as concessions to backward compatibility. The wrap-up I did started at http://www.postgresql.org/** message-id/4EE91248.8010505@**2ndQuadrant.comhttp://www.postgresql.org/message-id/4ee91248.8010...@2ndquadrant.comand only had a few responses/controversy from there. Robert wrote a good summary: 1. Get rid of recovery.conf - error out if it is seen 2. For each parameter that was previously a recovery.conf parameter, make it a GUC 3. For the parameter that was does recovery.conf exist?, replace it with does standby.enabled exist?. Agreed on that. I thought this stopped from there because no one went back to clean up Fujii's submission, which Simon and Michael have now put more time into. There is not much distance between it and the last update Michael sent. Just to be picky. I recommend using the version dated of 23rd of January as a work base if we definitely get rid of recovery.conf, the patch file is called 20130123_recovery_unite.patch. The second patch I sent on the 27th tried to support both recovery.conf and postgresql.conf, but this finishes with a lot of duplicated code as two sets of functions are necessary to deparse options for both postgresql.conf and recovery.conf... Here's the detailed notes from my original proposal, with updates to incorporate the main feedback I got then; note that much of this is documentation rather than code: -Creating a standby.enabled file puts the system into recovery mode. That feature needs to save some state, and making those decisions based on existence of a file is already a thing we do. Rather than emulating the rename to recovery.done that happens now, the server can just delete it, to keep from incorrectly returning to a state it's exited. A UI along the lines of the promote one, allowing pg_ctl standby, should fall out of here. This file can be relocated to the config directory, similarly to how the include directory looks for things. There was a concern that this would require write permissions that don't exist on systems that relocate configs, like Debian/Ubuntu. That doesn't look to be a real issue though. Here's a random Debian server showing the postgres user can write to all of those: $ ls -ld /etc/postgresql drwxr-xr-x 4 root root 4096 Jun 29 2012 /etc/postgresql $ ls -ld /etc/postgresql/9.1 drwxr-xr-x 3 postgres postgres 4096 Jul 1 2012 /etc/postgresql/9.1 $ ls -ld /etc/postgresql/9.1/main drwxr-xr-x 2 postgres postgres 4096 Mar 3 11:00 /etc/postgresql/9.1/main -A similar recovery.enabled file turns on PITR recovery. -It should be possible to copy a postgresql.conf file from master to standby and just use it. For example: --standby_mode = on: Ignored unless you've started the server with standby.enabled, won't bother the master if you include it. --primary_conninfo: This will look funny on the master showing it connecting to itself, but it will get ignored there too. -If startup finds a recovery.conf file where it used to live at, abort--someone is expecting the old behavior. Hint to RTFM or include a short migration guide right on the spot. That can have a nice section about how you might use the various postgresql.conf include* features if they want to continue managing those files separately. Example: rename what you used to make recovery.conf as replication.conf and use include_if_exists if you want to be able to rename it to recovery.done like before. Or drop it into a config/ directory (similarly to the proposal for SET PERSISTENT) where the rename to recovery.done will make it then skipped. (Only files ending with .conf are processed by include_dir) -Tools such as pgpool that want to write a simple configuration file, only touching the things that used to go into recovery.conf, can
Re: [HACKERS] Request for vote to move forward with recovery.conf overhaul
Michael Paquier escribió: On Wed, Mar 6, 2013 at 2:08 PM, Michael Paquier michael.paqu...@gmail.comwrote: Looks good to me too. Based on the patch I already sent before, there are a couple of things missing: - There are no pg_ctl standby/recover commands implemented yet (no that difficult to add) - a certain amount of work is needed for documentation Btw, I have a concern about that: is it really the time to finish that for this release knowing that the 9.3 feature freeze is getting close? I still don't know when it will be but it is... close. This patch is still in the current commit fest. Any objections in marking it as returned with feedback and put it in the next commit fest? We could work on that again at the next release as it looks that there is not much time to work on it for 9.3. If the patch is almost ready, IMHO you should complete it and post a version that implements the consensus design. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request for vote to move forward with recovery.conf overhaul
On 3/8/13 10:34 PM, Michael Paquier wrote: This patch is still in the current commit fest. Any objections in marking it as returned with feedback and put it in the next commit fest? There are currently 20 Needs Review and 14 Waiting on Author things left in the queue, so it's not quite that there's no time left. There really isn't very much left to do on this. The rough consensus idea from before takes a while to describe, but there was not a complicated implementation in that. The overlap with the still possible to commit SET PERSISTENT is probably the worst potential issue this is facing now, but that's not even a real issue yet. If you're out of time to work on it and want to back out of here having made good progress, that's fine. I'd be tempted to work on this thing myself for a bit just to try and finally get it done. If it gets punted forward, we'll be right back to facing bit rot and remembering what was going on again, which is what killed the momentum toward committing this the last time. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Request for vote to move forward with recovery.conf overhaul
On Sat, Mar 9, 2013 at 1:20 PM, Greg Smith g...@2ndquadrant.com wrote: There are currently 20 Needs Review and 14 Waiting on Author things left in the queue, so it's not quite that there's no time left. There really isn't very much left to do on this. The rough consensus idea from before takes a while to describe, but there was not a complicated implementation in that. The overlap with the still possible to commit SET PERSISTENT is probably the worst potential issue this is facing now, but that's not even a real issue yet. OK thanks for your feedback. If you're out of time to work on it and want to back out of here having made good progress, that's fine. I'd be tempted to work on this thing myself for a bit just to try and finally get it done. If it gets punted forward, we'll be right back to facing bit rot and remembering what was going on again, which is what killed the momentum toward committing this the last time. I think I will be able to work on that but not before Monday. This depends also on how REINDEX CONCURRENTLY goes... -- Michael
Re: [HACKERS] Fix pgstattuple/pgstatindex to use regclass-type as the argument
(2013/03/05 22:46), Robert Haas wrote: On Sun, Mar 3, 2013 at 5:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: Maybe this is acceptable collateral damage. I don't know. But we definitely stand a chance of breaking applications if we change pgstatindex like this. It might be better to invent a differently-named function to replace pgstatindex. If this were a built-in function, we might have to make a painful decision between breaking backward compatibility and leaving this broken forever, but as it isn't, we don't. I think your suggestion of adding a new function is exactly right. We can remove the old one in a future release, and support both in the meantime. It strikes me that if extension versioning is for anything, this is it. It is obviously easy to keep two types of function interfaces, one with regclass-type and another with text-type, in the next release for backward-compatibility like below: pgstattuple(regclass) -- safer interface. pgstattuple(text) -- will be depreciated in the future release. Having both interfaces for a while would allow users to have enough time to rewrite their applications. Then, we will be able to obsolete (or just drop) old interfaces in the future release, maybe 9.4 or 9.5. I think this approach would minimize an impact of such interface change. So, I think we can clean up function arguments in the pgstattuple module, and also we can have two interfaces, both regclass and text, for the next release. Any comments? Regards, -- Satoshi Nagayasu sn...@uptime.jp Uptime Technologies, LLC. http://www.uptime.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers