Re: [HACKERS] Reduce pinning in btree indexes
Hello, I measured the performance of this patch considering markpos/restorepos. The loss seems to be up to about 10% unfortunately. At Thu, 26 Feb 2015 17:49:23 + (UTC), Kevin Grittner kgri...@ymail.com wrote in 440831854.629116.1424972963735.javamail.ya...@mail.yahoo.com Heikki Linnakangas hlinnakan...@vmware.com wrote: On 02/15/2015 02:19 AM, Kevin Grittner wrote: Hmm. Did your test case actually exercise mark/restore? The memcpy()s are not that small. Especially if it's an index-only scan, you're copying a large portion of the page. Some scans call markpos on every tuple, so that could add up. I have results from the `make world` regression tests and a 48-hour customer test. Unfortunately I don't know how heavily either of those exercise this code. Do you have a suggestion for a way to test whether there is a serious regression for something approaching a worst case? ammarkpos/amrestrpos are called in merge joins. By the steps shown below, I had 1M times of markpos and no restorepos for 1M result rows, and had 500k times of markpos and the same number of times of restorepos for 2M rows result by a bit different configuration. I suppose we can say that they are the worst case considering maskpos/restrpos. The call counts can be taken using the attached patch. Both binaries ware compiled with -O2. shared_buffers=1GB and all shared pages used in the query were hit when measuring. The numbers were taken 12 times for each cofiguration and took averages and stddevs of 10 numbers excluding best and worst. Case 1. 500k markpos/restorepos for 2M result rows. Index scan: The patched loses about 2%. (1.98%) master: 6166 ms (std dev = 3.2 ms) Patched: 6288 ms (std dev = 3.7 ms) IndesOnly scan: The patches loses about 2%. (2.14%) master: 5946 ms (std dev = 5.0 ms) Patched: 6073 ms (std dev = 3.3 ms) The patched version is slower by about 2%. Of course all of it is not the effect of memcpy but I don't know the distribution. Case 2. 1M markpos, no restorepos for 1M result rows. IndesOnly scan: The patches loses about 10%. master: 3655 ms (std dev = 2.5 ms) Patched: 4038 ms (std dev = 2.6 ms) The loss is about 10%. The case looks a bit impractical but unfortunately the number might be unignorable. The distribution of the loss is unknown, too. regards, === CREATE TABLE t1 (a int); CREATE TABLE t2 (a int); delete from t1; delete from t2; -- This causes 1M times of markpos and no restorepos INSERT INTO t1 (SELECT a FROM generate_series(0, 99) a); INSERT INTO t2 (SELECT a FROM generate_series(0, 99) a); -- This causes 500k times of markpos and the same number of restorepos -- INSERT INTO t1 (SELECT a/2 FROM generate_series(0, 99) a); -- INSERT INTO t2 (SELECT a/2 FROM generate_series(0, 99) a); CREATE INDEX it1 ON t1 (a); CREATE INDEX it2 ON t2 (a); ANALYZE t1; ANALYZE t1; SET enable_seqscan to false; SET enable_material to false; SET enable_hashjoin to false; SET enable_nestloop to false; SET enable_indexonlyscan to false; -- omit this to do indexonly scan EXPLAIN (ANALYZE) SELECT t1.a, t2.a FROM t1 JOIN t2 on (t1.a = t2.a); QUERY PLAN -- Merge Join (cost=2.83..322381.82 rows=3031231 width=8) (actual time=0.013..5193.566 rows=200 loops=1) Merge Cond: (t1.a = t2.a) - Index Scan using it1 on t1 (cost=0.43..65681.87 rows=167 width=4) (actual time=0.004..727.557 rows=100 oops=1) - Index Scan using it2 on t2 (cost=0.43..214642.89 rows=3031231 width=4) (actual time=0.004..1478.361 rows=1 loops=1) Planning time: 25.585 ms Execution time: 6299.521 ms (6 rows) -- Kyotaro Horiguchi NTT Open Source Software Center diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c index 3af462b..04d6cec 100644 --- a/src/backend/access/nbtree/nbtree.c +++ b/src/backend/access/nbtree/nbtree.c @@ -543,15 +543,19 @@ btendscan(PG_FUNCTION_ARGS) PG_RETURN_VOID(); } + /* * btmarkpos() -- save current scan position */ +int hoge_markpos_count = 0; +int hoge_restrpos_count = 0; Datum btmarkpos(PG_FUNCTION_ARGS) { IndexScanDesc scan = (IndexScanDesc) PG_GETARG_POINTER(0); BTScanOpaque so = (BTScanOpaque) scan-opaque; + hoge_markpos_count++; /* we aren't holding any read locks, but gotta drop the pin */ if (BTScanPosIsValid(so-markPos)) { @@ -585,7 +589,7 @@ btrestrpos(PG_FUNCTION_ARGS) { IndexScanDesc scan = (IndexScanDesc) PG_GETARG_POINTER(0); BTScanOpaque so = (BTScanOpaque) scan-opaque; - + hoge_restrpos_count++; /* Restore the marked positions of any array keys */ if (so-numArrayKeys) _bt_restore_array_keys(scan); diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 33b172b..e7c1b99 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -258,14 +258,19 @@
Re: [HACKERS] Reduce pinning in btree indexes
Kyotaro == Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Kyotaro ammarkpos/amrestrpos are called in merge joins. By the steps Kyotaro shown below, I had 1M times of markpos and no restorepos for Kyotaro 1M result rows, and had 500k times of markpos and the same Kyotaro number of times of restorepos for 2M rows result by a bit Kyotaro different configuration. I suppose we can say that they are Kyotaro the worst case considering maskpos/restrpos. The call counts Kyotaro can be taken using the attached patch. You might want to try running the same test, but after patching ExecSupportsMarkRestore to return false for index scans. This will cause the planner to insert a Materialize node to handle the mark/restore. This way, you can get an idea of how much gain (if any) the direct support of mark/restore in the scan is actually providing. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance improvement for joins where outer side is unique
On 3 February 2015 at 22:23, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hi, I had a look on this patch. Although I haven't understood whole the stuff and all of the related things, I will comment as possible. Great, thank you for taking the time to look and review the patch. Performance: I looked on the performance gain this patch gives. For several on-memory joins, I had gains about 3% for merge join, 5% for hash join, and 10% for nestloop (@CentOS6), for simple 1-level joins with aggregation similar to what you mentioned in previous mail like this. =# SELECT count(*) FROM t1 JOIN t2 USING (id); Of course, the gain would be trivial when returning many tuples, or scans go to disk. That's true, but joins where rows are filtered after the join condition will win here too: For example select * from t1 inner join t2 on t1.id=t2.id where t1.value t2.value; Also queries with a GROUP BY clause. (Since grouping is always performed after the join :-( ) I haven't measured the loss by additional computation when the query is regarded as not a single join. I think this would be hard to measure, but likely if it is measurable then you'd want to look at just planning time, rather than planning and execution time. Explain representation: I don't like that the 'Unique Join: occupies their own lines in the result of explain, moreover, it doesn't show the meaning clearly. What about the representation like the following? Or, It might not be necessary to appear there. Nested Loop ... Output: - Unique Jion: Yes - Seq Scan on public.t2 (cost = ... - - Seq Scan on public.t1 (cost = + - Seq Scan on public.t1 (unique) (cost = Yeah I'm not too big a fan of this either. I did at one evolution of the patch I had Unique Left Join in the join node's line in the explain output, but I hated that more and changed it just to be just in the VERBOSE output, and after I did that I didn't want to change the join node's line only when in verbose mode. I do quite like that it's a separate item for the XML and JSON explain output. That's perhaps quite useful when the explain output must be processed by software. I'm totally open for better ideas on names, but I just don't have any at the moment. Coding: The style looks OK. Could applied on master. It looks to work as you expected for some cases. Random comments follow. - Looking specialjoin_is_unique_join, you seem to assume that !delay_upper_joins is a sufficient condition for not being unique join. The former simplly indicates that don't commute with upper OJs and the latter seems to indicate that the RHS is unique, Could you tell me what kind of relationship is there between them? The rationalisation around that are from the (now changed) version of join_is_removable(), where the code read: /* * Must be a non-delaying left join to a single baserel, else we aren't * going to be able to do anything with it. */ if (sjinfo-jointype != JOIN_LEFT || sjinfo-delay_upper_joins) return false; I have to admit that I didn't go and investigate why delayed upper joins cannot be removed by left join removal code, I really just assumed that we're just unable to prove that a join to such a relation won't match more than one outer side's row. I kept this to maintain that behaviour as I assumed it was there for a good reason. - The naming unique join seems a bit obscure for me, but I don't have no better idea:( However, the member name has_unique_join seems to be better to be is_unique_join. Yeah, I agree with this. I just can't find something short enough that means based on the join condition, the inner side of the join will never produce more than 1 row for any single outer row. Unique join was the best I came up with. I'm totally open for better ideas. I agree that is_unique_join is better than has_unique_join. I must have just copied the has_eclass_joins struct member without thinking too hard about it. I've now changed this in my local copy of the patch. - eclassjoin_is_unique_join involves seemingly semi-exhaustive scan on ec members for every element in joinlist. Even if not, it might be thought to be a bit too large for the gain. Could you do the equivelant things by more small code? I'd imagine some very complex queries could have many equivalence classes and members, though I hadn't really thought that this number would be so great that processing here would suffer much. There's quite a few fast paths out, like the test to ensure that both rels are mentioned in ec_relids. Though for a query which is so complex to have a great number of eclass' and members, likely there would be quite a few relations involved and planning would be slower anyway. I'm not quite sure how else I could write this and still find the unique join cases each time. We can't really just give up half way through looking. Thanks again for the review. Regards
Re: [HACKERS] Reduce pinning in btree indexes
Hi, At Fri, 27 Feb 2015 05:56:18 +, Andrew Gierth and...@tao11.riddles.org.uk wrote in 874mq77vuu@news-spur.riddles.org.uk Kyotaro == Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Kyotaro ammarkpos/amrestrpos are called in merge joins. By the steps Kyotaro shown below, I had 1M times of markpos and no restorepos for Kyotaro 1M result rows, and had 500k times of markpos and the same Kyotaro number of times of restorepos for 2M rows result by a bit Kyotaro different configuration. I suppose we can say that they are Kyotaro the worst case considering maskpos/restrpos. The call counts Kyotaro can be taken using the attached patch. You might want to try running the same test, but after patching ExecSupportsMarkRestore to return false for index scans. This will cause the planner to insert a Materialize node to handle the mark/restore. Mmm? The patch bt-nopin-v1.patch seems not contain the change for ExecSupportMarkRestore and the very simple function remain looking to return true for T_Index(Only)Scan after the patch applied. Explain shows that the plan does not involve materializing step and addition to it, the counters I put into both btmarkpos() and btrestrpos() showed that they are actually called during the execution, like this, for both unpatched and patched. | LOG: markpos=100, restrpos=0 | STATEMENT: EXPLAIN (ANALYZE,BUFFERS) SELECT t1.a, t2.a FROM t1 JOIN t2 on (t1.a = t2.a); To make sure, I looked into btmarkpos and btrestrpos to confirm that they really does the memcpy() we're talking about, then recompiled whole the source tree. This way, you can get an idea of how much gain (if any) the direct support of mark/restore in the scan is actually providing. Does the scan mean T_Material node? But no material in plan and counters in bt*pos were incremented in fact as mentioned above.. Could you point out any other possible mistake in my thought? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Merge compact/non compact commits, make aborts dynamically sized
On Wed, Feb 25, 2015 at 8:10 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-24 20:51:42 +0200, Heikki Linnakangas wrote: On 02/20/2015 05:21 PM, Andres Freund wrote: There's one bit that I'm not so sure about though: To avoid duplication I've added Parse(Commit/Abort)Record(), but unfortunately that has to be available both in front and backend code - so it's currently living in xactdesc.c. I think we can live with that, but it's certainly not pretty. Yeah, that's ugly. Why does frontend code need that? The old format isn't exactly trivial for frontend code to decode either. pg_xlogdump outputs subxacts and such; I don't forsee other usages. Sure, we could copy the code around, but I think that's worse than having it in xactdesc.c. Needs a comment explaining why it's there if I haven't added one already. FWIW, I think they would live better in frontend code for client applications. That's a nice patch. +1 for merging them. Here are a couple of comments: +/* Parse the WAL format of a xact abort into a easier to understand format. */ +void +ParseCommitRecord(uint8 info, xl_xact_commit *xlrec, xl_xact_parsed_commit *parsed) I think that you mean here of an xact commit, not abort. + * Emit, but don't insert, a abort record. s/a abort/an abort/ XactEmitAbortRecord has some problems with tabs replaced by 4 spaces at a couple of places. +/* free opcode 0x70 */ + +#define XLOG_XACT_OPMASK 0x70 There is a contradiction here, 0x70 is not free. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce pinning in btree indexes
Kyotaro == Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: You might want to try running the same test, but after patching ExecSupportsMarkRestore to return false for index scans. This will cause the planner to insert a Materialize node to handle the mark/restore. Kyotaro Mmm? The patch bt-nopin-v1.patch seems not contain the change Kyotaro for ExecSupportMarkRestore and the very simple function remain Kyotaro looking to return true for T_Index(Only)Scan after the patch Kyotaro applied. Right. I'm suggesting you change that, in order to determine what performance cost, if any, would result from abandoning the idea of doing mark/restore entirely. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New CF app deployment
On 02/26/2015 01:59 PM, Michael Paquier wrote: On Thu, Feb 26, 2015 at 9:15 PM, Asif Naeem anaeem...@gmail.com wrote: This thread seems relevant, Please guide me to how can access older CF pages The MSVC portion of this fix got completely lost in the void: https://commitfest.postgresql.org/action/patch_view?id=1330 Above link result in the following error i.e. Not found The specified URL was not found. Please do let me know if I missed something. Thanks. Try commitfest-old instead, that is where the past CF app stores its data, like that: https://commitfest-old.postgresql.org/action/patch_view?id=1330 hmm maybe we should have some sort of handler the redirects/reverse proxies to the old commitfest app for this. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
Tom == Tom Lane t...@sss.pgh.pa.us writes: Tom We've discussed doing $SUBJECT off and on for nearly ten years, Tom the oldest thread I could find about it being here: Tom http://www.postgresql.org/message-id/flat/1186435268.16321.37.ca...@dell.linuxdev.us.dell.com Tom It's come up again every time we found another leak of dead child Tom contexts, which happened twice last year for example. And that Tom patch I'm pushing for expanded out-of-line objects really needs Tom this to become the default behavior anywhere that we can detoast Tom objects. So, this is also changing (indirectly) the effect of ReScanExprContext so that deletes child contexts too. In the grouping sets work I noticed I had to explicitly add some MemoryContextDeleteChildren after ReScanExprContext in order to clean up properly; this obviously makes that redundant. (that looks like another data point in favour of this change) I guess the only question is whether anything currently relies on ReScanExprContext's current behavior. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission
On Sat, Feb 7, 2015 at 8:26 PM, Pavel Stehule pavel.steh...@gmail.com wrote: Hi I am sending a review of this patch. Thanks for the review. sorry for the delay. 4. Regress tests test rules... FAILED -- missing info about new view Thanks. Corrected. My objections: 1. data type for database field should be array of name or array of text. When name contains a comma, then this comma is not escaped currently: {omega,my stupid extremly, name2,my stupid name} expected: {my stupid name,omega,my stupid extremly, name2} Same issue I see in options field Changed including the user column also. 2. Reload is not enough for content refresh - logout is necessary I understand, why it is, but it is not very friendly, and can be very stressful. It should to work with last reloaded data. At present the view data is populated from the variable parsed_hba_lines which contains the already parsed lines of pg_hba.conf file. During the reload, the hba entries are reloaded only in the postmaster process and not in every backend, because of this reason the reloaded data is not visible until the session is disconnected and connected. Instead of changing the SIGHUP behavior in the backend to load the hba entries also, whenever the call is made to the view, the pg_hba.conf file is parsed to show the latest reloaded data in the view. This way it doesn't impact the existing behavior but it may impact the performance because of file load into memory every time. Updated patch is attached. comments? Regards, Hari Babu Fujitsu Australia Catalog_view_to_HBA_settings_patch_V5.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Idea: GSoC - Query Rewrite with Materialized Views
Thank you for your answers. I would be very interested in tracking the staleness of the MV. You see, I work in a research group in database tuning, and we have implemented some solutions to take advantage of MV's and speed up queries. The query rewrite feature would be extremely desirable for us. Do you think that implementing the staleness check as suggested by Thomas could get us started in the query rewrite business? Do you think I should make a proposal or there are more interesting subjects to GSoC? I'd be happy to hear project suggestions, especially related to the optimizer, tuning, etc. Eric 2015-02-20 22:35 GMT-02:00 Tomas Vondra tomas.von...@2ndquadrant.com: On 21.2.2015 00:20, Kevin Grittner wrote: Tomas Vondra tomas.von...@2ndquadrant.com wrote: I share the view that this would be very valuable, but the scope far exceeds what can be done within a single GSoC project. But maybe we could split that into multiple pieces, and Eric would implement only the first piece? For example the 'is_stale' flag for a MV would be really useful, making it possible to refresh only the MVs that actually need a refresh. You may be on to something there. Frankly, though, I'm not sure that we could even reach consensus within the community on a detailed design for how we intend to track staleness (that will hold up both now and once we have incremental maintenance of materialized views working) within the time frame of a GSoC project. This would need to be done with an eye toward how it might be used in direct references (will we allow a staleness limit on a reference from a query?), for use in a rewrite, and how it will interact with changes to base tables and with both REFRESH statements and incremental maintenance at various levels of eagerness. I'm not sure that staleness management wouldn't be better left until we have some of those other parts for it to work with. Doing that properly is going to be nontrivial, no doubt about that. I was thinking about keeping a simple list of updated tables (oids) and then at commit time, deciding which MVs to depend on that and setting some sort of flag (or XID) for all those MVs. But maybe there's a better way. Questions to consider: Some other products allow materialized views to be partitioned and staleness to be tracked by partition, and will check which partitions will be accessed in determining staleness. Is that something we want to allow for? I think we need to get this working for simple MVs, especially because we don't have partitioned MVs (or the type of declarative partitioning the other products do have). Once we have incremental maintenance, an MV maintained in an eager fashion (changes are visible in the MV as soon as the transaction modifying the underlying table commit) could be accessed with a MVCC snapshots, with different snapshots seeing different versions. It seems pretty clear that such an MV would always be considered fresh, so there would be no need to constantly flipping to stale and back again as the underlying table were changed and the changes were reflected in the MV. How do we handle that? Yes, incrementally updated MVs might be used more easily, without tracking staleness. But we don't have that now, and it's going to take a significant amount of time to get there. Also, not all MVs can be updated incrementally, so either we allow only simple MVs to be used for rewrites, or we'll have to implement the 'stale' flag anyway. If changes to an MV are less eager (they are queued for application after COMMIT, as time permits) would we want to track the xid of how far along they are, so that we can tell whether a particular snapshot is safe to use? Do we want to allow a non-MVCC snapshot that shows the latest version of each row? Only if staleness is minimal? Maybe. When I talk about 'flag' I actually mean a simple way to determine whether the MV is up-to-date or not. Snapshots and XIDs are probably the right way to do that in MVCC-based system. What about MVs which don't have incremental maintenance? We can still determine what xid they are current as of, from the creation or the latest refresh. Do we want to track that instead of a simple boolean flag? How would we use the 'as of' XID? IMHO it's unacceptable to quietly use stale data unless the user explicitly references the MV, so we'd have to assume we can't use that MV. regards -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 2/26/15 4:34 PM, Andres Freund wrote: On 2015-02-26 16:16:54 -0600, Jim Nasby wrote: On 2/26/15 4:01 PM, Alvaro Herrera wrote: The reason for doing it this way is that changing the underlying architecture is really hard, without having to bear an endless hackers bike shed discussion about the best userland syntax to use. It seems a much better approach to do the actually difficult part first, then let the rest to be argued to death by others and let those others do the easy part (and take all the credit along with that); that way, that discussion does not kill other possible uses that the new architecture allows. I agree that it's a sane order of developing things. But: I don't think it makes sense to commit it without the capability to change the order. Not so much because it'll not serve a purpose at that point, but rather because it'll essentially untestable. And this is a patch that needs a fair amount of changes, both automated, and manual. It's targeted for 9.6 anyway, so we have a while to decide on what's committed when. It's possible that there's no huge debate on the syntax. At least during development I'd even add a function that randomizes the physical order of columns, and keeps the logical one. Running the regression tests that way seems likely to catch a fair number of bugs. Yeah, I've thought that would be a necessary part of testing. Not really sure how we'd go about it with existing framework though... +1. This patch is already several years old; lets not delay it further. Plus, I suspect that you could hack the catalog directly if you really wanted to change LCO. Worst case you could create a C function to do it. I don't think that's sufficient for testing purposes. Not only for the patch itself, but also for getting it right in new code. We'll want to do testing that it doesn't make sense for the API to support. For example, randomizing the storage ordering; that's not a sane use case. Ideally we wouldn't even expose physical ordering because the code would be smart enough to figure it out. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MemoryContext reset/delete callbacks
On 2015-02-27 01:54:27 +0100, Andres Freund wrote: On 2015-02-26 19:28:50 -0500, Tom Lane wrote: /* *** typedef struct MemoryContextData *** 59,72 MemoryContext firstchild; /* head of linked list of children */ MemoryContext nextchild;/* next child of same parent */ char *name; /* context name (just for debugging) */ boolisReset;/* T = no space alloced since last reset */ #ifdef USE_ASSERT_CHECKING ! boolallowInCritSection; /* allow palloc in critical section */ #endif } MemoryContextData; It's a bit sad to push AllocSetContextData onto four cachelines from the current three... That stuff is hot. But I don't really see a way around it right now. And it seems like it'd give us more amunition to improve things than the small loss of speed it implies. Actually: struct MemoryContextData { NodeTagtype; /* 0 4 */ /* XXX 4 bytes hole, try to pack */ MemoryContextMethods * methods; /* 8 8 */ MemoryContext parent; /*16 8 */ MemoryContext firstchild; /*24 8 */ MemoryContext nextchild;/*32 8 */ char * name; /*40 8 */ bool isReset; /*48 1 */ bool allowInCritSection; /*49 1 */ /* size: 56, cachelines: 1, members: 8 */ /* sum members: 46, holes: 1, sum holes: 4 */ /* padding: 6 */ /* last cacheline: 56 bytes */ }; If we move isReset and allowInCritSection after type, we'd stay at the same size... Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New CF app deployment
On Thu, Feb 26, 2015 at 9:15 PM, Asif Naeem anaeem...@gmail.com wrote: This thread seems relevant, Please guide me to how can access older CF pages The MSVC portion of this fix got completely lost in the void: https://commitfest.postgresql.org/action/patch_view?id=1330 Above link result in the following error i.e. Not found The specified URL was not found. Please do let me know if I missed something. Thanks. Try commitfest-old instead, that is where the past CF app stores its data, like that: https://commitfest-old.postgresql.org/action/patch_view?id=1330 -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On 2/25/15 11:40 PM, Alvaro Herrera wrote: Fujii Masao wrote: On Tue, Feb 24, 2015 at 1:29 AM, David Steele da...@pgmasters.net wrote: 1) Follow Oracle's as session option and only log each statement type against an object the first time it happens in a session. This would greatly reduce logging, but might be too little detail. It would increase the memory footprint of the module to add the tracking. Doesn't this mean that a user could conceal malicious activity simply by doing a innocuous query that silences all further activity? True enough, but I thought it might be useful as an option. I tend to lock down users pretty tightly so there's not much they can do without somehow getting superuser access, at which point all bets are off anyway. In the case where users are highly constrained, the idea here would be to just keeps tabs on the sort of things they are reading/writing for financial audits and ISO compliance. 2) Only log once per call to the backend. Essentially, we would only log the statement you see in pg_stat_activity. This could be a good option because it logs what the user accesses directly, rather than functions, views, etc. which hopefully are already going through a review process and can be audited that way. ... assuming the user does not create stuff on the fly behind your back. Sure, but then the thing they created/modified would also be logged somewhere earlier (assuming pg_audit.log = 'all'). It would require some analysis to figure out what they did but I think that would be true in many cases. Would either of those address your concerns? Before discussing how to implement, probably we need to consider the spec about this. For example, should we log even nested statements for the audit purpose? If yes, how should we treat the case where the user changes the setting so that only DDL is logged, and then the user-defined function which internally executes DDL is called? Since the top-level SQL (calling the function) is not the target of audit, we should not log even the nested DDL? Clearly if you log only DROP TABLE, and then the malicious user hides one such call inside a function that executes the drop (which is called via a SELECT top-level SQL), you're not going to be happy. If the purpose of the auditing it to catch malicious activity then all statements would need to be logged. If only top-level statements are logged then it might be harder to detect, but it would still be logged. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] mogrify and indent features for jsonb
On 26 February 2015 at 15:09, Dmitry Dolgov 9erthali...@gmail.com wrote: Hi, Thom. Would this support deleting type and the value 'dd' With this patch you can delete them one by one: select '{a: 1, b: 2, c: {type: json, stuff: test}, d: [aa,bb,cc,dd]}'::jsonb - '{c, type}'::text[] - '{d, -1}'::text[]; ?column? --- {a: 1, b: 2, c: {stuff: test}, d: [aa, bb, cc]} (1 row) Doesn't work for me: # select '{a: 1, b: 2, c: {type: json, stuff: test}, d: [aa,bb,cc,dd]}'::jsonb - '{c, type}'::text[] - '{d, -1}'::text[]; ERROR: operator does not exist: jsonb - text[] LINE 1: ...ff: test}, d: [aa,bb,cc,dd]}'::jsonb - '{c, typ... ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. Is there a way to take the json: '{a: 1, b: 2, c: {type: json, stuff: test}, d: [aa,bb,cc,dd]}' and add ee to d without replacing it? No, looks like there is no way to add a new element to array with help of this patch. I suppose this feature can be implemented easy enough inside the jsonb_concat function: select '{a: 1, b: 2, c: {type: json, stuff: test}, d: [aa,bb,cc,dd]}'::jsonb || '{d: [ee]}'::jsonb but I'm not sure, that it will be the best way. Yeah, I think that may be problematic. I agree with Josh that there's probably no sane mix of operators for this, as I would expect your example to replace d: [aa,bb,cc,dd] with d: [ee] rather than append to it. Hmm... unless we used a + operator, but then I'm not sure what should happen in the instance of '{d: [aa]}' + '{d: bb}'. -- Thom
Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL
On 2/25/15 10:42 PM, Fujii Masao wrote: On Tue, Feb 24, 2015 at 1:29 AM, David Steele da...@pgmasters.net wrote: On 2/18/15 10:25 AM, David Steele wrote: On 2/18/15 6:11 AM, Fujii Masao wrote: The pg_audit doesn't log BIND parameter values when prepared statement is used. Seems this is an oversight of the patch. Or is this intentional? It's actually intentional - following the model I talked about in my earlier emails, the idea is to log statements only. This also follows on 2ndQuadrant's implementation. Unfortunately, I think it's beyond the scope of this module to log bind variables. Maybe I can live with that at least in the first version. I'm following not only 2ndQuadrant's implementation, but Oracle's as well. Oracle's audit_trail (e.g., = db, extended) can log even bind values. Also log_statement=on in PostgreSQL also can log bind values. Maybe we can reuse the same technique that log_statement uses. I'll look at how this is done in the logging code and see if it can be used in pg_audit. Imagine the case where you call the user-defined function which executes many nested statements. In this case, pg_audit logs only top-level statement (i.e., issued directly by client) every time nested statement is executed. In fact, one call of such UDF can cause lots of *same* log messages. I think this is problematic. I agree - not sure how to go about addressing it, though. I've tried to cut down on the verbosity of the logging in general, but of course it can still be a problem. Using security definer and a different logging GUC for the defining role might work. I'll add that to my unit tests and see what happens. That didn't work - but I didn't really expect it to. Here are two options I thought of: 1) Follow Oracle's as session option and only log each statement type against an object the first time it happens in a session. This would greatly reduce logging, but might be too little detail. It would increase the memory footprint of the module to add the tracking. 2) Only log once per call to the backend. Essentially, we would only log the statement you see in pg_stat_activity. This could be a good option because it logs what the user accesses directly, rather than functions, views, etc. which hopefully are already going through a review process and can be audited that way. Would either of those address your concerns? Before discussing how to implement, probably we need to consider the spec about this. For example, should we log even nested statements for the audit purpose? If yes, how should we treat the case where the user changes the setting so that only DDL is logged, and then the user-defined function which internally executes DDL is called? Since the top-level SQL (calling the function) is not the target of audit, we should not log even the nested DDL? I think logging nested statements should at least be an option. And yes, I think that nested statements should be logged even if the top-level SQL is not (depending on configuration). The main case for this would be DO blocks which can be run by anybody. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
FWIW a fix for this has been posted to all active branches: Author: Andres Freund and...@anarazel.de Branch: master [fd6a3f3ad] 2015-02-26 12:50:07 +0100 Branch: REL9_4_STABLE [d72115112] 2015-02-26 12:50:07 +0100 Branch: REL9_3_STABLE [abce8dc7d] 2015-02-26 12:50:07 +0100 Branch: REL9_2_STABLE [d67076529] 2015-02-26 12:50:07 +0100 Branch: REL9_1_STABLE [5c8dabecd] 2015-02-26 12:50:08 +0100 Branch: REL9_0_STABLE [82e0d6eb5] 2015-02-26 12:50:08 +0100 Reconsider when to wait for WAL flushes/syncrep during commit. Up to now RecordTransactionCommit() waited for WAL to be flushed (if synchronous_commit != off) and to be synchronously replicated (if enabled), even if a transaction did not have a xid assigned. The primary reason for that is that sequence's nextval() did not assign a xid, but are worthwhile to wait for on commit. This can be problematic because sometimes read only transactions do write WAL, e.g. HOT page prune records. That then could lead to read only transactions having to wait during commit. Not something people expect in a read only transaction. This lead to such strange symptoms as backends being seemingly stuck during connection establishment when all synchronous replicas are down. Especially annoying when said stuck connection is the standby trying to reconnect to allow syncrep again... This behavior also is involved in a rather complicated = 9.4 bug where the transaction started by catchup interrupt processing waited for syncrep using latches, but didn't get the wakeup because it was already running inside the same overloaded signal handler. Fix the issue here doesn't properly solve that issue, merely papers over the problems. In 9.5 catchup interrupts aren't processed out of signal handlers anymore. To fix all this, make nextval() acquire a top level xid, and only wait for transaction commit if a transaction both acquired a xid and emitted WAL records. If only a xid has been assigned we don't uselessly want to wait just because of writes to temporary/unlogged tables; if only WAL has been written we don't want to wait just because of HOT prunes. The xid assignment in nextval() is unlikely to cause overhead in real-world workloads. For one it only happens SEQ_LOG_VALS/32 values anyway, for another only usage of nextval() without using the result in an insert or similar is affected. Discussion: 20150223165359.gf30...@awork2.anarazel.de, 369698E947874884A77849D8FE3680C2@maumau, 5CF4ABBA67674088B3941894E22A0D25@maumau Per complaint from maumau and Thom Brown Backpatch all the way back; 9.0 doesn't have syncrep, but it seems better to be consistent behavior across all maintained branches. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of standard comparison operators
Simon Riggs si...@2ndquadrant.com writes: On 20 February 2015 at 20:44, Tom Lane t...@sss.pgh.pa.us wrote: Well, assuming that we're satisfied with just having a way to warn when the behavior changed (and not, in particular, a switch that can select old or new behavior) I'm in favour of your proposed improvements, but I'm having a problem thinking about random application breakage that would result. Having a warn_if_screwed parameter that we disable by default won't help much because if you are affected you can't change that situation. There are too many applications to test all of them and not all applications can be edited, even if they were tested. I find this argument to be unhelpful, because it could be made in exactly the same words against any non-backwards-compatible change whatsoever. Nonetheless, we do make non-backwards-compatible changes all the time. I think the way to do this is to have a pluggable parser, so users can choose 1) old parser, 2) new, better parser, 3) any other parser they fancy that they maintain to ensure application compatibility. We've got a pluggable executor and optimizer, so why not a parser too? Implementing that in the same way we have done for executor and optimizer is a 1 day patch, so easily achievable for 9.5. I don't find that particularly attractive either. It seems quite unlikely to me that anyone would actually use such a hook; replacing the whole parser would be essentially unmaintainable. Nobody uses the hooks you mention to replace the whole planner or whole executor --- there are wrappers out there, which is a different thing altogether. But you could not undo the issue at hand here without at the very least a whole new copy of gram.y, which would need to be updated constantly if you wanted it to keep working across more than one release. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mogrify and indent features for jsonb
Hi, Thom. Would this support deleting type and the value 'dd' With this patch you can delete them one by one: select '{a: 1, b: 2, c: {type: json, stuff: test}, d: [aa,bb,cc,dd]}'::jsonb - '{c, type}'::text[] - '{d, -1}'::text[]; ?column? --- {a: 1, b: 2, c: {stuff: test}, d: [aa, bb, cc]} (1 row) Is there a way to take the json: '{a: 1, b: 2, c: {type: json, stuff: test}, d: [aa,bb,cc,dd]}' and add ee to d without replacing it? No, looks like there is no way to add a new element to array with help of this patch. I suppose this feature can be implemented easy enough inside the jsonb_concat function: select '{a: 1, b: 2, c: {type: json, stuff: test}, d: [aa,bb,cc,dd]}'::jsonb || '{d: [ee]}'::jsonb but I'm not sure, that it will be the best way. On 26 February 2015 at 01:13, Josh Berkus j...@agliodbs.com wrote: On 02/25/2015 03:13 AM, Thom Brown wrote: Can you think of a reasonable syntax for doing that via operators? I can imagine that as a json_path function, i.e.: jsonb_add_to_path(jsonb, text[], jsonb) or where the end of the path is an array: jsonb_add_to_path(jsonb, text[], text|int|float|bool) But I simply can't imagine an operator syntax which would make it clear what the user intended. No, there probably isn't a sane operator syntax for such an operation. A function would be nice. I'd just want to avoid hacking away at arrays by exploding them, adding a value then re-arraying them and replacing the value. Well, anyway, that doesn't seem like a reason to block the patch. Rather, it's a reason to create another one for 9.6 ... -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Primary not sending to synchronous standby
On 26 February 2015 at 13:08, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-23 17:09:24 +, Thom Brown wrote: On 23 February 2015 at 16:53, Andres Freund and...@2ndquadrant.com wrote: Comments? This is obviously just a POC, but I think something like this does make a great deal of sense. Thom, does that help? Yeah, this appears to eliminate the problem, at least in the case I reported. I've pushed a somewhat more evolved version of this after more testing. Thanks. I'll give it another round of testing later. -- Thom
Re: [HACKERS] Primary not sending to synchronous standby
On 2015-02-23 17:09:24 +, Thom Brown wrote: On 23 February 2015 at 16:53, Andres Freund and...@2ndquadrant.com wrote: Comments? This is obviously just a POC, but I think something like this does make a great deal of sense. Thom, does that help? Yeah, this appears to eliminate the problem, at least in the case I reported. I've pushed a somewhat more evolved version of this after more testing. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning WIP patch
On 2/26/15 3:22 AM, Andres Freund wrote: On 2015-02-26 02:20:21 -0600, Jim Nasby wrote: The reason I'd like to do this with partitioning vs plain inheritance is presumably as we build out partitioning we'll get very useful things like the ability to have FKs to properly partitioned tables. Insert tuple routing could also be useful. The problem there imo isn't so much inheritance, but lack of working unique checks across partitions. That's something we can implement independent of this, it's just not trivial. There's been discussion of allowing for uniqueness when we can guarantee no overlap between partitions, and the partition key is part of the unique constraint. That's the particular use case I was thinking of. I suspect there's other partitioning features that would be useful in a generic inheritance setup as well; that's why I'd love to see both features work together... but I fear there's enough work to get there that it may not happen, and I don't want us to accidentally start mixing the two and have users start relying on it. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning WIP patch
On 2/26/15 3:09 AM, Amit Langote wrote: Yes. If it helps, the exact use-case I have in mind is using list-based partitioning + additional columns in some/all children (different between children). For example, if you need to track different types of customer payment methods, you'd have a payment parent table, a list partition for credit debit cards, a different list partition for bank accounts, etc. The reason I'd like to do this with partitioning vs plain inheritance is presumably as we build out partitioning we'll get very useful things like the ability to have FKs to properly partitioned tables. Insert tuple routing could also be useful. Unless I'm missing something again, isn't allowing partitions to have heterogeneous rowtypes a problem in the long run? I'm afraid I'm confused as to your stand regarding inheritance vs. new partitioning. To be specific, children with heterogeneous schemas sounds much like what inheritance would be good for as you say. But then isn't that why we have to plan children individually which you said new partitioning should get away from? Apologies if I haven't been clear enough. What I'd like to see is the best of both worlds; fast partitioning when not using inheritance, and perhaps somewhat slower when using inheritance, but still with the features partitioning gives you. But my bigger concern from a project standpoint is that we not put ourselves in a position of supporting something that we really don't want to support (a partitioning system that's got inheritance mixed in). As much as I'd personally like to have both features together, I think it would be bad for the community to go down that road without careful thought. With explicit partitioning, shouldn't we go in direction where we remove some restrictions imposed by inheritance (think multiple inheritance)? I recall a link Alvaro had started the discussion with think link to a Tom's remark about something very related: http://www.postgresql.org/message-id/1598.1399826...@sss.pgh.pa.us That post looks like Tom figured out a way to eliminate a problem that hurts inheritance, so that's good. My fear is that at some point we'll hit a problem with partitioning that we can't solve in the inheritance model. If we allow inheritance features into partitioning now we'll painted into a corner. If we disallow those features now we can always re-enable them if we get to the point where we're in the clear. Does that make sense? Yes, it does. In fact, I do intend to keep them separate the first attempt of which is to choose to NOT transform a PARTITION OF parent clause into INHERITS parent. Any code that may look like it's trying to do that is because the patch is not fully baked yet. Ok, good to know. That's why I was asking about ALTER TABLE ADD COLUMN on a partition. If we release something without that being restricted it'll probably cause trouble later on. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql versus domains
Hi 2015-02-26 18:31 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: At the behest of Salesforce, I've been looking into improving plpgsql's handling of variables of domain types, which is currently pretty awful. It's really slow, because any assignment to a domain variable goes through the slow double-I/O-conversion path in exec_cast_value, even if no actual work is needed. And I noticed that the domain's constraints get looked up only once per function per session; for example this script gets unexpected results: create domain di as int; create function foo1(int) returns di as $$ declare d di; begin d := $1; return d; end $$ language plpgsql immutable; select foo1(0); -- call once to set up cache alter domain di add constraint pos check (value 0); select 0::di; -- fails, as expected select foo1(0); -- should fail, does not \c - select foo1(0); -- now it fails The reason for this is that domain_in looks up the domain's constraints and caches them under the calling FmgrInfo struct. That behavior was designed to ensure we'd do just one constraint-lookup per query, which I think is reasonable. But plpgsql sets up an FmgrInfo in the variable's PLpgSQL_type record, which persists for the whole session unless the function's parse tree is flushed for some reason. So the constraints only get looked up once. The rough sketch I have in mind for fixing this is: 1. Arrange to cache the constraints for domain types in typcache.c, so as to reduce the number of trips to the actual catalogs. I think typcache.c will have to flush these cache items upon seeing any sinval change in pg_constraint, but that's still cheaper than searching pg_constraint for every query. 2. In plpgsql, explicitly model a domain type as being its base type plus some constraints --- that is, stop depending on domain_in() to enforce the constraints, and do it ourselves. This would mean that the FmgrInfo in a PLpgSQL_type struct would reference the base type's input function rather than domain_in, so we'd not have to be afraid to cache that. The constraints would be represented as a separate list, which we'd arrange to fetch from the typcache once per transaction. (I think checking for new constraints once per transaction is sufficient for reasonable situations, though I suppose that could be argued about.) The advantage of this approach is first that we could avoid an I/O conversion when the incoming value to be assigned matches the domain's base type, which is the typical case; and second that a domain without any CHECK constraints would become essentially free, which is also a common case. I like this variant There can be some good optimization with scalar types: text, varchars, numbers and reduce IO cast. 3. We could still have domains.c responsible for most of the details; the domain_check() API may be good enough as-is, though it seems to lack any simple way to force re-lookup of the domain constraints once per xact. Thoughts, better ideas? Given the lack of field complaints, I don't feel a need to try to back-patch a fix for this, but I do want to see it fixed for 9.5. +1 Regards Pavel regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] mogrify and indent features for jsonb
On 02/26/2015 07:25 AM, Thom Brown wrote: Yeah, I think that may be problematic. I agree with Josh that there's probably no sane mix of operators for this, as I would expect your example to replace d: [aa,bb,cc,dd] with d: [ee] rather than append to it. Hmm... unless we used a + operator, but then I'm not sure what should happen in the instance of '{d: [aa]}' + '{d: bb}'. Yeah, that's what I would expect too. In fact, I could would count on replacement. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning WIP patch
On 02/25/2015 07:15 PM, Amit Langote wrote: On 26-02-2015 AM 05:15, Josh Berkus wrote: On 02/24/2015 12:13 AM, Amit Langote wrote: Here is an experimental patch that attempts to implement this. This looks awesome. Thanks! I would love to have it for 9.5, but I guess the patch isn't nearly baked enough for that? I'm not quite sure what would qualify as baked enough for 9.5 though we can surely try to reach some consensus on various implementation aspects and perhaps even get it ready in time for 9.5. Well, we don't have long at all to do that. I guess I'm asking what kind of completeness of code we have; is this basically done pending API changes and bugs, or are there major bits (like, say, pg_dump and EXPLAIN support) which are completely unimplemented? where key_spec consists of partition key column names and optional operator class per column. Currently, there are restrictions on the key_spec such as allowing only column names (not arbitrary expressions of them), only one column for list strategy, etc. What's the obstacle to supporting expressions and/or IMMUTABLE functions? I think it's fine to add this feature without them initially, I'm just asking about the roadmap for eventually supporting expressions in the key spec. Only one concern I can remember someone had raised is that having to evaluate an expression for every row during bulk-inserts may end up being pretty expensive. Though, we might have to live with that. Well, it's not more expensive than having to materialize the value from a trigger and store it on disk. The leading one here would be functions over timestamp; for example, the data has a timestamptz, but you want to partition by week. I think one idea is to learn from ability to use expressions in indexes. Sure. So a feature to implement for the 2nd release. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce pinning in btree indexes
Heikki Linnakangas hlinnakan...@vmware.com wrote: On 02/15/2015 02:19 AM, Kevin Grittner wrote: Interestingly, the btree README points out that using the old TID with a new tuple poses no hazard for a scan using an MVCC snapshot, because the new tuple would not be visible to a snapshot created that long ago. The first question is: Do we really need that interlock for the non-MVCC snapshots either? We don't need the interlock for non-MVCC snapshots if the conditions we limit or filter on at the index level are rechecked when we get to the heap tuple; otherwise we do. If we do: For non-MVCC snapshots, we need to ensure that all index scans that started before the VACUUM did complete before the VACUUM does. Isn't it enough to be sure that if we're not going to recheck any index tests against the heap tuple that any process-local copies of index entries which exist at the start of the index scan phase of the vacuum are not used after the vacuum enters the phase of freeing line pointers -- that is, any scan which has read a page when the vacuum starts to scan the index moves on to another page before vacuum finishes scanning that index? I wonder if we could find a different mechanism to enforce that. Using the pin-interlock for that has always seemed a bit silly to me. It certainly is abusing the semantics of the pin to treat it as a type of lock on the contents. Perhaps grab a new heavy-weight lock on the table whenever a non-MVCC index scan on the table begins, and have VACUUM wait on it. -1 The problem I'm most interested in fixing based on user feedback is a vacuum blocking when a cursor which is using an index scan is sitting idle. Not only does the vacuum of that table stall, but if it is an autovacuum worker, that worker is prevented from cleaning up any other tables. I have seen all autovacuum workers blocked in exactly this way, leading to massive bloat. What you suggest is just using a less awkward way to keep the problem. I found that the LP_DEAD hinting would be a problem with an old TID, but I figured we could work around that by storing the page LSN into the scan position structure when the page contents were read, and only doing hinting if that matched when we were ready to do the hinting. That wouldn't work for an index which was not WAL-logged, so the patch still holds pins for those. Or you could use GetFakeLSNForUnloggedRel(). I'm unfamiliar with that, but will take a look. (I will be back at my usual development machine late tomorrow.) Robert pointed out that the visibility information for an index-only scan wasn't checked while the index page READ lock was held, so those scans also still hold the pins. Why does an index-only scan need to hold the pin? Robert already answered this one, but there is a possible solution -- provide some way to check the heap's visibility map for the TIDs read from an index page before releasing the read lock on that page. Finally, there was an optimization for marking buffer position for possible restore that was incompatible with releasing the pin. I use quotes because the optimization adds overhead to every move to the next page in order set some variables in a structure when a mark is requested instead of running two fairly small memcpy() statements. The two-day benchmark of the customer showed no performance hit, and looking at the code I would be amazed if the optimization yielded a measurable benefit. In general, optimization by adding overhead to moving through a scan to save time in a mark operation seems dubious. Hmm. Did your test case actually exercise mark/restore? The memcpy()s are not that small. Especially if it's an index-only scan, you're copying a large portion of the page. Some scans call markpos on every tuple, so that could add up. I have results from the `make world` regression tests and a 48-hour customer test. Unfortunately I don't know how heavily either of those exercise this code. Do you have a suggestion for a way to test whether there is a serious regression for something approaching a worst case? At some point we could consider building on this patch to recheck index conditions for heap access when a non-MVCC snapshot is used, check the visibility map for referenced heap pages when the TIDs are read for an index-only scan, and/or skip LP_DEAD hinting for non-WAL-logged indexes. But all those are speculative future work; this is a conservative implementation that just didn't modify pinning where there were any confounding factors. Understood. Still, I'd like to see if we can easily get rid of the pinning altogether. That would be great, but I felt that taking care of the easy cases in on patch and following with patches to handle the trickier cases as separate follow-on patches made more sense than trying to do everything at once. Assuming we sort out the mark/restore issues for the initial patch, it provides infrastructure to keep the
Re: [HACKERS] Partitioning WIP patch
On 2015-02-26 12:15:17 +0900, Amit Langote wrote: On 26-02-2015 AM 05:15, Josh Berkus wrote: On 02/24/2015 12:13 AM, Amit Langote wrote: Here is an experimental patch that attempts to implement this. I would love to have it for 9.5, but I guess the patch isn't nearly baked enough for that? I'm not quite sure what would qualify as baked enough for 9.5 though we can surely try to reach some consensus on various implementation aspects and perhaps even get it ready in time for 9.5. I think it's absolutely unrealistic to get this into 9.5. There's barely been any progress on the current (last!) commitfest - where on earth should the energy come to make this patch ready? And why would that be fair against all the others that have submitted in time? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to create virtual indexes on postgres
On 2/26/15 6:17 AM, Sreerama Manoj wrote: But, it runs with Postgres 9.1 version...But I use 9.4..I think I cant use that. Or as an alternative Is there any provision in postgres to know use(Increase in Performance) of an index before creating that index. No. It might not be too hard to port the hypothetical index work to 9.4 though. Also, just to let you know, this is really a topic for pgsql-general, not pgsql-hackers. It's also best to reply to list emails in-line, or at the bottom, not at the top. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql versus domains
On 2015-02-26 13:53:18 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: Hm. A bit sad to open code that in plpgsql instead of making it available more generally. The actual checking wouldn't be open-coded, it would come from domain_check() (or possibly a modified version of that). It is true that plpgsql would know more about domains than it does today, but It'd still teach plpgsql more about types than it knows right now. But on a second thought that ship has sailed long ago - and the amount of knowledge seems relatively small. There's much more stuff about composites there already. and I'm not planning to fight two compatibility battles concurrently ;-) Heh ;) You're probably going to kill me because of the increased complexity, but how about making typecache.c smarter, more in the vein of relcache.c, and store the relevant info in there? I thought that's what I was proposing in point #1. Sure, but my second point was to avoid duplicating that information into -fcinfo and such and instead reference the typecache entry from there. So that if the typecache entry is being rebuilt (a new mechanism I'm afraid), whatever uses -fcinfo gets the updated functions/definition. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql versus domains
Hi, On 2015-02-26 12:31:46 -0500, Tom Lane wrote: It's really slow, because any assignment to a domain variable goes through the slow double-I/O-conversion path in exec_cast_value, even if no actual work is needed. Hm, that's surely not nice for some types. Doing syscache lookups every time can't help either. And I noticed that the domain's constraints get looked up only once per function per session; for example this script gets unexpected results: The reason for this is that domain_in looks up the domain's constraints and caches them under the calling FmgrInfo struct. That's probably far from the only such case we have :( 1. Arrange to cache the constraints for domain types in typcache.c, so as to reduce the number of trips to the actual catalogs. I think typcache.c will have to flush these cache items upon seeing any sinval change in pg_constraint, but that's still cheaper than searching pg_constraint for every query. That sounds sane. 2. In plpgsql, explicitly model a domain type as being its base type plus some constraints --- that is, stop depending on domain_in() to enforce the constraints, and do it ourselves. This would mean that the FmgrInfo in a PLpgSQL_type struct would reference the base type's input function rather than domain_in, so we'd not have to be afraid to cache that. The constraints would be represented as a separate list, which we'd arrange to fetch from the typcache once per transaction. Hm. A bit sad to open code that in plpgsql instead of making it available more generally. 3. We could still have domains.c responsible for most of the details; the domain_check() API may be good enough as-is, though it seems to lack any simple way to force re-lookup of the domain constraints once per xact. Thoughts, better ideas? You're probably going to kill me because of the increased complexity, but how about making typecache.c smarter, more in the vein of relcache.c, and store the relevant info in there? And then, to avoid repeated lookups, store a reference to that in fcinfo? The lifetime of objects wouldn't be entirely trivial, but it doesn't sound impossible. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql versus domains
2015-02-26 19:53 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Andres Freund and...@2ndquadrant.com writes: On 2015-02-26 12:31:46 -0500, Tom Lane wrote: 2. In plpgsql, explicitly model a domain type as being its base type plus some constraints --- that is, stop depending on domain_in() to enforce the constraints, and do it ourselves. This would mean that the FmgrInfo in a PLpgSQL_type struct would reference the base type's input function rather than domain_in, so we'd not have to be afraid to cache that. The constraints would be represented as a separate list, which we'd arrange to fetch from the typcache once per transaction. Hm. A bit sad to open code that in plpgsql instead of making it available more generally. The actual checking wouldn't be open-coded, it would come from domain_check() (or possibly a modified version of that). It is true that plpgsql would know more about domains than it does today, but I'm not sure I see another use-case for this type of logic. To some extent this is a workaround for the fact that plpgsql does type conversions the way it does (ie, by I/O rather than by using the parser's cast mechanisms). We've talked about changing that, but people seem to be afraid of the compatibility consequences, and I'm not planning to fight two compatibility battles concurrently ;-) I understand, but in this case, a compatibility can be enforced simply - we can support a only know cast mode and IO cast mode. IO cast is simple for lot of people, but there is a lot of performance issues and few errors related to this topic. But this is offtopic now. But, what can be related - for plpgsql_check is necessary some simple check of a assigning - that should to return error or warning This part of plpgsql_check is too complex now. Regards Pavel You're probably going to kill me because of the increased complexity, but how about making typecache.c smarter, more in the vein of relcache.c, and store the relevant info in there? I thought that's what I was proposing in point #1. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 2/23/15 5:09 PM, Tomas Vondra wrote: Over the time I've heard various use cases for this patch, but in most cases it was quite speculative. If you have an idea where this might be useful, can you explain it here, or maybe point me to a place where it's described? For better or worse, table structure is a form of documentation for a system. As such, it's very valuable to group related fields in a table together. When creating a table, that's easy, but as soon as you need to alter your careful ordering can easily end up out the window. Perhaps to some that just sounds like pointless window dressing, but my experience is that on a complex system the less organized things are the more bugs you get due to overlooking something. The other reason for this patch (which it maybe doesn't support anymore?) is to allow Postgres to use an optimal physical ordering of fields on a page to reduce space wasted on alignment, as well as taking nullability and varlena into account. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] plpgsql versus domains
Andres Freund and...@2ndquadrant.com writes: On 2015-02-26 12:31:46 -0500, Tom Lane wrote: 2. In plpgsql, explicitly model a domain type as being its base type plus some constraints --- that is, stop depending on domain_in() to enforce the constraints, and do it ourselves. This would mean that the FmgrInfo in a PLpgSQL_type struct would reference the base type's input function rather than domain_in, so we'd not have to be afraid to cache that. The constraints would be represented as a separate list, which we'd arrange to fetch from the typcache once per transaction. Hm. A bit sad to open code that in plpgsql instead of making it available more generally. The actual checking wouldn't be open-coded, it would come from domain_check() (or possibly a modified version of that). It is true that plpgsql would know more about domains than it does today, but I'm not sure I see another use-case for this type of logic. To some extent this is a workaround for the fact that plpgsql does type conversions the way it does (ie, by I/O rather than by using the parser's cast mechanisms). We've talked about changing that, but people seem to be afraid of the compatibility consequences, and I'm not planning to fight two compatibility battles concurrently ;-) You're probably going to kill me because of the increased complexity, but how about making typecache.c smarter, more in the vein of relcache.c, and store the relevant info in there? I thought that's what I was proposing in point #1. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Composite index and min()
On 2/26/15 1:34 AM, James Sewell wrote: I have the following table: I'm moving this discussion to -general. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Tue, Feb 24, 2015 at 6:46 PM, Syed, Rahila rahila.s...@nttdata.com wrote: Hello , I've not read this logic yet, but ISTM there is a bug in that new WAL format because I got the following error and the startup process could not replay any WAL records when I set up replication and enabled wal_compression. LOG: record with invalid length at 0/3B0 LOG: record with invalid length at 0/3000518 LOG: Invalid block length in record 0/30005A0 LOG: Invalid block length in record 0/3000D60 ... Please fine attached patch which replays WAL records. Even this patch doesn't work fine. The standby emit the following error messages. LOG: invalid block_id 255 at 0/3B0 LOG: record with invalid length at 0/30017F0 LOG: invalid block_id 255 at 0/3001878 LOG: record with invalid length at 0/30027D0 LOG: record with invalid length at 0/3002E58 ... Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning WIP patch
On 26-02-2015 PM 05:20, Jim Nasby wrote: On 2/25/15 7:57 PM, Amit Langote wrote: AIUI, as far as we stay with a design where partitions (children) are individually planned, that might be OK. But, I guess things will get more complicated. I think the role of a parent in planning would remain limited to drive partition-pruning. Am I missing something? Isn't the point of adding explicit partitioning to make it faster than plain inheritance? Presumably as part of that we'll eventually want to NOT plan children individually. Yes, we'd definitely want to get to a point where planning children individually is not necessary. But I am afraid we will have to get there a step at a time. IMHO, solving one problem of partition-pruning would be a good start. And that will definitely be part of parent's planning using partition bounds list (not pruning children one-by-one with relation_excluded_by_constraints()). I would certainly prefer that we support the capabilities of inheritance along with partitioning (because in some cases you want both). But it's going to limit what we can do internally. Just to clarify are you referring to inheritance relationship between a partitioned table and partitions? Yes. If it helps, the exact use-case I have in mind is using list-based partitioning + additional columns in some/all children (different between children). For example, if you need to track different types of customer payment methods, you'd have a payment parent table, a list partition for credit debit cards, a different list partition for bank accounts, etc. The reason I'd like to do this with partitioning vs plain inheritance is presumably as we build out partitioning we'll get very useful things like the ability to have FKs to properly partitioned tables. Insert tuple routing could also be useful. Unless I'm missing something again, isn't allowing partitions to have heterogeneous rowtypes a problem in the long run? I'm afraid I'm confused as to your stand regarding inheritance vs. new partitioning. To be specific, children with heterogeneous schemas sounds much like what inheritance would be good for as you say. But then isn't that why we have to plan children individually which you said new partitioning should get away from? With explicit partitioning, shouldn't we go in direction where we remove some restrictions imposed by inheritance (think multiple inheritance)? I recall a link Alvaro had started the discussion with think link to a Tom's remark about something very related: http://www.postgresql.org/message-id/1598.1399826...@sss.pgh.pa.us That post looks like Tom figured out a way to eliminate a problem that hurts inheritance, so that's good. My fear is that at some point we'll hit a problem with partitioning that we can't solve in the inheritance model. If we allow inheritance features into partitioning now we'll painted into a corner. If we disallow those features now we can always re-enable them if we get to the point where we're in the clear. Does that make sense? Yes, it does. In fact, I do intend to keep them separate the first attempt of which is to choose to NOT transform a PARTITION OF parent clause into INHERITS parent. Any code that may look like it's trying to do that is because the patch is not fully baked yet. Regards, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] How to create virtual indexes on postgres
Hi, I use Postgres 9.4 database.Now,I am optimizing the queries by using the results of explain and explain analyze,Sometimes I am creating Indexes to optimize them. But, I was not successful sometimes as even I create Index to optimize them, the planner is not using them . So my question was can we know whether the planner will use the index before actually creating a real Index..or can we create virtual or Hypothetical Index those can only be known to the planner and not the user or Is there any alternative to do it..If present,share with me.
Re: [HACKERS] Transactions involving multiple postgres foreign servers
Added to 2015-06 commitfest to attract some reviews and comments. On Tue, Feb 17, 2015 at 2:56 PM, Ashutosh Bapat ashutosh.ba...@enterprisedb.com wrote: Hi All, Here are the steps and infrastructure for achieving atomic commits across multiple foreign servers. I have tried to address most of the concerns raised in this mail thread before. Let me know, if I have left something. Attached is a WIP patch implementing the same for postgres_fdw. I have tried to make it FDW-independent. A. Steps during transaction processing 1. When an FDW connects to a foreign server and starts a transaction, it registers that server with a boolean flag indicating whether that server is capable of participating in a two phase commit. In the patch this is implemented using function RegisterXactForeignServer(), which raises an error, thus aborting the transaction, if there is at least one foreign server incapable of 2PC in a multiserver transaction. This error thrown as early as possible. If all the foreign servers involved in the transaction are capable of 2PC, the function just updates the information. As of now, in the patch the function is in the form of a stub. Whether a foreign server is capable of 2PC, can be a. FDW level decision e.g. file_fdw as of now, is incapable of 2PC but it can build the capabilities which can be used for all the servers using file_fdw b. a decision based on server version type etc. thus FDW can decide that by looking at the server properties for each server c. a user decision where the FDW can allow a user to specify it in the form of CREATE/ALTER SERVER option. Implemented in the patch. For a transaction involving only a single foreign server, the current code remains unaltered as two phase commit is not needed. Rest of the discussion pertains to a transaction involving more than one foreign servers. At the commit or abort time, the FDW receives call backs with the appropriate events. FDW then takes following actions on each event. 2. On XACT_EVENT_PRE_COMMIT event, the FDW coins one prepared transaction id per foreign server involved and saves it along with xid, dbid, foreign server id and user mapping and foreign transaction status = PREPARING in-memory. The prepared transaction id can be anything represented as byte string. Same information is flushed to the disk to survive crashes. This is implemented in the patch as prepare_foreign_xact(). Persistent and in-memory storages and their usages are discussed later in the mail. FDW then prepares the transaction on the foreign server. If this step is successful, the foreign transaction status is changed to PREPARED. If the step is unsuccessful, the local transaction is aborted and each FDW will receive XACT_EVENT_ABORT (discussed later). The updates to the foreign transaction status need not be flushed to the disk, as they can be inferred from the status of local transaction. 3. If the local transaction is committed, the FDW callback will get XACT_EVENT_COMMIT event. Foreign transaction status is changed to COMMITTING. FDW tries to commit the foreign transaction with the prepared transaction id. If the commit is successful, the foreign transaction entry is removed. If the commit is unsuccessful because of local/foreign server crash or network failure, the foreign prepared transaction resolver takes care of the committing it at later point of time. 4. If the local transaction is aborted, the FDW callback will get XACT_EVENT_ABORT event. At this point, the FDW may or may not have prepared a transaction on foreign server as per step 1 above. If it has not prepared the transaction, it simply aborts the transaction on foreign server; a server crash or network failure doesn't alter the ultimate result in this case. If FDW has prepared the foreign transaction, it updates the foreign transaction status as ABORTING and tries to rollback the prepared transaction. If the rollback is successful, the foreign transaction entry is removed. If the rollback is not successful, the foreign prepared transaction resolver takes care of aborting it at later point of time. B. Foreign prepared transaction resolver --- In the patch this is implemented as a built-in function pg_fdw_resolve(). Ideally the functionality should be run by a background worker process frequently. The resolver looks at each entry and invokes the FDW routine to resolve the transaction. The FDW routine returns boolean status: true if the prepared transaction was resolved (committed/aborted), false otherwise. The resolution is as follows - 1. If foreign transaction status is COMMITTING or ABORTING, commits or aborts the prepared transaction resp through the FDW routine. If the transaction is successfully resolved, it removes the foreign transaction entry. 2. Else, it checks if the local transaction was committed or
Re: [HACKERS] Partitioning WIP patch
On 2015-02-26 02:20:21 -0600, Jim Nasby wrote: The reason I'd like to do this with partitioning vs plain inheritance is presumably as we build out partitioning we'll get very useful things like the ability to have FKs to properly partitioned tables. Insert tuple routing could also be useful. The problem there imo isn't so much inheritance, but lack of working unique checks across partitions. That's something we can implement independent of this, it's just not trivial. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 26 February 2015 at 05:41, Stephen Frost sfr...@snowman.net wrote: Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: Here's an updated patch with a new test for this bug. I've been developing the fixes for these RLS issues as one big patch, but I suppose it would be easy to split up, if that's preferred. Thanks for working on all of this! I've brought this up to date with master and addressed the little bit of bitrot. I'm still reviewing it but I'm generally happy with the approach and would certainly welcome any additional thoughts from you or feedback from others. Thanks for doing that. I had been meaning to update it myself, but kept getting distracted by my day job. I think this should probably be committed as 2 separate patches, because there are really 2 separate issues being fixed here, and the commit comment only mentions the second one: 1). The planner changes and the first new regression test (update with from clause self join). This is a fix to the inheritance planner code, which wasn't properly handling the case of a query containing both target and non-target relations with RLS and inheritance. 2). The rewriter changes and the other regression tests (S.b. view on top of Row-level security). This is more than just a code tidy-up -- there's a real bug there. If you run those tests against HEAD, the update against the s.b. view fails to apply the RLS policy of the underlying table because the old recursion detection code thinks it has already handled that RTE. I should get more time to look at this over the weekend. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS
On 26 February 2015 at 05:43, Stephen Frost sfr...@snowman.net wrote: Dean, * Dean Rasheed (dean.a.rash...@gmail.com) wrote: Attached is a patch to make RLS checks run before attempting to insert/update any data rather than afterwards. Excellent, this I really like and it's a pretty straight-forward change. I wonder if there are some documentation updates which need to be done for this also? I'm planning to look as I vauguely recall mentioning the ordering of operations somewhere along the way. I also addressed the bitrot from the column-priv leak patch. Would be great to have you take a look at the latest and let me know if you have any further comments or suggestions. I'm definitely looking forward to getting these changes in soon. Thanks for the update. I don't recall any mention of ordering in the docs, but if there isn't anything there it makes sense to add something. I haven't thought about this patch in a while, but I still like the look of it. It's definitely neater to do these checks earlier and it's good to be able to get RLS-specific errors too. I'll take a look at the latest updates. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Refactoring GUC unit conversions
On Fri, Feb 13, 2015 at 10:26 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: In the redesign checkpoint_segments patch, Robert suggested keeping the settings' base unit as number of segments, but allow conversions from MB, GB etc. I started looking into that and found that adding a new unit to guc.c is quite messy. The conversions are done with complicated if-switch-case constructs. Attached is a patch to refactor that, making the conversions table-driven. This doesn't change functionality, it just makes the code nicer. Isn't it good idea to allow even wal_keep_segments to converse from MB, GB etc? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of GetUserId() Usage
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed I have reviewed the patch. Patch is excellent in shape and does what is expected and discussed. Also changes are straight forward too. So looks good to go in. However I have one question: What is the motive for splitting the function return value from SIGNAL_BACKEND_NOPERMISSION into SIGNAL_BACKEND_NOSUPERUSER and SIGNAL_BACKEND_NOPERMISSION? Is that required for some other upcoming patches OR just for simplicity? Currently, we have combined error for both which is simply split into two. No issue as such, just curious as it does not go well with the subject. You can mark this for ready for committer. The new status of this patch is: Waiting on Author -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about to have relnamespace and relrole?
Hello, thank you for reviewing. The attatched are the third version of this patch. 0001-Add-regrole_v3.patch 0002-Add-regnamespace_v3.patch - Rearranged into regrole patch and regnamespace patch as seen above, each of them consists of changes for code, docs, regtests. regnamespace patch depends on the another patch. - Removed the irrelevant change and corrected mistakes in comments. - Renamed role_or_oid to role_name_or_oid in regrolein(). - Changed the example name for regrole in the docmentation to 'smithee' as an impossible-in-reality-but-well-known name, from 'john', the most common in US (according to Wikipedia). At Tue, 24 Feb 2015 16:30:44 +0530, Jeevan Chalke jeevan.cha...@enterprisedb.com wrote in CAM2+6=v-fwqfkwf0pi3dacq-zry5wxcplvluqfkf2mf57_6...@mail.gmail.com I agree on Tom's concern on MVCC issue, but we already hit that when we introduced regclass and others. So I see no additional issue with these changes as such. About planner slowness, I guess updated documentation looks perfect for that. So I went ahead reviewing these patches. All patches are straight forward and since we have similar code already exists, I did not find any issue at code level. They are consistent with other functions. One concern is about arbitrary allocation of OIDs for the new objects - types, functions. They are isolated from other similar reg* types, but there's no help for it without global renumbering of static(system) OIDs. Patches applies with patch -p1. make, make install, make check has no issues. Testing was fine too. However here are few review comments on the patches attached: Review points on 0001-Add-regrole.patch --- 1. +#include utils/acl.h Can you please add it at appropriate place as #include list is an ordered list regrolein calls reg_role_oid in acl.c, which is declared in acl.h. 2. +char *role_or_oid = PG_GETARG_CSTRING(0); Can we have variable named as role_name_or_oid, like other similar functions? I might somehow have thought it a bit long. Fixed. 3. +/* + * Normal case: parse the name into components and see if it matches any + * pg_role entries in the current search path. + */ I believe, roles are never searched per search path. Thus need to update comments here. Ouch. I forgot to edit it properly. I edit it. The correct catalog name is pg_authid. + /* Normal case: see if the name matches any pg_authid entry. */ I also edited comments for regnamespacein. Review points on 0002-Add-regnamespace.patch --- 1. + * regnamespacein- converts classname to class OID Typos. Should be nspname instead of classname and namespase OID instead of class OID Thank you for pointing out. I fixed the same mistake in regrolein, p_ts_dict was there instead of pg_authid.. The names of oid kinds appear in multiple forms, that is, oprname and opr_name. Although I don't understand the reason, I followed the convention. Review points on 0003-Check-newpatch --- 1. @@ -1860,6 +1860,7 @@ StoreAttrDefault(Relation rel, AttrNumber attnum, OidattrdefOid; ObjectAddress colobject, defobject; +Oidexprtype; This seems unrelated. Please remove. It's a trace of the previous code to ruling out the new oid types. Removed. Apart from this, it will be good if you have ONLY two patches, (1) For regrole and (2) For regnamespace specific With all related changes into one patch. I mean, all role related changes i.e. 0001 + 0003 role related changes + 0004 role related changes and docs update AND 0002 + 0003 nsp related changes + 0004 nsp related changes I prudently separated it since I wasn't confident on the pertinence of ruling out. I rearranged them as your advise including docs. regards, -- Kyotaro Horiguchi NTT Open Source Software Center From 4d56a68e2bf2b7ee0da0447ad9f82f0b46277133 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp Date: Wed, 18 Feb 2015 14:38:32 +0900 Subject: [PATCH 1/2] Add regrole Add new OID aliass type regrole. The new type has the scope of whole the database cluster so it doesn't behave as the same as the existing OID alias types which have database scope, concerning object dependency. To get rid of confusion, inhibit constants of the new type from appearing where dependencies are made involving it. Documentation about this new type is added and some existing descriptions are modified according to the restriction of the type. Addition to it, put a note about possible MVCC violation and optimization issues, which are general over the all reg* types. --- doc/src/sgml/datatype.sgml| 55 +--- src/backend/bootstrap/bootstrap.c | 2 + src/backend/catalog/dependency.c | 22 src/backend/utils/adt/regproc.c | 98 +++ src/backend/utils/adt/selfuncs.c | 2 + src/backend/utils/cache/catcache.c
Re: [HACKERS] How about to have relnamespace and relrole?
Sorry, I fixed a silly typo in documentation in the previous version. - of theses types has a significance... + of these types has a significance... # My fingers frequently slip as above.. I incremented the version of this revised patch to get rid of confusion. === Hello, thank you for reviewing. The attatched are the fourth version of this patch. 0001-Add-regrole_v4.patch 0002-Add-regnamespace_v4.patch - Rearranged into regrole patch and regnamespace patch as seen above, each of them consists of changes for code, docs, regtests. regnamespace patch depends on the another patch. - Removed the irrelevant change and corrected mistakes in comments. - Renamed role_or_oid to role_name_or_oid in regrolein(). - Changed the example name for regrole in the docmentation to 'smithee' as an impossible-in-reality-but-well-known name, from 'john', the most common in US (according to Wikipedia). At Tue, 24 Feb 2015 16:30:44 +0530, Jeevan Chalke jeevan.cha...@enterprisedb.com wrote in CAM2+6=v-fwqfkwf0pi3dacq-zry5wxcplvluqfkf2mf57_6...@mail.gmail.com I agree on Tom's concern on MVCC issue, but we already hit that when we introduced regclass and others. So I see no additional issue with these changes as such. About planner slowness, I guess updated documentation looks perfect for that. So I went ahead reviewing these patches. All patches are straight forward and since we have similar code already exists, I did not find any issue at code level. They are consistent with other functions. One concern is about arbitrary allocation of OIDs for the new objects - types, functions. They are isolated from other similar reg* types, but there's no help for it without global renumbering of static(system) OIDs. Patches applies with patch -p1. make, make install, make check has no issues. Testing was fine too. However here are few review comments on the patches attached: Review points on 0001-Add-regrole.patch --- 1. +#include utils/acl.h Can you please add it at appropriate place as #include list is an ordered list regrolein calls reg_role_oid in acl.c, which is declared in acl.h. 2. +char *role_or_oid = PG_GETARG_CSTRING(0); Can we have variable named as role_name_or_oid, like other similar functions? I might somehow have thought it a bit long. Fixed. 3. +/* + * Normal case: parse the name into components and see if it matches any + * pg_role entries in the current search path. + */ I believe, roles are never searched per search path. Thus need to update comments here. Ouch. I forgot to edit it properly. I edit it. The correct catalog name is pg_authid. + /* Normal case: see if the name matches any pg_authid entry. */ I also edited comments for regnamespacein. Review points on 0002-Add-regnamespace.patch --- 1. + * regnamespacein- converts classname to class OID Typos. Should be nspname instead of classname and namespase OID instead of class OID Thank you for pointing out. I fixed the same mistake in regrolein, p_ts_dict was there instead of pg_authid.. The names of oid kinds appear in multiple forms, that is, oprname and opr_name. Although I don't understand the reason, I followed the convention. Review points on 0003-Check-newpatch --- 1. @@ -1860,6 +1860,7 @@ StoreAttrDefault(Relation rel, AttrNumber attnum, OidattrdefOid; ObjectAddress colobject, defobject; +Oidexprtype; This seems unrelated. Please remove. It's a trace of the previous code to ruling out the new oid types. Removed. Apart from this, it will be good if you have ONLY two patches, (1) For regrole and (2) For regnamespace specific With all related changes into one patch. I mean, all role related changes i.e. 0001 + 0003 role related changes + 0004 role related changes and docs update AND 0002 + 0003 nsp related changes + 0004 nsp related changes I prudently separated it since I wasn't confident on the pertinence of ruling out. I rearranged them as your advise including docs. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug in pg_dump
On Wed, Feb 25, 2015 at 7:46 AM, Gilles Darold gilles.dar...@dalibo.com wrote: This is a far better patch and the test to export/import of the postgis_topology extension works great for me. Thanks for the work. Attached is a patch that uses an even better approach by querying only once all the FK dependencies of tables in extensions whose data is registered as dumpable by getExtensionMembership(). Could you check that it works correctly? My test case passes but an extra check would be a good nice. The patch footprint is pretty low so we may be able to backport this patch easily. -- Michael From 72e369ee725301003cf065b0bf9875724455dac1 Mon Sep 17 00:00:00 2001 From: Michael Paquier mich...@otacoo.com Date: Tue, 17 Feb 2015 07:39:23 + Subject: [PATCH 1/3] Fix ordering of tables part of extensions linked with constraints Additional checks on FK constraints potentially linking between them extension objects are done and data dump ordering is ensured. Note that this does not take into account foreign keys of tables that are not part of an extension linking to an extension table. --- src/bin/pg_dump/pg_dump.c | 80 +-- 1 file changed, 78 insertions(+), 2 deletions(-) diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c index 2b53c72..bbcd600 100644 --- a/src/bin/pg_dump/pg_dump.c +++ b/src/bin/pg_dump/pg_dump.c @@ -15236,7 +15236,8 @@ dumpRule(Archive *fout, DumpOptions *dopt, RuleInfo *rinfo) } /* - * getExtensionMembership --- obtain extension membership data + * getExtensionMembership --- obtain extension membership data and check FK + * dependencies among extension tables. */ void getExtensionMembership(Archive *fout, DumpOptions *dopt, ExtensionInfo extinfo[], @@ -15368,7 +15369,9 @@ getExtensionMembership(Archive *fout, DumpOptions *dopt, ExtensionInfo extinfo[] parsePGArray(extcondition, extconditionarray, nconditionitems) nconfigitems == nconditionitems) { - int j; + int j, i_conrelid, i_confrelid; + PQExpBuffer query2; + bool first_elt = true; for (j = 0; j nconfigitems; j++) { @@ -15423,6 +15426,79 @@ getExtensionMembership(Archive *fout, DumpOptions *dopt, ExtensionInfo extinfo[] } } } + + /* + * Now that all the TableInfoData objects have been created for + * all the extensions, check their FK dependencies and register + * them to ensure correct data ordering. Note that this is not + * a problem for user tables not included in an extension + * referencing with a FK tables in extensions as their constraint + * is declared after dumping their data. In --data-only mode the + * table ordering is ensured as well thanks to + * getTableDataFKConstraints(). + */ + query2 = createPQExpBuffer(); + + /* + * Query all the foreign key dependencies for all the extension + * tables found previously. Only tables whose data need to be + * have to be tracked. + */ + appendPQExpBuffer(query2, + SELECT conrelid, confrelid + FROM pg_catalog.pg_constraint + WHERE contype = 'f' AND conrelid IN (); + + for (j = 0; j nconfigitems; j++) + { +TableInfo *configtbl; +Oid configtbloid = atooid(extconfigarray[j]); + +configtbl = findTableByOid(configtbloid); +if (configtbl == NULL || configtbl-dataObj == NULL) + continue; + +if (first_elt) +{ + appendPQExpBuffer(query2, %u, configtbloid); + first_elt = false; +} +else + appendPQExpBuffer(query2, , %u, configtbloid); + } + appendPQExpBuffer(query2, );); + + res = ExecuteSqlQuery(fout, query2-data, PGRES_TUPLES_OK); + ntups = PQntuples(res); + + i_conrelid = PQfnumber(res, conrelid); + i_confrelid = PQfnumber(res, confrelid); + + /* Now get the dependencies and register them */ + for (j = 0; j ntups; j++) + { +Oid conrelid, confrelid; +TableInfo *reftable, *contable; + +conrelid = atooid(PQgetvalue(res, j, i_conrelid)); +confrelid = atooid(PQgetvalue(res, j, i_confrelid)); +contable = findTableByOid(conrelid); +reftable = findTableByOid(confrelid); + +if (reftable == NULL || + reftable-dataObj == NULL || + contable == NULL || + contable-dataObj == NULL) + continue; + +/* + * Make referencing TABLE_DATA object depend on the + * referenced table's TABLE_DATA object. + */ +addObjectDependency(contable-dataObj-dobj, + reftable-dataObj-dobj.dumpId); + } + resetPQExpBuffer(query2); } if (extconfigarray) free(extconfigarray); -- 2.3.0 From 1d8fe1c39ec5049c39810f94153d347971738616 Mon Sep 17 00:00:00 2001 From: Michael Paquier mich...@otacoo.com Date: Tue, 17 Feb 2015 22:29:28 +0900 Subject: [PATCH 2/3] Make prove_check install contents of current directory as well This is useful for example for TAP tests in extensions. --- src/Makefile.global.in | 1 + 1 file changed, 1 insertion(+) diff --git a/src/Makefile.global.in
[HACKERS] BDR Multiple database
Hi all, I'm using PostgreSQL BDR 9.4.1 to test BDR capability right now $ psql --version psql (PostgreSQL) 9.4.1 We want to use BDR with multiple database but now all the document didn't show any example how to config BDR with multiple database. We've tried with many combination as below but still no luck. Anyone can point us this? 1st combination bdr.connections = 'bdrnode02db1' bdr.bdrnode02db1_dsn = 'dbname=db1 host=172.17.42.1 port=49264 user=postgres' bdr.connections = 'bdrnode02db2' bdr.bdrnode02db2_dsn = 'dbname=db2 host=172.17.42.1 port=49264 user=postgres' 2nd combination bdr.connections = 'bdrnode02' bdr.bdrnode02_dsn = 'dbname=db1 host=172.17.42.1 port=49264 user=postgres' bdr.bdrnode02_dsn = 'dbname=db2 host=172.17.42.1 port=49264 user=postgres' Regards, Jirayut
Re: [HACKERS] Partitioning WIP patch
On 2/25/15 7:57 PM, Amit Langote wrote: On 26-02-2015 AM 10:31, Jim Nasby wrote: On 2/25/15 7:24 PM, Amit Langote wrote: Does ALTER TABLE parent_monthly_x_201401 ADD COLUMN foo still operate the same as today? I'd like to see us continue to support that, but perhaps it would be wise to not paint ourselves into that corner just yet. Nothing prevents that from working, at least at the moment. Ok, but is that what we really want? If we release it that way we'll be stuck with it forever. AIUI, as far as we stay with a design where partitions (children) are individually planned, that might be OK. But, I guess things will get more complicated. I think the role of a parent in planning would remain limited to drive partition-pruning. Am I missing something? Isn't the point of adding explicit partitioning to make it faster than plain inheritance? Presumably as part of that we'll eventually want to NOT plan children individually. I would certainly prefer that we support the capabilities of inheritance along with partitioning (because in some cases you want both). But it's going to limit what we can do internally. Just to clarify are you referring to inheritance relationship between a partitioned table and partitions? Yes. If it helps, the exact use-case I have in mind is using list-based partitioning + additional columns in some/all children (different between children). For example, if you need to track different types of customer payment methods, you'd have a payment parent table, a list partition for credit debit cards, a different list partition for bank accounts, etc. The reason I'd like to do this with partitioning vs plain inheritance is presumably as we build out partitioning we'll get very useful things like the ability to have FKs to properly partitioned tables. Insert tuple routing could also be useful. With explicit partitioning, shouldn't we go in direction where we remove some restrictions imposed by inheritance (think multiple inheritance)? I recall a link Alvaro had started the discussion with think link to a Tom's remark about something very related: http://www.postgresql.org/message-id/1598.1399826...@sss.pgh.pa.us That post looks like Tom figured out a way to eliminate a problem that hurts inheritance, so that's good. My fear is that at some point we'll hit a problem with partitioning that we can't solve in the inheritance model. If we allow inheritance features into partitioning now we'll painted into a corner. If we disallow those features now we can always re-enable them if we get to the point where we're in the clear. Does that make sense? -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to create virtual indexes on postgres
On Thu, Feb 26, 2015 at 6:20 PM, Sreerama Manoj manoj.sreerama...@gmail.com wrote: So my question was can we know whether the planner will use the index before actually creating a real Index..or can we create virtual or Hypothetical Index those can only be known to the planner and not the user or Is there any alternative to do it..If present,share with me. No, the index needs to be created to allow the planner to use it. This reminds me of this project though, that if I recall correctly has patches Postgres to allow the use of hypothetical indexes: https://sourceforge.net/projects/hypotheticalind/ -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New CF app deployment
Hi, This thread seems relevant, Please guide me to how can access older CF pages e.g. Thread http://www.postgresql.org/message-id/flat/51f19059.7050...@pgexperts.com#51f19059.7050...@pgexperts.com mentions following link i.e. The MSVC portion of this fix got completely lost in the void: https://commitfest.postgresql.org/action/patch_view?id=1330 Above link result in the following error i.e. Not found The specified URL was not found. Please do let me know if I missed something. Thanks. Regards, Muhammad Asif Naeem On Tue, Feb 24, 2015 at 11:59 PM, Peter Eisentraut pete...@gmx.net wrote: On 2/22/15 3:12 PM, Magnus Hagander wrote: Would you suggest removing the automated system completely, or keep it around and just make it possible to override it (either by removing the note that something is a patch, or by making something that's not listed as a patch become marked as such)? I would remove it completely. It has been demonstrated to not work. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How to create virtual indexes on postgres
But, it runs with Postgres 9.1 version...But I use 9.4..I think I cant use that. Or as an alternative Is there any provision in postgres to know use(Increase in Performance) of an index before creating that index. On Thu, Feb 26, 2015 at 5:37 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Feb 26, 2015 at 6:20 PM, Sreerama Manoj manoj.sreerama...@gmail.com wrote: So my question was can we know whether the planner will use the index before actually creating a real Index..or can we create virtual or Hypothetical Index those can only be known to the planner and not the user or Is there any alternative to do it..If present,share with me. No, the index needs to be created to allow the planner to use it. This reminds me of this project though, that if I recall correctly has patches Postgres to allow the use of hypothetical indexes: https://sourceforge.net/projects/hypotheticalind/ -- Michael
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
Tom Lane wrote: Thoughts? Any objections to pushing this? Is there any reason at all to keep MemoryContextResetButPreserveChildren()? Since your patch doesn't add any callers, it seems pretty likely that there's none anywhere. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] GSoC idea - Simulated annealing to search for query plans
Dear Hackers, I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of Computer Science at WUT, Poland. Last year I've been a bit into evolutionary algorithms and during my research I found out about GEQO in Postgres. I also found out that there are plans to try a different attempt to find optimal query plans and thought it could be a good thing to use it as a project for GSoC. I'm interested in one of old TODO items related to the optimizer - 'Consider compressed annealing to search for query plans'. I believe this would be potentially beneficial to Postgres to check if such a heuristic could really choose better query plans than GEQO does. Judging by the results that simulated annealing gives on the travelling salesman problem, it looks like a simpler and potentially more effective way of combinatorial optimization. As deliverables of such a project I would provide a simple implementation of basic simulated annealing optimizer and some form of quantitative comparison with GEQO. I see that this may be considerably bigger than most of GSoC projects, but I would like to know your opinion. Do you think that this would be beneficial enough to make a proper GSoC project? I would also like to know if you have any additional ideas about this project. Best regards, Grzegorz Parka
Re: [HACKERS] Precedence of standard comparison operators
Andres Freund and...@anarazel.de writes: On February 26, 2015 10:29:18 PM CET, Peter Eisentraut pete...@gmx.net wrote: My suggestion was to treat this like the standard_conforming_string change. That is, warn for many years before changing. I don't think scs is a good example to follow. Yeah. For one thing, there wouldn't be any way to suppress the warning other than to parenthesize your code, which I would find problematic because it would penalize standard-conforming queries. I'd prefer an arrangement whereby once you fix your code to be standard-conforming, you're done. A possible point of compromise would be to leave the warning turned on by default, at least until we get a better sense of how this would play out in the real world. I continue to suspect that we're making a mountain out of, if not a molehill, at least a hillock. I think most sane people would have parenthesized their queries to start with rather than go look up whether IS DISTINCT FROM binds tighter than = ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Jim Nasby jim.na...@bluetreble.com writes: On 2/26/15 4:01 PM, Alvaro Herrera wrote: Josh Berkus wrote: Oh, I didn't realize there weren't commands to change the LCO. Without at least SQL syntax for LCO, I don't see why we'd take it; this sounds more like a WIP patch. The reason for doing it this way is that changing the underlying architecture is really hard, without having to bear an endless hackers bike shed discussion about the best userland syntax to use. It seems a much better approach to do the actually difficult part first, then let the rest to be argued to death by others and let those others do the easy part (and take all the credit along with that); that way, that discussion does not kill other possible uses that the new architecture allows. +1. This patch is already several years old; lets not delay it further. I tend to agree with that, but how are we going to test things if there's no mechanism to create a table in which the orderings are different? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of standard comparison operators
On 2/26/15 4:09 PM, Tom Lane wrote: Andres Freund and...@anarazel.de writes: On February 26, 2015 10:29:18 PM CET, Peter Eisentraut pete...@gmx.net wrote: My suggestion was to treat this like the standard_conforming_string change. That is, warn for many years before changing. I don't think scs is a good example to follow. Yeah. For one thing, there wouldn't be any way to suppress the warning other than to parenthesize your code, which I would find problematic because it would penalize standard-conforming queries. I'd prefer an arrangement whereby once you fix your code to be standard-conforming, you're done. A possible point of compromise would be to leave the warning turned on by default, at least until we get a better sense of how this would play out in the real world. I continue to suspect that we're making a mountain out of, if not a molehill, at least a hillock. I think most sane people would have parenthesized their queries to start with rather than go look up whether IS DISTINCT FROM binds tighter than = ... Question of sanity aside, I can tell you that many business queries are written with little more sophistication than monkeys with typewriters, where you keep the monkeys fed with bananas and paper until your query results (not the SQL) looks like what someone expects it to look like. Then the output of that version is held as sacrosanct, and any future SQL changes are wrong unless they produce the expected result changes. In my experience this happens because some poor business person just needs to get their job done and either isn't allowed to bother the data people or the data people are just too busy, so they're stuck doing it themselves. From what I've seen, SQL written by developers is often a bit better than this... but not a lot. And part of that salvation is because application queries tend to be a lot less complex than reporting/BI ones. I don't have a great feel for how much of an impact this specific change would have on that... the examples so far have all been pretty esoteric. I suspect most people writing such wonderful SQL don't know about IS DISTINCT FROM nor would they try doing things like bool_a bool_b = bool_c. So there may actually be very little code to fix, but I think we at least need a way for users to verify that. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
Andres Freund and...@2ndquadrant.com writes: On 2015-02-26 17:01:53 -0500, Tom Lane wrote: We've discussed doing $SUBJECT off and on for nearly ten years, the oldest thread I could find about it being here: http://www.postgresql.org/message-id/flat/1186435268.16321.37.ca...@dell.linuxdev.us.dell.com It's come up again every time we found another leak of dead child contexts, which happened twice last year for example. And that patch I'm pushing for expanded out-of-line objects really needs this to become the default behavior anywhere that we can detoast objects. I don't object to the behavioural change per se, rather the contrary, but I find it a pretty bad idea to change the meaning of an existing functioname this way. That's pretty much the whole point I think. Or are you arguing for an alternative proposal in which we remove MemoryContextReset (or at least rename it to something new) and thereby intentionally break all code that uses MemoryContextReset? I can't say that I find that an improvement. ... Without a compiler erroring out people won't notice that suddenly MemoryContextReset deletes much more; leading to possibly hard to find errors. Context resets frequently are in error paths, and those won't necessarily be hit when running with assertions enabled. With all due respect, that's utterly wrong. I have looked at every single MemoryContextReset call in the codebase, and as far as I can see the *only* one that is in an error path is elog.c:336, which I'm quite certain is wrong anyway (the other reset of ErrorContext in that file uses MemoryContextResetAndDeleteChildren, this one should too). I see no reason whatever to believe that this change is likely to do anything except fix heretofore-unnoticed memory leaks. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Fri, Feb 27, 2015 at 6:54 AM, Rahila Syed rahilasye...@gmail.com wrote: Hello, Even this patch doesn't work fine. The standby emit the following error messages. Yes this bug remains unsolved. I am still working on resolving this. Following chunk IDs have been added in the attached patch as suggested upthread. +#define XLR_CHUNK_BLOCK_REFERENCE 0x10 +#define XLR_CHUNK_BLOCK_HAS_IMAGE 0x04 +#define XLR_CHUNK_BLOCK_HAS_DATA 0x08 XLR_CHUNK_BLOCK_REFERENCE denotes chunk ID of block references. XLR_CHUNK_BLOCK_HAS_IMAGE is a replacement of BKPBLOCK_HAS_IMAGE and XLR_CHUNK_BLOCK_HAS DATA a replacement of BKPBLOCK_HAS_DATA. Before sending a new version, be sure that this get fixed by for example building up a master with a standby replaying WAL, and running make installcheck-world or similar. If the standby does not complain at all, you have good chances to not have bugs. You could also build with WAL_DEBUG to check record consistency. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of standard comparison operators
On 2015-02-26 20:13:34 +, Simon Riggs wrote: On 26 February 2015 at 15:56, Tom Lane t...@sss.pgh.pa.us wrote: I think the way to do this is to have a pluggable parser, so users can choose 1) old parser, 2) new, better parser, 3) any other parser they fancy that they maintain to ensure application compatibility. We've got a pluggable executor and optimizer, so why not a parser too? Implementing that in the same way we have done for executor and optimizer is a 1 day patch, so easily achievable for 9.5. I don't find that particularly attractive either. It seems quite unlikely to me that anyone would actually use such a hook; replacing the whole parser would be essentially unmaintainable. Nobody uses the hooks you mention to replace the whole planner or whole executor --- there are wrappers out there, which is a different thing altogether. But you could not undo the issue at hand here without at the very least a whole new copy of gram.y, which would need to be updated constantly if you wanted it to keep working across more than one release. I can see a point in having the ability to plug in a parser - I just don't think it has much to do with this topic. It'd be a nightmare to maintain two versions of our parser; I don't think anybody wants to patch more than one. Whole planner replacement is used for extensions by CitusData and NTT, and will definitely be used in the future for other apps. s/planner/parser/? Because replacing the planner can already be done as a whole without much problems. Whole executor replacement is also used by one extension to produce DDL triggers. Hm? In any case, whole planner replacement would be very desirable for people trying to minimize code churn in their applications. As soon as it exists, I will use to make some MySQL-only apps work seamlessly against PostgreSQL, such as WordPress. It doesn't need to be 100% perfect MySQL, it just needs to be enough to make the app work. Maintaining a code base that can work against multiple databases is hard. Writing it against one main database and maintaining a parser layer would be much easier. Assuming you meant parser: Maybe. I have my doubt somebody will actually invest the significant amount of time to develop something like that. But I don't have a problem providing the capability; there seems little point in allowing to replace the optimizer but not the planner. I just don't see that it has much to do with this discussion. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 2/26/15 4:01 PM, Alvaro Herrera wrote: Josh Berkus wrote: On 02/26/2015 01:54 PM, Alvaro Herrera wrote: This patch decouples these three things so that they can changed freely -- but provides no user interface to do so. I think that trying to only decouple the thing we currently have in two pieces, and then have a subsequent patch to decouple again, is additional conceptual complexity for no gain. Oh, I didn't realize there weren't commands to change the LCO. Without at least SQL syntax for LCO, I don't see why we'd take it; this sounds more like a WIP patch. The reason for doing it this way is that changing the underlying architecture is really hard, without having to bear an endless hackers bike shed discussion about the best userland syntax to use. It seems a much better approach to do the actually difficult part first, then let the rest to be argued to death by others and let those others do the easy part (and take all the credit along with that); that way, that discussion does not kill other possible uses that the new architecture allows. +1. This patch is already several years old; lets not delay it further. Plus, I suspect that you could hack the catalog directly if you really wanted to change LCO. Worst case you could create a C function to do it. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
Hi, On 2015-02-26 17:01:53 -0500, Tom Lane wrote: We've discussed doing $SUBJECT off and on for nearly ten years, the oldest thread I could find about it being here: http://www.postgresql.org/message-id/flat/1186435268.16321.37.ca...@dell.linuxdev.us.dell.com It's come up again every time we found another leak of dead child contexts, which happened twice last year for example. And that patch I'm pushing for expanded out-of-line objects really needs this to become the default behavior anywhere that we can detoast objects. I don't object to the behavioural change per se, rather the contrary, but I find it a pretty bad idea to change the meaning of an existing functioname this way. Without a compiler erroring out people won't notice that suddenly MemoryContextReset deletes much more; leading to possibly hard to find errors. Context resets frequently are in error paths, and those won't necessarily be hit when running with assertions enabled. Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
Andres Freund and...@2ndquadrant.com writes: I'd really not even be surprised if a committer backpatches a MemoryContextReset() addition, not realizing it behaves differently in the back branches. As far as that goes, the only consequence would be a possible memory leak in the back branches; it would not be a really fatal problem. I'd rather live with that risk than with the sort of intentional API break (and ensuing back-patch pain) that you're proposing. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
Andres Freund and...@2ndquadrant.com writes: On 2015-02-26 17:45:26 -0500, Tom Lane wrote: With all due respect, that's utterly wrong. I have looked at every single MemoryContextReset call in the codebase, and as far as I can see the *only* one that is in an error path is elog.c:336, which I'm quite certain is wrong anyway (the other reset of ErrorContext in that file uses MemoryContextResetAndDeleteChildren, this one should too). Sure, in the pg codebase. But there definitely are extensions using memory contexts. And there's code that has to work across several branches. Backpatching alone imo presents a risk; there's nothing to warn you three years down the road that the MemoryContextReset() you just backpatched doesn't work the same in the backbranches. If the changes breaks some code it's likely actually a good thing: Because, as you say, using MemoryContextReset() will likely be the wrong thing, and they'll want to fix it for the existing branches as well. Is that likely to happen? I doubt it. They'll just mutter under their breath about useless API breakage and move on. Basically, this is a judgment call, and I disagree with your judgment. I think changing the behavior of MemoryContextReset is exactly what we want to have happen. (And I've got to say that I'm losing patience with backwards-compatibility arguments taken to this degree. We might as well stop development entirely.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning WIP patch
On Fri, Feb 27, 2015 at 3:24 AM, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-26 12:15:17 +0900, Amit Langote wrote: On 26-02-2015 AM 05:15, Josh Berkus wrote: On 02/24/2015 12:13 AM, Amit Langote wrote: Here is an experimental patch that attempts to implement this. I would love to have it for 9.5, but I guess the patch isn't nearly baked enough for that? I'm not quite sure what would qualify as baked enough for 9.5 though we can surely try to reach some consensus on various implementation aspects and perhaps even get it ready in time for 9.5. I think it's absolutely unrealistic to get this into 9.5. There's barely been any progress on the current (last!) commitfest - where on earth should the energy come to make this patch ready? And why would that be fair against all the others that have submitted in time? +1. There are many other patches pending the in CF app waiting for feedback, while this one showed up after the last CF deadline for 9.5 and needs design and spec decisions that should not be taken lightly at the end of a major release development cycle. Please let's not rush into something we may regret. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
We've discussed doing $SUBJECT off and on for nearly ten years, the oldest thread I could find about it being here: http://www.postgresql.org/message-id/flat/1186435268.16321.37.ca...@dell.linuxdev.us.dell.com It's come up again every time we found another leak of dead child contexts, which happened twice last year for example. And that patch I'm pushing for expanded out-of-line objects really needs this to become the default behavior anywhere that we can detoast objects. So attached is a patch to do it. I've verified that this passes make check-world, not that that proves all that much :-( I did not make an attempt to s/MemoryContextResetAndDeleteChildren/MemoryContextReset/ globally, as it's certainly unnecessary to do that for testing purposes. I'm not sure whether we should do so, or skip the code churn (there's about 30 occurrences in HEAD). We'd want to keep the macro in any case to avoid unnecessary breakage of 3rd-party code. Thoughts? Any objections to pushing this? regards, tom lane diff --git a/src/backend/utils/mmgr/README b/src/backend/utils/mmgr/README index 45e610d..e01bcd4 100644 *** a/src/backend/utils/mmgr/README --- b/src/backend/utils/mmgr/README *** lifetimes that only partially overlap ca *** 125,133 from different trees of the context forest (there are some examples in the next section). ! For convenience we will also want operations like reset/delete all ! children of a given context, but don't reset or delete that context ! itself. Globally Known Contexts --- 125,137 from different trees of the context forest (there are some examples in the next section). ! Actually, it turns out that resetting a given context should almost ! always imply deleting (not just resetting) any child contexts it has. ! So MemoryContextReset() means that, and if you really do want a forest of ! empty contexts you need to call MemoryContextResetButPreserveChildren(). ! ! For convenience we also provide operations like reset/delete all children ! of a given context, but don't reset or delete that context itself. Globally Known Contexts diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 202bc78..f40c33e 100644 *** a/src/backend/utils/mmgr/mcxt.c --- b/src/backend/utils/mmgr/mcxt.c *** MemoryContextInit(void) *** 132,145 /* * MemoryContextReset * Release all space allocated within a context and its descendants, * but don't delete the contexts themselves. * * The type-specific reset routine handles the context itself, but we * have to do the recursion for the children. */ void ! MemoryContextReset(MemoryContext context) { AssertArg(MemoryContextIsValid(context)); --- 132,176 /* * MemoryContextReset + * Release all space allocated within a context and delete all its + * descendant contexts (but not the named context itself). + * + * The type-specific reset routine handles the context itself, but we + * have to do the recursion for the children. + */ + void + MemoryContextReset(MemoryContext context) + { + AssertArg(MemoryContextIsValid(context)); + + /* save a function call in common case where there are no children */ + if (context-firstchild != NULL) + MemoryContextDeleteChildren(context); + + /* Nothing to do if no pallocs since startup or last reset */ + if (!context-isReset) + { + (*context-methods-reset) (context); + context-isReset = true; + VALGRIND_DESTROY_MEMPOOL(context); + VALGRIND_CREATE_MEMPOOL(context, 0, false); + } + } + + /* + * MemoryContextResetButPreserveChildren * Release all space allocated within a context and its descendants, * but don't delete the contexts themselves. * + * Note: this was formerly the behavior of plain MemoryContextReset(), but + * we found out that you almost always want to delete children not keep them. + * This API may indeed have no use at all, but we keep it for the moment. + * * The type-specific reset routine handles the context itself, but we * have to do the recursion for the children. */ void ! MemoryContextResetButPreserveChildren(MemoryContext context) { AssertArg(MemoryContextIsValid(context)); *** MemoryContextResetChildren(MemoryContext *** 171,177 AssertArg(MemoryContextIsValid(context)); for (child = context-firstchild; child != NULL; child = child-nextchild) ! MemoryContextReset(child); } /* --- 202,208 AssertArg(MemoryContextIsValid(context)); for (child = context-firstchild; child != NULL; child = child-nextchild) ! MemoryContextResetButPreserveChildren(child); } /* *** MemoryContextDeleteChildren(MemoryContex *** 226,248 } /* - * MemoryContextResetAndDeleteChildren - * Release all space allocated within a context and delete all - * its descendants. - * - * This is a common combination case where we want to preserve
Re: [HACKERS] json_populate_record issue - TupleDesc reference leak
Andrew Dunstan and...@dunslane.net writes: This doesn't look quite right. Shouldn't we unconditionally release the Tupledesc before the returns at lines 2118 and 2127, just as we do at the bottom of the function at line 2285? I think Pavel's patch is probably OK as-is, because the tupdesc returned by get_call_result_type isn't reference-counted; but I agree the code would look cleaner your way. If the main exit isn't bothering to distinguish this then the early exits should not either. What I'm wondering about, though, is this bit at line 2125: /* same logic as for json */ if (!have_record_arg rec) PG_RETURN_POINTER(rec); If that's supposed to be the same logic as in the other path, then how is it that !have_record_arg has anything to do with whether the JSON object is empty? Either the code is broken, or the comment is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 2015-02-26 16:16:54 -0600, Jim Nasby wrote: On 2/26/15 4:01 PM, Alvaro Herrera wrote: The reason for doing it this way is that changing the underlying architecture is really hard, without having to bear an endless hackers bike shed discussion about the best userland syntax to use. It seems a much better approach to do the actually difficult part first, then let the rest to be argued to death by others and let those others do the easy part (and take all the credit along with that); that way, that discussion does not kill other possible uses that the new architecture allows. I agree that it's a sane order of developing things. But: I don't think it makes sense to commit it without the capability to change the order. Not so much because it'll not serve a purpose at that point, but rather because it'll essentially untestable. And this is a patch that needs a fair amount of changes, both automated, and manual. At least during development I'd even add a function that randomizes the physical order of columns, and keeps the logical one. Running the regression tests that way seems likely to catch a fair number of bugs. +1. This patch is already several years old; lets not delay it further. Plus, I suspect that you could hack the catalog directly if you really wanted to change LCO. Worst case you could create a C function to do it. I don't think that's sufficient for testing purposes. Not only for the patch itself, but also for getting it right in new code. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
On 2015-02-26 18:05:46 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: On 2015-02-26 17:45:26 -0500, Tom Lane wrote: If the changes breaks some code it's likely actually a good thing: Because, as you say, using MemoryContextReset() will likely be the wrong thing, and they'll want to fix it for the existing branches as well. Is that likely to happen? I doubt it. They'll just mutter under their breath about useless API breakage and move on. Maybe. Basically, this is a judgment call, and I disagree with your judgment. Right. That's ok, happens all the time. And I've got to say that I'm losing patience with backwards-compatibility arguments taken to this degree. We might as well stop development entirely. Meh, normally you're on the side I'm on right now... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] MemoryContext reset/delete callbacks
We discussed this idea a couple weeks ago. The core of it is that when a memory context is being deleted, you might want something extra to happen beyond just pfree'ing everything in the context. I'm thinking in particular that this might provide a nice solution to the problem we discussed earlier today of managing cached information about a domain's CHECK constraints: a function could make a reference-counted link to some data in its FmgrInfo cache, and use a memory context callback on the context containing the FmgrInfo to ensure that the reference count gets decremented when needed and not before. Attached is a draft patch for this. I've not really tested it more than seeing that it compiles, but it's so simple that there are unlikely to be bugs as such in it. There are some design decisions that could be questioned though: 1. I used ilists for the linked list of callback requests. This creates a dependency from memory contexts to lib/ilist. That's all right at the moment since lib/ilist does no memory allocation, but it might be logically problematic. We could just use explicit struct foo * links with little if any notational penalty, so I wonder if that would be better. 2. I specified that callers have to allocate the storage for the callback structs. This saves a palloc cycle in just about every use-case I've thought of, since callers generally will need to palloc some larger struct of their own and they can just embed the MemoryContextCallback struct in that. It does seem like this offers a larger surface for screwups, in particular putting the callback struct in the wrong memory context --- but there's plenty of surface for that type of mistake anyway, if you put whatever the void *arg is pointing at in the wrong context. So I'm OK with this but could also accept a design in which MemoryContextRegisterResetCallback takes just a function pointer and a void *arg and palloc's the callback struct for itself in the target context. I'm unsure whether that adds enough safety to justify a separate palloc. 3. Is the void *arg API for the callback func sufficient? I considered also passing it the MemoryContext, but couldn't really find a use-case to justify that. 4. I intentionally didn't provide a MemoryContextDeregisterResetCallback API. Since it's just a singly-linked list, that could be an expensive operation and so I'd rather discourage it. In the use cases I've thought of, you'd want the callback to remain active for the life of the context anyway, so there would be no need. (And, of course, there is slist_delete for the truly determined ...) Comments? regards, tom lane diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c index 202bc78..624d08a 100644 *** a/src/backend/utils/mmgr/mcxt.c --- b/src/backend/utils/mmgr/mcxt.c *** MemoryContext CurTransactionContext = NU *** 54,59 --- 54,60 /* This is a transient link to the active portal's memory context: */ MemoryContext PortalContext = NULL; + static void MemoryContextCallResetCallbacks(MemoryContext context); static void MemoryContextStatsInternal(MemoryContext context, int level); /* *** MemoryContextReset(MemoryContext context *** 150,155 --- 151,157 /* Nothing to do if no pallocs since startup or last reset */ if (!context-isReset) { + MemoryContextCallResetCallbacks(context); (*context-methods-reset) (context); context-isReset = true; VALGRIND_DESTROY_MEMPOOL(context); *** MemoryContextDelete(MemoryContext contex *** 196,201 --- 198,211 MemoryContextDeleteChildren(context); /* + * It's not entirely clear whether 'tis better to do this before or after + * delinking the context; but an error in a callback will likely result in + * leaking the whole context (if it's not a root context) if we do it + * after, so let's do it before. + */ + MemoryContextCallResetCallbacks(context); + + /* * We delink the context from its parent before deleting it, so that if * there's an error we won't have deleted/busted contexts still attached * to the context tree. Better a leak than a crash. *** MemoryContextResetAndDeleteChildren(Memo *** 243,248 --- 253,308 } /* + * MemoryContextRegisterResetCallback + * Register a function to be called before next context reset/delete. + * Such callbacks will be called in reverse order of registration. + * + * The caller is responsible for allocating a MemoryContextCallback struct + * to hold the info about this callback request, and for filling in the + * func and arg fields in the struct to show what function to call with + * what argument. Typically the callback struct should be allocated within + * the specified context, since that means it will automatically be freed + * when no longer needed. + * + * There is no API for deregistering a callback once registered. If you + * want
Re: [HACKERS] Partitioning WIP patch
On 27-02-2015 AM 03:18, Josh Berkus wrote: On 02/25/2015 07:15 PM, Amit Langote wrote: I'm not quite sure what would qualify as baked enough for 9.5 though we can surely try to reach some consensus on various implementation aspects and perhaps even get it ready in time for 9.5. Well, we don't have long at all to do that. I guess I'm asking what kind of completeness of code we have; is this basically done pending API changes and bugs, or are there major bits (like, say, pg_dump and EXPLAIN support) which are completely unimplemented? I would say I am not entirely sure/satisfied about some decisions I have made (or not) when writing even the basic patch. Yes, pg_dump/EXPLAIN/psql, etc. are not touched. So, it seems it might not be fair to claim it's actually something for 9.5. Let me just call it WIP for a while while keep I working on it and receive feedback. Only one concern I can remember someone had raised is that having to evaluate an expression for every row during bulk-inserts may end up being pretty expensive. Though, we might have to live with that. Well, it's not more expensive than having to materialize the value from a trigger and store it on disk. The leading one here would be functions over timestamp; for example, the data has a timestamptz, but you want to partition by week. I think one idea is to learn from ability to use expressions in indexes. Sure. So a feature to implement for the 2nd release. Actually, I'm trying to add that and see how it works. I will post an updated patch soon if it looks good enough. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Josh Berkus wrote: On 02/26/2015 01:54 PM, Alvaro Herrera wrote: This patch decouples these three things so that they can changed freely -- but provides no user interface to do so. I think that trying to only decouple the thing we currently have in two pieces, and then have a subsequent patch to decouple again, is additional conceptual complexity for no gain. Oh, I didn't realize there weren't commands to change the LCO. Without at least SQL syntax for LCO, I don't see why we'd take it; this sounds more like a WIP patch. The reason for doing it this way is that changing the underlying architecture is really hard, without having to bear an endless hackers bike shed discussion about the best userland syntax to use. It seems a much better approach to do the actually difficult part first, then let the rest to be argued to death by others and let those others do the easy part (and take all the credit along with that); that way, that discussion does not kill other possible uses that the new architecture allows. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
Alvaro Herrera alvhe...@2ndquadrant.com writes: Tom Lane wrote: Thoughts? Any objections to pushing this? Is there any reason at all to keep MemoryContextResetButPreserveChildren()? Since your patch doesn't add any callers, it seems pretty likely that there's none anywhere. The only reason to keep it is to have an out if it turns out that some third-party code actually needs that behavior. On reflection, maybe a better API to offer for that eventuality is a function named something like MemoryContextResetOnly(), which would leave child contexts completely alone. Then, if you want the old functionality, you could get it with MemoryContextResetOnly plus MemoryContextResetChildren. BTW, the original thread discussed the idea of moving context bookkeeping blocks into the immediate parent context, but the usefulness of MemoryContextSetParent() negates the thought that that would be a good plan. So there's no real issue here other than potential backwards compatibility for external code. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
On 2015-02-26 23:31:16 +0100, Andres Freund wrote: Without a compiler erroring out people won't notice that suddenly MemoryContextReset deletes much more; leading to possibly hard to find errors. Context resets frequently are in error paths, and those won't necessarily be hit when running with assertions enabled. I'd really not even be surprised if a committer backpatches a MemoryContextReset() addition, not realizing it behaves differently in the back branches. How about MemoryContextReset(bool reset_children)? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
Tomas Vondra tomas.von...@2ndquadrant.com wrote: Over the time I've heard various use cases for this patch, but in most cases it was quite speculative. If you have an idea where this might be useful, can you explain it here, or maybe point me to a place where it's described? One use case is to be able to suppress default display of columns that are used for internal purposes. For example, incremental maintenance of materialized views will require storing a count(t) column, and sometimes state information for aggregate columns, in addition to what the users explicitly request. At the developers' meeting there was discussion of whether and how to avoid displaying these by default, and it was felt that when we have this logical column ordering it would be good to have a way to suppress default display. Perhaps this could be as simple as a special value for logical position. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
On 2015-02-26 17:45:26 -0500, Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: Or are you arguing for an alternative proposal in which we remove MemoryContextReset (or at least rename it to something new) and thereby intentionally break all code that uses MemoryContextReset? Yes, that I am. After a bit of thought I just sent the suggestion to add a parameter to MemoryContextReset(). That way the compiler will warn. ... Without a compiler erroring out people won't notice that suddenly MemoryContextReset deletes much more; leading to possibly hard to find errors. Context resets frequently are in error paths, and those won't necessarily be hit when running with assertions enabled. With all due respect, that's utterly wrong. I have looked at every single MemoryContextReset call in the codebase, and as far as I can see the *only* one that is in an error path is elog.c:336, which I'm quite certain is wrong anyway (the other reset of ErrorContext in that file uses MemoryContextResetAndDeleteChildren, this one should too). Sure, in the pg codebase. But there definitely are extensions using memory contexts. And there's code that has to work across several branches. Backpatching alone imo presents a risk; there's nothing to warn you three years down the road that the MemoryContextReset() you just backpatched doesn't work the same in the backbranches. If the changes breaks some code it's likely actually a good thing: Because, as you say, using MemoryContextReset() will likely be the wrong thing, and they'll want to fix it for the existing branches as well. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Renaming MemoryContextResetAndDeleteChildren to MemoryContextReset
Andres Freund and...@2ndquadrant.com writes: ... Without a compiler erroring out people won't notice that suddenly MemoryContextReset deletes much more; leading to possibly hard to find errors. BTW, so far as *data* is concerned, the existing call deletes all data in the child contexts already. The only not-already-buggy operation you could perform before that would no longer work is to allocate fresh data in one of those child contexts, assuming you still had a pointer to such a context. I've not seen any coding pattern in which that's likely. The problem is exactly that whoever's resetting the parent context isn't aware of child contexts having been attached to it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partitioning WIP patch
On 27-02-2015 AM 03:24, Andres Freund wrote: On 2015-02-26 12:15:17 +0900, Amit Langote wrote: On 26-02-2015 AM 05:15, Josh Berkus wrote: I would love to have it for 9.5, but I guess the patch isn't nearly baked enough for that? I'm not quite sure what would qualify as baked enough for 9.5 though we can surely try to reach some consensus on various implementation aspects and perhaps even get it ready in time for 9.5. I think it's absolutely unrealistic to get this into 9.5. There's barely been any progress on the current (last!) commitfest - where on earth should the energy come to make this patch ready? And why would that be fair against all the others that have submitted in time? I realize and I apologize that it was irresponsible of me to have said that; maybe got a bit too excited. I do not want to unduly draw people's time on something that's not quite ready while there are other things people have worked hard on to get in time. In all earnestness, I say we spend time perfecting those things. I'll add this into CF-JUN'15. I will keep posting updates meanwhile so that when that commitfest finally starts, we will have something worth considering. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MemoryContext reset/delete callbacks
Hi, On 2015-02-26 19:28:50 -0500, Tom Lane wrote: We discussed this idea a couple weeks ago. Hm, didn't follow that discussion. The core of it is that when a memory context is being deleted, you might want something extra to happen beyond just pfree'ing everything in the context. I've certainly wished for this a couple times... Attached is a draft patch for this. I've not really tested it more than seeing that it compiles, but it's so simple that there are unlikely to be bugs as such in it. There are some design decisions that could be questioned though: 1. I used ilists for the linked list of callback requests. This creates a dependency from memory contexts to lib/ilist. That's all right at the moment since lib/ilist does no memory allocation, but it might be logically problematic. We could just use explicit struct foo * links with little if any notational penalty, so I wonder if that would be better. Maybe I'm partial here, but I don't see a problem. Actually the reason I started the ilist stuff was that I wrote a different memory context implementation ;). Wish I'd time/energy to go back to that... 2. I specified that callers have to allocate the storage for the callback structs. This saves a palloc cycle in just about every use-case I've thought of, since callers generally will need to palloc some larger struct of their own and they can just embed the MemoryContextCallback struct in that. It does seem like this offers a larger surface for screwups, in particular putting the callback struct in the wrong memory context --- but there's plenty of surface for that type of mistake anyway, if you put whatever the void *arg is pointing at in the wrong context. So I'm OK with this but could also accept a design in which MemoryContextRegisterResetCallback takes just a function pointer and a void *arg and palloc's the callback struct for itself in the target context. I'm unsure whether that adds enough safety to justify a separate palloc. I'm unworried about this. Yes, one might be confused for a short while, but compared to the complexity of using any such facility sanely it seems barely relevant. 3. Is the void *arg API for the callback func sufficient? I considered also passing it the MemoryContext, but couldn't really find a use-case to justify that. Hm, seems sufficient to me. 4. I intentionally didn't provide a MemoryContextDeregisterResetCallback API. Since it's just a singly-linked list, that could be an expensive operation and so I'd rather discourage it. In the use cases I've thought of, you'd want the callback to remain active for the life of the context anyway, so there would be no need. (And, of course, there is slist_delete for the truly determined ...) Yea, that seems fine. If you don't want the callback to do anything anymore, it's easy enough to just set a flag somewhere. /* *** typedef struct MemoryContextData *** 59,72 MemoryContext firstchild; /* head of linked list of children */ MemoryContext nextchild;/* next child of same parent */ char *name; /* context name (just for debugging) */ boolisReset;/* T = no space alloced since last reset */ #ifdef USE_ASSERT_CHECKING ! boolallowInCritSection; /* allow palloc in critical section */ #endif } MemoryContextData; It's a bit sad to push AllocSetContextData onto four cachelines from the current three... That stuff is hot. But I don't really see a way around it right now. And it seems like it'd give us more amunition to improve things than the small loss of speed it implies. I guess it might needa a warning about being careful what you directly free if you use this. Might also better fit in a layer above this... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 2/26/15 1:49 PM, Jim Nasby wrote: On 2/23/15 5:09 PM, Tomas Vondra wrote: Over the time I've heard various use cases for this patch, but in most cases it was quite speculative. If you have an idea where this might be useful, can you explain it here, or maybe point me to a place where it's described? For better or worse, table structure is a form of documentation for a system. As such, it's very valuable to group related fields in a table together. When creating a table, that's easy, but as soon as you need to alter your careful ordering can easily end up out the window. Perhaps to some that just sounds like pointless window dressing, but my experience is that on a complex system the less organized things are the more bugs you get due to overlooking something. I agree with Jim's comments. I've generally followed column ordering that goes something like: 1) primary key 2) foreign keys 3) flags 4) other programmatic data fields (type, order, etc.) 5) non-programmatic data fields (name, description, etc.) The immediate practical benefit of this is that users are more likely to see fields that they need without scrolling right. Documentation is also clearer because fields tend to go from most to least important (left to right, top to bottom). Also, if you are consistent enough then users just *know* where to look. I wrote a function a while back that reorders columns in tables (it not only deals with reordering, but triggers, constraints, indexes, etc., though there are some caveats). It's painful and not very efficient, but easy to use. Most dimension tables that I've worked with are in the millions of rows so reordering is not problem. With fact tables, I assess on a case-by-case basis. It would be nice to not have to do that triage. The function is attached if anyone is interested. -- - David Steele da...@pgmasters.net / CATALOG_TABLE_COLUMN_MOVE Function create or replace function _utility.catalog_table_column_move ( strSchemaName text, strTableName text, strColumnName text, strColumnNameBefore text ) returns void as $$ declare rIndex record; rConstraint record; rColumn record; strSchemaTable text = strSchemaName || '.' || strTableName; strDdl text; strClusterIndex text; begin -- Raise notice that a reorder is in progress raise notice 'Reorder columns in table %.% (% before %)', strSchemaName, strTableName, strColumnName, strColumnNameBefore; -- Get the cluster index select pg_index.relname into strClusterIndex from pg_namespace inner join pg_class on pg_class.relnamespace = pg_namespace.oid and pg_class.relname = strTableName inner join pg_index pg_index_map on pg_index_map.indrelid = pg_class.oid and pg_index_map.indisclustered = true inner join pg_class pg_index on pg_index.oid = pg_index_map.indexrelid where pg_namespace.nspname = strSchemaName; if strClusterIndex is null then raise exception 'Table %.% must have a cluster index before reordering', strSchemaName, strTableName; end if; -- Disable all user triggers strDdl = 'alter table ' || strSchemaTable || ' disable trigger user'; raise notice 'Disable triggers [%]', strDdl; execute strDdl; -- Create temp table to hold ddl create temp table temp_catalogtablecolumnreorder ( type text not null, name text not null, ddl text not null ); -- Save index ddl in a temp table raise notice 'Save indexes'; for rIndex in with index as ( select _utility.catalog_index_list_get(strSchemaName, strTableName) as name ), index_ddl as ( select index.name, _utility.catalog_index_create_get(_utility.catalog_index_get(strSchemaName, index.name)) as ddl from index ) select index.name, index_ddl.ddl from index left outer join index_ddl on index_ddl.name = index.name and index_ddl.ddl not like '%[function]%' loop raise notice 'Save %', rIndex.name; insert into temp_catalogtablecolumnreorder values ('index', rIndex.name, rIndex.ddl); end loop; -- Save constraint ddl in a temp table raise notice 'Save constraints'; for rConstraint in with constraint_list as ( select _utility.catalog_constraint_list_get(strSchemaName, strTableName, '{p,u,f,c}') as name ), constraint_ddl as ( select constraint_list.name,
Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans
On 02/26/2015 01:59 PM, Grzegorz Parka wrote: Dear Hackers, I'm Grzegorz Parka, BSc Engineer of Technical Physics and student of Computer Science at WUT, Poland. Last year I've been a bit into evolutionary algorithms and during my research I found out about GEQO in Postgres. I also found out that there are plans to try a different attempt to find optimal query plans and thought it could be a good thing to use it as a project for GSoC. I'm interested in one of old TODO items related to the optimizer - 'Consider compressed annealing to search for query plans'. I believe this would be potentially beneficial to Postgres to check if such a heuristic could really choose better query plans than GEQO does. Judging by the results that simulated annealing gives on the travelling salesman problem, it looks like a simpler and potentially more effective way of combinatorial optimization. As deliverables of such a project I would provide a simple implementation of basic simulated annealing optimizer and some form of quantitative comparison with GEQO. You might look at the earlier attempt to make the GEQO replacement pluggable. That project failed to complete sufficiently to be a feature though, but it did enough to show that our current GEQO implementation was suboptimal. I'm currently searching for this project ... it was a GSOC project, but I think before they required posting to Google Code. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans
Josh Berkus j...@agliodbs.com writes: On 02/26/2015 01:59 PM, Grzegorz Parka wrote: I'm interested in one of old TODO items related to the optimizer - 'Consider compressed annealing to search for query plans'. You might look at the earlier attempt to make the GEQO replacement pluggable. That project failed to complete sufficiently to be a feature though, but it did enough to show that our current GEQO implementation was suboptimal. I'm currently searching for this project ... it was a GSOC project, but I think before they required posting to Google Code. I seem to recall somebody demo'ing a simulated-annealing GEQO replacement at PGCon a couple years back. It never got to the point of being a submitted patch though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] contrib/fuzzystrmatch/dmetaphone.c license
On Thu, Feb 26, 2015 at 8:56 AM, Stephen Frost sfr...@snowman.net wrote: * Jim Nasby (jim.na...@bluetreble.com) wrote: On 2/25/15 4:10 PM, Andrew Dunstan wrote: On 02/25/2015 11:59 AM, Joe Conway wrote: It's largely because of such uncertainties that I have been advised in the past (by those with appropriate letters after their names) to stop using the Artistic licence. This is why I spent nearly a year working on changing pgAdmin to the PostgreSQL licence. I committed this (1 July 2004), but cannot remember any details about a license discussion. And I searched the list archives and curiously cannot find any email at all about it either. Maybe Andrew remembers something. I doubt we want to rip it out without some suitable replacement -- do we? That's more than 10 years ago. I remember creating this for my then work at the North Carolina State Highway Patrol and sending it to Joe, but that's about the extent of my recollection. If the Artistic License isn't acceptable. I guess we'd have to try to get the code relicensed, or reimplement the function ourselves. There are numerous implementations out there we could copy from or use as a basis for reimplementation, including several licensed under the Apache 2.0 license - is that compatible with ours? Perhaps a company large enough to have in-house counsel (EnterpriseDB?) could get a quick legal opinion on the license before we start pursuing other things? Perhaps this is just a non-issue... For my 2c (IANAL), I'm not convinced that it's an issue either.. I've certainly not heard of anyone complaining about it either, so.. That said, we could also through SPI which might get us a bit of pro-bono work, if we really wanted to pursue this. Just a hunch, but as they tend to be conservative (lawyers in general), I expect the answer we'd get is yes, it might conflict and if you want to avoid any issues you wouldn't include it. Exactly :), and I just had a discussion with some legal folks about that because it has been an issue raised internally. So the boring stuff being... The Perl License has two meanings: GPL v2.0 or Artistic License 1.0, and there can be problems if fuzzystrmatch.so links with something that has portions licensed as Apache v2 because Apache v2 and GPL v2.0 are incompatible, or anything who-know-what incompatible with Apache v2. By choosing the Artistic license there are no problems visibly, at least for the case I have been pinged about. To that end, I'd suggest -core simply formally ask the authors about it. Once we have that answer, we can figure out what to do. In my experience, at least, it's a lot better to go that route and figure out what the original authors really *intended* than to go get a lawyer to weigh in on it. One of those approaches is both free and gives a clear answer, while the other is expensive and doesn't provide any real certainty. :) I would go this way as well, aka ask the authors and see if it is possible to remove the license notice and keep everything licensed under PostgreSQL license to avoid any future problems... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans
On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund and...@2ndquadrant.com wrote: On 2015-02-26 20:23:33 -0500, Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: On 02/26/2015 01:59 PM, Grzegorz Parka wrote: I'm interested in one of old TODO items related to the optimizer - 'Consider compressed annealing to search for query plans'. You might look at the earlier attempt to make the GEQO replacement pluggable. That project failed to complete sufficiently to be a feature though, but it did enough to show that our current GEQO implementation was suboptimal. I'm currently searching for this project ... it was a GSOC project, but I think before they required posting to Google Code. I seem to recall somebody demo'ing a simulated-annealing GEQO replacement at PGCon a couple years back. It never got to the point of being a submitted patch though. Yea, it was Jan Urbański (CCed). And the project link: https://github.com/wulczer/saio Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello
Re: [HACKERS] Partitioning WIP patch
On 27-02-2015 AM 03:01, Jim Nasby wrote: On 2/26/15 3:09 AM, Amit Langote wrote: Unless I'm missing something again, isn't allowing partitions to have heterogeneous rowtypes a problem in the long run? I'm afraid I'm confused as to your stand regarding inheritance vs. new partitioning. To be specific, children with heterogeneous schemas sounds much like what inheritance would be good for as you say. But then isn't that why we have to plan children individually which you said new partitioning should get away from? Apologies if I haven't been clear enough. What I'd like to see is the best of both worlds; fast partitioning when not using inheritance, and perhaps somewhat slower when using inheritance, but still with the features partitioning gives you. I get the distinction, thanks. Actually I wasn't quite thinking of altering the way any part of the current partitioning based on inheritance works nor am I proposing to get rid of it. It all stays as is. Not sure how we could say if it will support features of the new partitioning before those features actually begin to materialize. But my bigger concern from a project standpoint is that we not put ourselves in a position of supporting something that we really don't want to support (a partitioning system that's got inheritance mixed in). As much as I'd personally like to have both features together, I think it would be bad for the community to go down that road without careful thought. Yes, it does. In fact, I do intend to keep them separate the first attempt of which is to choose to NOT transform a PARTITION OF parent clause into INHERITS parent. Any code that may look like it's trying to do that is because the patch is not fully baked yet. Ok, good to know. That's why I was asking about ALTER TABLE ADD COLUMN on a partition. If we release something without that being restricted it'll probably cause trouble later on. Yes, I agree. More generally, I think the patch/approach is in need of a clear separation of internal implementation concerns and user-facing notions even at this point. This may be one of them. For example, with the patch, a partition is defined as: CREATE TABLE name PARTITION OF parent ... Unless that turns into something like: CREATE PARTITION name OF parent ... we may not be able to put all the restrictions we'd want to put on a partition for the sake of what would be partitioning internals. Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Odd behavior of updatable security barrier views on foreign tables
On 2015/02/26 11:38, Stephen Frost wrote: I've pushed an update for this to master and 9.4 and improved the comments and the commit message as discussed. Would be great if you could test and let me know if you run into any issues! OK, thanks! Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Add CINE for ALTER TABLE ... ADD COLUMN
Hi all, This simple patch add CINE for ALTER TABLE ... ADD COLUMN. So now we can: ALTER TABLE foo ADD COLUMN IF NOT EXISTS c1 integer; and/or ... ALTER TABLE foo ADD COLUMN IF NOT EXISTS c1 integer, ADD COLUMN c2 integer; Regards, -- Fabrízio de Royes Mello Consultoria/Coaching PostgreSQL Timbira: http://www.timbira.com.br Blog: http://fabriziomello.github.io Linkedin: http://br.linkedin.com/in/fabriziomello Twitter: http://twitter.com/fabriziomello Github: http://github.com/fabriziomello diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index b3a4970..aba7ec0 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -36,7 +36,7 @@ ALTER TABLE ALL IN TABLESPACE replaceable class=PARAMETERname/replaceable phrasewhere replaceable class=PARAMETERaction/replaceable is one of:/phrase -ADD [ COLUMN ] replaceable class=PARAMETERcolumn_name/replaceable replaceable class=PARAMETERdata_type/replaceable [ COLLATE replaceable class=PARAMETERcollation/replaceable ] [ replaceable class=PARAMETERcolumn_constraint/replaceable [ ... ] ] +ADD [ COLUMN ] [ IF NOT EXISTS ]replaceable class=PARAMETERcolumn_name/replaceable replaceable class=PARAMETERdata_type/replaceable [ COLLATE replaceable class=PARAMETERcollation/replaceable ] [ replaceable class=PARAMETERcolumn_constraint/replaceable [ ... ] ] DROP [ COLUMN ] [ IF EXISTS ] replaceable class=PARAMETERcolumn_name/replaceable [ RESTRICT | CASCADE ] ALTER [ COLUMN ] replaceable class=PARAMETERcolumn_name/replaceable [ SET DATA ] TYPE replaceable class=PARAMETERdata_type/replaceable [ COLLATE replaceable class=PARAMETERcollation/replaceable ] [ USING replaceable class=PARAMETERexpression/replaceable ] ALTER [ COLUMN ] replaceable class=PARAMETERcolumn_name/replaceable SET DEFAULT replaceable class=PARAMETERexpression/replaceable @@ -96,11 +96,12 @@ ALTER TABLE ALL IN TABLESPACE replaceable class=PARAMETERname/replaceable variablelist varlistentry -termliteralADD COLUMN/literal/term +termliteralADD COLUMN [ IF NOT EXISTS ]/literal/term listitem para This form adds a new column to the table, using the same syntax as - xref linkend=SQL-CREATETABLE. + xref linkend=SQL-CREATETABLE. If literalIF NOT EXISTS/literal + is specified and the column already exists, no error is thrown. /para /listitem /varlistentry diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index f5d5b63..5ecb438 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -328,8 +328,8 @@ static void ATPrepAddColumn(List **wqueue, Relation rel, bool recurse, bool recu AlterTableCmd *cmd, LOCKMODE lockmode); static void ATExecAddColumn(List **wqueue, AlteredTableInfo *tab, Relation rel, ColumnDef *colDef, bool isOid, -bool recurse, bool recursing, LOCKMODE lockmode); -static void check_for_column_name_collision(Relation rel, const char *colname); +bool recurse, bool recursing, bool if_not_exists, LOCKMODE lockmode); +static bool check_for_column_name_collision(Relation rel, const char *colname, bool if_not_exists); static void add_column_datatype_dependency(Oid relid, int32 attnum, Oid typid); static void add_column_collation_dependency(Oid relid, int32 attnum, Oid collid); static void ATPrepAddOids(List **wqueue, Relation rel, bool recurse, @@ -2283,7 +2283,7 @@ renameatt_internal(Oid myrelid, oldattname))); /* new name should not already exist */ - check_for_column_name_collision(targetrelation, newattname); + check_for_column_name_collision(targetrelation, newattname, false); /* apply the update */ namestrcpy((attform-attname), newattname); @@ -3399,11 +3399,11 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel, case AT_AddColumnToView: /* add column via CREATE OR REPLACE * VIEW */ ATExecAddColumn(wqueue, tab, rel, (ColumnDef *) cmd-def, - false, false, false, lockmode); + false, false, false, cmd-missing_ok, lockmode); break; case AT_AddColumnRecurse: ATExecAddColumn(wqueue, tab, rel, (ColumnDef *) cmd-def, - false, true, false, lockmode); + false, true, false, cmd-missing_ok, lockmode); break; case AT_ColumnDefault: /* ALTER COLUMN DEFAULT */ ATExecColumnDefault(rel, cmd-name, cmd-def, lockmode); @@ -3500,13 +3500,13 @@ ATExecCmd(List **wqueue, AlteredTableInfo *tab, Relation rel, /* Use the ADD COLUMN code, unless prep decided to do nothing */ if (cmd-def != NULL) ATExecAddColumn(wqueue, tab, rel, (ColumnDef *) cmd-def, -true, false, false, lockmode); +true, false, false, cmd-missing_ok, lockmode); break; case AT_AddOidsRecurse: /* SET WITH OIDS */ /* Use the ADD COLUMN code, unless prep decided to do nothing */ if (cmd-def != NULL) ATExecAddColumn(wqueue, tab, rel, (ColumnDef *) cmd-def, -true, true,
Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans
On 02/26/2015 05:50 PM, Fabrízio de Royes Mello wrote: On Thu, Feb 26, 2015 at 10:27 PM, Andres Freund and...@2ndquadrant.com mailto:and...@2ndquadrant.com wrote: On 2015-02-26 20:23:33 -0500, Tom Lane wrote: Josh Berkus j...@agliodbs.com mailto:j...@agliodbs.com writes: On 02/26/2015 01:59 PM, Grzegorz Parka wrote: I'm interested in one of old TODO items related to the optimizer - 'Consider compressed annealing to search for query plans'. You might look at the earlier attempt to make the GEQO replacement pluggable. That project failed to complete sufficiently to be a feature though, but it did enough to show that our current GEQO implementation was suboptimal. I'm currently searching for this project ... it was a GSOC project, but I think before they required posting to Google Code. I seem to recall somebody demo'ing a simulated-annealing GEQO replacement at PGCon a couple years back. It never got to the point of being a submitted patch though. Yea, it was Jan Urbański (CCed). And the project link: https://github.com/wulczer/saio So what w'ere saying, Grzegorz, is that we would love to see someone pick this up and get it to the point of making it a feature as a GSOC project. I think if you can start from where Jan left off, you could actually complete it. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Fri, Feb 27, 2015 at 8:01 AM, Michael Paquier michael.paqu...@gmail.com wrote: On Fri, Feb 27, 2015 at 6:54 AM, Rahila Syed rahilasye...@gmail.com wrote: Even this patch doesn't work fine. The standby emit the following error messages. Yes this bug remains unsolved. I am still working on resolving this. Following chunk IDs have been added in the attached patch as suggested upthread. +#define XLR_CHUNK_BLOCK_REFERENCE 0x10 +#define XLR_CHUNK_BLOCK_HAS_IMAGE 0x04 +#define XLR_CHUNK_BLOCK_HAS_DATA 0x08 XLR_CHUNK_BLOCK_REFERENCE denotes chunk ID of block references. XLR_CHUNK_BLOCK_HAS_IMAGE is a replacement of BKPBLOCK_HAS_IMAGE and XLR_CHUNK_BLOCK_HAS DATA a replacement of BKPBLOCK_HAS_DATA. Before sending a new version, be sure that this get fixed by for example building up a master with a standby replaying WAL, and running make installcheck-world or similar. If the standby does not complain at all, you have good chances to not have bugs. You could also build with WAL_DEBUG to check record consistency. It would be good to get those problems fixed first. Could you send an updated patch? I'll look into it in more details. For the time being I am switching this patch to Waiting on Author. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GSoC idea - Simulated annealing to search for query plans
On 2015-02-26 20:23:33 -0500, Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: On 02/26/2015 01:59 PM, Grzegorz Parka wrote: I'm interested in one of old TODO items related to the optimizer - 'Consider compressed annealing to search for query plans'. You might look at the earlier attempt to make the GEQO replacement pluggable. That project failed to complete sufficiently to be a feature though, but it did enough to show that our current GEQO implementation was suboptimal. I'm currently searching for this project ... it was a GSOC project, but I think before they required posting to Google Code. I seem to recall somebody demo'ing a simulated-annealing GEQO replacement at PGCon a couple years back. It never got to the point of being a submitted patch though. Yea, it was Jan Urbański (CCed). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical column ordering
On 27/02/15 14:08, David Steele wrote: [...] I agree with Jim's comments. I've generally followed column ordering that goes something like: 1) primary key 2) foreign keys 3) flags 4) other programmatic data fields (type, order, etc.) 5) non-programmatic data fields (name, description, etc.) The immediate practical benefit of this is that users are more likely to see fields that they need without scrolling right. Documentation is also clearer because fields tend to go from most to least important (left to right, top to bottom). Also, if you are consistent enough then users just *know* where to look. I wrote a function a while back that reorders columns in tables (it not only deals with reordering, but triggers, constraints, indexes, etc., though there are some caveats). It's painful and not very efficient, but easy to use. Most dimension tables that I've worked with are in the millions of rows so reordering is not problem. With fact tables, I assess on a case-by-case basis. It would be nice to not have to do that triage. The function is attached if anyone is interested. I've never formally written down the order of how I define fields^H^H^H^H^H^H columns in a table, but David's list is the same order I use. The only additional ordering I do: is to put fields that are likely to be long text fields, at the end of the record. So I would certainly appreciate my logical ordering to be the natural order they appear. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MemoryContext reset/delete callbacks
Andres Freund and...@2ndquadrant.com writes: It's a bit sad to push AllocSetContextData onto four cachelines from the current three... That stuff is hot. But I don't really see a way around it right now. And it seems like it'd give us more amunition to improve things than the small loss of speed it implies. Meh. I doubt it would make any difference, especially seeing that the struct isn't going to be aligned on any special boundary. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers