Re: [HACKERS] pg_rewind in contrib
On Mon, Mar 9, 2015 at 11:59 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Heikki Linnakangas wrote: Attached is a new patch version, fixing all the little things you listed. I believe this is pretty much ready for commit. I'm going to read it through myself one more time before committing, but I don't have anything mind now that needs fixing anymore. I do -- it's obvious that you've given some consideration to translatability, but it's not complete: anything that's using fprintf() and friends are not marked for translation with _(). There are lots of things using pg_fatal etc and those are probably fine. One big thing that's not currently translated is usage(), but there are others. Definitely. On top of that, I had an extra look at this patch, testing pg_rewind with OSX, Linux, MinGW-32 and MSVC. Particularly on Windows, I have been able to rewind a node and to reconnect it to a promoted standby using this utility compiled on both MinGW-32 and MSVC (nothing fancy with two services, one master and a standby on a local server). I think nobody has tested that until now though... A minor comment: + para + applicationpg_rewind/ is a tool for synchronizing a PostgreSQL cluster + with another copy of the same cluster, after the clusters' timelines have PostgreSQL needs the markup productname. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance improvement for joins where outer side is unique
Sorry for the dealy, I've returned to this. Performance: I looked on the performance gain this patch gives. For several on-memory joins, I had gains about 3% for merge join, 5% for hash join, and 10% for nestloop (@CentOS6), for simple 1-level joins with aggregation similar to what you mentioned in previous mail like this. =# SELECT count(*) FROM t1 JOIN t2 USING (id); Of course, the gain would be trivial when returning many tuples, or scans go to disk. That's true, but joins where rows are filtered after the join condition will win here too: Yes, when not many tuples was returned, the gain remains in inverse proportion to the number of the returning rows. Heavy (time-consuming) on-memory join returning several rows should have gain from this. Also queries with a GROUP BY clause. (Since grouping is always performed after the join :-( ) And in many cases clinged by additional sorts:p As such, so many factors affect the performance so improvements which give linear performance gain are often difficult to gain approval to being applied. I suppose we need more evidence how and in what situation we will receive the gain. I haven't measured the loss by additional computation when the query is regarded as not a single join. I think this would be hard to measure, but likely if it is measurable then you'd want to look at just planning time, rather than planning and execution time. From the another point of view, the patch looks a bit large for the gain (for me). Addition to it, it loops by many levels. [mark_unique_joins()] foreach (joinlist) [eclassjoin_is_unique_join()] foreach(joinlist) foreach(root-eq_classes) foreach(root-ec_members) The innermost loop could be roughly said to iterate about 10^3*2 times for a join of 10 tables all of which have index and no eclass is shared among them. From my expreince, we must have the same effect by far less iteration levels. This is caueed the query like this, as a bad case. create table t1 (a int, b int); create index i_t1 (a int); create table t2 (like t1 including indexes); create table t10 (.); insert into t1 (select a, a * 10 from generate_series(0, 100) a); insert into t2 (select a * 10, a * 20 from generate_series(0, 100) a); ... insert into t10 (select a * 90, a * 100 from generate_series(0, 100) a); explain select t1.a, t10.b from t1 join t2 on (t1.b = t2.a) join t3 on (t2.b = t3.a) join t4 on (t3.b = t4.a) join t5 on (t4.b = t5.a) join t6 on (t5.b = t6.a) join t7 on (t6.b = t7.a) join t8 on (t7.b = t8.a) join t9 on (t8.b = t9.a) join t10 on (t9.b = t10.a); The head takes 3ms for planning and the patched version takes around 5ms while pure execution time is 1ms.. I think it is a too long extra time. Explain representation: I don't like that the 'Unique Join: occupies their own lines in the result of explain, moreover, it doesn't show the meaning clearly. What about the representation like the following? Or, It might not be necessary to appear there. Nested Loop ... Output: - Unique Jion: Yes - Seq Scan on public.t2 (cost = ... - - Seq Scan on public.t1 (cost = + - Seq Scan on public.t1 (unique) (cost = Yeah I'm not too big a fan of this either. I did at one evolution of the patch I had Unique Left Join in the join node's line in the explain output, but I hated that more and changed it just to be just in the VERBOSE output, and after I did that I didn't want to change the join node's line only when in verbose mode. I do quite like that it's a separate item for the XML and JSON explain output. That's perhaps quite useful when the explain output must be processed by software. I'm totally open for better ideas on names, but I just don't have any at the moment. Unique Left Join looks too bold. Anyway changing there is rather easier. Coding: The style looks OK. Could applied on master. It looks to work as you expected for some cases. Random comments follow. - Looking specialjoin_is_unique_join, you seem to assume that !delay_upper_joins is a sufficient condition for not being unique join. The former simplly indicates that don't commute with upper OJs and the latter seems to indicate that the RHS is unique, Could you tell me what kind of relationship is there between them? The rationalisation around that are from the (now changed) version of join_is_removable(), where the code read: /* * Must be a non-delaying left join to a single baserel, else we aren't * going to be able to do anything with it. */ if (sjinfo-jointype != JOIN_LEFT || sjinfo-delay_upper_joins) return false; I have to admit that I didn't go and investigate why delayed upper joins cannot be removed by left join removal code, I really just assumed that we're just unable to prove that a join to such a relation won't match more than one outer side's row. I kept
Re: [HACKERS] TABLESAMPLE patch
On 10/03/15 10:54, Amit Kapila wrote: On Tue, Mar 10, 2015 at 3:03 PM, Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com wrote: Ok now I think I finally understand what you are suggesting - you are saying let's go over whole page while tsmnexttuple returns something, and do the visibility check and other stuff in that code block under the buffer lock and cache the resulting valid tuples in some array and then return those tuples one by one from that cache? Yes, this is what I am suggesting. And if the caller will try to do it in one step and cache the visibility info then we'll end up with pretty much same structure as rs_vistuples - there isn't saner way to cache this info other than ordered vector of tuple offsets, unless we assume that most pages have close to MaxOffsetNumber of tuples which they don't, so why not just use the heapgetpage directly and do the binary search over rs_vistuples. The downside of doing it via heapgetpage is that it will do visibility test for tuples which we might not even need (I think we should do visibility test for tuples retrurned by tsmnexttuple). Well, heapgetpage can either read visibility data for whole page or not, depending on if we want pagemode reading or not. So we can use the pagemode for sampling methods where it's feasible (like system) and not use pagemode where it's not (like bernoulli) and then either use the rs_vistuples or call HeapTupleSatisfiesVisibility individually again depending if the method is using pagemode or not. Yeah, but as mentioned above, this has some downside, but go for it only if you feel that above suggestion is making code complex, which I think should not be the case as we are doing something similar in acquire_sample_rows(). I think your suggestion is actually simpler code wise, I am just somewhat worried by the fact that no other scan node uses that kind of caching and there is probably reason for that. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce pinning in btree indexes
Hello, I found no other problem including the performance issue in the patch attached to the last mail as far as I can understand. So I'll mark this as ready for commit after a few days with no objection after this comments is addressed. - If (BTScanPosIsPinned(so-currPos)). As I mention below for nbtsearch.c, the page aquired in the if-then block may be vacuumed so the LSN check done in the if-else block is need also in the if-then block. It will be accomplished only by changing the position of the end of the if-else block. I'm not sure I agree on this. Sorry, it is largely because of my poor composition. If the page is pinned it should have been pinned continuously since we initially read it, so the line pointers we have could not have been removed by any vacuum process. The tuples may have been pruned away in the heap, but that doesn't matter. Index entries may have been added and the index page may have been split, but that was true before this patch and _bt_killitems will deal with those things the same as it always has. Yes. I think so. I don't see any benefit to doing the LSN compare in this case; if we've paid the costs of holding the pin through to this point, we might as well flag any dead entries we can. I thought of the case that the buffer has been pinned by another backend after this backend unpinned it. I looked again the definition of BTScanPosIsPinned and BTScanPosUnpinIfPinned, and understood that the pin should be mine if BTScanPosIsPinned. Would you mind rewriting the comment there like this? - /* The buffer is still pinned, but not locked. Lock it now. */ + /* I still hold the pin on the buffer, but not locked. Lock it now. */ | LockBuffer(so-currPos.buf, BT_READ); Or would you mind renaming the macro as BTScanPosIsPinnedByMe or something like, or anyway to notify the reader that the pin should be mine there? - _bt_killitems is called without pin when rescanning from before, so I think the previous code has already considered the unpinned case (if (ItemPointerEquals(ituple... does that). Vacuum rearranges line pointers after deleting LP_DEAD items so the previous check seems enough for the purpose. The previous code is more effeicient becuase the mathing items will be processed even after vacuum. I'm not following you on this one; could you rephrase it? Sorry, I read btrescan incorrectly that it calls _bt_killitems() *after* releaseing the buffer. Please forget it. Finally, I'd like to timidly comment on this.. + To handle the cases where it is safe to release the pin after + reading the index page, the LSN of the index page is read along + with the index entries before the shared lock is released, and we + do not attempt to flag index tuples as dead if the page is not + pinned and the LSN does not match. I suppose that the sentence like following is more directly describing about the mechanism and easier to find the correnponsing code, if it is correct. To handle the cases where a index page has unpinned when trying to mark the unused index tuples on the page as dead, the LSN of the index page is remembered when reading the index page for index tuples, and we do not attempt to flag index tuples as dead if the page is not pinned and the LSN does not match. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace
On Mon, Mar 9, 2015 at 7:26 PM, Andreas Karlsson andr...@proxel.se wrote: - I do not like how \d handles the toast tablespace. Having TOAST in pg_default and the table in another space looks the same as if there was no TOAST table at all. I think we should always print both tablespaces if either of them are not pg_default. If we do it that way, we should always show the tablespace even if it's pg_default in any case to be consistent. I'm pretty sure that we don't want that. I think we will have to agree to disagree here. I think it should be obvious when the toast table is in the default tablespace for tables outside it. I'm not sure about the details here, but that seems like a pretty sound principle. - Would it be interesting to add syntax for moving the toast index to a separate tablespace? -1, we just want to be able to move TOAST heap, which is supposed to contain a lot of data and we want to move on a different storage device / filesystem. I think we should allow moving the indexes for consistency. With this patch we can move everything except for TOAST indexes. It might make sense to always put the TOAST index with the TOAST table, but it seems strange to put the TOAST index with the heap and the TOAST table someplace else. Or at least, that's how it seems to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TABLESAMPLE patch
On 10/03/15 04:43, Amit Kapila wrote: On Mon, Mar 9, 2015 at 3:08 PM, Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com wrote: On 09/03/15 04:51, Amit Kapila wrote: On Sat, Mar 7, 2015 at 10:37 PM, Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com Double checking for tuple visibility is the only downside I can think of. That will happen if we use heapgetpage and the way currently code is written in patch, however we can easily avoid double checking if we don't call heapgetpage and rather do the required work at caller's place. What's the point of pagemode then if the caller code does the visibility checks still one by one on each call. I thought one of the points of pagemode was to do this in one step (and one buffer lock). You only need one buffer lock for doing at caller's location as well something like we do in acquire_sample_rows(). Ok now I think I finally understand what you are suggesting - you are saying let's go over whole page while tsmnexttuple returns something, and do the visibility check and other stuff in that code block under the buffer lock and cache the resulting valid tuples in some array and then return those tuples one by one from that cache? And if the caller will try to do it in one step and cache the visibility info then we'll end up with pretty much same structure as rs_vistuples - there isn't saner way to cache this info other than ordered vector of tuple offsets, unless we assume that most pages have close to MaxOffsetNumber of tuples which they don't, so why not just use the heapgetpage directly and do the binary search over rs_vistuples. The downside of doing it via heapgetpage is that it will do visibility test for tuples which we might not even need (I think we should do visibility test for tuples retrurned by tsmnexttuple). Well, heapgetpage can either read visibility data for whole page or not, depending on if we want pagemode reading or not. So we can use the pagemode for sampling methods where it's feasible (like system) and not use pagemode where it's not (like bernoulli) and then either use the rs_vistuples or call HeapTupleSatisfiesVisibility individually again depending if the method is using pagemode or not. This is how sequential scan does it afaik. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: knowing detail of config files via SQL
* Stephen Frost (sfr...@snowman.net) wrote: --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -414,6 +414,11 @@ CREATE RULE pg_settings_n AS GRANT SELECT, UPDATE ON pg_settings TO PUBLIC; +CREATE VIEW pg_file_settings AS + SELECT * FROM pg_show_all_file_settings() AS A; + +REVOKE ALL on pg_file_settings FROM public; + Err, and further, I realize that you're not actually changing the permissions on the actual function at all, which means that they're the default which is executable by anyone. This will also need a REVOKE EXECUTE on pg_show_all_file_settings() FROM public; Or someone could simply run the function instead of using the view to see the data returned. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] TABLESAMPLE patch
On Tue, Mar 10, 2015 at 3:03 PM, Petr Jelinek p...@2ndquadrant.com wrote: On 10/03/15 04:43, Amit Kapila wrote: On Mon, Mar 9, 2015 at 3:08 PM, Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com wrote: On 09/03/15 04:51, Amit Kapila wrote: On Sat, Mar 7, 2015 at 10:37 PM, Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com Double checking for tuple visibility is the only downside I can think of. That will happen if we use heapgetpage and the way currently code is written in patch, however we can easily avoid double checking if we don't call heapgetpage and rather do the required work at caller's place. What's the point of pagemode then if the caller code does the visibility checks still one by one on each call. I thought one of the points of pagemode was to do this in one step (and one buffer lock). You only need one buffer lock for doing at caller's location as well something like we do in acquire_sample_rows(). Ok now I think I finally understand what you are suggesting - you are saying let's go over whole page while tsmnexttuple returns something, and do the visibility check and other stuff in that code block under the buffer lock and cache the resulting valid tuples in some array and then return those tuples one by one from that cache? Yes, this is what I am suggesting. And if the caller will try to do it in one step and cache the visibility info then we'll end up with pretty much same structure as rs_vistuples - there isn't saner way to cache this info other than ordered vector of tuple offsets, unless we assume that most pages have close to MaxOffsetNumber of tuples which they don't, so why not just use the heapgetpage directly and do the binary search over rs_vistuples. The downside of doing it via heapgetpage is that it will do visibility test for tuples which we might not even need (I think we should do visibility test for tuples retrurned by tsmnexttuple). Well, heapgetpage can either read visibility data for whole page or not, depending on if we want pagemode reading or not. So we can use the pagemode for sampling methods where it's feasible (like system) and not use pagemode where it's not (like bernoulli) and then either use the rs_vistuples or call HeapTupleSatisfiesVisibility individually again depending if the method is using pagemode or not. Yeah, but as mentioned above, this has some downside, but go for it only if you feel that above suggestion is making code complex, which I think should not be the case as we are doing something similar in acquire_sample_rows(). With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] pg_rewind in contrib
Michael Paquier wrote: On top of that, I had an extra look at this patch, testing pg_rewind with OSX, Linux, MinGW-32 and MSVC. Particularly on Windows, I have been able to rewind a node and to reconnect it to a promoted standby using this utility compiled on both MinGW-32 and MSVC (nothing fancy with two services, one master and a standby on a local server). I think nobody has tested that until now though... Sweet! -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote: You still duplicate the type cache code, but many other array functions do that too so I will not hold that against you. (Maybe somebody should write separate patch that would put all that duplicate code into common function?) I think this patch is ready for committer so I am going to mark it as such. The documentation in this patch needs some improvements to the English. Can anyone help with that? The documentation should discuss what happens if the array is multi-dimensional. The code for array_offset and for array_offset_start appear to be byte-for-byte identical. There's no comment explaining why, but I bet it's to make the opr_sanity test pass. How about adding a comment? The comment for array_offset_common refers to array_offset_start as array_offset_startpos. I am sure in agreement with the idea that it would be good to factor out the common typecache code (for setting up my_extra). Any chance we get a preliminary patch that does that refactoring, and then rebase the main patch on top of it? I agree that it's not really this patch's job to solve that problem, but it would be nice. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] One question about security label command
Kohei KaiGai wrote: The attached patch revises error message when security label is specified on unsupported object. getObjectTypeDescription() may be better than oid of catalog. Agreed. postgres=# SECURITY LABEL FOR selinux ON ROLE kaigai postgres-# IS 'system_u:object_r:unlabeled_t:s0'; ERROR: sepgsql provider does not support labels on role I'd make it sepgsql provider does not support labels on objects of type %s And perhaps make it an ereport also, with errcode etc. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance improvement for joins where outer side is unique
On 10 March 2015 at 19:19, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: From the another point of view, the patch looks a bit large for the gain (for me). Addition to it, it loops by many levels. [mark_unique_joins()] foreach (joinlist) [eclassjoin_is_unique_join()] foreach(joinlist) foreach(root-eq_classes) foreach(root-ec_members) The innermost loop could be roughly said to iterate about 10^3*2 times for a join of 10 tables all of which have index and no eclass is shared among them. From my expreince, we must have the same effect by far less iteration levels. This is caueed the query like this, as a bad case. create table t1 (a int, b int); create index i_t1 (a int); create table t2 (like t1 including indexes); create table t10 (.); insert into t1 (select a, a * 10 from generate_series(0, 100) a); insert into t2 (select a * 10, a * 20 from generate_series(0, 100) a); ... insert into t10 (select a * 90, a * 100 from generate_series(0, 100) a); explain select t1.a, t10.b from t1 join t2 on (t1.b = t2.a) join t3 on (t2.b = t3.a) join t4 on (t3.b = t4.a) join t5 on (t4.b = t5.a) join t6 on (t5.b = t6.a) join t7 on (t6.b = t7.a) join t8 on (t7.b = t8.a) join t9 on (t8.b = t9.a) join t10 on (t9.b = t10.a); The head takes 3ms for planning and the patched version takes around 5ms while pure execution time is 1ms.. I think it is a too long extra time. This smells quite fishy to me. I'd be quite surprised if your machine took an extra 2 ms to do this. I've run what I think is the same test on my 5 year old i5 laptop and attached the .sql file which I used to generate the same schema as you've described. I've also attached the results of the explain analyze Planning Time: output from patched and unpatched using your test case. I was unable to notice any difference in plan times between both versions. In fact master came out slower, which is likely just the noise in the results. Just to see how long mark_unique_joins() takes with your test case I changed query_planner() to call mark_unique_joins() 1 million times: { int x; for (x = 0; x 100;x++) mark_unique_joins(root, joinlist); } I also got rid of the fast path test which bails out if the join is already marked as unique. /* check if we've already marked this join as unique on a previous call */ /*if (idxrel-is_unique_join) return true; */ On my machine after making these changes, it takes 800 ms to plan the query. So it seems that's around 800 nano seconds for a single call to mark_unique_joins(). Perhaps you've accidentally compiled the patched version with debug asserts? Are you able to retest this? Regards David Rowley create table t1 (a int, b int); create index i_t1 (a int); create table t2 (like t1 including indexes); create table t3 (like t1 including indexes); create table t4 (like t1 including indexes); create table t5 (like t1 including indexes); create table t6 (like t1 including indexes); create table t7 (like t1 including indexes); create table t8 (like t1 including indexes); create table t9 (like t1 including indexes); create table t10 (like t1 including indexes); insert into t1 (a,b) select a, a * 10 from generate_series(0, 100) a(a); insert into t2 (select a * 10, a * 20 from generate_series(0, 100) a); insert into t3 (select a * 10, a * 20 from generate_series(0, 100) a); insert into t4 (select a * 10, a * 20 from generate_series(0, 100) a); insert into t5 (select a * 10, a * 20 from generate_series(0, 100) a); insert into t6 (select a * 10, a * 20 from generate_series(0, 100) a); insert into t7 (select a * 10, a * 20 from generate_series(0, 100) a); insert into t8 (select a * 10, a * 20 from generate_series(0, 100) a); insert into t9 (select a * 10, a * 20 from generate_series(0, 100) a); insert into t10 (select a * 10, a * 20 from generate_series(0, 100) a); explain analyze select t1.a, t10.b from t1 join t2 on (t1.b = t2.a) join t3 on (t2.b = t3.a) join t4 on (t3.b = t4.a) join t5 on (t4.b = t5.a) join t6 on (t5.b = t6.a) join t7 on (t6.b = t7.a) join t8 on (t7.b = t8.a) join t9 on (t8.b = t9.a) join t10 on (t9.b = t10.a); unijoin_plan_benchmark_results.xlsx Description: MS-Excel 2007 spreadsheet -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation ordering in FROM clause causing error related to missing entry... Or not.
Michael Paquier wrote: Hi all, Today while playing with some queries I bumped into the following thing: =# with count_query as (select generate_series(0,1) as a) select b from count_query, generate_series(1, count_query.a) as b; b --- 1 (1 row) =# with count_query as (select generate_series(0,1) as a) select b from generate_series(1, count_query.a) as b, count_query; ERROR: 42P01: missing FROM-clause entry for table count_query LINE 1: ...eries(0,1) as a) select b from generate_series(1, count_quer... ^ LOCATION: errorMissingRTE, parse_relation.c:2850 I have been a little bit surprised by the fact that different entry ordering in the FROM clause of the main query had different effects. Perhaps there is something I am missing? This seems natural to me -- in your second example, by the time you reference count_query it hasn't yet been declared and thus it's not available in the namespace. This is how I expect a LATERAL reference to work: a RTE can reference previous entries, but not ones that come later. (SRFs in FROM become lateral references automatically, as I recall. Without LATERAL, you wouldn't have been able to refer to count_query at all.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] One question about security label command
The attached patch revises error message when security label is specified on unsupported object. getObjectTypeDescription() may be better than oid of catalog. postgres=# SECURITY LABEL FOR selinux ON ROLE kaigai postgres-# IS 'system_u:object_r:unlabeled_t:s0'; ERROR: sepgsql provider does not support labels on role 2015-03-09 23:55 GMT+09:00 Robert Haas robertmh...@gmail.com: On Tue, Mar 3, 2015 at 5:01 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote: From standpoint of SQL syntax, yep, SECURITY LABEL command support the object types below, however, it fully depends on security label provider; sepgsql.so in this case. At this moment, it supports database, schema, function, tables and column are supported by sepgsql. So, it is expected behavior. If the core system supports labels on other object types and sepgsql does not, it should give a better error for those cases, like: ERROR: sepgsql provider does not support labels on roles Errors like ERROR: unsupported object type: 1260 are a good way to report a failure that is never expected to happen, but they shouldn't be used as user-facing error messages. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- KaiGai Kohei kai...@kaigai.gr.jp security-label-errmsg.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace
Robert Haas wrote: On Mon, Mar 9, 2015 at 7:26 PM, Andreas Karlsson andr...@proxel.se wrote: I think we should allow moving the indexes for consistency. With this patch we can move everything except for TOAST indexes. It might make sense to always put the TOAST index with the TOAST table, but it seems strange to put the TOAST index with the heap and the TOAST table someplace else. Or at least, that's how it seems to me. Agreed. It doesn't seem necessary to allow moving the toast index to a tablespace other than the one containing the toast table. In other words, if you move the toast table, the index always follows it, and there's no option to move it independently. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Relation ordering in FROM clause causing error related to missing entry... Or not.
Hi all, Today while playing with some queries I bumped into the following thing: =# with count_query as (select generate_series(0,1) as a) select b from count_query, generate_series(1, count_query.a) as b; b --- 1 (1 row) =# with count_query as (select generate_series(0,1) as a) select b from generate_series(1, count_query.a) as b, count_query; ERROR: 42P01: missing FROM-clause entry for table count_query LINE 1: ...eries(0,1) as a) select b from generate_series(1, count_quer... ^ LOCATION: errorMissingRTE, parse_relation.c:2850 I have been a little bit surprised by the fact that different entry ordering in the FROM clause of the main query had different effects. Perhaps there is something I am missing? I haven't looked at the code but if this happens to be a bug I am fine to submit a patch. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce pinning in btree indexes
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: I found no other problem including the performance issue in the patch attached to the last mail as far as I can understand. So I'll mark this as ready for commit after a few days with no objection after this comments is addressed. Thanks for the reviews! I don't see any benefit to doing the LSN compare in this case; if we've paid the costs of holding the pin through to this point, we might as well flag any dead entries we can. I thought of the case that the buffer has been pinned by another backend after this backend unpinned it. I looked again the definition of BTScanPosIsPinned and BTScanPosUnpinIfPinned, and understood that the pin should be mine if BTScanPosIsPinned. Would you mind rewriting the comment there like this? - /* The buffer is still pinned, but not locked. Lock it now. */ + /* I still hold the pin on the buffer, but not locked. Lock it now. */ | LockBuffer(so-currPos.buf, BT_READ); Or would you mind renaming the macro as BTScanPosIsPinnedByMe or something like, or anyway to notify the reader that the pin should be mine there? I see your point, although those first person singular pronouns used like that make me a little uncomfortable; I'll change the comment and/or macro name, but I'll work on the name some more. Finally, I'd like to timidly comment on this.. + To handle the cases where it is safe to release the pin after + reading the index page, the LSN of the index page is read along + with the index entries before the shared lock is released, and we + do not attempt to flag index tuples as dead if the page is not + pinned and the LSN does not match. I suppose that the sentence like following is more directly describing about the mechanism and easier to find the correnponsing code, if it is correct. To handle the cases where a index page has unpinned when trying to mark the unused index tuples on the page as dead, the LSN of the index page is remembered when reading the index page for index tuples, and we do not attempt to flag index tuples as dead if the page is not pinned and the LSN does not match. Will reword that part to try to make it clearer. Thanks! -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Mon, Mar 9, 2015 at 9:08 PM, Michael Paquier wrote: On Mon, Mar 9, 2015 at 4:29 PM, Fujii Masao wrote: Thanks for updating the patch! Attached is the refactored version of the patch. Fujii-san and I had a short chat about tuning a bit the PGLZ strategy which is now PGLZ_strategy_default in the patch (at least 25% of compression, etc.). In particular min_input_size which is not set at 32B is too low, and knowing that the minimum fillfactor of a relation page is 10% this looks really too low. For example, using the extension attached to this email able to compress and decompress bytea strings that I have developed after pglz has been moved to libpqcommon (contains as well a function able to get a relation page without its hole, feel free to use it), I am seeing that we can gain quite a lot of space even with some incompressible data like UUID or some random float data (pages are compressed without their hole): 1) Float table: =# create table float_tab (id float); CREATE TABLE =# insert into float_tab select random() from generate_series(1, 20); INSERT 0 20 =# SELECT bytea_size(compress_data(page)) AS compress_size, bytea_size(page) AS raw_size_no_hole FROM get_raw_page('float_tab'::regclass, 0, false); -[ RECORD 1 ]+ compress_size| 329 raw_size_no_hole | 744 =# SELECT bytea_size(compress_data(page)) AS compress_size, bytea_size(page) AS raw_size_no_hole FROM get_raw_page('float_tab'::regclass, 0, false); -[ RECORD 1 ]+- compress_size| 1753 raw_size_no_hole | 4344 So that's more or less 60% saved... 2) UUID table =# SELECT bytea_size(compress_data(page)) AS compress_size, bytea_size(page) AS raw_size_no_hole FROM get_raw_page('uuid_tab'::regclass, 0, false); -[ RECORD 1 ]+ compress_size| 590 raw_size_no_hole | 904 =# insert into uuid_tab select gen_random_uuid() from generate_series(1, 100); INSERT 0 100 =# SELECT bytea_size(compress_data(page)) AS compress_size, bytea_size(page) AS raw_size_no_hole FROM get_raw_page('uuid_tab'::regclass, 0, false); -[ RECORD 1 ]+- compress_size| 3338 raw_size_no_hole | 5304 And in this case we are close to 40% saved... At least, knowing that with the header there are at least 24B used on a page, what about increasing min_input_size to something like 128B or 256B? I don't think that this is a blocker for this patch as most of the relation pages are going to have far more data than that so they will be unconditionally compressed, but there is definitely something we could do in this area later on, perhaps even we could do improvement with the other parameters like the compression rate. So that's something to keep in mind... -- Michael compress_test.tar.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation ordering in FROM clause causing error related to missing entry... Or not.
Michael, * Michael Paquier (michael.paqu...@gmail.com) wrote: Today while playing with some queries I bumped into the following thing: =# with count_query as (select generate_series(0,1) as a) select b from count_query, generate_series(1, count_query.a) as b; b --- 1 (1 row) The above results in an implicit LATERAL being done. =# with count_query as (select generate_series(0,1) as a) select b from generate_series(1, count_query.a) as b, count_query; ERROR: 42P01: missing FROM-clause entry for table count_query LINE 1: ...eries(0,1) as a) select b from generate_series(1, count_quer... ^ LOCATION: errorMissingRTE, parse_relation.c:2850 This doesn't because the generate_series() is first- where would it get the count_query relation? I have been a little bit surprised by the fact that different entry ordering in the FROM clause of the main query had different effects. Perhaps there is something I am missing? I haven't looked at the code but if this happens to be a bug I am fine to submit a patch. Yeah, it's simply because we can turn one into an implicit LATERAL, but we can't do that for the other. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] CATUPDATE confusion?
On Sat, Mar 7, 2015 at 6:40 PM, Adam Brightwell adam.brightw...@crunchydatasolutions.com wrote: pg_shadow, pg_user and pg_group were added when role support was added, specifically for backwards compatibility. I don't believe there was ever discussion about keeping them because filtering pg_roles based on rolcanlogin was too onerous. That said, we already decided recently that we wanted to keep them updated to match the actual attributes available (note that the replication role attribute modified those views) and I think that makes sense on the removal side as well as the new-attribute side. I continue to feel that we really should officially deprecate those views as I don't think they're actually all that useful any more and maintaining them ends up bringing up all these questions, discussion, and ends up being largely busy-work if no one really uses them. Deprecation would certainly seem like an appropriate path for 'usecatupd' (and the mentioned views). Though, is there a 'formal' deprecation policy/process that the community follows? I certainly understand and support giving client/dependent tools the time and opportunity to update accordingly. Therefore, I think having them read from 'rolsuper' for the time being would be ok, provided that they are actually removed at the next possible opportunity. Otherwise, I'd probably lean towards just removing them now and getting it over with. Unless some popular tool like pgAdmin is using these views, I'd say we should just nuke them outright. If it breaks somebody's installation, then they can always recreate the view on their particular system. That seems like an adequate workaround to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] moving from contrib to bin
Andres Freund wrote: Rebased (fair amount of trivial conflicts due to the copyright year bump) and attached as -MC style format-patch. If you look at the content of the patches you can see that the diff makes more sense now. Ah --- I just realized that the change Peter is proposing from backslash to forward slashes in the MSVC build stuff is related to testability of this patch set in non-Windows environments. Can we get that patch done please? It seems there's only a small bit missing there. How do we go about getting these patches pushed? Note that each utility we move to src/bin will create a new translatable module for 9.5. It would be better to do it soon rather than waiting at the very end of the cycle, so that translators have time to work through the bunch of extra files. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] moving from contrib to bin
Michael Paquier wrote: On Tue, Mar 10, 2015 at 10:48 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: How do we go about getting these patches pushed? I think that one of the last point raised (by Andres) was if the Makefiles in src/bin should depend on pgxs.mk or not. FWIW, I think as well that it should be a subsequent patch not related to this one as it is a different debate to have stuff of src/bin depend on contrib infrastructure. I don't think we care one bit whether these modules use pgxs, at least not currently. If we find any issues later on, it should be an easy fix anyway. Now, the pushing plan was to have the stuff of pg_upgrade done in a separate commit. Note that I am fine helping out wrapping up things particularly on Windows if that helps. I vote for moving one module per commmit. Probably the first commit will be the hardest one to get right, so let's pick an easy module -- say, pgbench. That way there are less gotchas hopefully. Once we have that working everywhere we look into moving another one. Note that each utility we move to src/bin will create a new translatable module for 9.5. It would be better to do it soon rather than waiting at the very end of the cycle, so that translators have time to work through the bunch of extra files. That's a point. I'm a translator ;-) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] moving from contrib to bin
Michael Paquier wrote: On Wed, Mar 11, 2015 at 10:06 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Now, the pushing plan was to have the stuff of pg_upgrade done in a separate commit. Note that I am fine helping out wrapping up things particularly on Windows if that helps. I vote for moving one module per commmit. Probably the first commit will be the hardest one to get right, so let's pick an easy module -- say, pgbench. That way there are less gotchas hopefully. Once we have that working everywhere we look into moving another one. Fine for me. You want a cross-OS patch? Yep, I'm willing to testing and pushing such a patch. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On Tue, Mar 10, 2015 at 5:53 PM, Jim Nasby jim.na...@bluetreble.com wrote: I don't think we need both array_offset and array_offset_start; can't both SQL functions just call one C function? Not if you want the opr_sanity tests to pass. (But I'm seriously starting to wonder if that's actually a smart rule for us to be enforcing. It seems to be something of a pain in the neck, and I'm unclear as to whether it is preventing any real problem.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Question about lazy_space_alloc() / linux over-commit
I'm okay with any of the proposed designs or with dropping the idea. Closing the loop on a few facts: On Sat, Mar 07, 2015 at 04:34:41PM -0600, Jim Nasby wrote: If we go that route, does it still make sense to explicitly use repalloc_huge? It will just cut over to that at some point (128M?) anyway, and if you're vacuuming a small relation presumably it's not worth messing with. repalloc_huge() differs from repalloc() only in the size ceiling beyond which they raise errors. repalloc() raises errors for requests larger than ~1 GiB, while repalloc_huge() is practically unconstrained on 64-bit and permits up to ~2 GiB on 32-bit. On Mon, Mar 09, 2015 at 05:12:22PM -0500, Jim Nasby wrote: Speaking of which... people have referenced allowing 1GB of dead tuples, which means allowing maintenance_work_mem MAX_KILOBYTES. The comment for that says: /* upper limit for GUC variables measured in kilobytes of memory */ /* note that various places assume the byte size fits in a long variable */ So I'm not sure how well that will work. I think that needs to be a separate patch. On LP64 platforms, MAX_KILOBYTES already covers maintenance_work_mem values up to ~2 TiB. Raising the limit on ILP32 platforms is not worth the trouble. Raising the limit on LLP64 platforms is a valid but separate project. nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rethinking the parameter access hooks for plpgsql's benefit
2015-03-11 0:24 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com writes: On the technical side, I do agree that the requirement to allocate and zero an array for every new simple expression is unfortunate, but I'm not convinced that repeatedly invoking the hook-function is a good way to solve that problem. Calling the hook-function figures to be significantly more expensive than an array-fetch, so you can almost think of the ParamListInfo entries themselves as a cache of data retrieved from the hook function. In that view of the world, the problem here is that initializing the cache is too expensive. Your solution to that is to rip the cache out completely, but what would be better is keep the cache and make initializing it cheaper - e.g. by sharing one ParamListInfo across all expressions in the same scope; or by not zeroing the memory and instead tracking the the first N slots are the only ones we've initialized; or $your_idea_here. Getting back to the actual merits of this patch --- you're right, it was not a good idea as proposed. I'd been thinking about the scalar expression case only, and forgot that the same code is used to pass parameters to queries, which might evaluate those parameters quite a large number of times. So any savings from reducing the setup time is sure to be swamped if the hook-function gets called repeatedly. Another problem is that for ROW and REC datums, exec_eval_datum() pallocs memory that won't get freed till the end of the statement, so repeat evaluation would cause memory leaks. So we really need evaluate-at-most-once semantics in there. However, this attempt did confirm that we don't need to create a separate ParamListInfo struct for each evaluation attempt. So the attached, greatly stripped-down patch just allocates a ParamListInfo once at function startup, and then zeroes it each time. This does nothing for the memset overhead but it does get rid of the palloc traffic, which seems to be good for a few percent on simple-assignment-type statements. AFAICS this is an unconditional win compared to HEAD. What's more, it provides at least a basis for doing better later: if we could think of a reasonably cheap way of tracking which datums get invalidated by an assignment, we might be able to reset only selected entries in the array rather than blindly blowing away all of them. I propose to apply this and leave it at that for now. +1 Regards Pave regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of standard comparison operators
On Tue, Mar 10, 2015 at 9:37 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Mar 10, 2015 at 12:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Do we have consensus on doing this? Should we have the warning on by default, or off? I vote for defaulting the warning to off. If that proves to be too problematic, I'd take that as a sign that this whole change is not as low-impact as we're hoping, and maybe consider a rethink. Do we want to have three states? On, Off, and Auto? We can then change what Auto means in a point-release while letting people who have chosen On or Off have their wish. Auto could also consider some other data - like how long ago the database was initialized... I would vote for Auto meaning On in the .0 release. David J.
Re: [HACKERS] Precedence of standard comparison operators
On Tue, Mar 10, 2015 at 12:45 PM, David G. Johnston david.g.johns...@gmail.com wrote: On Tue, Mar 10, 2015 at 9:37 AM, Robert Haas robertmh...@gmail.com wrote: On Tue, Mar 10, 2015 at 12:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Do we have consensus on doing this? Should we have the warning on by default, or off? I vote for defaulting the warning to off. If that proves to be too problematic, I'd take that as a sign that this whole change is not as low-impact as we're hoping, and maybe consider a rethink. Do we want to have three states? On, Off, and Auto? We can then change what Auto means in a point-release while letting people who have chosen On or Off have their wish. Auto could also consider some other data - like how long ago the database was initialized... I would vote for Auto meaning On in the .0 release. I don't think users will appreciate an auto value whose meaning might change at some point, and I doubt we've have much luck identifying the correct point, either. Users will upgrade over the course of years, not months, and will not necessarily complete their application retesting within any particular period of time thereafter. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of standard comparison operators
On Tue, Mar 10, 2015 at 12:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: Andres Freund and...@anarazel.de writes: On February 26, 2015 10:29:18 PM CET, Peter Eisentraut pete...@gmx.net wrote: My suggestion was to treat this like the standard_conforming_string change. That is, warn for many years before changing. I don't think scs is a good example to follow. Yeah. For one thing, there wouldn't be any way to suppress the warning other than to parenthesize your code, which I would find problematic because it would penalize standard-conforming queries. I'd prefer an arrangement whereby once you fix your code to be standard-conforming, you're done. A possible point of compromise would be to leave the warning turned on by default, at least until we get a better sense of how this would play out in the real world. I continue to suspect that we're making a mountain out of, if not a molehill, at least a hillock. I think most sane people would have parenthesized their queries to start with rather than go look up whether IS DISTINCT FROM binds tighter than = ... This thread seems to have died off without any clear resolution. I'd hoped somebody would try the patch on some nontrivial application to see if it broke anything or caused any warnings, but it doesn't seem like that is happening. Do we have consensus on doing this? Should we have the warning on by default, or off? I vote for defaulting the warning to off. If that proves to be too problematic, I'd take that as a sign that this whole change is not as low-impact as we're hoping, and maybe consider a rethink. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of standard comparison operators
On Tue, Mar 10, 2015 at 1:12 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Mar 10, 2015 at 12:45 PM, David G. Johnston david.g.johns...@gmail.com wrote: I would vote for Auto meaning On in the .0 release. I don't think users will appreciate an auto value whose meaning might change at some point, and I doubt we've have much luck identifying the correct point, either. Users will upgrade over the course of years, not months, and will not necessarily complete their application retesting within any particular period of time thereafter. Yeah, I think that's too cute by far. And people do not like things like this changing in point releases. If we do this, I envision it as being on by default in 9.5 and then changing the default in 9.6 or 9.7 or so. Another possibility is to leave it on through beta testing with the intent to turn it off before 9.5 final; that would give us more data about whether there are real issues than we're likely to get otherwise. To my mind, the fact that we're doing this at all is largely predicated on the fact that there won't be many real issues. So I think the goal of the debugging messages ought to be to let those people who discover that they do have issues track them down more easily, not to warn people. Warning is sort of closing the barn door after the horse has got out: hey, by the way, I just broke your app. Another thing to consider is that if it becomes common to run with these warnings on, then everybody will have to pretty much write their code with full parenthesization anyway, at least if they plan to publish their code on PGXN or anywhere that it might get run on some system other than the one it was written for. That seems like an annoying gotcha for an issue that we're not expecting to be common. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: disallow operator = and use it for named parameters
On Tue, Mar 10, 2015 at 11:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Thu, Feb 19, 2015 at 3:15 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I am marking this as Ready For Committer, the patch is trivial and works as expected, there is nothing to be added to it IMHO. The = operator was deprecated for several years so it should not be too controversial either. Committed with a few documentation tweaks. Was there any consideration given to whether ruleutils should start printing NamedArgExprs with =? Or do we think that needs to wait? I have to admit that I didn't consider that. What do you think? I guess I'd be tentatively in favor of changing that to match, but I could be convinced otherwise. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: disallow operator = and use it for named parameters
Robert Haas robertmh...@gmail.com writes: On Tue, Mar 10, 2015 at 11:50 AM, Tom Lane t...@sss.pgh.pa.us wrote: Was there any consideration given to whether ruleutils should start printing NamedArgExprs with =? Or do we think that needs to wait? I have to admit that I didn't consider that. What do you think? I guess I'd be tentatively in favor of changing that to match, but I could be convinced otherwise. Well, as said upthread, the argument for not changing would be that it would make it easier to dump views and reload them into older PG versions. I'm not sure how big a consideration that is, or whether it outweighs possible cross-DBMS compatibility benefits of dumping the more standard syntax. Presumably we are going to change it at some point; maybe we should just do it rather than waiting another 5 years. IOW, I guess I lean mildly towards changing, but I've been beaten up enough lately about backwards-compatibility worries that I'm not going to fight for changing this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] moving from contrib to bin
On Wed, Mar 11, 2015 at 12:00 PM, Peter Eisentraut pete...@gmx.net wrote: Here is a rebase of my previous patch set. I have integrated the various minor fixes from Michael and Andres. I have dropped moving pg_standby, because no one seemed to like that. I wasn't able to do anything with Michael's Mkvcbuild.pm patch, since that appeared to include a significant refactoring along with the actual move. Maybe the refactoring can be done separately first? Yes. I was just looking at the refactoring part so as it could be applied before the rest first. Otherwise, I suggest we start with the first patch, pg_archivecleanup, fix up the Windows build system for it, and commit it, and repeat. I'd rather vote for having the Windows-side stuff integrated with each patch. Mind if I rebase what you just sent with the Windows things added? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] moving from contrib to bin
Here is a rebase of my previous patch set. I have integrated the various minor fixes from Michael and Andres. I have dropped moving pg_standby, because no one seemed to like that. I wasn't able to do anything with Michael's Mkvcbuild.pm patch, since that appeared to include a significant refactoring along with the actual move. Maybe the refactoring can be done separately first? Otherwise, I suggest we start with the first patch, pg_archivecleanup, fix up the Windows build system for it, and commit it, and repeat. From a4bb7c5276a9db10f06259f11a3a8f8a0dd49645 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut peter_e@gmx.net Date: Tue, 10 Mar 2015 22:33:23 -0400 Subject: [PATCH 1/7] Move pg_archivecleanup from contrib/ to src/bin/ --- contrib/Makefile | 1 - contrib/pg_archivecleanup/Makefile | 18 -- doc/src/sgml/contrib.sgml | 1 - doc/src/sgml/filelist.sgml | 1 - doc/src/sgml/ref/allfiles.sgml | 1 + doc/src/sgml/{ = ref}/pgarchivecleanup.sgml | 10 +- doc/src/sgml/reference.sgml| 1 + src/bin/Makefile | 1 + {contrib = src/bin}/pg_archivecleanup/.gitignore | 0 src/bin/pg_archivecleanup/Makefile | 14 ++ .../bin}/pg_archivecleanup/pg_archivecleanup.c | 2 +- 11 files changed, 19 insertions(+), 31 deletions(-) delete mode 100644 contrib/pg_archivecleanup/Makefile rename doc/src/sgml/{ = ref}/pgarchivecleanup.sgml (97%) rename {contrib = src/bin}/pg_archivecleanup/.gitignore (100%) create mode 100644 src/bin/pg_archivecleanup/Makefile rename {contrib = src/bin}/pg_archivecleanup/pg_archivecleanup.c (99%) diff --git a/contrib/Makefile b/contrib/Makefile index 195d447..c56050e 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -28,7 +28,6 @@ SUBDIRS = \ oid2name \ pageinspect \ passwordcheck \ - pg_archivecleanup \ pg_buffercache \ pg_freespacemap \ pg_prewarm \ diff --git a/contrib/pg_archivecleanup/Makefile b/contrib/pg_archivecleanup/Makefile deleted file mode 100644 index ab52390..000 --- a/contrib/pg_archivecleanup/Makefile +++ /dev/null @@ -1,18 +0,0 @@ -# contrib/pg_archivecleanup/Makefile - -PGFILEDESC = pg_archivecleanup - cleans archive when used with streaming replication -PGAPPICON = win32 - -PROGRAM = pg_archivecleanup -OBJS = pg_archivecleanup.o $(WIN32RES) - -ifdef USE_PGXS -PG_CONFIG = pg_config -PGXS := $(shell $(PG_CONFIG) --pgxs) -include $(PGXS) -else -subdir = contrib/pg_archivecleanup -top_builddir = ../.. -include $(top_builddir)/src/Makefile.global -include $(top_srcdir)/contrib/contrib-global.mk -endif diff --git a/doc/src/sgml/contrib.sgml b/doc/src/sgml/contrib.sgml index a698d0f..f21fa14 100644 --- a/doc/src/sgml/contrib.sgml +++ b/doc/src/sgml/contrib.sgml @@ -202,7 +202,6 @@ titleServer Applications/title part of the core productnamePostgreSQL/productname distribution. /para - pgarchivecleanup; pgstandby; pgtestfsync; pgtesttiming; diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index f03b72a..5e3c34b 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -125,7 +125,6 @@ !ENTITY pageinspect SYSTEM pageinspect.sgml !ENTITY passwordcheck SYSTEM passwordcheck.sgml !ENTITY pgbench SYSTEM pgbench.sgml -!ENTITY pgarchivecleanup SYSTEM pgarchivecleanup.sgml !ENTITY pgbuffercache SYSTEM pgbuffercache.sgml !ENTITY pgcryptoSYSTEM pgcrypto.sgml !ENTITY pgfreespacemap SYSTEM pgfreespacemap.sgml diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index 7aa3128..cbe4611 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -179,6 +179,7 @@ !ENTITY dropuser SYSTEM dropuser.sgml !ENTITY ecpgRefSYSTEM ecpg-ref.sgml !ENTITY initdb SYSTEM initdb.sgml +!ENTITY pgarchivecleanup SYSTEM pgarchivecleanup.sgml !ENTITY pgBasebackup SYSTEM pg_basebackup.sgml !ENTITY pgConfig SYSTEM pg_config-ref.sgml !ENTITY pgControldata SYSTEM pg_controldata.sgml diff --git a/doc/src/sgml/pgarchivecleanup.sgml b/doc/src/sgml/ref/pgarchivecleanup.sgml similarity index 97% rename from doc/src/sgml/pgarchivecleanup.sgml rename to doc/src/sgml/ref/pgarchivecleanup.sgml index fdf0cbb..779159d 100644 --- a/doc/src/sgml/pgarchivecleanup.sgml +++ b/doc/src/sgml/ref/pgarchivecleanup.sgml @@ -1,4 +1,4 @@ -!-- doc/src/sgml/pgarchivecleanup.sgml -- +!-- doc/src/sgml/ref/pgarchivecleanup.sgml -- refentry id=pgarchivecleanup indexterm zone=pgarchivecleanup @@ -194,14 +194,6 @@ titleExamples/title /refsect1 refsect1 - titleAuthor/title - - para - Simon Riggs emailsimon@2ndquadrant.com/email - /para - /refsect1 - - refsect1 titleSee Also/title simplelist type=inline diff --git
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Mon, Mar 9, 2015 at 9:08 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Mar 9, 2015 at 4:29 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Mar 5, 2015 at 10:08 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Mar 5, 2015 at 9:14 PM, Syed, Rahila rahila.s...@nttdata.com wrote: Please find attached a patch. As discussed, flag to denote compression and presence of hole in block image has been added in XLogRecordImageHeader rather than block header. Thanks for updating the patch! Attached is the refactored version of the patch. Cool. Thanks! I have some minor comments: Thanks for the comments! +Turning this parameter on can reduce the WAL volume without Turning valueon/ this parameter That tag is not used in other place in config.sgml, so I'm not sure if that's really necessary. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] moving from contrib to bin
On Tue, Mar 10, 2015 at 8:07 PM, Michael Paquier wrote: I'd rather vote for having the Windows-side stuff integrated with each patch. Mind if I rebase what you just sent with the Windows things added? And here is the rebased series with the MSVC changes included for each module in its individual patch. 0001 is the refactoring patch for MSVC scripts which should be applied before the rest. -- Michael From 4a15216e282d700a409b7dcad74b553122a534a3 Mon Sep 17 00:00:00 2001 From: Michael Paquier mich...@otacoo.com Date: Mon, 15 Dec 2014 22:16:36 -0800 Subject: [PATCH 1/8] Refactor MSVC scripts for modules of src/bin similarly to contrib/ This refactoring is a preparation work for the integration of many contrib/ modules into src/bin, and permits to make any default new binary of src/bin/ to be detected by the build process. --- src/tools/msvc/Mkvcbuild.pm | 104 +--- 1 file changed, 69 insertions(+), 35 deletions(-) diff --git a/src/tools/msvc/Mkvcbuild.pm b/src/tools/msvc/Mkvcbuild.pm index 5dc8426..bd63193 100644 --- a/src/tools/msvc/Mkvcbuild.pm +++ b/src/tools/msvc/Mkvcbuild.pm @@ -29,6 +29,7 @@ my $libpgcommon; my $postgres; my $libpq; +# Set of variables for contrib modules my $contrib_defines = { 'refint' = 'REFINT_VERBOSE' }; my @contrib_uselibpq = ('dblink', 'oid2name', 'pgbench', 'pg_upgrade', 'postgres_fdw', 'vacuumlo'); @@ -48,11 +49,25 @@ my $contrib_extralibs = { 'pgbench' = ['ws2_32.lib'] }; my $contrib_extraincludes = { 'tsearch2' = ['contrib/tsearch2'], 'dblink' = ['src/backend'] }; my $contrib_extrasource = { - 'cube' = [ 'cubescan.l', 'cubeparse.y' ], - 'pgbench' = [ 'exprscan.l', 'exprparse.y' ], - 'seg' = [ 'segscan.l', 'segparse.y' ], }; + 'cube' = [ 'contrib\cube\cubescan.l', 'contrib\cube\cubeparse.y' ], + 'pgbench' = [ 'contrib\pgbench\exprscan.l', 'contrib\pgbench\exprparse.y' ], + 'seg' = [ 'contrib\seg\segscan.l', 'contrib\seg\segparse.y' ], }; my @contrib_excludes = ('pgcrypto', 'intagg', 'sepgsql'); +# Set of variables for frontend modules +my $frontend_defines = { 'initdb' = 'FRONTEND' }; +my @frontend_uselibpq = ('pg_ctl', 'psql'); +my $frontend_extralibs = {'initdb' = ['ws2_32.lib'], + 'pg_restore' = ['ws2_32.lib'], + 'psql' = ['ws2_32.lib'] }; +my $frontend_extraincludes = { + 'initdb' = ['src\timezone'], + 'psql' = ['src\bin\pg_dump', 'src\backend'] }; +my $frontend_extrasource = { + 'psql' = [ 'src\bin\psql\psqlscan.l' ] }; +my @frontend_excludes = ('pgevent', 'pg_basebackup', 'pg_dump', + 'scripts'); + sub mkvcbuild { our $config = shift; @@ -380,11 +395,15 @@ sub mkvcbuild $pgregress_isolation-AddReference($libpgcommon, $libpgport); # src/bin - my $initdb = AddSimpleFrontend('initdb'); - $initdb-AddIncludeDir('src\interfaces\libpq'); - $initdb-AddIncludeDir('src\timezone'); - $initdb-AddDefine('FRONTEND'); - $initdb-AddLibrary('ws2_32.lib'); + my $D; + opendir($D, 'src/bin') || croak Could not opendir on src/bin!\n; + while (my $d = readdir($D)) + { + next if ($d =~ /^\./); + next unless (-f src/bin/$d/Makefile); + next if (grep { /^$d$/ } @frontend_excludes); + AddSimpleFrontend($d); + } my $pgbasebackup = AddSimpleFrontend('pg_basebackup', 1); $pgbasebackup-AddFile('src\bin\pg_basebackup\pg_basebackup.c'); @@ -400,14 +419,6 @@ sub mkvcbuild $pgrecvlogical-AddFile('src\bin\pg_basebackup\pg_recvlogical.c'); $pgrecvlogical-AddLibrary('ws2_32.lib'); - my $pgconfig = AddSimpleFrontend('pg_config'); - - my $pgcontrol = AddSimpleFrontend('pg_controldata'); - - my $pgctl = AddSimpleFrontend('pg_ctl', 1); - - my $pgreset = AddSimpleFrontend('pg_resetxlog'); - my $pgevent = $solution-AddProject('pgevent', 'dll', 'bin'); $pgevent-AddFiles('src\bin\pgevent', 'pgevent.c', 'pgmsgevent.rc'); $pgevent-AddResourceFile('src\bin\pgevent', 'Eventlog message formatter', @@ -416,12 +427,6 @@ sub mkvcbuild $pgevent-UseDef('src\bin\pgevent\pgevent.def'); $pgevent-DisableLinkerWarnings('4104'); - my $psql = AddSimpleFrontend('psql', 1); - $psql-AddIncludeDir('src\bin\pg_dump'); - $psql-AddIncludeDir('src\backend'); - $psql-AddFile('src\bin\psql\psqlscan.l'); - $psql-AddLibrary('ws2_32.lib'); - my $pgdump = AddSimpleFrontend('pg_dump', 1); $pgdump-AddIncludeDir('src\backend'); $pgdump-AddFile('src\bin\pg_dump\pg_dump.c'); @@ -532,7 +537,6 @@ sub mkvcbuild my $mf = Project::read_file('contrib/pgcrypto/Makefile'); GenerateContribSqlFiles('pgcrypto', $mf); - my $D; opendir($D, 'contrib') || croak Could not opendir on contrib!\n; while (my $d = readdir($D)) { @@ -652,6 +656,9 @@ sub AddSimpleFrontend $p-AddIncludeDir('src\interfaces\libpq'); $p-AddReference($libpq); } + # Adjust module definition using frontent variables + AdjustFrontendProj($p); + return $p; } @@ -744,45 +751,72 @@ sub GenerateContribSqlFiles sub AdjustContribProj { my $proj = shift; - my $n= $proj-{name}; + AdjustModule($proj, $contrib_defines, \@contrib_uselibpq,
Re: [HACKERS] moving from contrib to bin
On Wed, Mar 11, 2015 at 10:06 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Michael Paquier wrote: On Tue, Mar 10, 2015 at 10:48 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: How do we go about getting these patches pushed? I think that one of the last point raised (by Andres) was if the Makefiles in src/bin should depend on pgxs.mk or not. FWIW, I think as well that it should be a subsequent patch not related to this one as it is a different debate to have stuff of src/bin depend on contrib infrastructure. I don't think we care one bit whether these modules use pgxs, at least not currently. If we find any issues later on, it should be an easy fix anyway. Now, the pushing plan was to have the stuff of pg_upgrade done in a separate commit. Note that I am fine helping out wrapping up things particularly on Windows if that helps. I vote for moving one module per commmit. Probably the first commit will be the hardest one to get right, so let's pick an easy module -- say, pgbench. That way there are less gotchas hopefully. Once we have that working everywhere we look into moving another one. Fine for me. You want a cross-OS patch? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rethinking the parameter access hooks for plpgsql's benefit
On Mon, Mar 09, 2015 at 03:06:24PM -0400, Robert Haas wrote: What I do care about is that we as a project have to at some point be willing to begin closing the spigot on new patches and focusing on polishing and shipping the patches we've already got. I think it's perfectly reasonable to worry about where we are on that continuum when it's already several weeks after the start of the last CommitFest, but if I'm in the minority on that, so be it. I am in your minority on that point. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind in contrib
On Wed, Mar 11, 2015 at 3:44 AM, Heikki Linnakangas hlinn...@iki.fi wrote: On 03/10/2015 07:46 AM, Amit Kapila wrote: Isn't it possible incase of async replication that old cluster has some blocks which new cluster doesn't have, what will it do in such a case? Sure, that's certainly possible. If the source cluster doesn't have some blocks that exist in the target, IOW a file in the source cluster is shorter than the same file in the target, that means that the relation was truncated in the source. Can't that happen if the source database (new-master) haven't received all of the data from target database (old-master) at the time of promotion? If yes, then source database won't have WAL for truncation and the way current mechanism works is must. Now I think for such a case doing truncation in the target database is the right solution, however should we warn user in some way (either by mentioning about it in docs or in the pg_rewind utility after it does truncation) that some of it's data that belongs to old-master will be overridden by this operation, so if he wants he can keep a backup copy of the same. I have tried to test some form of such a case and it seems to be failing with below error: pg_rewind.exe -D ..\..\Data\ --source-pgdata=..\..\Database1 The servers diverged at WAL position 0/16DE858 on timeline 1. Rewinding from last common checkpoint at 0/16B8A70 on timeline 1 could not open file ..\..\Data\/base/12706/16391 for truncation: No such file or directory Failure, exiting Hmm, could that be just because of the funny business with the Windows path separators? Does it work if you use -D ..\..\Data instead, without the last backslash? I have tried without backslash as well, but still it returns same error. pg_rewind.exe -D ..\..\Data --source-pgdata=..\..\Database1 The servers diverged at WAL position 0/1769BD8 on timeline 5. Rewinding from last common checkpoint at 0/1769B30 on timeline 5 could not open file ..\..\Data/base/12706/16394 for truncation: No such file or directory Failure, exiting I have even tried with complete path: pg_rewind.exe -D E:\WorkSpace\PostgreSQL\master\Data --source-pgdata=E:\WorkSpace\PostgreSQL\master\Database1 The servers diverged at WAL position 0/1782830 on timeline 6. Rewinding from last common checkpoint at 0/1782788 on timeline 6 could not open file E:\WorkSpace\PostgreSQL\master\Data/base/12706/16395 for truncation: No such file or directory Failure, exiting Another point is that after above error, target database gets corrupted. Basically the target database contains an extra data of source database and part of it's data. I think thats because truncation didn't happened. On retry it gives below message: pg_rewind.exe -D ..\..\Data --source-pgdata=..\..\Database1 source and target cluster are on the same timeline Failure, exiting I think message displayed in this case is okay, however displaying it as 'Failure' looks slightly odd. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
[HACKERS] Re: In-core regression tests for replication, cascading, archiving, PITR, etc.
On Sun, Mar 08, 2015 at 08:19:39PM +0900, Michael Paquier wrote: So I am planning to seriously focus soon on this stuff, basically using the TAP tests as base infrastructure for this regression test suite. First, does using the TAP tests sound fine? Yes. On the top of my mind I got the following items that should be tested: - WAL replay: from archive, from stream - hot standby and read-only queries - node promotion - recovery targets and their interferences when multiple targets are specified (XID, name, timestamp, immediate) - timelines - recovery_target_action - recovery_min_apply_delay (check that WAL is fetch from a source at some correct interval, can use a special restore_command for that) - archive_cleanup_command (check that command is kicked at each restart point) - recovery_end_command (check that command is kicked at the end of recovery) - timeline jump of a standby after reconnecting to a promoted node Those sound good. The TAP suites still lack support for any Windows target. If you're inclined to fix that, it would be a great contribution. The more we accrue tests before doing that, the harder it will be to dig out. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: disallow operator = and use it for named parameters
On Thu, Feb 19, 2015 at 3:15 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I am marking this as Ready For Committer, the patch is trivial and works as expected, there is nothing to be added to it IMHO. The = operator was deprecated for several years so it should not be too controversial either. Committed with a few documentation tweaks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reduce pinning in btree indexes
Kevin Grittner kgri...@ymail.com wrote: Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Would you mind rewriting the comment there like this? - /* The buffer is still pinned, but not locked. Lock it now. */ + /* I still hold the pin on the buffer, but not locked. Lock it now. */ Or would you mind renaming the macro as BTScanPosIsPinnedByMe or something like, or anyway to notify the reader that the pin should be mine there? I see your point, although those first person singular pronouns used like that make me a little uncomfortable; I'll change the comment and/or macro name, but I'll work on the name some more. After thinking it over, I think that the BTScanPos part of the macro name is enough of a hint that it is concerned with the actions of this scan; it is the comment that needs the change. I went with: /* * We have held the pin on this page since we read the index tuples, * so all we need to do is lock it. The pin will have prevented * re-use of any TID on the page, so there is no need to check the * LSN. */ + To handle the cases where it is safe to release the pin after + reading the index page, the LSN of the index page is read along + with the index entries before the shared lock is released, and we + do not attempt to flag index tuples as dead if the page is not + pinned and the LSN does not match. I suppose that the sentence like following is more directly describing about the mechanism and easier to find the correnponsing code, if it is correct. To handle the cases where a index page has unpinned when trying to mark the unused index tuples on the page as dead, the LSN of the index page is remembered when reading the index page for index tuples, and we do not attempt to flag index tuples as dead if the page is not pinned and the LSN does not match. Will reword that part to try to make it clearer. That sentence was full of passive voice, which didn't help any. I changed it to: | So that we can handle the cases where we attempt LP_DEAD flagging | for a page after we have released its pin, we remember the LSN of | the index page when we read the index tuples from it; we do not | attempt to flag index tuples as dead if the we didn't hold the | pin the entire time and the LSN has changed. Do these changes seem clear? Because these changes are just to a comment and a README file, I'm posting a patch-on-patch v3a (to be applied on top of v3). -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company bt-nopin-v3a.patch Description: invalid/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: disallow operator = and use it for named parameters
On 10/03/15 17:01, Pavel Stehule wrote: 2015-03-10 16:50 GMT+01:00 Tom Lane t...@sss.pgh.pa.us mailto:t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com mailto:robertmh...@gmail.com writes: Committed with a few documentation tweaks. Was there any consideration given to whether ruleutils should start printing NamedArgExprs with =? Or do we think that needs to wait? I didn't think about it? I don't see any reason why it have to use deprecated syntax. There is one, loading the output into older version of Postgres. Don't know if that's important one though. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of standard comparison operators
Tom Lane wrote: Do we have consensus on doing this? Should we have the warning on by default, or off? This is the tough decision, isn't it. I had thought it would default to off and people would only turn it on in their testing procedure prior to the actual upgrade of the production systems, but how are they going to find out they need to turn it on in the first place? We could have a big fat red blinking warning in the release notes and a picture of a dancing elephant in a tutu next to it, and we can be certain that many will miss it anyway. I think we should have an expires value: it means ON unless the system's initdb is older than one month from the current date, in which case it turns itself off. This is of course just a silly joke, but as you said there are probably valid constructs that are going to raise warnings for no reason, so having it default to ON would be pointlessly noisy in systems that have been developed with the new rules. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: disallow operator = and use it for named parameters
2015-03-10 16:50 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Robert Haas robertmh...@gmail.com writes: On Thu, Feb 19, 2015 at 3:15 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I am marking this as Ready For Committer, the patch is trivial and works as expected, there is nothing to be added to it IMHO. The = operator was deprecated for several years so it should not be too controversial either. Committed with a few documentation tweaks. Was there any consideration given to whether ruleutils should start printing NamedArgExprs with =? Or do we think that needs to wait? I didn't think about it? I don't see any reason why it have to use deprecated syntax. Regards Pavel regards, tom lane
[HACKERS] Stateful C-language function with state managed by third-party library
Hello, My function neeeds to call a third-party library which would create a state and then that state should be kept for the duration of the current query. The library can deallocate that state in a correct way. I understand that fn_extra is normally used for this and usually the state is created in a memory context which is deallocated at the end of the query. So normally it is not an issue. However, I cannot make that library use PostgreSQL utilities for memory management. I am afraid that for long-running sessions it may cause serious memory leaks if they do not deallocate state correctly and in a timely manner. Is there a mechanism for adding a finalizer hook which would be called and passed that pointer after the query is complete? Or perhaps there is another mechanism? I looked in the documentation and in the source but I do not see it mentioned. Thank you. With kind regards, Denys Rtveliashvili
Re: [HACKERS] proposal: disallow operator = and use it for named parameters
Robert Haas robertmh...@gmail.com writes: On Thu, Feb 19, 2015 at 3:15 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I am marking this as Ready For Committer, the patch is trivial and works as expected, there is nothing to be added to it IMHO. The = operator was deprecated for several years so it should not be too controversial either. Committed with a few documentation tweaks. Was there any consideration given to whether ruleutils should start printing NamedArgExprs with =? Or do we think that needs to wait? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of standard comparison operators
I wrote: Andres Freund and...@anarazel.de writes: On February 26, 2015 10:29:18 PM CET, Peter Eisentraut pete...@gmx.net wrote: My suggestion was to treat this like the standard_conforming_string change. That is, warn for many years before changing. I don't think scs is a good example to follow. Yeah. For one thing, there wouldn't be any way to suppress the warning other than to parenthesize your code, which I would find problematic because it would penalize standard-conforming queries. I'd prefer an arrangement whereby once you fix your code to be standard-conforming, you're done. A possible point of compromise would be to leave the warning turned on by default, at least until we get a better sense of how this would play out in the real world. I continue to suspect that we're making a mountain out of, if not a molehill, at least a hillock. I think most sane people would have parenthesized their queries to start with rather than go look up whether IS DISTINCT FROM binds tighter than = ... This thread seems to have died off without any clear resolution. I'd hoped somebody would try the patch on some nontrivial application to see if it broke anything or caused any warnings, but it doesn't seem like that is happening. Do we have consensus on doing this? Should we have the warning on by default, or off? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of standard comparison operators
On 3/10/15 1:12 PM, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Mar 10, 2015 at 12:45 PM, David G. Johnston david.g.johns...@gmail.com wrote: I would vote for Auto meaning On in the .0 release. I don't think users will appreciate an auto value whose meaning might change at some point, and I doubt we've have much luck identifying the correct point, either. Users will upgrade over the course of years, not months, and will not necessarily complete their application retesting within any particular period of time thereafter. Yeah, I think that's too cute by far. And people do not like things like this changing in point releases. If we do this, I envision it as being on by default in 9.5 and then changing the default in 9.6 or 9.7 or so. Well, I point again to standards_conforming_strings: Leave the warning off for one release (or more), then default to on for one (or more), then change the behavior. We can change the timeline, but I don't think the approach was unsound. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_object_address support for additional object types
Stephen, thanks for the comments. Stephen Frost wrote: I don't really care for how all the get_object_address stuff uses lists for arguments instead of using straight-forward arguments, but it's how it's been done and I can kind of see the reasoning behind it. I'm not following why you're switching this case (get_object_address_type) to using a ListCell though..? The reason for changing get_object_address_type is that it's a helper soubroutine to get_object_address and we can do whatever is more convenient. It turns out that it's more convenient, for the amop/amproc cases, to pass a ListCell instead of a List. get_object_address gets its stuff directly from the parser in some cases, or from pg_get_object_address otherwise, which constructs things to mimick exactly what the parser does. We can change what the parser emits and as long as we keep get_object_address in agreement, there is no issue. There might be code churn of course (because some of those object representations travel from the parser through other parts of the backend), but that doesn't really matter much. We have now exposed that interface through the SQL-callable pg_get_object_address function, but I'm pretty sure we could change that easily and it wouldn't be a tremendous problem -- as long as we do it before 9.5 is released. For instance we might want to decide that we want to have three lists instead of two; or two lists and a numeric quantity (which would also help the large object case, currently a bit ugly), or that we want something akin to what the parser does to types using struct TypeName for opclass-related objects. Anyway I'm not hot on changing anything here. It's a bit cumbersome an interface to use, but it's not excessively exposed to the user unless they use event triggers, and even then it is perfectly reliable anyhow. I thought the rest of it looked alright. I agree it's a bit odd how the opfamily is handled but I agree with your assessment that there's not much better we can do with this object representation. Great, thanks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] get_object_address support for additional object types
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote: Anyway I'm not hot on changing anything here. It's a bit cumbersome an interface to use, but it's not excessively exposed to the user unless they use event triggers, and even then it is perfectly reliable anyhow. Works for me. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] Precedence of standard comparison operators
Peter Eisentraut pete...@gmx.net writes: Well, I point again to standards_conforming_strings: Leave the warning off for one release (or more), then default to on for one (or more), then change the behavior. We can change the timeline, but I don't think the approach was unsound. I'm not excited about that approach, for the reasons that were stated upthread, mainly that there is little reason to think that anybody paid any attention to the s_c_s transition till they were forced to. Waiting to make the change will just allow more non-spec-compliant SQL code to accumulate in the wild, without significantly reducing the pain involved. There's one more reason, too: the code I have is designed to give correct warnings within the context of a parser that parses according to the spec-compliant rules. It's possible that a similar approach could be used to generate correct warnings within a parsetree built according to the old rules, but it would be entirely different in detail and would need a lot of additional work to develop. I don't particularly want to do that additional work. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Tue, Mar 3, 2015 at 7:47 PM, Amit Kapila amit.kapil...@gmail.com wrote: I have modified the patch to introduce a Funnel node (and left child as PartialSeqScan node). Apart from that, some other noticeable changes based on feedback include: a) Master backend forms and send the planned stmt to each worker, earlier patch use to send individual elements and form the planned stmt in each worker. b) Passed tuples directly via tuple queue instead of going via FE-BE protocol. c) Removed restriction of expressions in target list. d) Introduced a parallelmodeneeded flag in plannerglobal structure and set it for Funnel plan. There is still some work left like integrating with access-parallel-safety patch (use parallelmodeok flag to decide whether parallel path can be generated, Enter/Exit parallel mode is still done during execution of funnel node). I think these are minor points which can be fixed once we decide on the other major parts of patch. Find modified patch attached with this mail. This is definitely progress. I do think you need to integrate it with the access-parallel-safety patch. Other comments: - There's not much code left in shmmqam.c. I think that the remaining logic should be integrated directly into nodeFunnel.c, with the two bools in worker_result_state becoming part of the FunnelState. It doesn't make sense to have a separate structure for two booleans and 20 lines of code. If you were going to keep this file around, I'd complain about its name and its location in the source tree, too, but as it is I think we can just get rid of it altogether. - Something is deeply wrong with the separation of concerns between nodeFunnel.c and nodePartialSeqscan.c. nodeFunnel.c should work correctly with *any arbitrary plan tree* as its left child, and that is clearly not the case right now. shm_getnext() can't just do heap_getnext(). Instead, it's got to call ExecProcNode() on its left child and let the left child decide what to do about that. The logic in InitFunnelRelation() belongs in the parallel seq scan node, not the funnel. ExecReScanFunnel() cannot be calling heap_parallel_rescan(); it needs to *not know* that there is a parallel scan under it. The comment in FunnelRecheck is a copy-and-paste from elsewhere that is not applicable to a generic funnel mode. - The comment in execAmi.c refers to says Backward scan is not suppotted for parallel sequiantel scan. Sequential is mis-spelled here, but I think you should just nuke the whole comment. The funnel node is not, in the long run, just for parallel sequential scan, so putting that comment above it is not right. If you want to keep the comment, it's got to be more general than that somehow, like parallel nodes do not support backward scans, but I'd just drop it. - Can we rename create_worker_scan_plannedstmt to create_parallel_worker_plannedstmt? - I *strongly* suggest that, for the first version of this, we remove all of the tts_fromheap stuff. Let's make no special provision for returning a tuple stored in a tuple queue; instead, just copy it and store it in the slot as a pfree-able tuple. That may be slightly less efficient, but I think it's totally worth it to avoid the complexity of tinkering with the slot mechanism. - InstrAggNode claims that we only need the master's information for statistics other than buffer usage and tuple counts, but is that really true? The parallel backends can be working on the parallel part of the plan while the master is doing something else, so the amount of time the *master* spent in a particular node may not be that relevant. We might need to think carefully about what it makes sense to display in the EXPLAIN output in parallel cases. - The header comment on nodeFunnel.h capitalizes the filename incorrectly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Mon, Dec 22, 2014 at 7:34 PM, Peter Geoghegan p...@heroku.com wrote: On Mon, Dec 22, 2014 at 5:50 AM, Robert Haas robertmh...@gmail.com wrote: Looking over the latest patch, I think we could simplify the code so that you don't need multiple FuzzyAttrMatchState objects. Instead of creating a separate one for each RTE and then merging them, just have one. When you find an inexact-RTE name match, set a field inside the FuzzyAttrMatchState -- maybe with a name like rte_penalty -- to the Levenshtein distance between the RTEs. Then call scanRTEForColumn() and pass down the same state object. Now let updateFuzzyAttrMatchState() work out what it needs to do based on the observed inter-column distance and the currently-in-force RTE penalty. I'm afraid I don't follow. I think doing things that way makes things less clear. Merging is useful because it allows us to consider that an exact match might exist, which this searchRangeTableForCol() is already tasked with today. We now look for the best match exhaustively, or magically return immediately in the event of an exact match, without caring about the alias correctness or distance. Having a separate object makes this pattern apparent from the top level, within searchRangeTableForCol(). I feel that's better. updateFuzzyAttrMatchState() is the wrong place to put that, because that task rightfully belongs in searchRangeTableForCol(), where the high level diagnostic-report-generating control flow lives. To put it another way, creating a separate object obfuscates scanRTEForColumn(), since it's the only client of updateFuzzyAttrMatchState(). scanRTEForColumn() is a very important function, and right now I am only making it slightly less clear by tasking it with caring about distance of names on top of strict binary equality of attribute names. I don't want to push it any further. I don't buy this. What you're essentially doing is using the FuzzyAttrMatchState object in two ways that are not entirely compatible with each other - updateFuzzyAttrMatchState doesn't set the RTE fields, so searchRangeTableForCol has to do it. So there's an unspoken contract that in some parts of the code, you can rely on those fields being set, and in others, you can't. That pretty much defeats the whole point of making the state its own object, AFAICS. Furthermore, you end up with two copies of the state-combining logic, one in FuzzyAttrMatchState and a second in searchRangeTableForCol. That's ugly and unnecessary. I decided to rework this patch myself today; my version is attached. I believe that this version is significantly easier to understand than yours, both as to the code and the comments. I put quite a bit of work into both. I also suspect it's more efficient, because it avoids computing the Levenshtein distances for column names when we already know that those column names can't possibly be sufficiently-good matches for us to care about the details; and when it does compute the Levenshtein distance it keeps the max-distance threshold as low as possible. That may not really matter much, but it can't hurt. More importantly, essentially all of the fuzzy-matching logic is now isolated in FuzzyAttrMatchState(); the volume of change in scanRTEForColumn is the same as in your version, but the volume of change in searchRangeTableForCol is quite a bit less, so the patch is smaller overall. I'm prepared to commit this version if nobody finds a problem with it. It passes the additional regression tests you wrote. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company column-hint-rmh.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-10 19:50 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Pavel Stehule pavel.steh...@gmail.com writes: 2015-03-10 15:30 GMT+01:00 Robert Haas robertmh...@gmail.com: I am sure in agreement with the idea that it would be good to factor out the common typecache code (for setting up my_extra). Any chance we get a preliminary patch that does that refactoring, and then rebase the main patch on top of it? I agree that it's not really this patch's job to solve that problem, but it would be nice. The common part is following code: There is not all that much commonality; many functions don't bother to populate all of the ArrayMetaState fields because they don't need all of them. (The ones you quote don't, in fact.) You are either going to end up with a subroutine that does extra syscache lookups to populate the whole struct every time, or a confusing collection of ad-hoc subroutines. I'm not convinced that there's a lot of mileage to be gained here. I have not good feeling about it too. If we would to enhance this are, we probably need a specific flinfo field and flags to specify more precious the context of cached informations. my_extra should be reserved for generic usage. But still there is relative big space for settings some less common fields like proc. With extra field in flinfo we can have interface like bool set_flinfo_type_cache(fcinfo, type, flags); and usage fcinfo-flinfo-typecache-typlen, .. I agree with Robert, this can be nice, but it needs more time for design :( Regards Pavel regards, tom lane
Re: [HACKERS] One question about security label command
On Tue, Mar 10, 2015 at 9:41 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: And perhaps make it an ereport also, with errcode etc. Yeah, definitely. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On Tue, Mar 10, 2015 at 3:43 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I have not good feeling about it too. If we would to enhance this are, we probably need a specific flinfo field and flags to specify more precious the context of cached informations. my_extra should be reserved for generic usage. But still there is relative big space for settings some less common fields like proc. With extra field in flinfo we can have interface like bool set_flinfo_type_cache(fcinfo, type, flags); and usage fcinfo-flinfo-typecache-typlen, .. I agree with Robert, this can be nice, but it needs more time for design :( OK. If I'm in the minority, I'll desist. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Tue, Mar 10, 2015 at 11:28 AM, Robert Haas robertmh...@gmail.com wrote: I'm prepared to commit this version if nobody finds a problem with it. It passes the additional regression tests you wrote. Looks good to me. Thanks. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
2015-03-10 15:30 GMT+01:00 Robert Haas robertmh...@gmail.com: On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote: You still duplicate the type cache code, but many other array functions do that too so I will not hold that against you. (Maybe somebody should write separate patch that would put all that duplicate code into common function?) I think this patch is ready for committer so I am going to mark it as such. The documentation in this patch needs some improvements to the English. Can anyone help with that? The documentation should discuss what happens if the array is multi-dimensional. The code for array_offset and for array_offset_start appear to be byte-for-byte identical. There's no comment explaining why, but I bet it's to make the opr_sanity test pass. How about adding a comment? The comment for array_offset_common refers to array_offset_start as array_offset_startpos. yes, it is a reason. I'll comment it. I am sure in agreement with the idea that it would be good to factor out the common typecache code (for setting up my_extra). Any chance we get a preliminary patch that does that refactoring, and then rebase the main patch on top of it? I agree that it's not really this patch's job to solve that problem, but it would be nice. The common part is following code: my_extra = (ArrayMetaState *) fcinfo-flinfo-fn_extra; if (my_extra == NULL) { fcinfo-flinfo-fn_extra = MemoryContextAlloc(fcinfo-flinfo-fn_mcxt, sizeof(ArrayMetaState)); my_extra = (ArrayMetaState *) fcinfo-flinfo-fn_extra; my_extra-element_type = ~element_type; } and if (my_extra-element_type != element_type) { get_typlenbyvalalign(element_type, my_extra-typlen, my_extra-typbyval, my_extra-typalign); my_extra-element_type = element_type; } so we can design function like (ArrayMetaState *) GetCachedArrayMetaState(FunctionCallInfo fcinfo, Oid element_type, bool *reused) { ArrayMetaState *state = (ArrayMetaState *) fcinfo-flinfo-fn_extra; if (state == NULL) { fcinfo-flinfo-fn_extra = MemoryContextAlloc(fcinfo-flinfo-fn_mcxt, sizeof(ArrayMetaState)); state = (ArrayMetaState *) fcinfo-flinfo-fn_extra; state-element_type = ~element_type; } if (state-element_type != element_type) { get_typlenbyvalalign(element_type, my_extra-typlen, my_extra-typbyval, my_extra-typalign); state-element_type = element_type; *resused = false; } else *reused = true; } Usage in code: array_state = GetCachedArrayMetaState(fceinfo, element_type, reused); if (!reused) { // some other initialization } The content is relatively clean, but the most big problem is naming (as usual) Comments? Regards Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Precedence of standard comparison operators
Robert Haas robertmh...@gmail.com writes: On Tue, Mar 10, 2015 at 1:12 PM, Tom Lane t...@sss.pgh.pa.us wrote: Another possibility is to leave it on through beta testing with the intent to turn it off before 9.5 final; that would give us more data about whether there are real issues than we're likely to get otherwise. To my mind, the fact that we're doing this at all is largely predicated on the fact that there won't be many real issues. So I think the goal of the debugging messages ought to be to let those people who discover that they do have issues track them down more easily, not to warn people. Warning is sort of closing the barn door after the horse has got out: hey, by the way, I just broke your app. Agreed, but in the near term we need to *find out* whether there will be many real issues. Perhaps having the warnings on by default would help that, or perhaps not; I'm not sure. Another thing to consider is that if it becomes common to run with these warnings on, then everybody will have to pretty much write their code with full parenthesization anyway, at least if they plan to publish their code on PGXN or anywhere that it might get run on some system other than the one it was written for. That seems like an annoying gotcha for an issue that we're not expecting to be common. Hm, well, people who are publishing code will likely want it to work on both old and new PG releases, so I suspect they'd need to make it run warning-free anyway. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: disallow operator = and use it for named parameters
2015-03-10 19:02 GMT+01:00 Tom Lane t...@sss.pgh.pa.us: Kevin Grittner kgri...@ymail.com writes: Tom Lane t...@sss.pgh.pa.us wrote: Presumably we are going to change it at some point; maybe we should just do it rather than waiting another 5 years. +1 It has been deprecated long enough that I don't see the point of waiting. Uh, just to clarify, this has nothing to do with how long the operator has been deprecated. The issue is whether pg_dump should dump a function-call syntax that will not be recognized by any pre-9.5 release, when there is an alternative that will be recognized back to 9.0. BTW, I just noticed another place that probably should be changed: regression=# select foo(x = 1); ERROR: 42883: function foo(x := integer) does not exist LINE 1: select foo(x = 1); ^ HINT: No function matches the given name and argument types. You might need to add explicit type casts. LOCATION: ParseFuncOrColumn, parse_func.c:516 1. funcname_signature_string 2. get_rule_expr regards, tom lane
Re: [HACKERS] proposal: disallow operator = and use it for named parameters
On Tue, Mar 10, 2015 at 2:32 PM, Pavel Stehule pavel.steh...@gmail.com wrote: 1. funcname_signature_string 2. get_rule_expr Thanks. Patch attached. I'll commit this if there are no objections. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company named-expr-fixes.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
Pavel Stehule pavel.steh...@gmail.com writes: 2015-03-10 15:30 GMT+01:00 Robert Haas robertmh...@gmail.com: I am sure in agreement with the idea that it would be good to factor out the common typecache code (for setting up my_extra). Any chance we get a preliminary patch that does that refactoring, and then rebase the main patch on top of it? I agree that it's not really this patch's job to solve that problem, but it would be nice. The common part is following code: There is not all that much commonality; many functions don't bother to populate all of the ArrayMetaState fields because they don't need all of them. (The ones you quote don't, in fact.) You are either going to end up with a subroutine that does extra syscache lookups to populate the whole struct every time, or a confusing collection of ad-hoc subroutines. I'm not convinced that there's a lot of mileage to be gained here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Precedence of standard comparison operators
On Tue, Mar 10, 2015 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: I wrote: This thread seems to have died off without any clear resolution. I'd hoped somebody would try the patch on some nontrivial application to see if it broke anything or caused any warnings, but it doesn't seem like that is happening. I tried it on a fairly typical web application. Around 5000 or so distinct statements according to pg_stat_statements. With operator_precedence_warning = on, no warnings yet.
Re: [HACKERS] Precedence of standard comparison operators
Alex Hunsaker bada...@gmail.com writes: On Tue, Mar 10, 2015 at 10:11 AM, Tom Lane t...@sss.pgh.pa.us wrote: This thread seems to have died off without any clear resolution. I'd hoped somebody would try the patch on some nontrivial application to see if it broke anything or caused any warnings, but it doesn't seem like that is happening. I tried it on a fairly typical web application. Around 5000 or so distinct statements according to pg_stat_statements. With operator_precedence_warning = on, no warnings yet. Thanks! Much appreciated. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind in contrib
On Mon, Mar 9, 2015 at 7:32 PM, Heikki Linnakangas hlinn...@iki.fi wrote: Attached is a new patch version, fixing all the little things you listed. I believe this is pretty much ready for commit. I'm going to read it through myself one more time before committing, but I don't have anything mind now that needs fixing anymore. I just pushed the change to split dbcommands.h into dbcommands.h and dbcommands_xlog.h, as that seems like a nice-to-have anyway. Few assorted comments: 1. +step + para + Copy all those changed blocks from the new cluster to the old cluster. + /para +/step Isn't it possible incase of async replication that old cluster has some blocks which new cluster doesn't have, what will it do in such a case? I have tried to test some form of such a case and it seems to be failing with below error: pg_rewind.exe -D ..\..\Data\ --source-pgdata=..\..\Database1 The servers diverged at WAL position 0/16DE858 on timeline 1. Rewinding from last common checkpoint at 0/16B8A70 on timeline 1 could not open file ..\..\Data\/base/12706/16391 for truncation: No such file or directory Failure, exiting Exact scenario is: Node -1 (master): Step-1 Create table t1(c1 int, c2 char(500)) with (fillfactor=10); insert into t1 values(generate_series(1,110),''); Stop Node-1 (pg_ctl stop ..) Step-2 Copy manually the data-directory (it contains WAL log as well) to new location say Database1 Node-2 (standby) Step-3 Change settings to make it stand-by (recovery.conf and change postgresql.conf) Start Node and verify all data exists. Step-4 use pg_ctl promote to make Node-2 as master Step-5 Start Node-1 insert few more records insert into t1 values(generate_series(110,115),''); Step-6 Node-2 Now insert one more records in table t1 insert into t1 values(116,''); Stop both the nodes. Now if I run pg_rewind on old-master(Node-1), it will lead to above error. I think above scenario can be possible in async replication. If I insert more records (greater than what I inserted in Step-5) in Step-6, then pg_rewind works fine. 2. diff --git a/src/bin/pg_rewind/RewindTest.pm b/src/bin/pg_rewind/RewindTest.pm +# To run a test, the test script (in t/ subdirectory) calls the functions What do you mean by t/ subdirectory? 3. + applicationpg_rewind/ was run, and thereforce could not be copied by typo /thereforce 4. +static void +sanityChecks(void) +{ + /* Check that there's no backup_label in either cluster */ I could not see such a check in code. Am I missing anything? 5. + /* + * TODO: move old file out of the way, if any. And perhaps create the + * file with temporary name first and rename in place after it's done. + */ + snprintf(BackupLabelFilePath, MAXPGPATH, + %s/backup_label /* BACKUP_LABEL_FILE */, datadir_target); + There are couple of other TODO's in the patch, are these for future? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] initdb -S and tablespaces
At 2015-01-15 14:32:45 +0100, and...@2ndquadrant.com wrote: What I am thinking of is that, currently, if you start the server for initial loading with fsync=off, and then restart it, you're open to data loss. So when the current config file setting is changed from off to on, we should fsync the data directory. Even if there was no crash restart. Patch attached. Changes: 1. Renamed perform_fsync to fsync_recursively (otherwise it would read fsync_pgdata(pg_data)) 2. Added ControlData-fsync_disabled to record whether fsync was ever disabled while the server was running (tested in CreateCheckPoint) 3. Run fsync_recursively at startup only if fsync is enabled 4. Run it if we're doing crash recovery, or fsync was disabled 5. Use pg_flush_data in pre_sync_fname 6. Issue fsync on directories too 7. Tested that it works if pg_xlog is a symlink (no changes). (In short, everything you mentioned in your earlier mail.) Note that I set ControlData-fsync_disabled to false in BootstrapXLOG, but it gets set to true during a later CreateCheckPoint(). This means we run fsync again at startup after initdb. I'm not sure what to do about that. Is this about what you had in mind? -- Abhijit From bb2b5f130525dd44a10eab6829b9802b8f6f7eed Mon Sep 17 00:00:00 2001 From: Abhijit Menon-Sen a...@2ndquadrant.com Date: Thu, 6 Nov 2014 00:45:56 +0530 Subject: Recursively fsync PGDATA at startup if needed It's needed if we need to perform crash recovery or if fsync was disabled at some point while the server was running earlier (and we must store that information in the control file). This is so that we don't lose older unflushed writes in the event of a power failure after crash recovery, where more recent writes are preserved. See 20140918083148.ga17...@alap3.anarazel.de for details. --- src/backend/access/transam/xlog.c | 171 src/bin/pg_controldata/pg_controldata.c | 2 + src/include/catalog/pg_control.h| 8 +- 3 files changed, 180 insertions(+), 1 deletion(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 71cbe0e..75a6862 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -833,6 +833,12 @@ static void WALInsertLockAcquireExclusive(void); static void WALInsertLockRelease(void); static void WALInsertLockUpdateInsertingAt(XLogRecPtr insertingAt); +static void pre_sync_fname(char *fname, bool isdir); +static void walkdir(char *path, void (*action) (char *fname, bool isdir)); +static void walktblspc_links(char *path, + void (*action) (char *fname, bool isdir)); +static void fsync_recursively(char *pg_data); + /* * Insert an XLOG record represented by an already-constructed chain of data * chunks. This is a low-level routine; to construct the WAL record header @@ -4721,6 +4727,7 @@ BootStrapXLOG(void) ControlFile-checkPoint = checkPoint.redo; ControlFile-checkPointCopy = checkPoint; ControlFile-unloggedLSN = 1; + ControlFile-fsync_disabled = false; /* Set important parameter values for use when replaying WAL */ ControlFile-MaxConnections = MaxConnections; @@ -5787,6 +5794,27 @@ StartupXLOG(void) ereport(FATAL, (errmsg(control file contains invalid data))); + /* + * If we need to perform crash recovery, we issue an fsync on the + * data directory and its contents to try to ensure that any data + * written before the crash are flushed to disk. Otherwise a power + * failure in the near future might cause earlier unflushed writes + * to be lost, even though more recent data written to disk from + * here on would be persisted. + * + * We also do this if the control file indicates that fsync was + * disabled at some point while the server was running earlier. + */ + + if (enableFsync + (ControlFile-fsync_disabled || + (ControlFile-state != DB_SHUTDOWNED + ControlFile-state != DB_SHUTDOWNED_IN_RECOVERY))) + { + fsync_recursively(data_directory); + ControlFile-fsync_disabled = false; + } + if (ControlFile-state == DB_SHUTDOWNED) { /* This is the expected case, so don't be chatty in standalone mode */ @@ -8137,6 +8165,8 @@ CreateCheckPoint(int flags) /* crash recovery should always recover to the end of WAL */ ControlFile-minRecoveryPoint = InvalidXLogRecPtr; ControlFile-minRecoveryPointTLI = 0; + if (!enableFsync) + ControlFile-fsync_disabled = true; /* * Persist unloggedLSN value. It's reset on crash recovery, so this goes @@ -11008,3 +11038,144 @@ SetWalWriterSleeping(bool sleeping) XLogCtl-WalWriterSleeping = sleeping; SpinLockRelease(XLogCtl-info_lck); } + +/* + * Hint to the OS that it should get ready to fsync() this file. + * + * Adapted from pre_sync_fname in initdb.c + */ +static void +pre_sync_fname(char *fname, bool isdir) +{ + int fd; + + fd = open(fname, O_RDONLY | PG_BINARY); + + /* + * Some OSs don't allow us to open directories at all (Windows returns + * EACCES) + */ + if (fd 0 isdir
Re: [HACKERS] pg_trgm Memory Allocation logic
Hello, If you manually set RPADDING 2 in trgm.h, then it will, but the allocation probably should use LPADDING/RPADDING to get it right, rather than assume the max values. Yes you are right. For RPADDING = 2, the current formula is suitable but for RPADDING =1, a lot of extra space is allocated. IIUC, for each word the total number of trigrams is: (word_length + LPADDING + RPADDING - 2). Also the maximum number of words a string can have is: (slen +1)/2 (considering each word has length of 1) I think (slen + (slen + 1)/2 * (LPADDING + RPADDING - 3) + 1) allocates only the necessary memory for different values of RPADDING. Regards, Beena Emerson - -- Beena Emerson -- View this message in context: http://postgresql.nabble.com/pg-trgm-Memory-Allocation-logic-tp5841088p5841238.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Tue, Mar 10, 2015 at 10:23 AM, Haribabu Kommi kommi.harib...@gmail.com wrote: On Tue, Mar 10, 2015 at 3:09 PM, Amit Kapila amit.kapil...@gmail.com wrote: I have currently modelled it based on existing rescan for seqscan (ExecReScanSeqScan()) which means it will begin the scan again. Basically if the workers are already started/initialized by previous scan, then re-initialize them (refer function ExecReScanFunnel() in patch). Can you elaborate more if you think current handling is not sufficient for any case? From ExecReScanFunnel function it seems that the re-scan waits till all the workers has to be finished to start again the next scan. Are the workers will stop the current ongoing task? otherwise this may decrease the performance instead of improving as i feel. Okay, performance-wise it might effect such a case, but I think we can handle it by not calling WaitForParallelWorkersToFinish(), as DestroyParallelContext() will automatically terminate all the workers. I am not sure if it already handled or not, when a worker is waiting to pass the results, whereas the backend is trying to start the re-scan? I think stopping/terminating workers should handle such a case. Thanks for pointing out this case, I will change it in next update. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] proposal: searching in array function - array_position
On 3/10/15 10:53 AM, Jim Nasby wrote: On 3/10/15 9:30 AM, Robert Haas wrote: On Sat, Mar 7, 2015 at 1:06 PM, Petr Jelinek p...@2ndquadrant.com wrote: You still duplicate the type cache code, but many other array functions do that too so I will not hold that against you. (Maybe somebody should write separate patch that would put all that duplicate code into common function?) I think this patch is ready for committer so I am going to mark it as such. The documentation in this patch needs some improvements to the English. Can anyone help with that? I'll take a look at it. The documentation should discuss what happens if the array is multi-dimensional. This is the wording I have to describe what happens with a multi-dimensional array. I'm not thrilled with it; anyone have better ideas? Note: multi-dimensional arrays are squashed to one dimension before searching. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_rewind in contrib
On 03/10/2015 07:46 AM, Amit Kapila wrote: On Mon, Mar 9, 2015 at 7:32 PM, Heikki Linnakangas hlinn...@iki.fi wrote: Attached is a new patch version, fixing all the little things you listed. I believe this is pretty much ready for commit. I'm going to read it through myself one more time before committing, but I don't have anything mind now that needs fixing anymore. I just pushed the change to split dbcommands.h into dbcommands.h and dbcommands_xlog.h, as that seems like a nice-to-have anyway. Few assorted comments: 1. +step + para + Copy all those changed blocks from the new cluster to the old cluster. + /para +/step Isn't it possible incase of async replication that old cluster has some blocks which new cluster doesn't have, what will it do in such a case? Sure, that's certainly possible. If the source cluster doesn't have some blocks that exist in the target, IOW a file in the source cluster is shorter than the same file in the target, that means that the relation was truncated in the source. pg_rewind will truncate the file in the target to match the source's size, although that's not strictly necessary as there will also be a WAL record in the source about the truncation. That will be replayed on the first startup after pg_rewind and would do the truncation anyway. I have tried to test some form of such a case and it seems to be failing with below error: pg_rewind.exe -D ..\..\Data\ --source-pgdata=..\..\Database1 The servers diverged at WAL position 0/16DE858 on timeline 1. Rewinding from last common checkpoint at 0/16B8A70 on timeline 1 could not open file ..\..\Data\/base/12706/16391 for truncation: No such file or directory Failure, exiting Hmm, could that be just because of the funny business with the Windows path separators? Does it work if you use -D ..\..\Data instead, without the last backslash? +# To run a test, the test script (in t/ subdirectory) calls the functions What do you mean by t/ subdirectory? There is a directory, src/bin/pg_rewind/t, which contains the regression tests. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On 3/10/15 2:43 PM, Pavel Stehule wrote: There is not all that much commonality; many functions don't bother to populate all of the ArrayMetaState fields because they don't need all of them. (The ones you quote don't, in fact.) You are either going to end up with a subroutine that does extra syscache lookups to populate the whole struct every time, or a confusing collection of ad-hoc subroutines. I'm not convinced that there's a lot of mileage to be gained here. I have not good feeling about it too. If we would to enhance this are, we probably need a specific flinfo field and flags to specify more precious the context of cached informations. my_extra should be reserved for generic usage. But still there is relative big space for settings some less common fields like proc. With extra field in flinfo we can have interface like bool set_flinfo_type_cache(fcinfo, type, flags); and usage fcinfo-flinfo-typecache-typlen, .. I'm not following what you intended there, but in any case I don't think we'd need to change all that much, because there's only 3 cases: 1) Get only the base type info 2) Get base type info and equality operator 3) Get IO info for one IOFunc Passing the function an enum (and perhaps keeping it in ArrayMetaState) would be enough to distinguish between the 3 cases. You'd also need to pass in IOFuncSelector. That said, this pattern with fn_extra is repeated a lot, even just in the backend (not counting contrib or extensions). It would be nice if there was generic support for this. decibel@decina:[17:15]~/pgsql/HEAD/src/backend (array_offset_v4 $)$pg_grep fn_extra|cut -d: -f1|uniq -c 14 access/gist/gistscan.c 7 executor/functions.c 1 executor/nodeWindowAgg.c 14 utils/adt/array_userfuncs.c 31 utils/adt/arrayfuncs.c 4 utils/adt/domains.c 2 utils/adt/enum.c 1 utils/adt/int.c 6 utils/adt/jsonfuncs.c 1 utils/adt/oid.c 4 utils/adt/orderedsetaggs.c 7 utils/adt/rangetypes.c 24 utils/adt/rowtypes.c 8 utils/adt/varlena.c (utils/fmgr/* doesn't count) -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
Jim Nasby jim.na...@bluetreble.com writes: That said, this pattern with fn_extra is repeated a lot, even just in the backend (not counting contrib or extensions). It would be nice if there was generic support for this. What do you mean by generic support? Most of those functions are doing quite different things with fn_extra --- granted, nearly all of them are caching something there, but I don't see very much that a generic infrastructure could bring to the table. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: searching in array function - array_position
On 2/22/15 5:19 AM, Pavel Stehule wrote: 2015-02-22 3:00 GMT+01:00 Petr Jelinek p...@2ndquadrant.com mailto:p...@2ndquadrant.com: On 28/01/15 08:15, Pavel Stehule wrote: 2015-01-28 0:01 GMT+01:00 Jim Nasby jim.na...@bluetreble.com mailto:jim.na...@bluetreble.com mailto:Jim.Nasby@bluetreble.__com mailto:jim.na...@bluetreble.com: On 1/27/15 4:36 AM, Pavel Stehule wrote: It is only partially identical - I would to use cache for array_offset, but it is not necessary for array_offsets .. depends how we would to modify current API to support externally cached data. Externally cached data? Some from these functions has own caches for minimize access to typcache (array_map, array_cmp is example). And in first case, I am trying to push these information from fn_extra, in second case I don't do it, because I don't expect a repeated call (and I am expecting so type cache will be enough). You actually do caching via fn_extra in both case and I think that's the correct way, and yes that part can be moved common function. I also see that the documentation does not say what is returned by array_offset if nothing is found (it's documented in code but not in sgml). rebased + fixed docs I don't think we need both array_offset and array_offset_start; can't both SQL functions just call one C function? It might be worth combining the array and non-array versions of this, by having a _common function that accepts a boolean and then just run one or the other of the while loops. Most of the code seems to be shared between the two versions. What is this comment supposed to mean? There is no 'width_array'... * Biggest difference against width_array is unsorted input array. I've attached my doc changes, both alone and with the code. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com diff --git a/doc/src/sgml/array.sgml b/doc/src/sgml/array.sgml index 9ea1068..d90266f 100644 --- a/doc/src/sgml/array.sgml +++ b/doc/src/sgml/array.sgml @@ -600,6 +600,15 @@ SELECT * FROM sal_emp WHERE pay_by_quarter ARRAY[1]; index, as described in xref linkend=indexes-types. /para + para + You can also search for a value in an array using the functionarray_offset/ + function. It returns the position of the first occurrence of a value in an array: + +programlisting +SELECT array_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon'); +/programlisting + /para + tip para Arrays are not sets; searching for specific array elements diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index c198bea..311f2fe 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -11480,6 +11480,9 @@ SELECT NULLIF(value, '(none)') ... primaryarray_lower/primary /indexterm indexterm +primaryarray_offset/primary + /indexterm + indexterm primaryarray_prepend/primary /indexterm indexterm @@ -11599,6 +11602,37 @@ SELECT NULLIF(value, '(none)') ... row entry literal + functionarray_offset/function(typeanyarray/type, typeanyelement/type optional, typeint/type/optional) + /literal +/entry +entrytypeint/type/entry +entryreturns the offset of the first occurrence of a value in an +array. It uses the literalIS NOT DISTINCT FROM/ operator for +comparation. The optional third argument specifies an initial offset to +begin the search at. Returns NULL when the value is not found. Note: +multi-dimensional arrays are squashed to one dimension before +searching./entry + entryliteralarray_offset(ARRAY['sun','mon','tue','wen','thu','fri','sat'], 'mon')/literal/entry +entryliteral2/literal/entry + /row + row +entry + literal + functionarray_offsets/function(typeanyarray/type, typeanyelement/type) + /literal +/entry +entrytypeint[]/type/entry +entryreturns an array of offsets of all occurrences of a value in a array. It uses +the literalIS NOT DISTINCT FROM/ operator for comparation. Returns an empty array +when there are no occurences of the value in the array. Note: +multi-dimensional input arrays are squashed to one dimension before +searching./entry +entryliteralarray_offsets(ARRAY['A','A','B','A'], 'A')/literal/entry +entryliteral{1,2,4}/literal/entry + /row + row +entry + literal functionarray_prepend/function(typeanyelement/type, typeanyarray/type) /literal /entry diff --git a/src/backend/utils/adt/array_userfuncs.c b/src/backend/utils/adt/array_userfuncs.c index 6679333..f7b7932 100644 ---
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Hello, I have some minor comments The comments have been implemented in the attached patch. I think that extra parenthesis should be used for the first expression with BKPIMAGE_HAS_HOLE. Parenthesis have been added to improve code readability. Thank you, Rahila Syed On Mon, Mar 9, 2015 at 5:38 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Mon, Mar 9, 2015 at 4:29 PM, Fujii Masao masao.fu...@gmail.com wrote: On Thu, Mar 5, 2015 at 10:08 PM, Michael Paquier michael.paqu...@gmail.com wrote: On Thu, Mar 5, 2015 at 9:14 PM, Syed, Rahila rahila.s...@nttdata.com wrote: Please find attached a patch. As discussed, flag to denote compression and presence of hole in block image has been added in XLogRecordImageHeader rather than block header. Thanks for updating the patch! Attached is the refactored version of the patch. Cool. Thanks! I have some minor comments: +The default value is literaloff/ Dot at the end of this sentence. +Turning this parameter on can reduce the WAL volume without Turning valueon/ this parameter +but at the cost of some extra CPU time by the compression during +WAL logging and the decompression during WAL replay. Isn't a verb missing here, for something like that: but at the cost of some extra CPU spent on the compression during WAL logging and on the decompression during WAL replay. + * This can reduce the WAL volume, but at some extra cost of CPU time + * by the compression during WAL logging. Er, similarly some extra cost of CPU spent on the compression + if (blk-bimg_info BKPIMAGE_HAS_HOLE + (blk-hole_offset == 0 || +blk-hole_length == 0 || I think that extra parenthesis should be used for the first expression with BKPIMAGE_HAS_HOLE. + if (blk-bimg_info BKPIMAGE_IS_COMPRESSED + blk-bimg_len == BLCKSZ) + { Same here. + /* +* cross-check that hole_offset == 0 and hole_length == 0 +* if the HAS_HOLE flag is set. +*/ I think that you mean here that this happens when the flag is *not* set. + /* +* If BKPIMAGE_HAS_HOLE and BKPIMAGE_IS_COMPRESSED, +* an XLogRecordBlockCompressHeader follows +*/ Maybe a struct should be added for an XLogRecordBlockCompressHeader struct. And a dot at the end of the sentence should be added? Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers Support-compression-full-page-writes-in-WAL_v25.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] One question about security label command
ERRCODE_FEATURE_NOT_SUPPORTED is suitable error code here. Please see the attached one. Thanks, 2015-03-11 4:34 GMT+09:00 Robert Haas robertmh...@gmail.com: On Tue, Mar 10, 2015 at 9:41 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: And perhaps make it an ereport also, with errcode etc. Yeah, definitely. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- KaiGai Kohei kai...@kaigai.gr.jp security-label-errmsg.v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Relation ordering in FROM clause causing error related to missing entry... Or not.
On Tue, Mar 10, 2015 at 10:30 PM, Stephen Frost sfr...@snowman.net wrote: Yeah, it's simply because we can turn one into an implicit LATERAL, but we can't do that for the other. Ah, yes, thanks. I forgot that it was changed to an implicit LATERAL. Just wondering where my mind was yesterday night... -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Rethinking the parameter access hooks for plpgsql's benefit
Robert Haas robertmh...@gmail.com writes: On the technical side, I do agree that the requirement to allocate and zero an array for every new simple expression is unfortunate, but I'm not convinced that repeatedly invoking the hook-function is a good way to solve that problem. Calling the hook-function figures to be significantly more expensive than an array-fetch, so you can almost think of the ParamListInfo entries themselves as a cache of data retrieved from the hook function. In that view of the world, the problem here is that initializing the cache is too expensive. Your solution to that is to rip the cache out completely, but what would be better is keep the cache and make initializing it cheaper - e.g. by sharing one ParamListInfo across all expressions in the same scope; or by not zeroing the memory and instead tracking the the first N slots are the only ones we've initialized; or $your_idea_here. Getting back to the actual merits of this patch --- you're right, it was not a good idea as proposed. I'd been thinking about the scalar expression case only, and forgot that the same code is used to pass parameters to queries, which might evaluate those parameters quite a large number of times. So any savings from reducing the setup time is sure to be swamped if the hook-function gets called repeatedly. Another problem is that for ROW and REC datums, exec_eval_datum() pallocs memory that won't get freed till the end of the statement, so repeat evaluation would cause memory leaks. So we really need evaluate-at-most-once semantics in there. However, this attempt did confirm that we don't need to create a separate ParamListInfo struct for each evaluation attempt. So the attached, greatly stripped-down patch just allocates a ParamListInfo once at function startup, and then zeroes it each time. This does nothing for the memset overhead but it does get rid of the palloc traffic, which seems to be good for a few percent on simple-assignment-type statements. AFAICS this is an unconditional win compared to HEAD. What's more, it provides at least a basis for doing better later: if we could think of a reasonably cheap way of tracking which datums get invalidated by an assignment, we might be able to reset only selected entries in the array rather than blindly blowing away all of them. I propose to apply this and leave it at that for now. regards, tom lane diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c index f24f55a..0ad32f7 100644 *** a/src/pl/plpgsql/src/pl_exec.c --- b/src/pl/plpgsql/src/pl_exec.c *** exec_stmt_forc(PLpgSQL_execstate *estate *** 2197,2206 elog(ERROR, could not open cursor: %s, SPI_result_code_string(SPI_result)); - /* don't need paramlist any more */ - if (paramLI) - pfree(paramLI); - /* * If cursor variable was NULL, store the generated portal name in it */ --- 2197,2202 *** plpgsql_estate_setup(PLpgSQL_execstate * *** 3169,3174 --- 3165,3180 estate-datums = palloc(sizeof(PLpgSQL_datum *) * estate-ndatums); /* caller is expected to fill the datums array */ + /* initialize ParamListInfo with one entry per datum, all invalid */ + estate-paramLI = (ParamListInfo) + palloc0(offsetof(ParamListInfoData, params) + + estate-ndatums * sizeof(ParamExternData)); + estate-paramLI-paramFetch = plpgsql_param_fetch; + estate-paramLI-paramFetchArg = (void *) estate; + estate-paramLI-parserSetup = (ParserSetupHook) plpgsql_parser_setup; + estate-paramLI-parserSetupArg = NULL; /* filled during use */ + estate-paramLI-numParams = estate-ndatums; + /* set up for use of appropriate simple-expression EState */ if (simple_eval_estate) estate-simple_eval_estate = simple_eval_estate; *** plpgsql_estate_setup(PLpgSQL_execstate * *** 3179,3185 estate-eval_processed = 0; estate-eval_lastoid = InvalidOid; estate-eval_econtext = NULL; - estate-cur_expr = NULL; estate-err_stmt = NULL; estate-err_text = NULL; --- 3185,3190 *** exec_stmt_execsql(PLpgSQL_execstate *est *** 3495,3503 (rc == SPI_OK_SELECT) ? errhint(If you want to discard the results of a SELECT, use PERFORM instead.) : 0)); } - if (paramLI) - pfree(paramLI); - return PLPGSQL_RC_OK; } --- 3500,3505 *** exec_stmt_open(PLpgSQL_execstate *estate *** 3864,3871 if (curname) pfree(curname); - if (paramLI) - pfree(paramLI); return PLPGSQL_RC_OK; } --- 3866,3871 *** exec_assign_c_string(PLpgSQL_execstate * *** 4050,4056 /* -- ! * exec_assign_value Put a value into a target field * * Note: in some code paths, this will leak memory in the eval_econtext; * we assume that will be cleaned up later by exec_eval_cleanup. We cannot --- 4050,4056 /* -- ! * exec_assign_value Put a value into a target
Re: [HACKERS] Proposal : REINDEX xxx VERBOSE
On 3/9/15 9:43 PM, Sawada Masahiko wrote: On Fri, Mar 6, 2015 at 11:07 AM, Jim Nasby jim.na...@bluetreble.com wrote: On 3/2/15 10:58 AM, Sawada Masahiko wrote: On Wed, Feb 25, 2015 at 4:58 PM, Jim Nasby jim.na...@bluetreble.com wrote: On 2/24/15 8:28 AM, Sawada Masahiko wrote: According to the above discussion, VACUUM and REINDEX should have trailing options. Tom seems (to me) suggesting that SQL-style (bare word preceded by WITH) options and Jim suggesting '()' style options? (Anyway VACUUM gets the*third additional* option sytle, but it is the different discussion from this) Well, almost everything does a trailing WITH. We need to either stick with that for consistency, or add leading () as an option to those WITH commands. Does anyone know why those are WITH? Is it ANSI? As a refresher, current commands are: VACUUM (ANALYZE, VERBOSE) table1 (col1); REINDEX INDEX index1 FORCE; COPY table1 FROM 'file.txt' WITH (FORMAT csv); CREATE MATERIALIZED VIEW mv1 WITH (storageparam, ...) AS qry WITH DATA; CREATE EXTENSION ext1 WITH SCHEMA s1 VERSION v1 FROM over; CREATE ROLE role WITH LOGIN; GRANT WITH GRANT OPTION; CREATE VIEW v1 AS qry WITH CASCADED CHECK OPTION; ALTER DATABASE db1 WITH CONNECTION LIMIT 50; DECLARE c1 INSENSITIVE SCROLL CURSOR WITH HOLD; BTW, I'm fine with Tom's bare-word with WITH idea. That seems to be the most consistent with everything else. Is there a problem with doing that? I know getting syntax is one of the hard parts of new features, but it seems like we reached consensus here... Attached is latest version patch based on Tom's idea as follows. REINDEX { INDEX | ... } name WITH ( options [, ...] ) Are the parenthesis necessary? No other WITH option requires them, other than create table/matview (COPY doesn't actually require them). We have discussed about this option including FORCE option, but I think there are not user who want to use both FORCE and VERBOSE option at same time. I find that very hard to believe... I would expect a primary use case for VERBOSE to be I ran REINDEX, but it doesn't seem to have done anything... what's going on? and that's exactly when you might want to use FORCE. In currently code, nothing happens even if FORCE option is specified. This option completely exist for backward compatibility. But this patch add new syntax including FORCE option for now. I forgot that. There's no reason to support it with the new stuff then. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How about to have relnamespace and relrole?
Thank you for the correction. At Wed, 4 Mar 2015 01:01:48 -0600, Jim Nasby jim.na...@bluetreble.com wrote in 54f6addc.8030...@bluetreble.com On 3/3/15 8:04 PM, Kyotaro HORIGUCHI wrote: Note: The OID alias types don't sctrictly comply the transaction isolation rules so do not use them where exact transaction isolation on the values of these types has a significance. Likewise, since they look as simple constants to planner so you might get slower plans than the queries joining the system tables correnspond to the OID types. Might I suggest: Note: The OID alias types do not completely follow transaction isolation rules. The planner also treats them as simple constants, which may result in sub-optimal planning. Looks far simple and enough. The note has been replaced with your sentence in the attached patch. regards, -- Kyotaro Horiguchi NTT Open Source Software Center From e31c326fa8e8ee294f003258233bf4be1410fdd4 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp Date: Wed, 18 Feb 2015 14:38:32 +0900 Subject: [PATCH 1/2] Add regrole Add new OID aliass type regrole. The new type has the scope of whole the database cluster so it doesn't behave as the same as the existing OID alias types which have database scope, concerning object dependency. To get rid of confusion, inhibit constants of the new type from appearing where dependencies are made involving it. Documentation about this new type is added and some existing descriptions are modified according to the restriction of the type. Addition to it, put a note about possible MVCC violation and optimization issues, which are general over the all reg* types. --- doc/src/sgml/datatype.sgml| 53 --- src/backend/bootstrap/bootstrap.c | 2 + src/backend/catalog/dependency.c | 22 src/backend/utils/adt/regproc.c | 98 +++ src/backend/utils/adt/selfuncs.c | 2 + src/backend/utils/cache/catcache.c| 1 + src/include/catalog/pg_cast.h | 7 +++ src/include/catalog/pg_proc.h | 10 src/include/catalog/pg_type.h | 5 ++ src/include/utils/builtins.h | 5 ++ src/test/regress/expected/regproc.out | 26 +- src/test/regress/sql/regproc.sql | 7 +++ 12 files changed, 219 insertions(+), 19 deletions(-) diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml index edf636b..f4e82f5 100644 --- a/doc/src/sgml/datatype.sgml +++ b/doc/src/sgml/datatype.sgml @@ -4319,8 +4319,9 @@ SET xmloption TO { DOCUMENT | CONTENT }; an object identifier. There are also several alias types for typeoid/: typeregproc/, typeregprocedure/, typeregoper/, typeregoperator/, typeregclass/, -typeregtype/, typeregconfig/, and typeregdictionary/. -xref linkend=datatype-oid-table shows an overview. +typeregtype/, typeregrole/, typeregconfig/, and +typeregdictionary/. xref linkend=datatype-oid-table shows +an overview. /para para @@ -4429,6 +4430,13 @@ SELECT * FROM pg_attribute /row row +entrytyperegrole//entry +entrystructnamepg_authid//entry +entryrole name/entry +entryliteralsmithee//entry + /row + + row entrytyperegconfig//entry entrystructnamepg_ts_config//entry entrytext search configuration/entry @@ -4446,29 +4454,38 @@ SELECT * FROM pg_attribute /table para -All of the OID alias types accept schema-qualified names, and will -display schema-qualified names on output if the object would not -be found in the current search path without being qualified. -The typeregproc/ and typeregoper/ alias types will only -accept input names that are unique (not overloaded), so they are -of limited use; for most uses typeregprocedure/ or +All of the OID alias types for objects grouped by namespace accept +schema-qualified names, and will display schema-qualified names on output +if the object would not be found in the current search path without being +qualified. The typeregproc/ and typeregoper/ alias types will +only accept input names that are unique (not overloaded), so they are of +limited use; for most uses typeregprocedure/ or typeregoperator/ are more appropriate. For typeregoperator/, unary operators are identified by writing literalNONE/ for the unused operand. /para para -An additional property of the OID alias types is the creation of -dependencies. If a -constant of one of these types appears in a stored expression -(such as a column default expression or view), it creates a dependency -on the referenced object. For example, if a column has a default -expression literalnextval('my_seq'::regclass)/, -productnamePostgreSQL/productname -understands that the default expression depends on the sequence -literalmy_seq/; the
Re: [HACKERS] pg_basebackup may fail to send feedbacks.
Hi, the attached is the v5 patch. - Do feGetCurrentTimestamp() only when necessary. - Rebased to current master At Mon, 2 Mar 2015 20:21:36 +0900, Fujii Masao masao.fu...@gmail.com wrote in cahgqgwg1tjhpg03ozgwokxt5wyd5v4s3hutgsx7rotbhhnj...@mail.gmail.com On Tue, Feb 24, 2015 at 6:44 PM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hello, the attached is the v4 patch that checks feedback timing every WAL segments boundary. .. I said that checking whether to send feedback every boundary between WAL segments seemed too long but after some rethinking, I changed my mind. - The most large possible delay source in the busy-receive loop is fsyncing at closing WAL segment file just written, this can take several seconds. Freezing longer than the timeout interval could not be saved and is not worth saving anyway. - 16M bytes-disk-writes intervals between gettimeofday() seems to be gentle enough even on platforms where gettimeofday() is rather heavy. Sounds reasonable to me. So we don't need to address the problem in walreceiver side so proactively because it tries to send the feedback every after flushing the WAL records. IOW, the problem you observed is less likely to happen. Right? +now = feGetCurrentTimestamp(); +if (standby_message_timeout 0 Surely it would hardly happen, especially on ordinary configuration. However, the continuous receiving of the replication stream is a quite normal behavior even if hardly happens. So the receiver should guarantee the feedbacks to be sent by logic as long as it is working normally, as long as the code for the special case won't be too large and won't take too long time:). The current walreceiver doesn't look trying to send feedbacks after flushing every wal records. HandleCopyStream will restlessly process the records in a gapless replication stream, sending feedback only when keepalive packet arrives. It is the fact that the response to the keepalive would help greatly but it is not what the keepalives are for. It is intended to be used to confirm if a silent receiver is still alive. Even with this fix, the case that one write or flush takes longer time than the feedback interval cannot be saved, but it would be ok since it should be regarded as disorder. Minor comment: should feGetCurrentTimestamp() be called after the check of standby_message_timeout 0, to avoid unnecessary calls of that? Ah, you're right. I'll fixed it. ProcessXLogDataMsg(PGconn *conn, char *copybuf, int len, XLogRecPtr *blockpos, uint32 timeline, char *basedir, stream_stop_callback stream_stop, - char *partial_suffix, bool mark_done) + char *partial_suffix, bool mark_done, + int standby_message_timeout, int64 *last_status) Maybe it's time to refactor this ugly coding (i.e., currently many arguments need to be given to each functions. Looks ugly)... I'm increasing the ugliness:( XLog stuff seems to need to share many states widely to work. But the parameter list of the function looks to be bearable to this extent, to me:). regards, -- Kyotaro Horiguchi NTT Open Source Software Center From ef7b04c9ddf351ca99736d9ec9fa1954383cd124 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp Date: Tue, 24 Feb 2015 17:52:01 +0900 Subject: [PATCH] Make effort to send feedback regulary on heavy load. pg_basebackup and pg_receivexlog might be forced to omit sending feedback for long time by continuous replication stream caused by possible heavy load on receiver side. Keep alives from the server could be delayed on such a situation. This patch let them make efforts to send feedback on such a situation. On every boundary between WAL segments, send feedback if so the time has come just after syncing and closing the segment just finished. --- src/bin/pg_basebackup/receivelog.c | 28 +--- 1 file changed, 25 insertions(+), 3 deletions(-) diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c index 8caedff..df51f9d 100644 --- a/src/bin/pg_basebackup/receivelog.c +++ b/src/bin/pg_basebackup/receivelog.c @@ -45,7 +45,8 @@ static bool ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len, static bool ProcessXLogDataMsg(PGconn *conn, char *copybuf, int len, XLogRecPtr *blockpos, uint32 timeline, char *basedir, stream_stop_callback stream_stop, - char *partial_suffix, bool mark_done); + char *partial_suffix, bool mark_done, + int standby_message_timeout, int64 *last_status); static PGresult *HandleEndOfCopyStream(PGconn *conn, char *copybuf, XLogRecPtr blockpos, char *basedir, char *partial_suffix, XLogRecPtr *stoppos, bool mark_done); @@ -906,7 +907,8 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, { if
Re: [HACKERS] moving from contrib to bin
On Tue, Mar 10, 2015 at 10:48 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Andres Freund wrote: Rebased (fair amount of trivial conflicts due to the copyright year bump) and attached as -MC style format-patch. If you look at the content of the patches you can see that the diff makes more sense now. Ah --- I just realized that the change Peter is proposing from backslash to forward slashes in the MSVC build stuff is related to testability of this patch set in non-Windows environments. Can we get that patch done please? It seems there's only a small bit missing there. How do we go about getting these patches pushed? I think that one of the last point raised (by Andres) was if the Makefiles in src/bin should depend on pgxs.mk or not. FWIW, I think as well that it should be a subsequent patch not related to this one as it is a different debate to have stuff of src/bin depend on contrib infrastructure. Now, the pushing plan was to have the stuff of pg_upgrade done in a separate commit. Note that I am fine helping out wrapping up things particularly on Windows if that helps. Note that each utility we move to src/bin will create a new translatable module for 9.5. It would be better to do it soon rather than waiting at the very end of the cycle, so that translators have time to work through the bunch of extra files. That's a point. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers