[HACKERS] Confusing comment in tidbitmap.c
The following comment above #define PAGES_PER_CHUNK in tibbitmap.c appears to be incorrect: But we * also want PAGES_PER_CHUNK to be a power of 2 to avoid expensive integer * remainder operations. So, define it like this: I don't quite follow this as it does nothing of the kind. Check tbm_page_is_lossy() where we do: bitno = pageno % PAGES_PER_CHUNK; Or am I missing something about the compiler optimizing that to: bitno = pageno 255; ? Regards David Rowley
Re: [HACKERS] speedup tidbitmap patch: cache page
On 23 October 2014 at 00:52, Teodor Sigaev teo...@sigaev.ru wrote: In specific workload postgres could spend a lot of time in tbm_add_tuples, up to 50% of query time. hash_search call is expensive and called twice for each ItemPointer to insert. Suggested patch tries to cache last PagetableEntry pointer in hope that next ItemPointer will be on the same relation page. Patch is rather trivial and I don't believe that it could cause performance degradation. But it could increase performance on some workload. An example: # create extension btree_gin; # select (v / 10)::int4 as i into t from generate_series(1, 500) as v; # create index idx on t using gin (i); # set enable_seqscan = off; # explain analyze select * from t where i = 0; without patch: Execution time: 2427.692 ms with patch:Execution time: 2058.718 ms # explain analyze select * from t where i = 100 and i= 100; without patch: Execution time: 524.441 ms with patch:Execution time: 195.381 ms This seems like quite a promising performance boost. I've been having a look at this and I'm wondering about a certain scenario: In tbm_add_tuples, if tbm_page_is_lossy() returns true for a given block, and on the next iteration of the loop we have the same block again, have you benchmarked any caching code to store if tbm_page_is_lossy() returned true for that block on the previous iteration of the loop? This would save from having to call tbm_page_is_lossy() again for the same block. Or are you just expecting that tbm_page_is_lossy() returns true so rarely that you'll end up caching the page most of the time, and gain on skipping both hash lookups on the next loop, since page will be set in this case? It would be nice to see a comment to explain why it might be a good idea to cache the page lookup. Perhaps something like: + /* +* We'll cache this page as it's quite likely that on the next loop +* we'll be seeing the same page again. This will save from having +* to lookup the page in the hashtable again. +*/ + page = tbm_get_pageentry(tbm, blk); I also wondered if there might be a small slowdown in the case where the index only finds 1 matching tuple. So I tried the following: select v::int4 as i into t1 from generate_series(1, 500) as v; create index t1_idx on t1 using gin (i); select count(*) from t1 where i = 0; In this case tbm_add_tuples is called with ntids == 1, so there's only 1 loop performed per call. I wondered if the page == NULL check would add much overhead the function. Times are in milliseconds: Patch Master 2413.183 2339.979 2352.564 2428.025 2369.118 2359.498 2338.442 2350.974 2373.330 2400.942 2383.883 2342.461 2393.804 2359.49 2334.998 2359.884 2428.156 2520.842 2336.978 2356.995 avg. 2372.4456 2381.909 99.6% med. 2371.224 2359.494 100.5% It appears that if it does, then it's not very much. Regards David Rowley
Re: [HACKERS] Commitfest problems
On 13/12/14 09:37, Craig Ringer wrote: Speaking as the originator of commitfests, they were *always* intended to be a temporary measure, a step on the way to something else like continuous integration. I'd really like to see the project revisit some of the underlying assumptions that're being made in this discussion: - Patches must be email attachments to a mailing list - Changes must be committed by applying a diff ... and take a look at some of the options a git-based workflow might offer, especially in combination with some of the tools out there that help track working branches, run CI, etc. Having grown used to push/pull workflows with CI integration I find the PostgreSQL patch workflow very frustrating, especially for larger patches. It's particularly annoying to see a patch series squashed into a monster patch whenever it changes hands or gets rebased, because it's being handed around as a great honking diff not a git working branch. Is it time to stop using git like CVS? While not so active these days with PostgreSQL, I believe there is still great scope for improving the workflow of reviewing and applying patches. And while the commitfests were a good idea when they started, it's obvious that in their current form they do not scale to meet the current rate of development. If I could name just one thing that I think would improve things it would be submission of patches to the list in git format-patch format. Why? Because it enables two things: 1) by definition patches are small-chunked into individually reviewable pieces and 2) subject lines allow list members to filter things that aren't relevant to them. Here's an example of how the workflow works for the QEMU project which I'm involved in, which also has similar issues in terms of numbers of developers and supported platforms: 1) Developer submits a patch to the -devel list with git-format-patch - For subsystems with listed maintainers in the MAINTAINERS file, the maintainer must be CCd on the patchset (this is so that even if developers decide to filter PATCH emails to the list, they still get the CC copy) - Patchsets must be iterative and fully bisectable, with clear comments supplied in the commit message - Any patchset with 1 patch MUST have a cover letter - All patches have a Signed-off-by indicating that the developer has accepted the terms of the project license - Each subject line should be in the format [PATCH] subsystem: one line comment to enabled people to determine its relevance 2) Other developers start reviewing patches There are several mini-workflows that tend to occur here so I'll try and give some popular examples. - If a patch is a build fix, one of the core committers will verify and apply directly to the master repository - If a maintainer is happy with the whole patchset, they reply to the cover letter with a Reviewed-by and a line stating which subtree the patchset has been applied to. Maintainers add a Signed-off-by to all patches accepted via their subtree. - A maintainer may indicate that the patch itself is fine, but the commit message is not clear/incorrect and should be changed before being resubmitted - Any member of the mailing list may reply/review an individual patch by hitting reply in their mail client. Patch comments are included inline. If a reviewer is happy with an individual patch, they reply to the patch and add a Reviewed-by line; anyone can provide a review, even if they are not a maintainer - For complex patches, a maintainer may request that the patchset may be split into further patchsets. In general patchsets of 20-30 patches tend to be frowned upon, and will often immediately result in a reply saying please break this down into smaller chunks - A maintainer may reply and say that while the patchset in its final form needs more work, some clean-ups introduced by the patch are ready to be applied, e.g. please resubmit patches 1-4, 7 and 8 as a separate patchset which can then be applied directly to a subtree 3) Patchset tidy-up and submission After the initial review, new versions of the patchset are generally reposted to the list within a few days. - The patch version is included in the header with the same subject line, e.g. [PATCHv2] Add new feature - The cover letter contains a mini-changelog giving the changes between v1 and v2 e.g. v2: Fix spelling mistake pointed out by Peter Change lock ordering requested by Andreas - Reviewed-by tags for individual patches sent to the list are appended to the commit message for that patch before resubmission Once a maintainer has accepted a patch into their subtree, they send a pull request to the core maintainers to apply their trees to the main repository. I appreciate that currently merge commits aren't used in the PostgreSQL git repository so a pull request effectively becomes rebase this patchset onto master and push. So why does this help patch review? From my experience I can
Re: [HACKERS] Commitfest problems
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 #2 is solved by my previous comments about giving the CFM/C the authority. -Core could do that, they are in charge of release. I don't think authority is the solution. Or certainly not one that would work with an open source project like ours. What *would* work is to identify and fix the friction points that prevent people from joining, make the work harder than it needs to be, and makes people stop reviewing? I could quickly identify a handful of things, primarily among them the awful link-to-the-archives to gather up all the patches process. We have git, let's use it as it was intended. - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201412141011 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAlSNqK8ACgkQvJuQZxSWSsiewwCffAxv8xSZEyLWFz/b2+PxXOXS xB4An2ubr7ovELtFMKZOZCsFHQAyVca4 =S6ZQ -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
I ran a test using postgres-US.fo built in the PostgreSQL source tree, which is 38 MB, and ran select unnest(xpath('//fo:bookmark-title', b, array[array['fo', 'http://www.w3.org/1999/XSL/Format']])) from data; (Table contains one row only.) The timings were basically indistinguishable between the three code versions. I'll try to reproduce your test. How big is your file? Do you have a link to the actual file? Could you share your load script? I use this xml sample: http://w3.energistics.org/schema/prodml_v1.2.0_data/XML_Examples/productVolume_no_xsl.xml Basically i loaded the xml to table u 100 times. Load script attached. Regards, -- Ali Akbar load_test.sql Description: application/sql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup vs. Windows and tablespaces
On Sun, Dec 14, 2014 at 11:54 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Sat, Dec 13, 2014 at 10:48 PM, Tom Lane t...@sss.pgh.pa.us wrote: Andrew Dunstan and...@dunslane.net writes: On 11/20/2014 02:27 AM, Amit Kapila wrote: Now the part where I would like to receive feedback before revising the patch is on the coding style. It seems to me from Tom's comments that he is not happy with the code, now I am not sure which part of the patch he thinks needs change. Tom if possible, could you be slightly more specific about your concern w.r.t code? In view of the request above for comments from Tom, I have moved this back to Needs Review. Sorry, I was not paying very close attention to this thread and missed the request for comments. A few such: 1. The patch is completely naive about what might be in the symlink path string; eg embedded spaces in the path would break it. On at least some platforms, newlines could be in the path as well. I'm not sure about how to guard against this while maintaining human readability of the file. I will look into this and see what best can be done. One way to deal with this could be to append a delimiter(which is not allowed in tablespace path like quote (\')) at the end of tablespace path while writing the same to symlink label file and then use that as end marker while reading it from file. I think that might defeat the human readable aspect of this file to an extent, but I am not sure if it is too important to keep it human readable. I think even if we want to provide some information to user from internal files, it is always better to provide it via some utility/tool. Do we support newline in tablespace path? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Commitfest problems
On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote: Compare this to say, for example, huge patches such as RLS. I specifically objected to that being flattened into a single monster patch when I saw that'd been done. If you look at my part in the work on the row security patch, while I was ultimately unsuccessful in getting the patch mergeable I spent quite a bit of time splitting it up into a logical patch-series for sane review and development. I am quite annoyed that it was simply flattened back into an unreviewable, hard-to-follow blob and committed in that form. It's not like development on a patch series is difficult. You commit small fixes and changes, then you 'git rebase -i' and reorder them to apply to the appropriate changesets. Or you can do a 'rebase -i' and in 'e'dit mode make amendments to individual commits. Or you can commit 'fixup!'s that get auto-squashed. This is part of my grumbling about the use of git like it's still CVS. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote: If I could name just one thing that I think would improve things it would be submission of patches to the list in git format-patch format. Why? Because it enables two things: 1) by definition patches are small-chunked into individually reviewable pieces and 2) subject lines allow list members to filter things that aren't relevant to them. Personally I do just that. Even though the official docs say context diff is preferred, I think most people now use context diff in practice, from 'git diff'. I'm surprised most people don't use git format-patch . Note that we can't just git am a patch from git format-patch at the moment, because policy is that the committer should be the Author field. The project would have to define a transition point where we started using the git Author field for the author and the separate Committer field, like git-am does by default. I'm not sure that'll ever actually happen, or that it's even desirable. - Patchsets must be iterative and fully bisectable, with clear comments supplied in the commit message Fully support this one. Also in the smallest reasonable size divisions. - If a maintainer is happy with the whole patchset, they reply to the cover letter with a Reviewed-by and a line stating which subtree the patchset has been applied to. Maintainers add a Signed-off-by to all patches accepted via their subtree. Sounds a lot like the kernel workflow (though in practice it seems like the kernel folks tend bypass all this mailing list guff and just use direct pulls/pushes for lots of things). - Any member of the mailing list may reply/review an individual patch by hitting reply in their mail client. Patch comments are included inline. If a reviewer is happy with an individual patch, they reply to the patch and add a Reviewed-by line; anyone can provide a review, even if they are not a maintainer This is fun with many mail clients that like to linewrap text bodies and don't permit inline replies to attached text. 6) Smaller patches are applied more often By breaking large patches down into smaller chunks, even if the main feature itself isn't ready to be applied to the main repository then a lot of the preceding patches laying down the groundwork can. I think PostgreSQL does OK on smaller patches - at least sometimes. There can be a great deal of bikeshedding and endless arguments, back-and-forth about utility, in-tree users, etc, but no formal process is ever going to change that. The process of getting an uncontroversial patch into Pg is generally painless, if often rather slow unless a committer sees it and commits it immediately upon it being posted. The benefits here are that people with large out-of-tree patches I'm not sure we have all that many people in that category - though I'm a part of a team that most certainly does fit the bill. Even the small patches for BDR have been challenging and often contentious though. Since as large features get implemented as a series of smaller patches A big barrier here is the we must have in-tree users thing, which makes it hard to do iterative work on a finer grained scale. Some C-level unit tests that let us exercise small units of functionality might ease those complaints. Right now the closest we have to that is contrib/test_[blah] which is only useful for public API and even then quite limited. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Custom timestamp format in logs
On Sat, Dec 13, 2014 at 3:50 PM, Michael Paquier michael.paqu...@gmail.com wrote: Hi all, This week, we heard about a user willing to use a custom timestamp format across a set of services to improve the debugability of the whole set, Postgres being one of them. Unfortunately datestyle does not take into account the logs. Would it be worth adding a new GUC able to control the timestamp format in the logs? We could still redirect the logs with syslog and have a custom timestamp format there, but in the case of this particular user syslog was a no-go. Looking at the code, timestamp format is now hardcoded in setup_formatted_log_time and setup_formatted_start_time when calling pg_strftime @ elog.c, so it doesn't seem to be much complicated to do. Opinions? This thread is here for that. A separate GUC seems kind of weird. Wouldn't it be better with something like %(format)t or such in the log_line_prefix itself in that case? That could also be expanded to other parameters, should we need them? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] Custom timestamp format in logs
Magnus Hagander mag...@hagander.net writes: On Sat, Dec 13, 2014 at 3:50 PM, Michael Paquier michael.paqu...@gmail.com wrote: This week, we heard about a user willing to use a custom timestamp format across a set of services to improve the debugability of the whole set, Postgres being one of them. Unfortunately datestyle does not take into account the logs. Would it be worth adding a new GUC able to control the timestamp format in the logs? A separate GUC seems kind of weird. Wouldn't it be better with something like %(format)t or such in the log_line_prefix itself in that case? That could also be expanded to other parameters, should we need them? TBH, my answer to the rhetorical question is no. There is nothing weird about the timestamps %t emits now, and no reason why they should need to be configurable, except that somebody thinks it's easier to lobby us to complicate our software than to fix whatever they have that can't consume standard timestamp format. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 14/12/14 15:51, Craig Ringer wrote: On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote: Compare this to say, for example, huge patches such as RLS. I specifically objected to that being flattened into a single monster patch when I saw that'd been done. If you look at my part in the work on the row security patch, while I was ultimately unsuccessful in getting the patch mergeable I spent quite a bit of time splitting it up into a logical patch-series for sane review and development. I am quite annoyed that it was simply flattened back into an unreviewable, hard-to-follow blob and committed in that form. It's not like development on a patch series is difficult. You commit small fixes and changes, then you 'git rebase -i' and reorder them to apply to the appropriate changesets. Or you can do a 'rebase -i' and in 'e'dit mode make amendments to individual commits. Or you can commit 'fixup!'s that get auto-squashed. This is part of my grumbling about the use of git like it's still CVS. I just want to add that I'm always grateful for the time that yourself and all committers put into PostgreSQL, and my example of RLS was really to signify big feature X rather than to pick on a particular aspect of that patch. I apologise if you though I was in any way criticising the work that you, or anyone else, put into this feature. The argument I wanted to make is that if someone starts seeing a problem with a current build of PostgreSQL and git bisects it down to a particular commit, then smaller, iterative commits are extremely helpful in this regard. When searching for a needle in a haystack, there is very big difference between a add big feature X haystack and a change struct alignment for Y haystack, the latter which is implicit in having smaller, iterative patchsets. ATB, Mark. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
Craig Ringer cr...@2ndquadrant.com writes: On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote: Compare this to say, for example, huge patches such as RLS. I specifically objected to that being flattened into a single monster patch when I saw that'd been done. If you look at my part in the work on the row security patch, while I was ultimately unsuccessful in getting the patch mergeable I spent quite a bit of time splitting it up into a logical patch-series for sane review and development. I am quite annoyed that it was simply flattened back into an unreviewable, hard-to-follow blob and committed in that form. TBH, I'm not really on board with this line of argument. I don't find broken-down patches to be particularly useful for review purposes. An example I was just fooling with this week is the GROUPING SETS patch, which was broken into three sections for no good reason at all. (The fourth and fifth subpatches, being alternative solutions to one problem, are in a different category of course.) Too often, decisions made in one subpatch don't make any sense until you see the larger picture. Also, speaking of the larger picture: the current Postgres revision history amounts to 37578 commits (as of sometime yesterday) --- and that's just in the HEAD branch. If we'd made an effort to break feature patches into bite-size chunks like you're recommending here, we'd probably have easily half a million commits in the mainline history. That would not be convenient to work with, and I really doubt that it would be more useful for git bisect purposes, and I'll bet a large amount of money that most of them would not have had commit messages composed with any care at all. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Confusing comment in tidbitmap.c
David Rowley dgrowle...@gmail.com writes: The following comment above #define PAGES_PER_CHUNK in tibbitmap.c appears to be incorrect: But we * also want PAGES_PER_CHUNK to be a power of 2 to avoid expensive integer * remainder operations. So, define it like this: I don't quite follow this as it does nothing of the kind. Check tbm_page_is_lossy() where we do: bitno = pageno % PAGES_PER_CHUNK; Or am I missing something about the compiler optimizing that to: bitno = pageno 255; ? Exactly. Any C compiler ever written will do that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - Assert statement
Hi any comments to last proposal and patch? Pavel 2014-11-26 19:52 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com: Hi 2014-11-26 16:46 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com: 2014-11-26 13:31 GMT+01:00 Marko Tiikkaja ma...@joh.to: On 11/26/14 8:55 AM, Pavel Stehule wrote: * should be assertions globally enabled/disabled? - I have no personal preference in this question. I think so. The way I would use this function is to put expensive checks into strategic locations which would only run when developing locally (and additionally perhaps in one of the test environments.) And in that case I'd like to globally disable them for the live environment. ok * can be ASSERT exception handled ? - I prefer this be unhandled exception - like query_canceled because It should not be masked by plpgsql EXCEPTION WHEN OTHERS ... I don't care much either way, as long as we get good information about what went wrong. A stack trace and hopefully something like print_strict_params for parameters to the expr. There is more ways, I can live with both here is proof concept what do you think about it? Regards Pavel Pavel .marko
Re: [HACKERS] Commitfest problems
On 14/12/14 15:57, Craig Ringer wrote: On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote: If I could name just one thing that I think would improve things it would be submission of patches to the list in git format-patch format. Why? Because it enables two things: 1) by definition patches are small-chunked into individually reviewable pieces and 2) subject lines allow list members to filter things that aren't relevant to them. Personally I do just that. Even though the official docs say context diff is preferred, I think most people now use context diff in practice, from 'git diff'. I'm surprised most people don't use git format-patch . Note that we can't just git am a patch from git format-patch at the moment, because policy is that the committer should be the Author field. The project would have to define a transition point where we started using the git Author field for the author and the separate Committer field, like git-am does by default. I'm not sure that'll ever actually happen, or that it's even desirable. Sure, I appreciate that. The part I particularly wanted to emphasise here was the use of a system identifier as part of the subject line, e.g. say if there was a hypothetical patchset which sped up PostgreSQL by 20% then you could see subjects like this: [PATCH 1/x] lmgr: reduce lock strength for LWLock [PATCH 2/x] wal: change lock ordering for WAL writes [PATCH 3/x] optimiser: exit early if lock unobtainable While these are obviously unrealistic, what I hope here is that people with interest in these areas would take note of individual patches even if they aren't interested in the entire patchset. So maybe since Andreas did some recent locking work, he spots lmgr in the subject and then makes a note to review that part of the patch. Similary Heikki could spot wal and make a note to look at that, whilst Tom would zero in on the optimiser changes. The aim here is that no one person needs to sit and initially review a complete patch before returning feedback to the developer. - Patchsets must be iterative and fully bisectable, with clear comments supplied in the commit message Fully support this one. Also in the smallest reasonable size divisions. I should add here that the QEMU folk do tend to go to great lengths to preserve bisectability; often intermediate compatibility APIs are introduced early in the patchset and then removed at the very end when the final feature is implemented. - If a maintainer is happy with the whole patchset, they reply to the cover letter with a Reviewed-by and a line stating which subtree the patchset has been applied to. Maintainers add a Signed-off-by to all patches accepted via their subtree. Sounds a lot like the kernel workflow (though in practice it seems like the kernel folks tend bypass all this mailing list guff and just use direct pulls/pushes for lots of things). Yes indeed - mainly from the people on the KVM module side. Again I'm not saying that this workflow is correct for PostgreSQL, I was trying to use this an example to explain *why* the workflow is done in this manner and how it helps developers such as myself. - Any member of the mailing list may reply/review an individual patch by hitting reply in their mail client. Patch comments are included inline. If a reviewer is happy with an individual patch, they reply to the patch and add a Reviewed-by line; anyone can provide a review, even if they are not a maintainer This is fun with many mail clients that like to linewrap text bodies and don't permit inline replies to attached text. Wow really? I can't say I've seen this in practice for a long time but I defer to your experience here. 6) Smaller patches are applied more often By breaking large patches down into smaller chunks, even if the main feature itself isn't ready to be applied to the main repository then a lot of the preceding patches laying down the groundwork can. I think PostgreSQL does OK on smaller patches - at least sometimes. There can be a great deal of bikeshedding and endless arguments, back-and-forth about utility, in-tree users, etc, but no formal process is ever going to change that. The process of getting an uncontroversial patch into Pg is generally painless, if often rather slow unless a committer sees it and commits it immediately upon it being posted. I agree that trivial patches do tend to get picked up reasonably quickly. It strikes me that it may prevent patches slipping through the list if someone were officially nominated as a trivial patch monitor, i.e. someone for developers to ping if their patch has been ignored but it sounds like this is not such an issue in practice. The benefits here are that people with large out-of-tree patches I'm not sure we have all that many people in that category - though I'm a part of a team that most certainly does fit the bill. Even the small patches for BDR have been challenging and often contentious
Re: [HACKERS] Commitfest problems
On 12/14/2014 12:05 PM, Tom Lane wrote: Craig Ringer cr...@2ndquadrant.com writes: On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote: Compare this to say, for example, huge patches such as RLS. I specifically objected to that being flattened into a single monster patch when I saw that'd been done. If you look at my part in the work on the row security patch, while I was ultimately unsuccessful in getting the patch mergeable I spent quite a bit of time splitting it up into a logical patch-series for sane review and development. I am quite annoyed that it was simply flattened back into an unreviewable, hard-to-follow blob and committed in that form. TBH, I'm not really on board with this line of argument. I don't find broken-down patches to be particularly useful for review purposes. An example I was just fooling with this week is the GROUPING SETS patch, which was broken into three sections for no good reason at all. (The fourth and fifth subpatches, being alternative solutions to one problem, are in a different category of course.) Too often, decisions made in one subpatch don't make any sense until you see the larger picture. Also, speaking of the larger picture: the current Postgres revision history amounts to 37578 commits (as of sometime yesterday) --- and that's just in the HEAD branch. If we'd made an effort to break feature patches into bite-size chunks like you're recommending here, we'd probably have easily half a million commits in the mainline history. That would not be convenient to work with, and I really doubt that it would be more useful for git bisect purposes, and I'll bet a large amount of money that most of them would not have had commit messages composed with any care at all. I have tried to stay away from this thread, but ... I'm also quite dubious about this suggested workflow, partly for the reasons Tom gives, and partly because it would constrain the way I work. I tend to commit with little notes to myself in the commit logs, notes that are never intended to become part of the public project history. I should be quite sad to lose that. As for using git bisect, usually when I do this each iteration is quite expensive. Multiplying the number of commits by a factor between 10 and 100, which is what I think this would involve, would just make git bisect have to do between 3 and 7 more iterations, ISTM. That's not a win. On the larger issue, let me just note that I don't believe we have what is fundamentally a technological problem, and while technological changes can of course sometimes make things easier, they can also blind us to the more basic problems we are facing. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup vs. Windows and tablespaces
Amit Kapila wrote: One way to deal with this could be to append a delimiter(which is not allowed in tablespace path like quote (\')) at the end of tablespace path while writing the same to symlink label file and then use that as end marker while reading it from file. Some GNU tools such as xargs and find use a null char as item delimiter; see find -print0 and xargs -0. IIRC one of our tools also allow that (psql?). Doing the same here would make human reading a bit more difficult, but not completely impossible. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Custom timestamp format in logs
Tom Lane wrote: Magnus Hagander mag...@hagander.net writes: On Sat, Dec 13, 2014 at 3:50 PM, Michael Paquier michael.paqu...@gmail.com wrote: This week, we heard about a user willing to use a custom timestamp format across a set of services to improve the debugability of the whole set, Postgres being one of them. Unfortunately datestyle does not take into account the logs. Would it be worth adding a new GUC able to control the timestamp format in the logs? A separate GUC seems kind of weird. Wouldn't it be better with something like %(format)t or such in the log_line_prefix itself in that case? That could also be expanded to other parameters, should we need them? TBH, my answer to the rhetorical question is no. There is nothing weird about the timestamps %t emits now, and no reason why they should need to be configurable, except that somebody thinks it's easier to lobby us to complicate our software than to fix whatever they have that can't consume standard timestamp format. I imagine pgBadger/pgFouine wouldn't be happy with the timestamp being infinitely configurable. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - Assert statement
Pavel Stehule wrote: Hi any comments to last proposal and patch? WTH is that pstrdup(hint is null) thing? Debugging leftover? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 14/12/14 17:05, Tom Lane wrote: Craig Ringer cr...@2ndquadrant.com writes: On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote: Compare this to say, for example, huge patches such as RLS. I specifically objected to that being flattened into a single monster patch when I saw that'd been done. If you look at my part in the work on the row security patch, while I was ultimately unsuccessful in getting the patch mergeable I spent quite a bit of time splitting it up into a logical patch-series for sane review and development. I am quite annoyed that it was simply flattened back into an unreviewable, hard-to-follow blob and committed in that form. TBH, I'm not really on board with this line of argument. I don't find broken-down patches to be particularly useful for review purposes. An example I was just fooling with this week is the GROUPING SETS patch, which was broken into three sections for no good reason at all. (The fourth and fifth subpatches, being alternative solutions to one problem, are in a different category of course.) Too often, decisions made in one subpatch don't make any sense until you see the larger picture. The way in which this is normally handled is via the cover letter which is going to be nicely captured in the mail archives; typically a cover letter for a patchset consists of several paragraphs along the lines of patches 1-3 do some re-org work, patch 4 reworks the Y API for new feature X implemented in patches 9-12. As an example take a look at the cover letter for this patch submitted over the past few days: http://lists.gnu.org/archive/html/qemu-devel/2014-12/msg01990.html. It seems to me that this would give the level of detail which you are looking for. I agree that definitely in some cases patches could be broken into too fine pieces, but then I think it's perfectly okay for committers to turn around and request the series be re-submitted in more sensible chunks if they feel things have gone too far the other way. Also, speaking of the larger picture: the current Postgres revision history amounts to 37578 commits (as of sometime yesterday) --- and that's just in the HEAD branch. If we'd made an effort to break feature patches into bite-size chunks like you're recommending here, we'd probably have easily half a million commits in the mainline history. That would not be convenient to work with, and I really doubt that it would be more useful for git bisect purposes, and I'll bet a large amount of money that most of them would not have had commit messages composed with any care at all. Having worked on a few kernel patches myself, git is surprisingly effective on huge repositories once the cache is warmed up so I don't feel this would be an issue with a PostgreSQL git repository which will be many magnitudes smaller. In terms of commit messages, I don't know if you missed that part of my original response to Craig but it's considered normal for a maintainer to reject a patch because of an incorrect/irrelevant commit message. So if I submitted a patch that fixed a particular bug but you as a committer weren't happy with the commit message, then I'm perfectly okay with you asking me to resubmit with updated/corrected patch message wording. ATB, Mark. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Custom timestamp format in logs
Alvaro Herrera alvhe...@2ndquadrant.com writes: Tom Lane wrote: TBH, my answer to the rhetorical question is no. There is nothing weird about the timestamps %t emits now, and no reason why they should need to be configurable, except that somebody thinks it's easier to lobby us to complicate our software than to fix whatever they have that can't consume standard timestamp format. I imagine pgBadger/pgFouine wouldn't be happy with the timestamp being infinitely configurable. Yeah, if the repercussions were confined to the server code that would be one thing, but there are a number of external tools that would be affected as well. The complication cost has to be thought about that way rather than evaluated in isolation. You could argue that pgBadger et al could just document that they don't support nonstandard timestamp formats ... but then it's really unclear why we're shifting the complexity burden in this direction rather than asking why the one proprietary application that wants the other thing can't cope with the existing format choice. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On Sun, Dec 14, 2014 at 9:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: TBH, I'm not really on board with this line of argument. I don't find broken-down patches to be particularly useful for review purposes. An example I was just fooling with this week is the GROUPING SETS patch, which was broken into three sections for no good reason at all. (The fourth and fifth subpatches, being alternative solutions to one problem, are in a different category of course.) Too often, decisions made in one subpatch don't make any sense until you see the larger picture. It sounds like they didn't use the technique effectively, then. I think it will be useful to reviewers that I've broken out the mechanism through which the ON CONFLICT UPDATE patch accepts the EXCLUDED.* pseudo-alias into a separate commit (plus docs in a separate commit, as well as tests) - it's a non-trivial additional piece of code that it wouldn't be outrageous to omit in a release, and isn't particularly anticipated by earlier cumulative commits. Maybe people don't have a good enough sense of what sort of subdivision is appropriate yet. I think that better style will emerge over time. Of course, if you still prefer a monolithic commit, it's pretty easy to produce one from a patch series. It's not easy to go in the other direction, though. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 14/12/14 17:30, Andrew Dunstan wrote: On 12/14/2014 12:05 PM, Tom Lane wrote: Craig Ringer cr...@2ndquadrant.com writes: On 12/14/2014 10:35 PM, Mark Cave-Ayland wrote: Compare this to say, for example, huge patches such as RLS. I specifically objected to that being flattened into a single monster patch when I saw that'd been done. If you look at my part in the work on the row security patch, while I was ultimately unsuccessful in getting the patch mergeable I spent quite a bit of time splitting it up into a logical patch-series for sane review and development. I am quite annoyed that it was simply flattened back into an unreviewable, hard-to-follow blob and committed in that form. TBH, I'm not really on board with this line of argument. I don't find broken-down patches to be particularly useful for review purposes. An example I was just fooling with this week is the GROUPING SETS patch, which was broken into three sections for no good reason at all. (The fourth and fifth subpatches, being alternative solutions to one problem, are in a different category of course.) Too often, decisions made in one subpatch don't make any sense until you see the larger picture. Also, speaking of the larger picture: the current Postgres revision history amounts to 37578 commits (as of sometime yesterday) --- and that's just in the HEAD branch. If we'd made an effort to break feature patches into bite-size chunks like you're recommending here, we'd probably have easily half a million commits in the mainline history. That would not be convenient to work with, and I really doubt that it would be more useful for git bisect purposes, and I'll bet a large amount of money that most of them would not have had commit messages composed with any care at all. I have tried to stay away from this thread, but ... I'm also quite dubious about this suggested workflow, partly for the reasons Tom gives, and partly because it would constrain the way I work. I tend to commit with little notes to myself in the commit logs, notes that are never intended to become part of the public project history. I should be quite sad to lose that. Interestingly enough, I tend to work in a very similar way to this. When submitting patches upstream, I tend to rebase on a new branch and then squash/rework as required. One thing I do tend to find is that once I start rebasing the patches for upstream, I find that many of my personal notes can be rewritten as part of the patch comment (I would like to think that comments useful to myself will be useful to other developers some day). Once the rebase is complete, often I find that I have no non-public comments left so that problem becomes almost non-existent. As for using git bisect, usually when I do this each iteration is quite expensive. Multiplying the number of commits by a factor between 10 and 100, which is what I think this would involve, would just make git bisect have to do between 3 and 7 more iterations, ISTM. That's not a win. I find that this isn't too bad in practice. If you think about monolithic patches during a commitfest, you can imagine that most of them will touch one of the core .h files which will require most things to be rebuilt once applied during bisection. With smaller iterative patchsets, each patch tends to only touch a handful of files and so make install generally only needs to rebuild and link a very small number of files which is reasonably quick. Since all of the changes for global header files are contained in 1-2 patches then the number of complete rebuilds isn't as many as you might expect. On the larger issue, let me just note that I don't believe we have what is fundamentally a technological problem, and while technological changes can of course sometimes make things easier, they can also blind us to the more basic problems we are facing. Absolutely. I firmly believe that the right tools for the job are available, it's really trying to find a workflow solution that works well for everyone involved in the project. ATB, Mark. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - Assert statement
2014-12-14 18:57 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com: Pavel Stehule wrote: Hi any comments to last proposal and patch? WTH is that pstrdup(hint is null) thing? Debugging leftover? No, only not well error message. I propose a syntax ASSERT boolean_expression [, text expression ] -- text expression is hint for assertion debugging This text expression should not be null when it is used. I am not sure, what is well behave - so when assertions fails and text expression is null, then I use message hint is null as hint. Regards Pavel -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services
Re: [HACKERS] Commitfest problems
On 14/12/14 18:24, Peter Geoghegan wrote: On Sun, Dec 14, 2014 at 9:05 AM, Tom Lane t...@sss.pgh.pa.us wrote: TBH, I'm not really on board with this line of argument. I don't find broken-down patches to be particularly useful for review purposes. An example I was just fooling with this week is the GROUPING SETS patch, which was broken into three sections for no good reason at all. (The fourth and fifth subpatches, being alternative solutions to one problem, are in a different category of course.) Too often, decisions made in one subpatch don't make any sense until you see the larger picture. It sounds like they didn't use the technique effectively, then. I think it will be useful to reviewers that I've broken out the mechanism through which the ON CONFLICT UPDATE patch accepts the EXCLUDED.* pseudo-alias into a separate commit (plus docs in a separate commit, as well as tests) - it's a non-trivial additional piece of code that it wouldn't be outrageous to omit in a release, and isn't particularly anticipated by earlier cumulative commits. Maybe people don't have a good enough sense of what sort of subdivision is appropriate yet. I think that better style will emerge over time. For me this is a great example of how to get more developers involved in patch review. Imagine that I'm an experienced developer with little previous exposure to PostgreSQL, but with a really strong flex/bison background. If someone posts a patch to the list as a single grouping sets patch, then I'm going to look at that and think I have no way of understanding this and do nothing with it. However if it were posted as part of patchset with a subject of [PATCH] gram.y: add EXCLUDED pseudo-alias to bison grammar then this may pique my interest enough to review the changes to the grammar rules, with the barrier for entry reduced to understanding just the PostgreSQL-specific parts. What should happen over time is that developers like this would earn the trust of the committers, so committers can spend more time reviewing the remaining parts of the patchset. And of course the project has now engaged a new developer into the community who otherwise may not have not participated. Of course, if you still prefer a monolithic commit, it's pretty easy to produce one from a patch series. It's not easy to go in the other direction, though. Agreed. ATB, Mark. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BRIN range operator class
I thought we can do better than minmax for the inet data type, and ended up with a generalized opclass supporting both inet and range types. Patch based on minmax-v20 attached. It works well except a few small problems. I will improve the patch and add into a commitfest after BRIN framework is committed. I wanted to send a new version before the commitfest to get some feedback, but it is still work in progress. Patch attached rebased to the current HEAD. This version supports more operators and box from geometric data types. Opclasses are renamed to inclusion_ops to be more generic. The problems I mentioned remain beause I couldn't solve them without touching the BRIN framework. To support more operators I needed to change amstrategies and amsupport on the catalog. It would be nice if amsupport can be set to 0 like am strategies. I think it would be nicer to get the functions from the operators with using the strategy numbers instead of adding them directly as support functions. I looked around a bit but couldn't find a sensible way to support it. Is it possible without adding them to the RelationData struct? Inet data types accept IP version 4 and version 6. It isn't possible to represent union of addresses from different versions with a valid inet type. So, I made the union function return NULL in this case. Then, I tried to store if returned value is NULL or not, in column-values[] as boolean, but it failed on the pfree() inside brin_dtuple_initilize(). It doesn't seem right to free the values based on attr-attbyval. This problem remains. There is also a similar problem with the range types, namely empty ranges. There should be special cases for them on some of the strategies. I tried to solve the problems in several different ways, but got a segfault one line or another. This makes me think that BRIN framework doesn't support to store different types than the indexed column in the values array. For example, brin_deform_tuple() iterates over the values array and copies them using the length of the attr on the index, not the length of the type defined by OpcInfo function. If storing another types aren't supported, why is it required to return oid's on the OpcInfo function. I am confused. I didn't try to support other geometric types than box as I couldn't managed to store a different type on the values array, but it would be nice to get some feedback about the overall design. I was thinking to add a STORAGE parameter to the index to support other geometric types. I am not sure that adding the STORAGE parameter to be used by the opclass implementation is the right way. It wouldn't be the actual thing that is stored by the index, it will be an element in the values array. Maybe, data type specific opclasses is the way to go, not a generic one as I am trying. brin-inclusion-v02.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 12/15/2014 02:46 AM, Mark Cave-Ayland wrote: Interestingly enough, I tend to work in a very similar way to this. When submitting patches upstream, I tend to rebase on a new branch and then squash/rework as required. Same here, and I find it works really well ... when I do it properly. I usually have a private development branch that's full of fixup! commits, WIPs, awful commit messages, etc. Then I have the tree that's been rebased, squashed, and tided up. I periodically tidy my latest work and replace the old clean tree with it, then start my new development branch on top of the new clean tree. This gives me a somewhat nice looking, well ordered patch series to work on top of, while retaining the flexibility to do lots of quick fixes etc. Admittedly, sometimes the development tree gets a bit large and it takes a while before I give it a proper cleanup. My current project being very much in that category. Still - it works well in general. I find that this isn't too bad in practice. If you think about monolithic patches during a commitfest, you can imagine that most of them will touch one of the core .h files which will require most things to be rebuilt once applied during bisection. It's not build time, it's test-run time. Especially if you can't automate the test, or it isn't practical to. That's a legitimate concern - but I don't think we'd see a blowout of patch counts to quite the same degree. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest problems
On 14/12/14 20:07, Craig Ringer wrote: On 12/15/2014 02:46 AM, Mark Cave-Ayland wrote: Interestingly enough, I tend to work in a very similar way to this. When submitting patches upstream, I tend to rebase on a new branch and then squash/rework as required. Same here, and I find it works really well ... when I do it properly. I usually have a private development branch that's full of fixup! commits, WIPs, awful commit messages, etc. Then I have the tree that's been rebased, squashed, and tided up. I periodically tidy my latest work and replace the old clean tree with it, then start my new development branch on top of the new clean tree. This gives me a somewhat nice looking, well ordered patch series to work on top of, while retaining the flexibility to do lots of quick fixes etc. Admittedly, sometimes the development tree gets a bit large and it takes a while before I give it a proper cleanup. My current project being very much in that category. Still - it works well in general. That describes my workflow fairly well too. I find that this isn't too bad in practice. If you think about monolithic patches during a commitfest, you can imagine that most of them will touch one of the core .h files which will require most things to be rebuilt once applied during bisection. It's not build time, it's test-run time. Especially if you can't automate the test, or it isn't practical to. That's a legitimate concern - but I don't think we'd see a blowout of patch counts to quite the same degree. At the end of the day, each project finds its own level as to how complex individual patches should be and what should comprise a patchset. Over the past couple of years the QEMU process has evolved into its current form as maintainers and developers have figured out what works best for them, and I don't see that PostgreSQL would be any different in this respect - it takes time to adapt to a new workflow and get it right. Having worked on various parts for PostGIS for around 10 years, I've had my head stuck into various parts of the GiST code and have got to know a few parts of it really well. What I find frustrating is that I've come back from a workflow where I've been reviewing/testing patches within months of joining a project because the barrier for entry has been so low, and yet even with much longer experience of the PostgreSQL codebase I feel I can't do the same for patches submitted to the commitfest. And why is this? It's because while I know very specific areas of the code well, many patches span different areas of the source tree of which I have little and/or no experience, which when supplied as a single monolithic patch makes it impossible for me to give meaningful review to all but a tiny proportion of them. I believe that this is one of the main reasons why people struggle to find patch reviewers, with the consequence being that the majority of work falls back onto the CF manager and the committers which is why we have the current situation. ATB, Mark. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Custom timestamp format in logs
On Mon, Dec 15, 2014 at 1:36 AM, Magnus Hagander mag...@hagander.net wrote: On Sat, Dec 13, 2014 at 3:50 PM, Michael Paquier michael.paqu...@gmail.com wrote: Hi all, This week, we heard about a user willing to use a custom timestamp format across a set of services to improve the debugability of the whole set, Postgres being one of them. Unfortunately datestyle does not take into account the logs. Would it be worth adding a new GUC able to control the timestamp format in the logs? We could still redirect the logs with syslog and have a custom timestamp format there, but in the case of this particular user syslog was a no-go. Looking at the code, timestamp format is now hardcoded in setup_formatted_log_time and setup_formatted_start_time when calling pg_strftime @ elog.c, so it doesn't seem to be much complicated to do. Opinions? This thread is here for that. A separate GUC seems kind of weird. Check. Wouldn't it be better with something like %(format)t or such in the log_line_prefix itself in that case? That could also be expanded to other parameters, should we need them? Possible. I am not sure if we will be able to have a new parameter in log_line_prefix as modulable as timestamps for formatting though. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
2014-12-14 22:18 GMT+07:00 Ali Akbar the.ap...@gmail.com: I ran a test using postgres-US.fo built in the PostgreSQL source tree, which is 38 MB, and ran select unnest(xpath('//fo:bookmark-title', b, array[array['fo', 'http://www.w3.org/1999/XSL/Format']])) from data; (Table contains one row only.) The timings were basically indistinguishable between the three code versions. I'll try to reproduce your test. How big is your file? Do you have a link to the actual file? Could you share your load script? I use this xml sample: http://w3.energistics.org/schema/prodml_v1.2.0_data/XML_Examples/productVolume_no_xsl.xml Basically i loaded the xml to table u 100 times. Load script attached. Peter, while reviewing the better performing patch myself, now i think the patch needs more work to be committed. The structuring of the method will be confusing in the long term. I think i'll restructure the patch in the next commitfest. So i propose to break the patch: 1. We apply the current patch which uses xmlNodeCopy, so that the long-standing bug will be fixed in postgres. 2. I'll work with the performance enhancement in the next commitfest. Maybe for (2), the current better-performing patch can be viewed as PoC of the expected performance. Regards, -- Ali Akbar
Re: [HACKERS] Status of CF 2014-10 and upcoming 2014-12
On Sun, Dec 14, 2014 at 12:29 AM, Andrew Dunstan and...@dunslane.net wrote: Opinions? I will look at clearing out a few of the Ready For Committer patches today and tomorrow. Thank you. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()
On Tue, Dec 9, 2014 at 2:52 AM, Peter Geoghegan p...@heroku.com wrote: On Mon, Dec 8, 2014 at 9:43 AM, Peter Geoghegan p...@heroku.com wrote: I think it's very possible that the wrong alias may be provided by the user, and that we should consider that when providing a hint. Note that the existing mechanism (the mechanism that I'm trying to improve) only ever shows this error message: There is a column named \%s\ in table \%s\, but it cannot be referenced from this part of the query. I think it's pretty clear that this general class of user error is common. Moving this patch to CF 2014-12 as work is still going on, note that it is currently marked with Robert as reviewer and that its current status is Needs review. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Thu, Dec 11, 2014 at 3:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: I don't really have any comments on the algorithms yet, having spent too much time trying to figure out underdocumented data structures to get to the algorithms. However, noting the addition of list_intersection_int() made me wonder whether you'd not be better off reducing the integer lists to bitmapsets a lot sooner, perhaps even at parse analysis. list_intersection_int() is going to be O(N^2) by nature. Maybe N can't get large enough to matter in this context, but I do see places that seem to be concerned about performance. I've not spent any real effort looking at gsp2.patch yet, but it seems even worse off comment-wise: if there's any explanation in there at all of what a chained aggregate is, I didn't find it. I'd also counsel you to find some other way to do it than putting bool chain_head fields in Aggref nodes; that looks like a mess, eg, it will break equal() tests for expression nodes that probably should still be seen as equal. I took a quick look at gsp-u.patch. It seems like that approach should work, with of course the caveat that using CUBE/ROLLUP as function names in a GROUP BY list would be problematic. I'm not convinced by the commentary in ruleutils.c suggesting that extra parentheses would help disambiguate: aren't extra parentheses still going to contain grouping specs according to the standard? Forcibly schema-qualifying such function names seems like a less fragile answer on that end. Based on those comments, I am marking this patch as Returned with Feedback on the CF app for 2014-10. Andrew, feel free to move this entry to CF 2014-12 if you are planning to continue working on it so as it would get additional review. (Note that this patch status was Waiting on Author when writing this text). Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] inherit support for foreign tables
On Thu, Dec 11, 2014 at 2:54 PM, Ashutosh Bapat ashutosh.ba...@enterprisedb.com wrote: On Thu, Dec 11, 2014 at 8:39 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: Hi Ashutosh, Thanks for the review! (2014/12/10 14:47), Ashutosh Bapat wrote: We haven't heard anything from Horiguchi-san and Hanada-san for almost a week. So, I am fine marking it as ready for committer. What do you say? Moving this patch to CF 2014-12 with the same status. Let's get a committer having a look at it. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] GiST kNN search queue (Re: KNN-GiST with recheck)
On Thu, Dec 11, 2014 at 12:50 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01/28/2014 04:12 PM, Alexander Korotkov wrote: 3. A binary heap would be a better data structure to buffer the rechecked values. A Red-Black tree allows random insertions and deletions, but in this case you need to insert arbitrary values but only remove the minimum item. That's exactly what a binary heap excels at. We have a nice binary heap implementation in the backend that you can use, see src/backend/lib/binaryheap.c. Hmm. For me binary heap would be a better data structure for KNN-GiST at all :-) I decided to give this a shot, replacing the red-black tree in GiST with the binary heap we have in lib/binaryheap.c. It made the GiST code somewhat simpler, as the binaryheap interface is simpler than the red-black tree one. Unfortunately, performance was somewhat worse. That was quite surprising, as insertions and deletions are both O(log N) in both data structures, but the red-black tree implementation is more complicated. I implemented another data structure called a Pairing Heap. It's also a fairly simple data structure, but insertions are O(1) instead of O(log N). It also performs fairly well in practice. With that, I got a small but measurable improvement. To test, I created a table like this: create table gisttest (id integer, p point); insert into gisttest select id, point(random(), random()) from generate_series(1, 100) id; create index i_gisttest on gisttest using gist (p); And I ran this query with pgbench: select id from gisttest order by p - '(0,0)' limit 1000; With unpatched master, I got about 650 TPS, and with the patch 720 TPS. That's a nice little improvement, but perhaps more importantly, the pairing heap implementation consumes less memory. To measure that, I put a MemoryContextStats(so-queueCtx) call into gistendscan. With the above query, but without the limit clause, on master I got: GiST scan context: 2109752 total in 10 blocks; 2088456 free (24998 chunks); 21296 used And with the patch: GiST scan context: 1061160 total in 9 blocks; 1040088 free (12502 chunks); 21072 used That's 2MB vs 1MB. While that's not much in absolute terms, it'd be nice to reduce that memory consumption, as there is no hard upper bound on how much might be needed. If the GiST tree is really disorganized for some reason, a query might need a lot more. So all in all, I quite like this patch, even though it doesn't do anything too phenomenal. It adds a some code, in the form of the new pairing heap implementation, but it makes the GiST code a little bit simpler. And it gives a small performance gain, and reduces memory usage a bit. Hum. It looks that this patch using binary heap is intended to be a replacement red-black tree method. Any reason why it isn't added to the CF to track it? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] KNN-GiST with recheck
On Tue, Jan 28, 2014 at 10:54 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: 1. This patch introduces a new polygon - point operator. That seems useful on its own, with or without this patch. This patch is tracked with this entry in the commit fest app and is marked as Ready for committer. Hence I am moving this specific part to 2014-12 to keep track of it: 3. A binary heap would be a better data structure to buffer the rechecked values. A Red-Black tree allows random insertions and deletions, but in this case you need to insert arbitrary values but only remove the minimum item. That's exactly what a binary heap excels at. We have a nice binary heap implementation in the backend that you can use, see src/backend/lib/binaryheap.c. Based on those comments, I am marking this entry as returned with feedback: https://commitfest.postgresql.org/action/patch_view?id=1367 Heikki has sent as well a new patch to use a binary heap method instead of the red-black tree here: http://www.postgresql.org/message-id/54886bb8.9040...@vmware.com IMO this last patch should be added in the CF app, that's not the case now. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] moving from contrib to bin
Here is a patch series to move pg_archivecleanup pg_standby pg_test_fsync pg_test_timing pg_upgrade pg_xlogdump pgbench from contrib/ to src/bin/. Move people didn't like moving oid2name and vacuumlo, which I understand. So I'll leave those for now. I have also included a patch to move pg_upgrade_support into the backend, as discussed. Open issues/discussion points: - no Windows build system support yet - I didn't move the source of the man pages to .../sgml/ref. This could be done. - I didn't rename to the SGML ids of the man pages to git the patter of the other applications, because that would change the name of the HTML pages. - We have traditionally kept an individual author acknowledgement in the man pages for contrib items. Should those be removed now? From 0a4b9a4eaf930b8e14b4b3e2077ca7706a65ab6a Mon Sep 17 00:00:00 2001 From: Peter Eisentraut peter_e@gmx.net Date: Sun, 14 Dec 2014 20:41:58 -0500 Subject: [PATCH 1/9] Sort SUBDIRS variable in src/bin/Makefile The previous order appears to have been historically grown randomness. --- src/bin/Makefile | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/src/bin/Makefile b/src/bin/Makefile index f03cc42..f0589f3 100644 --- a/src/bin/Makefile +++ b/src/bin/Makefile @@ -13,8 +13,16 @@ subdir = src/bin top_builddir = ../.. include $(top_builddir)/src/Makefile.global -SUBDIRS = initdb pg_ctl pg_dump \ - psql scripts pg_config pg_controldata pg_resetxlog pg_basebackup +SUBDIRS = \ + initdb \ + pg_basebackup \ + pg_config \ + pg_controldata \ + pg_ctl \ + pg_dump \ + pg_resetxlog \ + psql \ + scripts ifeq ($(PORTNAME), win32) SUBDIRS += pgevent -- 2.2.0 From 403955c76136960d7d444e946ba2c20110d7e263 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut peter_e@gmx.net Date: Sun, 14 Dec 2014 20:41:58 -0500 Subject: [PATCH 2/9] Move pg_archivecleanup from contrib/ to src/bin/ --- contrib/Makefile | 1 - contrib/pg_archivecleanup/Makefile | 18 -- doc/src/sgml/contrib.sgml | 1 - doc/src/sgml/reference.sgml| 1 + src/bin/Makefile | 1 + {contrib = src/bin}/pg_archivecleanup/.gitignore | 0 src/bin/pg_archivecleanup/Makefile | 14 ++ .../bin}/pg_archivecleanup/pg_archivecleanup.c | 2 +- 8 files changed, 17 insertions(+), 21 deletions(-) delete mode 100644 contrib/pg_archivecleanup/Makefile rename {contrib = src/bin}/pg_archivecleanup/.gitignore (100%) create mode 100644 src/bin/pg_archivecleanup/Makefile rename {contrib = src/bin}/pg_archivecleanup/pg_archivecleanup.c (99%) diff --git a/contrib/Makefile b/contrib/Makefile index 195d447..c56050e 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -28,7 +28,6 @@ SUBDIRS = \ oid2name \ pageinspect \ passwordcheck \ - pg_archivecleanup \ pg_buffercache \ pg_freespacemap \ pg_prewarm \ diff --git a/contrib/pg_archivecleanup/Makefile b/contrib/pg_archivecleanup/Makefile deleted file mode 100644 index ab52390..000 --- a/contrib/pg_archivecleanup/Makefile +++ /dev/null @@ -1,18 +0,0 @@ -# contrib/pg_archivecleanup/Makefile - -PGFILEDESC = pg_archivecleanup - cleans archive when used with streaming replication -PGAPPICON = win32 - -PROGRAM = pg_archivecleanup -OBJS = pg_archivecleanup.o $(WIN32RES) - -ifdef USE_PGXS -PG_CONFIG = pg_config -PGXS := $(shell $(PG_CONFIG) --pgxs) -include $(PGXS) -else -subdir = contrib/pg_archivecleanup -top_builddir = ../.. -include $(top_builddir)/src/Makefile.global -include $(top_srcdir)/contrib/contrib-global.mk -endif diff --git a/doc/src/sgml/contrib.sgml b/doc/src/sgml/contrib.sgml index a698d0f..f21fa14 100644 --- a/doc/src/sgml/contrib.sgml +++ b/doc/src/sgml/contrib.sgml @@ -202,7 +202,6 @@ titleServer Applications/title part of the core productnamePostgreSQL/productname distribution. /para - pgarchivecleanup; pgstandby; pgtestfsync; pgtesttiming; diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 10c9a6d..62267db 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -257,6 +257,7 @@ titlePostgreSQL Server Applications/title /partintro initdb; + pgarchivecleanup; pgControldata; pgCtl; pgResetxlog; diff --git a/src/bin/Makefile b/src/bin/Makefile index f0589f3..0e7b183 100644 --- a/src/bin/Makefile +++ b/src/bin/Makefile @@ -15,6 +15,7 @@ include $(top_builddir)/src/Makefile.global SUBDIRS = \ initdb \ + pg_archivecleanup \ pg_basebackup \ pg_config \ pg_controldata \ diff --git a/contrib/pg_archivecleanup/.gitignore b/src/bin/pg_archivecleanup/.gitignore similarity index 100% rename from contrib/pg_archivecleanup/.gitignore rename to src/bin/pg_archivecleanup/.gitignore diff --git a/src/bin/pg_archivecleanup/Makefile b/src/bin/pg_archivecleanup/Makefile new file mode 100644 index
Re: [HACKERS] pgcrypto: PGP signatures
On Wed, Nov 12, 2014 at 7:05 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Sat, Nov 1, 2014 at 7:52 AM, Marko Tiikkaja ma...@joh.to wrote: Hi, I discovered a problem with the lack of MDC handling in the signature info extraction code, so I've fixed that and added a test message. v9 here. Hi Marko, I get a segfault when the length of the message is exactly 16308 bytes, see attached perl script. I can't get a backtrace, for some reason it acts as if there were no debug symbols despite that I built with them. I've not seen that before. I get this whether or not the bug 11905 patch is applied, so the problem seems to be related but different. This patch status was Ready for committer but it still has visibly some bugs, and has not been updated in a while as pointed out by Jeff. So I am switching it as Returned with feedback. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] moving from contrib to bin
On Mon, Dec 15, 2014 at 10:59 AM, Peter Eisentraut pete...@gmx.net wrote: - no Windows build system support yet Do you need some help here? I am getting worried about potential breakage with the version information built with MSVC and MinGW.. - We have traditionally kept an individual author acknowledgement in the man pages for contrib items. Should those be removed now? +1. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}
On Fri, Dec 12, 2014 at 6:39 AM, Robert Haas robertmh...@gmail.com wrote: It's probably something we should add, but there's enough to do getting the basic feature working first. Moving this patch to CF 2014-12 as work is still going on. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On partitioning
Alvaro wrote: Claudio Freire wrote: Fair enough, but that's not the same as not requiring easy proofs. The planner might not the one doing the proofs, but you still need proofs. Even if the proving method is hardcoded into the partitioning method, as in the case of list or range partitioning, it's still a proof. With arbitrary functions (which is what prompted me to mention proofs) you can't do that. A function works very well for inserting, but not for selecting. I could be wrong though. Maybe there's a way to turn SQL functions into analyzable things? But it would still be very easy to shoot yourself in the foot by writing one that is too complex. Arbitrary SQL expressions (including functions) are not the thing to use for partitioning -- at least that's how I understand this whole discussion. I don't think you want to do proofs as such -- they are expensive. This means if a user puts arbitrary expressions in a partition definition, say, ... FOR VALUES extract(month from current_date) TO extract(month from current_date + interval '3 months'), we make sure that those expressions are pre-computed to literal values. The exact time when that happens is open for discussion I guess. It could be either DDL time or, if feasible, during relation cache building when we compute the value from pg_node_tree of this expression which we may choose to store in the partition definition catalog. The former entails an obvious challenge of figuring out how we store the computed value into catalog (pg_node_tree of a Const?). To make this discussion a bit clearer, there are two things to distinguish: one is routing tuples, when an INSERT or COPY command references the partitioned table, into the individual partitions (ingress); the other is deciding which partitions to read when a SELECT query wants to read tuples from the partitioned table (egress). On ingress, what you want is something like being able to do something on the tuple that tells you which partition it belongs into. Ideally this is something much lighter than running an expression; if you can just apply an operator to the partitioning column values, that should be plenty fast. This requires no proof. And I am thinking this's all executor stuff. On egress you need some direct way to compare the scan quals with the partitioning values. I would imagine this to be similar to how scan quals are compared to the values stored in a BRIN index: each scan qual has a corresponding operator strategy and a scan key, and you can say aye or nay based on a small set of operations that can be run cheaply, again without any proof or running arbitrary expressions. My knowledge of this is far from being perfect, though to clear any confusions - As far as planning is concerned, I could not imagine how index access method way of pruning partitions could be made to work. Of course, I may be missing something. When you say scan qual has a corresponding operator strategy, I'd think that is a part of scan key in executor, no? Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] add ssl_protocols configuration option
On Thu, Nov 27, 2014 at 8:51 PM, Alex Shulgin a...@commandprompt.com wrote: Dag-Erling Smørgrav d...@des.no writes: Alex Shulgin a...@commandprompt.com writes: OK, looks like I've come up with something workable: I've added sslprotocol connection string keyword similar to pre-existing sslcompression, etc. Please see attached v2 of the original patch. I'm having doubts about the name of openssl.h header though, libpq-openssl.h? Perhaps ssloptions.[ch], unless you plan to add non-option-related code there later? I don't think anything else than common options-related code fits in there, so ssloptions.c makes sense to me. BTW, there is no Regent code in your openssl.c, so the copyright statement is incorrect. Good catch, I just blindly copied that from some other file. There have been arguments in favor and against this patch... But marking it as returned with feedback because of a lack of activity in the last couple of weeks. Feel free to update if you think that's not correct, and please move this patch to commit fest 2014-12. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: dynahash replacement for buffer table
There hasn't been much activity on this patch these days, and Andres has provided some feedback, hence switching to Returned with feedback. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar the.ap...@gmail.com wrote: Peter, while reviewing the better performing patch myself, now i think the patch needs more work to be committed. The structuring of the method will be confusing in the long term. I think I'll restructure the patch in the next commitfest. So i propose to break the patch: 1. We apply the current patch which uses xmlNodeCopy, so that the long-standing bug will be fixed in postgres. 2. I'll work with the performance enhancement in the next commitfest. Maybe for (2), the current better-performing patch can be viewed as PoC of the expected performance. Ali, are you currently working on that? Would you mind re-creating new entries in the commit fest app for the new set of patches that you are planning to do? For now I am switching this patch as returned with feedback. Thanks, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
Fujii == Fujii Masao masao.fu...@gmail.com writes: Fujii Pushed. Bug found: regression=# select count(*) from generate_series(1::numeric,10) v, generate_series(1,v) w; count --- 0 (1 row) regression=# select count(*) from generate_series(1::numeric,10) v, generate_series(1,v+0) w; count --- 55 (1 row) The error is in the use of PG_GETARG_NUMERIC and init_var_from_num when setting up the multi-call state; init_var_from_num points at the original num's digits rather than copying them, but start_num and stop_num have just been (potentially) detoasted in the per-call context, in which case the storage will have been freed by the next call. Obviously this could also be fixed by not detoasting the input until after switching to the multi-call context, but it looks to me like that would be unnecessarily complex. Suggested patch attached. (Is it also worth putting an explicit warning about this kind of thing in the SRF docs?) -- Andrew (irc:RhodiumToad) diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index c73f9bc..d841b6f 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -1325,11 +1325,16 @@ generate_series_step_numeric(PG_FUNCTION_ARGS) /* * Use fctx to keep state from call to call. Seed current with the - * original start value. + * original start value. We must copy the start_num and stop_num + * values rather than pointing to them, since we may have detoasted + * them in the per-call context. */ - init_var_from_num(start_num, fctx-current); - init_var_from_num(stop_num, fctx-stop); + init_var(fctx-current); + init_var(fctx-stop); init_var(fctx-step); + + set_var_from_num(start_num, fctx-current); + set_var_from_num(stop_num, fctx-stop); set_var_from_var(steploc, fctx-step); funcctx-user_fctx = fctx; diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out index ee6cb50..9d68145 100644 --- a/src/test/regress/expected/numeric.out +++ b/src/test/regress/expected/numeric.out @@ -1461,3 +1461,41 @@ select (i / (10::numeric ^ 131071))::numeric(1,0) 9 (4 rows) +-- Check usage with variables +select * from generate_series(1::numeric, 3::numeric) i, generate_series(i,3) j; + i | j +---+--- + 1 | 1 + 1 | 2 + 1 | 3 + 2 | 2 + 2 | 3 + 3 | 3 +(6 rows) + +select * from generate_series(1::numeric, 3::numeric) i, generate_series(1,i) j; + i | j +---+--- + 1 | 1 + 2 | 1 + 2 | 2 + 3 | 1 + 3 | 2 + 3 | 3 +(6 rows) + +select * from generate_series(1::numeric, 3::numeric) i, generate_series(1,5,i) j; + i | j +---+--- + 1 | 1 + 1 | 2 + 1 | 3 + 1 | 4 + 1 | 5 + 2 | 1 + 2 | 3 + 2 | 5 + 3 | 1 + 3 | 4 +(10 rows) + diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql index a7e92ac..1633e4c 100644 --- a/src/test/regress/sql/numeric.sql +++ b/src/test/regress/sql/numeric.sql @@ -854,3 +854,7 @@ select (i / (10::numeric ^ 131071))::numeric(1,0) from generate_series(6 * (10::numeric ^ 131071), 9 * (10::numeric ^ 131071), 10::numeric ^ 131071) as a(i); +-- Check usage with variables +select * from generate_series(1::numeric, 3::numeric) i, generate_series(i,3) j; +select * from generate_series(1::numeric, 3::numeric) i, generate_series(1,i) j; +select * from generate_series(1::numeric, 3::numeric) i, generate_series(1,5,i) j; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] alter user/role CURRENT_USER
On Fri, Nov 14, 2014 at 5:39 PM, Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp wrote: Hi, this is revised version. Kyotaro HORIGUCHI wrote: - Storage for new information The new struct NameId stores an identifier which telling what it logically is using the new enum NameIdTypes. I think NameId is a bad name for this. My point is that NameId, as it stands, might be a name for anything, not just a role; and the object it identifies is not an Id either. Maybe RoleSpec? Yeah! I felt it no good even if it were a generic type for various Name of something or its oid. RoleSpec sounds much better. Do we need a public_ok argument to get_nameid_oid() (get a better name for this function too) Maybe get_rolespec_oid() as a name ofter its parameter type? so that callers don't have to check for InvalidOid argument? I think the arrangement you propose is not very convenient; it'd be best to avoid duplicating the check for InvalidOid in all callers of the new function, particularly where there was no check before. I agree that It'd be better keeping away from duplicated InvalidOid checks, but public_ok seems a bit myopic. Since there's no reasonable border between functions accepting 'public' and others, such kind of solution would not be reasonable.. What about checking it being a PUBLIC or not *before* calling get_rolespec_oid()? The attached patch modified in the following points. - rename NameId to RoleSpec and NameIdType to RoleSpecTypes. - rename get_nameid_oid() to get_rolespec_oid(). - rename roleNamesToIds() to roleSpecsToIds(). - some struct members are changed such as authname to authrole. - check if rolespec is public or not before calling get_rolespec_oid() - ExecAlterDefaultPrivilegesStmt and ExecuteGrantStmt does slightly different things about ACL_ID_PUBLIC but I unified it to the latter. - rebased to the current master A new patch has been sent here but no review occurred, hence moving this item to CF 2014-12. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
Michael == Michael Paquier michael.paqu...@gmail.com writes: Michael Based on those comments, I am marking this patch as Michael Returned with Feedback on the CF app for 2014-10. Andrew, Michael feel free to move this entry to CF 2014-12 if you are Michael planning to continue working on it so as it would get Michael additional review. (Note that this patch status was Waiting Michael on Author when writing this text). Moved it to 2014-12 and set it back to waiting on author. We expect to submit a revised version, though I have no timescale yet. -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_basebackup vs. Windows and tablespaces
Amit Kapila amit.kapil...@gmail.com writes: On Mon, Dec 15, 2014 at 5:39 AM, Tom Lane t...@sss.pgh.pa.us wrote: What makes you think quote isn't allowed in tablespace paths? Below part of code makes me think that quote is not allowed. Oid CreateTableSpace(CreateTableSpaceStmt *stmt) { .. /* disallow quotes, else CREATE DATABASE would be at risk */ if (strchr(location, '\'')) ereport(ERROR, (errcode(ERRCODE_INVALID_NAME), errmsg(tablespace location cannot contain single quotes))); } Hm, I think that's left over from defending a *very* ancient version of CREATE DATABASE. In any case, as I mentioned, any limitations we might be putting on tablespace paths during SQL-level operation are pretty much a dead letter. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Final Patch for GROUPING SETS
On Mon, Dec 15, 2014 at 12:28 PM, Andrew Gierth and...@tao11.riddles.org.uk wrote: Michael == Michael Paquier michael.paqu...@gmail.com writes: Michael Based on those comments, I am marking this patch as Michael Returned with Feedback on the CF app for 2014-10. Andrew, Michael feel free to move this entry to CF 2014-12 if you are Michael planning to continue working on it so as it would get Michael additional review. (Note that this patch status was Waiting Michael on Author when writing this text). Moved it to 2014-12 and set it back to waiting on author. We expect to submit a revised version, though I have no timescale yet. OK thanks for the update. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Escaping from blocked send() reprised.
On Thu, Oct 30, 2014 at 10:27 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 10/03/2014 06:29 PM, Heikki Linnakangas wrote: [blah] About patch 3: Looking closer, this design still looks OK to me. As you said yourself, the comments need some work (e.g. the step 5. in the top comment in async.c needs updating). And then there are a couple of FIXME and XXX comments that need to be addressed. Those patches have not been updated in a while, and I am seeing some feedback from several people, hence returning as returned with feedback. Horiguchi-san, feel free to add new entries on the CF app in 2014-12 or move this entry if you feel overwise. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] alter user set local_preload_libraries.
On Tue, Dec 9, 2014 at 11:56 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Dec 8, 2014 at 9:18 PM, Tom Lane t...@sss.pgh.pa.us wrote: Barring someone committing to spend the time to improve that situation (time that would be poorly invested IMO), I don't think that we want to open up ignore_system_indexes as USERSET, or do anything else to encourage its use. If we're intent on removing PGC_BACKEND then I'd be okay with reclassifying ignore_system_indexes as SUSET; but I'm not exactly convinced that we should be trying to get rid of PGC_BACKEND. Well, if you want to discourage its use, I think there's an argument that marking it as SUSET would be more restrictive than what we have at present, since it would altogether prohibit non-superuser use. I'm not wedded to the idea of getting rid of PGC_BACKEND, but I do like it. Peter's survey of the landscape seems to show that there's very little left in that category and the stuff that is there is somewhat uninspiring. And simplifying things is always nice. Documentation fixes for the use of local_preload_libraries have been pushed, now there has been some wider discussion about changing the mode of a couple of parameters since PGC_SU_BACKEND has been introduced. Any problems to switch this patch to Returned with feedback? The discussion done here is wider than the simple use of local_preload_libraries in any case. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RLS with check option - surprised design
On Sat, Nov 22, 2014 at 5:59 AM, Stephen Frost sfr...@snowman.net wrote: * Peter Geoghegan (p...@heroku.com) wrote: On Fri, Nov 21, 2014 at 6:57 AM, Stephen Frost sfr...@snowman.net wrote: [blah] [re-blah] [re-re-blah] This patch has already been committed, but there are still many concerns shared, so moving it to CF 2014-12 as it needs more review. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Status of CF 2014-10 and upcoming 2014-12
On Sun, Dec 14, 2014 at 12:00 AM, Michael Paquier michael.paqu...@gmail.com wrote: In any case, I will be able to do that on Monday. Opinions? So I have been through all the remaining 20~25 patches of CF 2014-10, and roughly moved them to the next CF 2014-12 or returned them with feedback depending on their state and their stale period. If you feel that your patch has not been treated correctly or has faced an injustice, feel free to complain on this thread or on the thread of the related patch. In any case, let's now move on to CF 2014-12, and please be sure to check the status of the patch you submitted previously if it was still on the works. PS: Could someone close CF 2014-10 btw?) Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
2014-12-15 10:19 GMT+07:00 Michael Paquier michael.paqu...@gmail.com: On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar the.ap...@gmail.com wrote: Peter, while reviewing the better performing patch myself, now i think the patch needs more work to be committed. The structuring of the method will be confusing in the long term. I think I'll restructure the patch in the next commitfest. So i propose to break the patch: 1. We apply the current patch which uses xmlNodeCopy, so that the long-standing bug will be fixed in postgres. 2. I'll work with the performance enhancement in the next commitfest. Maybe for (2), the current better-performing patch can be viewed as PoC of the expected performance. Ali, are you currently working on that? Would you mind re-creating new entries in the commit fest app for the new set of patches that you are planning to do? For now I am switching this patch as returned with feedback. Thanks, What i mean, the last patch (v7 patch) as it is is enough to fix the bug (nested xpath namespace problem). I think the performance regression is still acceptable (in my case it's ~20%), because the bug is severe. Currently, xpath can return invalid xml because the namespace isn't included in the output! What i'll be working is the v4 patch, because it turns out the v4 patch has better performance (~10%, but Peter's test shows it isn't the case). But, the problem is the v4 patch is organized wrongly, and hacks around the libxml's xml node structure (duplicating the namespace on the existing structure). I'll work on that, but it will affects the performance benefit. So what i propose is, we close the longstanding bug in this comitfest with the v7 patch. I'll work on improving the performance, without compromising good code structure. If the result is good, i'll submit the patch. Thanks Regards, -- Ali Akbar
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
2014-12-15 11:02 GMT+07:00 Ali Akbar the.ap...@gmail.com: 2014-12-15 10:19 GMT+07:00 Michael Paquier michael.paqu...@gmail.com: On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar the.ap...@gmail.com wrote: Peter, while reviewing the better performing patch myself, now i think the patch needs more work to be committed. The structuring of the method will be confusing in the long term. I think I'll restructure the patch in the next commitfest. So i propose to break the patch: 1. We apply the current patch which uses xmlNodeCopy, so that the long-standing bug will be fixed in postgres. 2. I'll work with the performance enhancement in the next commitfest. Maybe for (2), the current better-performing patch can be viewed as PoC of the expected performance. Ali, are you currently working on that? Would you mind re-creating new entries in the commit fest app for the new set of patches that you are planning to do? For now I am switching this patch as returned with feedback. Thanks, What i mean, the last patch (v7 patch) as it is is enough to fix the bug (nested xpath namespace problem). I think the performance regression is still acceptable (in my case it's ~20%), because the bug is severe. Currently, xpath can return invalid xml because the namespace isn't included in the output! Sorry, typo here. What i mean isn't invalid xml, but incomplete xml. Hence the next call to xpath with the previous result as its input will become a problem because the namespace will not match. What i'll be working is the v4 patch, because it turns out the v4 patch has better performance (~10%, but Peter's test shows it isn't the case). But, the problem is the v4 patch is organized wrongly, and hacks around the libxml's xml node structure (duplicating the namespace on the existing structure). I'll work on that, but it will affects the performance benefit. So what i propose is, we close the longstanding bug in this comitfest with the v7 patch. I'll work on improving the performance, without compromising good code structure. If the result is good, i'll submit the patch. Thanks Regards, -- Ali Akbar
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
On Mon, Dec 15, 2014 at 1:02 PM, Ali Akbar the.ap...@gmail.com wrote: 2014-12-15 10:19 GMT+07:00 Michael Paquier michael.paqu...@gmail.com: On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar the.ap...@gmail.com wrote: Peter, while reviewing the better performing patch myself, now i think the patch needs more work to be committed. The structuring of the method will be confusing in the long term. I think I'll restructure the patch in the next commitfest. So i propose to break the patch: 1. We apply the current patch which uses xmlNodeCopy, so that the long-standing bug will be fixed in postgres. 2. I'll work with the performance enhancement in the next commitfest. Maybe for (2), the current better-performing patch can be viewed as PoC of the expected performance. Ali, are you currently working on that? Would you mind re-creating new entries in the commit fest app for the new set of patches that you are planning to do? For now I am switching this patch as returned with feedback. Thanks, What i mean, the last patch (v7 patch) as it is is enough to fix the bug (nested xpath namespace problem). I think the performance regression is still acceptable (in my case it's ~20%), because the bug is severe. Currently, xpath can return invalid xml because the namespace isn't included in the output! What i'll be working is the v4 patch, because it turns out the v4 patch has better performance (~10%, but Peter's test shows it isn't the case). But, the problem is the v4 patch is organized wrongly, and hacks around the libxml's xml node structure (duplicating the namespace on the existing structure). I'll work on that, but it will affects the performance benefit. So what i propose is, we close the longstanding bug in this comitfest with the v7 patch. I'll work on improving the performance, without compromising good code structure. If the result is good, i'll submit the patch. OK. Could you then move this patch to the new CF with Needs Review with Peter as reviewer then? He seems to be looking at it anyway seeing the update from 12/11. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions
2014-12-15 11:06 GMT+07:00 Michael Paquier michael.paqu...@gmail.com: On Mon, Dec 15, 2014 at 1:02 PM, Ali Akbar the.ap...@gmail.com wrote: 2014-12-15 10:19 GMT+07:00 Michael Paquier michael.paqu...@gmail.com: On Mon, Dec 15, 2014 at 9:05 AM, Ali Akbar the.ap...@gmail.com wrote: Peter, while reviewing the better performing patch myself, now i think the patch needs more work to be committed. The structuring of the method will be confusing in the long term. I think I'll restructure the patch in the next commitfest. So i propose to break the patch: 1. We apply the current patch which uses xmlNodeCopy, so that the long-standing bug will be fixed in postgres. 2. I'll work with the performance enhancement in the next commitfest. Maybe for (2), the current better-performing patch can be viewed as PoC of the expected performance. Ali, are you currently working on that? Would you mind re-creating new entries in the commit fest app for the new set of patches that you are planning to do? For now I am switching this patch as returned with feedback. Thanks, What i mean, the last patch (v7 patch) as it is is enough to fix the bug (nested xpath namespace problem). I think the performance regression is still acceptable (in my case it's ~20%), because the bug is severe. Currently, xpath can return invalid xml because the namespace isn't included in the output! What i'll be working is the v4 patch, because it turns out the v4 patch has better performance (~10%, but Peter's test shows it isn't the case). But, the problem is the v4 patch is organized wrongly, and hacks around the libxml's xml node structure (duplicating the namespace on the existing structure). I'll work on that, but it will affects the performance benefit. So what i propose is, we close the longstanding bug in this comitfest with the v7 patch. I'll work on improving the performance, without compromising good code structure. If the result is good, i'll submit the patch. OK. Could you then move this patch to the new CF with Needs Review with Peter as reviewer then? He seems to be looking at it anyway seeing the update from 12/11. OK. Moved to the new CF. Regards, -- Ali Akbar
Re: [HACKERS] pg_basebackup vs. Windows and tablespaces
On Sat, Dec 13, 2014 at 9:41 PM, Andrew Dunstan and...@dunslane.net wrote: On 11/20/2014 02:27 AM, Amit Kapila wrote: On Wed, Nov 19, 2014 at 11:46 PM, Robert Haas robertmh...@gmail.com mailto:robertmh...@gmail.com wrote: On Tue, Nov 18, 2014 at 9:19 AM, Alvaro Herrera alvhe...@2ndquadrant.com mailto:alvhe...@2ndquadrant.com wrote: Right, but they provide same functionality as symlinks and now we are even planing to provide this feature for both linux and windows as both Tom and Robert seems to feel, it's better that way. Anyhow, I think naming any entity generally differs based on individual's perspective, so we can go with the name which appeals to more people. In case, nobody else has any preference, I will change it to what both of us can agree upon (either 'tablespace catalog', 'tablespace_info' ...). Well, I have made my argument. Since you're the submitter, feel free to select what you think is the best name. For what it's worth, I, too, dislike having symlink in the name. Maybe tablespace_map? Sounds good to me as well. To summarize the situation of this patch, I have received below comments on which I am planning to work: 1. Change the name of file containing tablespace path information. 2. Store tablespace name as well along with oid and path to make the information Human readable. 3. Make the code generic (Remove #ifdef Win32 macro's and change comments referring this functionality for windows and see if any more changes are required to make it work on linux.) Now the part where I would like to receive feedback before revising the patch is on the coding style. It seems to me from Tom's comments that he is not happy with the code, now I am not sure which part of the patch he thinks needs change. Tom if possible, could you be slightly more specific about your concern w.r.t code? I have attached a rebased (on top of commit-8d7af8f) patch, just incase some one wants to apply and check it. In view of the request above for comments from Tom, I have moved this back to Needs Review. I am working on fixing the review comments, but I think I won't be able to handle all as still there is discussion going on for some of the comments, but I am intended to work on it for CF starting today. So I have moved this patch to CF (2014-12). With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] Add generate_series(numeric, numeric)
2014-12-15 10:25 GMT+07:00 Andrew Gierth and...@tao11.riddles.org.uk: Fujii == Fujii Masao masao.fu...@gmail.com writes: Fujii Pushed. Bug found: regression=# select count(*) from generate_series(1::numeric,10) v, generate_series(1,v) w; count --- 0 (1 row) regression=# select count(*) from generate_series(1::numeric,10) v, generate_series(1,v+0) w; count --- 55 (1 row) The error is in the use of PG_GETARG_NUMERIC and init_var_from_num when setting up the multi-call state; init_var_from_num points at the original num's digits rather than copying them, but start_num and stop_num have just been (potentially) detoasted in the per-call context, in which case the storage will have been freed by the next call. Oops. Obviously this could also be fixed by not detoasting the input until after switching to the multi-call context, but it looks to me like that would be unnecessarily complex. Suggested patch attached. Thanks (Is it also worth putting an explicit warning about this kind of thing in the SRF docs?) I think yes, it will be good. The alternative is restructuring this paragraph in the SRF docs: The memory context that is current when the SRF is called is a transient context that will be cleared between calls. This means that you do not need to call pfree on everything you allocated using palloc; it will go away anyway. However, if you want to allocate any data structures to live across calls, you need to put them somewhere else. The memory context referenced by multi_call_memory_ctx is a suitable location for any data that needs to survive until the SRF is finished running. In most cases, this means that you should switch into multi_call_memory_ctx while doing the first-call setup. The important part However, if you want to allocate any data structures to live across calls, you need to put them somewhere else. is buried in the docs. But i think explicit warning is more noticeable for new developer like me. Regards, -- Ali Akbar
[HACKERS] Commit fest 2014-12, let's begin!
Hi all, Let's begin the commit fest of December: https://commitfest.postgresql.org/action/commitfest_view?id=25 Looking at the stats at the top of the above page, there are currently 56 patches to be dealt with: - Needs Review: 48 - Waiting on Author: 5 - Ready for Committer: 3 In short we are in cruel need of reviewers to make things move on. To reviewers: - This commit fest may be a good occasion for you to study new parts of the code, there are many simple patches in need of love. - If you are marked as a reviewer for a given patch, but you know that you won't have time to look at it, it is advised that yoy remove your name to give room to someone else. - Be sure to keep the information of the patch you are reviewing fresh on the CF app. This needs at most 2 minutes. - Feel free to ping the patch submitter if what you reviewed does not get updated. People have their daily jobs and are busy, so try to not be too pushy. A delay of one week may give a good base, but that's holiday season in many places. And to patch submitters: - Please review one patch of equivalent difficulty with your own patch to help keeping a balance between reviews each patch - Be sure to keep the information about your patch up-to-date - Feel free to ping your reviewer if you have no news and need more feedback To committers: there are 2 patches marked as ready for committer from previous commit fest: - Foreign table inheritance - Point to polygon distance operator This is going to be a period of vacations for many people here, so let's do our best. Regards, -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add generate_series(numeric, numeric)
Ali == Ali Akbar the.ap...@gmail.com writes: Ali I think yes, it will be good. The alternative is restructuring Ali this paragraph in the SRF docs: The memory context that is current when the SRF is called is a transient context that will be cleared between calls. This means that you do not need to call pfree on everything you allocated using palloc; it will go away anyway. However, if you want to allocate any data structures to live across calls, you need to put them somewhere else. The memory context referenced by multi_call_memory_ctx is a suitable location for any data that needs to survive until the SRF is finished running. In most cases, this means that you should switch into multi_call_memory_ctx while doing the first-call setup. Ali The important part However, if you want to allocate any data Ali structures to live across calls, you need to put them somewhere Ali else. is buried in the docs. Ali But i think explicit warning is more noticeable for new Ali developer like me. I was thinking something like this, added just after that para: warning para While the actual arguments to the function remain unchanged between calls, if you detoast the argument values (which is normally done transparently by the functionPG_GETARG_replaceablexxx/replaceable/function macro) in the transient context then the detoasted copies will be freed on each cycle. Accordingly, if you keep references to such values in your structfielduser_fctx/, you must either copy them into the structfieldmulti_call_memory_ctx/ after detoasting, or ensure that you detoast the values only in that context. /para /warning -- Andrew (irc:RhodiumToad) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Logical Replication Helpers WIP for discussion
Hello, we've made few helper functions for making logical replication easier, I bundled it into contrib module as this is mainly for discussion at this time (I don't expect this to get committed any time soon, but it is good way to iron out protocol, etc). I created sample logical decoding plugin that uses those functions and which can be used for passing DML changes in platform/version independent (hopefully) format. I will post sample apply BG worker also once I get some initial feedback about this. It's hard to write tests for this as the binary changes contain transaction ids and timestamps so the data changes constantly. This is of course based on the BDR work Andres, Craig and myself have been doing. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/contrib/lrep/Makefile b/contrib/lrep/Makefile new file mode 100644 index 000..d3f7ba3 --- /dev/null +++ b/contrib/lrep/Makefile @@ -0,0 +1,20 @@ +# contrib/lrep/Makefile + +MODULE_big = lrep +OBJS = lrep_utils.o lrep_output.o $(WIN32RES) +PG_CPPFLAGS = -I$(libpq_srcdir) + +#EXTENSION = lrep +#REGRESS = lrep +#REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/lrep/logical.conf + +ifdef USE_PGXS +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) +else +subdir = contrib/lrep +top_builddir = ../.. +include $(top_builddir)/src/Makefile.global +include $(top_srcdir)/contrib/contrib-global.mk +endif diff --git a/contrib/lrep/lrep.h b/contrib/lrep/lrep.h new file mode 100644 index 000..e2910c4 --- /dev/null +++ b/contrib/lrep/lrep.h @@ -0,0 +1,108 @@ +/*- + * + * lrep.h + * LREP public interfaces + * + * Copyright (c) 2012-2014, PostgreSQL Global Development Group + * + * IDENTIFICATION + * contrib/lrep/lrep.h + * + *- + */ +#ifndef LREP_H +#define LREP_H + +#include libpq-fe.h + +#include nodes/parsenodes.h + +#include replication/logical.h +#include replication/output_plugin.h + +#include storage/lock.h + +#define LREP_PROTO_VERSION_NUM 1 +#define LREP_PROTO_MIN_REMOTE_VERSION_NUM 1 + +typedef struct +{ + MemoryContext context; + + bool allow_binary_protocol; + bool allow_sendrecv_protocol; + bool int_datetime_mismatch; + bool forward_changesets; + + uint32 client_pg_version; + uint32 client_pg_catversion; + uint32 client_proto_version; + uint32 client_min_proto_version; + size_t client_sizeof_int; + size_t client_sizeof_long; + size_t client_sizeof_datum; + size_t client_maxalign; + bool client_bigendian; + bool client_float4_byval; + bool client_float8_byval; + bool client_int_datetime; + char *client_db_encoding; + + void *extra; /* Additional data */ +} LREPOutputData; + + +typedef struct LREPTupleData +{ + Datum values[MaxTupleAttributeNumber]; + bool nulls[MaxTupleAttributeNumber]; + bool changed[MaxTupleAttributeNumber]; +} LREPTupleData; + + +extern void lrep_write_begin(StringInfo out, ReorderBufferTXN *txn, int flags, + StringInfo extradata); +extern void lrep_write_commit(StringInfo out, ReorderBufferTXN *txn, + XLogRecPtr commit_lsn, int flags, + StringInfo extradata); + +extern void lrep_write_insert(LREPOutputData *data, StringInfo out, + Relation rel, HeapTuple newtuple); +extern void lrep_write_update(LREPOutputData *data, StringInfo out, + Relation rel, HeapTuple oldtuple, + HeapTuple newtuple); +extern void lrep_write_delete(LREPOutputData *data, StringInfo out, + Relation rel, HeapTuple oldtuple); + +extern void lrep_write_rel(StringInfo out, Relation rel); +extern void lrep_write_tuple(LREPOutputData *data, StringInfo out, Relation rel, + HeapTuple tuple); + +extern int lrep_read_begin(StringInfo in, XLogRecPtr *origlsn, + TimestampTz *committime, TransactionId *remote_xid); +extern int lrep_read_commit(StringInfo in, XLogRecPtr *commit_lsn, + XLogRecPtr *end_lsn, TimestampTz *committime); +extern Relation lrep_read_insert(StringInfo in, LOCKMODE lockmode, + LREPTupleData *tuple); +extern Relation lrep_read_update(StringInfo in, LOCKMODE lockmode, + LREPTupleData *oldtuple, + LREPTupleData *newtuple, bool *pkeysent); +extern Relation lrep_read_delete(StringInfo in, LOCKMODE lockmode, + LREPTupleData *tuple, bool *pkeysent); + +extern void lrep_read_tuple_parts(StringInfo s, TupleDesc desc, + LREPTupleData *tuple); +extern RangeVar *lrep_read_rel(StringInfo s); + +extern bool lrep_send_feedback(PGconn *conn, XLogRecPtr recvpos, + XLogRecPtr writepos, XLogRecPtr flushpos, + int64 now, bool force); + +extern void lrep_opt_parse_notnull(DefElem *elem, const char *paramtype); +extern void lrep_opt_parse_uint32(DefElem *elem, uint32 *res); +extern void lrep_opt_parse_size_t(DefElem *elem, size_t *res); +extern void
Re: [HACKERS] Commit fest 2014-12, let's begin!
Michael Paquier michael.paqu...@gmail.com writes: Let's begin the commit fest of December: https://commitfest.postgresql.org/action/commitfest_view?id=25 ... To committers: there are 2 patches marked as ready for committer from previous commit fest: - Foreign table inheritance I should probably take this one. - Point to polygon distance operator I looked at that briefly during the last fest, but was unsure whether it was too entangled with the GiST patches that Heikki was looking at. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add transforms feature
Peter Eisentraut pete...@gmx.net writes: On 4/4/14 6:21 PM, Andres Freund wrote: + /* dependency on transform used by return type, if any */ + if ((trfid = get_transform_oid(returnType, languageObjectId, true))) Should be compared to InvalidOid imo, rather than implicitly assuming that InvalidOid evaluates to false. I think it's widely assumed that InvalidOid is false. That's not the point; the point is that some nonzero number of compilers (and probably Coverity too) will warn about this construct, on the not unreasonable grounds that = might be a typo for ==. (Those extra parens might satisfy gcc, but not other tools.) Please put in an explicit comparison of the assignment result, as is done in approximately 100% of the other places where this idiom appears in Postgres. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commit fest 2014-12, let's begin!
On Mon, Dec 15, 2014 at 3:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Michael Paquier michael.paqu...@gmail.com writes: - Point to polygon distance operator I looked at that briefly during the last fest, but was unsure whether it was too entangled with the GiST patches that Heikki was looking at. Recalling my memories of this morning, things are rather independent. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
Note: this patch has been moved to CF 2014-12 and I marked myself as an author if that's fine... I've finished by being really involved in that. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On partitioning
On Sun, Dec 14, 2014 at 11:12 PM, Amit Langote langote_amit...@lab.ntt.co.jp wrote: On egress you need some direct way to compare the scan quals with the partitioning values. I would imagine this to be similar to how scan quals are compared to the values stored in a BRIN index: each scan qual has a corresponding operator strategy and a scan key, and you can say aye or nay based on a small set of operations that can be run cheaply, again without any proof or running arbitrary expressions. My knowledge of this is far from being perfect, though to clear any confusions - As far as planning is concerned, I could not imagine how index access method way of pruning partitions could be made to work. Of course, I may be missing something. Let me be overly verbose, don't take it as patronizing, just answering in lots of detail why this could be a good idea to try. Normal indexes store a pointer for each key value of sorts. So B-Tree gets you a set of tids for each key, and so does GIN and hash. But BRIN is different in that it does the mapping differently. BRIN stores a compact, approximate representation of the set of keys within a page range. It can tell with some degree of (in)accuracy whether a key or key range could be part of that page range or not. The way it does this is abstracted out, but at its core, it stores a compressed representation of the key set that takes a constant amount of bits to store, and no more, no matter how many elements. What changes as the element it represents grows, is its accuracy. Currently, BRIN only supports min-max representations. It will store, for each page range, the minimum and maximum of some columns, and when you query it, you can compare range vs range, and discard whole page ranges. Now, what are partitions, if not page ranges? A BRIN tuple is a min-max pair. But BRIN in more general, it could use other data structures to hold that compressed representation, if someone implemented them. Like bloom filters [0]. A BRIN index is a complex data structure because it has to account for physically growing tables, but all the complexities vanish when you fix a block range (the BR in BRIN) to a partition. Then, a mere array of BRIN tuples would suffice. BRIN already contains the machinery to turn quals into something that filters out entire partitions, if you provide the BRIN tuples. And you could even effectively matain a BRIN index for the partitions (just a BRIN tuple per partition, dynamically updated with every insertion). If you do that, you start with empty partitions, and each insert updates the BRIN tuple. Avoiding concurrency loss in this case would be tricky, but in theory this could allow very general partition exclusion. In fact it could even work with constraint exclusion right now: you'd have a single-tuple BRIN index for each partition and benefit from it. But you don't need to pay the price of updating BRIN indexes, as min-max tuples for each partition can be produced while creating the partitions if the syntax already provides the information. Then, it's just a matter of querying this meta-data which just happens to have the form of a BRIN tuple for each partition. [0] http://en.wikipedia.org/wiki/Bloom_filter -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Comment typo in typedef struct BrinTuple
Hi, Find attached that does: -* mt_info is laid out in the following fashion: +* bt_info is laid out in the following fashion: snip-comment uint8 bt_info; } BrinTuple; Thanks, Amit BrinTuple-typo-fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
ctidscan as an example of custom-scan (Re: [HACKERS] [v9.5] Custom Plan API)
This attached patch adds a code example of custom-scan interface. This custom-scan provider (ctidscan) performs almost same as built-in SeqScan plan, but can produce same results with less page scan in case when qualifier of relation scan has inequality operators (, =, or =) on ctid system column, therefore, range to be scanned is less than case of full-table scan. Below is an example: postgres=# EXPLAIN SELECT * FROM t1 WHERE ctid '(10,0)'::tid AND b like '%abc%'; QUERY PLAN - Seq Scan on t1 (cost=0.00..10.00 rows=1 width=37) Filter: ((ctid '(10,0)'::tid) AND (b ~~ '%abc%'::text)) (2 rows) Once ctidscan is loaded, it can provide cheaper cost to scan the t1 table than SeqScan, so planner chooses the custom logic. postgres=# LOAD 'ctidscan'; LOAD postgres=# EXPLAIN SELECT * FROM t1 WHERE ctid '(10,0)'::tid AND b like '%abc%'; QUERY PLAN - Custom Scan (ctidscan) on t1 (cost=0.00..5.00 rows=1 width=37) Filter: ((ctid '(10,0)'::tid) AND (b ~~ '%abc%'::text)) ctid quals: (ctid '(10,0)'::tid) (3 rows) Like other query execution logic, it also provides enable_ctidscan parameter to turn on/off this feature. I'm not certain whether we should have this functionality in contrib from the perspective of workload that can help, but its major worth is for an example of custom-scan interface. Thanks, -- NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei kai...@ak.jp.nec.com -Original Message- From: Robert Haas [mailto:robertmh...@gmail.com] Sent: Thursday, December 11, 2014 11:58 AM To: Simon Riggs Cc: Kaigai Kouhei(海外 浩平); Thom Brown; Kohei KaiGai; Tom Lane; Alvaro Herrera; Shigeru Hanada; Stephen Frost; Andres Freund; PgHacker; Jim Mlodgenski; Peter Eisentraut Subject: ##freemail## Re: [HACKERS] [v9.5] Custom Plan API On Tue, Dec 9, 2014 at 3:24 AM, Simon Riggs si...@2ndquadrant.com wrote: Feedback I am receiving is that the API is unusable. That could be because it is impenetrable, or because it is unusable. I'm not sure it matters which. It would be nice to here what someone is trying to use it for and what problems that person is encountering. Without that, it's pretty much impossible for anyone to fix anything. As for sample code, KaiGai had a working example, which of course got broken when Tom changed the API, but it didn't look to me like Tom's changes would have made anything impossible that was possible before. I'm frankly kind of astonished by the tenor of this entire conversation; there is certainly plenty of code in the backend that is less self-documenting than this is; and KaiGai did already put up a wiki page with documentation as you requested. From his response, it sounds like he has updated the ctidscan code, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company pgsql-v9.5-contrib-ctidscan.v1.patch Description: pgsql-v9.5-contrib-ctidscan.v1.patch -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers