Re: [HACKERS] Last gasp
On Thu, Apr 12, 2012 at 05:49, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Apr 11, 2012 at 5:36 PM, Peter Eisentraut pete...@gmx.net wrote: I'd still review it, but I'd be able to spend say 3 minutes on review and 30 seconds on committing it, versus 3 minutes on review, 3 minutes on research, and 8 minutes on bookkeeping. Well, I am not averse to figuring out a better workflow, or some better tools. In practice, I think it's going to be hard to reduce the time to review a trivial patch much below 5-10 minutes, which is what it takes me now, because you've got to read the email, download the patch, check that it doesn't break the build, review, commit, and push, and I can't really see any of those steps going away. But that doesn't mean we shouldn't make the attempt, because I've got to admit that the current workflow seems a little cumbersome to me, too. I'm not sure I have a better idea, though. git remotes seem useful for collaborating on topic branches, but I don't think they can really be expected to save much of anything during the final commit process - which is basically all the process there is, when the patch is trivial. Now what would be sort of neat is if we had a way to keep all the versions of patch X plus author and reviewer information, links to reviews and discussion, etc. in some sort of centralized place. The CommitFest app was actually designed to track a lot of this information, but it's obviously not completely succeeding in tracking everything that people care about - it only contains links to patches and not patches themselves; it doesn't have any place to store proposed commit messages; etc. There might be room for improvement there, although getting consensus on what improvement looks like may not be totally straightforward, since I think Tom's ideal process for submitting a patch starts with attaching a file to an email and many other people I think would like to see it start with a pull request. This is not entirely a tools issue, of course, but it's in there somewhere. It strikes me that there are two different scenarios being discussed here, and we'd better be sure we keep them straight: small-to-trivial patches, and complex patches. I think that for the trivial case, what we need is less tooling not more. Entering a patch in the CF app, updating and closing it will add a not-small percentage to the total effort required to deal with a small patch (as Peter already noted, and he wasn't even counting the time to put the patch into CF initially). The only reason to even consider doing that is to make sure the patch doesn't get forgotten. Perhaps we could have some lighter-weight method of tracking such things? If we were actually using git branches for it, the CF app could automatically close entries when they were committed. But that requires them to be committed *unmodified*, and I'm not sure that's reasonable. I also think requiring a git branch for the *simple* changes is adding more tooling and not less, and thus fails on that suggestion. It might be helpful (if the CF app had a trivial API) with a small tool that could run from a git hook (or manual script or alias) that would prompt for which cf entry, if any, did this commit close? At the other end of the scale, I think it's true that the CF app could be more helpful than it is for tracking the state of complex patches. I don't really have any concrete suggestions, other than that I've seen far too many cases where the latest version of a patch was not linked into the CF entry. Somehow we've got to make that more robust. Maybe the answer is to tie things more directly into git workflows, though I'm not sure about details. I am concerned about losing traceability of submissions if all that ever shows up in the list archives is a URL. I've suggested before that it would be a good idea to be able to register a git repo + branch name in the commitfest app, and be able to track that. If it was smart enough to figure out that for something like github or bitbucket it could also add a web link (but keep the git link for whoever wants to pull it remotely) with the full differences to master, that would make *some* of those issues go away. (Certainly not all, it's not a magic solution, but I believe it would be a tool that could help). I've pretty much given up on that happening though... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why can't I use pgxs to build a plpgsql plugin?
On 08.04.2012 11:59, Guillaume Lelarge wrote: Hi, I recently wrote a plpgsql plugin. I wanted to enable the use of pgxs, to make it easier to compile the plugin, but I eventually found that I can't do that because the plpgsql.h file is not available in the include directory. I'm wondering if we shouldn't put the header files of plpgsql source code in the include directory. It would help compiling the PL/pgsql debugger, and profiler (and of course my own plugin). Yep, I just bumped into this myself, while trying to make pldebugger module compilable with pgxs. There could be a good reason which would explain why we can't (or don't want to) do this, but I don't see it right now. Me neither, except a general desire to keep internals hidden. I propose the attached. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/pl/plpgsql/src/Makefile b/src/pl/plpgsql/src/Makefile index 751a98d..f21d28e 100644 --- a/src/pl/plpgsql/src/Makefile +++ b/src/pl/plpgsql/src/Makefile @@ -27,19 +27,26 @@ all: all-lib include $(top_srcdir)/src/Makefile.shlib -install: all install-lib install-data +install: all install-lib install-data install-headers installdirs: installdirs-lib $(MKDIR_P) '$(DESTDIR)$(datadir)/extension' -uninstall: uninstall-lib uninstall-data +uninstall: uninstall-lib uninstall-data uninstall-headers install-data: installdirs $(INSTALL_DATA) $(addprefix $(srcdir)/, $(DATA)) '$(DESTDIR)$(datadir)/extension/' +# The plpgsql.h header file is needed by instrumentation plugins +install-headers: installdirs + $(INSTALL_DATA) '$(srcdir)/plpgsql.h' '$(DESTDIR)$(includedir_server)' + uninstall-data: rm -f $(addprefix '$(DESTDIR)$(datadir)/extension'/, $(notdir $(DATA))) +uninstall-headers: + rm -f '$(DESTDIR)$(includedir_server)/plpgsql.h' + .PHONY: install-data uninstall-data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] docs: WITH queries and VALUES
The SELECT manpage has: and with_query is: with_query_name [ ( column_name [, ...] ) ] AS ( select | insert | update | delete ) Should that list that you can use values as well? Or is it something we generally consider wherever select works you can use values? (I ran into it because it's what comes up when you do \h WITH, so I got the question of why is values not supported for with. but it is..) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] docs: WITH queries and VALUES
* Magnus Hagander (mag...@hagander.net) wrote: with_query_name [ ( column_name [, ...] ) ] AS ( select | insert | update | delete ) Should that list that you can use values as well? Or is it something we generally consider wherever select works you can use values? (I ran into it because it's what comes up when you do \h WITH, so I got the question of why is values not supported for with. but it is..) TABLE also works there, and here: [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ] and here: ( select ) [ AS ] alias [ ( column_alias [, ...] ) ] Not sure if it's worth fixing, just something I noticed. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Last gasp
On Tue, Apr 10, 2012 at 08:43:12PM -0400, Greg Smith wrote: The main reason I worry about this is because of a very real chicken/egg problem here that I keep banging into. Since the commit standards for so many other open-source projects are low, there are a non trivial number of business people who assume !committer == ![trusted|competent]. That makes having such a limited number of people who can commit both a PR issue (this project must not be very important if there are only 19 committers) and one limiting sponsorship (I'm not going to pay someone to work on this feature who's been working on it for years but isn't even a committer). There are a significant number of companies who are willing to sponsor committers to open-source projects; there are almost none who will sponsor reviewers or contributors of any stature unless they're already deep into the PostgreSQL community. That's one of the many reasons it's easier for a committer to attract funding for core PostgreSQL work, be it in the form of a full-time job or project-oriented funding. The corresponding flip side to that is that the small number of committers is limiting the scope of funding the project can accumulate. I want to caution against adjusting things to improve funding possibilities. There is nothing wrong with increasing funding possibilities, per say, but such changes often distort behavior in unforeseen ways that adversely affect our community process. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why can't I use pgxs to build a plpgsql plugin?
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 08.04.2012 11:59, Guillaume Lelarge wrote: There could be a good reason which would explain why we can't (or don't want to) do this, but I don't see it right now. Me neither, except a general desire to keep internals hidden. I propose the attached. Shouldn't the new targets be marked .PHONY? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] docs: WITH queries and VALUES
Stephen Frost sfr...@snowman.net writes: * Magnus Hagander (mag...@hagander.net) wrote: with_query_name [ ( column_name [, ...] ) ] AS ( select | insert | update | delete ) Should that list that you can use values as well? Or is it something we generally consider wherever select works you can use values? TABLE also works there, and here: Well, TABLE foo is defined as a shorthand for SELECT * FROM foo, so ISTM it's not too surprising that you can use it wherever you can use SELECT. I'm not sure that people have a similar view of VALUES though. It might be worth adding VALUES to the WITH syntax. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
Excerpts from Tom Lane's message of jue abr 12 00:49:38 -0300 2012: At the other end of the scale, I think it's true that the CF app could be more helpful than it is for tracking the state of complex patches. I don't really have any concrete suggestions, other than that I've seen far too many cases where the latest version of a patch was not linked into the CF entry. Somehow we've got to make that more robust. Maybe the answer is to tie things more directly into git workflows, though I'm not sure about details. I am concerned about losing traceability of submissions if all that ever shows up in the list archives is a URL. Two suggestions: 1. it might be convenient to have the patch author attach a suggested commit message to the patch entry in the commifest site. Would save some jiffies for the trivial patch case, I hope. 2. instead of just sending a URL to the list, maybe it'd be better if the patch is uploaded to the CF site, and the CF site sends it to pgsql-hackers for archival and reference, with appropriate In-Reply-To headers so that it is appropriately linked to the thread. But since the patch has been registered into the CF, the site can additionally present a link to download the patch directly instead of sending you to the archives. So redundant storage, for convenience. (Alternatively, the CF app could reach into archives to grab the patch file. With some appropriate ajaxy stuff this shouldn't be particularly hard.) -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
On 12 April 2012 13:45, Bruce Momjian br...@momjian.us wrote: I want to caution against adjusting things to improve funding possibilities. There is nothing wrong with increasing funding possibilities, per say, but such changes often distort behavior in unforeseen ways that adversely affect our community process. Funding is a necessary component of what we do. So, for example, while I'm very glad that EnterpriseDB afford various people the opportunity to work on community stuff for a significant proportion of their time - I do, after all, indirectly benefit from it - it is rather obviously the case that the particular things that those people work on is influenced to some degree by management. That is an assessment that isn't based on any particular observation about the things that EDB people work on. It's just common sense. This generally isn't a bad thing, since I think that the goals of the Postgres companies are broadly aligned with those of the community. When you get right down to it though, as Tom said, we are a herd of cats, and it isn't particularly obvious that we've zeroed in on some specific vision that we all agree on that must be pursued without diversion. Given the extensibility of Postgres, it isn't usually necessary for anyone to pursue development of a feature that is clearly of niche interest, that we don't really want to have to support. I cannot think of any example of a proposed patch that mostly just scratched some particular organisation's itch. No one is able to hoodwink the community like that. People have always wanted to get their patches accepted, and we've always had high standards. The fact that there might be an additional financial incentive to do so doesn't seem to fundamentally alter that dynamic. It is not a coincidence that I did not send any code to -hackers prior to joining 2ndQuadrant. I certainly had the enthusiasm for it, but I could not afford to dedicate sufficient time. With the kind of dedication required to make a noticeable contribution, this is hardly surprising. There are some good counter-examples of this of course - one in particular that comes to mind is Greg Smith's work on the background writer that made it into 8.3 . However, the general trend is that somebody has to pay for this work for it to be maintainable over months and years, even with the level of dedication that we all have. Something that I would suggest is that those that are receiving funding be transparent about it. It isn't essential of course, but to do any less might lead to the perception of there being a conflict of interests in some people's minds, which is best avoided. I am conscious of the fact that I've expressed lots of opinions on this thread on our processes and so on, some of which, if followed through on, would be quite large departures. I hope that they were received as modest suggestions. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
On 04/11/2012 10:24 AM, Tom Lane wrote: Greg Smithg...@2ndquadrant.com writes: I'd like to dump around 50 pages of new material into the docs as a start, but I don't want to take so much time away from the code oriented committers to chew on that much. Well, with all due respect, that does not sound like a change that doesn't need review. I wasn't trying to suggest large changes should be made without review. I'd just like some new paths for work to progress without one of the more coding oriented committers being compelled to join and keep up with everything. The quality level I aimed for in my book wouldn't have been possible without Kevin Grittner, Scott Marlowe, and Jim Mlodgenski as reviewers; it didn't require anyone with commit bits though. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
If we were actually using git branches for it, the CF app could automatically close entries when they were committed. But that requires them to be committed *unmodified*, and I'm not sure that's reasonable. I also think requiring a git branch for the *simple* changes is adding more tooling and not less, and thus fails on that suggestion. Well actually, the other advantage of using branches is that it would encourage committers to bounce a patch back to the submitter for modification *instead of* doing it themselves. This would both have the advantage of saving time for the committer, and doing a better job of teaching submitters how to craft patches which don't need to be modified. Ultimately, we need to train new major contributors in order to get past the current bottleneck. Of course, this doesn't work as well for contributors who *can't* improve their patches, such as folks who have a language barrier with the comments. But it's something to think about. --Josh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 I want to caution against adjusting things to improve funding possibilities. There is nothing wrong with increasing funding possibilities, per say, but such changes often distort behavior in unforeseen ways that adversely affect our community process. I don't see this as much of a problem. If somewhat arbitrary labels and powers allow the project to succeed, we should think long and hard before rejecting the idea. It's not like we are going to make anyone who asks a committer, like MediaWiki does. Indeed, we have been super cautious about handing out both commit bits, and labels (e.g. Major Developer). One wrinkle is the subsystems: there are some people who only work on certain parts, yet have a commit bit (with the understanding that they won't start editing core or other parts). From an outside perspective however, a Postgres committer [of certain subsystems] is a Postgres committer. One thing I think would help potential and current developers, and act as a further code review and safety valve, is to have a mailing list that actually shows the committed diffs. Links to a webpage showing the diff is just not the same. pgsql-commit-di...@postgresql.org, anyone? - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201204121121 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAk+G8/sACgkQvJuQZxSWSsh7HACgn7Wf/AQyUJwtvxgjYSHSIHkJ hq4AnjMgPlDakupg4mo204+N1p4C0mMZ =z+cR -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CREATE FOREGIN TABLE LACUNA
On 23 March 2012 19:07, David Fetter da...@fetter.org wrote: On Fri, Mar 23, 2012 at 11:38:56AM -0700, David Fetter wrote: How about this one? Oops, forgot to put the latest docs in. I think the docs need some additional supporting content. The LIKE clause and its source_table parameter isn't explained on the CREATE FOREIGN TABLE page. There's no mention of the like_option parameter too which should be valid since you can specify whether it includes comments (among less relevant options). Also you appear to have modified the documented command definition so that OPTIONS can't be applied per-column anymore. It's now placed [, ... ] prior to the column's OPTIONS clause. The patch works for me though, and allows tables, foreign tables, views and composite types to be used in the LIKE clause. Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
On Thu, Apr 12, 2012 at 03:34:31PM +0100, Peter Geoghegan wrote: Something that I would suggest is that those that are receiving funding be transparent about it. It isn't essential of course, but to do any less might lead to the perception of there being a conflict of interests in some people's minds, which is best avoided. I am conscious of the fact that I've expressed lots of opinions on this thread on our processes and so on, some of which, if followed through on, would be quite large departures. I hope that they were received as modest suggestions. I appreciate everything everyone said in this thread, and I can't think of an example off the top of my head where vendors adversely affected our process. I think the _big_ reason for that is that our community members have always acted with a community first attitude that has insulated us from many of the pressures vendors can place on the development process. I am sure that protection will continue --- I just wanted to point out that it is a necessary protection so we can all be proud of our released code and feature set, and continue working as a well-coordinated team. The specific suggestion that vendors are not taking contributors seriously unless they have commit-bits is perhaps something that requires education of vendors, or perhaps my blogging about this will help. Greg Smith's analysis really hit home with me: a non trivial number of business people who assume !committer == ![trusted|competent]. That makes having such a limited number of people who can commit both a PR issue (this project must not be very important if there are only 19 committers) and one limiting sponsorship (I'm not going to pay someone to work on this feature who's been working on it for years but isn't even a committer). I think the big take-away, education-wise, is that for our project, committer == grunt work. Remember, I used to be the big committer of non-committer patches --- need I say more. ;-) LOL -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] man pages for contrib programs
On ons, 2012-04-11 at 22:10 +0100, Thom Brown wrote: On 11 April 2012 21:58, Peter Eisentraut pete...@gmx.net wrote: On ons, 2012-04-11 at 21:42 +0100, Thom Brown wrote: Could you clarify what you're defining to be a client application and a server application? This could be confusing as we already have sections under Reference called PostgreSQL Client Applications and PostgreSQL Server Applications, visible in the root table of contents. By the same criteria as the main reference: client applications can run anywhere and connect to a server, server applications run on the same host as the database server. Fair enough. So will you be classifying things like auto_explain and auth_delay as extensions? (i.e. things which aren't installed via CREATE EXTENSION) Good question. I guess we could keep the original name ... Modules for that chapter. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
I think the big take-away, education-wise, is that for our project, committer == grunt work. Remember, I used to be the big committer of non-committer patches --- need I say more. ;-) LOL Well, promoting several people to committer specifically and publically because of their review work would send that message a lot more strongly than your blog would. It would also provide an incentive for a few of our major contributors to do more review work, if it got them to committer. --Josh Berkus -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
On Thu, Apr 12, 2012 at 11:34:48AM -0400, Bruce Momjian wrote: On Thu, Apr 12, 2012 at 03:34:31PM +0100, Peter Geoghegan wrote: Something that I would suggest is that those that are receiving funding be transparent about it. It isn't essential of course, but to do any less might lead to the perception of there being a conflict of interests in some people's minds, which is best avoided. I am conscious of the fact that I've expressed lots of opinions on this thread on our processes and so on, some of which, if followed through on, would be quite large departures. I hope that they were received as modest suggestions. I appreciate everything everyone said in this thread, and I can't think of an example off the top of my head where vendors adversely affected our process. I think the _big_ reason for that is that our community members have always acted with a community first attitude that has insulated us from many of the pressures vendors can place on the development process. I am sure that protection will continue --- I just wanted to point out that it is a necessary protection so we can all be proud of our released code and feature set, and continue working as a well-coordinated team. Let me add one more thing. As someone who has been funded for Postgres work since 2000, I am certainly pro-funding! Since our community members have a community first attitude, it is the community's responsibility to help them get funding. We have thrown around a few ideas in this thread, but perhaps someone should start a new email thread with first-hand suggestions of how we can help people get funding. I am certainly ready to help however I can. My reason for replying to this thread was to highlight our valuable community first attitude. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why can't I use pgxs to build a plpgsql plugin?
On 12.04.2012 16:59, Tom Lane wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: On 08.04.2012 11:59, Guillaume Lelarge wrote: There could be a good reason which would explain why we can't (or don't want to) do this, but I don't see it right now. Me neither, except a general desire to keep internals hidden. I propose the attached. Shouldn't the new targets be marked .PHONY? Umm ... me reads up on what .PHONY means ... yes, yes they should. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
On 04/12/2012 11:34 AM, Bruce Momjian wrote: The specific suggestion that vendors are not taking contributors seriously unless they have commit-bits is perhaps something that requires education of vendors, or perhaps my blogging about this will help. I'm glad I managed to vent my frustration in this area in a way that was helpful. Just recognize that any experienced person at pitching software solutions will tell you to never wander down this path at all. If you have to tell someone a story and make them admit they're wrong about something as an early step toward adoption, you've just dumped a home-made FUD bomb on them. It's not a high percentage path toward credibility. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
I wrote: So I'm back to thinking we need to look explicitly at the rowcount comparison as well as all the existing conditions in add_path. One annoying thing about that is that it will reduce the usefulness of add_path_precheck, because that's called before we compute the rowcount estimates (and indeed not having to make the rowcount estimates is one of the major savings from the precheck). I think what we'll have to do is assume that a difference in parameterization could result in a difference in rowcount, and hence only a dominant path with exactly the same parameterization can result in failing the precheck. I've been experimenting some more with this, and have observed that in the test cases I'm using, adding rowcount as an additional criterion in add_path doesn't cost much of anything: it doesn't seem to affect the runtime significantly, and it only seldom changes the keep/reject decisions. So that's good news. Unfortunately, the precheck situation is actually worse than I thought: there are plenty of cases where parameterized paths can have the exact same parameterization (that is, same sets of required outer rels) and yet have different row estimates, because one might use different join clauses than the other. All you need to be at risk is more than one join clause between the same two rels, with those clauses matching different indexes or index columns. This entirely destroys the logic of add_path_precheck as currently constituted, because it implies we can never reject a parameterized path before computing its rowcount. I said upthread that I wouldn't cry if we got rid of add_path_precheck again, but it still looks like that would cost us a noticeable hit in planning speed. I've considered three other alternatives: 1. Lobotomize add_path_precheck so it always returns true for a parameterized path. This sounds horrid, but in the test cases I'm using it seems that this only results in doing the full path construction for a very small number of additional paths. 2. Refactor so that we obtain the row estimate during the first not the second cost estimation step. This doesn't look promising; I have not actually coded and tested it, but eyeballing gprof numbers for the current code suggests it would give back a considerable percentage of the savings from having a precheck at all. 3. Rearrange plan generation so that a parameterized path always uses all join clauses available from the specified outer rels. (Any that don't work as indexquals would have to be applied as filter conditions.) If we did that, then we would be back to a situation where all paths with the same parameterization should yield the same rowcount, thus justifying letting add_path_precheck work as it does now. #3 would amount to pushing quals that would otherwise be checked at the nestloop join node down to the lowest inner-relation level where they could be checked. This is something I'd suspected would be a good idea to start with, but hadn't gotten around to implementing for non-index quals. It had not occurred to me that it might simplify cost estimation to always do that. I'm going to take a closer look at #3, but it may not be practical to try to squeeze it into 9.2; if not, I think #1 will do as a stopgap. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
On Wed, Apr 11, 2012 at 12:00:39PM -0300, Alvaro Herrera wrote: remote in their main PG tree, and so changesets could be pulled into the same clone and cherry-picked into the master branch. If you're talking about a way of using git to support reviewing, the Gerrit tool has an interesting workflow. Essentially anything you want reviewed you push to a fake tag refs/for/master which always creates a new branch. As such you have a repository which contains every patch ever submitted, but it simultaneously tracks the parents so you know which version of the tree a patch was against. In the case of Postgres each entry in the CF app would have its own tag (say refs/cf/234) which would create a new patch for that entry. In the end accepted patches are cherry-picked onto the real tree. But because all patches are now in the same place you can build tooling around it easier, like testing: does this patch cherry-pick cleanly or is there a conflict. No merge commits, just using git purely as patch storage. (Note to make this work it has a git server emulation which may or may not be easy to do, but it's just a thought about workflow.) Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] Last gasp
Alvaro Herrera wrote: Now what would be sort of neat is if we had a way to keep all the versions of patch X plus author and reviewer information, links to reviews and discussion, etc. in some sort of centralized place. FWIW: y'all might have discussed to death during the git migration, so *please* do not let me derail you if so... github does a great job of exactly this. You open an issue, you reference it from commits, all the related commits are listed in (and browseable from) the issue, you can comment on specific lines of the commit, it integrates w/email, it has an API to write tools (both workflow and archival) against, etc. Rather than extend the CF app into a trivial-patch workflow app, it might be worth looking at integrating it with github. Jay Levitt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
On Thu, Apr 12, 2012 at 6:11 PM, Jay Levitt jay.lev...@gmail.com wrote: Rather than extend the CF app into a trivial-patch workflow app, it might be worth looking at integrating it with github. There's a reluctance to require a proprietary component that could disappear on us without notice. The existence of git itself is a result of *exactly* that circumstance, as Linux kernel developers had gotten dependent on BitKeeper, whereupon the owner decided to take his toys home, at which point they were left bereft of their SCM tool. http://kerneltrap.org/node/4966 I expect that it would be more worthwhile to look into enhancements to git workflow such as http://code.google.com/p/gerrit/ Gerrit. I don't know that Gerrit is THE answer, but there are certainly projects that have found it of value, and it doesn't have the oops, it's proprietary problem. -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [Patch] Fix little typo in a comment
From: Tom Lane [mailto:t...@sss.pgh.pa.us] Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes: This is a little patch to fix a typo in contrib/file_fdw. I think that comment is fine as-is. OK, thanks. Best regards, Etsuro Fujita regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory usage during sorting
On Sun, 2012-03-18 at 11:25 -0400, Tom Lane wrote: Yeah, that was me, and it came out of actual user complaints ten or more years back. (It's actually not 2X growth but more like 4X growth according to the comments in logtape.c, though I no longer remember the exact reasons why.) We knew when we put in the logtape logic that we were trading off speed for space, and we accepted that. I skimmed through TAOCP, and I didn't find the 4X number you are referring to, and I can't think what would cause that, either. The exact wording in the comment in logtape.c is 4X the actual data volume, so maybe that's just referring to per-tuple overhead? However, I also noticed that section 5.4.4 (Vol 3 p299) starts discussing the idea of running the tapes backwards and forwards. That doesn't directly apply, because a disk seek is cheaper than rewinding a tape, but perhaps it could be adapted to get the block-freeing behavior we want. The comments in logtape.c say: Few OSes allow arbitrary parts of a file to be released back to the OS, so we have to implement this space-recycling ourselves within a single logical file. But if we alternate between reading in forward and reverse order, we can make all of the deallocations at the end of the file, and then just truncate to free space. I would think that the OS could be more intelligent about block allocations and deallocations to avoid too much fragmentation, and it would probably be a net reduction in code complexity. Again, the comments in logtape.c have something to say about it: ...but the seeking involved should be comparable to what would happen if we kept each logical tape in a separate file But I don't understand why that is the case. On another topic, quite a while ago Len Shapiro (PSU professor) suggested to me that we implement Algorithm F (Vol 3 p321). Right now, as tuplesort.c says: In the current code we determine the number of tapes M on the basis of workMem: we want workMem/M to be large enough that we read a fair amount of data each time we preread from a tape But with forcasting, we can be a little smarter about which tapes we preread from if the data in the runs is not random. That means we could potentially merge more runs at once with the same work_mem without sacrificing adequate buffers for prefetching. I'm not sure whether this is a practical problem today, and I'm also not sure what to do if we start merging a lot more runs and then determine that forcasting doesn't work as well as we'd hoped (e.g. that the data in the runs really is random). But I thought it was worth mentioning. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers