Re: [HACKERS] Improving our clauseless-join heuristics
Amit Kapila writes: >> I'm afraid I'm still not following you very well. Perhaps you could >> submit a proposed patch? > Before that can you please explain in little more detail (if possible with > small example) about the idea you have told in original mail : "is there any > join clause that both these relations participate in?" Um ... wasn't that well enough explained already? I think there are basically two cases. You can have a join clause that is immediately useful for joining two relations, say select ... from a,b where a.x = b.y; This is "immediate" in the sense that you can apply it when joining a to b, regardless of any other relations involved in the query. Or you can have a case like select ... from a,b,c where (a.x + b.y) = c.z; This clause is not immediately useful for joining any two of the three relations in the query. It will be useful when we get to level 3, particularly so if we chose to join a and b first and there's an index on c.z. But we would have had to accept doing a cartesian join of a and b to arrive at that situation. In this example, we have no alternative except to do some cartesian join at level 2 --- but as soon as we add some more tables and join clauses to the example, we could get distracted from the possibility that a cartesian join of a and b might be a good idea. Given that make_rels_by_joins doesn't (and shouldn't IMO) have any detailed understanding of the semantics of particular join clauses, I would not expect it to realize that joining a to b is the most likely option out of the three possible clauseless joins that are available at level 2 in this query. It's going to have to generate all 3, and then costing at the next level will figure out what's best to do. However, I think it *does* need to understand that clauses relating 3 or more relations can work like this. In the code as it stood before last week, it would actively reject joining a to b if there were any additional relations in the query. That's just not right. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
First of all I think a bug tracker will never be a system that is somehow "loved". Bugs are things which developers hate. Bugs in a cute nice small bug house aren't nice, are they? Question: Is there a reasonable chance from may be Tom Lane's idea of a wiki to an improved (define improvements) version which PG developers think may be useful? Wolfgang Von: Robert Haas An: Magnus Hagander CC: Jay Levitt ; Alex ; Dimitri Fontaine ; Alvaro Herrera ; Peter Eisentraut ; Tom Lane ; Greg Smith ; Pg Hackers Gesendet: 3:07 Mittwoch, 18.April 2012 Betreff: Re: [HACKERS] Bug tracker tool we need On Tue, Apr 17, 2012 at 1:47 AM, Magnus Hagander wrote: > That's probably one reason people aren't jumping on this. Because > there is no tracker out there that people actually *like*... I think this is a point worth serious thought. The bug trackers I've used have been mostly terrible; saying that they are to bug tracking what CVS is to version control is insulting CVS. They're more like what editing RCS files in vi is to version control - i.e. worse than not having version control. To put that in practical terms, I think everyone (including people like Tom and I) who (a) are old curmudgeons or anyway middle-aged curmudgeons and (b) would spend much more time in bed with any system that we adopted than the average hacker would agree that the current system is kind of a pain. But there is no point in replacing it with something else unless that new thing is going to be significantly better than what we are doing now. And it's not entirely clear that such a thing exists. There are certainly people out there, and even on this list, who will tell you that system ABC is great. But for any given ABC there are also people who will tell you that it's got significant problems. We don't need to change anything to get a system that's got significant problems; we already have one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
Magnus Hagander writes: > I think this cleraly outlines that we need to remember that there are > *two* different patterns that people are trying tosolve with the > bugtracker. Yeah, remember we drifted to this topic from discussion of management of CF patches, which might be yet a third use-case. It's not obvious that it's the same as tracking unfixed bugs, at least; though maybe the requirements end up the same. > Any tool we'd go for should aim to cover *both* usecases. Not convinced that we should expect one tool to be good at both (or all three) things. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On Wed, Apr 18, 2012 at 07:52, Tom Lane wrote: > Magnus Hagander writes: >> On Wed, Apr 18, 2012 at 04:30, Tom Lane wrote: >>> So when I read Andrew's recent suggestion that we use >>> Bugzilla, my immediate reaction was "egad, can't we do better?". >>> Maybe we can't :-(. > >> Personally, I'd say we *already* do better than that... > > Just meditating a little ... one of my big beefs with Bugzilla is that > it shows basically a historical record of how a bug was discovered and > dealt with. While that surely has, er, historical value, it's not that > useful to read when you want to know which bug matches your symptoms, > what the possible consequences are, which versions it was fixed in, > etc. One particularly nasty point is that (AFAIK) it's impossible to > delete or edit incorrect comments, only to add new ones. > > I wonder whether a better model would be a wiki page per bug, with > an editable description and some links to reports, commits, etc. > Not but what I hate every wiki I've ever used too ... but at least > they let you fix the information when it's wrong or unhelpful. There's no reason why a bugtracker couldn't and shoulldn't make it possible to do that. I think the reason Bugzilla doesn't goes back to the "all comments are concatenated into a huge text field", which means you'd have to prase that text field to get the actual data back, instead of having it easily available... The discussion should in that case be about whether history shoudl be kept for each individual comment, or just thrown away... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
Magnus Hagander writes: > On Wed, Apr 18, 2012 at 04:30, Tom Lane wrote: >> So when I read Andrew's recent suggestion that we use >> Bugzilla, my immediate reaction was "egad, can't we do better?". >> Maybe we can't :-(. > Personally, I'd say we *already* do better than that... Just meditating a little ... one of my big beefs with Bugzilla is that it shows basically a historical record of how a bug was discovered and dealt with. While that surely has, er, historical value, it's not that useful to read when you want to know which bug matches your symptoms, what the possible consequences are, which versions it was fixed in, etc. One particularly nasty point is that (AFAIK) it's impossible to delete or edit incorrect comments, only to add new ones. I wonder whether a better model would be a wiki page per bug, with an editable description and some links to reports, commits, etc. Not but what I hate every wiki I've ever used too ... but at least they let you fix the information when it's wrong or unhelpful. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
Regarding the schema: I'm afraid the schema cannot be changed at this point, though I appreciate the suggestions. Regarding an INSERT performance test, what kind of table shape would you like me to exercise? The patch as submitted may actually shave some cycles off of the insertion of rows with trailing nulls even when there are less than 64 columns because it avoids iterating over the null columns a 2nd time in heap_fill_tuple(), so I want to be sure that I pick something that you feel is properly representative. Thanks. -Jamie From: Tom Lane To: Jameison Martin Cc: "pgsql-hackers@postgresql.org" Sent: Tuesday, April 17, 2012 9:57 PM Subject: Re: [HACKERS] patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap Jameison Martin writes: > The use-case I'm targeting is a schema that has multiple tables with ~800 > columns, most of which have only the first 50 or so values set. 800 columns > would require 800 bits in a bitmap which equates to 100 bytes. With 8-byte > alignment the row bitmap would take up 104 bytes with the current > implementation. If only the first 50 or so columns are actually non-null, > then the minimum bitmap size wouldn't need to be more than 8 bytes, which > means the proposed change would save 96 bytes. For the data set I have in > mind roughly 90% of the rows would fall into the category of needing only 8 > bytes for the null bitmap. I can't help thinking that (a) this is an incredibly narrow use-case, and (b) you'd be well advised to rethink your schema design anyway. There are a whole lot of inefficiencies associated with having that many columns; the size of the null bitmap is probably one of the smaller ones. I don't really want to suggest an EAV design, but perhaps some of the columns could be collapsed into arrays, or something like that? > What kind of test results would prove that this is a net win (or not a net > loss) for typical cases? Are you interested in some insert performance tests? > Also, how would you define a typical case (e.g. what kind of data shape)? Hmm, well, most of the tables I've seen have fewer than 64 columns, so that the probability of win is exactly zero. Which would mean that you've got to demonstrate that the added overhead is unmeasurably small. Which maybe you can do, because there's certainly plenty of cycles involved in a tuple insertion, but we need to see the numbers. I'd suggest an INSERT/SELECT into a temp table as probably stressing tuple formation speed the most. Or maybe you could write a C function that just exercises heap_form_tuple followed by heap_freetuple in a tight loop --- if there's no slowdown measurable in that context, then a fortiori we don't have to worry about it in the real world. regards, tom lane
Re: [HACKERS] Bug tracker tool we need
On Wed, Apr 18, 2012 at 05:44, Tom Lane wrote: > Greg Smith writes: >> Rather than talk about adopting one of the available torture devices, >> I'd happily consider the simplest thing possible that would be useful >> here instead. Here's my proposed tiny tracker: > > Wasn't Jay just muttering about "writing your own bug tracker" being an > anti-pattern? But still, there's something in what you say, because ... The caes where it would work to do that is if we agreed it'd be a "tiny tracker". And not actually try to do too much. As in basicaly a neater frontend over the mailinglist archives. I actually started on a demo of that at one point. Currently blocked behind the fact that we have to fix the mailinglist archives as well, in particular the break-on-month-boundary kills any attempt to do such a thing. Which is also being worked on, but backlogged as usual :S I'll admit that one reason it's been sitting fairly low on the prio list is the guaranteed months of bikeshedding that it would bring along ;) >> -Make commits that fix a bug reference it in one of the standard ways >> that's done by every one of these bug trackers. Just throw "Fixes >> #6596" into the commit message. These will probably work if a more >> serious tool is adopted, too. > > ... I think you'll find a lot of that data could be mined out of our > historical commit logs already. I know I make a practice of mentioning > "bug #" whenever there is a relevant bug number, and I think other > committers do too. It wouldn't be 100% coverage, but still, if we could > bootstrap the tracker with a few hundred old bugs, we might have > something that was immediately useful, instead of starting from scratch > and hoping it would eventually contain enough data to be useful. I have always considered that a *requirement*, not a neat addon. > At the same time, I think we'd likely be a lot better off squirting this > data into bugzilla or another standard tracker, instead of building our > own infrastructure. I'm somewhat doubtful. As a first step, we should at least stick in a tracker with a reasonable SQL schema so it's possible to extract and do smomething with the data. IIRC, bugzilla (and others) at least used to just concatenate all comments into a single text field, and not even keep them apart, for example. Because clearly this whole JOIN thing is evil and difficult. They may have fixed that by now, but what i've seen from most trackers shows signs of people who have never seen a database beyond an Excel sheet... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving our clauseless-join heuristics
>>I'm afraid I'm still not following you very well. Perhaps you could >>submit a proposed patch? Before that can you please explain in little more detail (if possible with small example) about the idea you have told in original mail : "is there any join clause that both these relations participate in?" I wanted to know the detail about the idea you told to see if what I am proposing has any merit as compare to your idea. -Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Wednesday, April 18, 2012 10:36 AM To: Amit Kapila Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Improving our clauseless-join heuristics Amit Kapila writes: >> I might still be misunderstanding, but I think what you are suggesting >> is that in the loop in make_rels_by_clause_joins, if we find that the >> old_rel doesn't have a join clause/restriction with the current >> other_rel, we check to see whether other_rel has any join clauses at >> all, and force the join to occur anyway if it doesn't. > It is on similar lines, but the only difference is that it will try to join > old_rel with other_rel list incase > old_rel is not able to join with any of other_rel in the list with proper > join clause between them. I'm afraid I'm still not following you very well. Perhaps you could submit a proposed patch? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On Tue, Apr 17, 2012 at 19:59, Greg Smith wrote: > On 04/17/2012 09:20 AM, Jay Levitt wrote: > Let's pick a real example from the last week of my life, where having a bug > tracker would have helped me out. This appears in a log: > > ERROR: missing chunk number 0 for toast value 1167375 in pg_toast_2619 > > What I should be able to do here is search the bug tracker for these words > and have it spit out an issue that looks like this I'm snipping the actual usecase, because it's nice and deatailed, but. I think this cleraly outlines that we need to remember that there are *two* different patterns that people are trying tosolve with the bugtracker. One is the simple "someone reported a bug. track until someone fixes it to make sure we don't miss it. Possibly even flag who is currently working on/responsible for this bug". The other one is for the *outsider* (sorry, Greg, in this scenario you get to represent an outsider for once). Who comes in and wants to know if the problem he/she has exists before, if it's fixed, and in which versions it's fixed in. There is some overlap, but the general usecase is drastically different. And we do a decent enough job of the first one today, partially because we have a few people who are very good at remembering these things and then finding them in the list archives. It works. However, we do very bad on the second one, IMHO. Any tool we'd go for should aim to cover *both* usecases. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On Wed, Apr 18, 2012 at 04:30, Tom Lane wrote: > Robert Haas writes: >> On Tue, Apr 17, 2012 at 1:47 AM, Magnus Hagander wrote: >>> That's probably one reason people aren't jumping on this. Because >>> there is no tracker out there that people actually *like*... > >> I think this is a point worth serious thought. > > Indeed. The only one I've got extensive experience with is Bugzilla > (because Red Hat uses it) and I do cordially hate it. At least some > of that is due to bureaucratic practices RH has evolved, like cloning > bugs N times for N affected releases, but I think the tool encourages > such things. So when I read Andrew's recent suggestion that we use > Bugzilla, my immediate reaction was "egad, can't we do better?". > Maybe we can't :-(. Personally, I'd say we *already* do better than that... -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join
Robert Haas writes: > On Mon, Apr 16, 2012 at 1:43 PM, Jeff Davis wrote: >> ... Only one side really needs the mark and restore logic, but it was easier >> to write the pseudocode in a symmetrical way (except step 7). > I'm actually not sure these are equivalent formulations. Suppose one > side has [i,i] where i ranges from 1 to 1000 and the other side > the exact same thing plus [1,1000]. That one really big range > will come up second on the right side, and you will not be able to > discard it until you reach the end of the join. If you just keep > rewinding the right side, you're going to end up with O(n^2) behavior, > whereas if you can discard tuples "from the middle" on the right side, > then you will get O(n) behavior, which is a big difference. In other > words, in your original algorithm the tuples that you discard in step > 4 or 5 need not be the first remaining tuple on whichever side of the > join we're talking about. It would be a pretty weird implementation of mergejoin that could "discard tuples from the middle" of an input stream. Or to be more specific, it wouldn't be the mergejoin itself that could do that at all --- you'd need the input plan node to be some sort of tweaked version of tuplestore or tuplesort that could respond to a request like that. I can't escape the feeling that Jeff has chosen the wrong basis for his thought experiment, and that what he really ought to be thinking about is hashjoin, which keeps an in-memory table that it could easily modify on the fly if it chose to. The multi-batch aspect of hybrid hashjoin could be useful too (IOW, when you're out of better ideas, throw the tuple back in the water to process later). This is just handwaving of course. I think some digging in the spatial-join literature would likely find ideas better than any of these. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improving our clauseless-join heuristics
Amit Kapila writes: >> I might still be misunderstanding, but I think what you are suggesting >> is that in the loop in make_rels_by_clause_joins, if we find that the >> old_rel doesn't have a join clause/restriction with the current >> other_rel, we check to see whether other_rel has any join clauses at >> all, and force the join to occur anyway if it doesn't. > It is on similar lines, but the only difference is that it will try to join > old_rel with other_rel list incase > old_rel is not able to join with any of other_rel in the list with proper > join clause between them. I'm afraid I'm still not following you very well. Perhaps you could submit a proposed patch? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
Jameison Martin writes: > The use-case I'm targeting is a schema that has multiple tables with ~800 > columns, most of which have only the first 50 or so values set. 800 columns > would require 800 bits in a bitmap which equates to 100 bytes. With 8-byte > alignment the row bitmap would take up 104 bytes with the current > implementation. If only the first 50 or so columns are actually non-null, > then the minimum bitmap size wouldn't need to be more than 8 bytes, which > means the proposed change would save 96 bytes. For the data set I have in > mind roughly 90% of the rows would fall into the category of needing only 8 > bytes for the null bitmap. I can't help thinking that (a) this is an incredibly narrow use-case, and (b) you'd be well advised to rethink your schema design anyway. There are a whole lot of inefficiencies associated with having that many columns; the size of the null bitmap is probably one of the smaller ones. I don't really want to suggest an EAV design, but perhaps some of the columns could be collapsed into arrays, or something like that? > What kind of test results would prove that this is a net win (or not a net > loss) for typical cases? Are you interested in some insert performance tests? > Also, how would you define a typical case (e.g. what kind of data shape)? Hmm, well, most of the tables I've seen have fewer than 64 columns, so that the probability of win is exactly zero. Which would mean that you've got to demonstrate that the added overhead is unmeasurably small. Which maybe you can do, because there's certainly plenty of cycles involved in a tuple insertion, but we need to see the numbers. I'd suggest an INSERT/SELECT into a temp table as probably stressing tuple formation speed the most. Or maybe you could write a C function that just exercises heap_form_tuple followed by heap_freetuple in a tight loop --- if there's no slowdown measurable in that context, then a fortiori we don't have to worry about it in the real world. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On 18 April 2012 13:44, Tom Lane wrote: > ... I think you'll find a lot of that data could be mined out of our > historical commit logs already. I know I make a practice of mentioning > "bug #" whenever there is a relevant bug number, and I think other > committers do too. It wouldn't be 100% coverage, but still, if we could > bootstrap the tracker with a few hundred old bugs, we might have > something that was immediately useful, instead of starting from scratch > and hoping it would eventually contain enough data to be useful. Just as a data point, git tells me that there are 387 commits where the commit log message matches '#\d+', and 336 where it matches 'bug #\d+'. Cheers, BJ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On 04/17/2012 11:44 PM, Tom Lane wrote: At the same time, I think we'd likely be a lot better off squirting this data into bugzilla or another standard tracker, instead of building our own infrastructure. Perhaps. It just struck me that a lot of the custom bits needed here regardless could be built/added to the usual workflow bottom-up. You don't have to assume that either a full tracker or something like my simple POC idea will show up at the end to get started. Making goal #1 involve just mining for the data that's already around and squirting it onto a web page would be a useful exercise, one that's likely to benefit any tracker adoption. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
Greg Smith writes: > Rather than talk about adopting one of the available torture devices, > I'd happily consider the simplest thing possible that would be useful > here instead. Here's my proposed tiny tracker: Wasn't Jay just muttering about "writing your own bug tracker" being an anti-pattern? But still, there's something in what you say, because ... > -Make commits that fix a bug reference it in one of the standard ways > that's done by every one of these bug trackers. Just throw "Fixes > #6596" into the commit message. These will probably work if a more > serious tool is adopted, too. ... I think you'll find a lot of that data could be mined out of our historical commit logs already. I know I make a practice of mentioning "bug #" whenever there is a relevant bug number, and I think other committers do too. It wouldn't be 100% coverage, but still, if we could bootstrap the tracker with a few hundred old bugs, we might have something that was immediately useful, instead of starting from scratch and hoping it would eventually contain enough data to be useful. At the same time, I think we'd likely be a lot better off squirting this data into bugzilla or another standard tracker, instead of building our own infrastructure. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On 04/17/2012 10:30 PM, Tom Lane wrote: Indeed. The only one I've got extensive experience with is Bugzilla (because Red Hat uses it) and I do cordially hate it. At least some of that is due to bureaucratic practices RH has evolved, like cloning bugs N times for N affected releases, but I think the tool encourages such things. That's along the same lines as my comments toward Jay Levitt, that bugs where the fixes span multiple releases are the part nobody seems to handle very well. Rather than talk about adopting one of the available torture devices, I'd happily consider the simplest thing possible that would be useful here instead. Here's my proposed tiny tracker: -If a bug is found in a released version, but it didn't originate on pgsql-bugs, send a message to that list so it gets assigned a bug id. -Write something that consumes pgsql-bugs and dumps all bug numbers and their subject lines into a database. Start them with a state of "Unconfirmed". -Make commits that fix a bug reference it in one of the standard ways that's done by every one of these bug trackers. Just throw "Fixes #6596" into the commit message. These will probably work if a more serious tool is adopted, too. -Update src/tools/git_changelog to understand these messages and produce a delimited file about every bug fix discovered. Import new entries into another table with the bug id as the foreign key. -Provide a command line tool to change bug state, basically a thin wrapper around an UPDATE statement. Make it easy to change ranges that are currently "Unconfirmed" to "Not a bug", for the bug reports that weren't really bugs. -When point release tagging happens, run another script that looks for bug fix commits since the last one, and then save that version number into a "fixed in" table. -Generate a web page out of the database. I think I've outlined that in a way that would make useful steps toward adopting one of the full packages, too. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Don't override arguments set via options with positional argumen
On Tue, Apr 17, 2012 at 10:53 PM, Tom Lane wrote: > I can see both sides of this. I agree that the old behavior is buggy, > but what I imagine Robert is worried about is scripts that accidentally > work okay today and would stop working once the PG programs are fixed > to complain about bogus usage. People tend not to like it if you make > that kind of change in a minor release. Exactly. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On Tue, Apr 17, 2012 at 11:07 PM, Greg Sabino Mullane wrote: > Let's not let perfect be the enemy of good. In this case, *anything* > that actually tracks bugs (and they are all quite good at that, > if nothing else) is an improvement over what we have now, and thus, > quite good. :) I respectfully disagree. I would rather be chained to an angry cat for a day than have to use Bugzilla on a regular basis. RT is better, but the UI is still laughably bad. Try sticking a 74-email long thread into an RT ticket and then try to do anything with it. Now try in Gmail (even the new, revised, slightly-harder-to-use Gmail). It's not close. > Personally, I'm okay with, and have extensively hacked on, Bugzilla > and RT, but anything should be fine as long as we have someone > to take ownership. Therein lies another problem... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
> Date: Wed, 18 Apr 2012 02:45:09 +0300 > Subject: Re: [HACKERS] Gsoc2012 idea, tablesample > From: a...@cybertec.at > To: cbbro...@gmail.com > CC: sfr...@snowman.net; pgsql-hackers@postgresql.org > > On Tue, Apr 17, 2012 at 7:33 PM, Christopher Browne > wrote: > > Well, there may be cases where the quality of the sample isn't > > terribly important, it just needs to be "reasonable." > > > > I browsed an article on the SYSTEM/BERNOULLI representations; they > > both amount to simple picks of tuples. > > > > - BERNOULLI implies picking tuples with a specified probability. > > > > - SYSTEM implies picking pages with a specified probability. (I think > > we mess with this in ways that'll be fairly biased in view that tuples > > mayn't be of uniform size, particularly if Slightly Smaller strings > > stay in the main pages, whilst Slightly Larger strings get TOASTed...) Looking at the definition of BERNOULLI method and it means to scan all the tuples, I always have a question. What is the difference of using BERNOULLI method with using "select * where rand() < 0.1"? They will both go through all the tuples and cost a seq-scan. If the answer to the above question is "no difference", I have one proposal for another method of BERNOULLI. For a relation, we can have all their tuples assigned an unique and continuous ID( we may use ctid or others). Then for each number in the set of IDs, we assign a random number and check whether that is smaller than the sampling percentage. If it is smaller, we retrieve the tuple corresponding to that ID. This method will not seq scan all the tuples, but it can sample by picking tuples.Thanks Best Regards and ThanksHuang Qi VictorComputer Science of National University of Singapore
Re: [HACKERS] Bug tracker tool we need
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 > But for any > given ABC there are also people who will tell you that it's got > significant problems. We don't need to change anything to get a > system that's got significant problems; we already have one. Let's not let perfect be the enemy of good. In this case, *anything* that actually tracks bugs (and they are all quite good at that, if nothing else) is an improvement over what we have now, and thus, quite good. :) Personally, I'm okay with, and have extensively hacked on, Bugzilla and RT, but anything should be fine as long as we have someone to take ownership. - -- Greg Sabino Mullane g...@turnstep.com PGP Key: 0x14964AC8 201204172131 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAk+OL/0ACgkQvJuQZxSWSshMxACeJdr+WO4ttA2mkrGLv98PTTSH jSoAniKwQNPzokA3f0GYN8gB+hAOc0Hy =oPn6 -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Don't override arguments set via options with positional argumen
Andrew Dunstan writes: > You know, I could have sworn it was discussed, but when I look back I > see it wasn't. I must have been remembering the recent logging protocol bug. > I'll revert it if people want, although I still think it's a bug. I think we discussed it to the extent of agreeing it was a bug, but the question of whether to back-patch was not brought up. I can see both sides of this. I agree that the old behavior is buggy, but what I imagine Robert is worried about is scripts that accidentally work okay today and would stop working once the PG programs are fixed to complain about bogus usage. People tend not to like it if you make that kind of change in a minor release. Against that you have to set the probability of mistaken interactive usage being caught, or not, by a repaired program. Stopping a disastrous command-line typo seems potentially worth any pain from having to fix scripts that would have to be fixed eventually anyway. If you had patched only HEAD I would have been fine with that, but seeing that you've done the work to back-patch I'm kind of inclined to let it stand. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
Robert Haas writes: > On Tue, Apr 17, 2012 at 1:47 AM, Magnus Hagander wrote: >> That's probably one reason people aren't jumping on this. Because >> there is no tracker out there that people actually *like*... > I think this is a point worth serious thought. Indeed. The only one I've got extensive experience with is Bugzilla (because Red Hat uses it) and I do cordially hate it. At least some of that is due to bureaucratic practices RH has evolved, like cloning bugs N times for N affected releases, but I think the tool encourages such things. So when I read Andrew's recent suggestion that we use Bugzilla, my immediate reaction was "egad, can't we do better?". Maybe we can't :-(. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
> > Hmm. StandbyMode is a local variable which cannot be accessed in > > checkpointer. But WalRcvInProgress() which shows if wal receiver > > is running seems to be usable to ENABLE governing progress by > > checkpoint_segments. > > Even when walreceiver is not running and WAL files are read from the archive, > checkpoint_segments can trigger a restartpoint. In this case, ISTM a > restartpoint > should be scheduled according to checkpoint_segments, so I don't think that > checking WalRcvInProgress() for that purpose is right thing. Instead, what > about > sharing StandbyMode flag among processes via shared memory like XLogCtl? I tried that at first. But I suppose the requirement here is 'if reading segments comes via replication stream, enable throttling by checkpoint_segments.' and WalRcvInProgress() seems fit to check that. Plus, adding SharedStartupStandbyMode into XLogCtlData seems accompanied with some annoyances which would not pay. By the way, do you have some advise about GetStandbyFlushRecPtr() and the order of the locations? I'm embarrassed with that... > > I agree with it basically. But I've get confused to look into > > GetStandbyFlushRecPtr(). > > > > | if (XLByteLT(receivePtr, replayPtr)) > > | return XLByteLT(replayPtr, restorePtr) ? restorePtr : replayPtr; > > | else > > | return XLByteLT(receivePtr, restorePtr) ? restorePtr : receivePtr; > - receivePtr seems always updated just after syncing received xlog. > - replayPtr is updated just BEFORE xlog_redo operation, and > - restorePtr is updated AFTER xlog_redo(). > - And, replayPtr seems not exceeds receivePtr. > > These seems quite reasonable. These conditions make following > conditional expression. > > restorePtr <= replayPtr <= receivePtr > > But XLByteLT(recievePtr, replayPtr) this should not return true > under the condition above.. Something wrong in my assumption? regards, -- Kyotaro Horiguchi NTT Open Source Software Center == My e-mail address has been changed since Apr. 1, 2012. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On Tue, Apr 17, 2012 at 1:47 AM, Magnus Hagander wrote: > That's probably one reason people aren't jumping on this. Because > there is no tracker out there that people actually *like*... I think this is a point worth serious thought. The bug trackers I've used have been mostly terrible; saying that they are to bug tracking what CVS is to version control is insulting CVS. They're more like what editing RCS files in vi is to version control - i.e. worse than not having version control. To put that in practical terms, I think everyone (including people like Tom and I) who (a) are old curmudgeons or anyway middle-aged curmudgeons and (b) would spend much more time in bed with any system that we adopted than the average hacker would agree that the current system is kind of a pain. But there is no point in replacing it with something else unless that new thing is going to be significantly better than what we are doing now. And it's not entirely clear that such a thing exists. There are certainly people out there, and even on this list, who will tell you that system ABC is great. But for any given ABC there are also people who will tell you that it's got significant problems. We don't need to change anything to get a system that's got significant problems; we already have one. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
On Tue, Apr 17, 2012 at 7:33 PM, Christopher Browne wrote: > Well, there may be cases where the quality of the sample isn't > terribly important, it just needs to be "reasonable." > > I browsed an article on the SYSTEM/BERNOULLI representations; they > both amount to simple picks of tuples. > > - BERNOULLI implies picking tuples with a specified probability. > > - SYSTEM implies picking pages with a specified probability. (I think > we mess with this in ways that'll be fairly biased in view that tuples > mayn't be of uniform size, particularly if Slightly Smaller strings > stay in the main pages, whilst Slightly Larger strings get TOASTed...) Dealing with non uniform sizes isn't too hard. analyze.c already does that. Given a table with B blocks it takes a uniform sample of b blocks, and runs Vitter's reservoir sampling algorithm over the resulting blocks to get s random tuples in a single pass. It's quite easy to prove that this results in each tuple having an equal probability to appear in the final table. However, it isn't fully independent sampling - depending on the value of b compared to s and B, there is a slightly higher probability to see multiple tuples picked from one page. I'm too lazy to do the math, but for the analyze case of b = s it probably isn't significant for most practical purposes, even if B is really large. And it seems to me that when b >= s the reservoir sampling thresholds could be tweaked at block boundaries to even out the dependencies. The ratio of b to s could be tweaked to get lower quality sampling (samples are more spatially clumped) in exchange for less random I/O. > Possibly the forms of sampling that people *actually* need, most of > the time, are more like Dollar Unit Sampling, which are pretty > deterministic, in ways that mandate that they be rather expensive > (e.g. - guaranteeing Seq Scan). I have a gut feeling that Dollar Unit Sampling and other weighted samples can be done with a similar approach of uniformly sampling blocks and then running weighted reservoir sampling [1] over the result. [1]: http://utopia.duth.gr/~pefraimi/research/data/2007EncOfAlg.pdf Cheers, Ants Aasma -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: [COMMITTERS] pgsql: Don't override arguments set via options with positional argumen
On 04/17/2012 07:19 PM, Andrew Dunstan wrote: On 04/17/2012 07:08 PM, Robert Haas wrote: On Tue, Apr 17, 2012 at 6:39 PM, Andrew Dunstan wrote: Don't override arguments set via options with positional arguments. A number of utility programs were rather careless about paremeters that can be set via both an option argument and a positional argument. This leads to results which can violate the Principal Of Least Astonishment. These changes refuse to use positional arguments to override settings that have been made via positional arguments. The changes are backpatched to all live branches. Branch -- REL8_3_STABLE Uh, isn't it kind of a bad idea to back-patch something like this? It seems like a behavior change. It was discussed. I think the previous behaviour is a bug. It can't be sane to be allowed to do: initdb -D foo bar You know, I could have sworn it was discussed, but when I look back I see it wasn't. I must have been remembering the recent logging protocol bug. I'll revert it if people want, although I still think it's a bug. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Don't override arguments set via options with positional argumen
On 04/17/2012 07:08 PM, Robert Haas wrote: On Tue, Apr 17, 2012 at 6:39 PM, Andrew Dunstan wrote: Don't override arguments set via options with positional arguments. A number of utility programs were rather careless about paremeters that can be set via both an option argument and a positional argument. This leads to results which can violate the Principal Of Least Astonishment. These changes refuse to use positional arguments to override settings that have been made via positional arguments. The changes are backpatched to all live branches. Branch -- REL8_3_STABLE Uh, isn't it kind of a bad idea to back-patch something like this? It seems like a behavior change. It was discussed. I think the previous behaviour is a bug. It can't be sane to be allowed to do: initdb -D foo bar cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On Tue, Apr 17, 2012 at 4:15 PM, Jay Levitt wrote: > That's a great point. Both GitHub and git itself have no real concept of > releases, and can't tell you when a commit made it in. Those factors likely play together in this. Git is a tool, not a workflow, and intentionally allows its users to use it in a variety of ways. (Which includes some very interesting pathologies visible with stuff like git-annex, git-mail.) It's not a bug that git can do things differently than the Postgres project wants things done. Likewise, it's not a bug that github, which intends to support all kinds of users using git, does not enforce the preferred Postgres workflow. I think it's pretty *normal* that we'd need tooling that won't be identical to (say) GitHub, and we shouldn't get too exercised about this. I wonder if our "fix" instead involves: a) Adding an ArchiveOpteryx instance to archive mailing list traffic; http://archiveopteryx.org/ b) A bit more tooling to make it easier to link to threads in that instance c) Perhaps some management tools based on debbugs to ease scripting of issues being tracked That's not prescriptive; just ideas. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [COMMITTERS] pgsql: Don't override arguments set via options with positional argumen
On Tue, Apr 17, 2012 at 6:39 PM, Andrew Dunstan wrote: > Don't override arguments set via options with positional arguments. > > A number of utility programs were rather careless about paremeters > that can be set via both an option argument and a positional > argument. This leads to results which can violate the Principal > Of Least Astonishment. These changes refuse to use positional > arguments to override settings that have been made via positional > arguments. The changes are backpatched to all live branches. > > Branch > -- > REL8_3_STABLE Uh, isn't it kind of a bad idea to back-patch something like this? It seems like a behavior change. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On 04/17/2012 04:38 PM, Tom Lane wrote: Jay Levitt writes: Greg Smith wrote: Tracking when and how a bug is backported to older versions is one hard part of the problem here. That's a great point. Both GitHub and git itself have no real concept of releases, and can't tell you when a commit made it in. We do actually have a somewhat-workable solution for that, see src/tools/git_changelog. It relies on cooperation of the committers to commit related patches with the same commit message and more or less the same commit time, but that fits fairly well with our practices anyway. If we did have an issue tracker I could see expecting commit messages to include a reference to the issue number, and then it would not be hard to adapt this program to key on that instead of matching commit message texts. Yeah, that would be good. BTW, since we're discussing trackers yet again, let me put in a plug for Bugzilla, which has mature Postgres support, is written in Perl (which a large number of hackers are familiar with and which we use extensively), has a long history and a large organization behind it (Mozilla) and last but not least has out of the box support for creating updating and closing bugs via email (I just set up an instance of the latest release with this enabled to assure myself that it works, and it does.) It also has XML-RPC and JSON-RPC interfaces, as well as standard browser support, although I have not tested the RPC interfaces. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extension allocating shared memory
Alvaro Herrera wrote: > Excerpts from Kevin Grittner's message: >> What is the best way for an extension to allocate shared memory >> and to access it from every backend? > RequestAddinShmemSpace Perfect! That's exactly what I wanted. I see that the pg_stat_statements extension is already using it, so I even have a good working example to look at. Thanks, -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
Thanks for the response. The use-case I'm targeting is a schema that has multiple tables with ~800 columns, most of which have only the first 50 or so values set. 800 columns would require 800 bits in a bitmap which equates to 100 bytes. With 8-byte alignment the row bitmap would take up 104 bytes with the current implementation. If only the first 50 or so columns are actually non-null, then the minimum bitmap size wouldn't need to be more than 8 bytes, which means the proposed change would save 96 bytes. For the data set I have in mind roughly 90% of the rows would fall into the category of needing only 8 bytes for the null bitmap. What kind of test results would prove that this is a net win (or not a net loss) for typical cases? Are you interested in some insert performance tests? Also, how would you define a typical case (e.g. what kind of data shape)? Thanks. -jamie From: Tom Lane To: Jameison Martin Cc: "pgsql-hackers@postgresql.org" Sent: Tuesday, April 17, 2012 9:38 AM Subject: Re: [HACKERS] patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap Jameison Martin writes: > The following patch truncates trailing null attributes from heap rows to > reduce the size of the row bitmap. This has been discussed before, but it always seemed that the cost-benefit ratio was exceedingly questionable. You don't get any savings whatsoever unless you reduce the size of the null bitmap across a MAXALIGN boundary, which more and more often is 64 bits, so that the frequency with which the optimization wins anything doesn't look likely to be that high. And on the other side of the coin, you're adding cycles to every single tuple-construction operation to make this work. The introduction of bugs doesn't seem improbable either. (Just because tuples in user tables might have unexpected natts values doesn't mean that the code is, or should be, prepared to tolerate that in system tables or plan-constructed tuples.) So what I'd like to see is some concrete test results proving that this is a net win, or at least not a net loss, for typical cases. Just asserting that it might be a win for certain usage patterns doesn't do it for me. regards, tom lane
Re: [HACKERS] patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
Greg Stark writes: > On Tue, Apr 17, 2012 at 5:38 PM, Tom Lane wrote: >> This has been discussed before, but it always seemed that the >> cost-benefit ratio was exceedingly questionable. You don't get any >> savings whatsoever unless you reduce the size of the null bitmap across >> a MAXALIGN boundary, which more and more often is 64 bits, so that the >> frequency with which the optimization wins anything doesn't look likely >> to be that high. > There is the usage pattern where (brace yourself) people have > thousands of columns in which they have all but a handful be null. > They might be pretty happy about this. Oh, I don't doubt that there are *some* use cases for this. I'm just dubious about how much we'd be slowing things down for everybody else. As I said, what I'd like to see are some benchmarks, and not just benchmarks that are tuned to match the sort of case where this wins. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
Jay Levitt writes: > Greg Smith wrote: >> Tracking when and how a bug is backported to older versions is one hard part >> of the problem here. > That's a great point. Both GitHub and git itself have no real concept of > releases, and can't tell you when a commit made it in. We do actually have a somewhat-workable solution for that, see src/tools/git_changelog. It relies on cooperation of the committers to commit related patches with the same commit message and more or less the same commit time, but that fits fairly well with our practices anyway. If we did have an issue tracker I could see expecting commit messages to include a reference to the issue number, and then it would not be hard to adapt this program to key on that instead of matching commit message texts. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
Robert Haas writes: > On Tue, Apr 17, 2012 at 3:05 PM, Tom Lane wrote: >>> Personally, I find required_outer more clear. YMMV. >> Perhaps. What's bothering me is the potential for confusion with outer >> joins; the parameter-supplying rels are *not* necessarily on the other >> side of an outer join. Anybody else have an opinion about that? > Well, we also use the words "inner" and "outer" to refer to the sides > of any join, regardless of type. True. > The thing I don't like about "param_relids" is that "param" can refer > to an awful lot of different things. Fair enough. I'll leave required_outer alone then, and adjust some names in the new patch to be consistent with that. As far as the other naming issue goes, it struck me that instead of join_clause_is_parameterizable_xxx, we could call those functions join_clause_is_movable_xxx, assuming it's okay to commandeer the notion of "movable" for this particular usage. It seems a bit less generic than "parameterizable" anyway. The "for" and "within" bits don't fit with that though. The first one could reasonably be called "join_clause_is_movable_to", since we're checking if it's okay to push the clause to precisely that base relation, but I'm a bit at a loss for a modifier for the other one. "into" would be appropriate, but "to" and "into" are so close together that people might get confused. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
Josh, * Josh Berkus (j...@agliodbs.com) wrote: > FWIW, the PostGIS folks would *really* love to have a TABLESAMPLE which > worked with geographic indexes. This would be tremendously useful for > constructing low-resolution "zoom out" tiles on maps and similar. I'm familiar with the concept of 'zoom out' tiles and PostGIS, but I don't actually see the connection between that and TABLESAMPLE. Perhaps I'm missing something, but I've never seen a case where you create 'zoom out' tiles by just grabbing some portion of the data at random, as that would end up creating empty spots or missing pieces. My experience has been with running an algorithm on each record in the data set to reduce the number of points on a given record (which ends up reducing the size of the record, etc). You could also filter out records which wouldn't be visible due to the small 'size' of the record. Again, neither of those would benefit from a TABLESAMPLE command. Perhaps they're thinking it's going to use a GIST index to only pull out records matching a certain condition..? Except we have WHERE for that.. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] Re: patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
On Tue, Apr 17, 2012 at 5:38 PM, Tom Lane wrote: > This has been discussed before, but it always seemed that the > cost-benefit ratio was exceedingly questionable. You don't get any > savings whatsoever unless you reduce the size of the null bitmap across > a MAXALIGN boundary, which more and more often is 64 bits, so that the > frequency with which the optimization wins anything doesn't look likely > to be that high. There is the usage pattern where (brace yourself) people have thousands of columns in which they have all but a handful be null. They might be pretty happy about this. I'm not sure if that's a use case that makes sense to optimize for though -- even for them the space overhead would be noticeable but not a showstopper. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
Greg Smith wrote: On 04/17/2012 09:20 AM, Jay Levitt wrote: Antispam is (in the large) a technically unsolvable problem; even in the '90s, we'd see hackers start poking at our newest countermeasures within the hour. GitHub is a giant target, and PG probably benefits here from NOT being one. Everyone who deals with list moderation and spam issues around PostgreSQL just got a belly laugh from that comment. Hint: the PostgreSQL lists had already been around and therefore were being targeted by spammers for over ten years before GitHub even existed. Hehe. OK, we will have to battle this out over drinks if I ever make it to PGCon.. but teaser: I've bankrupted Sanford Wallace and taught the DOJ what spam was. Pedantic note/fun fact: There was no email antispam in 1994 I like it when Magnus really gets the details perfect when making a deadpan joke. Dammit. I *fail*. Anyway, back to serious talk, I believe GitHub is a dead end here because the "primary key" as it were for issues is a repo. A bug tracker for PostgreSQL would need to have issues broken down per branch and include information similar to the release notes for each minor point release. Tracking when and how a bug is backported to older versions is one hard part of the problem here. That's a great point. Both GitHub and git itself have no real concept of releases, and can't tell you when a commit made it in. Although.. there's some sort of new release-note functionality. Maybe I'll play and see if it'd be applicable here. Jay -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq URI and regression testing
Excerpts from Andrew Dunstan's message of mar abr 17 16:03:50 -0300 2012: > >> That's one reason for that, but there are probably others in the way of > >> making this fully portable and automatable. > > This test setup also appears to labor under the illusion that we live in > a Unix-only world. And for no good reason that I can tell. The shell > script should be ripped out and replaced by a perl script which could > actually be used on any windows build host. The MSVC build system also > needs adjusting to make it build the test driver, at least. I'll see about it. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extension allocating shared memory
Excerpts from Kevin Grittner's message of mar abr 17 16:27:21 -0300 2012: > What is the best way for an extension to allocate shared memory and to > access it from every backend? Or, if there is no support existing for > that, what advice do people have if I want to make that happen? I don't > need a lot (probably 1KB would do). RequestAddinShmemSpace -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
On Tue, Apr 17, 2012 at 3:05 PM, Tom Lane wrote: > Well, we already made a policy decision that we weren't going to try > very hard to support merge joins inside parameterized subtrees, because > the potential growth in planning time looked nasty. My thought was that > we might resurrect the parameterized MergeAppendPath code when and if > we reverse that decision. At the moment, in fact, I believe that > add_path is pretty nearly guaranteed to discard a parameterized > MergeAppendPath immediately upon submission, because the fact that it's > sorted isn't given any weight for add_path comparisons, and cost-wise > it's going to look worse than the similarly parameterized plain Append > that we would have submitted just before. OK. >>> Second, I've gotten dissatisfied >>> with the terminology "required_outer" that was used in the original param >>> plans patch. I'm considering a global search and replace with >>> "param_relids" or some variant of that. > >> Personally, I find required_outer more clear. YMMV. > > Perhaps. What's bothering me is the potential for confusion with outer > joins; the parameter-supplying rels are *not* necessarily on the other > side of an outer join. Anybody else have an opinion about that? Well, we also use the words "inner" and "outer" to refer to the sides of any join, regardless of type. Maybe we could call it "requires_nestloop_with" or "requires_join_to" or something like that. The thing I don't like about "param_relids" is that "param" can refer to an awful lot of different things. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add new replication mode synchronous_commit = 'write'.
On Tue, Apr 17, 2012 at 9:52 PM, Thom Brown wrote: > On 16 April 2012 17:21, Fujii Masao wrote: >> On Sun, Apr 15, 2012 at 12:13 AM, Thom Brown wrote: >>> No, that's not what I was referring to. If you don't have a standby >>> (i.e. a single, isolated database cluster with no replication), and >>> its synchronous_commit is set to 'remote_write', what effect does that >>> have? >> >> It's the same effect as 'on' and 'local' do, i.e., transaction commit waits >> for only local WAL flush. This behavior is not documented explicitly... >> How should we change the document? What about adding the following >> into the explanation of synchronous_commit parameter (maybe the end >> of second paragraph of that)? >> >> - >> If synchronous_standby_names is not set, on, remote_write and local >> provide the same synchronization level; transaction commit only waits for >> local flush. >> - > > Yes, that sounds fine. Okay, patch attached. Regards, -- Fujii Masao synchronous_commit_doc_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] extension allocating shared memory
What is the best way for an extension to allocate shared memory and to access it from every backend? Or, if there is no support existing for that, what advice do people have if I want to make that happen? I don't need a lot (probably 1KB would do). If this just "can't be done" I guess I could start a little daemon to run alongside PostgreSQL and talk to it through a pipe or TCP on localhost, but it seems cleaner to have it run within PostgreSQL if feasible. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
On Tue, Apr 17, 2012 at 11:50 PM, Kyotaro HORIGUCHI wrote: > Hmm. StandbyMode is a local variable which cannot be accessed in > checkpointer. But WalRcvInProgress() which shows if wal receiver > is running seems to be usable to ENABLE governing progress by > checkpoint_segments. Even when walreceiver is not running and WAL files are read from the archive, checkpoint_segments can trigger a restartpoint. In this case, ISTM a restartpoint should be scheduled according to checkpoint_segments, so I don't think that checking WalRcvInProgress() for that purpose is right thing. Instead, what about sharing StandbyMode flag among processes via shared memory like XLogCtl? Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
On Sun, Apr 15, 2012 at 8:23 AM, Simon Riggs wrote: > On Sat, Apr 7, 2012 at 9:51 PM, Robert Haas wrote: >> I think this basically just boils down to too many patches and not >> enough people. I was interested in Command Triggers from the >> beginning of this CommitFest, and I would have liked to pick it up >> sooner, but there were a LOT of patches to work on for this >> CommitFest. The first three CommitFests of this cycle each had >> between 52 and 60 patches, while this one had 106 which included >> several very complex and invasive patches, command triggers among >> them. So there was just a lot more to do, and a number of the people >> who submitted all of those patches didn't do a whole lot to help >> review them, sometimes because they were still furiously rewriting >> their submissions. It's not surprising that more patches + fewer >> reviewers = each patch getting less attention, or getting it later. > > This is a good point. The current process lacks inherent scalability. > > I would really like us to enforce a policy of 1 patch => 1 review. > That way we automatically have enough review time, no matter how many > patches we get. If we don't enforce that, then patch sponsors are more > likely to take the attitude that review isn't something they need to > pay for, just the dev work. I would be generally in favor of that policy, but I would relax it for people who have only ever submitted a handful of patches, so as to continue encouraging them to become involved in the community. It's also worth noting that not all reviews are created equal. It takes a lot more time to review command triggers than it does to review pg_archivecleanup extension-skipping. It's not important that the review effort is *exactly* proportional to the size of what has been submitted, but it *is* important that when we're having a CommitFest, people are spending the majority of their time on reviewing, rather than continuing to spend it developing patches that aren't done yet. It's perfectly right to say that people should be given a chance to finish up patches that are almost there, but our definition of "almost there" has a tendency to expand to the point where it's laughable when the last CommitFest of the cycle rolls along. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
On Tue, Apr 17, 2012 at 3:50 PM, Kyotaro HORIGUCHI wrote: > These seems quite reasonable. These conditions make following > conditional expression. > > restorePtr <= replayPtr <= receivePtr > > But XLByteLT(recievePtr, replayPtr) this should not return true > under the condition above.. Something wrong in my assumption? When walreceiver is not running, i.e., the startup process reads the WAL files from the archival area, the replay location would get bigger than the receive one. Regards, -- Fujii Masao -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
Robert Haas writes: > On Tue, Apr 17, 2012 at 12:14 PM, Tom Lane wrote: >> BTW, after writing the code for it I decided to remove creation of >> parameterized MergeAppendPaths from allpaths.c, though there is still some >> support for them elsewhere. On reflection it seemed to me that the code >> was willing to create far too many of these, much more than their >> potential usefulness could justify (remember that parameterized paths must >> be on the inside of a nestloop, so their sort ordering is typically of >> marginal use). We can put that back if we can think of a more restrictive >> heuristic for when to create them. > I guess the case in which this would matter is if you wrote something > like A LJ (B IJ C) where B and/or C has child tables and the best > method of joining them to each other is a marge join. Well, we already made a policy decision that we weren't going to try very hard to support merge joins inside parameterized subtrees, because the potential growth in planning time looked nasty. My thought was that we might resurrect the parameterized MergeAppendPath code when and if we reverse that decision. At the moment, in fact, I believe that add_path is pretty nearly guaranteed to discard a parameterized MergeAppendPath immediately upon submission, because the fact that it's sorted isn't given any weight for add_path comparisons, and cost-wise it's going to look worse than the similarly parameterized plain Append that we would have submitted just before. >> Second, I've gotten dissatisfied >> with the terminology "required_outer" that was used in the original param >> plans patch. I'm considering a global search and replace with >> "param_relids" or some variant of that. > Personally, I find required_outer more clear. YMMV. Perhaps. What's bothering me is the potential for confusion with outer joins; the parameter-supplying rels are *not* necessarily on the other side of an outer join. Anybody else have an opinion about that? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq URI and regression testing
On 04/17/2012 02:47 PM, Alvaro Herrera wrote: Excerpts from Peter Eisentraut's message of mar abr 17 15:41:04 -0300 2012: On tis, 2012-04-17 at 10:47 -0300, Alvaro Herrera wrote: What's the preferred way to make it automatically tested as much as possible? I know the buildfarm does not run "installcheck-world", so if we want it there, it'd need a bit more code on the client side. I think it would be wise to have it also run on installcheck-world. It was agreed during the patch discussion that it shouldn't be run automatically. Oh, okay. I didn't notice that. I guess we should issue a call-for-testing, then, so that we ensure it works (FSVO works) in all (FSVO all) platforms. Hmm. Just had this thought: not all platform support the same socket types. Maybe we should have separated the .in file in three parts: IPv4, IPv6, unix-domain socket. That way each platform would only run tests that pertain to it. Right now there's a single "regress.in" file that lists all the tests. That's one reason for that, but there are probably others in the way of making this fully portable and automatable. This test setup also appears to labor under the illusion that we live in a Unix-only world. And for no good reason that I can tell. The shell script should be ripped out and replaced by a perl script which could actually be used on any windows build host. The MSVC build system also needs adjusting to make it build the test driver, at least. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
Excerpts from Robert Haas's message of mar abr 17 15:46:23 -0300 2012: > On Tue, Apr 17, 2012 at 12:14 PM, Tom Lane wrote: > > The core of the patch is in the new functions get_baserel_parampathinfo > > and get_joinrel_parampathinfo, which look up or construct ParamPathInfos, > > and join_clause_is_parameterizable_for and > > join_clause_is_parameterizable_within, which control > > movement of parameterizable join clauses. (I'm not that thrilled with the > > names of the latter two functions, anybody got a better idea?) The rest > > of it is pretty much boilerplate changes and replacing ad-hoc logic with > > uses of this stuff. > > Parameterizable is such a mouthful. I wish we had a better word. P13e ? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] libpq URI and regression testing
Excerpts from Peter Eisentraut's message of mar abr 17 15:41:04 -0300 2012: > On tis, 2012-04-17 at 10:47 -0300, Alvaro Herrera wrote: > > What's the preferred way to make it automatically tested as much as > > possible? I know the buildfarm does not run "installcheck-world", so if > > we want it there, it'd need a bit more code on the client side. I think > > it would be wise to have it also run on installcheck-world. > > It was agreed during the patch discussion that it shouldn't be run > automatically. Oh, okay. I didn't notice that. I guess we should issue a call-for-testing, then, so that we ensure it works (FSVO works) in all (FSVO all) platforms. > > Hmm. Just had this thought: not all platform support the same socket > > types. Maybe we should have separated the .in file in three parts: > > IPv4, IPv6, unix-domain socket. That way each platform would only run > > tests that pertain to it. Right now there's a single "regress.in" file > > that lists all the tests. > > That's one reason for that, but there are probably others in the way of > making this fully portable and automatable. Hmm. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
On Tue, Apr 17, 2012 at 12:14 PM, Tom Lane wrote: > I've been hacking away on a patch to do this, and attached is something > that I think is pretty close to committable. It needs another going-over > and some new regression test cases, but it seems to work, and it fixes a > number of things besides the above-mentioned issue. In particular, this > has a much more principled approach than HEAD does to the problem of where > to place parameterizable join clauses in the plan tree; that can be seen > in the one change in the existing regression tests, where we no longer > generate a redundant upper-level copy of an OR join clause that the old > code wasn't bright enough to get rid of. > > The patch is a bit large because I chose to revise the data representation. > Instead of each Path having its own required_outer, rows, and > param_clauses fields, now a parameterized Path has a pointer to a > ParamPathInfo struct that it shares with other Paths for the same rel and > the same parameterization. This guarantees that such paths will have the > same estimated rowcount, because we only compute that once per > parameterization, which should save some work as well as making the world > safe for add_path_precheck. Seems reasonable. > The only place where this approach proved a bit tricky was in handling > AppendPaths and MergeAppendPaths, which didn't surprise me because that > was a rough spot for the old way too (and indeed they aren't handled > completely correctly in HEAD). A parameterized path is now *required* > to enforce all clauses that the join clause movement rules assign to it; > but Append and MergeAppend don't do qual checking, and I didn't feel like > changing that. The method that I have settled on is to require all child > paths of a parameterized append to have the exact same parameterization, > IOW we push the qual checks down below the append. Now the interesting > point about that is that we want to support Appends wherein some children > are seqscans and some are indexscans (consider a partitioned table where > the parent is a dummy empty table with no indexes). The "raw" situation > there is that we'll have a plain seqscan path for the parent and then a > collection of similarly-parameterized indexscan paths for the live > partition children. To make it possible to convert that case into a > parameterized append path, I added parameterization support to seqscans > and then wrote "reparameterize_path", which changes a Path to increase > its parameterization level (and thereby assign it more pushed-down join > clauses to check at runtime). That allows us to reconfigure the seqscan > to match the other children. I've also added such support to > SubqueryScan, on the grounds that the other common use of append paths is > UNION ALL across subqueries. We might later want to add parameterization > support to other path types, but this seemed like enough for the moment. OK. > BTW, after writing the code for it I decided to remove creation of > parameterized MergeAppendPaths from allpaths.c, though there is still some > support for them elsewhere. On reflection it seemed to me that the code > was willing to create far too many of these, much more than their > potential usefulness could justify (remember that parameterized paths must > be on the inside of a nestloop, so their sort ordering is typically of > marginal use). We can put that back if we can think of a more restrictive > heuristic for when to create them. I guess the case in which this would matter is if you wrote something like A LJ (B IJ C) where B and/or C has child tables and the best method of joining them to each other is a marge join. That doesn't seem all that likely; normally a hash join or nested loop will be better. On the flip side I can't see that generating a bunch of extra paths is going to hurt you much there either; they will fall away pretty quickly if they aren't useful for merging. Now, if you have something like A IJ B or A LJ B, where B is partitioned, it's clearly a waste of time to generate parameterized paths with pathkeys. > The core of the patch is in the new functions get_baserel_parampathinfo > and get_joinrel_parampathinfo, which look up or construct ParamPathInfos, > and join_clause_is_parameterizable_for and > join_clause_is_parameterizable_within, which control > movement of parameterizable join clauses. (I'm not that thrilled with the > names of the latter two functions, anybody got a better idea?) The rest > of it is pretty much boilerplate changes and replacing ad-hoc logic with > uses of this stuff. Parameterizable is such a mouthful. I wish we had a better word. > I have a couple of other ideas in mind in the way of mop-up, but they are > not in this patch to keep it from bloating even more. First, I'm thinking > we should get rid of RelOptInfo.baserestrictcost, thus forcing all scan > cost estimators to invoke cost_qual_eval explicitly. That field has been > vestigial from a
Re: [HACKERS] libpq URI and regression testing
On tis, 2012-04-17 at 10:47 -0300, Alvaro Herrera wrote: > What's the preferred way to make it automatically tested as much as > possible? I know the buildfarm does not run "installcheck-world", so if > we want it there, it'd need a bit more code on the client side. I think > it would be wise to have it also run on installcheck-world. It was agreed during the patch discussion that it shouldn't be run automatically. > Hmm. Just had this thought: not all platform support the same socket > types. Maybe we should have separated the .in file in three parts: > IPv4, IPv6, unix-domain socket. That way each platform would only run > tests that pertain to it. Right now there's a single "regress.in" file > that lists all the tests. That's one reason for that, but there are probably others in the way of making this fully portable and automatable. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
Qi, Hackers: FWIW, the PostGIS folks would *really* love to have a TABLESAMPLE which worked with geographic indexes. This would be tremendously useful for constructing low-resolution "zoom out" tiles on maps and similar. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join
On Mon, Apr 16, 2012 at 7:05 PM, Tom Lane wrote: >> Hmm. This sounds like something that Tom's recent work on >> parameterized plans ought to have fixed, or if not, it seems closely >> related. > > Not really. It's still going to be a nestloop, and as such not terribly > well suited for queries where there are a lot of matchable rows on both > sides. The work I've been doing is really about making nestloops usable > in cases where join order restrictions formerly prevented it --- but > Jeff's complaint has nothing to do with that. (This thought also makes > me a bit dubious about the nearby suggestions that more indexes will > fix it.) I thought Jeff was parenthetically complaining about cases like A LEFT JOIN (B INNER JOIN C ON b.y = c.y) ON a.x && b.x. That presumably would require the parameterized-path stuff to have any chance of doing partial index scans over B. However, I understand that's not the main issue here. One thing that I think needs some analysis is when the range join idea is better or worse than a nested loop with inner index-scan, because potentially those are the options the planner has to choose between, and the costing model had better know enough to make the right thing happen. It strikes me that the nested loop with inner index-scan is likely to be a win when there are large chunks of the indexed relation that the nestloop never needs to visit at all - imagine small JOIN big ON small.a && big.a, for example. I suppose the really interesting question is how much we can save when the entirety of both relations has to be visited anyway - it seems promising, but I guess we won't know for sure without testing it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join
On Mon, Apr 16, 2012 at 1:43 PM, Jeff Davis wrote: > On Mon, 2012-04-16 at 02:52 -0400, Tom Lane wrote: >> Jeff Davis writes: >> > 1. Order the ranges on both sides by the lower bound, then upper bound. >> > Empty ranges can be excluded entirely. >> > 2. Left := first range on left, Right := first range on right >> > 3. If Left or Right is empty, terminate. >> > 4. If lower(Left) > upper(Right), discard Right, goto 2 >> > 5. If lower(Right) > upper(Left), discard Left, goto 2 >> > 6. return (Left, Right) as joined tuple >> > 7. Right := next range on right >> > 8. goto 3 >> >> This is surely not correct in detail. As written, it will be impossible >> for any tuple on the right side to be joined to more than one left-side >> tuple. You will need something analogous to the mark-and-restore >> rewinding logic in standard mergejoin to make this work. > > Every time you discard a tuple on the left, you go to step 2, which > rewinds the right side back to the first non-discarded tuple. > > So, implemented using mark and restore: > > * start off with the marks at the first tuple on each side > * "discard" means move the mark down a tuple > * setting it back to the first range means restoring to the mark > * going to the next range (step 7) just means getting another > tuple, without changing the mark > > Only one side really needs the mark and restore logic, but it was easier > to write the pseudocode in a symmetrical way (except step 7). I'm actually not sure these are equivalent formulations. Suppose one side has [i,i] where i ranges from 1 to 1000 and the other side the exact same thing plus [1,1000]. That one really big range will come up second on the right side, and you will not be able to discard it until you reach the end of the join. If you just keep rewinding the right side, you're going to end up with O(n^2) behavior, whereas if you can discard tuples "from the middle" on the right side, then you will get O(n) behavior, which is a big difference. In other words, in your original algorithm the tuples that you discard in step 4 or 5 need not be the first remaining tuple on whichever side of the join we're talking about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
On 04/17/2012 09:20 AM, Jay Levitt wrote: Antispam is (in the large) a technically unsolvable problem; even in the '90s, we'd see hackers start poking at our newest countermeasures within the hour. GitHub is a giant target, and PG probably benefits here from NOT being one. Everyone who deals with list moderation and spam issues around PostgreSQL just got a belly laugh from that comment. Hint: the PostgreSQL lists had already been around and therefore were being targeted by spammers for over ten years before GitHub even existed. Pedantic note/fun fact: There was no email antispam in 1994 I like it when Magnus really gets the details perfect when making a deadpan joke. Anyway, back to serious talk, I believe GitHub is a dead end here because the "primary key" as it were for issues is a repo. A bug tracker for PostgreSQL would need to have issues broken down per branch and include information similar to the release notes for each minor point release. Tracking when and how a bug is backported to older versions is one hard part of the problem here. For example, Trac, Redmine, and Github all have ways to make a commit message reference an issue, something like "fixes #X". That's fine for projects that don't have a complicated backport policy, but I haven't been able to figure out how to make it work well enough for a PostgreSQL bug tracker, to end up saving any work here. In some cases, a bug shouldn't be closed until it's been backported to all supported releases. Others will only fix in relevant releases. Let's pick a real example from the last week of my life, where having a bug tracker would have helped me out. This appears in a log: ERROR: missing chunk number 0 for toast value 1167375 in pg_toast_2619 What I should be able to do here is search the bug tracker for these words and have it spit out an issue that looks like this === Bug: Fix race condition during toast table access from stale syscache entries Impact: Transient query failures Fixed in: 9.2.0: http://archives.postgresql.org/pgsql-committers/2011-11/msg00012.php 9.1.2: http://archives.postgresql.org/pgsql-committers/2011-11/msg00016.php 9.0.6: http://archives.postgresql.org/pgsql-committers/2011-11/msg00014.php 8.4.10: http://archives.postgresql.org/pgsql-committers/2011-11/msg00013.php 8.3.17: http://archives.postgresql.org/pgsql-committers/2011-11/msg00017.php 8.2.23: http://archives.postgresql.org/pgsql-committers/2011-11/msg00015.php === Note that the "fixed in" version information doesn't show up until some time *after* the bug fix is committed, because they normally get rolled into the next minor release in bulk. A bug tracking system for PostgreSQL will start looking attractive when it makes life easier for the people who do these backports and the associated release notes. Start looking at the problem from their perspective if you want to figure out how to make that happen. I don't have a good answer to that; I just know that Trac, Redmine, and GitHub haven't felt like a good fit, having used every one of that trio for multiple years now at some point. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
On Tue, Apr 17, 2012 at 5:33 PM, Christopher Browne wrote: > I get the feeling that this is a somewhat-magical feature (in that > users haven't much hope of understanding in what ways the results are > deterministic) that is sufficiently "magical" that anyone serious > about their result sets is likely to be unhappy to use either SYSTEM > or BERNOULLI. These both sound pretty useful. "BERNOULLI" is fine for cases where you aren't worried about time dependency on your data. If you're looking for the average or total value of some column for example. SYSTEM just means "I'm willing to trade some unspecified amount of speed for some unspecified amount of accuracy" which presumably is only good if you trust the database designers to make a reasonable trade-off for cases where speed matters and the accuracy requirements aren't very strict. > Possibly the forms of sampling that people *actually* need, most of > the time, are more like Dollar Unit Sampling, which are pretty > deterministic, in ways that mandate that they be rather expensive > (e.g. - guaranteeing Seq Scan). I don't know about that but the cases I would expect to need other distributions would be ones where you're looking at the tuples in a non-linear way. Things like "what's the average gap between events" or "what's the average number of instances per value". These might require a full table scan but might still be useful if the data is going to be subsequently aggregated or joined in ways that would be too expensive on the full data set. But we shouldn't let best be the enemy of the good here. Having SYSTEM and BERNOULLI would solve most use cases and having those would make it easier to add more later. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
Jameison Martin writes: > The following patch truncates trailing null attributes from heap rows to > reduce the size of the row bitmap. This has been discussed before, but it always seemed that the cost-benefit ratio was exceedingly questionable. You don't get any savings whatsoever unless you reduce the size of the null bitmap across a MAXALIGN boundary, which more and more often is 64 bits, so that the frequency with which the optimization wins anything doesn't look likely to be that high. And on the other side of the coin, you're adding cycles to every single tuple-construction operation to make this work. The introduction of bugs doesn't seem improbable either. (Just because tuples in user tables might have unexpected natts values doesn't mean that the code is, or should be, prepared to tolerate that in system tables or plan-constructed tuples.) So what I'd like to see is some concrete test results proving that this is a net win, or at least not a net loss, for typical cases. Just asserting that it might be a win for certain usage patterns doesn't do it for me. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
On Tue, Apr 17, 2012 at 11:27 AM, Stephen Frost wrote: > Qi, > > * Qi Huang (huangq...@hotmail.com) wrote: >> > Doing it 'right' certainly isn't going to be simply taking what Neil did >> > and updating it, and I understand Tom's concerns about having this be >> > more than a hack on seqscan, so I'm a bit nervous that this would turn >> > into something bigger than a GSoC project. >> >> As Christopher Browne mentioned, for this sampling method, it is not >> possible without scanning the whole data set. It improves the sampling >> quality but increases the sampling cost. I think it should also be using >> only for some special sampling types, not for general. The general sampling >> methods, as in the SQL standard, should have only SYSTEM and BERNOULLI >> methods. > > I'm not sure what sampling method you're referring to here. I agree > that we need to be looking at implementing the specific sampling methods > listed in the SQL standard. How much information is provided in the > standard about the requirements placed on these sampling methods? Does > the SQL standard only define SYSTEM and BERNOULLI? What do the other > databases support? What does SQL say the requirements are for 'SYSTEM'? Well, there may be cases where the quality of the sample isn't terribly important, it just needs to be "reasonable." I browsed an article on the SYSTEM/BERNOULLI representations; they both amount to simple picks of tuples. - BERNOULLI implies picking tuples with a specified probability. - SYSTEM implies picking pages with a specified probability. (I think we mess with this in ways that'll be fairly biased in view that tuples mayn't be of uniform size, particularly if Slightly Smaller strings stay in the main pages, whilst Slightly Larger strings get TOASTed...) I get the feeling that this is a somewhat-magical feature (in that users haven't much hope of understanding in what ways the results are deterministic) that is sufficiently "magical" that anyone serious about their result sets is likely to be unhappy to use either SYSTEM or BERNOULLI. Possibly the forms of sampling that people *actually* need, most of the time, are more like Dollar Unit Sampling, which are pretty deterministic, in ways that mandate that they be rather expensive (e.g. - guaranteeing Seq Scan). -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?" -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] patch submission: truncate trailing nulls from heap rows to reduce the size of the null bitmap
The following patch truncates trailing null attributes from heap rows to reduce the size of the row bitmap. Applications often have wide rows in which many of the trailing column values are null. On an insert/update, all of the trailing null columns are tracked in the row bitmap. This can add a substantial overhead for very wide rows. This change truncates heap rows such that the trailing nulls are elided. The intuition for this change is that ALTER TABLE t ADD COLUMN c type NULL is a metadata only change. Postgres works fine when a row's metadata (tuple descriptor) is inconsistent with the actual row data: extra columns are assumed to be null. This change just adjusts the number of attributes for a row and the row bitmap to only track up to the last non-null attribute. Thanks. -Jamie Martin 0001-Truncate-trailing-null-attributes-from-heap-rows-to-.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
Qi, * Qi Huang (huangq...@hotmail.com) wrote: > > Doing it 'right' certainly isn't going to be simply taking what Neil did > > and updating it, and I understand Tom's concerns about having this be > > more than a hack on seqscan, so I'm a bit nervous that this would turn > > into something bigger than a GSoC project. > > As Christopher Browne mentioned, for this sampling method, it is not possible > without scanning the whole data set. It improves the sampling quality but > increases the sampling cost. I think it should also be using only for some > special sampling types, not for general. The general sampling methods, as in > the SQL standard, should have only SYSTEM and BERNOULLI methods. I'm not sure what sampling method you're referring to here. I agree that we need to be looking at implementing the specific sampling methods listed in the SQL standard. How much information is provided in the standard about the requirements placed on these sampling methods? Does the SQL standard only define SYSTEM and BERNOULLI? What do the other databases support? What does SQL say the requirements are for 'SYSTEM'? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Gsoc2012 idea, tablesample
> > 2. It's not very useful if it's just a dummy replacement for "WHERE > > random() < ?". It has to be more advanced than that. Quality of the > > sample is important, as is performance. There was also an > > interesting idea of on implementing monetary unit sampling. > > In reviewing this, I got the impression (perhaps mistaken..), that > different sampling methods are defined by the SQL standard and that it > would simply be us to implement them according to what the standard > requires. > > > I think this would be a useful project if those two points are taken > > care of. > > Doing it 'right' certainly isn't going to be simply taking what Neil did > and updating it, and I understand Tom's concerns about having this be > more than a hack on seqscan, so I'm a bit nervous that this would turn > into something bigger than a GSoC project. > As Christopher Browne mentioned, for this sampling method, it is not possible without scanning the whole data set. It improves the sampling quality but increases the sampling cost. I think it should also be using only for some special sampling types, not for general. The general sampling methods, as in the SQL standard, should have only SYSTEM and BERNOULLI methods. Best Regards and ThanksHuang Qi VictorComputer Science of National University of Singapore
Re: [HACKERS] Gsoc2012 idea, tablesample
On Tue, Apr 17, 2012 at 2:49 PM, Stephen Frost wrote: > * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote: >> 1. We probably don't want the SQL syntax to be added to the grammar. >> This should be written as an extension, using custom functions as >> the API, instead of extra SQL syntax. > > Err, I missed that, and don't particularly agree with it.. Is there a > serious issue with the grammar defined in the SQL standard? The other > DBs which provide this- do they use the SQL grammar or something else? > > I'm not sure that I particularly *like* the SQL grammar, but if we're > going to implement this, we should really do it 'right'. I think the danger is that fiddling with the grammar can be a ratsnest of learning how to deal with bison grammars and learning how to interpret the ANSI standard and bikeshedding. These are all parts of the open source world so maybe an argument could be made they should be part of a GSOC project but I fear they would swallow the whole project. But I think I agree that doing it as an external module would be strange and not very useful. I picture it instead as a new scan type which is basically a copy of heapscan or tidscan and uses various algorithms to decide which tuples to return. For a first cut pof I would leave out the grammar and just have a guc that enabled replacing the heap scan with a sample scan everywhere. But that would have to be done as a patch to Postgres to add the new scan type. It wouldn't make it much easier to have a hook that replaced one scan type with another I don't think. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.3 Pre-proposal: Range Merge Join
On Mon, Apr 16, 2012 at 10:20 PM, Simon Riggs wrote: > The thing I like most about temp indexes is that they needn't be temporary. > > I'd like to see something along the lines of demand-created optional > indexes, that we reclaim space/maintenance overhead on according to > some cache management scheme. More space you have, the more of the > important ones hang around. The rough same idea applies to > materialised views. I find this a) really scary, b) a huge increase in scope, and c) kind of pointless. a) DDL and shared caches require all kinds of additional synchronization that would be really annoying to be incurring in the middle of DML queries unexpectedly. If these indexes are to be useful for other transactions they would also impose a hidden and uncontrollable cost on all updates and inserts on the table. b) I think it would be a lot harder to get working cleanly. You would need to invent a UI to control the lifetime and resources used by these objects and deal with duplication between manually created and dynamically created objects. It's fundamentally a huge shift in the design of the database changing what things the user maintains as part of their schema and what things the database maintains transparently. This is a big project on its own aside from the technical side of things. c) If these indexes are useful for many queries then the user can choose to create them themself. These indexes you're proposing would be just as expensive as any those indexes and offer few advantages aside from their automaticness. The same could be accomplished with some external demon that just looked at the query logs and determined which indexes to create or not. The main advantage of creating dynamic indexes as part of the query would be lost. Namely that these would be local data structures that don't need to be synchronized with other transactions and don't need to be updated by other transactions. They're just part of the query execution the way spilled hash tables and tuplesort tapes are. You could build indexes on materialized data resulting from earlier joins or aggregates and so on. The point is that if you make them equivalent to normal indexes just dynamically maintained then all you've done is change the way normal indexes work but haven't really changed the set of queries they're useful for. That might be a neat UI Feature but If you want to change the set of queries postgres can handle efficiently at all then you need something that's fundamentally different from a table index. As an aside I think some of what you're looking for could be better handled with some kind of query result cache that could keep around the materialized data from plan nodes until an event happens that changes the data. This might be related to the underlying infrastructure needed for materialized views but I haven't thought too deeply about it. It seems like a lot of work and a big change from the current very isolated per-process execution model. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
Alex Shulgin wrote: Jay Levitt writes: (A quick Google shows redmine and especially Trac having spam issues of their own.) Ugh, redmine (or trac for that matters) has nothing to with handling spam. I believe a typical bug tracker doesn't handle spam itself, it lets the mailing system do that. Surely you can throw in some captcha plugins to try to reduce the spam posted from the web UI. Maybe I'm confused - Magnus et al, are we talking spammy issues/issue comments/etc, or are we talking more about exposed email addresses? I assumed we meant spammy issues, like blog comments - spammers post issues and comments with links, to get PageRank to their sites. Email defenses wouldn't help here. Captchas are fairly pointless nowadays, assuming you have someone dedicated enough to write a spambot against your bug tracker. Most of them (even reCAPTCHA!) can be >80% defeated by software - many 99% - and there are millions of humans hanging out on Mechanical Turk who'll solve them for you 100%. Modern anti-spam ends up being a machine learning and systems exercise.. but that's another mailing list :) I think Google gets more use out of reCAPTCHA for OCR tweaking than for anti-spam. Jay -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
Hello, this message is attached with the patch which did not tested. That is for show the way. On Tue, Apr 17, 2012 at 9:38 PM, Kyotaro HORIGUCHI wrote: > But I think referring checkpoint_segment on such case should be > inhibited, and I suppose it is possible using StandbyMode in > IsCheckpointOnSchedule(), I suppose. > > I will correct the patch later. Hmm. StandbyMode is a local variable which cannot be accessed in checkpointer. But WalRcvInProgress() which shows if wal receiver is running seems to be usable to ENABLE governing progress by checkpoint_segments. | IsCheckpointOnSchedule(double progress) | { |/* | * Inhibit governing progress by segments in archive recovery. | */ |recovery_in_progress = RecoveryInProgress(); |if (!recovery_in_progress || WalRcvInProgress()) |{ |recptr = recovery_in_progress ? GetXLogReplayRecPtr(NULL) : |GetInsertRecPtr(); How about this? regards, -- Kyotaro Horiguchi NTT Open Source Software Center == My e-mail address has been changed since Apr. 1, 2012. standby_checkpoint_segments_9.2dev_fix_20120417v2.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
Stephen Frost writes: > * Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote: >> Another idea that Robert Haas suggested was to add support doing a >> TID scan for a query like "WHERE ctid< '(501,1)'". That's not >> enough work for GSoC project on its own, but could certainly be a >> part of it. > I don't think Robert's suggestion would be part of a 'tablesample' > patch. Yeah, I don't see the connection either. It seems more like this would be a localized hack in tidpath.c and nodeTidscan.c. I think it'd be a neat beginning project for somebody, but it's not really related to the GSoC project as proposed. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Regarding GSoc Application
On Tue, Apr 17, 2012 at 7:37 PM, Andrew Dunstan wrote: > > > On 04/17/2012 09:12 AM, Atri Sharma wrote: >> >> I just had a small doubt I wanted to clarify.I initially said in my >> proposal that I would be using SPI for getting the FDW API to call Pl/Java >> functions,but now,after discussion with the community,I have changed the >> approach and I will be using JNI Invocation API instead for the same >> purpose. I wanted to know if I need to send a modified proposal for the same >> or the original proposal shall still hold good. > > > These are not the right forums to ask this question. You need to ask on the > GSOC specific forum. > > cheers > > andrew Thanks Andrew.I'll keep that in mind. -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Regarding GSoc Application
On 04/17/2012 09:12 AM, Atri Sharma wrote: I just had a small doubt I wanted to clarify.I initially said in my proposal that I would be using SPI for getting the FDW API to call Pl/Java functions,but now,after discussion with the community,I have changed the approach and I will be using JNI Invocation API instead for the same purpose. I wanted to know if I need to send a modified proposal for the same or the original proposal shall still hold good. These are not the right forums to ask this question. You need to ask on the GSOC specific forum. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
Jay Levitt writes: > >> No meaningful search, eh? Works for me. > > Redmine searches return partial-word matches, and there's no way to > disable that. Searching for "test" finds "latest". To me, that's > broken. Well, I believe one can plug in a different search engine, like lucene or xapian. However it doesn't look like some one already did (for ticket/wiki history, but there's xapian search plugin[1] to index attachments.) > Also, the UI is very 5 years ago; e.g., "compare revisions" uses the > same columns-of-radio-buttons approach as MediaWiki. If the goal is a > tool to reduce friction and increase involvement, you want a smoother > UX. Nothing that could not be tweaked with a plugin or core code modification here either. Not sure about the magnitude of the effort required, though. -- Alex [1] https://github.com/xelkano/redmine_xapian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
* Heikki Linnakangas (heikki.linnakan...@enterprisedb.com) wrote: > 1. We probably don't want the SQL syntax to be added to the grammar. > This should be written as an extension, using custom functions as > the API, instead of extra SQL syntax. Err, I missed that, and don't particularly agree with it.. Is there a serious issue with the grammar defined in the SQL standard? The other DBs which provide this- do they use the SQL grammar or something else? I'm not sure that I particularly *like* the SQL grammar, but if we're going to implement this, we should really do it 'right'. > 2. It's not very useful if it's just a dummy replacement for "WHERE > random() < ?". It has to be more advanced than that. Quality of the > sample is important, as is performance. There was also an > interesting idea of on implementing monetary unit sampling. In reviewing this, I got the impression (perhaps mistaken..), that different sampling methods are defined by the SQL standard and that it would simply be us to implement them according to what the standard requires. > I think this would be a useful project if those two points are taken > care of. Doing it 'right' certainly isn't going to be simply taking what Neil did and updating it, and I understand Tom's concerns about having this be more than a hack on seqscan, so I'm a bit nervous that this would turn into something bigger than a GSoC project. > Another idea that Robert Haas suggested was to add support doing a > TID scan for a query like "WHERE ctid< '(501,1)'". That's not > enough work for GSoC project on its own, but could certainly be a > part of it. I don't think Robert's suggestion would be part of a 'tablesample' patch. Perhaps a completely different project which was geared towards allowing hidden columns to be used in various ways in a WHERE clause.. Of course, we'd need someone to actually define that; I don't think someone relatively new to the project is going to know what experienced hackers want to do with system columns. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Gsoc2012 idea, tablesample
On 17.04.2012 14:55, Qi Huang wrote: Hi, Heikki Thanks for your advice.I will change my plan accordingly. But I have a few questions. 1. We probably don't want the SQL syntax to be added to the grammar. This should be written as an extension, using custom functions as the API, instead of extra SQL syntax. 1. "This should be written as an extension, using custom functions as the API". Could you explain a bit more what does this mean? I mean, it won't be integrated into the PostgeSQL server code. Rather, it will be a standalone module that can be distributed as a separate .tar.gz file, and installed on a server. PostgreSQL has some facilities to help you package code as extensions that can be easily distributed and installed. 2. It's not very useful if it's just a dummy replacement for "WHERE random()< ?". It has to be more advanced than that. Quality of the sample is important, as is performance. There was also an interesting idea of on implementing monetary unit sampling. 2. In the plan, I mentioned using optimizer statistics to improve the quality of sampling. Yeah, that's one approach. Would be nice to hear more about that, how exactly you can use optimizer statistics to help the sampling. I may emphasize on that point. I will read about monetary unit sampling and add into the plan about possibility of implementing this idea. Ok, sounds good. Another idea that Robert Haas suggested was to add support doing a TID scan for a query like "WHERE ctid< '(501,1)'". That's not enough work for GSoC project on its own, but could certainly be a part of it. 3. I read about the replies on using ctid. But I don't quite understand how that might help. ctid is just a physical location of row version within the table. If I do "where ctid<'(501, 1)'", what is actually happening? At the moment, if you do "WHERE ctid = '(501,1)', you get an access plan with a TidScan, which quickly fetches the row from that exact physical location. But if you do "WHERE ctid < '(501,1'), you get a SeqScan, which scans the whole table. That's clearly wasteful, you know the physical range of pages you need to scan: everything up to page 501. But the SeqScan will scan pages > 501, too. The idea is to improve that so that you'd only scan the pages up to page 501. Can I add in this as an optional implementation? I think I can check how to do this if I can have enough time in this project. Yeah, that sounds reasonable. Besides, I saw the Gsoc site editing has been closed. Should I just submit through this mailing list with attachment? Just post the updated details to this mailing list. Preferably inline, not as an attachment. You don't need to post the contact details, biography, etc, just updated inch-stones and project details parts. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] libpq URI and regression testing
Hi, When I committed Alex Shulgin's patch to add URI support to libpq, I included the test harness as well. However, due to it being in a separate subdirectory that did not previously had tests, it's not being run by buildfarm. It's not considered in "make installcheck-world" either. What's the preferred way to make it automatically tested as much as possible? I know the buildfarm does not run "installcheck-world", so if we want it there, it'd need a bit more code on the client side. I think it would be wise to have it also run on installcheck-world. Hmm. Just had this thought: not all platform support the same socket types. Maybe we should have separated the .in file in three parts: IPv4, IPv6, unix-domain socket. That way each platform would only run tests that pertain to it. Right now there's a single "regress.in" file that lists all the tests. Opinions? -- Álvaro Herrera -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slow temporary tables when using sync rep
On 17 April 2012 14:35, Heikki Linnakangas wrote: > On 17.04.2012 14:10, Thom Brown wrote: >> >> On 17 April 2012 11:30, Heikki Linnakangas >> wrote: >>> >>> What happens is that we write the commit record if the transaction >>> accesses >>> a temporary table, but we don't flush it. However, we still wait until >>> it's >>> replicated to the standby. The obvious fix is to not wait for that, see >>> attached. >> >> >> Tested patch. Yes, that fixes the problem. Thanks. > > > Ok, committed. Thanks Heikki. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slow temporary tables when using sync rep
On 17.04.2012 14:10, Thom Brown wrote: On 17 April 2012 11:30, Heikki Linnakangas wrote: What happens is that we write the commit record if the transaction accesses a temporary table, but we don't flush it. However, we still wait until it's replicated to the standby. The obvious fix is to not wait for that, see attached. Tested patch. Yes, that fixes the problem. Thanks. Ok, committed. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
Jay Levitt writes: > > (A quick Google shows redmine and especially Trac having spam issues > of their own.) Ugh, redmine (or trac for that matters) has nothing to with handling spam. I believe a typical bug tracker doesn't handle spam itself, it lets the mailing system do that. Surely you can throw in some captcha plugins to try to reduce the spam posted from the web UI. -- Alex -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Bug tracker tool we need
Magnus Hagander wrote: On Mon, Apr 16, 2012 at 23:48, Jay Levitt wrote: - Familiarity: Many developers already have a GitHub account and use it Most of the more senior developers don't use github. Other than possibly as a place to store a plain git repository. So that's not really relevant. I meant outside developers - the folks you'd like to see more involved in the process. - Patch commenting and git integration encourage actual review-resubmit cycles instead of "Here, look, I fixed it for you" reviews The amount of spam coming through that system, and the inability/unwillingness of github to even care about it is a killer argument *against* github. We have working antispam for email. The github antispam is somewhere around where email antispam was in 1994. Interesting; I haven't run into this but you're the second person to mention it here. Antispam is (in the large) a technically unsolvable problem; even in the '90s, we'd see hackers start poking at our newest countermeasures within the hour. GitHub is a giant target, and PG probably benefits here from NOT being one. (A quick Google shows redmine and especially Trac having spam issues of their own.) Pedantic note/fun fact: There was no email antispam in 1994; Canter & Siegel posted their infamous USENET Green Card spam that year, but it didn't really spread to email for another year or two. Once it did, there were fervent debates about whether it should be called "velveeta" to distinguish from the USENET variety. GitHub could well be a non-starter, but if third-party-dependence is really the holdup, I'd volunteer to write the tools - in fact, a google of [export issues from github] shows a few that might already suffice. It *is* a non-starter, because (a) it's a third party dependency, and (b) AFAIK they don't provide *data access* to the issue trackers. Sure they do: http://developer.github.com/v3/issues/ Jay -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Regarding GSoc Application
On Wed, Apr 11, 2012 at 8:17 PM, Atri Sharma wrote: > > On Tue, Apr 10, 2012 at 10:41 PM, Atri Sharma wrote: Well. maybe I spoke too soon...JNI is probably the best route. Since SPI is off the table, all we're really pulling in from pl/java is the (non-trivial) proper installation of a jvm into a postgres process. pl/java is essentially a wrapper to JNI that does postgres to jni type conversion and builds a database driven class lookup system that allows building the java environment inside the database. That part is not needed at all; since we're not going to install our routines as pl/java installed they will be installed into a separate and more typical class path from a jvm point of view. There are no exposed routines in pljava.so that allow for easy manual invocation of a java routine -- it's all pretty much hardwired to the language function call handler system. This begs the question about why pl/java should be used at all. Well, as I said the jvm insertion into postgres is non-trivial so that alone is worth something. Maybe though we should just rip out the jvm guts that we really need (mostly some parts of backend.c and some of the type files) and do a completely standalone jni wrapper. pl/java's build system is a mess anyways so we're not losing all that much in trying to go off on our own and do something. OTOH, throwing a build system on top of the growing pile of things to do is turning this into a daunting project vs the (admittedly cobbled together) approach I was hoping to do earlier. Anyways, it's really Atri's call if he's comfortable proceeding. >>> >>> Hi All, >>> >>> I think we are back on the initial approach I proposed(hooking directly > into >>> the JVM and executing Java code that calls JDBC).I think the best way to > do >>> this is create a JVM that executes the Java code and give the control of > the >>> JVM to the native API. >>> >>> I agree,the only need of Pl/Java that is apparent here is the need of the >>> Java internals(JDK et al).If we set them up independently,then,we can > have >>> the FDW wrapping JDBC directly through JNI.JNI would call pure Java >>> functions to connect to the JDBC. >>> >>> I think we can proceed with this.Once we are done with the API calling > Java >>> functions,I think the rest of the path is easily mapped(writing Java >>> functions to connect to JDBC). >> >>yeah -- it sound plausible. I think the next step is to pull one of >>the fdw projects that is working and currently builds with pgxs. Make >>sure it is bsd licensed and that you retain the original copyright in >>the code (feel free to add your own). The mysql_fdw project for >>example is a good candidate. After that we need to strip out all the >>mysql specific parts so that we have a basic skeleton of the fdw >>wrapper that still builds. From there we will need to build in the >>jni calls as well as jvm initialization code we are going to more or >>less directly copy from pl/java, but one thing at a time: IMO, >>getting the jvm installed and executing a 'hello world' jni is a good >>milestone to reach before getting into all the mechanics of the remote >>querying. >> >>The jvm initialization code in pl/java we are going to use is in >>src/C/pljava/backend.c. A lot of the initialization work done there >>is irrelevant to what we are trying to do but there are some very >>important bits like the on_proc_exit handler that does cleanup when >>the postgres process exits. Ideally we can drop just the stuff we >>need from there and get our project to compile. >> > > I agree,atm I'll work on getting JNI set up and to build a JVM using it and > getting the "Hello,World" running in it in PostGreSQl. > > I'll keep you posted, > > Atri > Hi all, I just had a small doubt I wanted to clarify.I initially said in my proposal that I would be using SPI for getting the FDW API to call Pl/Java functions,but now,after discussion with the community,I have changed the approach and I will be using JNI Invocation API instead for the same purpose. I wanted to know if I need to send a modified proposal for the same or the original proposal shall still hold good. Atri -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [COMMITTERS] pgsql: Add new replication mode synchronous_commit = 'write'.
On 16 April 2012 17:21, Fujii Masao wrote: > On Sun, Apr 15, 2012 at 12:13 AM, Thom Brown wrote: >> No, that's not what I was referring to. If you don't have a standby >> (i.e. a single, isolated database cluster with no replication), and >> its synchronous_commit is set to 'remote_write', what effect does that >> have? > > It's the same effect as 'on' and 'local' do, i.e., transaction commit waits > for only local WAL flush. This behavior is not documented explicitly... > How should we change the document? What about adding the following > into the explanation of synchronous_commit parameter (maybe the end > of second paragraph of that)? > > - > If synchronous_standby_names is not set, on, remote_write and local > provide the same synchronization level; transaction commit only waits for > local flush. > - Yes, that sounds fine. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
Sorry, I've wrote something wrong. >> The reason we haven't historically obeyed checkpoint_segments >> during recovery is that it slows down the recovery >> unnecessarily if you're restoring from a backup and you replay, > > The variable StandbyMode is false on archive recovery, so no > checkpoint triggerred during then. Nevertheless, checkpoints will be triggered by checkpoint_timeout and run at the maybe higher speed governed by checkpoint_segments. This is undesirable behavior from such a point of view. But I think referring checkpoint_segment on such case should be inhibited, and I suppose it is possible using StandbyMode in IsCheckpointOnSchedule(), I suppose. I will correct the patch later. regards, -- Kyotaro Horiguchi NTT Open Source Software Center == My e-mail address has been changed since Apr. 1, 2012. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Memory usage during sorting
On Mon, Apr 16, 2012 at 10:42 PM, Peter Geoghegan wrote: > All but 4 regression tests pass, but they don't really count > as failures, since they're down to an assumption in the tests that the > order certain tuples appear should be the same as our current > quicksort implementation returns them, even though, in these > problematic cases, that is partially dictated by implementation - our > quicksort isn't stable, but timsort is. This is an interesting point. If we use a stable sort we'll probably be stuck with stable sorts indefinitely. People will start depending on the stability and then we'll break their apps if we find a faster sort that isn't stable. Notably though tapesort is not stable (because heapsort is not stable so neither the runs nor the merge steps are stable). So people's apps would appear to work when they're in memory and fail only on large data sets. It's easily possible for a user's query to never need to go to tape though. We don't have the luxury of having a separate sort and stable_sort though due to the ORDER BY clause. All in all I think it's handier to have a stable ORDER BY sort than an unstable one though. So I'm not necessarily opposed to it even if it means we're stuck using a stable sort indefinitely. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
Besides, I saw the Gsoc site editing has been closed. Should I just submit through this mailing list with attachment? Best Regards and ThanksHuang Qi VictorComputer Science of National University of Singapore > Date: Tue, 17 Apr 2012 09:16:29 +0300 > From: heikki.linnakan...@enterprisedb.com > To: j...@agliodbs.com > CC: huangq...@hotmail.com; pgsql-hackers@postgresql.org; and...@anarazel.de; > alvhe...@commandprompt.com; neil.con...@gmail.com; dan...@heroku.com; > cbbro...@gmail.com; kevin.gritt...@wicourts.gov > Subject: [HACKERS] Gsoc2012 idea, tablesample > > On 24.03.2012 22:12, Joshua Berkus wrote: > > Qi, > > > > Yeah, I can see that. That's a sign that you had a good idea for a > > project, actually: your idea is interesting enough that people want to > > debate it. Make a proposal on Monday and our potential mentors will help > > you refine the idea. > > Yep. The discussion withered, so let me try to summarize: > > 1. We probably don't want the SQL syntax to be added to the grammar. > This should be written as an extension, using custom functions as the > API, instead of extra SQL syntax. > > 2. It's not very useful if it's just a dummy replacement for "WHERE > random() < ?". It has to be more advanced than that. Quality of the > sample is important, as is performance. There was also an interesting > idea of on implementing monetary unit sampling. > > I think this would be a useful project if those two points are taken > care of. > > Another idea that Robert Haas suggested was to add support doing a TID > scan for a query like "WHERE ctid< '(501,1)'". That's not enough work > for GSoC project on its own, but could certainly be a part of it. > > -- >Heikki Linnakangas >EnterpriseDB http://www.enterprisedb.com > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Gsoc2012 idea, tablesample
Hi, Heikki Thanks for your advice.I will change my plan accordingly. But I have a few questions. > 1. We probably don't want the SQL syntax to be added to the grammar. > This should be written as an extension, using custom functions as the > API, instead of extra SQL syntax. > 1. "This should be written as an extension, using custom functions as the API". Could you explain a bit more what does this mean? > 2. It's not very useful if it's just a dummy replacement for "WHERE > random() < ?". It has to be more advanced than that. Quality of the > sample is important, as is performance. There was also an interesting > idea of on implementing monetary unit sampling. 2. In the plan, I mentioned using optimizer statistics to improve the quality of sampling. I may emphasize on that point. I will read about monetary unit sampling and add into the plan about possibility of implementing this idea. > Another idea that Robert Haas suggested was to add support doing a TID > scan for a query like "WHERE ctid< '(501,1)'". That's not enough work > for GSoC project on its own, but could certainly be a part of it. 3. I read about the replies on using ctid. But I don't quite understand how that might help. ctid is just a physical location of row version within the table. If I do "where ctid<'(501, 1)'", what is actually happening? Can I add in this as an optional implementation? I think I can check how to do this if I can have enough time in this project. Best Regards and ThanksHuang Qi VictorComputer Science Department- National University of Singapore > Date: Tue, 17 Apr 2012 09:16:29 +0300 > From: heikki.linnakan...@enterprisedb.com > To: j...@agliodbs.com > CC: huangq...@hotmail.com; pgsql-hackers@postgresql.org; and...@anarazel.de; > alvhe...@commandprompt.com; neil.con...@gmail.com; dan...@heroku.com; > cbbro...@gmail.com; kevin.gritt...@wicourts.gov > Subject: [HACKERS] Gsoc2012 idea, tablesample > > On 24.03.2012 22:12, Joshua Berkus wrote: > > Qi, > > > > Yeah, I can see that. That's a sign that you had a good idea for a > > project, actually: your idea is interesting enough that people want to > > debate it. Make a proposal on Monday and our potential mentors will help > > you refine the idea. > > Yep. The discussion withered, so let me try to summarize: > > 1. We probably don't want the SQL syntax to be added to the grammar. > This should be written as an extension, using custom functions as the > API, instead of extra SQL syntax. > > 2. It's not very useful if it's just a dummy replacement for "WHERE > random() < ?". It has to be more advanced than that. Quality of the > sample is important, as is performance. There was also an interesting > idea of on implementing monetary unit sampling. > > I think this would be a useful project if those two points are taken > care of. > > Another idea that Robert Haas suggested was to add support doing a TID > scan for a query like "WHERE ctid< '(501,1)'". That's not enough work > for GSoC project on its own, but could certainly be a part of it. > > -- >Heikki Linnakangas >EnterpriseDB http://www.enterprisedb.com > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
Hello, > The reason we haven't historically obeyed checkpoint_segments > during recovery is that it slows down the recovery > unnecessarily if you're restoring from a backup and you replay, The variable StandbyMode is false on archive recovery, so no checkpoint triggerred during then. xlog.c:10026 (in some version around 9.2) | /* | * Request a restartpoint if we've replayed too much | * xlog since the last one. | */ | if (StandbyMode && bgwriterLaunched) | { | if (XLogCheckpointNeeded(readId, readSeg)) > You could argue that you should obey checkpoint_segments in a > standby server that's caught up with the master, but AFAICS the > patch doesn't try to detect that. Concerning checkpoint, there seems no need for the standby to know whether it has been caught up with its master or not. Checkpoint has been already kicked by checkpoint_timeout regardless of the sync_state. regards, -- Kyotaro Horiguchi NTT Open Source Software Center == My e-mail address has been changed since Apr. 1, 2012. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slow temporary tables when using sync rep
On 17 April 2012 11:30, Heikki Linnakangas wrote: > What happens is that we write the commit record if the transaction accesses > a temporary table, but we don't flush it. However, we still wait until it's > replicated to the standby. The obvious fix is to not wait for that, see > attached. Tested patch. Yes, that fixes the problem. Thanks. -- Thom -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Slow temporary tables when using sync rep
On 17.04.2012 02:54, Michael Nolan wrote: On Mon, Apr 16, 2012 at 6:27 PM, Thom Brown wrote: Hi, I've noticed that when using synchronous replication (on 9.2devel at least), temporary tables become really slow: Since temporary tables are only present until the session ends (or possibly only until a commit), why are they replicated at all? They're not replicated. What happens is that we write the commit record if the transaction accesses a temporary table, but we don't flush it. However, we still wait until it's replicated to the standby. The obvious fix is to not wait for that, see attached. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index e22bdac..d8523f3 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -1168,7 +1168,8 @@ RecordTransactionCommit(void) * Note that at this stage we have marked clog, but still show as running * in the procarray and continue to hold locks. */ - SyncRepWaitForLSN(XactLastRecEnd); + if (wrote_xlog) + SyncRepWaitForLSN(XactLastRecEnd); /* Reset XactLastRecEnd until the next transaction writes something */ XactLastRecEnd.xrecoff = 0; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Last gasp
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 > If the feature set is desirable, though, I wonder if Postgres is > big/high profile enough for them to figure out some sort of better > arrangement. They *love* it when big open-source projects use GitHub > as their public repo - they'll email and blog announcements about > it - and if there's interest I'd be happy to open a conversation > with them. No need to wonder, we've been in contact with them before and they are very pro Postgres. > I've looked at it in conjunction with Jenkins CI; it looked nice but was way > too heavy-weight for a four-person startup (what's code review?). It's > probably much more suitable for this sized project. > Gerrit's a full-featured code review app with a tolerable UI; MediaWiki is just starting to jump into git/Gerrit and there are definitely a lot of rough edges in that workflow still. ... Someone mentioned upthread that github spam was a problem. I'm not sure I see the issue here - wouldn't mail from them still go through our lists and out current anti-spam measures anyway? - -- Greg Sabino Mullane g...@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201204170623 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iEYEAREDAAYFAk+NRO4ACgkQvJuQZxSWSshPpACg9+ZB6NzCsvnkQwuoD/BzIHgL yMkAn3zwksbKxaSDt3k/YzKY7UVLmUZb =igZu -END PGP SIGNATURE- -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Clobbered parameter names via DECLARE in PL/PgSQL
Hello there is VIP patch of plpgsql_check_function that supports this warning Regards Pavel 2012/4/15 Pavel Stehule : > 2012/4/15 Tom Lane : >> Pavel Stehule writes: >>> We can raise warning from CREATE OR REPLACE FUNCTION - but I would to >>> like have plpgsql_check_function inside core - and it is better place >>> for this and similar issues. >> >> I agree. This is a perfectly legal use of nested declaration scopes, >> so it would be totally inappropriate to complain about it in normal >> use of a plpgsql function. On the other hand, it would probably be >> sane and useful for CHECK FUNCTION to flag any case where an inner >> declaration shadows an outer-scope name (not only the specific case >> of topmost block vs function parameter). > > yes, it is very simple check there. There should be "levels" of > warnings in future and performance or semantic warnings. > > But, we don't need to increase complexity of CHECK FUNCTION now. A > design of CHECK FUNCTION was rich for this purposes. And we need to > find way to push plpgsql_check_function to core first. > > Regards > > Pavel > > > > > >> >> regards, tom lane plpgsql_check_function-2012-04-17-1.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
On 17.04.2012 09:50, Kyotaro HORIGUCHI wrote: This is new version of the patch. I replaced GetStandbyFlushRecPtr with GetXLogReplayRecPtr to check progress of checkpoint following Fujii's sugestion. The reason we haven't historically obeyed checkpoint_segments during recovery is that it slows down the recovery unnecessarily if you're restoring from a backup and you replay, say, one week's worth of WAL files. If for example you have checkpoint_segments=10 and checkpoint_timeout='15 mins' in the server that generated the WAL, you would be constantly performing a restartpoint if you trigger one every 10 segments. You could argue that you should obey checkpoint_segments in a standby server that's caught up with the master, but AFAICS the patch doesn't try to detect that. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why can't I use pgxs to build a plpgsql plugin?
2012/4/17 Heikki Linnakangas : > On 17.04.2012 07:56, Pavel Stehule wrote: >> >> 2012/4/16 Heikki Linnakangas: >>> >>> Ok, committed. I fixed the .PHONY line as Tom pointed out, and changed >>> MSVC >>> install.pm to also copy the header file. >> >> >> Hello, >> >> it doesn't work for modules from contrib directory >> >> pavel ~/src/postgresql/contrib/check_plpgsql $ make >> gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith >> -Wdeclaration-after-statement -Wendif-labels >> -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing >> -fwrapv -fexcess-precision=standard -g -fpic -I. -I. >> -I../../src/include -D_GNU_SOURCE -c -o check_plpgsql.o >> check_plpgsql.c >> check_plpgsql.c:16:21: fatal error: plpgsql.h: No such file or directory >> compilation terminated. > > > Hmm, the makefile rule I added copies the plpgsql.h file to include/server > directory when you do "make install". That makes the file available when you > build with USE_PGXS=1, without access to the source tree, but doesn't change > the situation when you build inside contrib. > > If you plop the module directly to contrib, I guess you'll have to do > > CFLAGS += -I$(top_srcdir)/src/pl/plpgsql/src > > That's what pldebugger has always done. > ook, thank you Pavel > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers