Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype
* Tom Lane (t...@sss.pgh.pa.us) wrote: After a bit of study of the code, it appears to me that it's not too difficult to fix: we just have to use the expression's result type rather than the index column's atttypid in the subsequent processing. ANALYZE never actually looks at the index column contents (indeed cannot, since our index AMs provide no API for extracting the contents), I don't think it'll be an issue, but just in case- is there any reason this will cause trouble for the possible index-only quals/scans work? So that seems to make it not practical to back-patch. But we could get it into 9.0.. I thought of an ugly hack that would avoid an API/ABI break: since VacAttrStats.attr is a copy of the pg_attribute data, we could scribble on it, changing atttypid, attlen, attbyval, etc to match the index expression result type. This seems pretty grotty, but it would allow the fix to be back-patched into existing branches. Yeah, that's rather nasty. :/ Comments? I'm not sure which way to jump here. For my 2c- let's get it fixed for 9.0 cleanly and encourage people who run into this to upgrade to that once it's released. Perhaps I've missed it, but I don't recall seeing many complaints about this. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] review: xml_is_well_formed
On fre, 2010-07-30 at 12:50 +0100, Mike Fowler wrote: * xml_is_well_formed returns true for simple text postgres=# SELECT xml_is_well_formed(''); xml_is_well_formed t (1 row) it is probably wrong result - is it ok?? Yes this is OK, pure text is valid XML content. Are you speaking of XML content fragments that SQL/XML defines? Well-formedness should probably only allow XML documents. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rbtree code breaks GIN's adherence to maintenance_work_mem
On Sat, Jul 31, 2010 at 12:40 AM, Tom Lane t...@sss.pgh.pa.us wrote: Another misbehavior that I noted while playing with Artur's example is that, while GIN index build seems to adhere pretty well to whatever maintenance_work_mem limit you give it in 8.4, it blows right by that limit in 9.0 and HEAD --- I observed it happily eating something north of 128MB with a maintenance_work_mem of 70MB. It looks to me like the reason is the new rbtree.c code, which adds a whole lot of data structure that's not being counted against the maintenance_work_mem limit. Now the first question that might be asked is what we'd need to do to rbtree.c's API to allow its memory consumption to be tracked. However, I think there's a considerably more pressing question, which is what is it that rbtree.c is doing that justifies a 2X bloat factor in GIN index build --- which is pretty much what it's costing us, given the observation above. We know that the amount of memory available for the build has an enormous effect on build time. If we just do the obvious thing of including the rbtree data structure in the maintenance_work_mem calculation, what we're going to get is a very substantial slowdown in build speed for the same maintenance_work_mem setting, compared to the way 8.4 worked. So, I would like somebody to show cause why that whole module shouldn't be ripped out and the code reverted to where it was in 8.4. My recollection is that the argument for adding it was to speed things up in corner cases, but what I think it's actually going to do is slow things down in every case. I've always been a bit suspicious of this code, too, even though I didn't think about the memory consumption issue. But see here: http://archives.postgresql.org/pgsql-hackers/2010-02/msg00307.php I think there may be a better way to avoid the pathological cases, but I'm not sure what it is. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] More fun with GIN lossy-page pointers
I fixed the problem I was on about earlier with ginget.c doing the wrong thing for keystreams like this: ... ... 42/642/65535 42/7... ... (To recap, after reporting the potential lossy match for 42/6, the previous code would advance both keystreams and thus fail to see the potential lossy match for 42/7.) But in the shower this morning I realized there was another case I had not considered: ... ... 42/642/7 50/142/65535 ... ... After deciding there is no possible match at 42/6, both the original code and my patch will choose to advance the first keystream. This means they will never compare 42/6 to the lossy pointer and recognize there's a potential lossy match. So far as I can see, it's impossible to handle this situation when examining only one TID per stream with no lookahead. Choosing to advance the second stream would obviously fail in many other cases, so there is no correct action. The only reasonable way out is to forbid the case --- that is, decree that a keystream may *not* contain both lossy and nonlossy pointers to the same page. Now it looks to me like there's no actual bug so far as keyGetItem is concerned, because in fact that's how entryGetItem will behave: it can only return a lossy pointer when a partial match bitmap has gone lossy, and then the bitmap won't return any real pointers to that page. However it *is* possible for the revised version of keyGetItem to return such a keystream, thus breaking scanGetItem. The problematic case is where the consistentFn has OR logic and we have a situation like ... ... 42/642/65535 50/1... ... We'll correctly report 42/6 as a lossy result, but then the next call will advance only the first keystream, and decide that it needs to test nothing-and-42/65535 as a possible match. Given an OR query, that'll succeed, and we'll return 42/65535 next, thus breaking the rule that needs to hold for scanGetItem. So this looks like a bit of a mess to resolve. I think what we have to do is first see if the consistentFn will succeed when only the lossy stream is TRUE, and if so report the whole-page pointer as a lossy match (and advance over it as well as all the non-lossy pointers for the same page on the next call). Otherwise continue as now. In any case the code comments are all wet because they state that a keystream with both lossy and nonlossy pointers on the same page is OK. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype
Stephen Frost sfr...@snowman.net writes: * Tom Lane (t...@sss.pgh.pa.us) wrote: After a bit of study of the code, it appears to me that it's not too difficult to fix: we just have to use the expression's result type rather than the index column's atttypid in the subsequent processing. ANALYZE never actually looks at the index column contents (indeed cannot, since our index AMs provide no API for extracting the contents), I don't think it'll be an issue, but just in case- is there any reason this will cause trouble for the possible index-only quals/scans work? Not that I can see. What ANALYZE and the planner want to do is provide stats on the behavior of the indexed expression. The fact that certain index opclasses don't actually store that value is of no interest. Even if it did become of interest later on, we're under no compulsion to not redefine the contents of pg_statistic when we need to. So that seems to make it not practical to back-patch. But we could get it into 9.0.. Hmm. I was thinking it was a bit late for 9.0 too, since it'd invalidate any testing anyone's done of their custom typanalyze functions. However, since we're still only in beta, maybe that's acceptable. Comments? I'm not sure which way to jump here. For my 2c- let's get it fixed for 9.0 cleanly and encourage people who run into this to upgrade to that once it's released. Perhaps I've missed it, but I don't recall seeing many complaints about this. It's only been very recently that we had any useful stats capability that could apply in such situations (in fact I think we still haven't actually shipped a non-bogus version of tsvector typanalyze :-(). So I'm not sure anyone would have realized they were missing something. But it would be nice to have this working in 8.4, and I'll be quite unhappy if we don't have it in 9.0. If people feel it's not too late for an ABI break in 9.0 then I'm willing to compromise on backpatching the clean fix. I'm not entirely convinced it's that much cleaner than the hack solution though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype
On Jul 31, 2010, at 11:00 AM, Tom Lane t...@sss.pgh.pa.us wrote: It's only been very recently that we had any useful stats capability that could apply in such situations (in fact I think we still haven't actually shipped a non-bogus version of tsvector typanalyze :-(). So I'm not sure anyone would have realized they were missing something. But it would be nice to have this working in 8.4, and I'll be quite unhappy if we don't have it in 9.0. If people feel it's not too late for an ABI break in 9.0 then I'm willing to compromise on backpatching the clean fix. I'm not entirely convinced it's that much cleaner than the hack solution though. I think this whole discussion is starting with the wrong premise. This is not a bug fix; therefore, it's 9.1 material. If anyone else had suggested slipping this into 9.0, let alone 8.4, we would have punted it in a heartbeat. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rbtree code breaks GIN's adherence to maintenance_work_mem
Robert Haas robertmh...@gmail.com writes: On Sat, Jul 31, 2010 at 12:40 AM, Tom Lane t...@sss.pgh.pa.us wrote: So, I would like somebody to show cause why that whole module shouldn't be ripped out and the code reverted to where it was in 8.4. My recollection is that the argument for adding it was to speed things up in corner cases, but what I think it's actually going to do is slow things down in every case. I've always been a bit suspicious of this code, too, even though I didn't think about the memory consumption issue. But see here: http://archives.postgresql.org/pgsql-hackers/2010-02/msg00307.php I did a bit of experimentation and confirmed my fears: HEAD is willing to eat about double the specified maintenance_work_mem. If you cut back the setting so that its actual memory use is no more than 8.4's, it's about 33% slower on non-pathological data (I'm testing the dataset from Artur Dabrowski here). The referenced tests showing significant speedup over 8.4 were presumably all done without controlling for memory usage. I'm not sure how much those results need to be discounted, but they can't be taken at face value. I think there may be a better way to avoid the pathological cases, but I'm not sure what it is. After a bit of further study I think maybe something could be done towards making rbtree.c less memory-hungry: it's basically been coded with zero attention to memory efficiency. The RBNode structs are individually palloc'd, which means that on a 64-bit machine they take about 80 bytes apiece. The EntryAccumulator structs they are frontends for are only 32 bytes apiece (plus the probably-pass-by-reference Datum value, which we can't easily do anything with), and they are allocated in groups to avoid palloc overhead. EntryAccumulator did get 16 bytes smaller since 8.4 as a result of removing the left/right pointers that the rbtree structure is substituting for, but that doesn't make up for this. I'm tempted to suggest that making RBNode be a hidden struct containing a pointer to somebody else's datum is fundamentally the wrong way to go about things, because the extra void pointer is pure overhead, and we aren't ever going to be using these things in a context where memory usage isn't of concern. If we refactored the API so that RBNode was intended to be the first field of some larger struct, as is done in dynahash tables for instance, we could eliminate the void pointer and the palloc inefficiency. The added storage compared to what 8.4 used would be a parent link and the iteratorState/color fields, which would end up costing us 16 more bytes per EntryAccumulator rather than 64. Still not great but at least it's not a 2X penalty, and the memory allocation would become the caller's problem not rbtree's, so the problem of tracking usage would be no different from before. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch for check constraints using multiple inheritance
On Fri, Jul 30, 2010 at 4:38 PM, Yeb Havinga yebhavi...@gmail.com wrote: I'm looking at ATPrepAddColumn right now, where there is actually some comments about getting the right attinhcount in the case of multiple inherited children, but it's the first time I'm looking at this part of PostgreSQL and it needs some time to sink in. It looks a bit like at the place of the comment /* child should see column as singly inherited */, maybe the recursion could be stopped there as well, i.e. not only setting inhcount to 1, but actually adding the child relation one time to the wqueue. I believe the crux is to stop double recursion from a parent in ATOneLevelRecursion. I did a quick test by adding a globally defined static List *visitedrels = NIL; I agree that's the crux of the problem, but I can't see solving it with a global variable. I realize you were just testing... PS: forgot to say thanks for looking into this problem earlier in the thread. Thanks! yw -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype
Robert Haas robertmh...@gmail.com writes: I think this whole discussion is starting with the wrong premise. This is not a bug fix; therefore, it's 9.1 material. Failing to store stats isn't a bug? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rbtree code breaks GIN's adherence to maintenance_work_mem
On Sat, Jul 31, 2010 at 12:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Jul 31, 2010 at 12:40 AM, Tom Lane t...@sss.pgh.pa.us wrote: So, I would like somebody to show cause why that whole module shouldn't be ripped out and the code reverted to where it was in 8.4. My recollection is that the argument for adding it was to speed things up in corner cases, but what I think it's actually going to do is slow things down in every case. I've always been a bit suspicious of this code, too, even though I didn't think about the memory consumption issue. But see here: http://archives.postgresql.org/pgsql-hackers/2010-02/msg00307.php I did a bit of experimentation and confirmed my fears: HEAD is willing to eat about double the specified maintenance_work_mem. If you cut back the setting so that its actual memory use is no more than 8.4's, it's about 33% slower on non-pathological data (I'm testing the dataset from Artur Dabrowski here). That seems like a pretty serious regression. I'm tempted to suggest that making RBNode be a hidden struct containing a pointer to somebody else's datum is fundamentally the wrong way to go about things, because the extra void pointer is pure overhead, and we aren't ever going to be using these things in a context where memory usage isn't of concern. If we refactored the API so that RBNode was intended to be the first field of some larger struct, as is done in dynahash tables for instance, we could eliminate the void pointer and the palloc inefficiency. The added storage compared to what 8.4 used would be a parent link and the iteratorState/color fields, which would end up costing us 16 more bytes per EntryAccumulator rather than 64. Still not great but at least it's not a 2X penalty, and the memory allocation would become the caller's problem not rbtree's, so the problem of tracking usage would be no different from before. Even if we do that, is it still going to be too much of a performance regression overall? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rbtree code breaks GIN's adherence to maintenance_work_mem
Robert Haas robertmh...@gmail.com writes: On Sat, Jul 31, 2010 at 12:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm tempted to suggest that making RBNode be a hidden struct containing a pointer to somebody else's datum is fundamentally the wrong way to go about things, because the extra void pointer is pure overhead, and we aren't ever going to be using these things in a context where memory usage isn't of concern. If we refactored the API so that RBNode was intended to be the first field of some larger struct, as is done in dynahash tables for instance, we could eliminate the void pointer and the palloc inefficiency. Even if we do that, is it still going to be too much of a performance regression overall? Looking back, EntryAccumulator was 56 bytes on 64-bit machines in 8.4 (it should have been smaller, but a poor choice of field ordering left a lot of pad space). Right now it's 32 bytes, and if we stick an RBNode field in the front it'd be 64. So that'd be a 14% penalty compared to 8.4, as opposed to the present situation which is a 100% penalty (32+80 bytes per entry). On 32-bit machines the numbers are 32 bytes (8.4) versus 20+40 (HEAD) versus 36 bytes (my proposal), so 12.5% penalty versus 87.5%. (All of these numbers should be discounted by whatever space you want to assume the pass-by-reference key datum takes.) So it'd definitely be a lot better than now. Maybe there'd be some remaining performance gap for non-pathological cases, but I think we would be willing to pay that in order to avoid bad behavior for the pathological cases. It's difficult to say for sure of course without going to the trouble of coding and testing it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rbtree code breaks GIN's adherence to maintenance_work_mem
On Sat, Jul 31, 2010 at 12:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sat, Jul 31, 2010 at 12:02 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm tempted to suggest that making RBNode be a hidden struct containing a pointer to somebody else's datum is fundamentally the wrong way to go about things, because the extra void pointer is pure overhead, and we aren't ever going to be using these things in a context where memory usage isn't of concern. If we refactored the API so that RBNode was intended to be the first field of some larger struct, as is done in dynahash tables for instance, we could eliminate the void pointer and the palloc inefficiency. Even if we do that, is it still going to be too much of a performance regression overall? Looking back, EntryAccumulator was 56 bytes on 64-bit machines in 8.4 (it should have been smaller, but a poor choice of field ordering left a lot of pad space). Right now it's 32 bytes, and if we stick an RBNode field in the front it'd be 64. So that'd be a 14% penalty compared to 8.4, as opposed to the present situation which is a 100% penalty (32+80 bytes per entry). On 32-bit machines the numbers are 32 bytes (8.4) versus 20+40 (HEAD) versus 36 bytes (my proposal), so 12.5% penalty versus 87.5%. (All of these numbers should be discounted by whatever space you want to assume the pass-by-reference key datum takes.) So it'd definitely be a lot better than now. Maybe there'd be some remaining performance gap for non-pathological cases, but I think we would be willing to pay that in order to avoid bad behavior for the pathological cases. It's difficult to say for sure of course without going to the trouble of coding and testing it. Well, it sounds like a reasonable thing to try, then. You going to take a crack at it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rbtree code breaks GIN's adherence to maintenance_work_mem
Robert Haas robertmh...@gmail.com writes: On Sat, Jul 31, 2010 at 12:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: So it'd definitely be a lot better than now. Maybe there'd be some remaining performance gap for non-pathological cases, but I think we would be willing to pay that in order to avoid bad behavior for the pathological cases. It's difficult to say for sure of course without going to the trouble of coding and testing it. Well, it sounds like a reasonable thing to try, then. You going to take a crack at it? Yeah, I'll have at it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype
On Sat, Jul 31, 2010 at 12:06 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I think this whole discussion is starting with the wrong premise. This is not a bug fix; therefore, it's 9.1 material. Failing to store stats isn't a bug? Well, it kind of sounds more like you're removing a known limitation than fixing a bug. It's not as if the behavior fails to match the comment. I'm pretty hesitant to see us making any changes to 9.0 that aren't necessary to fix existing bugs or new regressions. What I want to do with 9.0 is get it stable and ship it. I'm not really terribly concerned about the possibility of an ABI break even at this late date, but I *am* concerned about the possibility either of (1) unforeseen consequences necessitating further patching or (2) getting distracted from the business of getting the release out the door. We've been in feature freeze for more than five months, so I think it's certainly time try to reduce to an absolute minimum the number of changes that need to be made before release. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: xml_is_well_formed
On Sat, Jul 31, 2010 at 8:10 AM, Peter Eisentraut pete...@gmx.net wrote: On fre, 2010-07-30 at 12:50 +0100, Mike Fowler wrote: * xml_is_well_formed returns true for simple text postgres=# SELECT xml_is_well_formed(''); xml_is_well_formed t (1 row) it is probably wrong result - is it ok?? Yes this is OK, pure text is valid XML content. Are you speaking of XML content fragments that SQL/XML defines? Well-formedness should probably only allow XML documents. I think the point of this function is to determine whether a cast to xml will throw an error. The behavior should probably match exactly whatever test would be applied there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype
Robert Haas 07/31/10 12:33 PM Tom Lane wrote: Robert Haas writes: I think this whole discussion is starting with the wrong premise. This is not a bug fix; therefore, it's 9.1 material. Failing to store stats isn't a bug? Well, it kind of sounds more like you're removing a known limitation than fixing a bug. It's operating as designed and documented. There is room for enhancement, but the only thing which could possibly justify this as 9.0 material is if there was a demonstrated performance regression in 9.0 for which this was the safest cure. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review patch: Distinguish between unique indexes and unique constraints
On Thu, Jul 29, 2010 at 7:56 PM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: The patch is in context diff format and applies cleanly. No doc changes were included. Arguably there should be a mention in the documentation for psql's \d+ commend, but since the number of child tables and the display of reloptions aren't mentioned, perhaps we're not trying to list *all* the differences the + makes. If those don't merit mention, this doesn't. The patch implements what it's supposed to, there was consensus on the list that we want it, we don't already have it, the SQL spec doesn't apply to psql's backslash commands, and it was added just for the verbose listing as requested (with no objection). There are no pg_dump issues. The only danger is that someone is depending on the current \d+ format and will be surprised at the new distinction between unique indexes and unique constraints. All bases seem to me to be covered. The feature works as advertised, I saw no problem corner cases, no assertion failures or crashes. The patch causes no noticeable performance change, doesn't claim to affect performance, and will not slow down anything else. The patch does not follow coding guidelines, as it places the opening brace for an if block on the same line as the if. There are no portability issues; it should work everywhere that current backslash commands work. The purpose of the code is obvious enough to not require any comment lines. It does what it says, correctly. It doesn't produce compiler warnings and does not crash. It fits together coherently with all else, and introduces no new interdependencies. I am attaching a new version of the patch which fixes the formatting issue and rearranges the code slightly in a way which I find more readable. (I leave it to the committer to determine which arrangement better suits.) I am marking this Ready for Committer. I have committed this patch with a few changes. First, I felt that there was little point in showing this detail only in verbose mode; indeed, it seems like that could be confusing in some circumstances. (I thought I checked this was an index not a constraint? Oh, crap, I forgot to say \d+). So I removed that check. Second, I felt that UNIQUE, CONSTRAINT, btree (a) was one too many commas, so I made it just say UNIQUE CONSTRAINT, btree (a) instead. I didn't see any discussion of either of these points in the previous discussion of this patch, so hopefully neither change is objectionable (but if it is, we can revisit it, if someone thinks I'm all wet!). Thanks to Josh for the patch and Kevin for the review; I think this is a good change. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype
* Kevin Grittner (kevin.gritt...@wicourts.gov) wrote: Robert Haas 07/31/10 12:33 PM Tom Lane wrote: Failing to store stats isn't a bug? Well, it kind of sounds more like you're removing a known limitation than fixing a bug. It's operating as designed and documented. There is room for enhancement, but the only thing which could possibly justify this as 9.0 material is if there was a demonstrated performance regression in 9.0 for which this was the safest cure. I have to disagree with this, to be honest. The fact that we've documented what is completely unexpected and frustrating behaviour doesn't mean we get to say it's not a bug. Not collecting stats, at all, is a pretty bad bug, in my view. Stats are an important part of the system which needs to work at least decently. Perhaps before it was pretty rare that we'd have the situation described (before we brought in tsearch2), but it's not any longer and we need to support it as we would the other types. The only reason I'm against backpatching it to the beginning is that it's either an ABI change or some rather grotty code, and even then it wouldn't be hard to push me to accepting the grotty code if we make the cleaner change for 9.0 and going forward, especially as we have people in the wild being affected by it. Certain other databases have done a very good job of documenting their bugs and in some cases even calling them features. I'd rather we not go down that path. I don't see the lack of stats collecting to be a simple 'limitation'. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] More fun with GIN lossy-page pointers
Excerpts from Tom Lane's message of sáb jul 31 09:57:13 -0400 2010: So far as I can see, it's impossible to handle this situation when examining only one TID per stream with no lookahead. Choosing to advance the second stream would obviously fail in many other cases, so there is no correct action. The only reasonable way out is to forbid the case --- that is, decree that a keystream may *not* contain both lossy and nonlossy pointers to the same page. Would it make sense to order the streams differently? I mean, what if whole-page pointers in the lossy stream are processed before regular ones? (I am thoroughly unfamiliar with this stuff, so forgive me if I've gotten the concepts and terminology all wrong) -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] More fun with GIN lossy-page pointers
Alvaro Herrera alvhe...@commandprompt.com writes: Excerpts from Tom Lane's message of sáb jul 31 09:57:13 -0400 2010: So far as I can see, it's impossible to handle this situation when examining only one TID per stream with no lookahead. Choosing to advance the second stream would obviously fail in many other cases, so there is no correct action. The only reasonable way out is to forbid the case --- that is, decree that a keystream may *not* contain both lossy and nonlossy pointers to the same page. Would it make sense to order the streams differently? I mean, what if whole-page pointers in the lossy stream are processed before regular ones? Hmm ... interesting thought. I'm not sure what the implications are, but it's definitely worth considering. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype
On Sat, Jul 31, 2010 at 9:16 PM, Stephen Frost sfr...@snowman.net wrote: * Kevin Grittner (kevin.gritt...@wicourts.gov) wrote: Robert Haas 07/31/10 12:33 PM Tom Lane wrote: Failing to store stats isn't a bug? Well, it kind of sounds more like you're removing a known limitation than fixing a bug. It's operating as designed and documented. There is room for enhancement, but the only thing which could possibly justify this as 9.0 material is if there was a demonstrated performance regression in 9.0 for which this was the safest cure. I have to disagree with this, to be honest. The fact that we've documented what is completely unexpected and frustrating behaviour doesn't mean we get to say it's not a bug. Not collecting stats, at all, is a pretty bad bug, in my view. I guess I'd appreciate it if someone could explain in more detail in what cases we fail to collect stats. Do we have a typanalyze function here that can't possibly work for anything, ever? Or is it just some subset of the cases? (Apologies if this has been discussed on the original thread; I was unable to find it in the archives.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: psql: edit function, show function commands patch
On Sun, Jul 25, 2010 at 11:42 AM, Pavel Stehule pavel.steh...@gmail.com wrote: I'm setting this as ready for committer. Thank you very much I took a look at this tonight and am a bit mystified by the following bit: + /* +* PL doesn't calculate first row of function's body +* when first row is empty. So checks first row, and +* correct lineno when it is necessary. +*/ Is that true of any PL, or just some particular PL? Is the behavior described here a bug that we should fix, or is it, for some reason, considered correct? Is it documented in our documentation? The implementation of first_row_is_empty() looks pretty kludgey, too. It seems to me that it will fail completely if the text of the function definition happens to contain $function$. CREATE OR REPLACE FUNCTION boom() RETURNS text LANGUAGE plpgsql AS $$ BEGIN SELECT '$function$'; END $$; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] rbtree code breaks GIN's adherence to maintenance_work_mem
Robert Haas robertmh...@gmail.com writes: On Sat, Jul 31, 2010 at 12:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: So it'd definitely be a lot better than now. Maybe there'd be some remaining performance gap for non-pathological cases, but I think we would be willing to pay that in order to avoid bad behavior for the pathological cases. It's difficult to say for sure of course without going to the trouble of coding and testing it. Well, it sounds like a reasonable thing to try, then. You going to take a crack at it? This worked out pretty well, so I've applied the attached patch. I observed the following timings running the index build from Artur Dobrowski's example: 8.4 9.0b4 HEAD maintenance_work_mem setting100MB 60MB100MB actual peak process image size 118M126M112M elapsed time964s1289s 937s so the memory bloat problem is definitely fixed and the speed seems satisfactory. It might be worth commenting that after I'd rewritten the rbtree code, I spent awhile pulling my hair out because the code *still* blew past the expected memory usage. It turned out that there were some significant leaks in ginEntryInsert and subsidiary routines, causing memory usage to continue to grow even once we realized we'd hit maintenance_work_mem and started to dump data to disk. These leaks were there before, but were masked in 8.4 because the old ginGetEntry code incrementally pfreed itempointer-list storage as it walked the data structure; this caused storage to be released faster than the leaks would consume it. The new code doesn't release any of that storage until the MemoryContextReset after the dump loop, so any leakage in the dump loop becomes a visible increase in the peak space consumption. I wasn't able to remove all of the leakage --- there's some fairly ugly coding associated with index page splits that leaks an indextuple or so per split, and I'm not sure where the extra tuple could safely be pfree'd. That seems to be small enough to live with though; it's less than 0.5% of the total memory usage in the example I'm working with. I also cleaned up some other stuff that would've bit us eventually, like unnecessary use of recursion in the rbtree iteration code. regards, tom lane Index: src/backend/access/gin/ginbtree.c === RCS file: /cvsroot/pgsql/src/backend/access/gin/ginbtree.c,v retrieving revision 1.15 diff -c -r1.15 ginbtree.c *** src/backend/access/gin/ginbtree.c 2 Jan 2010 16:57:33 - 1.15 --- src/backend/access/gin/ginbtree.c 1 Aug 2010 01:44:08 - *** *** 267,272 --- 267,274 /* * Insert value (stored in GinBtree) to tree described by stack + * + * NB: the passed-in stack is freed, as though by freeGinBtreeStack. */ void ginInsertValue(GinBtree btree, GinBtreeStack *stack) *** *** 308,317 PageSetTLI(page, ThisTimeLineID); } ! UnlockReleaseBuffer(stack-buffer); END_CRIT_SECTION(); ! freeGinBtreeStack(stack-parent); return; } else --- 310,320 PageSetTLI(page, ThisTimeLineID); } ! LockBuffer(stack-buffer, GIN_UNLOCK); END_CRIT_SECTION(); ! freeGinBtreeStack(stack); ! return; } else *** *** 325,331 */ newlpage = btree-splitPage(btree, stack-buffer, rbuffer, stack-off, rdata); - ((ginxlogSplit *) (rdata-data))-rootBlkno = rootBlkno; parent = stack-parent; --- 328,333 *** *** 341,347 ((ginxlogSplit *) (rdata-data))-isRootSplit = TRUE; ((ginxlogSplit *) (rdata-data))-rrlink = InvalidBlockNumber; - page = BufferGetPage(stack-buffer); lpage = BufferGetPage(lbuffer); rpage = BufferGetPage(rbuffer); --- 343,348 *** *** 375,384 UnlockReleaseBuffer(rbuffer); UnlockReleaseBuffer(lbuffer); ! UnlockReleaseBuffer(stack-buffer); ! END_CRIT_SECTION(); return; } else --- 376,386 UnlockReleaseBuffer(rbuffer); UnlockReleaseBuffer(lbuffer); ! LockBuffer(stack-buffer, GIN_UNLOCK); END_CRIT_SECTION(); + freeGinBtreeStack(stack); + return; } else Index: src/backend/access/gin/ginbulk.c === RCS file: /cvsroot/pgsql/src/backend/access/gin/ginbulk.c,v retrieving revision 1.19 diff -c -r1.19 ginbulk.c *** src/backend/access/gin/ginbulk.c 26 Feb 2010 02:00:33 - 1.19 --- src/backend/access/gin/ginbulk.c 1 Aug 2010 01:44:08 - *** *** 19,35 #include utils/memutils.h ! #define DEF_NENTRY 2048 ! #define DEF_NPTR 4 - static void * - ginAppendData(void *old, void *new, void *arg) - { - EntryAccumulator *eo = (EntryAccumulator *) old, - *en =
Re: [HACKERS] review: psql: edit function, show function commands patch
Robert Haas robertmh...@gmail.com writes: I took a look at this tonight and am a bit mystified by the following bit: + /* +* PL doesn't calculate first row of function's body +* when first row is empty. So checks first row, and +* correct lineno when it is necessary. +*/ Is that true of any PL, or just some particular PL? plpgsql has an old bit of logic that deliberately ignores an initial newline in the function body: /*-- * Hack: skip any initial newline, so that in the common coding layout *CREATE FUNCTION ... AS $$ *code body *$$ LANGUAGE plpgsql; * we will think line 1 is what the programmer thinks of as line 1. *-- */ if (*cur_line_start == '\r') cur_line_start++; if (*cur_line_start == '\n') cur_line_start++; None of the other standard PLs do that AFAIK. Is it documented in our documentation? I don't think so. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ANALYZE versus expression indexes with nondefault opckeytype
Robert Haas robertmh...@gmail.com writes: On Sat, Jul 31, 2010 at 9:16 PM, Stephen Frost sfr...@snowman.net wrote: * Kevin Grittner (kevin.gritt...@wicourts.gov) wrote: Robert Haas 07/31/10 12:33 PM Tom Lane wrote: Failing to store stats isn't a bug? Well, it kind of sounds more like you're removing a known limitation than fixing a bug. It's operating as designed and documented. I have to disagree with this, to be honest. The fact that we've documented what is completely unexpected and frustrating behaviour doesn't mean we get to say it's not a bug. Not collecting stats, at all, is a pretty bad bug, in my view. I'm a bit bemused by the claim that this behavior is documented. One comment buried deep in the bowels of the source is not user-visible documentation in my book. I guess I'd appreciate it if someone could explain in more detail in what cases we fail to collect stats. Do we have a typanalyze function here that can't possibly work for anything, ever? Or is it just some subset of the cases? ANALYZE normally collects stats for any expression that there is an expression index for. However, it will punt and fail to collect stats if the expression index uses an opclass whose opckeytype (ie, storage datatype) is different from the actual expression datatype. A quick look into the system catalogs shows that that applies to these opclasses: amname | opcname | opcintype | opckeytype +--+---+- btree | name_ops | name | cstring gist | point_ops| point | box gist | poly_ops | polygon | box gist | circle_ops | circle| box gin| _int4_ops| integer[] | integer gin| _text_ops| text[]| text gin| _abstime_ops | abstime[] | abstime gin| _bit_ops | bit[] | bit gin| _bool_ops| boolean[] | boolean gin| _bpchar_ops | character[] | character gin| _bytea_ops | bytea[] | bytea gin| _char_ops| char[] | char gin| _cidr_ops| cidr[]| cidr gin| _date_ops| date[]| date gin| _float4_ops | real[]| real gin| _float8_ops | double precision[]| double precision gin| _inet_ops| inet[]| inet gin| _int2_ops| smallint[]| smallint gin| _int8_ops| bigint[] | bigint gin| _interval_ops| interval[]| interval gin| _macaddr_ops | macaddr[] | macaddr gin| _name_ops| name[]| name gin| _numeric_ops | numeric[] | numeric gin| _oid_ops | oid[] | oid gin| _oidvector_ops | oidvector[] | oidvector gin| _time_ops| time without time zone[] | time without time zone gin| _timestamptz_ops | timestamp with time zone[]| timestamp with time zone gin| _timetz_ops | time with time zone[] | time with time zone gin| _varbit_ops | bit varying[] | bit varying gin| _varchar_ops | character varying[] | character varying gin| _timestamp_ops | timestamp without time zone[] | timestamp without time zone gin| _money_ops | money[] | money gin| _reltime_ops | reltime[] | reltime gin| _tinterval_ops | tinterval[] | tinterval gist | tsvector_ops | tsvector | gtsvector gin| tsvector_ops | tsvector | text gist | tsquery_ops | tsquery | bigint (37 rows) Now, of the above the only cases where we'd be likely to be able to do anything very useful with stats on the expression value are the name case, which isn't that exciting in practice, and the tsvector cases. For tsvector it was only with 8.4 that we had non-toy stats code, so while the limitation is ancient it's only recently that it started to be meaningful. I don't think this can be claimed to be a corner case. If you set up an FTS index according to the first alternative offered in http://developer.postgresql.org/pgdocs/postgres/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX you will find that the system fails to collect stats for it and so you get stupid default estimates for your FTS queries. If this were a documented limitation I'd expect to see a big red warning there to *not* do it that
Re: [HACKERS] review: psql: edit function, show function commands patch
2010/8/1 Robert Haas robertmh...@gmail.com: On Sun, Jul 25, 2010 at 11:42 AM, Pavel Stehule pavel.steh...@gmail.com wrote: I'm setting this as ready for committer. Thank you very much I took a look at this tonight and am a bit mystified by the following bit: + /* + * PL doesn't calculate first row of function's body + * when first row is empty. So checks first row, and + * correct lineno when it is necessary. + */ Is that true of any PL, or just some particular PL? Is the behavior described here a bug that we should fix, or is it, for some reason, considered correct? Is it documented in our documentation? it is primary plpgsql issue. The implementation of first_row_is_empty() looks pretty kludgey, too. It seems to me that it will fail completely if the text of the function definition happens to contain $function$. CREATE OR REPLACE FUNCTION boom() RETURNS text LANGUAGE plpgsql AS $$ BEGIN SELECT '$function$'; END $$; I can enhance algorithm on client side - but it will not be a pretty code - it better do it on server side - for example pg_get_formated_functiondef ... CREATE OR REPLACE FUNCTION pg_get_formated_function_def(in oid, linenum bool:= false, OUT funcdef text, OUT first_line_isignored bool) this can remove any ugly code Regards Pavel -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] merge command - GSoC progress
2010/7/31 Greg Smith g...@2ndquadrant.com Boxuan Zhai wrote: I create a clean patch file (no debug clauses). See the attachment. Some general coding feedback for you on this. Thanks for your consideration! It's not obvious to people who might want to try this out what exactly they are supposed to do. Particularly for complicated patches like this, where only a subset of the final feature might actually be implemented, some sort of reviewer guide suggesting what should and shouldn't work would be extremely helpful. I recall there was some sort of patch design guide in an earlier version of this patch; it doesn't seem to be there anymore. Don't remember if that had any examples in it. I am now working on fixing a bug which makes the system unable to initdb. I will update my page in postgres Wiki with a detailed instruction of my implementation and testing examples soon, with my next patch file. Ultimately, the examples of working behavior for this patch will need to be put into code. The way that happens is by adding working examples into the regression tests for the database. If you had those done for this patch, I wouldn't have to ask for code examples; I could just look at the source to the regression output and see how to use it against the standard database the regression samples create, and then execute against. This at least lets you avoid having to generate some test data, because there will already be some in the regression database for you to use. There is an intro this topic at http://wiki.postgresql.org/wiki/Regression_test_authoring Another common way to generate test data is to run pgbench which creates 4 tables and populates them with data. I will try to add my testing examples to the gregression folder. As far as the patch itself goes, you have some work left on cleaning that up still you'll need to do eventually. What I would suggest is actually reading the patch itself; don't just generate it and send it, read through the whole thing like someone new to it would do. One way you can improve what you've done already is to find places where you have introduced changes to the code structure just through editing. Here's an example of what I'm talking about, from line 499 of your patch: -break; +break; This happened because you added two invisible tabs to the end of this line. This makes the patch larger for no good reason and tends to infuriate people who work on the code. There's quite a bit of extra lines added in here too that should go. You should consider reducing the ultimate size of the patch in terms of lines a worthwhile use of your time, even if it doesn't change how things work. There's lots of examples in this one where you put two or three lines between two statements when a single one would match the look of the code in that file. A round of reading the diff looking for that sort of problem would be useful. Another thing you should do is limit how long each line is when practical. You have lots of seriously wide comment lines here right now in particular. While there are some longer lines in the PostgreSQL code right now, that's code, not comments. And when you have a long line and a long comment, don't tack the comment onto the end. Put it on the line above instead. Also, don't use // in the middle of comments the way you've done in a few places here. Sorry for these mistakes, again. I promise that the same thing will not happen in my next patch. Getting some examples sorted out and starting on regression tests is more important than the coding style parts I think, just wanted to pass those along while I noticed them reading the patch, so you could start looking out for them more as you continue to work on it. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us http://www.2ndquadrant.us/