Re: [HACKERS] Query optimization problem
Tom Lane t...@sss.pgh.pa.us writes: In the example, we do have d1.id and d2.basedon grouped in an equivalence class. So in principle you could substitute d1.id into the WHERE clause in place of d2.basedon, once you'd checked that it was being used with an operator that's compatible with the specific equivalence class (ie it's in one of the eclass's opfamilies, I think). The problem is to recognize that such a rewrite would be a win --- it could just as easily be a big loss. Ok, that was my feeling too. Even if we understood how to direct the rewriting process, I'm really dubious that it would win often enough to justify the added planning time. The particular problem here seems narrow enough that solving it on the client side is probably a whole lot easier and cheaper than trying to get the planner to do it. My overly naive uneducated idea here would be to produce both the plans and let the planner evaluate their respective costs. Maybe that's what you mean here by how to direct the rewriting process. Then we don't want to generate too many useless plans when you have lots of eclass around. This brings back the idea of pondering somehow the optimiser effort pushed into solving a query plan. Like in gcc we can use different effort targets and we don't know for sure before hand if -O3 will produce faster code than -O2, all we know is that it will try harder. Is it possible to imagine having a plan_eclass_permutations default to false that would activate the discussed behavior here? Ok, I'm not sure what form should take such a setting, but clearly, there's a need to be able to impact the optimiser effort. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] merge command - GSoC progress
2010/7/28 Robert Haas robertmh...@gmail.com On Tue, Jul 27, 2010 at 1:04 AM, Boxuan Zhai bxzhai2...@gmail.com wrote: I have get a edition that the merge command can run. It accept the standard merge command and can do UPDATE, INSERT and DELETE actions now. But we cannot put additional qualification for actions. There are some bugs when we try to evaluate the quals which make the system quit. I will fix it soon. This patch doesn't compile. You're using zbxprint() from a bunch of places where it's not defined. I get compile warnings for all of those files and then a link failure at the end. You might find it useful to create src/Makefile.custom in your local tree and put COPT=-Werror in there; it tends to prevent problems of this kind. Undefined symbols: _zbxprint, referenced from: _transformStmt in analyze.o _ExecInitMergeAction in nodeModifyTable.o _ExecModifyTable in nodeModifyTable.o _ExecInitModifyTable in nodeModifyTable.o _merge_action_planner in planner.o Sorry, this is a debug function defined by me. It may not be included in the patch. I add a line of #define zbxprint printf somewhere in the system. Not that it's as high-priority as getting this fully working, but you should revert the useless changes in this patch - e.g. the one-line change to heaptuple.c is obvious debugging leftovers, and all of the changes to execQual.c and execUtil.c are whitespace-only. You should also try to make your code and comments conform to project style guidelines. In general, you'll find it easier to keep track of your changes (and you'll have fewer spurious changes) if you use git diff master...yourbranch instead of marking comments, etc. with ZBX or similar. I will clean all these in my next patch. I am now very confused with the failure of action qualification. I look through the whole process of a query, from parser to executor. In my opinion, the qualification transformed in analyzer, will be processed by prepsocess_qual_condition() in planner, and then by the ExecInitExpr() function in excutor start phase (in InitPlan() function). Then the qual is ready to be used in ExecQual(). Am I correct? I have done these on the merge action qual, but when I pass them into ExecQual(), the server just closed abnormally. I don't know if I missed any steps on preparing the qual expressions. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
Re: [HACKERS] merge command - GSoC progress
On Wed, Jul 28, 2010 at 6:08 AM, Boxuan Zhai bxzhai2...@gmail.com wrote: On Tue, Jul 27, 2010 at 1:04 AM, Boxuan Zhai bxzhai2...@gmail.com wrote: I have get a edition that the merge command can run. It accept the standard merge command and can do UPDATE, INSERT and DELETE actions now. But we cannot put additional qualification for actions. There are some bugs when we try to evaluate the quals which make the system quit. I will fix it soon. This patch doesn't compile. You're using zbxprint() from a bunch of places where it's not defined. I get compile warnings for all of those files and then a link failure at the end. You might find it useful to create src/Makefile.custom in your local tree and put COPT=-Werror in there; it tends to prevent problems of this kind. Undefined symbols: _zbxprint, referenced from: _transformStmt in analyze.o _ExecInitMergeAction in nodeModifyTable.o _ExecModifyTable in nodeModifyTable.o _ExecInitModifyTable in nodeModifyTable.o _merge_action_planner in planner.o Sorry, this is a debug function defined by me. It may not be included in the patch. I add a line of #define zbxprint printf somewhere in the system. Yeah, but it's not included in all the places that are needed to make everything compile. You might move this to postgres.h or something. Not that it's as high-priority as getting this fully working, but you should revert the useless changes in this patch - e.g. the one-line change to heaptuple.c is obvious debugging leftovers, and all of the changes to execQual.c and execUtil.c are whitespace-only. You should also try to make your code and comments conform to project style guidelines. In general, you'll find it easier to keep track of your changes (and you'll have fewer spurious changes) if you use git diff master...yourbranch instead of marking comments, etc. with ZBX or similar. I will clean all these in my next patch. I am now very confused with the failure of action qualification. I look through the whole process of a query, from parser to executor. In my opinion, the qualification transformed in analyzer, will be processed by prepsocess_qual_condition() in planner, and then by the ExecInitExpr() function in excutor start phase (in InitPlan() function). Then the qual is ready to be used in ExecQual(). Am I correct? I'm not sure, sorry. I have done these on the merge action qual, but when I pass them into ExecQual(), the server just closed abnormally. I don't know if I missed any steps on preparing the qual expressions. Have you tried attaching a debugger? Try SELECT pg_backend_pid() and then use gdb -p the_pid from another window. Hit continue. Then do whatever it is that's crashing. That way you can get a stack backtrace, and poke around at the data structures. Using pprint() on node-type data structures, either in debugging code or actually straight from the debugger via call, is also very helpful, often-times. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3
Andrew Dunstan wrote: The real problem has nothing to do with any of the analysis, as you say. It is this: they have an override file for PGXS and it uses $(mkinstalldirs) which we got rid of about a year ago. So apparently they haven't been testing much against any of our alphas or betas or they would have seen this long ago. The correct fix is to do the following in the PostGIS source root: sed -i -e 's/mkinstalldirs/MKDIR_P/' postgis/Makefile.pgxs cheers andrew Hmmm that's totally wrong - the override in Makefile.pgxs should only ever be loaded for PostgreSQL 8.3 and 8.4, and not PostgreSQL 9.0 since it already has the correct installation paths. What I suspect is that you're actually getting bitten by this: http://markmail.org/message/k7iolbazhrqhijfk#query:pg_config%20jun%202007+page:1+mid:rqk6ux2e7npqbrzf+state:results Or, in other words, configure is picking up the wrong pg_config. Since the path fix in the thread was not backported to 8.3, then the presence of an another pg_config for PostgreSQL 8.3 in PATH will break things :( ATB, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, Jul 28, 2010 at 12:23 AM, Jeff Davis pg...@j-davis.com wrote: On Tue, 2010-07-27 at 15:23 -0700, Jeff Davis wrote: On Tue, 2010-07-27 at 17:18 -0400, Robert Haas wrote: My first concern with that idea was that it may create an inconsistency between the primary and the standby. The primary could have a bunch of zero pages that never make it to the standby. Maybe I'm slow on the uptake here, but don't the pages start out all-zeroes on the standby just as they do on the primary? The only way it seems like this would be a problem is if a page that previously contained data on the primary was subsequently zeroed without writing a WAL record - or am I confused? The case I was concerned about is when you have a table on the primary with a bunch of zero pages at the end. Then you SET TABLESPACE, and none of the copied pages (or even the fact that they exist) would be sent to the standby, but they would exist on the primary. And later pages may have data, so the standby may see page N but not N-1. Generally, most of the code is not expecting to read or write past the end of the file, unless it's doing an extension. However, I think everything is fine during recovery, because it looks like it's designed to create zero pages as needed. So your idea seems safe to me, although I do still have some doubts because of my lack of knowledge in this area; particularly hot standby conflict detection/resolution. My idea was different: still log the zero page, just don't set LSN or TLI for a zero page in log_newpage() or heap_xlog_newpage(). This isn't as clean as your idea, but I'm a little more confident that it is correct. Both potential fixes attached and both appear to work. fix1 -- Only call PageSetLSN/TLI inside log_newpage() and heap_xlog_newpage() if the page is not zeroed. fix2 -- Don't call log_newpage() at all if the page is not zeroed. Please review. I don't have a strong opinion about which one should be applied. Looks good. I still prefer fix2; it seems cleaner to me. It has another advantage, too, which is that copy_relation_data() is used ONLY by ALTER TABLE SET TABLESPACE. So if I stick to patching that function, that's the only thing I can possibly break, whereas log_newpage() is used in a bunch of other places. I don't think either fix is going to break anything at all, but considering that this is going to need back-patching, I'd rather be conservative. Speaking of back-patching, the subject line describes this as an 8.3+ problem, but it looks to me like this goes all the way back to 8.0. The code is a little different in 8.2 and prior, but ISTM it's vulnerable to the same issue. Don't we need a modified version of this patch for the 8.0 - 8.2 branches? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query optimization problem
On Wed, Jul 28, 2010 at 3:45 AM, Dimitri Fontaine dfonta...@hi-media.com wrote: Even if we understood how to direct the rewriting process, I'm really dubious that it would win often enough to justify the added planning time. The particular problem here seems narrow enough that solving it on the client side is probably a whole lot easier and cheaper than trying to get the planner to do it. My overly naive uneducated idea here would be to produce both the plans and let the planner evaluate their respective costs. Maybe that's what you mean here by how to direct the rewriting process. Then we don't want to generate too many useless plans when you have lots of eclass around. The way the planner is set up, you'd have to plan with qual A, then repeat the entire process with qual B, and then just for good measure repeat the process with both quals A and B. ISTM you'd triple the planning time if there were even just one case of this in a particular query. If you have different ways of generating the same output for a given rel, you can just throw them all into a bucket and let the planner work it out. But here you want to have different paths for the same relation that generate *different output*, and the planner doesn't understand that concept. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query optimization problem
Robert Haas robertmh...@gmail.com writes: But here you want to have different paths for the same relation that generate *different output*, and the planner doesn't understand that concept. Sorry? I though what Equivalence Class provides is the proving that using this qualification or another will *not* affect the output. -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Tue, 2010-07-27 at 21:23 -0700, Jeff Davis wrote: Both potential fixes attached and both appear to work. fix1 -- Only call PageSetLSN/TLI inside log_newpage() and heap_xlog_newpage() if the page is not zeroed. fix2 -- Don't call log_newpage() at all if the page is not zeroed. Please review. I don't have a strong opinion about which one should be applied. ISTM we should just fix an uninitialized page first, using code from VACUUM similar to if (PageIsNew(page)) { ereport(WARNING, (errmsg(relation \%s\ page %u is uninitialized --- fixing, relname, blkno))); PageInit(page, BufferGetPageSize(buf), 0); } then continue as before. We definitely shouldn't do anything that leaves standby different to primary. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3
Mark Cave-Ayland wrote: Andrew Dunstan wrote: The real problem has nothing to do with any of the analysis, as you say. It is this: they have an override file for PGXS and it uses $(mkinstalldirs) which we got rid of about a year ago. So apparently they haven't been testing much against any of our alphas or betas or they would have seen this long ago. The correct fix is to do the following in the PostGIS source root: sed -i -e 's/mkinstalldirs/MKDIR_P/' postgis/Makefile.pgxs cheers andrew Hmmm that's totally wrong - the override in Makefile.pgxs should only ever be loaded for PostgreSQL 8.3 and 8.4, and not PostgreSQL 9.0 since it already has the correct installation paths. What I suspect is that you're actually getting bitten by this: http://markmail.org/message/k7iolbazhrqhijfk#query:pg_config%20jun%202007+page:1+mid:rqk6ux2e7npqbrzf+state:results Or, in other words, configure is picking up the wrong pg_config. Since the path fix in the thread was not backported to 8.3, then the presence of an another pg_config for PostgreSQL 8.3 in PATH will break things :( No, the configure test is wrong. Here's what's in configure.ac: dnl Temporary hack until minimum PostgreSQL version is 8.5: dnl If PostgreSQL 8.5 is detected, trigger the inclusion of the new versioned PGXS targets PGXSOVERRIDE=0 if test ! $PGSQL_MINOR_VERSION -ge 5; then PGXSOVERRIDE=1 fi Of course, we don't have any such thing as 8.5. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query optimization problem
On Wed, Jul 28, 2010 at 6:55 AM, Dimitri Fontaine dfonta...@hi-media.com wrote: Robert Haas robertmh...@gmail.com writes: But here you want to have different paths for the same relation that generate *different output*, and the planner doesn't understand that concept. Sorry? I though what Equivalence Class provides is the proving that using this qualification or another will *not* affect the output. In a query like... SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE (d1.ID=234409763) or (d2.ID=234409763) ...you're going to scan d1, scan d2, and then join the results. The scan of d1 is going to produce different results depending on whether you evaluate or not d1.ID=234409763, and the scan of d2 is going to produce different results depending on whether or not you evaluate d2.BasedOn=234409763. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query optimization problem
Robert Haas wrote: On Wed, Jul 28, 2010 at 6:55 AM, Dimitri Fontaine dfonta...@hi-media.com wrote: Robert Haas robertmh...@gmail.com writes: But here you want to have different paths for the same relation that generate *different output*, and the planner doesn't understand that concept. Sorry? I though what Equivalence Class provides is the proving that using this qualification or another will *not* affect the output. In a query like... SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE (d1.ID=234409763) or (d2.ID=234409763) ...you're going to scan d1, scan d2, and then join the results. The scan of d1 is going to produce different results depending on whether you evaluate or not d1.ID=234409763, and the scan of d2 is going to produce different results depending on whether or not you evaluate d2.BasedOn=234409763. Wouldn't it be relatively easy, to rewrite the filter expression by adding expressions, instead of replacing constants, in the disjunctive case, so the example at hand would become: WHERE (d1.ID=234409763) or (d2.ID=234409763) AND (d2.BasedOnID=234409763) or (d2.ID=234409763) regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3
Andrew Dunstan wrote: No, the configure test is wrong. Here's what's in configure.ac: dnl Temporary hack until minimum PostgreSQL version is 8.5: dnl If PostgreSQL 8.5 is detected, trigger the inclusion of the new versioned PGXS targets PGXSOVERRIDE=0 if test ! $PGSQL_MINOR_VERSION -ge 5; then PGXSOVERRIDE=1 fi Of course, we don't have any such thing as 8.5. Ah wait - I see. The fix is in already in the 1.5 branch, it just hasn't hit a release yet :( http://trac.osgeo.org/postgis/changeset/5421/branches/1.5/configure.ac Looks like we need to push a new release in time for 9.0... ATB, Mark. -- Mark Cave-Ayland - Senior Technical Architect PostgreSQL - PostGIS Sirius Corporation plc - control through freedom http://www.siriusit.co.uk t: +44 870 608 0063 Sirius Labs: http://www.siriusit.co.uk/labs -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, Jul 28, 2010 at 7:02 AM, Simon Riggs si...@2ndquadrant.com wrote: On Tue, 2010-07-27 at 21:23 -0700, Jeff Davis wrote: Both potential fixes attached and both appear to work. fix1 -- Only call PageSetLSN/TLI inside log_newpage() and heap_xlog_newpage() if the page is not zeroed. fix2 -- Don't call log_newpage() at all if the page is not zeroed. Please review. I don't have a strong opinion about which one should be applied. ISTM we should just fix an uninitialized page first, using code from VACUUM similar to if (PageIsNew(page)) { ereport(WARNING, (errmsg(relation \%s\ page %u is uninitialized --- fixing, relname, blkno))); PageInit(page, BufferGetPageSize(buf), 0); } then continue as before. As Jeff Davis pointed out upthread, you don't know that 0 is the correct special size for the relation being copied. In the VACUUM path, that code is only applied to heaps, but that's not necessarily the case here. We definitely shouldn't do anything that leaves standby different to primary. Obviously. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query optimization problem
On Wed, Jul 28, 2010 at 7:24 AM, Yeb Havinga yebhavi...@gmail.com wrote: Sorry? I though what Equivalence Class provides is the proving that using this qualification or another will *not* affect the output. In a query like... SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE (d1.ID=234409763) or (d2.ID=234409763) ...you're going to scan d1, scan d2, and then join the results. The scan of d1 is going to produce different results depending on whether you evaluate or not d1.ID=234409763, and the scan of d2 is going to produce different results depending on whether or not you evaluate d2.BasedOn=234409763. Wouldn't it be relatively easy, to rewrite the filter expression by adding expressions, instead of replacing constants, in the disjunctive case, so the example at hand would become: WHERE (d1.ID=234409763) or (d2.ID=234409763) AND (d2.BasedOnID=234409763) or (d2.ID=234409763) Yeah, that could be done, but it's not necessarily a win from a performance standpoint. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query optimization problem
Robert Haas robertmh...@gmail.com writes: SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE (d1.ID=234409763) or (d2.ID=234409763) ...you're going to scan d1, scan d2, and then join the results. The scan of d1 is going to produce different results depending on whether you evaluate or not d1.ID=234409763, and the scan of d2 is going to produce different results depending on whether or not you evaluate d2.BasedOn=234409763. Well I just realised you can't use d2.BasedOn in scanning d1 here. I don't know what exactly I had in mind previously, but in any case, sorry for the noise. I hope the optimiser effort control still hold water nonetheless… -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] patch saved in commitfest application isn't actual now
Hello, can you send a current version, please. I looked to git repository, but you did more changes. Thank you Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query optimization problem
Robert Haas wrote: On Wed, Jul 28, 2010 at 7:24 AM, Yeb Havinga yebhavi...@gmail.com wrote: Sorry? I though what Equivalence Class provides is the proving that using this qualification or another will *not* affect the output. In a query like... SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE (d1.ID=234409763) or (d2.ID=234409763) ...you're going to scan d1, scan d2, and then join the results. The scan of d1 is going to produce different results depending on whether you evaluate or not d1.ID=234409763, and the scan of d2 is going to produce different results depending on whether or not you evaluate d2.BasedOn=234409763. Wouldn't it be relatively easy, to rewrite the filter expression by adding expressions, instead of replacing constants, in the disjunctive case, so the example at hand would become: WHERE (d1.ID=234409763) or (d2.ID=234409763) AND (d2.BasedOnID=234409763) or (d2.ID=234409763) Yeah, that could be done, but it's not necessarily a win from a performance standpoint. Not necessarily a win, but on the test case no significant increase in planning time. It somehow feels like a good idea to give the planner as much information as possible, i.e. for each rel as much baserestrictinfo's. I earlier forgot parentheses, the correct query is SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE ((d1.ID=234409763) or (d2.ID=234409763)) AND ((d2.BasedOn=234409763) or (d2.ID=234409763)); by doing this in the rewrite step, triple planning would be avoided. I suspect that a copyObject of the expression + expression tree mutator call time during rewrite is negligible compared to plan time, assuming this is minimal, in this particulare case there doesn't seem to be much planning time between the three variants. I ran the script below a number of times, the third time is the one with expanded expression: Time: 0.820 ms Time: 0.859 ms Time: 0.877 ms --- Time: 0.617 ms Time: 0.662 ms Time: 0.737 ms --- Time: 0.817 ms Time: 0.766 ms Time: 0.826 ms --- Time: 0.638 ms Time: 0.700 ms Time: 0.706 ms --- Time: 0.463 ms Time: 0.847 ms Time: 0.793 ms --- Time: 0.629 ms Time: 0.671 ms Time: 0.703 ms this was the script (on the relation and index supplied by the OP) -- warm catalog explain SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE (d1.ID=234409763) or (d2.ID=234409763); \timing explain SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE (d1.ID=234409763) or (d2.ID=234409763); explain SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE (d2.BasedOn=234409763) or (d2.ID=234409763); explain SELECT d1.ID, d2.ID FROM DocPrimary d1 JOIN DocPrimary d2 ON d2.BasedOn=d1.ID WHERE ((d1.ID=234409763) or (d2.ID=234409763)) AND ((d2.BasedOn=234409763) or (d2.ID=234409763)); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE SET STATISTICS requires AccessExclusiveLock
On tor, 2010-07-15 at 10:24 +0100, Simon Riggs wrote: Patch to reduce lock levels for ALTER TABLE CREATE TRIGGER CREATE RULE Tried this out, but $subject is still the case. The problem is that ATRewriteCatalogs() calls AlterTableCreateToastTable() based on what it thinks the subcommands are, and AlterTableCreateToastTable() takes an AccessExclusiveLock. This could possibly be addressed by moving AT_SetStatistics to AT_PASS_MISC in order to avoid the TOAST table call. In a related matter, assigning ShareUpdateExclusiveLock to AT_SetStorage doesn't work either, because the TOAST table call needs to be done in that case. Perhaps this logic needs to be refactored a bit more so that there aren't any inconsistencies between the lock modes and the passes, which could lead to false expectations and deadlocks. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE SET STATISTICS requires AccessExclusiveLock
On Wed, 2010-07-28 at 15:24 +0300, Peter Eisentraut wrote: On tor, 2010-07-15 at 10:24 +0100, Simon Riggs wrote: Patch to reduce lock levels for ALTER TABLE CREATE TRIGGER CREATE RULE Tried this out, but $subject is still the case. The problem is that ATRewriteCatalogs() calls AlterTableCreateToastTable() based on what it thinks the subcommands are, and AlterTableCreateToastTable() takes an AccessExclusiveLock. This could possibly be addressed by moving AT_SetStatistics to AT_PASS_MISC in order to avoid the TOAST table call. In a related matter, assigning ShareUpdateExclusiveLock to AT_SetStorage doesn't work either, because the TOAST table call needs to be done in that case. Perhaps this logic needs to be refactored a bit more so that there aren't any inconsistencies between the lock modes and the passes, which could lead to false expectations and deadlocks. Thanks for your comments. Will reconsider and update. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query optimization problem
Yeb Havinga yebhavi...@gmail.com writes: Robert Haas wrote: On Wed, Jul 28, 2010 at 7:24 AM, Yeb Havinga yebhavi...@gmail.com wrote: Wouldn't it be relatively easy, to rewrite the filter expression by adding expressions, instead of replacing constants, in the disjunctive case, so the example at hand would become: WHERE (d1.ID=234409763) or (d2.ID=234409763) AND (d2.BasedOnID=234409763) or (d2.ID=234409763) Yeah, that could be done, but it's not necessarily a win from a performance standpoint. Not necessarily a win, but on the test case no significant increase in planning time. The problem is that it could cost you a lot in execution time, because of the useless extra filter conditions that will be applied. The planner isn't going to notice that those conditions are redundant. An even worse problem is that because it doesn't know that, it's going to underestimate the combined selectivity of the two WHERE conditions, resulting in drastic underestimates of the numbers of rows emitted, possibly resulting in horribly bad plan choices that kill whatever performance improvement you got at the bottom level. What the EquivalenceClass machinery actually buys us is the ability to deal with a set of partially-redundant possible filter conditions and apply only enough of them to get a correct plan. As an example, if the query has A=B and B=C, we could deduce A=C, but we don't want to apply all three equality conditions at runtime. Instead we put all three variables into an EC, and then there is logic to determine which of the equality clauses implied by the EC should actually get applied where. This avoids both the useless-checks-at-runtime problem and the problem of wrong selectivity estimates. To do something like this without generating stupid plans, we'd need some sort of generalized EC mechanism that could figure out which variants of the clause made the most sense in different contexts. Or maybe something else entirely --- but just generating a lot of variants of a clause and throwing them all into the existing mechanism is not workable. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Toward a column reorder solution
On Tue, 27 Jul 2010 19:55:18 -0700, David E. Wheeler da...@kineticode.com wrote: On Jul 27, 2010, at 3:01 PM, Joshua D. Drake wrote: Correct. We are also hoping to get some sponsorship for it. https://www.fossexperts.com/ Frigging copycat. Hah! I gave you kudos :P (you are in the FAQ) JD -- PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE SET STATISTICS requires AccessExclusiveLock
On Wed, 2010-07-28 at 15:24 +0300, Peter Eisentraut wrote: On tor, 2010-07-15 at 10:24 +0100, Simon Riggs wrote: Patch to reduce lock levels for ALTER TABLE CREATE TRIGGER CREATE RULE Tried this out, but $subject is still the case. The problem is that ATRewriteCatalogs() calls AlterTableCreateToastTable() based on what it thinks the subcommands are, and AlterTableCreateToastTable() takes an AccessExclusiveLock. This could possibly be addressed by moving AT_SetStatistics to AT_PASS_MISC in order to avoid the TOAST table call. In a related matter, assigning ShareUpdateExclusiveLock to AT_SetStorage doesn't work either, because the TOAST table call needs to be done in that case. Perhaps this logic needs to be refactored a bit more so that there aren't any inconsistencies between the lock modes and the passes, which could lead to false expectations and deadlocks. Changes as suggested, plus tests to confirm lock levels for ShareUpdateExclusiveLock changes. Will commit soon, if no objection. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c index 516dbd2..77a3c57 100644 --- a/src/backend/commands/cluster.c +++ b/src/backend/commands/cluster.c @@ -376,7 +376,7 @@ cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose, /* Check heap and index are valid to cluster on */ if (OidIsValid(indexOid)) - check_index_is_clusterable(OldHeap, indexOid, recheck); + check_index_is_clusterable(OldHeap, indexOid, recheck, AccessExclusiveLock); /* Log what we're doing (this could use more effort) */ if (OidIsValid(indexOid)) @@ -405,11 +405,11 @@ cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose, * definition can't change under us. */ void -check_index_is_clusterable(Relation OldHeap, Oid indexOid, bool recheck) +check_index_is_clusterable(Relation OldHeap, Oid indexOid, bool recheck, LOCKMODE lockmode) { Relation OldIndex; - OldIndex = index_open(indexOid, AccessExclusiveLock); + OldIndex = index_open(indexOid, lockmode); /* * Check that index is in fact an index on the given relation diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 9a7cd96..defb2e4 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -2550,6 +2550,8 @@ AlterTableGetLockLevel(List *cmds) case AT_DropCluster: case AT_SetRelOptions: case AT_ResetRelOptions: + case AT_SetOptions: + case AT_ResetOptions: case AT_SetStorage: cmd_lockmode = ShareUpdateExclusiveLock; break; @@ -2669,19 +2671,19 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd, ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode); /* Performs own permission checks */ ATPrepSetStatistics(rel, cmd-name, cmd-def, lockmode); - pass = AT_PASS_COL_ATTRS; + pass = AT_PASS_MISC; break; case AT_SetOptions: /* ALTER COLUMN SET ( options ) */ case AT_ResetOptions: /* ALTER COLUMN RESET ( options ) */ ATSimplePermissionsRelationOrIndex(rel); /* This command never recurses */ - pass = AT_PASS_COL_ATTRS; + pass = AT_PASS_MISC; break; case AT_SetStorage: /* ALTER COLUMN SET STORAGE */ ATSimplePermissions(rel, false); ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode); /* No command-specific prep needed */ - pass = AT_PASS_COL_ATTRS; + pass = AT_PASS_MISC; break; case AT_DropColumn: /* DROP COLUMN */ ATSimplePermissions(rel, false); @@ -6906,7 +6908,7 @@ ATExecClusterOn(Relation rel, const char *indexName, LOCKMODE lockmode) indexName, RelationGetRelationName(rel; /* Check index is valid to cluster on */ - check_index_is_clusterable(rel, indexOid, false); + check_index_is_clusterable(rel, indexOid, false, lockmode); /* And do the work */ mark_index_clustered(rel, indexOid); diff --git a/src/include/commands/cluster.h b/src/include/commands/cluster.h index 4f67930..74d4bd1 100644 --- a/src/include/commands/cluster.h +++ b/src/include/commands/cluster.h @@ -14,6 +14,7 @@ #define CLUSTER_H #include nodes/parsenodes.h +#include storage/lock.h #include utils/relcache.h @@ -21,7 +22,7 @@ extern void cluster(ClusterStmt *stmt, bool isTopLevel); extern void cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose, int freeze_min_age, int freeze_table_age); extern void check_index_is_clusterable(Relation OldHeap, Oid indexOid, - bool recheck); + bool recheck, LOCKMODE lockmode); extern void mark_index_clustered(Relation rel, Oid indexOid); extern Oid make_new_heap(Oid OIDOldHeap, Oid NewTableSpace); diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out index 5aff44f..83e24fd 100644 --- a/src/test/regress/expected/alter_table.out +++ b/src/test/regress/expected/alter_table.out @@ -1473,6 +1473,127 @@
Re: [HACKERS] ERROR: argument to pg_get_expr() must come from system catalogs
On Wed, Jul 28, 2010 at 4:54 PM, Bruce Momjian br...@momjian.us wrote: Are we basically leaving pgAdmin in this state until we come up with a fix and need a new minor release? We pride ourselves in not introducing breakage in minor releases, but it has certainly happened in this case, and it is making pgAdmin look bad. Dave, is there a hack you can add to pgAdmin to work around the join issue until we can fix the backend? It wouldn't make much difference if there was - the majority of people won't get it until they upgrade their server anyway. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, 2010-07-28 at 06:40 -0400, Robert Haas wrote: fix1 -- Only call PageSetLSN/TLI inside log_newpage() and heap_xlog_newpage() if the page is not zeroed. fix2 -- Don't call log_newpage() at all if the page is not zeroed. Please review. I don't have a strong opinion about which one should be applied. Looks good. I still prefer fix2; it seems cleaner to me. It has another advantage, too, which is that copy_relation_data() is used ONLY by ALTER TABLE SET TABLESPACE. So if I stick to patching that function, that's the only thing I can possibly break, whereas log_newpage() is used in a bunch of other places. I don't think either fix is going to break anything at all, but considering that this is going to need back-patching, I'd rather be conservative. Sounds good to me. However, when Simon said We definitely shouldn't do anything that leaves standby different to primary. you said obviously. Fix2 can leave a difference between the two, because zeroed pages at the end of the heap file on the primary will not be sent to the standby (the standby will only create the zeroed pages if a higher block number is sent; which won't be the case if the zeroed pages are at the end). As we discussed before, that looks inconsequential, but I just want to make sure that it's understood. Speaking of back-patching, the subject line describes this as an 8.3+ problem, but it looks to me like this goes all the way back to 8.0. The code is a little different in 8.2 and prior, but ISTM it's vulnerable to the same issue. Don't we need a modified version of this patch for the 8.0 - 8.2 branches? That sounds right. I just saw that the code in question was introduced in 8.3. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Toward a column reorder solution
On Jul 28, 2010, at 7:57 AM, Joshua D. Drake wrote: Hah! I gave you kudos :P (you are in the FAQ) Ah, thanks. The link is missing a G: It's PGXN, not PXN. Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Jul 25, 2010, at 8:01 AM, Kris Jurka wrote: The JDBC driver reads server messages for multiple reasons. One of them is indeed to do early failure detection. That's high quality. =) Another is to pickup NoticeResponse messages to avoid a network buffer deadlock. That's a good catch. I don't think psql/restore would often run into this as when COPY IN is in play, it's normally restoring a database. However, with JDBC, I imagine COPY would more often be used to do bulk loading into live tables that may very well cause a NOTICE. [Well, I reference psql/libpq because I don't recall it recognizing failure during COPY IN in the past, so I assume it's not receiving any data in that state.] hrm, I suppose a lazy way around that problem would be to suspend all client messages(client_min_messages) during COPY IN. Tho, I guess one would still have to contend with NotificationResponse, and ParameterStatus.. So this is possible to work around driver side by peeking into the network stream and delaying processing of the end of copy until the driver agrees that the copy is done, but I don't think you would have to peek in. If the interface were to always hold onto the last message or last n-bytes submitted to be sent, it would be able to send the possible CopyData(EOF) and CopyDone once the COPY operation (at the interface level) is closed/shutdown/terminated. Granted, this is dependent on CopyData(EOF) not being in the middle of regular CopyData, but I gather that that would end in an ErrorResponse anyways. I still maintain that this is a server bug. It is not OK for the server to assume that the client is done and move on, the client must tell the server what it wants done. I'm a bit torn here. While it would seem to be either a bug in the spec or a bug in the server, I'm inclined to call it a wart in the server's implementation of the spec. I don't see the fix as being dangerous, but I imagine an implementor would want to have the workaround in place regardless. I certainly would. I'd be in favor of seeing this fixed in 9.x, and the documentation updated to warn implementors about the wart in the older versions.. That is, I don't see any reason why we can't get rid of this unsightly thing considering the workarounds would still work with a wart-free server. cheers, jwp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Toward a column reorder solution
On Wed, 2010-07-28 at 09:30 -0700, David E. Wheeler wrote: On Jul 28, 2010, at 7:57 AM, Joshua D. Drake wrote: Hah! I gave you kudos :P (you are in the FAQ) Ah, thanks. The link is missing a G: It's PGXN, not PXN. Yeah that is already fixed, just waiting for cache to clear (on the server). Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
Jeff Davis pg...@j-davis.com writes: However, when Simon said We definitely shouldn't do anything that leaves standby different to primary. you said obviously. Fix2 can leave a difference between the two, because zeroed pages at the end of the heap file on the primary will not be sent to the standby (the standby will only create the zeroed pages if a higher block number is sent; which won't be the case if the zeroed pages are at the end). As we discussed before, that looks inconsequential, but I just want to make sure that it's understood. I understand it, and I don't like it one bit. I haven't caught up on this thread yet, but I think the only acceptable solution is one that leaves the slave in the *same* state as the master. Not a state that we hope will behave equivalently. I can think of too many corner cases where it might not. (In fact, having a zeroed page in a relation is already a corner case in itself, so the amount of testing you'd get for such behaviors is epsilon squared. You don't want to take that bet.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Wed, 28 Jul 2010, James William Pye wrote: hrm, I suppose a lazy way around that problem would be to suspend all client messages(client_min_messages) during COPY IN. Tho, I guess one would still have to contend with NotificationResponse, and ParameterStatus.. Technically you won't get NotificationResponse until transaction end, so you don't need to worry about that mid copy. I don't think you would have to peek in. If the interface were to always hold onto the last message or last n-bytes submitted to be sent, it would be able to send the possible CopyData(EOF) and CopyDone once the COPY operation (at the interface level) is closed/shutdown/terminated. Granted, this is dependent on CopyData(EOF) not being in the middle of regular CopyData, but I gather that that would end in an ErrorResponse anyways. One of the key points of confusion is that CopyData(EOF) does not result in an error. It results in ignoring any futher data. The problem I have is that for text mode it waits for CopyDone, but in binary mode it ends the copy sequence immediately. Additionally the interface exposed by the JDBC driver lets the user write arbitrary CopyData bytes to the server, so without parsing all of that we don't know whether they've issued CopyData(EOF) or not. Kris Jurka -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, 2010-07-28 at 12:36 -0400, Tom Lane wrote: Jeff Davis pg...@j-davis.com writes: However, when Simon said We definitely shouldn't do anything that leaves standby different to primary. you said obviously. Fix2 can leave a difference between the two, because zeroed pages at the end of the heap file on the primary will not be sent to the standby (the standby will only create the zeroed pages if a higher block number is sent; which won't be the case if the zeroed pages are at the end). As we discussed before, that looks inconsequential, but I just want to make sure that it's understood. I understand it, and I don't like it one bit. I haven't caught up on this thread yet, but I think the only acceptable solution is one that leaves the slave in the *same* state as the master. Not a state that we hope will behave equivalently. I can think of too many corner cases where it might not. (In fact, having a zeroed page in a relation is already a corner case in itself, so the amount of testing you'd get for such behaviors is epsilon squared. You don't want to take that bet.) Ok, sounds like my original fix (fix1) is the way to go then. Log zero pages, but don't set LSN/TLI if it's a zero page (in log_newpage or heap_xlog_newpage). Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, Jul 28, 2010 at 12:36 PM, Tom Lane t...@sss.pgh.pa.us wrote: Jeff Davis pg...@j-davis.com writes: However, when Simon said We definitely shouldn't do anything that leaves standby different to primary. you said obviously. Fix2 can leave a difference between the two, because zeroed pages at the end of the heap file on the primary will not be sent to the standby (the standby will only create the zeroed pages if a higher block number is sent; which won't be the case if the zeroed pages are at the end). As we discussed before, that looks inconsequential, but I just want to make sure that it's understood. I understand it, and I don't like it one bit. I haven't caught up on this thread yet, but I think the only acceptable solution is one that leaves the slave in the *same* state as the master. Not a state that we hope will behave equivalently. I can think of too many corner cases where it might not. (In fact, having a zeroed page in a relation is already a corner case in itself, so the amount of testing you'd get for such behaviors is epsilon squared. You don't want to take that bet.) I might be missing something here, but I don't see how you're going to manage that. In Jeff's original example, he crashes the database after extending the relation but before initializing and writing the new page. I believe that at that point no XLOG has been written yet, so the relation has been extended but there is no WAL to be sent to the standby. So now you have the exact situation you're concerned about - the relation has been extended on the master but not on the standby. As far as I can see, this is an unavoidable consequence of the fact that we don't XLOG the act of extending the relation. Worrying about it only in the specific context of ALTER TABLE .. SET TABLESPACE seems backwards; if there are any bugs there, we're in for it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, 2010-07-28 at 13:18 -0400, Robert Haas wrote: In Jeff's original example, he crashes the database after extending the relation but before initializing and writing the new page. I believe that at that point no XLOG has been written yet, so the relation has been extended but there is no WAL to be sent to the standby. So now you have the exact situation you're concerned about - the relation has been extended on the master but not on the standby. As far as I can see, this is an unavoidable consequence of the fact that we don't XLOG the act of extending the relation. Worrying about it only in the specific context of ALTER TABLE .. SET TABLESPACE seems backwards; if there are any bugs there, we're in for it. That's a very good point. Now I'm leaning more toward your fix. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
Robert Haas robertmh...@gmail.com writes: On Wed, Jul 28, 2010 at 12:36 PM, Tom Lane t...@sss.pgh.pa.us wrote: I understand it, and I don't like it one bit. I haven't caught up on this thread yet, but I think the only acceptable solution is one that leaves the slave in the *same* state as the master. I might be missing something here, but I don't see how you're going to manage that. In Jeff's original example, he crashes the database after extending the relation but before initializing and writing the new page. I believe that at that point no XLOG has been written yet, so the relation has been extended but there is no WAL to be sent to the standby. So now you have the exact situation you're concerned about - the relation has been extended on the master but not on the standby. You're right that we cannot prevent that situation --- or at least, the cure would be worse than the disease. (The cure would be to XLOG the extension action, obviously, but then out-of-disk-space has to be a PANIC condition.) However, it doesn't follow that it's a good idea to make copy_relation_data *intentionally* make the slave and master different. I've caught up on the thread now, and I think that fix2 (skip logging the page) is extremely dangerous and has little if anything in its favor. fix1 seems reasonable given the structure of the page validity checks. However, what about Jeff's original comment : On second thought, why are PageSetLSN and PageSetTLI being called from : log_newpage(), anyway? I think it is appropriate to be setting the LSN/TLI in the case of a page that's been constructed by the caller as part of the WAL-logged action, but doing so in copy_relation_data seems rather questionable. We certainly didn't change the source page so changing its LSN seems rather wrong --- wouldn't it be better to just copy the source pages with their original LSNs? So perhaps the best fix is to add a bool parameter to log_newpage telling it whether to update LSN/TLI, and have copy_relation_data pass false while the other callers pass true. (Although I guess we'd need to propagate that flag in the WAL record, so maybe this is more trouble than its worth.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, Jul 28, 2010 at 2:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Wed, Jul 28, 2010 at 12:36 PM, Tom Lane t...@sss.pgh.pa.us wrote: I understand it, and I don't like it one bit. I haven't caught up on this thread yet, but I think the only acceptable solution is one that leaves the slave in the *same* state as the master. I might be missing something here, but I don't see how you're going to manage that. In Jeff's original example, he crashes the database after extending the relation but before initializing and writing the new page. I believe that at that point no XLOG has been written yet, so the relation has been extended but there is no WAL to be sent to the standby. So now you have the exact situation you're concerned about - the relation has been extended on the master but not on the standby. You're right that we cannot prevent that situation --- or at least, the cure would be worse than the disease. (The cure would be to XLOG the extension action, obviously, but then out-of-disk-space has to be a PANIC condition.) Not to mention that performance would probably be atrocious. However, it doesn't follow that it's a good idea to make copy_relation_data *intentionally* make the slave and master different. I've caught up on the thread now, and I think that fix2 (skip logging the page) is extremely dangerous and has little if anything in its favor. Why do you think that? They will be different only in terms of whether the uninitialized bytes are before or after the nominal EOF, and we know we have to be indifferent to that case anyway. fix1 seems reasonable given the structure of the page validity checks. However, what about Jeff's original comment : On second thought, why are PageSetLSN and PageSetTLI being called from : log_newpage(), anyway? I think it is appropriate to be setting the LSN/TLI in the case of a page that's been constructed by the caller as part of the WAL-logged action, but doing so in copy_relation_data seems rather questionable. We certainly didn't change the source page so changing its LSN seems rather wrong --- wouldn't it be better to just copy the source pages with their original LSNs? So perhaps the best fix is to add a bool parameter to log_newpage telling it whether to update LSN/TLI, and have copy_relation_data pass false while the other callers pass true. (Although I guess we'd need to propagate that flag in the WAL record, so maybe this is more trouble than its worth.) It seems like if log_newpage() were to set the LSN/TLI before calling XLogInsert() - or optionally not - then it wouldn't be necessary to set them also in heap_xlog_newpage(); the memcpy operation would by definition have copied the right information onto the page. That seems like it would be a cleaner design, but back-patching a change to the interpretation of WAL records that might already be on someone's disk seems dicey at best. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] reducing NUMERIC size for 9.1
[ gradually catching up on email ] Robert Haas robertmh...@gmail.com writes: On Fri, Jul 16, 2010 at 2:39 PM, Tom Lane t...@sss.pgh.pa.us wrote: I don't like the way you did that either (specifically, not the kluge in NUMERIC_DIGITS()). It would probably work better if you declared two different structs, or a union of same, to represent the two layout cases. n_sign_dscale is now pretty inappropriately named, probably better to change the field name. This will also help to catch anything that's not using the macros. (Renaming the n_weight field, or at least burying it in an extra level of struct, would be helpful for the same reason.) I'm not sure what you have in mind here. If we create a union of two structs, we'll still have to pick one of them to use to check the high bits of the first word, so I'm not sure we'll be adding all that much in terms of clarity. No, you can do something like this: typedef struct numeric_short { uint16 word1; NumericDigit digits[1]; } numeric_short; typedef struct numeric_long { uint16 word1; int16 weight; NumericDigit digits[1]; } numeric_long; typedef union numeric { uint16 word1; numeric_short short; numeric_longlong; } numeric; and then access word1 either directly or (after having identified which format it is) via one of the sub-structs. If you really wanted to get pedantic you could have a third sub-struct representing the format for NaNs, but since those are just going to be word1 it may not be worth the trouble. It seems like you've handled the NAN case a bit awkwardly. Since the weight is uninteresting for a NAN, it's okay to not store the weight field, so I think what you should do is consider that the dscale field is still full-width, ie the format of the first word remains old-style not new-style. I don't remember whether dscale is meaningful for a NAN, but if it is, your approach is constraining what is possible to store, and is also breaking compatibility with old databases. There is only one NaN value. Neither weight or dscale is meaningful. I think if the high two bits of the first word are 11 we never examine anything else - do you see somewhere that we're doing otherwise? I hadn't actually looked. I think though that it's a mistake to break compatibility on both dscale and weight when you only need to break one. Also, weight is *certainly* uninteresting for NaNs since it's not even meaningful unless there are digits. dscale could conceivably be worth something. The sign extension code in the NUMERIC_WEIGHT() macro seems a bit awkward; I wonder if there's a better way. One solution might be to offset the value (ie, add or subtract NUMERIC_SHORT_WEIGHT_MIN) rather than try to sign-extend per se. Hmm... so, if the weight is X we store the value X-NUMERIC_SHORT_WEIGHT_MIN as an unsigned integer? That's kind of a funny representation - I *think* it works out to sign extension with the high bit flipped. I guess we could do it that way, but it might make it harder/more confusing to do bit arithmetic with the weight sign bit later on. Yeah, it was just an idea. It seems like there should be an easier way to extract the sign-extended value, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, 2010-07-28 at 14:50 -0400, Robert Haas wrote: It seems like if log_newpage() were to set the LSN/TLI before calling XLogInsert() - or optionally not - then it wouldn't be necessary to set them also in heap_xlog_newpage(); the memcpy operation would by definition have copied the right information onto the page. That seems like it would be a cleaner design, but back-patching a change to the interpretation of WAL records that might already be on someone's disk seems dicey at best. How do you set the LSN before XLogInsert()? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, Jul 28, 2010 at 3:08 PM, Jeff Davis pg...@j-davis.com wrote: On Wed, 2010-07-28 at 14:50 -0400, Robert Haas wrote: It seems like if log_newpage() were to set the LSN/TLI before calling XLogInsert() - or optionally not - then it wouldn't be necessary to set them also in heap_xlog_newpage(); the memcpy operation would by definition have copied the right information onto the page. That seems like it would be a cleaner design, but back-patching a change to the interpretation of WAL records that might already be on someone's disk seems dicey at best. How do you set the LSN before XLogInsert()? Details, details... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
Robert Haas robertmh...@gmail.com writes: On Wed, Jul 28, 2010 at 2:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: I've caught up on the thread now, and I think that fix2 (skip logging the page) is extremely dangerous and has little if anything in its favor. Why do you think that? They will be different only in terms of whether the uninitialized bytes are before or after the nominal EOF, and we know we have to be indifferent to that case anyway. (1) You're assuming that the page will be zeroes on the slave without having forced it to be so. A really obvious case where this fails is where we're doing crash-and-restart on the master: a later action could have modified the page away from the all-zero state. (In principle that's OK but I think this might break torn-page protection.) (2) On filesystems that support holes, the page will not have storage, whereas it (probably) does on the master. This could lead to a divergence in behavior later, ie slave runs out of disk space at a different point than the master. (3) The position of the nominal EOF can drive choices about which page to put new tuples in, specifically thats where RelationGetBufferForTuple will go if FSM has no information. This could result in unexpected divergence in behavior after the slave goes live compared to what the master would have done. Maybe that's OK but it seems better to avoid it if we can, especially when you think about crash-and-restart on the master as opposed to a separate slave. Now as I said earlier, these are all tiny corners of a corner case, and they *probably* shouldn't matter. But I see no good reason to expose ourselves to the possibility that there's some cases where they do matter. Especially when your argument for fix2 is a purely aesthetic judgment that I don't agree with anyway. I think it is appropriate to be setting the LSN/TLI in the case of a page that's been constructed by the caller as part of the WAL-logged action, but doing so in copy_relation_data seems rather questionable. We certainly didn't change the source page so changing its LSN seems rather wrong --- wouldn't it be better to just copy the source pages with their original LSNs? It seems like if log_newpage() were to set the LSN/TLI before calling XLogInsert() - or optionally not - then it wouldn't be necessary to set them also in heap_xlog_newpage(); the memcpy operation would by definition have copied the right information onto the page. Not possible because it is only after you've done XLogInsert that you know what LSN was assigned to the WAL record. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
I wrote: I think it is appropriate to be setting the LSN/TLI in the case of a page that's been constructed by the caller as part of the WAL-logged action, but doing so in copy_relation_data seems rather questionable. BTW, I thought of an argument that explains why that's sane: it marks the copied page as having been recently WAL-logged. If we do some action on the copied relation shortly after completing the copy_relation_data transaction, we will see that its LSN is later than the last checkpoint and know that we don't need to emit a full-page WAL image for it, which is correct because in case of crash+restart the HEAP_NEWPAGE record will provide the full-page image. If we left the source relation's page's LSN in there, we would frequently make the wrong decision and emit an unnecessary extra full-page image. So nevermind that distraction. I'm back to thinking that fix1 is the way to go. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, Jul 28, 2010 at 3:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: (1) You're assuming that the page will be zeroes on the slave without having forced it to be so. A really obvious case where this fails is where we're doing crash-and-restart on the master: a later action could have modified the page away from the all-zero state. (In principle that's OK but I think this might break torn-page protection.) Hmm, yeah, that does seem like it has the potential to be bad. I think this is sufficient reason to go with fix #1. (2) On filesystems that support holes, the page will not have storage, whereas it (probably) does on the master. This could lead to a divergence in behavior later, ie slave runs out of disk space at a different point than the master. I can't get excited about this one. (3) The position of the nominal EOF can drive choices about which page to put new tuples in, specifically thats where RelationGetBufferForTuple will go if FSM has no information. This could result in unexpected divergence in behavior after the slave goes live compared to what the master would have done. Maybe that's OK but it seems better to avoid it if we can, especially when you think about crash-and-restart on the master as opposed to a separate slave. You're still going to have that in the normal (not altering the tablespace) extension case, which is presumably far more common. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [JDBC] Trouble with COPY IN
On Jul 28, 2010, at 9:53 AM, Kris Jurka wrote: Technically you won't get NotificationResponse until transaction end, so you don't need to worry about that mid copy. Ah, thanks for noting that. It would appear my original reading of the async section didn't get far enough beyond Frontends must be prepared to deal with these messages at any time, even when not engaged in a query.. I see the note below clarifying NotificationResponse. One of the key points of confusion is that CopyData(EOF) does not result in an error. It results in ignoring any futher data. The problem I have is that for text mode it waits for CopyDone, but in binary mode it ends the copy sequence immediately. That is bothersome. :\ Additionally the interface exposed by the JDBC driver lets the user write arbitrary CopyData bytes to the server, so without parsing all of that we don't know whether they've issued CopyData(EOF) or not. Okay, so you can't know with absolute certainty without parsing the data, but the usual case would be handled by holding onto the last-N bytes or so. Enough to fit the EOF and perhaps a little more for paranoia's sake. That's not to say that I'm missing the problem. When (not if, when) the user feeds data past a CopyData(EOF), it's going to get interesting. [Thinking about the logic necessary to handle such a case and avoid network buffer deadlock...] I would think the least invasive way to handle it would be to set the CommandComplete and ReadyForQuery messages aside when they are received if CopyDone hasn't been sent, continue the COPY operation as usual until it is shutdown, send CopyDone and, finally, reinstate CommandComplete and RFQ as if they were just received.. I don't think that really accommodates for CopyFail as the status in RFQ will need to be adjusted to match what was actually done..? Well, I'm not sure you would need to worry about NoticeResponse after a premature CommandComplete as INSERTs are no longer happening. ugh. +1 for a fix. Not directly regarding your patch, but while the discussion is in the general area. I think it would be wise to throw an error when non-empty CopyData messages are received after CopyData(EOF). Chances are that the user is making a mistake and should be notified of it. cheers, jwp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Wed, 2010-07-28 at 15:37 -0400, Tom Lane wrote: So nevermind that distraction. I'm back to thinking that fix1 is the way to go. Agreed. It's uncontroversial to have a simple guard against corrupting an uninitialized page, and uncontroversial is good for things that will be back-patched. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to show individual statement latencies in pgbench output
Finally got around to taking a longer look at your patch, sorry about the delay here. Patch itself seems to work on simple tests anyway (more on the one suspect bit below). You didn't show what the output looks like, so let's start with that because it is both kind of neat and not what I expected from your description. Here's the sort of extra stuff added to the end of the standard output when you toggle this feature on: $ pgbench -S pgbench -T 10 -c 8 -j 4 -l -r ... tps = 28824.943954 (excluding connections establishing) command latencies 0.001487 \set naccounts 10 * :scale 0.000677 \setrandom aid 1 :naccounts 0.273983 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; From the way you described the patch, I had thought that you were just putting this information into the log files or something like that. In fact, it's not in the log files; it just shows up in this summary at the end. I'm not sure if that's best or not. Some might want to see how the per-statement variation varies over time. Sort of torn on whether the summary alone is enough detail or not. Let me play with this some more and get back to you on that. Here's what a standard test looks like: tps = 292.468349 (excluding connections establishing) command latencies 0.002120 \set nbranches 1 * :scale 0.000682 \set ntellers 10 * :scale 0.000615 \set naccounts 10 * :scale 0.000723 \setrandom aid 1 :naccounts 0.000522 \setrandom bid 1 :nbranches 0.000553 \setrandom tid 1 :ntellers 0.000525 \setrandom delta -5000 5000 0.070307 BEGIN; 1.721631 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; 0.147854 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; 11.894366 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; 4.761715 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; 0.643895 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); 7.452017 END; I'm happy to see that the END here has a significant portion of time assigned to it, which means that it's correctly tracking the commit that happens there. It's possible to ask the question why add this feature when pg_stat_statements will get you the same data?. I think this gives a different enough view of things that it's worth adding anyway. Getting the END statements tracked, not having to use prepared statements to make it work, and now having to worry about overflowing the pg_stat_statements.max parameter that limits what that module tracks are all points in favor of this patch being useful even if you know about pg_stat_statements. Now onto the nitpicking. With the standard Ubuntu compiler warnings on I get this: pgbench.c:1588: warning: ‘i’ may be used uninitialized in this function If you didn't see that, you may want to double-check how verbose you have your compiler setup to be; maybe you just missed it (which is of course what reviewers are here for). Regardless, the troublesome bit is this: int i; commands = process_commands(buf[i]); Which is obviously not a good thing. I'm not sure entirely what you're doing with the changes you made to process_file, but I'd suggest you double check the logic and coding of that section because there's at least one funny thing going on here with i being used without initialization first here. I didn't try yet to figure out how this error might lead to a bug, but there probably is one. This looks like a good feature to me, just not sure if it's worth extending to produce even more output if people want to see it. Can always layer that on top later. I'll continue testing and try to get a firmer opinion. Please take a look at the problem I pointed out and produce a new patch when you get a chance that fixes that part, so at least we don't get stuck on that detail. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
On Tue, Jul 27, 2010 at 4:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Well, that's pretty much saying we won't release before September. Yup, that's what I think. In fact I think September might be optimistic. This is what happens when you fork early and allow developers to start focusing on new development instead of testing the release branch. Actually, rewind. I see that you moved the user-mappings issue I was concerned about to resolved after beta3; I missed the fact that you'd committed a fix there. You also fixed the EPQ issue, and the heap_update_redo problem evaporated. So now we have the following issues remaining: * page corruption after moving tablespace * ExplainOnePlan handles snapshots differently than ProcessQuery * name and comment of XLogSetAsyncCommitLSN() should be changed * Documentation fails to build as PDF ...and I wouldn't necessarily regard any of those as forcing another beta; the first two are ancient, the third is cosmetic, and the last one is a build system issue rather than a code change. Obviously, it's too early to decide anything: we may yet discover more issues that need to be addressed. But I think we're in much better shape than it seemed 24 hours ago. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] string_to_array has to be stable?
On Tue, 2010-07-20 at 11:31 +0200, Pavel Stehule wrote: Hello I am working on to_array, to_string functions and I am looking on string_to_array function. I am surprised so this function is marked as immutable postgres=# select array_to_string(array[current_date],','); array_to_string - 2010-07-20 (1 row) postgres=# set datestyle to German ; SET postgres=# select array_to_string(array[current_date],','); array_to_string - 20.07.2010 (1 row) What's wrong with that? current_date is the part that's changing, and it's being passed as an argument to the function. If the argument changes, an immutable function can return a different result. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] documentation for committing with git
On Wed, Jul 21, 2010 at 9:22 AM, Robert Haas robertmh...@gmail.com wrote: On the other hand, if you have technical corrections, or if you have suggestions on how to do the same things better (rather than suggestions on what to do differently), that would be greatly appreciated. Somewhere in that wiki page there is some musing about the size of .git directories being somewhat dissatisfying when one feels compelled to have multiple check-outs materialized. There's a facility in git known as alternates that let you fetch objects from some other pool. This is highly useful if you have the same project checked out many times, either for personal use or on a hosting server of some sort. Because the object pool being referenced has no knowledge of other repositories referencing it, garbage collection (should you be doing things that generate garbage, such deleting branches and tags) of the underlying pool can cause corruption in referencing repositories in the case where they reference objects that have since been deleted. This will never happen if the repository monotonically grows, as is often the case for a 'authoritative' repository, such as the one at git.postgresql.org that only has major branches and release tags that will never go away. (save the rare case when fixing up after a cvs race condition that has occurred a few times in the past). In practice, one can just make a clean clone of a project for the purposes of such an object pool and then let it sit for months or even years, as the size of each subsequent .git, even for considerable amounts of history, is marginal. Once in a while one can perform the clean-up task of catching up the object pool, if they feel their .git directories have gotten unacceptably large. Here's a brief section about it on a git wiki: https://git.wiki.kernel.org/index.php/GitFaq#How_to_share_objects_between_existing_repositories.3F fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Incorrect FTS result with GIN index
Oleg Bartunov o...@sai.msu.su writes: you can download dump http://mira.sai.msu.su/~megera/tmp/search_tab.dump Hmm ... I'm not sure why you're failing to reproduce it, because it's falling over pretty easily for me. After poking at it for awhile, I am of the opinion that scanGetItem's handling of multiple keys is fundamentally broken and needs to be rewritten completely. The particular case I'm seeing here is that one key returns this sequence of TIDs/lossy flags: ... 1085/4 0 1086/65535 1 1087/4 0 ... while the other one returns this: ... 1083/11 0 1086/6 0 1086/10 0 1087/10 0 ... and what comes out of scanGetItem is just ... 1086/6 1 ... because after returning that, on the next call it advances both input keystreams. So 1086/10 should be visited and is not. I think that depending on the previous entryRes state to determine what to do is basically unworkable, and what should probably be done instead is to remember the last-returned TID and advance keystreams with TIDs = that. I haven't quite thought through how that should interact with lossy-page TIDs but it seems more robust than what we've got. I'm also noticing that the ANDing behavior for the ee:* dd:* query style seems very much stupider than it needs to be --- it's returning lossy pages that very obviously don't need to be examined because the other keystream has no match at all on that page. But I haven't had time to probe into the reason why. I'm out of time for today, do you want to work on it? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
Robert Haas robertmh...@gmail.com writes: So now we have the following issues remaining: * page corruption after moving tablespace * ExplainOnePlan handles snapshots differently than ProcessQuery * name and comment of XLogSetAsyncCommitLSN() should be changed * Documentation fails to build as PDF ...and I wouldn't necessarily regard any of those as forcing another beta; the first two are ancient, the third is cosmetic, and the last one is a build system issue rather than a code change. Obviously, it's too early to decide anything: we may yet discover more issues that need to be addressed. But I think we're in much better shape than it seemed 24 hours ago. Yeah. I'm off poking at the incorrect FTS result problem, but that is a pre-existing bug as well; it goes back at least to 8.4 and probably further. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] string_to_array has to be stable?
Jeff Davis pg...@j-davis.com writes: On Tue, 2010-07-20 at 11:31 +0200, Pavel Stehule wrote: I am working on to_array, to_string functions and I am looking on string_to_array function. I am surprised so this function is marked as immutable What's wrong with that? current_date is the part that's changing, and it's being passed as an argument to the function. If the argument changes, an immutable function can return a different result. string_to_array() seems fine to me: it's a predictable transformation from text to text. However, I think that there really is an issue with array_to_string(), because that takes an anyarray and invokes the array element's type output function. Type output functions are not necessarily immutable, and if the input is of a type where that's not true, then the array_to_string() transformation isn't immutable either. An example is that date's output function produces different results depending on datestyle. I can't remember offhand whether there are any volatile type output functions, but if there were we'd really need to mark array_to_string() as volatile. That would be unpleasant for performance though. I'd rather compromise on stable. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] string_to_array has to be stable?
On Wed, 2010-07-28 at 20:25 -0400, Tom Lane wrote: string_to_array() seems fine to me: it's a predictable transformation from text to text. However, I think that there really is an issue with array_to_string(), because that takes an anyarray and invokes the array element's type output function. Yes, I misread the problem because he used current_date rather than a date literal. I can't remember offhand whether there are any volatile type output functions, but if there were we'd really need to mark array_to_string() as volatile. That would be unpleasant for performance though. I'd rather compromise on stable. Thoughts? Stable seems reasonable to me. A volatile type output function sounds like an edge case. Perhaps there are even grounds to force a type output function to be stable, similar to how we force the function for a functional index to be immutable. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch to show individual statement latencies in pgbench output
On Jul29, 2010, at 00:48 , Greg Smith wrote: Finally got around to taking a longer look at your patch, sorry about the delay here. Patch itself seems to work on simple tests anyway (more on the one suspect bit below). You didn't show what the output looks like, so let's start with that because it is both kind of neat and not what I expected from your description. Here's the sort of extra stuff added to the end of the standard output when you toggle this feature on: snipped output From the way you described the patch, I had thought that you were just putting this information into the log files or something like that. In fact, it's not in the log files; it just shows up in this summary at the end. I'm not sure if that's best or not. Some might want to see how the per-statement variation varies over time. Sort of torn on whether the summary alone is enough detail or not. Let me play with this some more and get back to you on that. I think the two features are pretty much orthogonal, even though they'd make use of the same per-statement instrumentation machinery. I created the patch to tune the wal_writer for the synchronous_commit=off case - the idea being that the COMMIT should be virtually instantaneous if the wal_writer writes old wal buffers out fast enough. I haven't yet used pgbench's log output feature, so I can't judge how useful the additional of per-statement data to that log is, and what the format should be. However, if you think it's useful and can come up with a sensible format, I'd be happy to add that feature to the patch. Now onto the nitpicking. With the standard Ubuntu compiler warnings on I get this: pgbench.c:1588: warning: ‘i’ may be used uninitialized in this function If you didn't see that, you may want to double-check how verbose you have your compiler setup to be; maybe you just missed it (which is of course what reviewers are here for). Regardless, the troublesome bit is this: int i; commands = process_commands(buf[i]); Fixed. That was a leftover of the trimming and comment skipping logic, which my patch moves to process_command. Updated patch is attached. Thanks for your extensive review best regards, Florian Pflug pgbench_statementlatency.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers