Re: [HACKERS] Canceling ROLLBACK statement
Pavan Deolasee writes: > If client is running a ROLLBACK statement and sends a statement cancel > signal to the server and if the cancel signal gets processed after the > transaction AbortTransaction() is completed, but before > CleanupTransaction() is called, I think the server may try to ABORT the > transaction again, This should be okay. If you can demonstrate a case where it causes problems, that would be a bug to fix, but I don't think it's likely to be anything fundamental. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Canceling ROLLBACK statement
Hi All, While working on something in XC, I hit upon an assertion failure. While this is in XC code, I believe there can be a way of reproducing this on vanilla PostgreSQL as well. I could not do so even after several tries, unless I add some code or run it through debugger. Here is the theory anyways. If client is running a ROLLBACK statement and sends a statement cancel signal to the server and if the cancel signal gets processed after the transaction AbortTransaction() is completed, but before CleanupTransaction() is called, I think the server may try to ABORT the transaction again, especially if it raises a ereport(ERROR) reporting "canceling statement due to user request". That results in server throwing a warning "AbortTransaction while in ABORT state" and subsequently an assertion failure in ProcArrayEndTransaction because proc->xid is not valid. The reason I believe I am not able to reproduce this is because interrupts are usually not processed between AbortTransaction and CleanupTransaction. And when the server gets a chance to process the interrupts, its DoingCommandRead and hence the cancel statement event is ignored. But its not entirely unlikely that the server may get chance to process the interrupt earlier. For example, if CleanupTransaction() calls elog() and there are bunch of possible code paths where it can happen. I am not sure if this is really an issue or if its worth fixing, but I thought it will be good to share at the least. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] 16-bit page checksums for 9.2
On Thu, Mar 1, 2012 at 9:11 PM, Tom Lane wrote: > Robert Haas writes: >> One thing I'm not too sure about is how to extend the page format to >> handle optional features. For example, suppose we want to add 2 bytes >> to the page header for a checksum (or 4 bytes, or any other number). >> Ideally, I'd like to not use up those 2 bytes on pages that aren't >> checksummed. What do we do about that? > > I'm having a hard time getting excited about adding that requirement on > top of all the others that we have for this feature. In particular, if > we insist on that, there is exactly zero chance of turning checksums on > on-the-fly, because you won't be able to do it if the page is full. > > The scheme that seems to me to make the most sense for checksums, > if we grant Simon's postulate that a 2-byte checksum is enough, is > (1) repurpose the top N bits of pd_flags as a page version number, > for some N; > (2) repurpose pd_pagesize_version as the checksum, with the > understanding that you can't have a checksum in version-0 pages. > > (Simon's current patch seems to be an overengineered version of that. > Possibly we should also ask ourselves how much we really need pd_tli > and whether that couldn't be commandeered as the checksum field.) > > I see the page versioning stuff as mainly being of interest for changes > that are more invasive than this, for instance tuple-header or data > changes. Well, OK, so... it wouldn't bother me a bit to steal pd_tli for this, although Simon previously objected to steeling even half of it, when I suggested that upthread. But I don't see the point of your first proposal: keeping the page version right where it is is a good idea, and we should do it. We could steel some *high* order bits from that field without breaking anything, but moving it around seems like it will complicate life to no good purpose. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
On 1 March 2012 22:09, Josh Berkus wrote: > On 3/1/12 1:57 PM, Daniel Farina wrote: >> On Thu, Mar 1, 2012 at 1:27 PM, Peter Geoghegan >> wrote: >>> My expectation is that this feature will make life a lot >>> easier for a lot of Postgres users. >> >> Yes. It's hard to overstate the apparent utility of this feature in >> the general category of visibility and profiling. > > More importantly, this is what pg_stat_statements *should* have been in > 8.4, and wasn't. It would probably be prudent to concentrate on getting the core infrastructure committed first. That way, we at least know that if this doesn't get into 9.2, we can work on getting it into 9.3 knowing that once committed, people won't have to wait over a year at the very least to be able to use the feature. The footprint of such a change is quite small: src/backend/nodes/copyfuncs.c |2 + src/backend/nodes/equalfuncs.c |4 + src/backend/nodes/outfuncs.c |6 + src/backend/nodes/readfuncs.c |5 + src/backend/optimizer/plan/planner.c |1 + src/backend/parser/analyze.c | 37 +- src/backend/parser/parse_coerce.c | 12 +- src/backend/parser/parse_param.c | 11 +- src/include/nodes/parsenodes.h |3 + src/include/nodes/plannodes.h |2 + src/include/parser/analyze.h | 12 + src/include/parser/parse_node.h|3 +- That said, I believe that the patch is pretty close to a commitable state as things stand, and that all that is really needed is for a committer familiar with the problem space to conclude the work started by Daniel and others, adding: contrib/pg_stat_statements/pg_stat_statements.c| 1420 ++- -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Robert Haas writes: > One thing I'm not too sure about is how to extend the page format to > handle optional features. For example, suppose we want to add 2 bytes > to the page header for a checksum (or 4 bytes, or any other number). > Ideally, I'd like to not use up those 2 bytes on pages that aren't > checksummed. What do we do about that? I'm having a hard time getting excited about adding that requirement on top of all the others that we have for this feature. In particular, if we insist on that, there is exactly zero chance of turning checksums on on-the-fly, because you won't be able to do it if the page is full. The scheme that seems to me to make the most sense for checksums, if we grant Simon's postulate that a 2-byte checksum is enough, is (1) repurpose the top N bits of pd_flags as a page version number, for some N; (2) repurpose pd_pagesize_version as the checksum, with the understanding that you can't have a checksum in version-0 pages. (Simon's current patch seems to be an overengineered version of that. Possibly we should also ask ourselves how much we really need pd_tli and whether that couldn't be commandeered as the checksum field.) I see the page versioning stuff as mainly being of interest for changes that are more invasive than this, for instance tuple-header or data changes. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Thu, Mar 01, 2012 at 10:17:04PM +0200, Peter Eisentraut wrote: > On ons, 2012-02-29 at 22:47 -0500, Bruce Momjian wrote: > > Hey, that's a good idea. I would always write the pg_dump output to a > > log file. If the dump succeeds, I remove the file, if not, I tell > > users to read the log file for details about the failure --- good > > idea. > > But we also need the server log output somewhere. So I think this temp > file would need to cover everything that -l covers. OK, combining your and Robert's ideas, how about I have pg_upgrade write the server log to a file, and the pg_dump output to a file (with its stderr), and if pg_upgrade fails, I report the failure and mention those files. If pg_upgrade succeeds, I remove the files? pg_upgrade already creates temporary files that it removes on completion. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Thu, Mar 01, 2012 at 08:45:26AM -0500, Robert Haas wrote: > On Wed, Feb 29, 2012 at 6:02 PM, Bruce Momjian wrote: > >> Any ideas about improving the error reporting more generally, so that > >> when reloading the dump fails, the user can easily see what went > >> belly-up, even if they didn't use -l? > > > > The only idea I have is to write the psql log to a temporary file and > > report the last X lines from the file in case of failure. Does that > > help? > > Why not just redirect stdout but not stderr? If there are error > messages, surely we want the user to just see those. Well, I think sending the error messages to the user but stdout to a file will leave users confused because it will be unclear which SQL statement generated the error. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Thu, Mar 1, 2012 at 8:32 PM, Tom Lane wrote: > Alvaro Herrera writes: >> Excerpts from Robert Haas's message of jue mar 01 21:23:06 -0300 2012: >>> and that, further, you were arguing that we should not support >>> multiple page versions. > >> I don't think we need to support multiple page versions, if multiple >> means > 2. > > That's exactly the point here. We clearly cannot support on-line > upgrade unless, somewhere along the line, we are willing to cope with > two page formats more or less concurrently. What I don't want is for > that requirement to balloon to supporting N formats forever. If we > do not have a mechanism that allows certifying that you have no > remaining pages of format N-1 before you upgrade to a server that > supports (only) versions N and N+1, then we're going to be in the > business of indefinite backwards compatibility instead. > > I'm not entirely sure, but I think we may all be in violent agreement > about where this needs to end up. I was using multiple to mean N>1, so yeah, it's sounding like we're more or less on the same page here. One thing I'm not too sure about is how to extend the page format to handle optional features. For example, suppose we want to add 2 bytes to the page header for a checksum (or 4 bytes, or any other number). Ideally, I'd like to not use up those 2 bytes on pages that aren't checksummed. What do we do about that? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
2012/3/2 Peter Eisentraut : >> with renaming to dblink_fdw_validator? > > Well, it's not the validator of the dblink_fdw, so maybe something like > basic_postgresql_fdw_validator. -1 for same reason. It's not the validator of basic_postgresql_fdw. Using "fdw" in the name of validator which doesn't have actual FDW might confuse users. Rather dblink_validator or libpq_option_validator is better? One possible another idea is creating dblink_fdw which uses the validator during "CREATE EXTENSION dblink" for users who store connection information in FDW objects. -- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Alvaro Herrera writes: > Excerpts from Robert Haas's message of jue mar 01 21:23:06 -0300 2012: >> and that, further, you were arguing that we should not support >> multiple page versions. > I don't think we need to support multiple page versions, if multiple > means > 2. That's exactly the point here. We clearly cannot support on-line upgrade unless, somewhere along the line, we are willing to cope with two page formats more or less concurrently. What I don't want is for that requirement to balloon to supporting N formats forever. If we do not have a mechanism that allows certifying that you have no remaining pages of format N-1 before you upgrade to a server that supports (only) versions N and N+1, then we're going to be in the business of indefinite backwards compatibility instead. I'm not entirely sure, but I think we may all be in violent agreement about where this needs to end up. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review of: collation for (expr)
On Thu, Jan 19, 2012 at 1:50 PM, Peter Eisentraut wrote: > On tor, 2012-01-12 at 21:25 -0800, probabble wrote: >> Compiling on Ubuntu 10.04 LTS AMD64 on a GoGrid virtual machine from >> 2012-01-12 checkout. >> >> Bison upgraded to v2.5, and downgraded to v2.4.1 >> >> Make process for both versions resulted in the following errors: >> >> make[3]: Leaving directory `/usr/local/src/pgbuild/src/backend/catalog' >> make -C parser gram.h >> make[3]: Entering directory `/usr/local/src/pgbuild/src/backend/parser' >> /usr/local/bin/bison -d -o gram.c gram.y >> gram.y: conflicts: 370 reduce/reduce >> gram.y: expected 0 reduce/reduce conflicts >> gram.y:10482.27-10494.33: warning: rule useless in parser due to conflicts: >> func_expr: COLLATION FOR '(' a_expr ')' >> make[3]: *** [gram.c] Error 1 > > I can't reproduce that. > me neither > In the meantime, attached is a re-merged version of the patch; the old > version doesn't apply cleanly anymore. > besides a clash in the oid and the value of leakproof missing in the pg_proc entry, everything works fine. The only thing is that i don't see a reason for these includes in src/backend/utils/adt/misc.c: + #include "nodes/nodeFuncs.h" + #include "utils/lsyscache.h" -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Excerpts from Robert Haas's message of jue mar 01 21:23:06 -0300 2012: > On Thu, Mar 1, 2012 at 4:08 PM, Tom Lane wrote: > > Robert Haas writes: > >>> So a relation can't have some pages in Version 9.2, and other pages in > >>> version 9.3? How will this work for 2TB tables? > > > >> Not very well, but better than Tom's proposal to require upgrading the > >> entire cluster in a single off-line operation. > > > > WTF? That was most certainly not what *I* was proposing; it's obviously > > unworkable. We need a process that can incrementally up-version a live > > database and keep track of the minimum version present, at some > > granularity smaller than "whole database". > > > > All of this was discussed and hashed out about two years ago, IIRC. > > We just haven't made any progress towards actually implementing those > > concepts. > > I am now officially confused. I thought you and Heikki were arguing > about 24 hours ago that we needed to shut down the old database, run a > pre-upgrade utility to convert all the pages, run pg_upgrade, and then > bring the new database on-line; Actually, nobody mentioned shutting the database down. This utility does not necessarily have to be something that runs independently from a running postmaster; in fact what I envisioned was that we would have it be run in a backend -- I've always imagined it's just a function to which you give a table OID, and it converts pages of that table into the new format. Maybe a user-friendly utility would connect to each database and call this function for each table. > and that, further, you were arguing that we should not support > multiple page versions. I don't think we need to support multiple page versions, if multiple means > 2. Obviously we need the server to support reading *two* page versions; though we can probably get away with having support for *writing* only the newest page version. (The pg_class column would mean "the oldest page version to be found in this table", so the table can actually contain a mixture of old pages and new pages.) -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Thu, Mar 1, 2012 at 4:08 PM, Tom Lane wrote: > Robert Haas writes: >>> So a relation can't have some pages in Version 9.2, and other pages in >>> version 9.3? How will this work for 2TB tables? > >> Not very well, but better than Tom's proposal to require upgrading the >> entire cluster in a single off-line operation. > > WTF? That was most certainly not what *I* was proposing; it's obviously > unworkable. We need a process that can incrementally up-version a live > database and keep track of the minimum version present, at some > granularity smaller than "whole database". > > All of this was discussed and hashed out about two years ago, IIRC. > We just haven't made any progress towards actually implementing those > concepts. I am now officially confused. I thought you and Heikki were arguing about 24 hours ago that we needed to shut down the old database, run a pre-upgrade utility to convert all the pages, run pg_upgrade, and then bring the new database on-line; and that, further, you were arguing that we should not support multiple page versions. Now you seem to be arguing the exact opposite - namely, that we should support multiple page versions, and that the conversion should be done on-line. So, to reiterate, I'm lost. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
[ sorry Tom, reply all this time... ] > What do you mean by "storing sequences as arrays"? So, a simple example is, for transcripts ( sequences of DNA that are turned into proteins ), we store each of the connected components as an array of the form: exon_type in [1,6] splice_type = [1,3] and then the array elements are [ exon_type, splice_type, exon_type ] ~ 99% of the elements are of the form [ [1,3], 1, [1,3] ], so I almost always get a hash or merge join ( correctly ) but for the rare junction types ( which are usually more interesting as well ) I correctly get nest loops with an index scan. > Can you demonstrate > that the existing stats are relevant at all to the query you're worried > about? Well, if we didn't have mcv's and just relied on ndistinct to estimate the '=' selectivities, either my low selectivity quals would use the index, or my high selectivity quals would use a table scan, either of which would be wrong. I guess I could wipe out the stats and get some real numbers tonight, but I can't see how the planner would be able to distinguish *without* mcv's... Best, Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
Alvaro Herrera writes: > Excerpts from Tom Lane's message of jue mar 01 18:51:38 -0300 2012: >> How would we make it optional? There's noplace I can think of to stick >> such a knob ... > Uhm, attoptions? Oh, I had forgotten we had that mechanism already. Yeah, that might work. I'm a bit tempted to design the option setting so that you can select whether to keep the btree stats, the new stats, or both or neither --- after all, there are probably plenty of databases where nobody cares about the array-containment operators either. That leaves the question of which setting should be the default ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multi "-t" and adding "-n" to vacuumdb ?
Tom Lane wrote: > Why isn't your customer using autovacuum? If there are concrete > reasons why that doesn't get the job done for him, it would be > more useful in the long run to work on fixing that. FWIW, we're using autovacuum here, at slightly more aggressive settings from the default, and we still rely on manual vacuums for two main reasons: (1) VACUUM FREEZE ANALYZE after a bulk load to avoid the hint bit rewrite costs for end users and the unpredictable anti-wraparound autovacuum when all the loaded data suddenly hits the freeze threshold. (2) For base backups of databases across a slow WAN, we do a "diff" rsync against a copy of the hot standby here. (Well, actually, to save space we do it against a hard-link copy of the previous base backup against which we have run a "diff" rsync from the hot standby.) If we don't do a VACUUM FREEZE ANALYZE before each base backup, it at least doubles the size of base backups, due to the hint bit and xmin freeze changes that occur after the initial copy of a tuple is backed up. Simon's recent work, if it makes it in, will deal with (1), and it may be possible to deal with (2) using much more aggressive configurations for autovacuum, although I suspect it might take another tweak or two to the back end to really cover that without manual vacuums. -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Caching for stable expressions with constant arguments v6
On Sat, Feb 4, 2012 at 5:40 AM, Marti Raudsepp wrote: > On Sat, Feb 4, 2012 at 09:49, Jaime Casanova wrote: >> i little review... > > Thanks! By the way, you should update to the v7 patch. > just tried it and it fail when initializing on make check """ creating information schema ... TRAP: FailedAssertion("!(*cachable == ((bool) 1))", File: "clauses.c", Line: 2295) Aborted (core dumped) child process exited with exit code 134 """ -- Jaime Casanova www.2ndQuadrant.com Professional PostgreSQL: Soporte 24x7 y capacitación -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
Nathan Boley writes: > Maybe this is bad design, but I've gotten in the habit of storing > sequences as arrays and I commonly join on them. I looked through my > code this morning, and I only have one 'range' query ( of the form > described up-thread ), but there are tons of the form > SELECT att1, attb2 FROM rela, relb where rela.seq_array_1 = relb.seq_array; What do you mean by "storing sequences as arrays"? Can you demonstrate that the existing stats are relevant at all to the query you're worried about? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
>> What about MCV's? Will those be removed as well? > > Sure. Those seem even less useful. Ya, this will destroy the performance of several queries without some heavy tweaking. Maybe this is bad design, but I've gotten in the habit of storing sequences as arrays and I commonly join on them. I looked through my code this morning, and I only have one 'range' query ( of the form described up-thread ), but there are tons of the form SELECT att1, attb2 FROM rela, relb where rela.seq_array_1 = relb.seq_array; I can provide some examples if that would make my argument more compelling. Sorry to be difficult, Nathan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Allowing multi "-t" and adding "-n" to vacuumdb ?
"Jehan-Guillaume (ioguix) de Rorthais" writes: > One of our customer send us a patch he wrote for his needs (on > "src/bin/scripts/vacuumdb.c", no doc were included). > He's using one schema per application and would like to be able to run > vacuumdb on each of them independently so he added the "--schema|-n" > option and send us the patch. > Reviewing his patch, I thought it would be more useful to allow multi > "--table|-t" options on the command line first. It might be possible to > pass an array of tables to "vacuum_one_database" function instead of > just one. > Then, we could add the "--schema|-n" option which would fetch and build > the table list and call "vacuum_one_database". > But before I start writing this patch, I would like some opinion, pros / > cons. Do you think such a feature could be accepted in official > PostgreSQL code or should we keep this as an external script ? I think most of us see vacuumdb as a historical leftover. We keep it around in case anyone is still relying on it, but improving it seems like misdirected effort. Why isn't your customer using autovacuum? If there are concrete reasons why that doesn't get the job done for him, it would be more useful in the long run to work on fixing that. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
Excerpts from Tom Lane's message of jue mar 01 18:51:38 -0300 2012: > > Alvaro Herrera writes: > > Excerpts from Robert Haas's message of jue mar 01 12:00:08 -0300 2012: > >> On Wed, Feb 29, 2012 at 5:43 PM, Tom Lane wrote: > >> I confess I am nervous about ripping this out. I am pretty sure we > >> will get complaints about it. Performance optimizations that benefit > >> group A at the expense of group B are always iffy, and I'm not sure > >> the case of using an array as a path indicator is as uncommon as you > >> seem to think. > > > Maybe we should keep it as an option. > > How would we make it optional? There's noplace I can think of to stick > such a knob ... Uhm, attoptions? "alter table foo alter column bar set extended_array_stats to on" or something like that? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
On 3/1/12 1:57 PM, Daniel Farina wrote: > On Thu, Mar 1, 2012 at 1:27 PM, Peter Geoghegan wrote: >> My expectation is that this feature will make life a lot >> easier for a lot of Postgres users. > > Yes. It's hard to overstate the apparent utility of this feature in > the general category of visibility and profiling. More importantly, this is what pg_stat_statements *should* have been in 8.4, and wasn't. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
On Thu, Mar 1, 2012 at 1:27 PM, Peter Geoghegan wrote: > My expectation is that this feature will make life a lot > easier for a lot of Postgres users. Yes. It's hard to overstate the apparent utility of this feature in the general category of visibility and profiling. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
Alvaro Herrera writes: > Excerpts from Robert Haas's message of jue mar 01 12:00:08 -0300 2012: >> On Wed, Feb 29, 2012 at 5:43 PM, Tom Lane wrote: >> I confess I am nervous about ripping this out. I am pretty sure we >> will get complaints about it. Performance optimizations that benefit >> group A at the expense of group B are always iffy, and I'm not sure >> the case of using an array as a path indicator is as uncommon as you >> seem to think. > Maybe we should keep it as an option. How would we make it optional? There's noplace I can think of to stick such a knob ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
On 1 March 2012 00:48, Alvaro Herrera wrote: > I'm curious about the LeafNode stuff. Is this something that could be > done by expression_tree_walker? I'm not completely familiar with it so > maybe there's some showstopper such as some node tags not being > supported, or maybe it just doesn't help. But I'm curious. Good question. The LeafNode function (which is a bit of a misnomer, as noted in a comment) looks rather like a walker function, as appears in the example in a comment in nodeFuncs.c: * expression_tree_walker() is designed to support routines that traverse * a tree in a read-only fashion (although it will also work for routines * that modify nodes in-place but never add/delete/replace nodes). * A walker routine should look like this: * * bool my_walker (Node *node, my_struct *context) * { * if (node == NULL) * return false; * // check for nodes that special work is required for, eg: * if (IsA(node, Var)) * { * ... do special actions for Var nodes * } * else if (IsA(node, ...)) * { * ... do special actions for other node types * } * // for any node type not specially processed, do: * return expression_tree_walker(node, my_walker, (void *) context); * } My understanding is that the expression-tree walking support is mostly useful for the majority of walker code, which only cares about a small subset of nodes, and hopes to avoid including boilerplate code just to walk those other nodes that it's actually disinterested in. This code, unlike most clients of expression_tree_walker(), is pretty much interested in everything, since its express purpose is to fingerprint all possible query trees. Another benefit of expression_tree_walker is that if you miss a certain node being added, (say a FuncExpr-like node), you get to automatically have that node walked over to walk to the nodes that you do in fact care about (such as those within this new nodes args List). That makes perfect sense in the majority of cases, but here you've already missed the fields within this new node that FuncExpr itself lacks, so you're already finger-printing inaccurately. I suppose you could still at least get the nodetag and still have a warning about the fingerprinting being inadequate by going down the expression_tree_walker path, but I'm inclined to wonder if it you aren't just better of directly walking the tree, if only to encourage the idea that this code needs to be maintained over time, and to cut down on the little extra bit of indirection that that imposes. It's not going to be any sort of burden to maintain it - it currently stands at a relatively meagre 800 lines of code for everything to do with tree walking - and the code that will have to be added with new nodes or refactored along with the existing tree structure is going to be totally trivial. All of that said, I wouldn't mind making LeafNode into a walker, if that approach is judged to be better, and you don't mind documenting the order in which the tree is walked as deterministic, because the order now matters in a way apparently not really anticipated by expression_tree_walker, though that's probably not a problem. My real concern now is that it's March 1st, and the last commitfest may end soon. Even though this patch has extensive regression tests, has been floating around for months, and, I believe, will end up being a timely and important feature, a committer has yet to step forward to work towards this patch getting committed. Can someone volunteer, please? My expectation is that this feature will make life a lot easier for a lot of Postgres users. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
Robert Haas writes: >> So a relation can't have some pages in Version 9.2, and other pages in >> version 9.3? How will this work for 2TB tables? > Not very well, but better than Tom's proposal to require upgrading the > entire cluster in a single off-line operation. WTF? That was most certainly not what *I* was proposing; it's obviously unworkable. We need a process that can incrementally up-version a live database and keep track of the minimum version present, at some granularity smaller than "whole database". All of this was discussed and hashed out about two years ago, IIRC. We just haven't made any progress towards actually implementing those concepts. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY with hints, rebirth
On 01.03.2012 18:40, Simon Riggs wrote: On Sun, Feb 26, 2012 at 7:16 PM, Heikki Linnakangas wrote: On 24.02.2012 22:55, Simon Riggs wrote: What exactly does it do? Previously, we optimised COPY when it was loading data into a newly created table or a freshly truncated table. This patch extends that and actually sets the tuple header flag as HEAP_XMIN_COMMITTED during the load. Doing so is simple 2 lines of code. The patch also adds some tests for corner cases that would make that action break MVCC - though those cases are minor and typical data loads will benefit fully from this. This doesn't work with subtransactions: ... The query should return the row copied in the same subtransaction. Thanks for pointing that out. New patch with corrected logic and test case attached. It's still broken: -- create test table and file create table a as select 1 as id; copy a to '/tmp/a'; -- start test postgres=# begin; BEGIN postgres=# truncate a; TRUNCATE TABLE postgres=# savepoint sp1; SAVEPOINT postgres=# copy a from '/tmp/a'; COPY 1 postgres=# select * from a; id 1 (1 row) postgres=# rollback to savepoint sp1; ROLLBACK postgres=# select * from a; id 1 (1 row) That last select should not have seen the tuple. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On ons, 2012-02-29 at 22:47 -0500, Bruce Momjian wrote: > Hey, that's a good idea. I would always write the pg_dump output to a > log file. If the dump succeeds, I remove the file, if not, I tell > users to read the log file for details about the failure --- good > idea. But we also need the server log output somewhere. So I think this temp file would need to cover everything that -l covers. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] review: CHECK FUNCTION statement
Why does CollectCheckedFunctions skip trigger functions? My only guess is that at one point the checker was not supposed to know how to check them, and a later version learned about it and this bit wasn't updated; but maybe there's another reason? -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
>> So a relation can't have some pages in Version 9.2, and other pages in >> version 9.3? How will this work for 2TB tables? > > Not very well, but better than Tom's proposal to require upgrading the > entire cluster in a single off-line operation. Yes, but the result will be that anyone with a 2TB table will *never* convert it to the new format. Which means we can never deprecate that format, because lots of people will still be using it. I continue to assert that all of this sounds like 9.3 work to me. I'm really not keen on pushing through a hack which will make pushing in a long-term solution harder. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Thu, Mar 1, 2012 at 12:42 PM, Josh Berkus wrote: >>> And how would a DBA know that? >> >> We'd add a column to pg_class that tracks which page version is in use >> for each relation. > > So a relation can't have some pages in Version 9.2, and other pages in > version 9.3? How will this work for 2TB tables? Not very well, but better than Tom's proposal to require upgrading the entire cluster in a single off-line operation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
>> And how would a DBA know that? > > We'd add a column to pg_class that tracks which page version is in use > for each relation. So a relation can't have some pages in Version 9.2, and other pages in version 9.3? How will this work for 2TB tables? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] incompatible pointer types with newer zlib
On fre, 2012-02-24 at 11:10 -0500, Tom Lane wrote: > Peter Eisentraut writes: > > On tor, 2012-02-23 at 10:17 -0500, Tom Lane wrote: > >> void * seems entirely reasonable given the two different usages, but > >> I would be happier if the patch added explicit casts whereever FH is > >> set to or used as one of these two types. > > > That would add about 70 casts all over the place. I don't think that > > will make things clearer or more robust. I think we either leave it as > > my first patch for now or find a more robust solution with a union or > > something. > > Hm. Could we just create two separate fields? It's not real clear to > me that forcing both these usages into a generic pointer buys much. It's used in confusing ways. In pg_backup_files.c _PrintFileData(), it's a compile-time decision whether FH is a FILE or a gzFile: #ifdef HAVE_LIBZ AH->FH = gzopen(filename, "rb"); #else AH->FH = fopen(filename, PG_BINARY_R); #endif But if we changed FH to be gzFile under HAVE_LIBZ, then tons of other places will complain that use fread(), fwrite(), fileno(), etc. directly on FH. Considering the volume of who complains in one way versus the other, I think _PrintFileData() is at fault. I think the best fix would be to rearrange _PrintFileData() so that it doesn't use FH at all. Instead, we could define a separate ArchiveHandle field IF that works more like OF, and then change ahwrite() to use that. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
On tor, 2012-03-01 at 20:56 +0900, Shigeru Hanada wrote: > How about moving postgresql_fdw_validator into dblink, That's probably a good move. If this were C++, we might try to subclass this whole thing a bit, to avoid code duplication, but I don't see an easy way to do that here. > with renaming to dblink_fdw_validator? Well, it's not the validator of the dblink_fdw, so maybe something like basic_postgresql_fdw_validator. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] COPY with hints, rebirth
On Sun, Feb 26, 2012 at 7:16 PM, Heikki Linnakangas wrote: > On 24.02.2012 22:55, Simon Riggs wrote: >> >> A long time ago, in a galaxy far away, we discussed ways to speed up >> data loads/COPY. >> http://archives.postgresql.org/pgsql-hackers/2007-01/msg00470.php >> >> In particular, the idea that we could mark tuples as committed while >> we are still loading them, to avoid negative behaviour for the first >> reader. >> >> Simple patch to implement this is attached, together with test case. >> >> ... >> >> >> What exactly does it do? Previously, we optimised COPY when it was >> loading data into a newly created table or a freshly truncated table. >> This patch extends that and actually sets the tuple header flag as >> HEAP_XMIN_COMMITTED during the load. Doing so is simple 2 lines of >> code. The patch also adds some tests for corner cases that would make >> that action break MVCC - though those cases are minor and typical data >> loads will benefit fully from this. > > > This doesn't work with subtransactions: ... > The query should return the row copied in the same subtransaction. Thanks for pointing that out. New patch with corrected logic and test case attached. >> In the link above, Tom suggested reworking HeapTupleSatisfiesMVCC() >> and adding current xid to snapshots. That is an invasive change that I >> would wish to avoid at any time and explains the long delay in >> tackling this. The way I've implemented it, is just as a short test >> during XidInMVCCSnapshot() so that we trap the case when the xid == >> xmax and so would appear to be running. This is much less invasive and >> just as performant as Tom's original suggestion. > > > TransactionIdIsCurrentTransactionId() can be fairly expensive if you have a > lot of subtransactions open... I've put in something to avoid that cost for the common case - just a boolean. This seems like the best plan rather than the explicit FREEZE option. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index c910863..3899282 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -2056,6 +2056,17 @@ heap_prepare_insert(Relation relation, HeapTuple tup, TransactionId xid, tup->t_tableOid = RelationGetRelid(relation); /* + * If we are inserting into a new relation invisible as yet to other + * backends and our session has no prior snapshots and no ready portals + * then we can also set the hint bit for the rows we are inserting. The + * last two restrictions ensure that HeapTupleSatisfiesMVCC() gives + * the right answer if the current transaction inspects the data after + * we load it. + */ + if (options & HEAP_INSERT_HINT_XMIN) + tup->t_data->t_infomask |= HEAP_XMIN_COMMITTED; + + /* * If the new tuple is too big for storage or contains already toasted * out-of-line attributes from some other relation, invoke the toaster. */ diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index e22bdac..de7504c 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -146,6 +146,7 @@ typedef struct TransactionStateData int prevSecContext; /* previous SecurityRestrictionContext */ bool prevXactReadOnly; /* entry-time xact r/o state */ bool startedInRecovery; /* did we start in recovery? */ + bool maySeePreHintedTuples; /* did subtrans write pre-hinted rows? */ struct TransactionStateData *parent; /* back link to parent */ } TransactionStateData; @@ -175,6 +176,7 @@ static TransactionStateData TopTransactionStateData = { 0, /* previous SecurityRestrictionContext */ false, /* entry-time xact r/o state */ false, /* startedInRecovery */ + false, /* maySeePreHintedTuples */ NULL /* link to parent state block */ }; @@ -704,6 +706,18 @@ TransactionStartedDuringRecovery(void) return CurrentTransactionState->startedInRecovery; } +bool +TransactionMaySeePreHintedTuples(void) +{ + return CurrentTransactionState->maySeePreHintedTuples; +} + +void +TransactionMayWritePreHintedTuples(void) +{ + CurrentTransactionState->maySeePreHintedTuples = true; +} + /* * CommandCounterIncrement */ @@ -1689,6 +1703,7 @@ StartTransaction(void) s->startedInRecovery = false; XactReadOnly = DefaultXactReadOnly; } + s->maySeePreHintedTuples = false; XactDeferrable = DefaultXactDeferrable; XactIsoLevel = DefaultXactIsoLevel; forceSyncCommit = false; diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 110480f..1419e33 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -43,6 +43,7 @@ #include "utils/builtins.h" #include "utils/lsyscache.h" #include "utils/memutils.h" +#include "utils/portal.h" #include "utils/rel.h" #include "utils/snapmgr.h" @@ -1922,6 +1923,13 @@ CopyFrom(CopyState cstate) hi_opt
Re: [HACKERS] performance results on IBM POWER7
On Thu, Mar 1, 2012 at 11:23 AM, Ants Aasma wrote: > On Thu, Mar 1, 2012 at 4:54 PM, Robert Haas wrote: >> ... After that I think maybe some testing of the >> remaining CommitFest patches might be in order (though personally I'd >> like to wrap this CommitFest up fairly soon) to see if any of those >> improve things. > > Besides performance testing, could you check how clocksources behave > on this kind of machine? > You can find pg_test_timing tool attached here: > http://archives.postgresql.org/pgsql-hackers/2012-01/msg00937.php > > To see which clocksources are available, you can do: > # cat /sys/devices/system/clocksource/clocksource0/available_clocksource > To switch the clocksource, just write the desired clocksource like this: > # echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource Sure, I'll check that as soon as it's back up. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Allowing multi "-t" and adding "-n" to vacuumdb ?
Hi, One of our customer send us a patch he wrote for his needs (on "src/bin/scripts/vacuumdb.c", no doc were included). He's using one schema per application and would like to be able to run vacuumdb on each of them independently so he added the "--schema|-n" option and send us the patch. Reviewing his patch, I thought it would be more useful to allow multi "--table|-t" options on the command line first. It might be possible to pass an array of tables to "vacuum_one_database" function instead of just one. Then, we could add the "--schema|-n" option which would fetch and build the table list and call "vacuum_one_database". But before I start writing this patch, I would like some opinion, pros / cons. Do you think such a feature could be accepted in official PostgreSQL code or should we keep this as an external script ? Thank you, -- Jehan-Guillaume (ioguix) de Rorthais http://www.dalibo.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] performance results on IBM POWER7
On Thu, Mar 1, 2012 at 4:54 PM, Robert Haas wrote: > ... After that I think maybe some testing of the > remaining CommitFest patches might be in order (though personally I'd > like to wrap this CommitFest up fairly soon) to see if any of those > improve things. Besides performance testing, could you check how clocksources behave on this kind of machine? You can find pg_test_timing tool attached here: http://archives.postgresql.org/pgsql-hackers/2012-01/msg00937.php To see which clocksources are available, you can do: # cat /sys/devices/system/clocksource/clocksource0/available_clocksource To switch the clocksource, just write the desired clocksource like this: # echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource Thanks, Ants Aasma -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Speed dblink using alternate libpq tuple storage
On Tue, Feb 28, 2012 at 05:04:44PM +0900, Kyotaro HORIGUCHI wrote: > > There is still one EOF in v3 getAnotherTuple() - > > pqGetInt(tupnfields), please turn that one also to > > protocolerror. > > pqGetInt() returns EOF only when it wants additional reading from > network if the parameter `bytes' is appropreate. Non-zero return > from it seems should be handled as EOF, not a protocol error. The > one point I had modified bugilly is also restored. The so-called > 'protocol error' has been vanished eventually. But it's broken in V3 protocol - getAnotherTuple() will be called only if the packet is fully read. If the packet contents do not agree with packet header, it's protocol error. Only valid EOF return in V3 getAnotherTuple() is when row processor asks for early exit. > Is there someting left undone? * Convert old EOFs to protocol errors in V3 getAnotherTuple() * V2 getAnotherTuple() can leak PGresult when handling custom error from row processor. * remove pqIsnonblocking(conn) check when row processor returned 2. I missed that it's valid to call PQisBusy/PQconsumeInput/PQgetResult on sync connection. * It seems the return codes from callback should be remapped, (0, 1, 2) is unusual pattern. Better would be: -1 - error 0 - stop parsing / early exit ("I'm not done yet") 1 - OK ("I'm done with the row") * Please drop PQsetRowProcessorErrMsg() / PQresultSetErrMsg(). Main problem is that it needs to be synced with error handling in rest of libpq, which is unlike the rest of row processor patch, which consists only of local changes. All solutions here are either ugly hacks or too complex to be part of this patch. Also considering that we have working exceptions and PQgetRow, I don't see much need for custom error messages. If really needed, it should be introduced as separate patch, as the area of code it affects is completely different. Currently the custom error messaging seems to be the blocker for this patch, because of raised complexity when implementing it and when reviewing it. Considering how unimportant the provided functionality is, compared to rest of the patch, I think we should simply drop it. My suggestion - check in getAnotherTuple whether resultStatus is already error and do nothing then. This allows internal pqAddRow to set regular "out of memory" error. Otherwise give generic "row processor error". > By the way, I noticed that dblink always says that the current > connection is 'unnamed' in messages the errors in > dblink_record_internal@dblink. I could see that > dblink_record_internal defines the local variable conname = NULL > and pass it to dblink_res_error to display the error message. But > no assignment on it in the function. > > It seemed properly shown when I added the code to set conname > from PG_GETARG_TEXT_PP(0) if available, in other words do that > just after DBLINK_GET_CONN/DBLINK_GET_NAMED_CONN's. It seems the > dblink's manner... This is not included in this patch. > > Furthurmore dblink_res_error looks only into returned PGresult to > display the error and always says only `Error occurred on dblink > connection..: could not execute query'.. > > Is it right to consider this as follows? > > - dblink is wrong in error handling. A client of libpq should >see PGconn by PQerrorMessage() if (or regardless of whether?) >PGresult says nothing about error. Yes, it seems like bug. -- marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
Excerpts from Robert Haas's message of jue mar 01 12:00:08 -0300 2012: > On Wed, Feb 29, 2012 at 5:43 PM, Tom Lane wrote: > > No, just that we'd no longer have statistics relevant to that, and would > > have to fall back on default selectivity assumptions. Do you think that > > such applications are so common as to justify bloating pg_statistic for > > everybody that uses arrays? > > I confess I am nervous about ripping this out. I am pretty sure we > will get complaints about it. Performance optimizations that benefit > group A at the expense of group B are always iffy, and I'm not sure > the case of using an array as a path indicator is as uncommon as you > seem to think. Maybe we should keep it as an option. I do think it's quite uncommon, but for those rare users, it'd be good to provide the capability while not bloating everyone else's stat catalog. The thing is, people using arrays as path indicators and such are likely using relatively small arrays; people storing real data are likely to store much bigger arrays. Just a hunch though. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
On Wed, Feb 29, 2012 at 5:43 PM, Tom Lane wrote: > Nathan Boley writes: >> On Wed, Feb 29, 2012 at 12:39 PM, Tom Lane wrote: >>> I am starting to look at this patch now. I'm wondering exactly why the >>> decision was made to continue storing btree-style statistics for arrays, >>> in addition to the new stuff. > >> If I understand you're suggestion, queries of the form > >> SELECT * FROM rel >> WHERE ARRAY[ 1,2,3,4 ] <= x >> AND x <=ARRAY[ 1, 2, 3, 1000]; > >> would no longer use an index. Is that correct? > > No, just that we'd no longer have statistics relevant to that, and would > have to fall back on default selectivity assumptions. Do you think that > such applications are so common as to justify bloating pg_statistic for > everybody that uses arrays? I confess I am nervous about ripping this out. I am pretty sure we will get complaints about it. Performance optimizations that benefit group A at the expense of group B are always iffy, and I'm not sure the case of using an array as a path indicator is as uncommon as you seem to think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Collect frequency statistics for arrays
On Thu, Mar 1, 2012 at 1:19 AM, Alexander Korotkov wrote: > On Thu, Mar 1, 2012 at 1:09 AM, Tom Lane wrote: > >> That seems like a pretty narrow, uncommon use-case. Also, to get >> accurate stats for such queries that way, you'd need really enormous >> histograms. I doubt that the existing parameters for histogram size >> will permit meaningful estimation of more than the first array entry >> (since we don't make the histogram any larger than we do for a scalar >> column). >> >> The real point here is that the fact that we're storing btree-style >> stats for arrays is an accident, backed into by having added btree >> comparators for arrays plus analyze.c's habit of applying default >> scalar-oriented analysis functions to any type without an explicit >> typanalyze entry. I don't recall that we ever thought hard about >> it or showed that those stats were worth anything. >> > > OK. I don't object to removing btree stats from arrays. > What do you thinks about pg_stats view in this case? Should it combine > values histogram and array length histogram in single column like do for > MCV and MCELEM? > Btree statistics for arrays and additional statistics slot are removed from attached version of patch. pg_stats view is untouched for while. -- With best regards, Alexander Korotkov. arrayanalyze-0.13.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] performance results on IBM POWER7
IBM has provided the PostgreSQL community with access to a couple of IBM POWER7 machines through OSUOSL. Simon has access to one, carved up into a couple of LPARs, for replication work, and there's a buildfarm animal on there as well, I think; I have access to the other, for performance testing. I imagine we can get access for a few other people as well, though at the moment the performance-testing machine is inaccessible and I'm not having very much luck getting help from the very busy OSUOSL folks. Anyway, before it bit the dust, I was able to do some basic pgbench tests at various scale factors and client counts. I used my usual settings: shared_buffers = 8GB maintenance_work_mem = 1GB synchronous_commit = off checkpoint_segments = 300 checkpoint_timeout = 15min checkpoint_completion_target = 0.9 wal_writer_delay = 20ms I did three five-minute runs at each scale factors 100, 300, 1000, 3000, and 1, with varying client counts: 1, 2, and all multiples of 4 up to 80. I stopped and restarted the database after each run (but did not flush the OS cache, so this is a warm-start test) and took the median of the three results for each run. Full results are attached herewith; pretty graphs are on my blog at http://rhaas.blogspot.com/2012/03/performance-and-scalability-on-ibm.html When I get the machine back, my plan is to next run some read-write pgbench tests. Those will need to be longer, though. Read performance doesn't seem to be very sensitive to the length of the tests, but write performance is, so I'll probably need at least 30-minute runs if not more to get an accurate sense of what the performance is like. After that I think maybe some testing of the remaining CommitFest patches might be in order (though personally I'd like to wrap this CommitFest up fairly soon) to see if any of those improve things. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company f.100 Description: Binary data f.300 Description: Binary data f.1000 Description: Binary data f.3000 Description: Binary data f.1 Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parameterized-path cost comparisons need some work
On Wed, Feb 29, 2012 at 6:01 PM, Tom Lane wrote: >> Well, my "evidence" is that a parameterized path should pretty much >> always include a paramaterized path somewhere in there - otherwise, >> what is parameterization doing for us? > > Well, yes, we know that much. I didn't write what I meant to write there. I meant to say: a parameterized path is presumably going to contain a parameterized *index scan* somewhere within. So somewhere we're going to have something of the form -> Index Scan blah on blah Index Cond: someattr = $1 And if that path weren't parameterized, we'd have to read the whole relation, either with a full index scan, or a sequential scan. Or, I mean, maybe there's a filter condition, so that no path needs to retrieve the *whole* relation, but even there the index cond is on top of that, and it's probably doing something, though I suppose you're right that there might be cases where it doesn't. >> And that's going to reduce the >> row count. I may be missing something, but I'm confused as to why >> this isn't nearly tautological. > > We don't know that --- I will agree it's likely, but that doesn't make > it so certain that we can assume it without checking. A join condition > won't necessarily eliminate any rows. > > (... thinks about that for awhile ...) One thing we could possibly do > is have indxpath.c arbitrarily reject parameterizations that don't > produce a smaller estimated number of rows than an unparameterized scan. > Admittedly, this still doesn't *prove* the assumption for join > relations, but maybe it brings the odds to where it's okay for add_path > to make such an assumption. That seems to make sense. > (... thinks some more ...) No, that doesn't get us there, because that > doesn't establish that a more-parameterized path produces fewer rows > than some path that requires less parameterization, yet not none at > all. You really want add_path carrying out those comparisons. In your > previous example, it's entirely possible that path D is dominated by B > or C because of poor choices of join quals. I'm not following this part. Can you explain further? It seems to me at any rate that we could get pretty far if we could just separate parameterized paths and unparameterized paths into separate buckets. Even if we have to do some extra work when comparing parameterized paths *to each other*, we'd gain a fair amount by avoiding comparing any of them with the unparameterized paths. Or at least, I hope so. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade --logfile option documentation
On Wed, Feb 29, 2012 at 6:02 PM, Bruce Momjian wrote: >> Any ideas about improving the error reporting more generally, so that >> when reloading the dump fails, the user can easily see what went >> belly-up, even if they didn't use -l? > > The only idea I have is to write the psql log to a temporary file and > report the last X lines from the file in case of failure. Does that > help? Why not just redirect stdout but not stderr? If there are error messages, surely we want the user to just see those. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 16-bit page checksums for 9.2
On Wed, Feb 29, 2012 at 5:52 PM, Tom Lane wrote: > Robert Haas writes: >> On Wed, Feb 29, 2012 at 4:34 PM, Tom Lane wrote: >>> Easier for who? I don't care for the idea of code that has to cope with >>> two page formats, or before long N page formats, because if we don't >>> have some mechanism like this then we will never be able to decide that >>> an old data format is safely dead. > >> Huh? You can drop support for a new page format any time you like. >> You just decree that version X+1 no longer supports it, and you can't >> pg_upgrade to it until all traces of the old page format are gone. > > And how would a DBA know that? We'd add a column to pg_class that tracks which page version is in use for each relation. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
(2012/03/01 0:33), Tom Lane wrote: > I don't think that creating such a dependency is acceptable. > Even if we didn't mind the dependency, you said yourself that > contrib/postgresql_fdw's validator will accept stuff that's not > appropriate for dblink. Agreed. I think that these two contrib modules (and all FDW modules) should have individual validator for each to avoid undesirable dependency and naming conflict, and such validator function should be inside each module, but not in core. How about moving postgresql_fdw_validator into dblink, with renaming to dblink_fdw_validator? Attached patch achieves such changes. I've left postgresql_fdw_validator" in foreign_data regression test section, so that foreign_data section can still check whether FDW DDLs invoke validator function. I used the name "postgresql_fdw_validator" for test validator to make change as little as possible. This change requires dblink to have new function, so its version should be bumped to 1.1. These changes have no direct relation to PostgreSQL FDW, so this patch can be applied by itself. If this patch has been applied, I'll rename pgsql_fdw to postgresql_fdw which contains product name fully spelled out. -- Shigeru Hanada diff --git a/contrib/dblink/Makefile b/contrib/dblink/Makefile index ac63748..a27db88 100644 *** a/contrib/dblink/Makefile --- b/contrib/dblink/Makefile *** SHLIB_LINK = $(libpq) *** 7,13 SHLIB_PREREQS = submake-libpq EXTENSION = dblink ! DATA = dblink--1.0.sql dblink--unpackaged--1.0.sql REGRESS = dblink --- 7,13 SHLIB_PREREQS = submake-libpq EXTENSION = dblink ! DATA = dblink--1.1.sql dblink--1.0--1.1.sql dblink--unpackaged--1.0.sql REGRESS = dblink diff --git a/contrib/dblink/dblink--1.0--1.1.sql b/contrib/dblink/dblink--1.0--1.1.sql index ...3dd02a0 . *** a/contrib/dblink/dblink--1.0--1.1.sql --- b/contrib/dblink/dblink--1.0--1.1.sql *** *** 0 --- 1,17 + /* contrib/dblink/dblink--1.0--1.1.sql */ + + -- complain if script is sourced in psql, rather than via ALTER EXTENSION + \echo Use "ALTER EXTENSION dblink UPDATE" to load this file. \quit + + /* First we have to create validator function */ + CREATE FUNCTION dblink_fdw_validator( + options text[], + catalog oid + ) + RETURNS boolean + AS 'MODULE_PATHNAME', 'dblink_fdw_validator' + LANGUAGE C STRICT; + + /* Then we add the validator */ + ALTER EXTENSION dblink ADD FUNCTION dblink_fdw_validator(text[], oid); + diff --git a/contrib/dblink/dblink--1.1.sql b/contrib/dblink/dblink--1.1.sql index ...ec02498 . *** a/contrib/dblink/dblink--1.1.sql --- b/contrib/dblink/dblink--1.1.sql *** *** 0 --- 1,231 + /* contrib/dblink/dblink--1.1.sql */ + + -- complain if script is sourced in psql, rather than via CREATE EXTENSION + \echo Use "CREATE EXTENSION dblink" to load this file. \quit + + -- dblink_connect now restricts non-superusers to password + -- authenticated connections + CREATE FUNCTION dblink_connect (text) + RETURNS text + AS 'MODULE_PATHNAME','dblink_connect' + LANGUAGE C STRICT; + + CREATE FUNCTION dblink_connect (text, text) + RETURNS text + AS 'MODULE_PATHNAME','dblink_connect' + LANGUAGE C STRICT; + + -- dblink_connect_u allows non-superusers to use + -- non-password authenticated connections, but initially + -- privileges are revoked from public + CREATE FUNCTION dblink_connect_u (text) + RETURNS text + AS 'MODULE_PATHNAME','dblink_connect' + LANGUAGE C STRICT SECURITY DEFINER; + + CREATE FUNCTION dblink_connect_u (text, text) + RETURNS text + AS 'MODULE_PATHNAME','dblink_connect' + LANGUAGE C STRICT SECURITY DEFINER; + + REVOKE ALL ON FUNCTION dblink_connect_u (text) FROM public; + REVOKE ALL ON FUNCTION dblink_connect_u (text, text) FROM public; + + CREATE FUNCTION dblink_disconnect () + RETURNS text + AS 'MODULE_PATHNAME','dblink_disconnect' + LANGUAGE C STRICT; + + CREATE FUNCTION dblink_disconnect (text) + RETURNS text + AS 'MODULE_PATHNAME','dblink_disconnect' + LANGUAGE C STRICT; + + CREATE FUNCTION dblink_open (text, text) + RETURNS text + AS 'MODULE_PATHNAME','dblink_open' + LANGUAGE C STRICT; + + CREATE FUNCTION dblink_open (text, text, boolean) + RETURNS text + AS 'MODULE_PATHNAME','dblink_open' + LANGUAGE C STRICT; + + CREATE FUNCTION dblink_open (text, text, text) + RETURNS text + AS 'MODULE_PATHNAME','dblink_open' + LANGUAGE C STRICT; + + CREATE FUNCTION dblink_open (text, text, text, boolean) + RETURNS text + AS 'MODULE_PATHNAME','dblink_open' + LANGUAGE C STRICT; + + CREATE FUNCTION dblink_fetch (text, int) + RETURNS setof record + AS 'MODULE_PATHNAME','dblink_fetch' + LANGUAGE C STRICT; + + CREATE FUNCTION dblink_fetch (text, int, boolean) + RETURNS setof record + AS 'MODULE_PATHNAME','dblink_fetch' + LANGUAGE C STRICT; + + CREATE FUNCTION dblink_fetch (text, text, int) + RETURNS setof record + AS 'MODULE_PATHNAME','dblink_fetch' + LANGUAGE C STRICT; + + CREATE FUNCTION dblink_fetch (text, text, int, boolea