Re: [HACKERS] Removal of useless include references
* Bruce Momjian wrote: The attached patch removes unneeded include references, and marks some includes as needing to be skipped by pgrminclude. There are several unrelated changes to pg_upgrade in that patch, too. -- Christian -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch to slightly improve clarity of a comment in postgresql.conf.sample
Sure, I like that more. On Thu, Aug 25, 2011 at 9:29 AM, Robert Haas wrote: > On Wed, Aug 24, 2011 at 7:33 AM, Dougal Sutherland > wrote: > > The attached change to postgresql.conf.sample makes it more clear at a > > glance that the default value of listen_addresses is 'localhost', not > > 'localhost, *'. This would have saved a friend an hour or two of fiddling > > tonight. > > How about: > ># defaults to 'localhost'; use '*' for all > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company >
Re: [HACKERS] tsvector concatenation - backend crash
On 2011-08-26 05:28, Tom Lane wrote: Jesper Krogh writes: Attached SQL files gives (at least in my hands) a reliable backend crash with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry I cannot provide a more trimmed down set of vectors the reproduces the bug, thus the "obsfucated" dataset. But even deleting single terms in the vectors make the bug go away. Hm ... I can reproduce this on one of my usual machines, but not another. What platform are you on exactly? 64 bit Ubuntu Lucid (amd64). -- Jesper -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Buffering GiST leaf pages too
On 26.08.2011 00:45, Jim Nasby wrote: I've often wondered about the per-tuple overhead of all kinds of operations, not just GiST index builds. For example, if you're doing a seqscan, ISTM it would be a lot more efficient to memcpy an entire page into backend-local memory and operate off of that lock-free. What we currently do is even better than that. We take the lock once, and hold it while we do all the visibility checks. Then the lock is released, but the page is kept pinned so that it doesn't get evicted from the buffer cache. No memcpy() required. Similarly for an index scan, you'd want to copy a full leaf page if you think you'll be hitting it more than once or twice. We more or less do that too already. When an index scan steps on a leaf page, it scans the page for all matches, and copies them to backend-local memory. The page lock is then released. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Removal of useless include references
Bruce Momjian writes: > It has been years since I ran src/tools/pginclude/pgrminclude to remove > unnecessary include files. (I have already fixed things so include > files can be compiled on their own.) > The attached patch removes unneeded include references, and marks some > includes as needing to be skipped by pgrminclude. > I am sure applying this patch will break builds on some platforms and > some option combinations so I will monitor the buildfarm when I apply it > and make adjustments. The last time you did this was in July 2006. It took us two weeks to mostly recover, but we were still dealing with some fallout in December, cf http://archives.postgresql.org/pgsql-hackers/2006-12/msg00491.php We had the buildfarm then, had had it for a couple years. The notion that watching the buildfarm is enough is fully disproven by history. Unless you have a better test plan than last time (which this isn't), I don't think this should be done at all. The benefits are microscopic and the pain real. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_upgrade problem
OK, this was very helpful. I found out that there is a bug in current 9.0.X, 9.1.X, and HEAD that I introduced recently when I excluded temp tables. (The bug is not in any released version of pg_upgrade.) The attached, applied patches should fix it for you. I assume you are running 9.0.X, and not 9.0.4. --- hubert depesz lubaczewski wrote: > On Thu, Aug 25, 2011 at 04:43:02PM -0400, Bruce Momjian wrote: > > Please check the old cluster. > > Sure: > > =# SELECT reltoastrelid FROM pg_class WHERE relname = 'actions'; > > > > reltoastrelid > --- > 82510395 > 71637071 > (2 rows) > > =# SELECT oid::regclass, reltoastrelid FROM pg_class WHERE relname = > 'actions'; > > > oid | reltoastrelid > ---+--- > x.actions | 82510395 > y.actions | 71637071 > (2 rows) > > =# select oid, relfilenode from pg_class where oid in (SELECT reltoastrelid > FROM pg_class WHERE relname = 'actions'); >oid| relfilenode > --+- > 82510395 |82510395 > 71637071 |71637071 > (2 rows) > > =# select oid from pg_database where datname = current_database(); >oid > -- > 71635381 > (1 row) > > $ ls -l /base/71635381/{71637071,82510395} > -rw--- 1 postgres postgres 0 2009-10-12 06:49 /base/71635381/71637071 > -rw--- 1 postgres postgres 0 2010-08-19 14:02 /base/71635381/82510395 > > > > > > One more thing - one of earlier tests actually worked through > > > > > pg_upgrade, but when running vacuumdb -az on newly started 9.0.4, I > > > > > got > > > > > error about missing transaction/clog - don't remember exactly what it > > > > > was, though. > > > > THere was a bug in how how pg_upgrade worked in pre-9.0.4 --- could it > > > > have been that? > > > It was done definitely using 9.0.4. > > Good. > > Not sure if it's good, since it was after the clog error was fixed, and > I still got it :/ > > but anyway - the problem with 71637071 is more important now. > > Best regards, > > depesz -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/contrib/pg_upgrade/info.c b/contrib/pg_upgrade/info.c new file mode 100644 index ca357e7..1f5b7ae *** a/contrib/pg_upgrade/info.c --- b/contrib/pg_upgrade/info.c *** get_rel_infos(migratorContext *ctx, cons *** 328,336 " ON c.reltablespace = t.oid " "WHERE (( " /* exclude pg_catalog and pg_temp_ (could be orphaned tables) */ ! " n.nspname !~ '^pg_' " ! " AND n.nspname != 'information_schema' " ! " AND c.oid >= %u " " ) OR ( " " n.nspname = 'pg_catalog' " " AND relname IN " --- 328,338 " ON c.reltablespace = t.oid " "WHERE (( " /* exclude pg_catalog and pg_temp_ (could be orphaned tables) */ ! "n.nspname != 'pg_catalog' " ! "AND n.nspname !~ '^pg_temp_' " ! "AND n.nspname !~ '^pg_toast_temp_' " ! " AND n.nspname != 'information_schema' " ! " AND c.oid >= %u " " ) OR ( " " n.nspname = 'pg_catalog' " " AND relname IN " diff --git a/contrib/pg_upgrade/version_old_8_3.c b/contrib/pg_upgrade/version_old_8_3.c new file mode 100644 index 930f76d..6fcd61b *** a/contrib/pg_upgrade/version_old_8_3.c --- b/contrib/pg_upgrade/version_old_8_3.c *** old_8_3_check_for_name_data_type_usage(m *** 61,69 " NOT a.attisdropped AND " " a.atttypid = 'pg_catalog.name'::pg_catalog.regtype AND " " c.relnamespace = n.oid AND " ! /* exclude pg_catalog and pg_temp_ (could be orphaned tables) */ ! " n.nspname !~ '^pg_' AND " ! " n.nspname != 'information_schema'"); ntups = PQntuples(res); i_nspname = PQfnumber(res, "nspname"); --- 61,71 " NOT a.attisdropped AND " " a.atttypid = 'pg_catalog.name'::pg_catalog.regtype AND " " c.relnamespace = n.oid AND " ! /* exclude possibly orphaned temp tables */ ! " n.nspname != 'pg_catalog' AND " ! " n.nspname !~ '^pg_temp_' AND " ! " n.nspname !~ '^pg_toast_temp_' AND " ! " n.nspname != 'information_schema' "); ntups = PQntuples(res); i_nspname = PQfnumber(res, "nspname"); *** old_8_3_check_for_tsquery_usage(migrator *** 152,160 " NOT a.attisdropped AND " " a.atttypid = 'pg_catalog.tsquery'::pg_catalog.regtype AND " " c.relnamespace
[HACKERS] dropdb and dropuser: IF EXISTS
I noticed a few places where it would be handy if dropdb took a flag like "--if-exists" which would basically just add in the 'IF EXISTS' clause to the DROP DATABASE statement. For example, scripts like find_static or mbregress.sh use dropdb && createdb, but they generate noisy errors from dropdb when run for the first time since there's no --if-exists flag. (They could just pipe 'DROP DATABASE IF EXISTS ...' to psql, but what's the point of having dropdb if it's not used?) Attached is a very quick patch implementing the "--if-exists" or "-X" option for dropdb and dropuser. I didn't bother adding in a check to make sure the server version was 8.2+ since we're not even supporting 8.1 nowadays, though that'd be easy enough to add in. Josh diff --git a/doc/src/sgml/ref/dropdb.sgml b/doc/src/sgml/ref/dropdb.sgml index e20bcdb..2092bb6 100644 *** a/doc/src/sgml/ref/dropdb.sgml --- b/doc/src/sgml/ref/dropdb.sgml *** PostgreSQL documentation *** 87,92 --- 87,102 + -X + --if-exists + + +Don't report an error if the specified database does not exist. + + + + + -V --version diff --git a/doc/src/sgml/ref/dropuser.sgml b/doc/src/sgml/ref/dropuser.sgml index c158103..22580a4 100644 *** a/doc/src/sgml/ref/dropuser.sgml --- b/doc/src/sgml/ref/dropuser.sgml *** PostgreSQL documentation *** 89,94 --- 89,104 + -X + --if-exists + + + Don't report an error if the specified user does not exist. + + + + + -V --version diff --git a/src/bin/scripts/dropdb.c b/src/bin/scripts/dropdb.c index 4cec63e..187bf6c 100644 *** a/src/bin/scripts/dropdb.c --- b/src/bin/scripts/dropdb.c *** main(int argc, char *argv[]) *** 29,34 --- 29,35 {"password", no_argument, NULL, 'W'}, {"echo", no_argument, NULL, 'e'}, {"interactive", no_argument, NULL, 'i'}, + {"if-exists", no_argument, NULL, 'X'}, {NULL, 0, NULL, 0} }; *** main(int argc, char *argv[]) *** 43,48 --- 44,50 enum trivalue prompt_password = TRI_DEFAULT; bool echo = false; bool interactive = false; + bool if_exists = false; PQExpBufferData sql; *** main(int argc, char *argv[]) *** 54,60 handle_help_version_opts(argc, argv, "dropdb", help); ! while ((c = getopt_long(argc, argv, "h:p:U:wWei", long_options, &optindex)) != -1) { switch (c) { --- 56,62 handle_help_version_opts(argc, argv, "dropdb", help); ! while ((c = getopt_long(argc, argv, "h:p:U:wWeiX", long_options, &optindex)) != -1) { switch (c) { *** main(int argc, char *argv[]) *** 79,84 --- 81,89 case 'i': interactive = true; break; + case 'X': + if_exists = true; + break; default: fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname); exit(1); *** main(int argc, char *argv[]) *** 110,117 initPQExpBuffer(&sql); ! appendPQExpBuffer(&sql, "DROP DATABASE %s;\n", ! fmtId(dbname)); /* * Connect to the 'postgres' database by default, except have the --- 115,122 initPQExpBuffer(&sql); ! appendPQExpBuffer(&sql, "DROP DATABASE %s%s;\n", ! (if_exists ? "IF EXISTS " : ""), fmtId(dbname)); /* * Connect to the 'postgres' database by default, except have the *** help(const char *progname) *** 146,151 --- 151,157 printf(_("\nOptions:\n")); printf(_(" -e, --echoshow the commands being sent to the server\n")); printf(_(" -i, --interactive prompt before deleting anything\n")); + printf(_(" -X, --if-exists don't report error if database doesn't exist\n")); printf(_(" --helpshow this help, then exit\n")); printf(_(" --version output version information, then exit\n")); printf(_("\nConnection options:\n")); diff --git a/src/bin/scripts/dropuser.c b/src/bin/scripts/dropuser.c index 0949a5e..bf5196f 100644 *** a/src/bin/scripts/dropuser.c --- b/src/bin/scripts/dropuser.c *** main(int argc, char *argv[]) *** 29,34 --- 29,35 {"password", no_argument, NULL, 'W'}, {"echo", no_argument, NULL, 'e'}, {"interactive", no_argument, NULL, 'i'}, + {"if-exists", no_argument, NULL, 'X'}, {NULL, 0, NULL, 0} }; *** main(int argc, char *argv[]) *** 43,48 --- 44,50 enum trivalue prompt_password = TRI_DEFAULT; bool echo = false; bool interactive = false; + bool if_exists = false; PQExpBufferData sql; *** main(int argc, char *argv[]) *** 54,60 handle_help_version_opts(argc, argv, "dropuser", help); ! while ((c = getopt_long(argc, argv, "h:p:U:wWei", long_options, &optindex)) !
Re: [HACKERS] tsvector concatenation - backend crash
Jesper Krogh writes: > Attached SQL files gives (at least in my hands) a reliable backend crash > with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry > I cannot provide a more trimmed down set of vectors the reproduces the > bug, thus > the "obsfucated" dataset. But even deleting single terms in the vectors make > the bug go away. Hm ... I can reproduce this on one of my usual machines, but not another. What platform are you on exactly? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of issues with psql variable substitution
Robert Haas writes: > On Thu, Aug 25, 2011 at 5:12 PM, Tom Lane wrote: >> ISTM the general rule ought to be that we attempt to substitute for a >> colon-construct regardless of where it appears within an argument, as >> long as it's not within quotes. > My main thought is that I remember this code being pretty awful - > especially with respect to error handling - when I looked at it. A > lot of dubious behaviors were more or less compelled by the > impossibility of bailing out at an arbitrary point a la ereport(). At > least, barring a drastic refactoring of the code, which might not be a > bad idea either. What I had in mind to do was just to rearrange the flex rules --- the issues that I called out have to do with dubious choices about when to transition between different lexer states. I agree that the error handling isn't terribly friendly in unexpected cases like there not being a connection available to determine the literal-quoting rules, but that's not what I'm on about here. I'm just after consistent variable-expansion behavior in normal operation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why doesn't psql use the information schema to get ACL description ?
On Thu, Aug 25, 2011 at 5:51 PM, Jim Nasby wrote: > On Aug 23, 2011, at 1:57 PM, Peter Eisentraut wrote: >>> Ultimately my ulterior motive is to make sure the information schema >>> is actually useful and ensure that it maintains that status. >> >> The information schema only exposes privileges defined in the SQL >> standard on objects defined in the SQL standard. So psql or any tool >> that wants to provide information on PostgreSQL-specific features can't >> use that. But perhaps the JDBC interfaces only expose certain standard >> information anyway, so it could be useful. Depends on the details, >> though. > > Related to this is the newsysviews project, which was intended to present > more human-friendly info ala info_schema, but to also provide > PostgreSQL-specific information (which info_schema can't). I'll have to have a look at this when I get a moment, sounds promising. Will be even better if we get all clients using it. Dave Cramer dave.cramer(at)credativ(dot)ca http://www.credativ.ca -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade problem (fwd)
I am sending this report to hackers to all hacker subscribers can read the original bug report. --- - Forwarded message from hubert depesz lubaczewski - hi I have 8.3.11 database, ~ 600GB in size. I want to upgrade it to 9.0. First, I tried with 9.0.4, and when I hit problem (the same) I tried git, head of 9.0 branch. So. I did pg_upgrade with -c, and it looked like this: $ time pg_upgrade -c -v -b /opt/pgsql-8.3.11-int/bin/ -B /opt/pgsql-9.0.5a-int/bin/ -d /var/postgresql// -D /var/postgresql/-9.0 -k -l pg_upgrade.log -p -P 4329 Running in verbose mode Performing Consistency Checks - Checking old data directory (/var/postgresql/) ok Checking old bin directory (/opt/pgsql-8.3.11-int/bin) ok Checking new data directory (/var/postgresql/-9.0) ok Checking new bin directory (/opt/pgsql-9.0.5a-int/bin) ok "/opt/pgsql-8.3.11-int/bin/pg_ctl" -l "pg_upgrade.log" -D "/var/postgresql/" -o "-p -c autovacuum=off -c autovacuum_freeze_max_age=20" start >> "pg_upgrade.log" 2>&1 Checking for reg* system oid user data typesok Checking for /contrib/isn with bigint-passing mismatch ok Checking for invalid 'name' user columnsok Checking for tsquery user columns ok Checking for tsvector user columns ok Checking for hash and gin indexes warning | Your installation contains hash and/or gin | indexes. These indexes have different | internal formats between your old and new | clusters so they must be reindexed with the | REINDEX command. After migration, you will | be given REINDEX instructions. Checking for bpchar_pattern_ops indexes ok Checking for large objects ok "/opt/pgsql-8.3.11-int/bin/pg_ctl" -l "pg_upgrade.log" -D "/var/postgresql/" stop >> "pg_upgrade.log" 2>&1 "/opt/pgsql-9.0.5a-int/bin/pg_ctl" -l "pg_upgrade.log" -D "/var/postgresql/-9.0" -o "-p 4329 -c autovacuum=off -c autovacuum_freeze_max_age=20" start >> "pg_upgrade.log" 2>&1 Checking for presence of required libraries ok *Clusters are compatible* "/opt/pgsql-9.0.5a-int/bin/pg_ctl" -l "pg_upgrade.log" -D "/var/postgresql/-9.0" stop >> "pg_upgrade.log" 2>&1 real0m6.417s user0m0.040s sys 0m0.060s All looks ok. So I ran the upgrade without -c: $ time pg_upgrade -v -b /opt/pgsql-8.3.11-int/bin/ -B /opt/pgsql-9.0.5a-int/bin/ -d /var/postgresql// -D /var/postgresql/-9.0 -k -l pg_upgrade.log -p -P 4329 Running in verbose mode Performing Consistency Checks - Checking old data directory (/var/postgresql/) ok Checking old bin directory (/opt/pgsql-8.3.11-int/bin) ok Checking new data directory (/var/postgresql/-9.0) ok Checking new bin directory (/opt/pgsql-9.0.5a-int/bin) ok "/opt/pgsql-8.3.11-int/bin/pg_ctl" -l "pg_upgrade.log" -D "/var/postgresql/" -o "-p -c autovacuum=off -c autovacuum_freeze_max_age=20" start >> "pg_upgrade.log" 2>&1 Checking for reg* system oid user data typesok Checking for /contrib/isn with bigint-passing mismatch ok Checking for invalid 'name' user columnsok Checking for tsquery user columns ok Creating script to adjust sequences ok Checking for large objects ok Creating catalog dump "/opt/pgsql-9.0.5a-int/bin/pg_dumpall" --port --username "postgres" --schema-only --binary-upgrade > "/var/postgresql/pg_upgrade_dump_all.sql" ok "/opt/pgsql-8.3.11-int/bin/pg_ctl" -l "pg_upgrade.log" -D "/var/postgresql/" stop >> "pg_upgrade.log" 2>&1 "/opt/pgsql-9.0.5a-int/bin/pg_ctl" -l "pg_upgrade.log" -D "/var/postgresql/-9.0" -o "-p 4329 -c autovacuum=off -c autovacuum_freeze_max_age=20" start >> "pg_upgrade.log" 2>&1 Checking for presence of required libraries ok | If pg_upgrade fails after this point, you must | re-initdb the new cluster before continuing. | You will also need to remove the ".old" suffix | from /var/postgresql//global/pg_control.old. Performing Migration Adding ".old" suffix to old global/pg_control ok Analyzing all rows in the new cluster "/opt/pgsql-9.0.5a-int/bin/vacuumdb" --port 4329 --username "postgres" --all --analyze >> "pg_upgrade.log" 2>&1 ok Freezing all rows on the new cluster "/opt/pgsql-9.0.5a-int/bin/vacuumdb" --port 4329 --username "postgres" --all --freeze >> "pg_upgrade.log" 2>&1 ok "/opt/pgsql-9.0.5a-int/bin/pg_ctl" -l "pg_upgrade.log" -D "/var/postgresql/-9.0" stop >> "pg_upgrade.log" 2>&1 Deleting new commit clogs
Re: [HACKERS] PATCH: regular logging of checkpoint progress
On 26 Srpen 2011, 0:39, Tomas Vondra wrote: > On 26 Srpen 2011, 0:18, Josh Berkus wrote: >> Tomas, >> >>> I'd like to propose a small patch that allows better checkpoint >>> progress >>> monitoring. The patch is quite simple - it adds a new integer GUC >>> "checkpoint_update_limit" and every time checkpoint writes this number >>> of >>> buffers, it does two things: >> >> I'd rather not have a new GUC if we can avoid it. What about just >> making this some reasonable value (like 1000 buffers) if log_checkpoints >> = on? > > I was thinking about that too, but I think no value can fit all cases > because the systems may have very different I/O subsystems. > > With one 7.2k drive I usually get about 25MB/s on average, with big arrays > / good controllers / fast drives you can write much faster. So either the > value will be too low (and the log will be infested with those messages) > or too high (so it won't be updated very often). Hmmm, maybe we could use time instead of number of buffers? Something like "every 5 seconds, log the checkpoint progress" instead of "every time 1000 buffers is written ..." That should work on systems regardless of I/O performance. But although using time instead of number of buffers seems like a good idea, I don't think it eliminates the need for a new GUC. (a) Even with time limit, I find it useful to be able to set the limits differently. (b) In some cases it may be useful to enable just basic (current behaviour) checkpoint logging using log_checkpoints, so using it for this new patch may not be a good idea. Although this could be fixed by allowing three values no/basic/detailed instead of just true/false. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: regular logging of checkpoint progress
On 26 Srpen 2011, 0:18, Josh Berkus wrote: > Tomas, > >> I'd like to propose a small patch that allows better checkpoint progress >> monitoring. The patch is quite simple - it adds a new integer GUC >> "checkpoint_update_limit" and every time checkpoint writes this number >> of >> buffers, it does two things: > > I'd rather not have a new GUC if we can avoid it. What about just > making this some reasonable value (like 1000 buffers) if log_checkpoints > = on? I was thinking about that too, but I think no value can fit all cases because the systems may have very different I/O subsystems. With one 7.2k drive I usually get about 25MB/s on average, with big arrays / good controllers / fast drives you can write much faster. So either the value will be too low (and the log will be infested with those messages) or too high (so it won't be updated very often). And it depends on the number of shared buffers too. I thought it could be something like 5% of shared buffers but it really does not solve the issues. So I think there should be a new GUC. OTOH, it's probably a good idea to log this only when the log_checkpoints is enabled as those who tune checkpoints will enable log_checkpoints in the first case. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cheaper snapshots redux
On Aug 22, 2011, at 6:22 PM, Robert Haas wrote: > With respect to a general-purpose shared memory allocator, I think > that there are cases where that would be useful to have, but I don't > think there are as many of them as many people seem to think. I > wouldn't choose to implement this using a general-purpose allocator > even if we had it, both because it's undesirable to allow this or any > subsystem to consume an arbitrary amount of memory (nor can it fail... > especially in the abort path) and because a ring buffer is almost > certainly faster than a general-purpose allocator. We have enough > trouble with palloc overhead already. That having been said, I do > think there are cases where it would be nice to have... and it > wouldn't surprise me if I end up working on something along those > lines in the next year or so. It turns out that memory management is > a major issue in lock-free programming; you can't assume that it's > safe to recycle an object once the last pointer to it has been removed > from shared memory - because someone may have fetched the pointer just > before you removed it and still be using it to examine the object. An > allocator with some built-in capabilities for handling such problems > seems like it might be very useful Actually, I wasn't thinking about the system dynamically sizing shared memory on it's own... I was only thinking of providing the ability for a user to change something like shared_buffers and allow that change to take effect with a SIGHUP instead of requiring a full restart. I agree that we'd have to be very careful with allowing the code to start changing shared memory size on it's own... -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cheaper snapshots redux
On Aug 25, 2011, at 8:24 AM, Robert Haas wrote: > My hope (and it might turn out that I'm an optimist) is that even with > a reasonably small buffer it will be very rare for a backend to > experience a wraparound condition. For example, consider a buffer > with ~6500 entries, approximately 64 * MaxBackends, the approximate > size of the current subxip arrays taken in aggregate. I hypothesize > that a typical snapshot on a running system is going to be very small > - a handful of XIDs at most - because, on the average, transactions > are going to commit in *approximately* increasing XID order and, if > you take the regression tests as representative of a real workload, > only a small fraction of transactions will have more than one XID. So BTW, there's a way to actually gather some data on this by using PgQ (part of Skytools and used by Londiste). PgQ works by creating "ticks" at regular intervals, where a tick is basically just a snapshot of committed XIDs. Presumably Slony does something similar. I can provide you with sample data from our production systems if you're interested. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inputting relative datetimes
> Yes, but what I am trying to achieve is a way of entering such > relative timestamps using a single input value, so that absolute and > relative timestamps can both be bound to a SQL query using just one > variable. I think adding a function would be the way to go then. Maybe extending to_timestamp with a "relative" keyword. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: regular logging of checkpoint progress
Tomas, > I'd like to propose a small patch that allows better checkpoint progress > monitoring. The patch is quite simple - it adds a new integer GUC > "checkpoint_update_limit" and every time checkpoint writes this number of > buffers, it does two things: I'd rather not have a new GUC if we can avoid it. What about just making this some reasonable value (like 1000 buffers) if log_checkpoints = on? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_restore --no-post-data and --post-data-only
On 08/25/2011 06:05 PM, Jim Nasby wrote: On Aug 24, 2011, at 7:43 PM, Josh Berkus wrote: On 8/23/11 1:30 PM, Andrew Dunstan wrote: Attached is an undocumented patch that allows pg_restore to omit post-data items or omit all but post-data items. This has been discussed before, and Simon sent in a patch back on 2008, which has bitrotted some. I'm not sure why it was dropped at the time, but I think it's time to do this. This patch relies on some infrastructure that was added since Simon's patch, so it works a bit differently (and more simply). If it's not clear from Andrew's description, the purpose of this patch is to allow dividing your pgdump into 3 portions: 1. schema 2. data 3. constraints/indexes This allows users to implement a number of custom solutions for ad-hoc parallel dump, conditional loading, data munging and sampled databases. While doing so was possible before using the manifest from pg_restore -l, the manifest approach has been complex to automate and relies on obscure knowledge. I have immediate production use for this patch and may be backporting it. FWIW, I got around this by writing a perl script that calls pg_dump -s and watches for the end of table create statements (IIRC it specifically looks for the first CREATE INDEX). The advantage to that approach is that you don't have to first create a custom format dump and then run pg_restore against that. Well, notwithstanding my well known love of perl, that strikes me as spending a pound to save a penny. And custom format dumps rock ;-) Also, your recipe above is buggy, BTW. A CREATE INDEX statement might well not be the first item in the post-data section. But we could also add these switches to pg_dump too if people feel it's worthwhile. I haven't looked but the logic should not be terribly hard. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_restore --no-post-data and --post-data-only
On Aug 24, 2011, at 7:43 PM, Josh Berkus wrote: > On 8/23/11 1:30 PM, Andrew Dunstan wrote: >> >> Attached is an undocumented patch that allows pg_restore to omit >> post-data items or omit all but post-data items. This has been discussed >> before, and Simon sent in a patch back on 2008, which has bitrotted >> some. I'm not sure why it was dropped at the time, but I think it's time >> to do this. This patch relies on some infrastructure that was added >> since Simon's patch, so it works a bit differently (and more simply). > > If it's not clear from Andrew's description, the purpose of this patch > is to allow dividing your pgdump into 3 portions: > > 1. schema > 2. data > 3. constraints/indexes > > This allows users to implement a number of custom solutions for ad-hoc > parallel dump, conditional loading, data munging and sampled databases. > While doing so was possible before using the manifest from pg_restore > -l, the manifest approach has been complex to automate and relies on > obscure knowledge. > > I have immediate production use for this patch and may be backporting it. FWIW, I got around this by writing a perl script that calls pg_dump -s and watches for the end of table create statements (IIRC it specifically looks for the first CREATE INDEX). The advantage to that approach is that you don't have to first create a custom format dump and then run pg_restore against that. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why doesn't psql use the information schema to get ACL description ?
On Aug 23, 2011, at 1:57 PM, Peter Eisentraut wrote: >> Ultimately my ulterior motive is to make sure the information schema >> is actually useful and ensure that it maintains that status. > > The information schema only exposes privileges defined in the SQL > standard on objects defined in the SQL standard. So psql or any tool > that wants to provide information on PostgreSQL-specific features can't > use that. But perhaps the JDBC interfaces only expose certain standard > information anyway, so it could be useful. Depends on the details, > though. Related to this is the newsysviews project, which was intended to present more human-friendly info ala info_schema, but to also provide PostgreSQL-specific information (which info_schema can't). -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Buffering GiST leaf pages too
On Aug 23, 2011, at 2:03 AM, Heikki Linnakangas wrote: > While looking at Alexander's GiST fastbuild patch, which adds buffers to > internal nodes to avoid random I/O during index build, it occurred to me that > inserting the tuples to the leaf pages one at a time is quite inefficient > too, even if the leaf pages are in cache. There's still the overhead of > locking and WAL-logging each insertion separately. I think we could get a > nice further speedup if we attach a small buffer (one block or so) to every > leaf page we're currently writing tuples to, and update the leaf page in > bulk. Conveniently, the code to insert multiple tuples to a page already > exists in GiST code (because inserting a tuple sometimes splits the page into > more than two parts, so you need to insert multiple downlinks to the parent), > so this requires no changes to the low-level routines and WAL-logging. > > Let's finish off the main fastbuild patch first, but I wanted to get the idea > out there. I've often wondered about the per-tuple overhead of all kinds of operations, not just GiST index builds. For example, if you're doing a seqscan, ISTM it would be a lot more efficient to memcpy an entire page into backend-local memory and operate off of that lock-free. Similarly for an index scan, you'd want to copy a full leaf page if you think you'll be hitting it more than once or twice. -- Jim C. Nasby, Database Architect j...@nasby.net 512.569.9461 (cell) http://jim.nasby.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inputting relative datetimes
On Thu, Aug 25, 2011 at 5:08 AM, Dean Rasheed wrote: > On 25 August 2011 10:43, Vik Reykja wrote: >> On Thu, Aug 25, 2011 at 11:39, Dean Rasheed >> wrote: >>> >>> My first thought was to have some general way of adding or subtracting >>> an interval at the end of an input timestamp, eg. by adding another >>> couple of special values - "plus " and "minus ". >>> This would allow things like: >>> >>> TIMESTAMPTZ 'today minus 5 days' >>> TIMESTAMPTZ 'now plus 2 hours' >> >> Funny you should mention intervals... >> >> timestamptz 'today' - interval '5 days' >> timestamptz 'now' + interval '2 hours' >> > > Yes, but what I am trying to achieve is a way of entering such > relative timestamps using a single input value, so that absolute and > relative timestamps can both be bound to a SQL query using just one > variable. not gonna happen. as noted, intervals already solve the issue quite well and timestamp string parsing is already complicated enough as it is. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of issues with psql variable substitution
On 08/25/2011 02:45 PM, Tom Lane wrote: I wrote: Andrew Dunstan writes: We could just add -b unconditionally to the flex flags and then count the number of lines in lex.backup. If it's greater that 1 whine loudly, or even fail, otherwise remove lex.backup. Would that avoid locale dependencies? Hm, yeah, seems like that ought to work. Done in HEAD, but only for the Makefile-based build mechanism. Anybody want to add the comparable logic to the MSVC scripts? Done. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of issues with psql variable substitution
On Thu, Aug 25, 2011 at 5:12 PM, Tom Lane wrote: > While I'm looking at this ... the current implementation has got a > number of very inconsistent behaviors with respect to when it will > expand a variable reference within a psql meta-command argument. > Observe: > > regression=# \set foo 'value of foo' > regression=# \set bar 'value of bar' > regression=# \echo :foo > value of foo > regression=# \echo :foo@bar > value of foo @bar > > (there shouldn't be a space before the @, IMO --- there is because this > gets treated as two separate arguments, which seems bizarre) > > regression=# \echo :foo:bar > value of foo value of bar > > (again, why is this two arguments not one?) > > regression=# \echo :foo@:bar > value of foo @:bar > > (why isn't :bar expanded here, when it is in the previous case?) > > regression=# \echo foo:foo@:bar > foo:foo@:bar > > (and now neither one gets expanded) > > ISTM the general rule ought to be that we attempt to substitute for a > colon-construct regardless of where it appears within an argument, as > long as it's not within quotes. > > Thoughts? My main thought is that I remember this code being pretty awful - especially with respect to error handling - when I looked at it. A lot of dubious behaviors were more or less compelled by the impossibility of bailing out at an arbitrary point a la ereport(). At least, barring a drastic refactoring of the code, which might not be a bad idea either. No objection if you want to clean some of it up, but you may find it's a larger sinkhole than you anticipate. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of issues with psql variable substitution
While I'm looking at this ... the current implementation has got a number of very inconsistent behaviors with respect to when it will expand a variable reference within a psql meta-command argument. Observe: regression=# \set foo 'value of foo' regression=# \set bar 'value of bar' regression=# \echo :foo value of foo regression=# \echo :foo@bar value of foo @bar (there shouldn't be a space before the @, IMO --- there is because this gets treated as two separate arguments, which seems bizarre) regression=# \echo :foo:bar value of foo value of bar (again, why is this two arguments not one?) regression=# \echo :foo@:bar value of foo @:bar (why isn't :bar expanded here, when it is in the previous case?) regression=# \echo foo:foo@:bar foo:foo@:bar (and now neither one gets expanded) ISTM the general rule ought to be that we attempt to substitute for a colon-construct regardless of where it appears within an argument, as long as it's not within quotes. Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PATCH: regular logging of checkpoint progress
Hello, I'd like to propose a small patch that allows better checkpoint progress monitoring. The patch is quite simple - it adds a new integer GUC "checkpoint_update_limit" and every time checkpoint writes this number of buffers, it does two things: (a) logs a "checkpoint status" message into the server log, with info about total number of buffers to write, number of already written buffers, current and average write speed and estimate of remaining time (b) sends bgwriter stats (so that the buffers_checkpoint is updated) I believe this will make checkpoint tuning easier, especially with large shared bufferers and large when there's other write activity (so that it's difficult to see checkpoint I/O). The default value (0) means this continuous logging is disabled. Tomasdiff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml new file mode 100644 index 67e722f..64d84b0 *** a/doc/src/sgml/config.sgml --- b/doc/src/sgml/config.sgml *** SET ENABLE_SEQSCAN TO OFF; *** 1863,1868 --- 1863,1885 + + checkpoint_update_limit (integer) + +checkpoint_update_limit configuration parameter + + + + Number of buffers written during a checkpoint between logging a status + (with total number of buffers to write, number of already written buffers, + average/current write and estimate of the remaining time) and updates bgwriter + stats at the same time. The default value 0 disables the continuous updates so + the stats are updated only at the end of the checkpoint. This parameter can only + be set in the postgresql.conf file or on the server command line. + + + + diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c new file mode 100644 index 4c7cfb0..b24ec93 *** a/src/backend/storage/buffer/bufmgr.c --- b/src/backend/storage/buffer/bufmgr.c *** *** 66,71 --- 66,72 bool zero_damaged_pages = false; int bgwriter_lru_maxpages = 100; doublebgwriter_lru_multiplier = 2.0; + int checkpoint_update_limit = 0; /* * How many buffers PrefetchBuffer callers should try to stay ahead of their *** BufferSync(int flags) *** 1175,1180 --- 1176,1192 int num_to_write; int num_written; int mask = BM_DIRTY; + + int num_since_update; + + longcurr_secs, + total_secs; + int curr_usecs, + total_usecs; + float curr_time, + total_time; + + TimestampTz startTimestamp, lastTimestamp; /* Make sure we can handle the pin inside SyncOneBuffer */ ResourceOwnerEnlargeBuffers(CurrentResourceOwner); *** BufferSync(int flags) *** 1238,1243 --- 1250,1260 buf_id = StrategySyncStart(NULL, NULL); num_to_scan = NBuffers; num_written = 0; + num_since_update = 0; + + startTimestamp = GetCurrentTimestamp(); + lastTimestamp = startTimestamp; + while (num_to_scan-- > 0) { volatile BufferDesc *bufHdr = &BufferDescriptors[buf_id]; *** BufferSync(int flags) *** 1261,1266 --- 1278,1327 TRACE_POSTGRESQL_BUFFER_SYNC_WRITTEN(buf_id); BgWriterStats.m_buf_written_checkpoints++; num_written++; + num_since_update++; + + /* +* Every time we write enough buffers (checkpoint_update_limit), +* we log a checkpoint status message and update the bgwriter +* stats (so that the pg_stat_bgwriter table may be updated). +* +* The log message contains info about total number of buffers to +* write, how many buffers are already written, average and current +* write speed and an estimate remaining time. +*/ + if ((checkpoint_update_limit > 0) && (num_since_update >= checkpoint_update_limit)) + { + + TimestampDifference(lastTimestamp, + GetCurrentTimestamp(), + &curr_secs, &curr_usecs); + +
Re: [HACKERS] [GENERAL] pg_upgrade problem
On Thu, Aug 25, 2011 at 04:43:02PM -0400, Bruce Momjian wrote: > Please check the old cluster. Sure: =# SELECT reltoastrelid FROM pg_class WHERE relname = 'actions'; reltoastrelid --- 82510395 71637071 (2 rows) =# SELECT oid::regclass, reltoastrelid FROM pg_class WHERE relname = 'actions'; oid | reltoastrelid ---+--- x.actions | 82510395 y.actions | 71637071 (2 rows) =# select oid, relfilenode from pg_class where oid in (SELECT reltoastrelid FROM pg_class WHERE relname = 'actions'); oid| relfilenode --+- 82510395 |82510395 71637071 |71637071 (2 rows) =# select oid from pg_database where datname = current_database(); oid -- 71635381 (1 row) $ ls -l /base/71635381/{71637071,82510395} -rw--- 1 postgres postgres 0 2009-10-12 06:49 /base/71635381/71637071 -rw--- 1 postgres postgres 0 2010-08-19 14:02 /base/71635381/82510395 > > > > One more thing - one of earlier tests actually worked through > > > > pg_upgrade, but when running vacuumdb -az on newly started 9.0.4, I got > > > > error about missing transaction/clog - don't remember exactly what it > > > > was, though. > > > THere was a bug in how how pg_upgrade worked in pre-9.0.4 --- could it > > > have been that? > > It was done definitely using 9.0.4. > Good. Not sure if it's good, since it was after the clog error was fixed, and I still got it :/ but anyway - the problem with 71637071 is more important now. Best regards, depesz -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSI 2PC coverage
Excerpts from Tom Lane's message of mié ago 24 22:11:58 -0300 2011: > Alvaro Herrera writes: > > After having to play with this, I didn't like it very much, because > > regression.diffs gets spammed with the (rather massive and completely > > useless) diff in that test. For the xml tests, rather than ignoring it > > fail on an installation without libxml, we use an alternative output. > > > Unless there are objections, I will commit the alternative file proposed > > by Dan. > > +1 ... "ignore" is a pretty ugly hack here. Done. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_upgrade problem
hubert depesz lubaczewski wrote: > On Thu, Aug 25, 2011 at 04:33:07PM -0400, Bruce Momjian wrote: > > The problem appears to be that the Postgres catalogs think there is a > > toast table for 'actions', while the file system doesn't seem to have > > such a file. I can you look in pg_class and verify that? > > > > SELECT reltoastrelid FROM pg_class WHERE relname = 'actions'; > > $ SELECT reltoastrelid FROM pg_class WHERE relname = 'actions'; > reltoastrelid > --- > (0 rows) > > This is done not on the pg from backup, but on normal production, as the test > pg instance doesn't work anymore. > > I can re-set the test instance, but extracting from backup, and making it > apply > all xlogs usually takes 2-3 days. If you remove the .old extension on pg_control, you can start the old cluster and check it. This is explained by pg_upgrade output: | If pg_upgrade fails after this point, you must | re-initdb the new cluster before continuing. | You will also need to remove the ".old" suffix | from /var/postgresql//global/pg_control.old. Please check the old cluster. > > > One more thing - one of earlier tests actually worked through > > > pg_upgrade, but when running vacuumdb -az on newly started 9.0.4, I got > > > error about missing transaction/clog - don't remember exactly what it > > > was, though. > > > > THere was a bug in how how pg_upgrade worked in pre-9.0.4 --- could it > > have been that? > > It was done definitely using 9.0.4. Good. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_upgrade problem
On Thu, Aug 25, 2011 at 04:33:07PM -0400, Bruce Momjian wrote: > The problem appears to be that the Postgres catalogs think there is a > toast table for 'actions', while the file system doesn't seem to have > such a file. I can you look in pg_class and verify that? > > SELECT reltoastrelid FROM pg_class WHERE relname = 'actions'; $ SELECT reltoastrelid FROM pg_class WHERE relname = 'actions'; reltoastrelid --- (0 rows) This is done not on the pg from backup, but on normal production, as the test pg instance doesn't work anymore. I can re-set the test instance, but extracting from backup, and making it apply all xlogs usually takes 2-3 days. > > One more thing - one of earlier tests actually worked through > > pg_upgrade, but when running vacuumdb -az on newly started 9.0.4, I got > > error about missing transaction/clog - don't remember exactly what it > > was, though. > > THere was a bug in how how pg_upgrade worked in pre-9.0.4 --- could it > have been that? It was done definitely using 9.0.4. Best regards, depesz -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] pg_upgrade problem
hubert depesz lubaczewski wrote: > hi > > I have 8.3.11 database, ~ 600GB in size. > > I want to upgrade it to 9.0. > > First, I tried with 9.0.4, and when I hit problem (the same) I tried > git, head of 9.0 branch. Good. > pg_upgrade_dump_db.sql- > pg_upgrade_dump_db.sql--- For binary upgrade, must preserve relfilenodes > pg_upgrade_dump_db.sql-SELECT > binary_upgrade.set_next_heap_relfilenode('71637068'::pg_catalog.oid); > pg_upgrade_dump_db.sql:SELECT > binary_upgrade.set_next_toast_relfilenode('71637071'::pg_catalog.oid); > pg_upgrade_dump_db.sql-SELECT > binary_upgrade.set_next_index_relfilenode('71637073'::pg_catalog.oid); > pg_upgrade_dump_db.sql- > pg_upgrade_dump_db.sql-CREATE TABLE actions ( > -- > pg_upgrade.log-linking /var/postgresql//base/113953649/2613 to > /var/postgresql/-9.0/base/11826/11790 > pg_upgrade.log- /var/postgresql//base/113953649/2683 > pg_upgrade.log-linking /var/postgresql//base/113953649/2683 to > /var/postgresql/-9.0/base/11826/11792 > pg_upgrade.log:Could not find 71637071 in old cluster The problem appears to be that the Postgres catalogs think there is a toast table for 'actions', while the file system doesn't seem to have such a file. I can you look in pg_class and verify that? SELECT reltoastrelid FROM pg_class WHERE relname = 'actions'; Then look in the file system to see if there is a matching file. > One more thing - one of earlier tests actually worked through > pg_upgrade, but when running vacuumdb -az on newly started 9.0.4, I got > error about missing transaction/clog - don't remember exactly what it > was, though. THere was a bug in how how pg_upgrade worked in pre-9.0.4 --- could it have been that? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache
On Fri, Aug 5, 2011 at 2:36 PM, Kohei KaiGai wrote: > BTW, what is the current status of this patch? > The status of contrib/sepgsql part is unclear for me, although we agreed that > syscache is suitable mechanism for security labels. Sorry it's taken me a while to get around to looking at this. Reviewing away... For me, the line you removed from dml.out causes the regression tests to fail. I don't understand what this is going for: + /* +* To boost up trusted procedure checks on db_procedure object +* class, we also confirm the decision when user calls a procedure +* labeled as 'tcontext'. +*/ Can you explain? sepgsql_avc_check_perms_label has a formatting error on the line that says "result = false". It's not indented correctly. Several functions do this: sepgsql_avc_check_valid(); do { ... } while (!sepgsql_avc_check_valid); I don't understand why we need a loop there. The comment for sepgql_avc_check_perms_label uses the word "elsewhere" when it really means "otherwise". Changing the calling sequence of sepgsql_get_label() would perhaps be better separated out into its own patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] tsvector concatenation - backend crash
Hi Attached SQL files gives (at least in my hands) a reliable backend crash with this stacktrace .. reproduced on both 9.0.4 and HEAD. I'm sorry I cannot provide a more trimmed down set of vectors the reproduces the bug, thus the "obsfucated" dataset. But even deleting single terms in the vectors make the bug go away. *** glibc detected *** postgres: jk jk [local] SELECT: corrupted double-linked list: 0x02279f80 *** === Backtrace: = /lib/libc.so.6(+0x775b6)[0x7fe4db4b25b6] /lib/libc.so.6(+0x7aa25)[0x7fe4db4b5a25] /lib/libc.so.6(cfree+0x73)[0x7fe4db4b8e83] postgres: jk jk [local] SELECT[0x710de5] postgres: jk jk [local] SELECT(MemoryContextReset+0x2a)[0x71119a] postgres: jk jk [local] SELECT(ExecScan+0x4a)[0x57887a] postgres: jk jk [local] SELECT(ExecProcNode+0x238)[0x571708] postgres: jk jk [local] SELECT(standard_ExecutorRun+0xd2)[0x5705e2] postgres: jk jk [local] SELECT[0x63c627] postgres: jk jk [local] SELECT(PortalRun+0x248)[0x63d948] postgres: jk jk [local] SELECT[0x639fdb] postgres: jk jk [local] SELECT(PostgresMain+0x547)[0x63af97] postgres: jk jk [local] SELECT[0x5fb959] postgres: jk jk [local] SELECT(PostmasterMain+0xa97)[0x5fe137] postgres: jk jk [local] SELECT(main+0x490)[0x59f4d0] /lib/libc.so.6(__libc_start_main+0xfd)[0x7fe4db459c4d] postgres: jk jk [local] SELECT[0x45d569] === Memory map: 0040-008d6000 r-xp 08:01 4071141 /tmp/pgsql/bin/postgres 00ad5000-00ad6000 r--p 004d5000 08:01 4071141 /tmp/pgsql/bin/postgres 00ad6000-00ae2000 rw-p 004d6000 08:01 4071141 /tmp/pgsql/bin/postgres 00ae2000-00b43000 rw-p 00:00 0 0215d000-0227e000 rw-p 00:00 0 [heap] 7fe4d400-7fe4d4021000 rw-p 00:00 0 7fe4d4021000-7fe4d800 ---p 00:00 0 7fe4d908f000-7fe4d90a5000 r-xp 08:01 4194383 /lib/libgcc_s.so.1 7fe4d90a5000-7fe4d92a4000 ---p 00016000 08:01 4194383 /lib/libgcc_s.so.1 7fe4d92a4000-7fe4d92a5000 r--p 00015000 08:01 4194383 /lib/libgcc_s.so.1 7fe4d92a5000-7fe4d92a6000 rw-p 00016000 08:01 4194383 /lib/libgcc_s.so.1 7fe4d92c1000-7fe4d9342000 rw-p 00:00 0 7fe4d9342000-7fe4db22e000 rw-s 00:04 8716337 /SYSV0052ea91 (deleted) 7fe4db22e000-7fe4db23a000 r-xp 08:01 4194415 /lib/libnss_files-2.11.1.so 7fe4db23a000-7fe4db439000 ---p c000 08:01 4194415 /lib/libnss_files-2.11.1.so 7fe4db439000-7fe4db43a000 r--p b000 08:01 4194415 /lib/libnss_files-2.11.1.so 7fe4db43a000-7fe4db43b000 rw-p c000 08:01 4194415 /lib/libnss_files-2.11.1.so 7fe4db43b000-7fe4db5b5000 r-xp 08:01 4194349 /lib/libc-2.11.1.so 7fe4db5b5000-7fe4db7b4000 ---p 0017a000 08:01 4194349 /lib/libc-2.11.1.so 7fe4db7b4000-7fe4db7b8000 r--p 00179000 08:01 4194349 /lib/libc-2.11.1.so 7fe4db7b8000-7fe4db7b9000 rw-p 0017d000 08:01 4194349 /lib/libc-2.11.1.so 7fe4db7b9000-7fe4db7be000 rw-p 00:00 0 7fe4db7be000-7fe4db84 r-xp 08:01 4194398 /lib/libm-2.11.1.so 7fe4db84-7fe4dba3f000 ---p 00082000 08:01 4194398 /lib/libm-2.11.1.so 7fe4dba3f000-7fe4dba4 r--p 00081000 08:01 4194398 /lib/libm-2.11.1.so 7fe4dba4-7fe4dba41000 rw-p 00082000 08:01 4194398 /lib/libm-2.11.1.so 7fe4dba41000-7fe4dba43000 r-xp 08:01 4194363 /lib/libdl-2.11.1.so 7fe4dba43000-7fe4dbc43000 ---p 2000 08:01 4194363 /lib/libdl-2.11.1.so 7fe4dbc43000-7fe4dbc44000 r--p 2000 08:01 4194363 /lib/libdl-2.11.1.so 7fe4dbc44000-7fe4dbc45000 rw-p 3000 08:01 4194363 /lib/libdl-2.11.1.so 7fe4dbc45000-7fe4dbc65000 r-xp 08:01 4194325 /lib/ld-2.11.1.so 7fe4dbc85000-7fe4dbce7000 rw-p 00:00 0 7fe4dbce7000-7fe4dbd26000 r--p 08:01 5512971 /usr/lib/locale/en_DK.utf8/LC_CTYPE 7fe4dbd26000-7fe4dbe44000 r--p 08:01 5512650 /usr/lib/locale/en_DK.utf8/LC_COLLATE 7fe4dbe44000-7fe4dbe47000 rw-p 00:00 0 7fe4dbe58000-7fe4dbe59000 r--p 08:01 5515083 /usr/lib/locale/en_DK.utf8/LC_TIME 7fe4dbe59000-7fe4dbe5a000 r--p 08:01 5515084 /usr/lib/locale/en_DK.utf8/LC_MONETARY 7fe4dbe5a000-7fe4dbe5b000 r--p 08:01 5640299 /usr/lib/locale/en_DK.utf8/LC_MESSAGES/SYS_LC_MESSAGES 7fe4dbe5b000-7fe4dbe62000 r--s 08:01 5511621 /usr/lib/gconv/gconv-modules.cache 7fe4dbe62000-7fe4dbe64000 rw-p 00:00 0 7fe4dbe64000-7fe4dbe65000 r--p 0001f000 08:01 4194325 /lib/ld-2.11.1.so 7
Re: [HACKERS] Backup's from standby
On Fri, Aug 19, 2011 at 9:38 AM, Valentine Gogichashvili wrote: >> > What issue we may face if you take a backups(includes data dir + wal >> > files) >> > at standby without LVM snapshot? >> >> The backup might be corrupted in arbitrary ways. > > And what will happen, if one issues a pg_start_backup() on the master, then > takes a file-backup on slave, and issues pg_stop_backup() on master again? > As far as I remember this approach was working for me, considering, that all > needed WAL files are transferred to the newly created DB copy as well. Well, I think you would need to verify a few more things: - The pg_start_backup() will need to have been replayed on the standby before you start the file copy. - You will need all the WAL segments beginning at the position where pg_start_backup() was executed on the master and ending after pg_stop_backup() was executed on the master. Assuming you do that, it seems like it ought to work, although I have not tested it and am not at all sure I'm not missing something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: Fast GiST index build
On 24.08.2011 16:57, Alexander Korotkov wrote: I've added some testing results to the wiki page: http://wiki.postgresql.org/wiki/Fast_GiST_index_build_GSoC_2011 There are not all the results I planned for the first chunk because it takes more time than I expect. Some notes about it. Now I see two causes which accelerate regular build of GiST indexes: 1) As it was noted before regular index build of pretty ordered dataset is fast. 2) I found that worse index is faster to build. I mean worse index is index with higher overlaps. Function gistchoose selects the first index tuple with zero penalty if any. Thus, with higher overlap in root page only few index tuples of it will be choosed for insert. And, recursively, only small part of the tree will be used for actual inserts. And that part of tree can easier fit to the cache. Thus, high overlaps makes inserts cheaper as much as searches expensiver. As an extreme case, a trivial penalty function that just always returns 0 will make index build fast - but the index will be useless for querying. In the tests on the first version of patch I found index quality of regular build much better than it of buffering build (without neighborrelocation). Now it's similar, though it's because index quality of regular index build become worse. There by in current tests regular index build is faster than in previous. I see following possible causes of it: 1) I didn't save source random data. So, now it's a new random data. 2) Some environment parameters of my test setup may alters, though I doubt. Despite these possible explanation it seems quite strange for me. That's pretty surprising. Assuming the data is truly random, I wouldn't expect a big difference in the index quality of one random data set over another. If the index quality depends so much on, say, the distribution of the few first tuples that are inserted to it, that's a quite interesting find on its own, and merits some further research. In order to compare index build methods on more qualitative indexes, I've tried to build indexes with my double sorting split method (see: http://syrcose.ispras.ru/2011/files/SYRCoSE2011_Proceedings.pdf#page=36). So on uniform dataset search is faster in about 10 times! And, as it was expected, regular index build becomes much slower. It runs more than 60 hours and while only 50% of index is complete (estimated by file sizes). Also, automatic switching to buffering build shows better index quality results in all the tests. While it's hard for me to explain that. Hmm, makes me a bit uneasy that we're testing with a modified page splitting algorithm. But if the new algorithm is that good, could you post that as a separate patch, please? That said, I don't see any new evidence that the buffering build algorithm would be significantly worse. There's the case of ordered data that we already knew about, and will have to just accept for now. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of issues with psql variable substitution
I wrote: > Andrew Dunstan writes: >> We could just add -b unconditionally to the flex flags and then count >> the number of lines in lex.backup. If it's greater that 1 whine loudly, >> or even fail, otherwise remove lex.backup. Would that avoid locale >> dependencies? > Hm, yeah, seems like that ought to work. Done in HEAD, but only for the Makefile-based build mechanism. Anybody want to add the comparable logic to the MSVC scripts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of issues with psql variable substitution
Andrew Dunstan writes: > We could just add -b unconditionally to the flex flags and then count > the number of lines in lex.backup. If it's greater that 1 whine loudly, > or even fail, otherwise remove lex.backup. Would that avoid locale > dependencies? Hm, yeah, seems like that ought to work. I'm tempted to add "-p -p" also, even though that only results in some whinging on stderr. It would still probably get noticed by anyone who was changing the lexer. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of issues with psql variable substitution
On 08/25/2011 01:16 PM, Alvaro Herrera wrote: Excerpts from Tom Lane's message of jue ago 25 14:00:57 -0300 2011: Robert Haas writes: On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane wrote: 1. Somebody broke the no-backtracking property back in 9.0 while adding quoted variable substitution. According to the flex manual, use of backtracking creates a performance penalty. We once measured the backend's lexer as being about a third faster with backtrack avoidance, and presumably it's about the same for psql's. This is not hard to fix, but should I consider it a bug fix and back-patch? We've not had complaints about psql getting slower as of 9.0. That may well have been me. [ checks "git blame" ] Well, you commmitted the patch anyway: d0cfc018. How would I have known that I broke it? Per the header comments in the backend lexer, you should run flex with "-b" switch and verify that the resulting lex.backup file says "no backing up". I've occasionally thought about automating that, but I'm not sure if the output is entirely locale- and flex-version-independent. It is locale dependent, though of course for the automated check you could just run flex under LC_ALL=C. $ /usr/bin/flex -Cfe -b /pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l $ cat lex.backup Sin retroceso. $ LC_ALL=C /usr/bin/flex -Cfe -b /pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l $ cat lex.backup No backing up. We could just add -b unconditionally to the flex flags and then count the number of lines in lex.backup. If it's greater that 1 whine loudly, or even fail, otherwise remove lex.backup. Would that avoid locale dependencies? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of issues with psql variable substitution
Excerpts from Tom Lane's message of jue ago 25 14:00:57 -0300 2011: > Robert Haas writes: > > On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane wrote: > >> 1. Somebody broke the no-backtracking property back in 9.0 while adding > >> quoted variable substitution. According to the flex manual, use of > >> backtracking creates a performance penalty. We once measured the > >> backend's lexer as being about a third faster with backtrack avoidance, > >> and presumably it's about the same for psql's. This is not hard to fix, > >> but should I consider it a bug fix and back-patch? We've not had > >> complaints about psql getting slower as of 9.0. > > > That may well have been me. > > [ checks "git blame" ] Well, you commmitted the patch anyway: d0cfc018. > > > How would I have known that I broke it? > > Per the header comments in the backend lexer, you should run flex with > "-b" switch and verify that the resulting lex.backup file says "no > backing up". I've occasionally thought about automating that, but I'm > not sure if the output is entirely locale- and flex-version-independent. It is locale dependent, though of course for the automated check you could just run flex under LC_ALL=C. $ /usr/bin/flex -Cfe -b /pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l $ cat lex.backup Sin retroceso. $ LC_ALL=C /usr/bin/flex -Cfe -b /pgsql/source/REL8_4_STABLE/src/bin/psql/psqlscan.l $ cat lex.backup No backing up. -- Álvaro Herrera The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of issues with psql variable substitution
Robert Haas writes: > On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane wrote: >> 1. Somebody broke the no-backtracking property back in 9.0 while adding >> quoted variable substitution. According to the flex manual, use of >> backtracking creates a performance penalty. We once measured the >> backend's lexer as being about a third faster with backtrack avoidance, >> and presumably it's about the same for psql's. This is not hard to fix, >> but should I consider it a bug fix and back-patch? We've not had >> complaints about psql getting slower as of 9.0. > That may well have been me. [ checks "git blame" ] Well, you commmitted the patch anyway: d0cfc018. > How would I have known that I broke it? Per the header comments in the backend lexer, you should run flex with "-b" switch and verify that the resulting lex.backup file says "no backing up". I've occasionally thought about automating that, but I'm not sure if the output is entirely locale- and flex-version-independent. > Also, how invasive is the fix? We need to add a couple more rules that will match an unterminated quoted variable and do something reasonable (probably just throw back everything but the colon with yyless). I've not coded it but I think it can't be more than a dozen lines or so. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] A couple of issues with psql variable substitution
On Thu, Aug 25, 2011 at 12:47 PM, Tom Lane wrote: > On my way to do something else entirely, I came across a couple of > things that are not very nice about psql's lexer: > > 1. Somebody broke the no-backtracking property back in 9.0 while adding > quoted variable substitution. According to the flex manual, use of > backtracking creates a performance penalty. We once measured the > backend's lexer as being about a third faster with backtrack avoidance, > and presumably it's about the same for psql's. This is not hard to fix, > but should I consider it a bug fix and back-patch? We've not had > complaints about psql getting slower as of 9.0. That may well have been me. How would I have known that I broke it? Also, how invasive is the fix? > 2. The lexer rules associated with variable substitution think that > variable names can consist only of ASCII letters and digits (and > underscores). The psql manual is noncommittal about whether non-ASCII > characters are allowed, but a reasonable person would think that the > rules ought to be the same as the backend's idea of what an identifier > is. Does anybody have a problem with improving that? Nope. Or at least, I don't. > (I'm not > proposing this part as a bug fix, because it does look a little bit > more invasive to fix, because of the way psql deals with unsafe > multibyte encodings.) +1 for not back-patching this part. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change format of FDW options used in \d* commands
2011/8/19 Shigeru Hanada : > Oops, good catch. I've fixed psql to use quote_ident() for option_name, > and modified regression tests to use special characters in option names. > Please try revised version of patch. This part looks good to me, too. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] A couple of issues with psql variable substitution
On my way to do something else entirely, I came across a couple of things that are not very nice about psql's lexer: 1. Somebody broke the no-backtracking property back in 9.0 while adding quoted variable substitution. According to the flex manual, use of backtracking creates a performance penalty. We once measured the backend's lexer as being about a third faster with backtrack avoidance, and presumably it's about the same for psql's. This is not hard to fix, but should I consider it a bug fix and back-patch? We've not had complaints about psql getting slower as of 9.0. 2. The lexer rules associated with variable substitution think that variable names can consist only of ASCII letters and digits (and underscores). The psql manual is noncommittal about whether non-ASCII characters are allowed, but a reasonable person would think that the rules ought to be the same as the backend's idea of what an identifier is. Does anybody have a problem with improving that? (I'm not proposing this part as a bug fix, because it does look a little bit more invasive to fix, because of the way psql deals with unsafe multibyte encodings.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Change format of FDW options used in \d* commands
2011/8/19 Shigeru Hanada : > BTW, I noticed that pg_dump has same issue since 8.4, initial release of > SQL/MED infrastructure. If a FDW option was defined on any FDW object > with a name which contains one of special characters such as space and > parentheses, pg_dump generates invalid OPTIONS clause such as "OPTIONS > (separated name 'value')". > ~~ > Perhaps this issue has been overlooked because dblink is practically the > only use case of FDW option before 9.1. Since 9.1, users might get > various FDW and some of those might use special characters in option > name. ISTM that this fix should be back-patched, at least to 9.1. > Please find attached patches for each STABLE branch. Good catch, committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cheaper snapshots redux
On Thu, Aug 25, 2011 at 11:15 AM, Markus Wanner wrote: > On 08/25/2011 04:59 PM, Tom Lane wrote: >> That's a good point. If the ring buffer size creates a constraint on >> the maximum number of sub-XIDs per transaction, you're going to need a >> fallback path of some sort. > > I think Robert envisions the same fallback path we already have: > subxids.overflowed. I have a slightly more nuanced idea, but basically yes. The trouble is that if you're keeping the snapshot around and updating it (rather than scanning the ProcArray each time) you need some sort of mechanism for the snapshot to eventually un-overflow. Otherwise, the first overflow leaves you in the soup for the entire lifetime of the cluster. What I have in mind is to store the highest subxid that has been removed from the snapshot, or InvalidTransactonId if we know the snapshot is complete. Whenever the highest removed subxid falls behind xmin, we can reset it to InvalidTransactionId. It would be sensible for clients to store the exact value of highest_removed_subxid in their snapshots as well, instead of just a Boolean flag. A pg_subtrans lookup is needed only for XIDs which are greater than xmin and less than or equal to highest_removed_subxid. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cheaper snapshots redux
Tom, On 08/25/2011 04:59 PM, Tom Lane wrote: > That's a good point. If the ring buffer size creates a constraint on > the maximum number of sub-XIDs per transaction, you're going to need a > fallback path of some sort. I think Robert envisions the same fallback path we already have: subxids.overflowed. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cheaper snapshots redux
Robert, On 08/25/2011 04:48 PM, Robert Haas wrote: > What's a typical message size for imessages? Most message types in Postgres-R are just a couple bytes in size. Others, especially change sets, can be up to 8k. However, I think you'll have an easier job guaranteeing that backends "consume" their portions of the ring-buffer in time. Plus wrap-around isn't that much of a problem in your case. (I couldn't drop imessage, but had to let senders wait). > Well, one long-running transaction that only has a single XID is not > really a problem: the snapshot is still small. But one very old > transaction that also happens to have a large number of > subtransactions all of which have XIDs assigned might be a good way to > stress the system. Ah, right, that's why its a list of transactions in progress and not a list of completed transactions in SnapshotData... good. > Each reader decides which data he needs to copy from the buffer, and > then copies it, and then checks whether any of it got overwritten > before the copy was completed. So there's a lively possibility that > the snapshot that was current when the reader began copying it will no > longer be current by the time he finishes copying it, because a commit > has intervened. That's OK: it just means that, effectively, the > snapshot is taken at the moment the start and stop pointers are read, > and won't take into account any commits that happen later, which is > exactly what a snapshot is supposed to do anyway. Agreed, that makes sense. Thanks for explaining. > There is a hopefully quite small possibility that by the time the > reader finishes copying it so much new data will have been written to > the buffer that it will have wrapped around and clobbered the portion > the reader was interested in. That needs to be rare. Yeah. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cheaper snapshots redux
Robert Haas writes: > Well, one long-running transaction that only has a single XID is not > really a problem: the snapshot is still small. But one very old > transaction that also happens to have a large number of > subtransactions all of which have XIDs assigned might be a good way to > stress the system. That's a good point. If the ring buffer size creates a constraint on the maximum number of sub-XIDs per transaction, you're going to need a fallback path of some sort. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cheaper snapshots redux
On Thu, Aug 25, 2011 at 10:19 AM, Markus Wanner wrote: > Note, however, that for imessages, I've also had the policy in place > that a backend *must* consume its message before sending any. And that > I took great care for all receivers to consume their messages as early > as possible. None the less, I kept incrementing the buffer size (to > multiple megabytes) to make this work. Maybe I'm overcautious because > of that experience. What's a typical message size for imessages? >> - a handful of XIDs at most - because, on the average, transactions >> are going to commit in *approximately* increasing XID order > > This assumption quickly turns false, if you happen to have just one > long-running transaction, I think. Or in general, if transaction > duration varies a lot. Well, one long-running transaction that only has a single XID is not really a problem: the snapshot is still small. But one very old transaction that also happens to have a large number of subtransactions all of which have XIDs assigned might be a good way to stress the system. >> So the backend taking a snapshot only needs >> to be able to copy < ~64 bytes of information from the ring buffer >> before other backends write ~27k of data into that buffer, likely >> requiring hundreds of other commits. > > You said earlier, that "only the latest snapshot" is required. It takes > only a single commit for such a snapshot to not be the latest anymore. > > Instead, if you keep around older snapshots for some time - as what your > description here implies - readers are free to copy from those older > snapshots while other backends are able to make progress concurrently > (writers or readers of other snapshots). > > However, that either requires keeping track of readers of a certain > snapshot (reference counting) or - as I understand your description - > you simply invalidate all concurrent readers upon wrap-around, or something. Each reader decides which data he needs to copy from the buffer, and then copies it, and then checks whether any of it got overwritten before the copy was completed. So there's a lively possibility that the snapshot that was current when the reader began copying it will no longer be current by the time he finishes copying it, because a commit has intervened. That's OK: it just means that, effectively, the snapshot is taken at the moment the start and stop pointers are read, and won't take into account any commits that happen later, which is exactly what a snapshot is supposed to do anyway. There is a hopefully quite small possibility that by the time the reader finishes copying it so much new data will have been written to the buffer that it will have wrapped around and clobbered the portion the reader was interested in. That needs to be rare. >> Now, as the size of the snapshot gets bigger, things will eventually >> become less good. > > Also keep configurations with increased max_connections in mind. With > that, we not only the snapshots get bigger, but more processes have to > share CPU time, on avg. making memcpy slower for a single process. Right. I'm imagining making the default buffer size proportional to max_connections. >> Of course even if wraparound turns out not to be a problem there are >> other things that could scuttle this whole approach, but I think the >> idea has enough potential to be worth testing. If the whole thing >> crashes and burns I hope I'll at least learn enough along the way to >> design something better... > > That's always a good motivation. In that sense: happy hacking! Thanks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cheaper snapshots redux
Robert, On 08/25/2011 03:24 PM, Robert Haas wrote: > My hope (and it might turn out that I'm an optimist) is that even with > a reasonably small buffer it will be very rare for a backend to > experience a wraparound condition. It certainly seems less likely than with the ring-buffer for imessages, yes. Note, however, that for imessages, I've also had the policy in place that a backend *must* consume its message before sending any. And that I took great care for all receivers to consume their messages as early as possible. None the less, I kept incrementing the buffer size (to multiple megabytes) to make this work. Maybe I'm overcautious because of that experience. > - a handful of XIDs at most - because, on the average, transactions > are going to commit in *approximately* increasing XID order This assumption quickly turns false, if you happen to have just one long-running transaction, I think. Or in general, if transaction duration varies a lot. > So the backend taking a snapshot only needs > to be able to copy < ~64 bytes of information from the ring buffer > before other backends write ~27k of data into that buffer, likely > requiring hundreds of other commits. You said earlier, that "only the latest snapshot" is required. It takes only a single commit for such a snapshot to not be the latest anymore. Instead, if you keep around older snapshots for some time - as what your description here implies - readers are free to copy from those older snapshots while other backends are able to make progress concurrently (writers or readers of other snapshots). However, that either requires keeping track of readers of a certain snapshot (reference counting) or - as I understand your description - you simply invalidate all concurrent readers upon wrap-around, or something. > That seems vanishingly unlikely; Agreed. > Now, as the size of the snapshot gets bigger, things will eventually > become less good. Also keep configurations with increased max_connections in mind. With that, we not only the snapshots get bigger, but more processes have to share CPU time, on avg. making memcpy slower for a single process. > Of course even if wraparound turns out not to be a problem there are > other things that could scuttle this whole approach, but I think the > idea has enough potential to be worth testing. If the whole thing > crashes and burns I hope I'll at least learn enough along the way to > design something better... That's always a good motivation. In that sense: happy hacking! Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch to slightly improve clarity of a comment in postgresql.conf.sample
On Thu, Aug 25, 2011 at 9:30 AM, Dougal Sutherland wrote: > Sure, I like that more. OK, done. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Short document fix
On Tue, Aug 23, 2011 at 4:53 PM, Hitoshi Harada wrote: > In the CREATE DOMAIN reference page of the current HEAD, it says > > --- > CREATE DOMAIN us_postal_code AS TEXT > CHECK( > VALUE ~ '^\\d{5}$' > OR VALUE ~ '^\\d{5}-\\d{4}$' > ); > --- > > but I believe it should conform the standard string style now that the > default is standard_conforming_strings = on. I agree. Done. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch to slightly improve clarity of a comment in postgresql.conf.sample
On Wed, Aug 24, 2011 at 7:33 AM, Dougal Sutherland wrote: > The attached change to postgresql.conf.sample makes it more clear at a > glance that the default value of listen_addresses is 'localhost', not > 'localhost, *'. This would have saved a friend an hour or two of fiddling > tonight. How about: # defaults to 'localhost'; use '*' for all -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] cheaper snapshots redux
On Thu, Aug 25, 2011 at 1:55 AM, Markus Wanner wrote: >> One difference with snapshots is that only the latest snapshot is of >> any interest. > > Theoretically, yes. But as far as I understood, you proposed the > backends copy that snapshot to local memory. And copying takes some > amount of time, possibly being interrupted by other backends which add > newer snapshots... Or do you envision the copying to restart whenever a > new snapshot arrives? My hope (and it might turn out that I'm an optimist) is that even with a reasonably small buffer it will be very rare for a backend to experience a wraparound condition. For example, consider a buffer with ~6500 entries, approximately 64 * MaxBackends, the approximate size of the current subxip arrays taken in aggregate. I hypothesize that a typical snapshot on a running system is going to be very small - a handful of XIDs at most - because, on the average, transactions are going to commit in *approximately* increasing XID order and, if you take the regression tests as representative of a real workload, only a small fraction of transactions will have more than one XID. So it seems believable to think that the typical snapshot on a machine with max_connections=100 might only be ~10 XIDs, even if none of the backends are read-only. So the backend taking a snapshot only needs to be able to copy < ~64 bytes of information from the ring buffer before other backends write ~27k of data into that buffer, likely requiring hundreds of other commits. That seems vanishingly unlikely; memcpy() is very fast. If it does happen, you can recover by retrying, but it should be a once-in-a-blue-moon kind of thing. I hope. Now, as the size of the snapshot gets bigger, things will eventually become less good. For example if you had a snapshot with 6000 XIDs in it then every commit would need to write over the previous snapshot and things would quickly deteriorate. But you can cope with that situation using the same mechanism we already use to handle big snapshots: toss out all the subtransaction IDs, mark the snapshot as overflowed, and just keep the toplevel XIDs. Now you've got at most ~100 XIDs to worry about, so you're back in the safety zone. That's not ideal in the sense that you will cause more pg_subtrans lookups, but that's the price you pay for having a gazillion subtransactions floating around, and any system is going to have to fall back on some sort of mitigation strategy at some point. There's no useful limit on the number of subxids a transaction can have, so unless you're prepared to throw an unbounded amount of memory at the problem you're going to eventually have to punt. It seems to me that the problem case is when you are just on the edge. Say you have 1400 XIDs in the snapshot. If you compact the snapshot down to toplevel XIDs, most of those will go away and you won't have to worry about wraparound - but you will pay a performance penalty in pg_subtrans lookups. On the other hand, if you don't compact the snapshot, it's not that hard to imagine a wraparound occurring - four snapshot rewrites could wrap the buffer. You would still hope that memcpy() could finish in time, but if you're rewriting 1400 XIDs with any regularity, it might not take that many commits to throw a spanner into the works. If the system is badly overloaded and the backend trying to take a snapshot gets descheduled for a long time at just the wrong moment, it doesn't seem hard to imagine a wraparound happening. Now, it's not hard to recover from a wraparound. In fact, we can pretty easily guarantee that any given attempt to take a snapshot will suffer a wraparound at most once. The writers (who are committing) have to be serialized anyway, so anyone who suffers a wraparound can just grab the same lock in shared mode and retry its snapshot. Now concurrency decreases significantly, because no one else is allowed to commit until that guy has got his snapshot, but right now that's true *every time* someone wants to take a snapshot, so falling back to that strategy occasionally doesn't seem prohibitively bad. However, you don't want it to happen very often, because even leaving aside the concurrency hit, it's double work: you have to try to take a snapshot, realize you've had a wraparound, and then retry. It seems pretty clear that with a big enough ring buffer the wraparound problem will become so infrequent as to be not worth worrying about. I'm theorizing that even with a quite small ring buffer the problem will still be infrequent enough not to worry about, but that might be optimistic. I think I'm going to need some kind of test case that generates very large, frequently changing snapshots. Of course even if wraparound turns out not to be a problem there are other things that could scuttle this whole approach, but I think the idea has enough potential to be worth testing. If the whole thing crashes and burns I hope I'll at least learn enough along the way to
Re: [HACKERS] Inputting relative datetimes
On 25 August 2011 10:43, Vik Reykja wrote: > On Thu, Aug 25, 2011 at 11:39, Dean Rasheed > wrote: >> >> My first thought was to have some general way of adding or subtracting >> an interval at the end of an input timestamp, eg. by adding another >> couple of special values - "plus " and "minus ". >> This would allow things like: >> >> TIMESTAMPTZ 'today minus 5 days' >> TIMESTAMPTZ 'now plus 2 hours' > > Funny you should mention intervals... > > timestamptz 'today' - interval '5 days' > timestamptz 'now' + interval '2 hours' > Yes, but what I am trying to achieve is a way of entering such relative timestamps using a single input value, so that absolute and relative timestamps can both be bound to a SQL query using just one variable. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Inputting relative datetimes
On Thu, Aug 25, 2011 at 11:39, Dean Rasheed wrote: > My first thought was to have some general way of adding or subtracting > an interval at the end of an input timestamp, eg. by adding another > couple of special values - "plus " and "minus ". > This would allow things like: > > TIMESTAMPTZ 'today minus 5 days' > TIMESTAMPTZ 'now plus 2 hours' > Funny you should mention intervals... timestamptz 'today' - interval '5 days' timestamptz 'now' + interval '2 hours'
[HACKERS] Inputting relative datetimes
As background, I have an app that accepts user text input and casts it to a timestamp in order to produce reports. I use PostgreSQL's timestamp input conversion for this, since it gives a lot of flexibility, and can parse pretty much anything the users throw at it. It is also handy that it recognizes special case values like "now", "today", "tomorrow" and "yesterday". However, I can't see any way of entering more general relative timestamps like "5 days ago" or "2 hours from now". Obviously I can enhance my app by writing my own input function to support relative timestamps, but I wonder if this is something that would be more generally useful if PostgreSQL supported it natively. If so, what should the syntax be? My first thought was to have some general way of adding or subtracting an interval at the end of an input timestamp, eg. by adding another couple of special values - "plus " and "minus ". This would allow things like: TIMESTAMPTZ 'today minus 5 days' TIMESTAMPTZ 'now plus 2 hours' It seems a bit clunky to have to spell out "plus" and "minus", but I think that using the symbols + and - would be impossible to parse because of the ambiguity with timezones. Thoughts? Better ideas? Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers