Re: [HACKERS] [GENERAL] Multiple Slave Failover with PITR
On Sun, Sep 2, 2012 at 5:12 AM, Bruce Momjian wrote: > > Do we ever want to document a way to connect slaves to a new master, > rather than recreating the slave? Please, please please do so. And hopefully it'll be less tricky sooner than later. -- fdr -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
On Mon, Sep 03, 2012 at 12:11:20AM -0400, Tom Lane wrote: > Noah Misch writes: > > I don't think it's libpq's job to enforce security policy this way. In any > > event, it has no reason to presume an environment variable is a safer > > source. > > One easy thing we could do that would at least narrow the risks is to > only allow the executable's *directory* to be specified, hardwiring the > executable file name to "postgres" (or "postgres.exe" I guess). If the user has any interactive access to the machine, he could make a /tmp/X/postgres symbolic link to the program of his choosing. > I'm inclined to think though that environment variables *are* more > secure in this context. In the contrib/dblink example, such a > restriction would certainly help a lot. dblink_connect() should only let superusers specify standalone_datadir at all; normal users have no business manipulating other data directories visible to the backend running dblink. For superusers, setting both standalone_datadir and standalone_backend is fair game. Trusting the environment over connection strings is a wrong policy for, say, a setuid command-line program. Suppose such a program passes libpq a fixed connection string to access its embedded database. The program will find itself explicitly clearing this environment variable to regain safety. > In any case, we should at > least think of a way that an app using libpq can prevent this type > of attack short of parsing the conn string for itself. My recommendation to affected application authors is to pass the connection string through PQconninfoParse(). Walk the array, validating or rejecting arguments like "host" and "standalone_datadir". For maximum safety, the application would need to reject any unanticipated parameters. We might soften that by adding a "bool critical" field to PQconninfoOption that documents whether the option must be understood by a program validating a connection. Compare the idea of the PNG chunk header's "ancillary" bit. Parameters like "host" and "standalone_datadir" would be critical, while "application_name" and "connect_timeout" would be ancillary. But this is a tough line to draw rigorously. I was pondering the flexibility from letting the application fork/exec and supply the client-side descriptor number(s) to libpq. Suppose it wants to redirect the backend's stderr to a file. A single-threaded application would temporarily redirect its own stderr while calling PQconnectdb(). In a multithreaded application, that introduces a race condition when other threads concurrently write to the normal stderr. By handling redirection in its own fork/exec code, the application can achieve the outcome safely. Perhaps offering this can wait until the feature constitutes a more general embedded-database offering, though. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
Noah Misch writes: > Windows does not have socketpair(), nor a strict pipe() equivalent. I expect > switching to socketpair() makes the Windows side trickier in some ways and > simpler in others. +1 for exploring that direction first. A bit of googling suggests that emulating socketpair() on Windows is not that hard: basically you create an accepting socket and connect to it. Ugly I guess but likely to be nicer than emulating the two-pipes trick exactly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas
On Sun, Sep 2, 2012 at 5:39 PM, Jeff Janes wrote: > On Thu, Aug 30, 2012 at 8:17 PM, Tom Lane wrote: >> Robert Haas writes: >>> On Thu, Aug 30, 2012 at 4:51 PM, Tom Lane wrote: Bruce Momjian writes: > On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote: >> Ok, I modified the part of pg_dump where tremendous number of LOCK >> TABLE are issued. I replace them with single LOCK TABLE with multiple >> tables. With 100k tables LOCK statements took 13 minutes in total, now >> it only takes 3 seconds. Comments? >> > Was this applied? >> No, we fixed the server side instead. >> >>> But only for 9.2, right? So people running back branches are still screwed. >> >> Yeah, but they're screwed anyway, because there are a bunch of O(N^2) >> behaviors involved here, not all of which are masked by what Tatsuo-san >> suggested. > > All of the other ones that I know of were associated with pg_dump > itself, and since it is recommended to run the newer version of > pg_dump against the older version of the server, no back patching > would be necessary to get the benefits of those particular fixes. > >> Six months or a year from now, we might have enough confidence in that >> batch of 9.2 fixes to back-port them en masse. Don't want to do it >> today though. > > > What would be the recommendation for people trying to upgrade, but who > can't get their data out in a reasonable window? > > Putting Tatsuo-san's change into a future pg_dump might be more > conservative than back-porting the server's Lock Table change to the > server version they are trying to get rid of. What he said. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade test mods for Windows/Mingw
Gurjeet Singh writes: > [ this needs a comment: ] > `uname -a | sed 's/.* //'` = Msys Also, how about doing that just once and setting a variable to test as needed later? Multiple copied-and-pasted instances of line noise like that are risky. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
Noah Misch writes: > On Sun, Sep 02, 2012 at 11:34:42PM -0400, Tom Lane wrote: >> Heikki Linnakangas writes: >>> Are there security issues with this? If a user can specify libpq >>> connection options, he can now execute any file he wants by passing it >>> as standalone_backend. >> Hmm, that's a good point. Maybe we should only allow the executable >> name to come from an environment variable? Seems kinda klugy though. > I don't think it's libpq's job to enforce security policy this way. In any > event, it has no reason to presume an environment variable is a safer source. One easy thing we could do that would at least narrow the risks is to only allow the executable's *directory* to be specified, hardwiring the executable file name to "postgres" (or "postgres.exe" I guess). I'm inclined to think though that environment variables *are* more secure in this context. In the contrib/dblink example, such a restriction would certainly help a lot. In any case, we should at least think of a way that an app using libpq can prevent this type of attack short of parsing the conn string for itself. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
On Sun, Sep 02, 2012 at 11:34:42PM -0400, Tom Lane wrote: > Heikki Linnakangas writes: > > On 03.09.2012 03:23, Tom Lane wrote: > >> 1. As you can see above, the feature is triggered by specifying the new > >> connection option "standalone_datadir", whose value must be the location > >> of the data directory. I also invented an option "standalone_backend", > >> which can be set to specify which postgres executable to launch. > > > Are there security issues with this? If a user can specify libpq > > connection options, he can now execute any file he wants by passing it > > as standalone_backend. Granted, you shouldn't allow an untrusted user to > > specify libpq connection options, because allowing to open a TCP > > connection to an arbitrary address can be a problem by itself, but it > > seems like this might make the situation much worse. contrib/dblink > > springs to mind.. > > Hmm, that's a good point. Maybe we should only allow the executable > name to come from an environment variable? Seems kinda klugy though. I don't think it's libpq's job to enforce security policy this way. In any event, it has no reason to presume an environment variable is a safer source. > >> 3. The bulk of the changes have to do with the fact that we need to keep > >> track of two file descriptors not one. > > > Would socketpair(2) be simpler? > > Hm, yes, but is it portable enough? It seems to be required by SUS v2, > so we're likely okay on the Unix side, but does Windows have this? Windows does not have socketpair(), nor a strict pipe() equivalent. I expect switching to socketpair() makes the Windows side trickier in some ways and simpler in others. +1 for exploring that direction first. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bugs
Alvaro Herrera writes: > Maybe, to reduce future backpatching pain, we could backpatch the change > to exec_prog() API now that you have fixed the implementation? I'm inclined to think this is a good idea, but keep in mind we're less than four days from wrapping 9.2.0. There's not a lot of margin for error here. At the same time, getting pg_upgrade to pass regression on Windows would be a good thing, and ought to go a long way towards ameliorating worries about this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade test mods for Windows/Mingw
On Sun, Sep 2, 2012 at 11:29 PM, Andrew Dunstan wrote: > The attached patch is what I had to do to get pg_upgrade's "make check" to > run on Windows under Mingw. Mostly the changes have to do with getting > paths right between Windows and MSys, or calling generated .bat files > instead of shell scripts. > When reading shell script code like this `uname -a | sed 's/.* //'` = Msys and sed -i -e 's,/,\\,g' -e 's,\\s\\q ,/s/q ,' delete_old_cluster.bat 2>/dev/null I find it easier to understand and maintain if the comments also describe what is the original string format that this pattern-matching expects, like: # We expect `uname -a` output like: # Windows_NT4.0 Msys and # We expect lines of the format: # abc/xyz/def/ # and we convert them to # abc\xyz\def BTW, would `uname -o` eliminate the need of pattern matching in the first snippet? The Wikipedia [1] article suggests so. [1] http://en.wikipedia.org/wiki/Uname Best regards, -- Gurjeet Singh
Re: [HACKERS] Yet another failure mode in pg_upgrade
Bruce Momjian writes: > Updated patch attached. [ looks at that for a bit... ] Now I see why you were on about that: the method you used here requires both clusters to have the same socket directory. Which is silly and unnecessary. Revised patch attached. regards, tom lane diff --git a/contrib/pg_upgrade/check.c b/contrib/pg_upgrade/check.c index 0fec73e..efb080b 100644 *** a/contrib/pg_upgrade/check.c --- b/contrib/pg_upgrade/check.c *** issue_warnings(char *sequence_script_fil *** 184,191 { prep_status("Adjusting sequences"); exec_prog(UTILITY_LOG_FILE, NULL, true, ! "\"%s/psql\" " EXEC_PSQL_ARGS " --port %d --username \"%s\" -f \"%s\"", ! new_cluster.bindir, new_cluster.port, os_info.user, sequence_script_file_name); unlink(sequence_script_file_name); check_ok(); --- 184,191 { prep_status("Adjusting sequences"); exec_prog(UTILITY_LOG_FILE, NULL, true, ! "\"%s/psql\" " EXEC_PSQL_ARGS " %s -f \"%s\"", ! new_cluster.bindir, cluster_conn_opts(&new_cluster), sequence_script_file_name); unlink(sequence_script_file_name); check_ok(); diff --git a/contrib/pg_upgrade/dump.c b/contrib/pg_upgrade/dump.c index cfc4017..b905ab0 100644 *** a/contrib/pg_upgrade/dump.c --- b/contrib/pg_upgrade/dump.c *** generate_old_dump(void) *** 24,31 * restores the frozenid's for databases and relations. */ exec_prog(UTILITY_LOG_FILE, NULL, true, ! "\"%s/pg_dumpall\" --port %d --username \"%s\" --schema-only --binary-upgrade %s -f %s", ! new_cluster.bindir, old_cluster.port, os_info.user, log_opts.verbose ? "--verbose" : "", ALL_DUMP_FILE); check_ok(); --- 24,31 * restores the frozenid's for databases and relations. */ exec_prog(UTILITY_LOG_FILE, NULL, true, ! "\"%s/pg_dumpall\" %s --schema-only --binary-upgrade %s -f %s", ! new_cluster.bindir, cluster_conn_opts(&old_cluster), log_opts.verbose ? "--verbose" : "", ALL_DUMP_FILE); check_ok(); diff --git a/contrib/pg_upgrade/option.c b/contrib/pg_upgrade/option.c index 94bce50..80f7d34 100644 *** a/contrib/pg_upgrade/option.c --- b/contrib/pg_upgrade/option.c *** adjust_data_dir(ClusterInfo *cluster) *** 376,378 --- 376,439 check_ok(); } + + + /* + * get_sock_dir + * + * Identify the socket directory to use for this cluster. If we're doing + * a live check (old cluster only), we need to find out where the postmaster + * is listening. Otherwise, we're going to put the socket into the current + * directory. + */ + void + get_sock_dir(ClusterInfo *cluster, bool live_check) + { + #if HAVE_UNIX_SOCKETS + if (!live_check) + { + /* Use the current directory for the socket */ + cluster->sockdir = pg_malloc(MAXPGPATH); + if (!getcwd(cluster->sockdir, MAXPGPATH)) + pg_log(PG_FATAL, "cannot find current directory\n"); + } + else + { + /* + * If we are doing a live check, we will use the old cluster's Unix + * domain socket directory so we can connect to the live server. + */ + + /* sockdir added to the 5th line of postmaster.pid in PG 9.1 */ + if (GET_MAJOR_VERSION(cluster->major_version) >= 901) + { + char filename[MAXPGPATH]; + FILE *fp; + int i; + + snprintf(filename, sizeof(filename), "%s/postmaster.pid", + cluster->pgdata); + if ((fp = fopen(filename, "r")) == NULL) + pg_log(PG_FATAL, "Could not get socket directory of the old server\n"); + + cluster->sockdir = pg_malloc(MAXPGPATH); + for (i = 0; i < 5; i++) + if (fgets(cluster->sockdir, MAXPGPATH, fp) == NULL) + pg_log(PG_FATAL, "Could not get socket directory of the old server\n"); + + fclose(fp); + + /* Remove trailing newline */ + if (strchr(cluster->sockdir, '\n') != NULL) + *strchr(cluster->sockdir, '\n') = '\0'; + } + else + { + /* Can't get live sockdir, so assume the default is OK. */ + cluster->sockdir = NULL; + } + } + #else /* !HAVE_UNIX_SOCKETS */ + cluster->sockdir = NULL; + #endif + } diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c index c47c8bb..ee3a151 100644 *** a/contrib/pg_upgrade/pg_upgrade.c --- b/contrib/pg_upgrade/pg_upgrade.c *** main(int argc, char **argv) *** 88,93 --- 88,96 check_cluster_versions(); check_cluster_compatibility(live_check); + get_sock_dir(&old_cluster, live_check); + get_sock_dir(&new_cluster, false); + check_old_cluster(live_check, &sequence_script_file_name); *** prepare_new_cluster(void) *** 211,218 */ prep_status("Analyzing all rows in the new cluster"); exec_prog(UTILITY_LOG_FILE, NULL, true, ! "\"%s/vacuumdb\" --port %d --username \"%s\" --all --analyze %s", ! new_cluster.bindir, new_cluster.port, os_info.user, log_opts.verbose ? "--verbose" : ""); check_ok(); --- 214,221 *
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
Heikki Linnakangas writes: > On 03.09.2012 03:23, Tom Lane wrote: >> 1. As you can see above, the feature is triggered by specifying the new >> connection option "standalone_datadir", whose value must be the location >> of the data directory. I also invented an option "standalone_backend", >> which can be set to specify which postgres executable to launch. > Are there security issues with this? If a user can specify libpq > connection options, he can now execute any file he wants by passing it > as standalone_backend. Granted, you shouldn't allow an untrusted user to > specify libpq connection options, because allowing to open a TCP > connection to an arbitrary address can be a problem by itself, but it > seems like this might make the situation much worse. contrib/dblink > springs to mind.. Hmm, that's a good point. Maybe we should only allow the executable name to come from an environment variable? Seems kinda klugy though. >> 3. The bulk of the changes have to do with the fact that we need to keep >> track of two file descriptors not one. > Would socketpair(2) be simpler? Hm, yes, but is it portable enough? It seems to be required by SUS v2, so we're likely okay on the Unix side, but does Windows have this? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade test mods for Windows/Mingw
The attached patch is what I had to do to get pg_upgrade's "make check" to run on Windows under Mingw. Mostly the changes have to do with getting paths right between Windows and MSys, or calling generated .bat files instead of shell scripts. cheers andrew diff --git a/contrib/pg_upgrade/test.sh b/contrib/pg_upgrade/test.sh index 31e30af..fc2304a 100644 --- a/contrib/pg_upgrade/test.sh +++ b/contrib/pg_upgrade/test.sh @@ -43,13 +43,23 @@ if [ "$1" = '--install' ]; then export EXTRA_REGRESS_OPTS fi -: ${oldbindir=$bindir} - : ${oldsrc=../..} -oldsrc=`cd "$oldsrc" && pwd` -newsrc=`cd ../.. && pwd` -PATH=$bindir:$PATH +if [ `uname -a | sed 's/.* //'` = Msys ] ; then + cp $libdir/libpq.dll $bindir + osbindir=`cd $bindir && pwd` + bindir=`cd $bindir && pwd -W` + oldsrc=`cd "$oldsrc" && pwd -W` + newsrc=`cd ../.. && pwd -W` +else + osbindir=$bindir + oldsrc=`cd "$oldsrc" && pwd` + newsrc=`cd ../.. && pwd` +fi + +: ${oldbindir=$bindir} + +PATH=$osbindir:$PATH export PATH PGDATA=$temp_root/data @@ -104,10 +114,19 @@ mv "${PGDATA}" "${PGDATA}.old" initdb +if [ `uname -a | sed 's/.* //'` = Msys ] ; then + PGDATA=`cd $PGDATA && pwd -W` +fi + pg_upgrade -d "${PGDATA}.old" -D "${PGDATA}" -b "$oldbindir" -B "$bindir" pg_ctl start -l "$logdir/postmaster2.log" -w -sh ./analyze_new_cluster.sh + +if [ `uname -a | sed 's/.* //'` = Msys ] ; then + cmd /c analyze_new_cluster.bat +else + sh ./analyze_new_cluster.sh +fi pg_dumpall >"$temp_root"/dump2.sql || pg_dumpall2_status=$? pg_ctl -m fast stop if [ -n "$pg_dumpall2_status" ]; then @@ -115,7 +134,15 @@ if [ -n "$pg_dumpall2_status" ]; then exit 1 fi -sh ./delete_old_cluster.sh +if [ `uname -a | sed 's/.* //'` = Msys ] ; then + sed -i -e 's,/,\\,g' -e 's,\\s\\q ,/s/q ,' delete_old_cluster.bat 2>/dev/null + chmod +w delete_old_cluster.bat + cmd /c delete_old_cluster.bat + dos2unix "$temp_root"/dump1.sql >/dev/null + dos2unix "$temp_root"/dump2.sql >/dev/null +else + sh ./delete_old_cluster.sh +fi if diff -q "$temp_root"/dump1.sql "$temp_root"/dump2.sql; then echo PASSED -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bugs
Excerpts from Andrew Dunstan's message of dom sep 02 22:11:42 -0300 2012: > > I have been wrestling for a couple of days trying to get pg_upgrade > testing working on Windows, with a view to having it tested on the > buildfarm. The test script has its own issues, which I'll deal with > separately, but there are two issues in pg_upgrade's exec.c that make me > suspect that if pg_upgrade has ever worked at all on Windows it is a > matter of sheer luck. The attached patch fixes these. The first issue is > a plain miscall to stlcpy(), where the length argument is wrong. The > second is where exec_prog tries to open a log file after the system call > returns. This will fail if the command was a 'pg_ctl start', as the > running postmaster will have the log file open, so I have simply > #ifdef'd it out for the Windows case, as the code does nothing except > add a couple of line feeds to the log, missing which won't affect > anything much. The first issue was clearly introduced by me in 088c065ce8e405fafbfa966937184ece9defcf20. The other one is probably also my fault. Bruce has tested this on Windows previously and fixed some previous bugs also introduced by me in the same area; so we know it works. Maybe, to reduce future backpatching pain, we could backpatch the change to exec_prog() API now that you have fixed the implementation? > Barring objection I will commit this before long. Thanks. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: standalone backend with full FE/BE protocol
On 03.09.2012 03:23, Tom Lane wrote: 1. As you can see above, the feature is triggered by specifying the new connection option "standalone_datadir", whose value must be the location of the data directory. I also invented an option "standalone_backend", which can be set to specify which postgres executable to launch. If the latter isn't specified, libpq defaults to trying the installation PGBINDIR that was selected by configure. (I don't think it can use the relative path tricks we use in pg_ctl and elsewhere, since there's no good reason to assume that it's running in a Postgres-supplied program.) I'm not particularly wedded to these names or the syntax, but I think this is the basic functionality we'd need. Are there security issues with this? If a user can specify libpq connection options, he can now execute any file he wants by passing it as standalone_backend. Granted, you shouldn't allow an untrusted user to specify libpq connection options, because allowing to open a TCP connection to an arbitrary address can be a problem by itself, but it seems like this might make the situation much worse. contrib/dblink springs to mind.. 3. The bulk of the changes have to do with the fact that we need to keep track of two file descriptors not one. This was a bit tedious, but fairly straightforward --- though I was surprised to find that send() and recv() don't work on pipes, at least not on Linux. You have to use read() and write() instead. Would socketpair(2) be simpler? 7. I haven't tried to make pg_upgrade use this yet. Hmm, pg_upgrade uses pg_dumpall rather than pg_dump, so it needs to connect once per database. That means launching the standalone backend multiple times. I guess that works, as long as pg_dumpall doesn't try to hold multiple connections open at any one time. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade bugs
Andrew Dunstan writes: > I have been wrestling for a couple of days trying to get pg_upgrade > testing working on Windows, with a view to having it tested on the > buildfarm. The test script has its own issues, which I'll deal with > separately, but there are two issues in pg_upgrade's exec.c that make me > suspect that if pg_upgrade has ever worked at all on Windows it is a > matter of sheer luck. The attached patch fixes these. The first issue is > a plain miscall to stlcpy(), where the length argument is wrong. The > second is where exec_prog tries to open a log file after the system call > returns. This will fail if the command was a 'pg_ctl start', as the > running postmaster will have the log file open, so I have simply > #ifdef'd it out for the Windows case, as the code does nothing except > add a couple of line feeds to the log, missing which won't affect > anything much. The strlcpy bug seems to be recently introduced --- I don't see it in 9.2. I think the other bit has not been there very long either, though it *is* in 9.2 branch so you'd better back-patch that part. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade bugs
I have been wrestling for a couple of days trying to get pg_upgrade testing working on Windows, with a view to having it tested on the buildfarm. The test script has its own issues, which I'll deal with separately, but there are two issues in pg_upgrade's exec.c that make me suspect that if pg_upgrade has ever worked at all on Windows it is a matter of sheer luck. The attached patch fixes these. The first issue is a plain miscall to stlcpy(), where the length argument is wrong. The second is where exec_prog tries to open a log file after the system call returns. This will fail if the command was a 'pg_ctl start', as the running postmaster will have the log file open, so I have simply #ifdef'd it out for the Windows case, as the code does nothing except add a couple of line feeds to the log, missing which won't affect anything much. Barring objection I will commit this before long. cheers andrew diff --git a/contrib/pg_upgrade/exec.c b/contrib/pg_upgrade/exec.c index c75d9db..c780b63 100644 --- a/contrib/pg_upgrade/exec.c +++ b/contrib/pg_upgrade/exec.c @@ -52,7 +52,7 @@ exec_prog(const char *log_file, const char *opt_log_file, old_umask = umask(S_IRWXG | S_IRWXO); - written = strlcpy(cmd, SYSTEMQUOTE, strlen(SYSTEMQUOTE)); + written = strlcpy(cmd, SYSTEMQUOTE, sizeof(cmd)); va_start(ap, fmt); written += vsnprintf(cmd + written, MAXCMDLEN - written, fmt, ap); va_end(ap); @@ -95,10 +95,12 @@ exec_prog(const char *log_file, const char *opt_log_file, log_file); } +#ifndef WIN32 if ((log = fopen_priv(log_file, "a+")) == NULL) pg_log(PG_FATAL, "cannot write to log file %s\n", log_file); fprintf(log, "\n\n"); fclose(log); +#endif return result == 0; } -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: pgbench - aggregation of info written into log
On Sun, Sep 2, 2012 at 3:46 PM, Tomas Vondra wrote: > > Fixed. I've kept use_log_agg only and I've removed the default. Also > I've added one more check (that the total duration is a multiple of > the aggregation interval). Hi Tomas, In the docs, -l is an option, not an application: "-l" Also, the docs still refer to -A, rather than to --aggregate-interval, in a few places. I think a section in the docs that points out that transactions run by specifying mulitple -f will be mingled together into an interval would be a good idea, as that could easily be overlooked (as I did earlier). The docs do not mention anywhere what the units for the latencies are. I think the format of the logfile should somehow make it unmistakably different from the regular -l logfile, for example by prefixing every line with "Aggregate: ". Or maybe there is some other solution, but I think that having two log formats, both of which consist of nothing but 6 integers, but yet mean very different things, is a recipe for confusion. Is it unavoidable that the interval be an integral number of seconds? I found the units in the original code confusing, so maybe there is some reason it needs to be like that that I don't understand yet. I'll look into it some more. Thanks, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: pgbench - aggregation of info written into log
Dne 30.08.2012 18:02, Robert Haas napsal: On Fri, Aug 24, 2012 at 5:25 PM, Tomas Vondra wrote: This patch is a bit less polished (and more complex) than the other pgbench patch I've sent a while back, and I'm not sure how to handle the Windows branch. That needs to be fixed during the commit fest. What's the problem with the Windows branch? Could you un-cuddle your brances to follow the PostgreSQL style, omit braces around single-statement blocks, and remove the spurious whitespace changes? Done, or at least I don't see other formatting issues. Let me know if I missed something. Instead of having both use_log_agg and naggseconds, I think you can get by with just having a single variable that is zero if aggregation is not in use and is otherwise the aggregation period. On a related note, you can't specify -A without an associated value, so there is no point in documenting a "default". As with your other patch, I suggest using a long option name like --latency-aggregate-interval and then making the name of the variable in the code match the option name, with s/-/_/g, for clarity. Fixed. I've kept use_log_agg only and I've removed the default. Also I've added one more check (that the total duration is a multiple of the aggregation interval). And just as with the sampling patch, I believe the "-l" should not be enabled by default, but required. But if more people ask to enable it whenever the aggregation or sampling is used, I'm fine with it. Tomasdiff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 00cab73..e044564 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -145,6 +145,7 @@ char *index_tablespace = NULL; #define naccounts 10 bool use_log;/* log transaction latencies to a file */ +intuse_log_agg;/* log aggregates instead of individual transactions */ bool is_connect; /* establish connection for each transaction */ bool is_latencies; /* report per-command latencies */ intmain_pid; /* main process id used in log filename */ @@ -240,6 +241,18 @@ typedef struct char *argv[MAX_ARGS]; /* command word list */ } Command; +typedef struct +{ + + longstart_time; /* when does the interval start */ + int cnt;/* number of transactions */ + double min_duration; /* min/max durations */ + double max_duration; + double sum;/* sum(duration), sum(duration^2) - for estimates */ + double sum2; + +} AggVals; + static Command **sql_files[MAX_FILES]; /* SQL script files */ static int num_files; /* number of script files */ static int num_commands = 0; /* total number of Command structs */ @@ -364,6 +377,8 @@ usage(void) " -f FILENAME read transaction script from FILENAME\n" " -j NUM number of threads (default: 1)\n" " -l write transaction times to log file\n" + " --aggregate-interval NUM\n" + " aggregate data over NUM seconds\n" " -M simple|extended|prepared\n" " protocol for submitting queries to server (default: simple)\n" " -n do not run VACUUM before tests\n" @@ -817,9 +832,22 @@ clientDone(CState *st, bool ok) return false; /* always false */ } +static +void agg_vals_init(AggVals * aggs, instr_time start) +{ + aggs->cnt = 0; + aggs->sum = 0; + aggs->sum2 = 0; + + aggs->min_duration = 3600 * 100.0; /* one hour */ + aggs->max_duration = 0; + + aggs->start_time = INSTR_TIME_GET_DOUBLE(start); +} + /* return false iff client should be disconnected */ static bool -doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile) +doCustom(TState *thread, CState *st, instr_time *conn_time, FILE *logfile, AggVals * agg) { PGresult *res; Command **commands; @@ -881,17 +909,70 @@ top: diff = now; INSTR_TIME_SUBTRACT(diff, st->txn_begin); usec = (double) INSTR_TIME_GET_MICROSEC(diff); - + + /* should we aggregate the results or not? */ + if (use_log_agg) + { + + /* are we still in the same interval? if yes, accumulate the +* values (print them otherwise) */ + if (agg->start_time + use_log_agg >= INSTR_TIME_GET_DOUBLE(now)) + { +
Re: [HACKERS] PATCH: pgbench - random sampling of transaction written into log
Dne 31.08.2012 00:01, Tomas Vondra napsal: On 30 Srpen 2012, 23:44, Robert Haas wrote: On Thu, Aug 30, 2012 at 3:48 PM, Tomas Vondra wrote: That sounds like a pretty trivial patch. I've been thinking about yet another option - histograms (regular or with exponential bins). I thought about that, too, but I think high-outliers is a lot more useful. At least for the kinds of things I worry about. OK, let's fix the current patches first. We can add more options later. So, here is a fixed version of the patch. I've made these changes: 1) The option is now '--sampling-rate' instead of '-R' and accepts float arguments. I've decided not to use 0.01 = 1% but use 1 = 1%, so it accepts values between 0 and 100. I find this more natural. 2) I've removed one of the two new variables, the remaining one is used both to check whether the sampling is enabled and keep the value. 3) I've decided not to enable "-l" whenever the "--sampling-rate" is used. Again, I find this a bit less magic behavior. 4) I've fixed some formatting issues - if you notice another one that I don't see, let me know. Tomasdiff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c index 00cab73..578aeb3 100644 --- a/contrib/pgbench/pgbench.c +++ b/contrib/pgbench/pgbench.c @@ -130,6 +130,11 @@ intforeign_keys = 0; intunlogged_tables = 0; /* + * use log sampling (rate => 1 = 100%, 0 = don't use) + */ +double use_sample_rate = 0.0; + +/* * tablespace selection */ char *tablespace = NULL; @@ -364,6 +369,8 @@ usage(void) " -f FILENAME read transaction script from FILENAME\n" " -j NUM number of threads (default: 1)\n" " -l write transaction times to log file\n" + " --sampling-rate NUM\n" + " sampling rate of the log (e.g. 0.01 for 1%% sample)\n" " -M simple|extended|prepared\n" " protocol for submitting queries to server (default: simple)\n" " -n do not run VACUUM before tests\n" @@ -877,21 +884,26 @@ top: instr_time diff; double usec; - INSTR_TIME_SET_CURRENT(now); - diff = now; - INSTR_TIME_SUBTRACT(diff, st->txn_begin); - usec = (double) INSTR_TIME_GET_MICROSEC(diff); + /* either no sampling or is within the sample */ + if ((use_sample_rate == 0.0) || (pg_erand48(thread->random_state) <= use_sample_rate)) + { + + INSTR_TIME_SET_CURRENT(now); + diff = now; + INSTR_TIME_SUBTRACT(diff, st->txn_begin); + usec = (double) INSTR_TIME_GET_MICROSEC(diff); #ifndef WIN32 - /* This is more than we really ought to know about instr_time */ - fprintf(logfile, "%d %d %.0f %d %ld %ld\n", - st->id, st->cnt, usec, st->use_file, - (long) now.tv_sec, (long) now.tv_usec); + /* This is more than we really ought to know about instr_time */ + fprintf(logfile, "%d %d %.0f %d %ld %ld\n", + st->id, st->cnt, usec, st->use_file, + (long) now.tv_sec, (long) now.tv_usec); #else - /* On Windows, instr_time doesn't provide a timestamp anyway */ - fprintf(logfile, "%d %d %.0f %d 0 0\n", - st->id, st->cnt, usec, st->use_file); + /* On Windows, instr_time doesn't provide a timestamp anyway */ + fprintf(logfile, "%d %d %.0f %d 0 0\n", + st->id, st->cnt, usec, st->use_file); #endif + } } if (commands[st->state]->type == SQL_COMMAND) @@ -1918,6 +1930,7 @@ main(int argc, char **argv) {"index-tablespace", required_argument, NULL, 3}, {"tablespace", required_argument, NULL, 2}, {"unlogged-tables", no_argument, &unlogged_tables, 1}, + {"sampling-rate", required_argument, NULL, 4}, {NULL, 0, NULL, 0} }; @@ -2123,6 +2136,14 @@ main(int argc, char **argv) case 3: /* index-tablespace */ index_tablespace = optarg; break; + case 4: + use_sample_rate = atof(optarg)/100; +
[HACKERS] Fwd: [PERFORM] Loose Index Scans by Planner?
I'd like to create a ToDo item for "loose index scans" or "skip scans", when the lead column has low cardinality and is not used in the where clause. This case can potentially be optimized by using the index as if it were a collection of N "partitioned" indexes, where N is the cardinality of the lead column. Any objections? I don't really have a detailed plan on how to do it. I expect the planner part would be harder than the execution part. See "[PERFORM] Loose Index Scans by Planner" thread. Thanks, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PERFORM] pg_dump and thousands of schemas
On Thu, Aug 30, 2012 at 8:17 PM, Tom Lane wrote: > Robert Haas writes: >> On Thu, Aug 30, 2012 at 4:51 PM, Tom Lane wrote: >>> Bruce Momjian writes: On Thu, May 31, 2012 at 09:20:43AM +0900, Tatsuo Ishii wrote: > Ok, I modified the part of pg_dump where tremendous number of LOCK > TABLE are issued. I replace them with single LOCK TABLE with multiple > tables. With 100k tables LOCK statements took 13 minutes in total, now > it only takes 3 seconds. Comments? > Was this applied? > >>> No, we fixed the server side instead. > >> But only for 9.2, right? So people running back branches are still screwed. > > Yeah, but they're screwed anyway, because there are a bunch of O(N^2) > behaviors involved here, not all of which are masked by what Tatsuo-san > suggested. All of the other ones that I know of were associated with pg_dump itself, and since it is recommended to run the newer version of pg_dump against the older version of the server, no back patching would be necessary to get the benefits of those particular fixes. > Six months or a year from now, we might have enough confidence in that > batch of 9.2 fixes to back-port them en masse. Don't want to do it > today though. What would be the recommendation for people trying to upgrade, but who can't get their data out in a reasonable window? Putting Tatsuo-san's change into a future pg_dump might be more conservative than back-porting the server's Lock Table change to the server version they are trying to get rid of. Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.3] Row-Level Security
On 17 July 2012 05:02, Kohei KaiGai wrote: > 2012/7/17 Robert Haas : >> On Sun, Jul 15, 2012 at 5:52 AM, Kohei KaiGai wrote: >>> The attached patch is a revised version of row-level security feature. >>> ... >>> According to the Robert's comment, I revised the place to inject >>> applyRowLevelSecurity(). The reason why it needed to patch on >>> adjust_appendrel_attrs_mutator() was, we handled expansion from >>> regular relation to sub-query after expand_inherited_tables(). >>> In this revision, it was moved to the head of sub-query planner. >>> Hi, I had a quick look at this and spotted a problem - certain types of query are able to bypass the RLS quals. For example: SELECT * FROM (SELECT * FROM foo) foo; since the RLS policy doesn't descend into subqueries, and is applied before they are pulled up into the main query. Similarly for views on top of tables with RLS, and SRF functions that query a table with RLS that get inlined. Also queries using UNION ALL are vulnerable if they end up being flattened, for example: SELECT * FROM foo UNION ALL SELECT * FROM foo; FWIW I recently developed some similar code as part of a patch to implement automatically updatable views (http://archives.postgresql.org/pgsql-hackers/2012-08/msg00303.php). Some parts of that code may be useful, possibly for adding UPDATE/DELETE support. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proof of concept: auto updatable views
On 31 August 2012 07:59, Dean Rasheed wrote: > On 30 August 2012 20:05, Robert Haas wrote: >> On Sun, Aug 12, 2012 at 5:14 PM, Dean Rasheed >> wrote: >>> None of this new code kicks in for non-security barrier views, so the >>> kinds of plans I posted upthread remain unchanged in that case. But >>> now a significant fraction of the patch is code added to handle >>> security barrier views. Of course we could simply say that such views >>> aren't updatable, but that seems like an annoying limitation if there >>> is a feasible way round it. >> >> Maybe it'd be a good idea to split this into two patches: the first >> could implement the feature but exclude security_barrier views, and >> the second could lift that restriction. >> > > Yes, I think that makes sense. > I should hopefully get some time to look at it over the weekend. > Here's an updated patch for the base feature (without support for security barrier views) with updated docs and regression tests. Regards, Dean auto-update-views.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Yet another failure mode in pg_upgrade
On Sun, Sep 2, 2012 at 01:13:52PM -0400, Tom Lane wrote: > Bruce Momjian writes: > > On Sun, Sep 2, 2012 at 01:06:28AM -0400, Robert Haas wrote: > >> I don't think this is reducing the number of failure modes; it's just > >> changing it from one set of obscure cases to a slightly different set > >> of obscure cases. > > > Tom reported problems with having old/new with different default socket > > locations. This fixes that, and reduces the possibility of acciental > > connections. What problems does this add? > > I'm going to be needing some fix in this area in any case, though > whether it's exactly Bruce's current patch is not clear to me. I found > out last night while making a test build of 9.2rc1 as a Fedora package > that pg_upgrade's regression test fails in the Fedora build environment, > if the postmaster has been patched so that its default socket directory > is /var/run/postgresql. That happens because /var/run/postgresql > doesn't exist in the build environment (it is only going to exist once > the postgresql-server package is installed), so the postmaster fails > to start because it can't create a socket where it expects to. > I have a patch to pg_regress that instructs the temporary postmaster > to use /tmp as unix_socket_directory regardless of its built-in default, > so that "make check" works for the regular core and contrib regression > tests. However, that doesn't affect pg_upgrade's regression test case. > > It looks rather messy to persuade pg_upgrade to do things differently > for regression testing and normal use, not to mention that it would make > the test even less representative of normal use. So I'm thinking that > we do need the pg_upgrade feature Bruce is suggesting of forcing the > socket directory to be the current directory. What's more, if that's > not back-patched to 9.2, I'm going to have to carry it as a Fedora patch > anyway. > > Alternatively, I can prevent "make check" from testing pg_upgrade > (which is what I did so I could carry on with package testing). > I'd just as soon not ship it like that, though. Well, I don't know of any known problems with the patch. On the other hand, I don't know our policy in pushing patches into RC releases at the request of packagers. If you want to stand behind the patch, it might be OK. I think that's how we handle such requests --- someone has to put their neck out for it. Fortunately the patch is not very large so is easy to review. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] GPU and Database
May be someone of you is interested in this ADBIS workshop on GPUs in Databases http://gid2012.cs.put.poznan.pl/ Regards Gaetano Mendola -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [bugfix] sepgsql didn't follow the latest core API changes
This patch fixes a few portions on which sepgsql didn't follow the latest core API changes. 1) Even though the prototype of ProcessUtility_hook was recently changed, sepgsql side didn't follow this update, so it made build failed. 2) sepgsql internally uses GETSTRUCT() and HeapTupleGetOid() macro these were moved to htup_details.h, so it needs an additional #include for "access/htup_defails.h". 3) sepgsql internally used a bool typed variable named "abort". I noticed it conflicts with ereport macro because it internally expanded to ereport_domain that contains invocation of "abort()". So, it renamed this variables to abort_on_violation. #define ereport_domain(elevel, domain, rest)\ (errstart(elevel, __FILE__, __LINE__, PG_FUNCNAME_MACRO, domain) ? \ (errfinish rest) : (void) 0), \ ((elevel) >= ERROR ? abort() : (void) 0) This does not affect to v9.2, so please apply it on the master branch. Thanks, -- KaiGai Kohei sepgsql-fixbug-follow-core-apis.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [bugfix] sepgsql missed a case of CREATE TABLE AS
The attached patch fixes a bug in sepgsql module. Could you apply both v9.2 and master branch? When post_create hook is invoked, sepgsql's handler checks whether the current invocation context needs to have permission checks according to the command-tag saved on ProcessUtility_hook. But it overlooked T_CreateTableAs tag, thus, neither of security label nor permission checks were applied, as if the routine did on toast relation. This patch adds this command tag as a context to check permissions. Thanks, -- KaiGai Kohei sepgsql-fixbug-create-table-as.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Yet another failure mode in pg_upgrade
Bruce Momjian writes: > On Sun, Sep 2, 2012 at 01:06:28AM -0400, Robert Haas wrote: >> I don't think this is reducing the number of failure modes; it's just >> changing it from one set of obscure cases to a slightly different set >> of obscure cases. > Tom reported problems with having old/new with different default socket > locations. This fixes that, and reduces the possibility of acciental > connections. What problems does this add? I'm going to be needing some fix in this area in any case, though whether it's exactly Bruce's current patch is not clear to me. I found out last night while making a test build of 9.2rc1 as a Fedora package that pg_upgrade's regression test fails in the Fedora build environment, if the postmaster has been patched so that its default socket directory is /var/run/postgresql. That happens because /var/run/postgresql doesn't exist in the build environment (it is only going to exist once the postgresql-server package is installed), so the postmaster fails to start because it can't create a socket where it expects to. I have a patch to pg_regress that instructs the temporary postmaster to use /tmp as unix_socket_directory regardless of its built-in default, so that "make check" works for the regular core and contrib regression tests. However, that doesn't affect pg_upgrade's regression test case. It looks rather messy to persuade pg_upgrade to do things differently for regression testing and normal use, not to mention that it would make the test even less representative of normal use. So I'm thinking that we do need the pg_upgrade feature Bruce is suggesting of forcing the socket directory to be the current directory. What's more, if that's not back-patched to 9.2, I'm going to have to carry it as a Fedora patch anyway. Alternatively, I can prevent "make check" from testing pg_upgrade (which is what I did so I could carry on with package testing). I'd just as soon not ship it like that, though. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: PATCH: psql boolean display
2012/9/2 Tom Lane : > I wrote: >> Phil Sorber writes: >>> What my patch was intended to do was let the end user set boolean >>> output to any arbitrary values. While foo/bar is pretty useless, it >>> was meant to reinforce that it was capable of any arbitrary value. I >>> can think of a decent list of other output an end user might want, >>> such as: > >>> true/false >>> yes/no >>> y/n >>> on/off >>> 1/0 >>> enabled/disabled > >>> Plus the different capitalized forms. > >> I can readily see that people might want boolean columns displayed in >> such ways in custom applications. I'm less convinced that there is much >> use for it in psql, though. > > BTW, another point that your list brings to mind is that somebody who > wants something like this would very possibly want different displays > for different columns. The proposed feature, being one-size-fits-all > for "boolean", couldn't handle that. > I proposed just more cleaner and more conventional boolean output in psql - nothing more. We can write formatting functions, CASE, we can use enums. > What would make a lot more sense in my mind would be to label columns > *in the database* to show how they ought to be displayed. > > One conceivable method is to make a collection of domains over bool, > and drive the display off the particular domain used. However we lack > some infrastructure that would be needed for this (in particular, I'm > pretty sure the PQftype data delivered to the client identifies the > underlying type and not the domain). > > Another approach is to make a collection of enum types, in which case > you don't need any client-side support at all. I experimented with > this method a bit, and it seems workable: > > regression=# create type mybool as enum ('no', 'yes'); > CREATE TYPE > regression=# create function bool(mybool) returns bool as > $$ select $1 = 'yes'::mybool $$ language sql immutable; > CREATE FUNCTION > regression=# create cast (mybool as bool) with function bool(mybool) as > assignment; > CREATE CAST > regression=# create table mb(f1 mybool); > CREATE TABLE > regression=# insert into mb values('no'),('yes'); > INSERT 0 2 > regression=# select * from mb where f1; > f1 > - > yes > (1 row) > > regression=# select * from mb where f1 = 'yes'; > f1 > - > yes > (1 row) > > I tried making the cast be implicit, but that caused problems with > ambiguous operators, so assignment seems to be the best you can do here. > > A variant of this is to build casts in the other direction > (bool::mybool), declare columns in the database as regular bool, > and apply the casts in queries when you want columns displayed in a > particular way. This might be the best solution if the desired > display is at all context-dependent. When I worked on PSM I required possibility to simple specification expected datatype out of SQL statement - some like enhancing parametrised queries - with fourth parameter - expected types. Then somebody can set expected type for some column simply - out of query - and transformation can be fast. It should be used for unsupported date formats and similar tasks. Regards Pavel > > regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: PATCH: psql boolean display
2012/9/2 Phil Sorber : > On Sun, Sep 2, 2012 at 1:13 AM, Pavel Stehule wrote: >> -- Forwarded message -- >> From: Pavel Stehule >> Date: 2012/9/1 >> Subject: PATCH: psql boolean display >> To: Phil Sorber >> >> >> Hello >> >> I am looking to your patch: >> >> I have only one note. I don't think so using any text for values >> "true" and "false" is good idea. I don't see a sense of any special >> texts like "foo", "bar" from your example. >> >> More strict design is better - I wrote simple modification - it is >> based on psql psets - and it add new configuration settings "boolstyle >> [char|word]". A advantage of this design is consistency and possible >> autocomplete support. >> >> Regards >> >> Pavel >> >> >> >> postgres=> select true, false; >> bool │ bool >> ──┼── >> t│ f >> (1 row) >> >> postgres=> \pset boolstyle word >> Bool output style is word. >> postgres=> select true, false; >> bool │ bool >> ──┼─── >> true │ false >> (1 row) >> >> >> -- >> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-hackers >> > > What you are proposing sounds like it would be better suited to a > server side GUC. Based on the discussion in the thread that said > true/false was the SQL standard and we were doing it incorrectly as > t/f but could not change for legacy reasons. We could even change the > default but give users a way to revert to the old behavior. Similar to > the bytea_output GUC. Maybe add the GUC for 9.3 but not change the > default behavior until 10.0. > > What my patch was intended to do was let the end user set boolean > output to any arbitrary values. While foo/bar is pretty useless, it > was meant to reinforce that it was capable of any arbitrary value. I > can think of a decent list of other output an end user might want, > such as: > > true/false > yes/no > y/n > on/off > 1/0 > enabled/disabled > > Plus the different capitalized forms. If you have these different requests, then you can use enums - or you can use own formatting function. There is relative strong recommendation don't use implicit formatting based on database configuration from application and inside application use explicit formatting anywhere. I don't thing so using GUC for boolean datatype is good idea. Using just chars 't' and 'f' is unlucky design, that must be respected due compatibility reasons. You don't need to solve it usually, because transformation from chars to words can do application or database driver - so I understand this as client issue - psql issue in this case. And I really don't see any sense for unlimited bool output - in simple tool like psql. It can be nice to fix issue with chars, because chars are not too pronounced, but we don't need to supply enums. Regards Pavel -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value
On Sun, 2 Sep 2012, Peter Geoghegan wrote: On 2 September 2012 16:26, Sergey Koposov wrote: That's why we support altering that value with an ALTER TABLE...ALTER COLUMN DDL statement. You might at least consider increasing the statistics target for the column first though, to see if that will make the n_distinct value better accord with reality. Thanks, I've forgot about that possibility. After fixing the n_distinct to the correct value the index scan became the preferred option. and the row number estimate for the index scan became more realistic Sorry for the noise. Sergey * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: PATCH: psql boolean display
I wrote: > Phil Sorber writes: >> What my patch was intended to do was let the end user set boolean >> output to any arbitrary values. While foo/bar is pretty useless, it >> was meant to reinforce that it was capable of any arbitrary value. I >> can think of a decent list of other output an end user might want, >> such as: >> true/false >> yes/no >> y/n >> on/off >> 1/0 >> enabled/disabled >> Plus the different capitalized forms. > I can readily see that people might want boolean columns displayed in > such ways in custom applications. I'm less convinced that there is much > use for it in psql, though. BTW, another point that your list brings to mind is that somebody who wants something like this would very possibly want different displays for different columns. The proposed feature, being one-size-fits-all for "boolean", couldn't handle that. What would make a lot more sense in my mind would be to label columns *in the database* to show how they ought to be displayed. One conceivable method is to make a collection of domains over bool, and drive the display off the particular domain used. However we lack some infrastructure that would be needed for this (in particular, I'm pretty sure the PQftype data delivered to the client identifies the underlying type and not the domain). Another approach is to make a collection of enum types, in which case you don't need any client-side support at all. I experimented with this method a bit, and it seems workable: regression=# create type mybool as enum ('no', 'yes'); CREATE TYPE regression=# create function bool(mybool) returns bool as $$ select $1 = 'yes'::mybool $$ language sql immutable; CREATE FUNCTION regression=# create cast (mybool as bool) with function bool(mybool) as assignment; CREATE CAST regression=# create table mb(f1 mybool); CREATE TABLE regression=# insert into mb values('no'),('yes'); INSERT 0 2 regression=# select * from mb where f1; f1 - yes (1 row) regression=# select * from mb where f1 = 'yes'; f1 - yes (1 row) I tried making the cast be implicit, but that caused problems with ambiguous operators, so assignment seems to be the best you can do here. A variant of this is to build casts in the other direction (bool::mybool), declare columns in the database as regular bool, and apply the casts in queries when you want columns displayed in a particular way. This might be the best solution if the desired display is at all context-dependent. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value
On 2 September 2012 16:26, Sergey Koposov wrote: > After looking at them, I think I understand the reason -- the > number of n_distinct for crts.data is terribly wrong. In reality it should > be ~ 130 millions. I already faced this problem at certain point when doing > "group by objid" and PG was excausting all the memory, because of that wrong > estimate. But I know that it's a known hard problem to estimate n_distinct. > > So probably that's the main reason of a problem... That's why we support altering that value with an ALTER TABLE...ALTER COLUMN DDL statement. You might at least consider increasing the statistics target for the column first though, to see if that will make the n_distinct value better accord with reality. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value
On Sun, 2 Sep 2012, Tom Lane wrote: Sergey Koposov writes: The problem is definitely the misestimation here: -> Index Scan using data_objid_idx on data d1 (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 rows=248 loops=11456) Index Cond: (objid = t.objid) The planner thinks that indexscan will be 2000 times more expensive than it really is (assuming that the cost per retrieved row is linear, which it isn't entirely, but close enough for the moment). Of course, it's also thinking the bitmap component scan on the same index will be 2000 times more expensive than reality, but that has only perhaps a 4X impact on the total cost of the bitmap scan, since the use of the other index is what dominates there. With a more accurate idea of this join condition's selectivity, I'm pretty certain it would have gone for a plain indexscan, or else a bitmap scan using only this index. So if there's a bug here, it's in eqjoinsel(). Can you pinpoint anything unusual about the stats of the objid columns? Here are the pg_stats rows for two tables. schemaname | tablename | attname | inherited | null_frac | avg_width | n_distinct | most_common_vals ! | most_common_freq! s ! | ! histogram_bounds
Re: [HACKERS] Fwd: PATCH: psql boolean display
Phil Sorber writes: > What my patch was intended to do was let the end user set boolean > output to any arbitrary values. While foo/bar is pretty useless, it > was meant to reinforce that it was capable of any arbitrary value. I > can think of a decent list of other output an end user might want, > such as: > true/false > yes/no > y/n > on/off > 1/0 > enabled/disabled > Plus the different capitalized forms. I can readily see that people might want boolean columns displayed in such ways in custom applications. I'm less convinced that there is much use for it in psql, though. In the big scheme of things, psql is a rather low-level tool, designed for DBAs and SQL programmers. I'd get quite upset if psql failed to tell me the truth about what was in a table I was looking at --- and a feature like this comes pretty close to not telling the truth, especially if it kicks in on a column I wasn't expecting it to. On the whole I think this sort of substitution belongs in a user-written-application layer of software, not in any of the tools we supply. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fwd: PATCH: psql boolean display
On Sun, Sep 2, 2012 at 1:13 AM, Pavel Stehule wrote: > -- Forwarded message -- > From: Pavel Stehule > Date: 2012/9/1 > Subject: PATCH: psql boolean display > To: Phil Sorber > > > Hello > > I am looking to your patch: > > I have only one note. I don't think so using any text for values > "true" and "false" is good idea. I don't see a sense of any special > texts like "foo", "bar" from your example. > > More strict design is better - I wrote simple modification - it is > based on psql psets - and it add new configuration settings "boolstyle > [char|word]". A advantage of this design is consistency and possible > autocomplete support. > > Regards > > Pavel > > > > postgres=> select true, false; > bool │ bool > ──┼── > t│ f > (1 row) > > postgres=> \pset boolstyle word > Bool output style is word. > postgres=> select true, false; > bool │ bool > ──┼─── > true │ false > (1 row) > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > What you are proposing sounds like it would be better suited to a server side GUC. Based on the discussion in the thread that said true/false was the SQL standard and we were doing it incorrectly as t/f but could not change for legacy reasons. We could even change the default but give users a way to revert to the old behavior. Similar to the bytea_output GUC. Maybe add the GUC for 9.3 but not change the default behavior until 10.0. What my patch was intended to do was let the end user set boolean output to any arbitrary values. While foo/bar is pretty useless, it was meant to reinforce that it was capable of any arbitrary value. I can think of a decent list of other output an end user might want, such as: true/false yes/no y/n on/off 1/0 enabled/disabled Plus the different capitalized forms. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value
Sergey Koposov writes: > On Sun, 2 Sep 2012, Peter Geoghegan wrote: >> One obvious red-flag from your query plans is that there is a >> misestimation of the row return count of a few orders of magnitude in >> the Bitmap Index Scan node. Did you trying performing an ANALYZE to >> see if that helped? It may also be helpful to show pg_stats entries >> for both the data.mjd and test.mjd columns. You may find, prior to >> doing an ANALYZE, that there is no entries for one of those tables. > The main large table is static and was analyzed. The test table was as > well. But as mentioned in another recent email, the query is the join, so > column correlation is a problem. The problem is definitely the misestimation here: -> Index Scan using data_objid_idx on data d1 (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 rows=248 loops=11456) Index Cond: (objid = t.objid) The planner thinks that indexscan will be 2000 times more expensive than it really is (assuming that the cost per retrieved row is linear, which it isn't entirely, but close enough for the moment). Of course, it's also thinking the bitmap component scan on the same index will be 2000 times more expensive than reality, but that has only perhaps a 4X impact on the total cost of the bitmap scan, since the use of the other index is what dominates there. With a more accurate idea of this join condition's selectivity, I'm pretty certain it would have gone for a plain indexscan, or else a bitmap scan using only this index. So if there's a bug here, it's in eqjoinsel(). Can you pinpoint anything unusual about the stats of the objid columns? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value
Thanks for your comments. On Sun, 2 Sep 2012, Peter Geoghegan wrote: On 2 September 2012 06:21, Sergey Koposov wrote: I think that this kind of question is better suited to the pgsql-performance list. Granted, it was presented as a bug report (though they're generally sent to -bugs rather than -hackers), but I don't think that this is a valid bug. The reason is that was inclined to think that it is a bug is that I encountered a similar bug before with bitmap scans and very big tables http://archives.postgresql.org/pgsql-hackers/2011-08/msg00958.php Furthermore 2 orders of magnitudes more of CPU time for bitmap scans comparing to the index scan didn't sound right to me (although obviously I'm not in the position to claim that it's 100% bug). One obvious red-flag from your query plans is that there is a misestimation of the row return count of a few orders of magnitude in the Bitmap Index Scan node. Did you trying performing an ANALYZE to see if that helped? It may also be helpful to show pg_stats entries for both the data.mjd and test.mjd columns. You may find, prior to doing an ANALYZE, that there is no entries for one of those tables. The main large table is static and was analyzed. The test table was as well. But as mentioned in another recent email, the query is the join, so column correlation is a problem. Turning off the enable_* planner options in production is generally discouraged. Certainly, you'd be crazy to do that on a server-wide basis. I'm using PG for data mining, data analysis purposes with very few clients connected and very large tables, so enable_* is used quite often to fix incorrect plans due to incorrect selectivities. Regards, Sergey * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [GENERAL] Multiple Slave Failover with PITR
Do we ever want to document a way to connect slaves to a new master, rather than recreating the slave? --- On Tue, Mar 27, 2012 at 10:47:48AM -0700, Ken Brush wrote: > Hello everyone, > > I notice that the documentation at: > http://wiki.postgresql.org/wiki/Binary_Replication_Tutorial > > Doesn't contain steps in a Multiple Slave setup for re-establishing > them after a slave has become the new master. > > Based on the documentation, here are the most fail-proof steps I came up with: > > 1. Master dies :( > 2. Touch the trigger file on the most caught up slave. > 3. Slave is now the new master :) > 4. use pg_basebackup or other binary replication trick (rsync, tar > over ssh, etc...) to bring the other slaves up to speed with the new > master. > 5. start the other slaves pointing to the new master. > > But, that can take time (about 1-2 hours) with my medium sized DB > (580GB currently). > > After testing a few different ideas that I gleaned from posts on the > mail list, I came up with this alternative method: > > 1. Master dies :( > 2. Touch the trigger file on the most caught up slave > 3. Slave is now the new master. > 4. On the other slaves do the following: > 5. Shutdown postgres on the slave > 6. Delete every file in /data/pgsql/data/pg_xlog > 7. Modify the recovery.conf file to point to the new master and > include the line "recovery_target_timeline='latest'" > 8. Copy the history file from the new master to the slave (it's the > most recent #.history file in the xlog directory) > 9. Startup postgres on the slave and watch it sync up to the new > master (about 1-5 minutes usually) > > My question is this. Is the alternative method adequate? I tested it a > bit and couldn't find any problems with data loss or inconsistency. > > I still use the fail-proof method above to re-incorporate the old > master as a new slave. > > Sincerely, > -Ken > > -- > Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value
On Sun, 2 Sep 2012, Pavel Stehule wrote: statistics on data_objid_idx table are absolutly out - so planner cannot find optimal plan That doesn't have anything to do with the problem, AFAIU. First, the data table is static and was analysed. Second, the query in question is the join, and afaik the estimation of the number of rows is known to be incorrect, in the case of column correlation. Third, according at least to my understanding in the fully cached regime bitmap scan should not take two orders of magnitude more CPU time than index scan. Sergey Regard Pavel Stehule Index Cond: (objid = t.objid) Total runtime: 66622.026 ms (11 rows) Here is the output when bitmap scans are disabled: QUERY PLAN QUERY PLAN Limit (cost=0.00..329631941.65 rows=1 width=68) (actual time=0.082..906.876 rows=1 loops=1) -> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual time=0.081..905.683 rows=1 loops=1) Join Filter: (t.mjd = d1.mjd) -> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28) (actual time=0.009..1.679 rows=11456 loops=1) -> Index Scan using data_objid_idx on data d1 (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 rows=248 loops=11456) Index Cond: (objid = t.objid) Total runtime: 907.462 ms When the bitmap scans are enabled the "prof" of postgres shows 47.10% postmaster postgres [.] hash_search_with_hash_value | --- hash_search_with_hash_value 11.06% postmaster postgres [.] hash_seq_search | --- hash_seq_search 6.95% postmaster postgres [.] hash_any | --- hash_any 5.17% postmaster postgres [.] _bt_checkkeys | --- _bt_checkkeys 4.07% postmaster postgres [.] tbm_add_tuples | --- tbm_add_tuples 3.41% postmaster postgres [.] hash_search | --- hash_search And the last note is that the crts.data table which is being bitmap scanned is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan code is somehow unprepared to combine two bitmaps for such a big table, and this leads to the terrible performance. Regards, Sergey PS Here are the schemas of the tables, just in case: wsdb=> \d test Table "koposov.test" Column | Type | Modifiers -+--+--- mjd | double precision | fieldid | bigint | intmag | integer | objid | bigint | wsdb=> \d crts.data Table "crts.data" Column | Type | Modifiers +--+--- objid | bigint | mjd| double precision | mag| real | emag | real | ra | double precision | dec| double precision | Indexes: "data_mjd_idx" btree (mjd) WITH (fillfactor=100) "data_objid_idx" btree (objid) WITH (fillfactor=100) "data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH (fillfactor=100) PPS shared_buffers=10GB, work_mem=1GB All the test shown here were don in fully cached regime. PPS I can believe that what I'm seeing is a feature, not a bug of bitmap scans, and I can live with disabling them, but I still thought it's worth reporting. * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers * Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Yet another failure mode in pg_upgrade
On Sat, Sep 1, 2012 at 02:35:06PM -0400, Bruce Momjian wrote: > On Sat, Sep 1, 2012 at 02:23:22PM -0400, Tom Lane wrote: > > Bruce Momjian writes: > > > + /* > > > + * Report the Unix domain socket directory location to the > > > postmaster. > > > + */ > > > > "Report" seems entirely the wrong verb there. Fixed. > > > + #define LISTEN_STR " -c listen_addresses=''" > > > + > > > + /* Have a sockdir to use? */ > > > + if (strlen(os_info.sockdir) != 0) > > > + snprintf(socket_string, sizeof(socket_string) - > > > strlen(LISTEN_STR), > > > + " -c %s='%s'", > > > + (GET_MAJOR_VERSION(cluster->major_version) < > > > 903) ? > > > + "unix_socket_directory" : > > > "unix_socket_directories", > > > + os_info.sockdir); > > > + > > > + /* prevent TCP/IP connections */ > > > + strcat(socket_string, LISTEN_STR); > > > > IMO this would be simpler and more readable if you got rid of the LISTEN_STR > > #define and just included -c listen_addresses='' in the snprintf format > > string. The comment for the whole thing should be something like > > "If we have a socket directory to use, command the postmaster to use it, > > and disable TCP/IP connections altogether". > > Well, you only want the unix_socket* if sockdir is defined, but you want > LISTEN_STR unconditionally, even if there is no sockdir. Not sure how > that could cleanly be in a single snprintf. I restructured the code to add the listen_addresses string first, allowing the removal of the #define, as Tom suggested. I also added unix_socket_permissions=0700 to further restrict socket access. Updated patch attached. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + diff --git a/contrib/pg_upgrade/option.c b/contrib/pg_upgrade/option.c new file mode 100644 index 94bce50..8b35d8f *** a/contrib/pg_upgrade/option.c --- b/contrib/pg_upgrade/option.c *** adjust_data_dir(ClusterInfo *cluster) *** 376,378 --- 376,431 check_ok(); } + + + #if HAVE_UNIX_SOCKETS + /* + * set_sockdir_and_pghost + * + * Set the socket directory to either the configured location (live check) + * or the current directory. + */ + void + set_sockdir_and_pghost(bool live_check) + { + if (!live_check) + { + /* Use the current directory for the socket */ + if (!getcwd(os_info.sockdir, MAXPGPATH)) + pg_log(PG_FATAL, "cannot find current directory\n"); + } + else + { + /* + * If we are doing a live check, we will use the old cluster's Unix + * domain socket directory so we can connect to the live server. + */ + + /* sockdir added to the 5th line of postmaster.pid in PG 9.1 */ + if (GET_MAJOR_VERSION(old_cluster.major_version) >= 901) + { + char filename[MAXPGPATH]; + FILE *fp; + int i; + + snprintf(filename, sizeof(filename), "%s/postmaster.pid", old_cluster.pgdata); + if ((fp = fopen(filename, "r")) == NULL) + pg_log(PG_FATAL, "Could not get socket directory of the old server\n"); + + for (i = 0; i < 5; i++) + if (fgets(os_info.sockdir, sizeof(os_info.sockdir), fp) == NULL) + pg_log(PG_FATAL, "Could not get socket directory of the old server\n"); + fclose(fp); + } + else + return; /* Can't get live sockdir, so use the default. */ + + /* Remove trailing newline */ + if (strchr(os_info.sockdir, '\n') != NULL) + *strchr(os_info.sockdir, '\n') = '\0'; + } + + /* For client communication */ + pg_putenv("PGHOST", os_info.sockdir); + } + #endif diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c new file mode 100644 index c47c8bb..8cad5c3 *** a/contrib/pg_upgrade/pg_upgrade.c --- b/contrib/pg_upgrade/pg_upgrade.c *** main(int argc, char **argv) *** 88,93 --- 88,97 check_cluster_versions(); check_cluster_compatibility(live_check); + #if HAVE_UNIX_SOCKETS + set_sockdir_and_pghost(live_check); + #endif + check_old_cluster(live_check, &sequence_script_file_name); diff --git a/contrib/pg_upgrade/pg_upgrade.h b/contrib/pg_upgrade/pg_upgrade.h new file mode 100644 index fa4c6c0..a712e21 *** a/contrib/pg_upgrade/pg_upgrade.h --- b/contrib/pg_upgrade/pg_upgrade.h *** typedef struct *** 267,272 --- 267,275 const char *progname; /* complete pathname for this program */ char *exec_path; /* full path to my executable */ char *user; /* username for clusters */ + #if HAVE_UNIX_SOCKETS + char sockdir[NAMEDATALEN]; /* directory for Unix Domain socket */ + #endif char **tablespaces; /* tablespaces */ int num_tablespaces; char **libraries; /* loadable libraries */ *** void print_maps(FileNameMap *maps, int n *** 387,392 --- 390,398
Re: [HACKERS] Yet another failure mode in pg_upgrade
On Sun, Sep 2, 2012 at 01:06:28AM -0400, Robert Haas wrote: > > For "live check" operation, you are checking a running server, so > > assuming the socket is in the current directory is not going to work. > > What the code does is to read the 5th line from the running server's > > postmaster.pid file, which has the socket directory in PG >= 9.1. For > > pre-9.1, pg_upgrade uses the compiled-in defaults for socket directory. > > If the defaults are different between the two servers, the new binaries, > > e.g. pg_dump, will not work. The fix is for the user to set pg_upgrade > > -O to match the old socket directory, and set PGHOST before running > > pg_upgrade. I could not find a good way to generate a proper error > > message because we are blind to the socket directory in pre-9.1. > > Frankly, this is a problem if the old pre-9.1 server is running in a > > user-configured socket directory too, so a documentation addition seems > > right here. > > > > So, in summary, this patch moves the socket directory to the current > > directory all but live check operation, and handles different socket > > directories for old cluster >= 9.1. I have added a documentation > > mention of how to make this work for for pre-9.1 old servers. > > I don't think this is reducing the number of failure modes; it's just > changing it from one set of obscure cases to a slightly different set > of obscure cases. Tom reported problems with having old/new with different default socket locations. This fixes that, and reduces the possibility of acciental connections. What problems does this add? -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value
On 2 September 2012 06:21, Sergey Koposov wrote: > Hi, > > I'm experiencing the case when bitmap scan is ~ 70 times slower than index > scan which seems to be caused by 1) very big table 2) some hash search logic > (hash_search_with_hash_value ) > > Here is the explain analyze of the query with bitmap scans allowed: > > wsdb=> explain analyze select * from test as t, crts.data as d1 > where d1.objid=t.objid and d1.mjd=t.mjd limit 1; > QUERY > PLAN > --- > Limit (cost=11514.04..115493165.44 rows=1 width=68) (actual > time=27.512..66620.231 rows=1 loops=1) >-> Nested Loop (cost=11514.04..1799585184.18 rows=155832 width=68) > (actual time=27.511..66616.807 rows=1 loops=1) > -> Seq Scan on test t (cost=0.00..2678.40 rows=156240 width=28) > (actual time=0.010..4.685 rows=11456 loops=1) > -> Bitmap Heap Scan on data d1 (cost=11514.04..11518.05 rows=1 > width=40) (actual time=5.807..5.807 rows=1 loops=11456) >Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid)) >-> BitmapAnd (cost=11514.04..11514.04 rows=1 width=0) > (actual time=5.777..5.777 rows=0 loops=11456) > -> Bitmap Index Scan on data_mjd_idx > (cost=0.00..2501.40 rows=42872 width=0) (actual time=3.920..3.920 rows=22241 > loops=11456) >Index Cond: (mjd = t.mjd) > -> Bitmap Index Scan on data_objid_idx > (cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248 > loops=11456) >Index Cond: (objid = t.objid) > Total runtime: 66622.026 ms > (11 rows) > > Here is the output when bitmap scans are disabled: > QUERY PLAN > QUERY > PLAN > > Limit (cost=0.00..329631941.65 rows=1 width=68) (actual > time=0.082..906.876 rows=1 loops=1) >-> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual > time=0.081..905.683 rows=1 loops=1) > Join Filter: (t.mjd = d1.mjd) > -> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28) > (actual time=0.009..1.679 rows=11456 loops=1) > -> Index Scan using data_objid_idx on data d1 > (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 > rows=248 loops=11456) >Index Cond: (objid = t.objid) > Total runtime: 907.462 ms > > When the bitmap scans are enabled the "prof" of postgres shows > 47.10% postmaster postgres [.] hash_search_with_hash_value > | > --- hash_search_with_hash_value > > 11.06% postmaster postgres [.] hash_seq_search > | > --- hash_seq_search > > 6.95% postmaster postgres [.] hash_any > | > --- hash_any > > 5.17% postmaster postgres [.] _bt_checkkeys > | > --- _bt_checkkeys > > 4.07% postmaster postgres [.] tbm_add_tuples > | > --- tbm_add_tuples > > 3.41% postmaster postgres [.] hash_search > | > --- hash_search > > > And the last note is that the crts.data table which is being bitmap scanned > is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan > code > is somehow unprepared to combine two bitmaps for such a big table, and this > leads to the terrible performance. I think that this kind of question is better suited to the pgsql-performance list. Granted, it was presented as a bug report (though they're generally sent to -bugs rather than -hackers), but I don't think that this is a valid bug. One obvious red-flag from your query plans is that there is a misestimation of the row return count of a few orders of magnitude in the Bitmap Index Scan node. Did you trying performing an ANALYZE to see if that helped? It may also be helpful to show pg_stats entries for both the data.mjd and test.mjd columns. You may find, prior to doing an ANALYZE, that there is no entries for one of those tables. Turning off the enable_* planner options in production is generally discouraged. Certainly, you'd be crazy to do that on a server-wide basis. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] bitmap scan much slower than index scan, hash_search_with_hash_value
Hello 2012/9/2 Sergey Koposov : > Hi, > > I'm experiencing the case when bitmap scan is ~ 70 times slower than index > scan which seems to be caused by 1) very big table 2) some hash search logic > (hash_search_with_hash_value ) > > Here is the explain analyze of the query with bitmap scans allowed: > > wsdb=> explain analyze select * from test as t, crts.data as d1 > where d1.objid=t.objid and d1.mjd=t.mjd limit 1; > QUERY > PLAN > --- > Limit (cost=11514.04..115493165.44 rows=1 width=68) (actual > time=27.512..66620.231 rows=1 loops=1) >-> Nested Loop (cost=11514.04..1799585184.18 rows=155832 width=68) > (actual time=27.511..66616.807 rows=1 loops=1) > -> Seq Scan on test t (cost=0.00..2678.40 rows=156240 width=28) > (actual time=0.010..4.685 rows=11456 loops=1) > -> Bitmap Heap Scan on data d1 (cost=11514.04..11518.05 rows=1 > width=40) (actual time=5.807..5.807 rows=1 loops=11456) >Recheck Cond: ((mjd = t.mjd) AND (objid = t.objid)) >-> BitmapAnd (cost=11514.04..11514.04 rows=1 width=0) > (actual time=5.777..5.777 rows=0 loops=11456) > -> Bitmap Index Scan on data_mjd_idx > (cost=0.00..2501.40 rows=42872 width=0) (actual time=3.920..3.920 rows=22241 > loops=11456) >Index Cond: (mjd = t.mjd) > -> Bitmap Index Scan on data_objid_idx > (cost=0.00..8897.90 rows=415080 width=0) (actual time=0.025..0.025 rows=248 > loops=11456) statistics on data_objid_idx table are absolutly out - so planner cannot find optimal plan Regard Pavel Stehule >Index Cond: (objid = t.objid) > Total runtime: 66622.026 ms > (11 rows) > > Here is the output when bitmap scans are disabled: > QUERY PLAN > QUERY > PLAN > > Limit (cost=0.00..329631941.65 rows=1 width=68) (actual > time=0.082..906.876 rows=1 loops=1) >-> Nested Loop (cost=0.00..4979486036.95 rows=151062 width=68) (actual > time=0.081..905.683 rows=1 loops=1) > Join Filter: (t.mjd = d1.mjd) > -> Seq Scan on test t (cost=0.00..2632.77 rows=151677 width=28) > (actual time=0.009..1.679 rows=11456 loops=1) > -> Index Scan using data_objid_idx on data d1 > (cost=0.00..26603.32 rows=415080 width=40) (actual time=0.010..0.050 > rows=248 loops=11456) >Index Cond: (objid = t.objid) > Total runtime: 907.462 ms > > When the bitmap scans are enabled the "prof" of postgres shows > 47.10% postmaster postgres [.] hash_search_with_hash_value > | > --- hash_search_with_hash_value > > 11.06% postmaster postgres [.] hash_seq_search > | > --- hash_seq_search > > 6.95% postmaster postgres [.] hash_any > | > --- hash_any > > 5.17% postmaster postgres [.] _bt_checkkeys > | > --- _bt_checkkeys > > 4.07% postmaster postgres [.] tbm_add_tuples > | > --- tbm_add_tuples > > 3.41% postmaster postgres [.] hash_search > | > --- hash_search > > > And the last note is that the crts.data table which is being bitmap scanned > is a 1.1Tb table with ~ 20e9 rows. My feeling is that the bitmap index scan > code > is somehow unprepared to combine two bitmaps for such a big table, and this > leads to the terrible performance. > > Regards, > Sergey > > PS Here are the schemas of the tables, just in case: > wsdb=> \d test > Table "koposov.test" > Column | Type | Modifiers > -+--+--- > mjd | double precision | > fieldid | bigint | > intmag | integer | > objid | bigint | > > wsdb=> \d crts.data >Table "crts.data" > Column | Type | Modifiers > +--+--- > objid | bigint | > mjd| double precision | > mag| real | > emag | real | > ra | double precision | > dec| double precision | > Indexes: > "data_mjd_idx" btree (mjd) WITH (fillfactor=100) > "data_objid_idx" btree (objid) WITH (fillfactor=100) > "data_q3c_ang2ipix_idx" btree (q3c_ang2ipix(ra, "dec")) WITH > (fillfactor=100) > > PPS shared_buffers=10GB, work_mem=1GB > All the test shown here were don in fully cached regime. > > PPS I can believe that what I'm seeing is a feature, not a bug of bitmap > scans, >