Re: [HACKERS] Why the asprintf patch is still breaking the buildfarm
On 10/23/2013 03:05 AM, Noah Misch wrote: I would vote for choosing the standard we want vsnprintf() to follow (probably C99) and substituting a conforming implementation wherever configure detects that libc does not conform. We'll be shipping some replacement vsnprintf() in any case; we may as well use it to insulate the rest of our code from less-preferred variants. Do you care about the snprintf behavior on very large buffers (larger than INT_MAX)? Then there's further complication, and it's an area where glibc behavior is likely to change in the future (because it is claimed that C99 and POSIX conflict, and glibc implements neither behavior). -- Florian Weimer / Red Hat Product Security Team -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Reasons not to like asprintf
On 10/22/2013 11:06 PM, Tom Lane wrote: Attached is a draft, which compiles though I've not yet actually tested it. nprinted = vsnprintf(buf, len, fmt, args); Assert(buf[len - 1] == '\0'); The assert may fire if len INT_MAX and the system returns with errno == EOVERFLOW, as required by POSIX. It's probably better to move it after the error logging. -- Florian Weimer / Red Hat Product Security Team -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Review of pg_rewind
While testing pg_rewind I encountered following problem. I used following process to do the testing, Please correct me if I am doing it in wrong way. Problem-1: pg_rewind gives error (target master must be shut down cleanly.) when master crashed unexpectedly. 1. Setup Streaming Replication (stand alone machine : master server port -5432, standby server port-5433 ) 2. Do some operation on master server: postgres=# create table test(id int); 3. Crash the Postgres process of master: kill -9 [pid of postgres process of master server] 4. Promote standby server 5. Run pg_rewind: $ /samrat/postgresql/contrib/pg_rewind/pg_rewind -D /samrat/master-data/ --source-server='host=localhost port=5433 dbname=postgres' -v connected to remote server fetched file global/pg_control, length 8192 target master must be shut down cleanly. 6. Check masters control information: $ /samrat/postgresql/install/bin/pg_controldata /samrat/master-data/ | grep Database cluster state Database cluster state: in production IIUC It is because pg_rewind does some checks before resynchronizing the PostgreSQL data directories. But In real time scenarios, for example due to hardware failure if master crashed and its controldata shows the state in production then pg_rewind will fail to pass this check. Problem-2: For zero length WAL record pf_rewind gives error. 1. Setup Streaming Replication (stand alone machine : master server port -5432, standby server port-5433 ) 2. Cleanly shutdown master (Do not add any data on master) 3. Promote standby server 4. Create table on new master (promoted standby) postgres=# create table test(id int); 5. Run pg_rewind: $ /samrat/postgresql/contrib/pg_rewind/pg_rewind -D /samrat/master-data/ --source-server='host=localhost port=5433 connected to remote server connected to remote server fetched file global/pg_control, length 8192 fetched file pg_xlog/0002.history, length 41 Last common WAL position: 0/490 on timeline 1 could not previous WAL record at 0/490: record with zero length at 0/490 Also it as you already listed in README of pg_rewind the it has a problem of tablespace support. I will continue with testing it further to help in improving it :)
Re: [HACKERS] Review of pg_rewind
Hi, Thanks for the feedback. Btw, pg_rewind is not a project included in Postgres core as a contrib module or anything, so could you send your feedback and the issues you find directly on github instead? The URL of the project is https://github.com/vmware/pg_rewind. Either way, here are some comments below... On Wed, Oct 23, 2013 at 6:07 PM, Samrat Revagade revagade.sam...@gmail.com wrote: While testing pg_rewind I encountered following problem. I used following process to do the testing, Please correct me if I am doing it in wrong way. Problem-1: pg_rewind gives error (target master must be shut down cleanly.) when master crashed unexpectedly. 1. Setup Streaming Replication (stand alone machine : master server port -5432, standby server port-5433 ) 2. Do some operation on master server: postgres=# create table test(id int); 3. Crash the Postgres process of master: kill -9 [pid of postgres process of master server] 4. Promote standby server 5. Run pg_rewind: $ /samrat/postgresql/contrib/pg_rewind/pg_rewind -D /samrat/master-data/ --source-server='host=localhost port=5433 dbname=postgres' -v connected to remote server fetched file global/pg_control, length 8192 target master must be shut down cleanly. 6. Check masters control information: $ /samrat/postgresql/install/bin/pg_controldata /samrat/master-data/ | grep Database cluster state Database cluster state: in production IIUC It is because pg_rewind does some checks before resynchronizing the PostgreSQL data directories. But In real time scenarios, for example due to hardware failure if master crashed and its controldata shows the state in production then pg_rewind will fail to pass this check. Yeah, you could call that a limitation of this module. When I looked at its code some time ago, I had on top of my mind the addition of an option of the type --force that could attempt resynchronization of a master even if it did not shut down correctly. Problem-2: For zero length WAL record pf_rewind gives error. 1. Setup Streaming Replication (stand alone machine : master server port -5432, standby server port-5433 ) 2. Cleanly shutdown master (Do not add any data on master) 3. Promote standby server 4. Create table on new master (promoted standby) postgres=# create table test(id int); 5. Run pg_rewind: $ /samrat/postgresql/contrib/pg_rewind/pg_rewind -D /samrat/master-data/ --source-server='host=localhost port=5433 connected to remote server connected to remote server fetched file global/pg_control, length 8192 fetched file pg_xlog/0002.history, length 41 Last common WAL position: 0/490 on timeline 1 could not previous WAL record at 0/490: record with zero length at 0/490 This is rather interesting. When I tested it I did not find this error. Also it as you already listed in README of pg_rewind the it has a problem of tablespace support. I will continue with testing it further to help in improving it :) Thanks! -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Review of pg_rewind
On Wed, Oct 23, 2013 at 2:54 PM, Michael Paquier michael.paqu...@gmail.comwrote: Hi, Thanks for the feedback. Btw, pg_rewind is not a project included in Postgres core as a contrib module or anything, so could you send your feedback and the issues you find directly on github instead? The URL of the project is https://github.com/vmware/pg_rewind. Sure, I will add those issues on github. Either way, here are some comments below... On Wed, Oct 23, 2013 at 6:07 PM, Samrat Revagade revagade.sam...@gmail.com wrote: While testing pg_rewind I encountered following problem. I used following process to do the testing, Please correct me if I am doing it in wrong way. Problem-1: pg_rewind gives error (target master must be shut down cleanly.) when master crashed unexpectedly. 1. Setup Streaming Replication (stand alone machine : master server port -5432, standby server port-5433 ) 2. Do some operation on master server: postgres=# create table test(id int); 3. Crash the Postgres process of master: kill -9 [pid of postgres process of master server] 4. Promote standby server 5. Run pg_rewind: $ /samrat/postgresql/contrib/pg_rewind/pg_rewind -D /samrat/master-data/ --source-server='host=localhost port=5433 dbname=postgres' -v connected to remote server fetched file global/pg_control, length 8192 target master must be shut down cleanly. 6. Check masters control information: $ /samrat/postgresql/install/bin/pg_controldata /samrat/master-data/ | grep Database cluster state Database cluster state: in production IIUC It is because pg_rewind does some checks before resynchronizing the PostgreSQL data directories. But In real time scenarios, for example due to hardware failure if master crashed and its controldata shows the state in production then pg_rewind will fail to pass this check. Yeah, you could call that a limitation of this module. When I looked at its code some time ago, I had on top of my mind the addition of an option of the type --force that could attempt resynchronization of a master even if it did not shut down correctly. This sounds good :) Greetings, Samrat Revagade
Re: [HACKERS] Why the asprintf patch is still breaking the buildfarm
On Wed, Oct 23, 2013 at 4:00 AM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah. As a separate matter, it might be useful to revise stringinfo.c and the asprintf code so that *if* the returned value is larger than the given buffer size, we use it as a guide to resizing, avoiding the possible need to loop multiple times to make the buffer large enough. And we could also improve our own implementation of snprintf to follow the C99 spec. Attached is a draft patch which implements this. I didn't bother making the translation macros make use of the extra data as I didn't think we would have many translations long enough to take advantage of it. I think it's a good idea to take advantage of the buffer size if vsnprintf() has gone to the trouble of working out what is needed for us. It seems quite wasteful to throw this information away. Comments are welcome. Regards David appendStringInfoVA.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] all_visible replay aborting due to uninitialized pages
On 22.10.2013 14:14, Andres Freund wrote: Hi Robert, Heikki, On 2013-09-24 13:25:41 +0200, Andres Freund wrote: I'm afraid this patch was a few bricks shy of a load. The log_newpage_buffer() function asserts that: /* We should be in a critical section. */ Assert(CritSectionCount 0); But the call in vacuumlazy.c is not inside a critical section. Also, the comments in log_newpage_buffer() say that the caller should mark the buffer dirty *before* calling log_newpage_buffer(), but in vacuumlazy.c, it's marked dirty afterwards. I'm not sure what consequences that might have, but at least it contradicts the comment. (spotted this while working on a patch, and ran into the assertion on crash recovery) What about the attached patches (one for 9.3 and master, the other for 9.2)? I've tested that I can trigger the assert before and not after by inserting faults... Yould either of you commit those patches to the corresponding branches? Committed, thanks. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest II CLosed
On Tue, Oct 22, 2013 at 12:27:13PM -0700, Josh Berkus wrote: When I came up with the idea of CommitFests they were supposed to be an incremental improvement for us to build on. Instead it's remained frozen in amber, and steadily becoming less and less effective. I've suggested a number of improvements and changes over the years, and largely been rewarded with denial, attacks, ridicule, and general sandbaggery. I'm done. If the community doesn't think there's a problem, then clearly I'm in error for proposing fixes. Not sure who you're going to get to do CF3, though. I'm not going to be CFM again, and I'm pretty sure nobody else wants the job either. For what it's worth, I liked how you ran CF 2013-06. It proceeded better than any CF of the 9.3 development cycle. I can appreciate that it drained you, though; you tried new things, and your reward was lots of flak. Your innovations were 85% good; sadly, debate raged over the negative aspects only. Perhaps that arises from how we deal with code. An 85%-good patch can still wreak havoc in the field; closing that gap is essential. We say little about the correct aspects of a patch; it's usually a given that things not mentioned are satisfactory and have self-evident value. That's not such an effective discussion pattern when the topic is management strategies. Thanks, nm -- Noah Misch EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why the asprintf patch is still breaking the buildfarm
Florian Weimer fwei...@redhat.com writes: Do you care about the snprintf behavior on very large buffers (larger than INT_MAX)? Then there's further complication, and it's an area where glibc behavior is likely to change in the future (because it is claimed that C99 and POSIX conflict, and glibc implements neither behavior). We do not. Note that the buffer enlargement behavior is designed not to let len exceed INT_MAX; it'll say out of memory instead. Given that vsnprintf is defined to return int, buffers larger than INT_MAX would be a real can of worms, one that we'd best not open. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Why the asprintf patch is still breaking the buildfarm
David Rowley dgrowle...@gmail.com writes: On Wed, Oct 23, 2013 at 4:00 AM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah. As a separate matter, it might be useful to revise stringinfo.c and the asprintf code so that *if* the returned value is larger than the given buffer size, we use it as a guide to resizing, avoiding the possible need to loop multiple times to make the buffer large enough. And we could also improve our own implementation of snprintf to follow the C99 spec. Attached is a draft patch which implements this. I started working on a very similar patch last night, but then began to wonder if it wouldn't be better to try to share code between stringinfo.c and psprintf.c --- that is, expose the latter's pvsnprintf and use that in stringinfo. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Using indexes for ORDER BY and PARTITION BY clause in windowing functions
Hi, I was dealing with windowing function recently. I feel they are pretty useful and quite handy in lot of operations. I am not sure why but my PostgreSQL does not seem to be using indexes for ORDER BY clause or PARTITION BY CLAUSE which I use with windowing function. I have tried ORDER BY and GROUP BY clauses in a normal sql statement and they seem to use indexes nicely. Is this being already considered for development? Best Regards, *Sameer Kumar | Database Consultant* *ASHNIK PTE. LTD.*
Re: [HACKERS] Sigh, my old HPUX box is totally broken by DSM patch
On Tue, Oct 22, 2013 at 10:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: initdb.c quoth: * ... but the fact that a platform has shm_open * doesn't guarantee that that call will succeed when attempted. Indeed: $ initdb The files belonging to this database system will be owned by user postgres. This user must also own the server process. The database cluster will be initialized with locale C. The default database encoding has accordingly been set to SQL_ASCII. The default text search configuration will be set to english. Data page checksums are disabled. creating directory /home/postgres/testversion/data ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... Bad system call(coredump) $ gdb shows: Core was generated by `initdb'. Program terminated with signal 12, Bad system call. (gdb) bt #0 0xc0143fb0 in ?? () from /usr/lib/libc.1 #1 0xa890 in choose_dsm_implementation () at initdb.c:1098 #2 0xab10 in test_config_settings () at initdb.c:1217 #3 0xe310 in initialize_data_directory () at initdb.c:3412 #4 0xed0c in main (argc=1, argv=0x7b03ac68) at initdb.c:3691 #5 0xc0065784 in ?? () from /usr/lib/libc.1 I'm not entirely sure what to do about this. Conceivably we could have initdb catch SIGSYS, but that seems rather ugly. Maybe configure needs a run-time test to see if shm_open will work, rather than just probing to see if such a function exists? I'm not thrilled with run-time tests in configure though. Another possibility is for initdb to execute the probe in a forked subprocess instead of risking doing it itself. Well, geez. That's obnoxious. I understand that an unimplemented system call might return ENOSYS, but SIGSYS seems pretty unfriendly. Why put the wrapper in your system libraries at all if it's just going to kill the process? I don't think a configure-time test is a good idea because there's no guarantee that the configure-time machine and the run-time machine have the same behavior. But having initdb fork a child process to run the test seems like a reasonable way forward, even though I feel like it shouldn't really be needed. One possibly unfortunate things is that SIGSYS at least on my box normally produces a core dump, so the initdb child might leave behind a core file somewhere as a side effect. Not sure if we can or want to work around that somehow. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sigh, my old HPUX box is totally broken by DSM patch
Robert Haas robertmh...@gmail.com writes: On Tue, Oct 22, 2013 at 10:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: selecting dynamic shared memory implementation ... Bad system call(coredump) Well, geez. That's obnoxious. I quite agree :-(. If it were just this old HPUX version, maybe we could write it off as something we don't care to support anymore. I'm worried though that there might be other platforms that act this way. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sigh, my old HPUX box is totally broken by DSM patch
* Robert Haas (robertmh...@gmail.com) wrote: I don't think a configure-time test is a good idea because there's no guarantee that the configure-time machine and the run-time machine have the same behavior. But having initdb fork a child process to run the test seems like a reasonable way forward, even though I feel like it shouldn't really be needed. One possibly unfortunate things is that SIGSYS at least on my box normally produces a core dump, so the initdb child might leave behind a core file somewhere as a side effect. Not sure if we can or want to work around that somehow. I'm going to guess this idea is a non-starter, but any hope there's some other system call which would tell us we're on a platform where shm_open() will hit us with SIGSYS? What happens when shm_unlink() is called on this platform? Or mmap()? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Using indexes for ORDER BY and PARTITION BY clause in windowing functions
Sameer Kumar sameer.ku...@ashnik.com writes: I am not sure why but my PostgreSQL does not seem to be using indexes for ORDER BY clause or PARTITION BY CLAUSE which I use with windowing function. When the entire contents of the table have to be read, a seqscan-and-sort will frequently be estimated as cheaper than an indexscan. If you think this is not true on your hardware, you might need to adjust random_page_cost. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sigh, my old HPUX box is totally broken by DSM patch
Stephen Frost sfr...@snowman.net writes: I'm going to guess this idea is a non-starter, but any hope there's some other system call which would tell us we're on a platform where shm_open() will hit us with SIGSYS? What happens when shm_unlink() is called on this platform? Or mmap()? For context's sake, the machine does have mmap(). shm_open and shm_unlink exist in libc and have declarations in sys/mman.h, but at least the former traps with a signal, suggesting the kernel hasn't got support for it. I agree with Robert that it's odd and obnoxious that the call doesn't just return with errno = ENOSYS. However, looking in the archives turns up this interesting historical info: http://www.postgresql.org/message-id/25564.962066...@sss.pgh.pa.us I wonder whether, if we went back to blocking SIGSYS, we could expect that affected calls would return ENOSYS (clearly preferable), or if that would just lead to some very strange behavior. Other archive entries mention that you get SIGSYS on Cygwin if the Cygwin support daemon isn't running, so that's at least one place where we'd want to check the behavior. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sigh, my old HPUX box is totally broken by DSM patch
* Tom Lane (t...@sss.pgh.pa.us) wrote: I agree with Robert that it's odd and obnoxious that the call doesn't just return with errno = ENOSYS. However, looking in the archives turns up this interesting historical info: http://www.postgresql.org/message-id/25564.962066...@sss.pgh.pa.us Wow, well, good on HPUX for trying to run the code you told it to.. I wonder whether, if we went back to blocking SIGSYS, we could expect that affected calls would return ENOSYS (clearly preferable), or if that would just lead to some very strange behavior. Other archive entries mention that you get SIGSYS on Cygwin if the Cygwin support daemon isn't running, so that's at least one place where we'd want to check the behavior. Would this make sense as a configure-time check, rather than initdb, to try blocking SIGSYS and checking for an ENOSYS from shm_open()? Seems preferrable to do that in a configure check rather than initdb. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Sigh, my old HPUX box is totally broken by DSM patch
On Wed, Oct 23, 2013 at 11:35 AM, Stephen Frost sfr...@snowman.net wrote: * Tom Lane (t...@sss.pgh.pa.us) wrote: I agree with Robert that it's odd and obnoxious that the call doesn't just return with errno = ENOSYS. However, looking in the archives turns up this interesting historical info: http://www.postgresql.org/message-id/25564.962066...@sss.pgh.pa.us Wow, well, good on HPUX for trying to run the code you told it to.. I wonder whether, if we went back to blocking SIGSYS, we could expect that affected calls would return ENOSYS (clearly preferable), or if that would just lead to some very strange behavior. Other archive entries mention that you get SIGSYS on Cygwin if the Cygwin support daemon isn't running, so that's at least one place where we'd want to check the behavior. Would this make sense as a configure-time check, rather than initdb, to try blocking SIGSYS and checking for an ENOSYS from shm_open()? Seems preferrable to do that in a configure check rather than initdb. I don't see why. It's a run-time behavior; the build system may not be where the binaries will ultimately run. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest II CLosed
On Wed, Oct 23, 2013 at 8:38 AM, Noah Misch n...@leadboat.com wrote: On Tue, Oct 22, 2013 at 12:27:13PM -0700, Josh Berkus wrote: When I came up with the idea of CommitFests they were supposed to be an incremental improvement for us to build on. Instead it's remained frozen in amber, and steadily becoming less and less effective. I've suggested a number of improvements and changes over the years, and largely been rewarded with denial, attacks, ridicule, and general sandbaggery. I'm done. If the community doesn't think there's a problem, then clearly I'm in error for proposing fixes. Not sure who you're going to get to do CF3, though. I'm not going to be CFM again, and I'm pretty sure nobody else wants the job either. For what it's worth, I liked how you ran CF 2013-06. It proceeded better than any CF of the 9.3 development cycle. I can appreciate that it drained you, though; you tried new things, and your reward was lots of flak. Your innovations were 85% good; sadly, debate raged over the negative aspects only. Perhaps that arises from how we deal with code. An 85%-good patch can still wreak havoc in the field; closing that gap is essential. We say little about the correct aspects of a patch; it's usually a given that things not mentioned are satisfactory and have self-evident value. That's not such an effective discussion pattern when the topic is management strategies. I couldn't have said it better myself. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Commitfest II CLosed
On Mon, Oct 21, 2013 at 11:10:09AM -0400, Tom Lane wrote: Heikki Linnakangas hlinnakan...@vmware.com writes: On 21.10.2013 16:15, Peter Eisentraut wrote: What is the alternative? If no-one really cares enough about a patch to review it, mark it as rejected, because no-one but the patch author cares. Harsh, but that's effectively what pushing to the next commitfest means anyway. Well, that could be the problem, but it's also possible that no one could get to it in the alloted CF timeframe. Maybe the best-qualified reviewers were on vacation, or maybe there were just too many patches. I could see bouncing a patch on this basis if it doesn't get touched for, say, two consecutive CFs. Maybe it would help if patches which got punted from the last commitfest without review were marked up in some way (red, bold) in the commitfest app so reviewers are nudged to maybe consider picking them up first. Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
All, Bringing this down to Earth: yes, it would be useful to have min and max for pg_stat_statement, and even if we add more stats to pg_stat_statement, it would be useful to have those two. So can we approve this patch on that basis? For my part, I generally use the 9-part percentiles for query analysis (0,5,10,25,50,75,90,95,100). However, that's fairly expensive to calculate, and would require a histogram or other approaches mentioned earlier. On 10/22/2013 11:16 AM, Jeff Janes wrote: It is easy to misinterpret the standard deviation if the distribution is not gaussian, but that is also true of the average. The standard deviation (or the variance) is commonly used with non-gaussian distributions, either because it is the most efficient estimator for those particular distributions, or just because it is so commonly available. On the other hand, it's still true that a high STDDEV indicates a high variance in the response times of a particular query, whereas a low one indicates that most are close to the average. While precision math might not work if we don't have the correct distribution, for gross DBA checks it's still useful. That is, I can answer the question in many cases of: Does this query have a high average because of outliers, or because it's consisently slow? by looking at the STDDEV. And FWIW, for sites where we monitor pg_stat_statements, we reset daily or weekly. Otherwise, the stats have no meaning. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
Josh, * Josh Berkus (j...@agliodbs.com) wrote: On the other hand, it's still true that a high STDDEV indicates a high variance in the response times of a particular query, whereas a low one indicates that most are close to the average. While precision math might not work if we don't have the correct distribution, for gross DBA checks it's still useful. That is, I can answer the question in many cases of: Does this query have a high average because of outliers, or because it's consisently slow? by looking at the STDDEV. The concern is actually the reverse issue- often the question is is this query ever really slow?, or when is this query really slow? and those questions are not answered by stddev, min, max, nor avg. And FWIW, for sites where we monitor pg_stat_statements, we reset daily or weekly. Otherwise, the stats have no meaning. I have wondered if we (PG) should do that by default.. I agree that often they are much more useful when reset periodically. Of course, having actual historical information *would* be valuable, if you could identify the time range covered.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Location for external scripts for Extensions?
Dimitri, Josh Berkus josh(at)agliodbs(dot)com writes: pg_partman has several external (python) scripts which help the extension, located in /extras/ in its source. The problem currently is that if you install pg_partman via pgxn or package, you don't get those scripts, because there's no install location for them. See also my proposal to solve that, I'd welcome some design level discussions about it: http://www.postgresql.org/message-id/m28uyzgof3@2ndquadrant.fr Well, that's a bit more complicated than what I had in mind. I was thinking just a location like $PGSHARE/extensions/lib/ Note that, in the case of pg_partman, we're talking about external scripts designed to be run by hand, not libaries to be called by the PL functions. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sigh, my old HPUX box is totally broken by DSM patch
* Robert Haas (robertmh...@gmail.com) wrote: On Wed, Oct 23, 2013 at 11:35 AM, Stephen Frost sfr...@snowman.net wrote: Would this make sense as a configure-time check, rather than initdb, to try blocking SIGSYS and checking for an ENOSYS from shm_open()? Seems preferrable to do that in a configure check rather than initdb. I don't see why. It's a run-time behavior; the build system may not be where the binaries will ultimately run. I suppose, just need to be more cautious when blocking signals in initdb than in a configure-time check, of course. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Using indexes for ORDER BY and PARTITION BY clause in windowing functions
On Wed, Oct 23, 2013 at 10:58 PM, Tom Lane t...@sss.pgh.pa.us wrote: Sameer Kumar sameer.ku...@ashnik.com writes: I am not sure why but my PostgreSQL does not seem to be using indexes for ORDER BY clause or PARTITION BY CLAUSE which I use with windowing function. When the entire contents of the table have to be read, a seqscan-and-sort will frequently be estimated as cheaper than an indexscan. If you think this is not true on your hardware, you might need to adjust random_page_cost. regards, tom lane My mistake. I had understood the issue wrongly. Actually when I use functions like max to find the maximum value grouped by another column I get a better performance when I try to do the same operation using max() over(). Take a look at below plan: edb=# \x Expanded display is on. edb=# \dS= student_score; Table enterprisedb.student_score Column| Type | Modifiers --+-+--- id | integer | not null student_name | character varying(1000) | score| integer | course | character varying(100) | Indexes: student_score_pkey PRIMARY KEY, btree (id) idx_course btree (course) idx_score btree (score) edb=# select count(*) from student_score ; -[ RECORD 1 ]- count | 122880 edb=# explain analyze select max(score) from student_score group by course; -[ RECORD 1 ]- QUERY PLAN | HashAggregate (cost=3198.20..3198.26 rows=6 width=9) (actual time=110.792..110.793 rows=6 loops=1) -[ RECORD 2 ]- QUERY PLAN | - Seq Scan on student_score (cost=0.00..2583.80 rows=122880 width=9) (actual time=0.011..23.055 rows=122880 loops=1) -[ RECORD 3 ]- QUERY PLAN | Total runtime: 110.862 ms edb=# explain analyze select max(score) over(partition by course) from student_score ; -[ RECORD 1 ] QUERY PLAN | WindowAgg (cost=0.00..10324.65 rows=122880 width=9) (actual time=36.145..224.504 rows=122880 loops=1) -[ RECORD 2 ] QUERY PLAN | - Index Scan using idx_course on student_score (cost=0.00..8481.45 rows=122880 width=9) (actual time=0.037..85.283 rows=122880 loops=1) -[ RECORD 3 ] QUERY PLAN | Total runtime: 242.949 ms AS you can see there is a difference of twice. On similar lines, when I have to find students who topped (had highest score) per course, I will fire something like below: edb=# explain analyze select student_name from student_score where (course,score)in (select course,max(score) from student_score group by course); -[ RECORD 1 ]--- QUERY PLAN | Hash Semi Join (cost=3198.41..6516.76 rows=7300 width=43) (actual time=113.727..181.045 rows=555 loops=1) -[ RECORD 2 ]--- QUERY PLAN | Hash Cond: (((enterprisedb.student_score.course)::text = (enterprisedb.student_score.course)::text) AND (enterprisedb.student_score.score = (max(enterprisedb.student_score.score -[ RECORD 3 ]--- QUERY PLAN | - Seq Scan on student_score (cost=0.00..2583.80 rows=122880 width=52) (actual time=0.009..22.702 rows=122880 loops=1) -[ RECORD 4 ]--- QUERY PLAN | - Hash (cost=3198.32..3198.32 rows=6 width=9) (actual time=111.521..111.521 rows=6 loops=1) -[ RECORD 5 ]--- QUERY PLAN | Buckets: 1024 Batches: 1 Memory Usage: 1kB -[ RECORD 6
[HACKERS] RULE regression test fragility?
While reviewing the Network Stats Traffic patch I discovered the current regression test for rules depends on the system view definitions not changing: -- -- Check that ruleutils are working -- SELECT viewname, definition FROM pg_views WHERE schemaname 'information_schema' ORDER BY viewname; In this particular case new fields have been added to the view, breaking this apparently unrelated test. Is checking the definition of all views necessary for this test? Would it possibly be better to create a temporary view for this check, or is something else going on here? __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com*
Re: [HACKERS] Commitfest II CLosed
On 10/23/2013 05:38 AM, Noah Misch wrote: We say little about the correct aspects of a patch; it's usually a given that things not mentioned are satisfactory and have self-evident value. That's not such an effective discussion pattern when the topic is management strategies. It's not an effective discussion pattern when dealing with new code contributors either, or even some old ones. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Location for external scripts for Extensions?
On Wed, Oct 23, 2013 at 12:24 PM, Josh Berkus j...@agliodbs.com wrote: Dimitri, Josh Berkus josh(at)agliodbs(dot)com writes: pg_partman has several external (python) scripts which help the extension, located in /extras/ in its source. The problem currently is that if you install pg_partman via pgxn or package, you don't get those scripts, because there's no install location for them. See also my proposal to solve that, I'd welcome some design level discussions about it: http://www.postgresql.org/message-id/m28uyzgof3@2ndquadrant.fr Well, that's a bit more complicated than what I had in mind. I was thinking just a location like $PGSHARE/extensions/lib/ Note that, in the case of pg_partman, we're talking about external scripts designed to be run by hand, not libaries to be called by the PL functions. I suppose that Slony would be a possible other user of this facility... It has something of a mixture of things that get added: a) There are the functions and tables that it adds. Ideally, these could be added in via CREATE EXTENSION, so that it would be somewhat logical for them to live in $PGSHARE/extensions. (I think that's somewhat tough to do using the CREATE EXTENSION facility as currently constituted as the slony schema installation process, at present, does a number of remappings, putting FQ names inside functions and such.) b) Then there are the crucial programs that it uses, slon (replication daemon) and slonik (configuration tool). Those seem to fit well with your intent; they are additional programs (not scripts per se, but I don't think that contradicts) that need to live somewhere reasonably near to the database installation. By default, we stow slon/slonik in the same directory as is used by all the other PostgreSQL binaries (e.g. - psql, pg_ctl, ...), e.g. - $BINDIR Perhaps that is an OK place for pg_partman. It would seem pretty acceptable to me to consider declaring something like $PGSHARE/extensions/bin for such things. -- When confronted by a difficult problem, solve it by reducing it to the question, How would the Lone Ranger handle this? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stats for network traffic WIP
I added this to the current CF, and am starting to review it as I have time. __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com* On Mon, Oct 21, 2013 at 11:32 AM, Stephen Frost sfr...@snowman.net wrote: Nigel, * Nigel Heron (nhe...@querymetrics.com) wrote: Hi, I've been using postgres for many years but never took the time to play with the code until now. As a learning experience i came up with this WIP patch to keep track of the # of bytes sent and received by the server over it's communication sockets. Counters are kept per database, per connection and globally/shared. Very neat idea. Please add it to the current commitfest (http://commitfest.postgresql.org) and, ideally, someone will get in and review it during the next CM. Thanks! Stephen
Re: [HACKERS] psql tab completion for updatable foreign tables
On Sat, Oct 19, 2013 at 5:44 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: On 18 October 2013 16:41, Robert Haas robertmh...@gmail.com wrote: On Fri, Oct 18, 2013 at 1:34 AM, Dean Rasheed dean.a.rash...@gmail.com wrote: Personally, I think this is too fancy anyway. I'd just complete all views and foreign tables and be done with it. We don't inspect permissions either, for example. This might be too confusing for users. Yeah, I think you're probably right. I tend to agree. When the rules were simple (i.e. pretty much nothing was updateable) it might have made sense to make tab completion hew to them, but they're complex enough now that I think it no longer does. There are now three different ways that a view can be updateable (auto, trigger, rule) and the rules are complex. Based on that it sounds like we need a new version of this patch. If that's not going to happen RSN, we should mark this returned with feedback and it can be resubmitted if and when someone finds the time to update it. OK, here's a new version that just completes with all tables, views and foreign tables. Doing this makes the insert, update and delete queries all the same, which means there's not much point in keeping all three, so I've just kept Query_for_list_of_updatables for use with INSERT, UPDATE and DELETE. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stats for network traffic WIP
Hi, thanks, I'm still actively working on this patch. I've gotten the traffic counters working when using SSL enabled clients (includes the ssl overhead now) but I still have the walsender transfers under SSL to work on. I'll post an updated patch when i have it figured out. Since the patch changes some views in pg_catalog, a regression test fails .. i'm not sure what to do next. Change the regression test in the patch, or wait until the review phase? I was also thinking of adding global counters for the stats collector (pg_stat* file read/write bytes + packets lost) and also log file io (bytes written for txt and csv formats) .. any interest? -nigel. On Wed, Oct 23, 2013 at 12:50 PM, Mike Blackwell mike.blackw...@rrd.com wrote: I added this to the current CF, and am starting to review it as I have time. __ Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com On Mon, Oct 21, 2013 at 11:32 AM, Stephen Frost sfr...@snowman.net wrote: Nigel, * Nigel Heron (nhe...@querymetrics.com) wrote: Hi, I've been using postgres for many years but never took the time to play with the code until now. As a learning experience i came up with this WIP patch to keep track of the # of bytes sent and received by the server over it's communication sockets. Counters are kept per database, per connection and globally/shared. Very neat idea. Please add it to the current commitfest (http://commitfest.postgresql.org) and, ideally, someone will get in and review it during the next CM. Thanks! Stephen -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stats for network traffic WIP
Sounds good. I personally don't have any interest in log file i/o counters, but that's just me. I wonder if stats collector counters might be useful... I seem to recall an effort to improve that area. Maybe not enough use to take the performance hit on a regular basis, though. __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com* On Wed, Oct 23, 2013 at 1:44 PM, Nigel Heron nhe...@querymetrics.comwrote: Hi, thanks, I'm still actively working on this patch. I've gotten the traffic counters working when using SSL enabled clients (includes the ssl overhead now) but I still have the walsender transfers under SSL to work on. I'll post an updated patch when i have it figured out. Since the patch changes some views in pg_catalog, a regression test fails .. i'm not sure what to do next. Change the regression test in the patch, or wait until the review phase? I was also thinking of adding global counters for the stats collector (pg_stat* file read/write bytes + packets lost) and also log file io (bytes written for txt and csv formats) .. any interest? -nigel. On Wed, Oct 23, 2013 at 12:50 PM, Mike Blackwell mike.blackw...@rrd.com wrote: I added this to the current CF, and am starting to review it as I have time. __ Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com On Mon, Oct 21, 2013 at 11:32 AM, Stephen Frost sfr...@snowman.net wrote: Nigel, * Nigel Heron (nhe...@querymetrics.com) wrote: Hi, I've been using postgres for many years but never took the time to play with the code until now. As a learning experience i came up with this WIP patch to keep track of the # of bytes sent and received by the server over it's communication sockets. Counters are kept per database, per connection and globally/shared. Very neat idea. Please add it to the current commitfest (http://commitfest.postgresql.org) and, ideally, someone will get in and review it during the next CM. Thanks! Stephen
Re: [HACKERS] stats for network traffic WIP
On Thu, Oct 24, 2013 at 12:23 AM, Mike Blackwell mike.blackw...@rrd.com wrote: Sounds good. I personally don't have any interest in log file i/o counters, but that's just me. I wonder if stats collector counters might be useful... I seem to recall an effort to improve that area. Maybe not enough use to take the performance hit on a regular basis, though. +1. I tend to be a bit touchy about any changes to code that runs frequently. We need to seriously test if the overhead added by this patch is worth it. IMO, the idea is pretty good. Its just that we need to do some wide spectrum performance testing. Thats only my thought though. Regards, Atri -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stats for network traffic WIP
On Wed, Oct 23, 2013 at 1:58 PM, Atri Sharma atri.j...@gmail.com wrote: IMO, the idea is pretty good. Its just that we need to do some wide spectrum performance testing. Thats only my thought though. I'm looking at trying to do some performance testing on this. Any suggestions on test scenarios, etc? __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com*
Re: [HACKERS] stats for network traffic WIP
On Wed, Oct 23, 2013 at 2:58 PM, Atri Sharma atri.j...@gmail.com wrote: On Thu, Oct 24, 2013 at 12:23 AM, Mike Blackwell mike.blackw...@rrd.com wrote: Sounds good. I personally don't have any interest in log file i/o counters, but that's just me. I wonder if stats collector counters might be useful... I seem to recall an effort to improve that area. Maybe not enough use to take the performance hit on a regular basis, though. +1. I tend to be a bit touchy about any changes to code that runs frequently. We need to seriously test if the overhead added by this patch is worth it. IMO, the idea is pretty good. Its just that we need to do some wide spectrum performance testing. Thats only my thought though. I didn't implement the code yet, but my impression is that since it will be the stats collector gathering counters about itself there will be very little overhead (no message passing, etc.) .. just a few int calculations and storing a few more bytes in the global stats file. The log file io tracking would generate some overhead though, similar to network stats tracking. I think the stats collector concerns voiced previously on the list were more about per relation stats which creates alot of io on servers with many tables. Adding global stats doesn't seem as bad to me. -nigel. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stats for network traffic WIP
On Thu, Oct 24, 2013 at 12:30 AM, Mike Blackwell mike.blackw...@rrd.com wrote: On Wed, Oct 23, 2013 at 1:58 PM, Atri Sharma atri.j...@gmail.com wrote: IMO, the idea is pretty good. Its just that we need to do some wide spectrum performance testing. Thats only my thought though. I'm looking at trying to do some performance testing on this. Any suggestions on test scenarios, etc? Umm...Lots of clients together would be the first obvious testing that comes to my mind. One thing to look at would be erratic clients. If some clients connect and disconnect within a short span of time, we should look if the collector works fine there. Also, we should verify the accuracy of the statistics collected. A small deviation is fine, but we should do a formal test, just to be sure. Does anyone think that the new untracked ports introduced by the patch could pose a problem? I am not sure there. I havent taken a deep look at the patch yet, but I will try to do so. However, since I will be in Dublin next week, it may happen that my inputs may be delayed a bit. The plus side is that I will discuss this with lots of people there. Adding myself as the co reviewer specifically for the testing purposes, if its ok with you. Regards, Atri -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] stats for network traffic WIP
On Wed, Oct 23, 2013 at 2:10 PM, Atri Sharma atri.j...@gmail.com wrote: Adding myself as the co reviewer specifically for the testing purposes, if its ok with you. It's perfectly fine with me. Please do! __ *Mike Blackwell | Technical Analyst, Distribution Services/Rollout Management | RR Donnelley* 1750 Wallace Ave | St Charles, IL 60174-3401 Office: 630.313.7818 mike.blackw...@rrd.com http://www.rrdonnelley.com http://www.rrdonnelley.com/ * mike.blackw...@rrd.com*
Re: [HACKERS] missing locking in at least INSERT INTO view WITH CHECK
On 23 October 2013 02:18, Andres Freund and...@2ndquadrant.com wrote: Hi, Using the same debugging hack^Wpatch (0001) as in the matview patch (0002) an hour or so ago I noticed that INSERT INTO view WITH CHECK doesn't lock the underlying relations properly. I've attached a sort-of-working (0003) hack but I really doubt it's the correct approach, I don't really know enough about that area of the code. This looks like something that needs to be fixed. Hmm, my first thought is that rewriteTargetView() should be calling AcquireRewriteLocks() on viewquery, before doing too much with it. There may be sub-queries in viewquery's quals (and also now in its targetlist) and I don't think the relations referred to by those sub-queries are getting locked. I think that any code that is doing anything significant with a rule action's query needs to think about locking the query's relations. I did a quick search and the only suspicious code I found was the matview and auto-updatable view code. Regards, Dean Also attached is 0004 which just adds a heap_lock() around a newly created temporary table in the matview code which shouldn't be required for correctness but gives warm and fuzzy feelings as well as less debugging noise. Wouldn't it be a good idea to tack such WARNINGs (in a proper and clean form) to index_open (checking the underlying relation is locked), relation_open(..., NoLock) (checking the relation has previously been locked) and maybe RelationIdGetRelation() when cassert is enabled? ISTM we frequently had bugs around this. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 9:20 AM, Stephen Frost sfr...@snowman.net wrote: Josh, * Josh Berkus (j...@agliodbs.com) wrote: On the other hand, it's still true that a high STDDEV indicates a high variance in the response times of a particular query, whereas a low one indicates that most are close to the average. While precision math might not work if we don't have the correct distribution, for gross DBA checks it's still useful. That is, I can answer the question in many cases of: Does this query have a high average because of outliers, or because it's consisently slow? by looking at the STDDEV. The concern is actually the reverse issue- often the question is is this query ever really slow?, or when is this query really slow? and those questions are not answered by stddev, min, max, nor avg. How does max not answer is this query ever really slow?? But good point, if we have a max, then I think a time-stamp for when that max was obtained would also be very useful. Cheers, Jeff
Re: [HACKERS] missing locking in at least INSERT INTO view WITH CHECK
On 2013-10-23 20:51:27 +0100, Dean Rasheed wrote: On 23 October 2013 02:18, Andres Freund and...@2ndquadrant.com wrote: Hi, Using the same debugging hack^Wpatch (0001) as in the matview patch (0002) an hour or so ago I noticed that INSERT INTO view WITH CHECK doesn't lock the underlying relations properly. I've attached a sort-of-working (0003) hack but I really doubt it's the correct approach, I don't really know enough about that area of the code. This looks like something that needs to be fixed. Hmm, my first thought is that rewriteTargetView() should be calling AcquireRewriteLocks() on viewquery, before doing too much with it. There may be sub-queries in viewquery's quals (and also now in its targetlist) and I don't think the relations referred to by those sub-queries are getting locked. Well, that wouldn't follow the currently documented rule ontop of QueryRewrite: * NOTE: the parsetree must either have come straight from the parser, * or have been scanned by AcquireRewriteLocks to acquire suitable locks. It might still be the right thing to do, but it seems suspicious that the rules need to be tweaked like that. I think that any code that is doing anything significant with a rule action's query needs to think about locking the query's relations. I did a quick search and the only suspicious code I found was the matview and auto-updatable view code. Yea, that were the locations the debugging patch cried on... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
* Jeff Janes (jeff.ja...@gmail.com) wrote: On Wed, Oct 23, 2013 at 9:20 AM, Stephen Frost sfr...@snowman.net wrote: * Josh Berkus (j...@agliodbs.com) wrote: On the other hand, it's still true that a high STDDEV indicates a high variance in the response times of a particular query, whereas a low one indicates that most are close to the average. While precision math might not work if we don't have the correct distribution, for gross DBA checks it's still useful. That is, I can answer the question in many cases of: Does this query have a high average because of outliers, or because it's consisently slow? by looking at the STDDEV. The concern is actually the reverse issue- often the question is is this query ever really slow?, or when is this query really slow? and those questions are not answered by stddev, min, max, nor avg. How does max not answer is this query ever really slow?? meh. max can end up being high for about a bazillion reasons and it'd be difficult to really get any understanding of how or why it happened from just that information. But good point, if we have a max, then I think a time-stamp for when that max was obtained would also be very useful. And now we're getting into exactly what I was trying to suggest up-thread: outlier detection and increased logging when a rare event occurs.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] missing locking in at least INSERT INTO view WITH CHECK
On 23 October 2013 21:08, Andres Freund and...@2ndquadrant.com wrote: On 2013-10-23 20:51:27 +0100, Dean Rasheed wrote: On 23 October 2013 02:18, Andres Freund and...@2ndquadrant.com wrote: Hi, Using the same debugging hack^Wpatch (0001) as in the matview patch (0002) an hour or so ago I noticed that INSERT INTO view WITH CHECK doesn't lock the underlying relations properly. I've attached a sort-of-working (0003) hack but I really doubt it's the correct approach, I don't really know enough about that area of the code. This looks like something that needs to be fixed. Hmm, my first thought is that rewriteTargetView() should be calling AcquireRewriteLocks() on viewquery, before doing too much with it. There may be sub-queries in viewquery's quals (and also now in its targetlist) and I don't think the relations referred to by those sub-queries are getting locked. Well, that wouldn't follow the currently documented rule ontop of QueryRewrite: * NOTE: the parsetree must either have come straight from the parser, * or have been scanned by AcquireRewriteLocks to acquire suitable locks. It might still be the right thing to do, but it seems suspicious that the rules need to be tweaked like that. Well it matches what already happens in other places in the rewriter --- see rewriteRuleAction() and ApplyRetrieveRule(). It's precisely because the rule action's query hasn't come from the parser that it needs to be processed in this way. Regards, Dean -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 1:07 PM, Jeff Janes jeff.ja...@gmail.com wrote: How does max not answer is this query ever really slow?? But good point, if we have a max, then I think a time-stamp for when that max was obtained would also be very useful. I'm concerned about the cost of all of this. And like Stephen, I'm not too impressed by the idea of a permanent max - it's going to be some value from before the cache was warmed the large majority of the time. I think that there are some big savings to be made now that the query text is only useful to humans, and isn't compared directly for the purposes of matching and so on. Generally speaking, a human will inquire about query execution costs far less frequently than the system spends aggregating them. So fixing that problem would go a long way towards resolving these concerns. It would also probably have the benefit of making it possible for query texts to be arbitrarily long - we'd be storing them in files (with a shared memory buffer). I get a lot of complaints about the truncation of query texts in pg_stat_statements, so I think that'd be really valuable. It would make far higher pg_stat_statements.max values practical to boot, by radically reducing the amount of shared memory required. All of this might be a bit tricky, but I suspect it's well worth it. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Tue, Oct 22, 2013 at 11:16:19AM -0700, Jeff Janes wrote: On Mon, Oct 21, 2013 at 4:29 PM, Tom Lane t...@sss.pgh.pa.us wrote: Hm. It's been a long time since college statistics, but doesn't the entire concept of standard deviation depend on the assumption that the underlying distribution is more-or-less normal (Gaussian)? It is easy to misinterpret the standard deviation if the distribution is not gaussian, but that is also true of the average. The standard deviation (or the variance) is commonly used with non-gaussian distributions, either because it is the most efficient estimator for those particular distributions, or just because it is so commonly available. Well, the standard deviation is the square root of the variance, which is the second moment of the distribution. The first moment being the mean. No matter what distribution it is, these are useful numbers. If I had to guess a distribution for query runtimes I'd go for Poisson, which would mean you'd expect the mean to equal the variance. Don't have enough experience with such measurements to say whether that is reasonable. Have a nice day, -- Martijn van Oosterhout klep...@svana.org http://svana.org/kleptog/ He who writes carelessly confesses thereby at the very outset that he does not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 2:46 PM, Martijn van Oosterhout klep...@svana.org wrote: Well, the standard deviation is the square root of the variance, which is the second moment of the distribution. The first moment being the mean. No matter what distribution it is, these are useful numbers. I'm inclined to agree. Perhaps more importantly, like the mean, the stddev is the truth, even if it doesn't tell the full story. This data will always need to be interpreted by a reasonably well informed human. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Autotuning and RAM survey results
Folks, Of course, this survey is completely unscientific. However, it does give us more information than we had before: http://www.postgresql.org/community/survey/89-what-is-the-smallest-amount-of-ram-you-have-on-a-production-postgresql-server/ So going by this, only 20% of our users have any production servers with RAM less than 1GB. And, I suspect, if we could ask how likely those instances are to be upgraded, the numbers would erode further. That supports the argument that it makes sense to release with a default config which favors users of over 1GB of RAM. We'll want to add an example doc of recalibrating for small RAM, but I don't think that's that hard. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On 24/10/13 10:34, Marc Mamin wrote: Oscillating plan changes may fit multimodal but I don't feel that's typical. My experience has been it's either an extremely rare plan difference or it's a shift from one plan to another over time. After all, all of avg, min, max and stdev are only numerical value for predicting model. There aren't the robustness and strictness such as Write Ahead Logging. It resembles a weather forecast. They are still better than nothing. It is needed a human judgment to finally suppose a cause from the numerical values. By the way, we can guess probability of the value from stdev. Therefore we can guess easily even if there is an extreme value in min/max whether it is normal or not. What I've been gathering from my quick chat this morning is that either you know how to characterize the distribution and then the min max and average are useful on their own, or you need to keep track of an histogram where all the bins are of the same size to be able to learn what the distribution actually is. Hello, We have an in house reporting application doing a lot of response times graphing. Our experience has shown that in many cases of interest (the one you want to dig in) a logarithmic scale for histogram bins result in a better visualization. attached an example from a problematic postgres query... my 2 pences, Marc Mamin Looks definitely bimodal in the log version, very clear! Yes, I feel that having a 32 log binary binned histogram (as Alvaro Herrera suggested) would be very useful. Especially if the size of the first bin can be set - as some people would like to be 100us and others might prefer 1ms or something else. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 2:57 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: Looks definitely bimodal in the log version, very clear! Yes, I feel that having a 32 log binary binned histogram (as Alvaro Herrera suggested) would be very useful. I'm having a hard time imagining how you'd actually implement that. For example, this: https://wiki.postgresql.org/wiki/Aggregate_Histogram requires that a limit be specified ahead of time. Is there a principled way to increase or decrease this kind of limit over time, and have the new buckets contents spill into each other? -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On 24/10/13 11:26, Peter Geoghegan wrote: On Wed, Oct 23, 2013 at 2:57 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: Looks definitely bimodal in the log version, very clear! Yes, I feel that having a 32 log binary binned histogram (as Alvaro Herrera suggested) would be very useful. I'm having a hard time imagining how you'd actually implement that. For example, this: https://wiki.postgresql.org/wiki/Aggregate_Histogram requires that a limit be specified ahead of time. Is there a principled way to increase or decrease this kind of limit over time, and have the new buckets contents spill into each other? To smplify things, I'm using 5 buckets, but 32 would be better. Assume first bucket width is 1ms. bucket range 0 x = 1ms 11ms x = 2ms 22ms x = 4ms 34ms x = 8ms 58ms x If the size of the first bucket changed, then implicitly the histogram would be restarted. As there is no meaningful way of using any data from the existing histogram - even if the size of the first bucket was a power of 2 greater than the old one (here things are fine, until you try and apportion the data in the last bucket!). Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 3:26 PM, Peter Geoghegan p...@heroku.com wrote: On Wed, Oct 23, 2013 at 2:57 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: Looks definitely bimodal in the log version, very clear! Yes, I feel that having a 32 log binary binned histogram (as Alvaro Herrera suggested) would be very useful. I'm having a hard time imagining how you'd actually implement that. For example, this: https://wiki.postgresql.org/wiki/Aggregate_Histogram requires that a limit be specified ahead of time. Is there a principled way to increase or decrease this kind of limit over time, and have the new buckets contents spill into each other? If you are doing a log scale in the bucket widths, 32 buckets covers a huge range, so I think you could get away without having the ability to rescale at all. For example just have the bottom bucket mean =1ms, and the top bucket mean 12.42 days (rather than between 12.42 and 24.85 days). But it should be possible to rescale if you really want to. If a bucket 32 is needed, just add bucket[2] into bucket[1], slide buckets 3..32 down one each, and initialize a new bucket 32, and bump the factor that says how many times this shift has been done before. Cheers, Jeff
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On 24/10/13 12:00, Gavin Flower wrote: On 24/10/13 11:26, Peter Geoghegan wrote: On Wed, Oct 23, 2013 at 2:57 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: Looks definitely bimodal in the log version, very clear! Yes, I feel that having a 32 log binary binned histogram (as Alvaro Herrera suggested) would be very useful. I'm having a hard time imagining how you'd actually implement that. For example, this: https://wiki.postgresql.org/wiki/Aggregate_Histogram requires that a limit be specified ahead of time. Is there a principled way to increase or decrease this kind of limit over time, and have the new buckets contents spill into each other? To smplify things, I'm using 5 buckets, but 32 would be better. Assume first bucket width is 1ms. bucket range 0 x = 1ms 11ms x = 2ms 22ms x = 4ms 34ms x = 8ms 58ms x If the size of the first bucket changed, then implicitly the histogram would be restarted. As there is no meaningful way of using any data from the existing histogram - even if the size of the first bucket was a power of 2 greater than the old one (here things are fine, until you try and apportion the data in the last bucket!). Cheers, Gavin Argh! Just realized, that even if the size of the first bucket was a power of 2 greater than the old one, then you can't meaningfully use any of the old data in any of the old buckets (this is 'obvious; but somewhat messy to explain!) Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 4:00 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: On 24/10/13 11:26, Peter Geoghegan wrote: On Wed, Oct 23, 2013 at 2:57 PM, Gavin Flower gavinflo...@archidevsys.co.nz** wrote: Looks definitely bimodal in the log version, very clear! Yes, I feel that having a 32 log binary binned histogram (as Alvaro Herrera suggested) would be very useful. I'm having a hard time imagining how you'd actually implement that. For example, this: https://wiki.postgresql.org/**wiki/Aggregate_Histogramhttps://wiki.postgresql.org/wiki/Aggregate_Histogram requires that a limit be specified ahead of time. Is there a principled way to increase or decrease this kind of limit over time, and have the new buckets contents spill into each other? To smplify things, I'm using 5 buckets, but 32 would be better. Assume first bucket width is 1ms. bucket range 0 x = 1ms 11ms x = 2ms 22ms x = 4ms 34ms x = 8ms 58ms x The last bucket would be limited to 8ms x = 16 ms. If you find something 16ms, then you have to rescale *before* you increment any of the buckets. Once you do, there is now room to hold it. bucket range 0 x = 2ms (sum of previous bucket 0 and previous bucket 1) 12ms x = 4ms 24ms x = 8ms 38ms x = 16ms 416ms x = 32ms (starts empty) Cheers, Jeff
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 4:14 PM, Jeff Janes jeff.ja...@gmail.com wrote: The last bucket would be limited to 8ms x = 16 ms. If you find something 16ms, then you have to rescale *before* you increment any of the buckets. Once you do, there is now room to hold it. How is that laid out in shared memory? If the answer is an array of 32 int64s, one per bucket, -1 from me to this proposal. A huge advantage of pg_stat_statements today is that the overhead is actually fairly modest. I really want to preserve that property. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
* Martijn van Oosterhout (klep...@svana.org) wrote: If I had to guess a distribution for query runtimes I'd go for Poisson, which would mean you'd expect the mean to equal the variance. Don't have enough experience with such measurements to say whether that is reasonable. I was thinking the same, but I'm not sure how that really helps us. The histogram is a good idea, imv, and I'll add my vote for implementing that. If it's too expensive to do currently then we need to work out a way to make it cheaper. The log-based histogram as an array w/ NULLs for any empty buckets might not be that terrible. Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] CLUSTER FREEZE
Hi I noticed that CLUSTER doesn't have a FREEZE option. Here is a patch to add that, for consistency with VACUUM. Is it useful? Thanks Thomas Munro cluster-freeze.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On 24/10/13 12:14, Jeff Janes wrote: On Wed, Oct 23, 2013 at 4:00 PM, Gavin Flower gavinflo...@archidevsys.co.nz mailto:gavinflo...@archidevsys.co.nz wrote: On 24/10/13 11:26, Peter Geoghegan wrote: On Wed, Oct 23, 2013 at 2:57 PM, Gavin Flower gavinflo...@archidevsys.co.nz mailto:gavinflo...@archidevsys.co.nz wrote: Looks definitely bimodal in the log version, very clear! Yes, I feel that having a 32 log binary binned histogram (as Alvaro Herrera suggested) would be very useful. I'm having a hard time imagining how you'd actually implement that. For example, this: https://wiki.postgresql.org/wiki/Aggregate_Histogram requires that a limit be specified ahead of time. Is there a principled way to increase or decrease this kind of limit over time, and have the new buckets contents spill into each other? To smplify things, I'm using 5 buckets, but 32 would be better. Assume first bucket width is 1ms. bucket range 0 x = 1ms 11ms x = 2ms 22ms x = 4ms 34ms x = 8ms 58ms x The last bucket would be limited to 8ms x = 16 ms. If you find something 16ms, then you have to rescale *before* you increment any of the buckets. Once you do, there is now room to hold it. bucket range 0 x = 2ms (sum of previous bucket 0 and previous bucket 1) 12ms x = 4ms 24ms x = 8ms 38ms x = 16ms 416ms x = 32ms (starts empty) Cheers, Jeff It is very important that the last bucket be unbounded, or you may lose potentially important data. Especially if one asumes that all durations will fit into the first n - 1 buckets, in which case being alerted to the asumption being siginificantly wrong is crucial! The logic to check on the values for the last bucket is trivial, so there is no need to have an upper limit for it. Cheers, Gavin
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
* Peter Geoghegan (p...@heroku.com) wrote: On Wed, Oct 23, 2013 at 4:14 PM, Jeff Janes jeff.ja...@gmail.com wrote: The last bucket would be limited to 8ms x = 16 ms. If you find something 16ms, then you have to rescale *before* you increment any of the buckets. Once you do, there is now room to hold it. How is that laid out in shared memory? If the answer is an array of 32 int64s, one per bucket, -1 from me to this proposal. A huge advantage of pg_stat_statements today is that the overhead is actually fairly modest. I really want to preserve that property. Any chance we could accumulate into the histogram in local memory and only push to the shared memory on an infrequent or at least less frequent basis? Apologies, I've not looked into the pg_stat_statements bits all that closely, but I can certainly see how having to hold it all in shared memory with locking to update would be painful.. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On 10/23/2013 01:26 PM, Peter Geoghegan wrote: So fixing that problem would go a long way towards resolving these concerns. It would also probably have the benefit of making it possible for query texts to be arbitrarily long - we'd be storing them in files (with a shared memory buffer). I get a lot of complaints about the truncation of query texts in pg_stat_statements, so I think that'd be really valuable. It would make far higher pg_stat_statements.max values practical to boot, by radically reducing the amount of shared memory required. All of this might be a bit tricky, but I suspect it's well worth it. So you're suggesting that instead of storing the aggregates as we currently do, we store a buffer of the last N queries (in normal form) and their stats? And then aggregate when the user asks for it? That would be nice, and IMHO the only way we can really resolve all of these stats concerns. Any approach we take using histograms etc. is going to leave out some stat someone needs. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 4:34 PM, Stephen Frost sfr...@snowman.net wrote: Any chance we could accumulate into the histogram in local memory and only push to the shared memory on an infrequent or at least less frequent basis? Apologies, I've not looked into the pg_stat_statements bits all that closely, but I can certainly see how having to hold it all in shared memory with locking to update would be painful.. I doubt it. That trick I proposed around storing the query text in an external file is only feasible because we never update the query text, and we hardly ever care what it actually is in practice. Contrast that with something that is in a structure protected by a spinlock. You'd have to keep deltas stashed in TopMemoryContext, and for all kinds of common cases that just wouldn't work. Plus you have to have some whole new mechanism for aggregating the stats across backends when someone expresses an interest in seeing totals. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On 24/10/13 12:24, Peter Geoghegan wrote: On Wed, Oct 23, 2013 at 4:14 PM, Jeff Janes jeff.ja...@gmail.com wrote: The last bucket would be limited to 8ms x = 16 ms. If you find something 16ms, then you have to rescale *before* you increment any of the buckets. Once you do, there is now room to hold it. How is that laid out in shared memory? If the answer is an array of 32 int64s, one per bucket, -1 from me to this proposal. A huge advantage of pg_stat_statements today is that the overhead is actually fairly modest. I really want to preserve that property. 32 int64 buckets is only 256 bytes, so a thousand histograms would be less than a quarter of a MB. Any machine that busy, would likely have many GB's of RAM. I have 32 GB on my development machine. Though, I suppose that the option to have such histograms could be off by default, which would seem reasonable. How about a convention not to have histgrams, when the parameter specifying the width of the first bucket was either missing or set to zero (assuming a 'negative value' would be a syntax error!). Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 4:46 PM, Josh Berkus j...@agliodbs.com wrote: So you're suggesting that instead of storing the aggregates as we currently do, we store a buffer of the last N queries (in normal form) and their stats? And then aggregate when the user asks for it? No, I'm not. I'm suggesting storing the query texts externally, in a file. They usually use 1024 bytes of shared memory per entry, regardless of how long the query text is. This would allow pg_stat_statements to store arbitrarily large query texts, while also giving us breathing room if we have ambitions around expanding what pg_stat_statements can (optionally) track. Having said that, I am still pretty sensitive to bloating pg_stat_statements. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On 24/10/13 12:46, Josh Berkus wrote: On 10/23/2013 01:26 PM, Peter Geoghegan wrote: So fixing that problem would go a long way towards resolving these concerns. It would also probably have the benefit of making it possible for query texts to be arbitrarily long - we'd be storing them in files (with a shared memory buffer). I get a lot of complaints about the truncation of query texts in pg_stat_statements, so I think that'd be really valuable. It would make far higher pg_stat_statements.max values practical to boot, by radically reducing the amount of shared memory required. All of this might be a bit tricky, but I suspect it's well worth it. So you're suggesting that instead of storing the aggregates as we currently do, we store a buffer of the last N queries (in normal form) and their stats? And then aggregate when the user asks for it? That would be nice, and IMHO the only way we can really resolve all of these stats concerns. Any approach we take using histograms etc. is going to leave out some stat someone needs. I don't see it as as either/or. It might be useful to optionally all the last n queries be stored as you suggested. People could decide for themselves how much storage they are willing to allocate for the purpose. Storing the last n queries, could be helpful in seeing why some users are suddenly experiencing very slow response times. I think the histogram would be more useful over a day or a week. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 4:48 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: 32 int64 buckets is only 256 bytes, so a thousand histograms would be less than a quarter of a MB. Any machine that busy, would likely have many GB's of RAM. I have 32 GB on my development machine. Who wants to just run with a thousand entries? I have many small instances running on AWS where that actually is an appreciable amount of memory. Individually, any addition to pg_stat_statements shared memory use looks small, but that doesn't mean we want every possible thing. Futhermore, you're assuming that this is entirely a matter of how much memory we use out of how much is available, and I don't understand it that way. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 4:51 PM, Peter Geoghegan p...@heroku.com wrote: No, I'm not. I'm suggesting storing the query texts externally, in a file. They usually use 1024 bytes of shared memory per entry, regardless of how long the query text is. I should add that I think that that's about the useful limit of such schemes. Maybe we could buy a bit more breathing room by storing some of the stats externally, but I doubt it'd be worth it. I'm not interested in optimizing pg_stat_statements in the direction of supporting aggregating a number of distinct entries past much more than 10,000. I am interested in making it store richer statistics, provided we're very careful about the costs. Every time those counters are incremented, a spinlock is held. I don't want everyone to have to pay any non-trivial additional cost for that, given that the added instrumentation may not actually be that useful to most users who just want a rough picture. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On 24/10/13 12:58, Peter Geoghegan wrote: On Wed, Oct 23, 2013 at 4:48 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: 32 int64 buckets is only 256 bytes, so a thousand histograms would be less than a quarter of a MB. Any machine that busy, would likely have many GB's of RAM. I have 32 GB on my development machine. Who wants to just run with a thousand entries? I have many small instances running on AWS where that actually is an appreciable amount of memory. Individually, any addition to pg_stat_statements shared memory use looks small, but that doesn't mean we want every possible thing. Futhermore, you're assuming that this is entirely a matter of how much memory we use out of how much is available, and I don't understand it that way. Anyhow, I was suggesting the faclity be off by default - I see no point in enabling where people don't need it, even if the resource RAM, processor, whatever, were minimal. Cheers, Gavin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
On Wed, Oct 23, 2013 at 5:15 PM, Gavin Flower gavinflo...@archidevsys.co.nz wrote: Anyhow, I was suggesting the faclity be off by default - I see no point in enabling where people don't need it, even if the resource RAM, processor, whatever, were minimal. As long as any new field in the Counters struct needs to be protected by a spinlock, I will be suspicious of the cost. The track_io_timing stuff is still protected, even when it's turned off. So I'm afraid that it isn't that simple. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] RULE regression test fragility?
Mike Blackwell mike.blackw...@rrd.com writes: While reviewing the Network Stats Traffic patch I discovered the current regression test for rules depends on the system view definitions not changing: Yes, this is standard. We just update the expected output anytime somebody changes a system view. (Now, if the submitter failed to notice that his patch broke the regression tests, that's grounds to wonder how much he tested it. But it's not unexpected for that test's output to change.) [ Is it really a good idea for the regression tests to do that? ] I tend to think so, as it seems like a good stress test for the rule-dumping machinery. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] high-dimensional knn-GIST tests (was Re: Cube extension kNN support)
On 9/22/13 4:38 PM, Stas Kelvich wrote: Hello, hackers. Here is the patch that introduces kNN search for cubes with euclidean, taxicab and chebyshev distances. Thanks for this! I decided to give the patch a try at the bleeding edge with some high-dimensional vectors, specifically the 1.4 million 1000-dimensional Freebase entity vectors from the Google 'word2vec' project: https://code.google.com/p/word2vec/#Pre-trained_entity_vectors_with_Freebase_naming Unfortunately, here's what I found: * with 1000-dimensional vectors, the index size on disk grows to many times (1000x or more) the size of the indexed data, making other tests of the index's helpfulness impractical. (Perhaps, other parameter-tuning can help?) * with 500-dimensional or 100-dimensional vectors, the index size is more manageable -- 2x to 6x the data size -- but using the index significantly hurts performance on top-10 nearest-neighbor searches, making it much slower than a full table scan. (The planner still tries to use the index, even though it's hurting.) Much more detail for the curious is below... Regarding the dataset hoped-for results: The word2vec project's example scripts do their synonym/analogy demonstrations by loading the entire 5GB+ dataset into main memory (~3min), do a full scan of all vectors (~40sec) to find those nearest a target vector. My motivating question was: could the data be loaded to Postgresql using the cube type, and kNN-GIST indexed using this patch, then do speedy index-assisted ranked-nearest-neighbor queries from the database? (While the patch's distance_euclid is not the same cosine-distance the word2vec examples use, the freebase entity vectors are all unit vectors, and even additions of vectors can be scaled to unit length. My intuition is that euclidean-distances on the unit sphere will be in the same rank order as cosine-distance, so the cube distance_euclid/- should enable the same sort of synonym/analogy demos.) Regarding the 1000-dimensional case: It was necessary to change three compiled-in implementation limits. In the four steps that the need for change became evident: (1) in contrib/cube/cubedata.h, increase CUBE_MAX_DIM (to accept vectors with more than 100 dimensions) (2) in src/include/pg_config.h, increase BLCKSZ to 16384 (otherwise 1000 64-bit floats in a single column gave an ERROR: row is too big, mentioning the 8160 limit - large cube values not TOASTable?) (3) in src/include/access/itup.h, increase INDEX_SIZE_MASK to 0x3fff (otherwise encountering an ERROR: index row requires 16016 bytes, maximum size is 8191 when attempting to create the index (4) in src/include/pg_config.h, again increase BLCKSZ now to 32768 (otherwise encountering an ERROR: index row size 16416 exceeds maximum 5440 for index pg_class_relname_nsp_index when attempting to create the index With the cube-kNN patch applied and these other changes, I was able to import the 1.4M freebase vectors and do a full-scan nearest-neighbors query. (My starting postgresql codebase was the github mirror of 9.4dev as of about a week ago.) The psql transcript: word2vec=# \timing Timing is on. word2vec=# CREATE EXTENSION cube; CREATE EXTENSION Time: 42.141 ms word2vec=# CREATE TABLE word2vec ( word varchar(200), vec cube ); CREATE TABLE Time: 8.532 ms word2vec=# COPY word2vec FROM PROGRAM 'zcat /tmp/pgvectors/freebase-vectors-skipgram1000.pgtxt.gz'; COPY 1422903 Time: 12399065.498 ms word2vec=# SELECT word, dist FROM (SELECT word, distance_euclid(vec,(SELECT vec FROM word2vec WHERE word='geoffrey hinton')) AS dist FROM word2vec) AS subquery ORDER BY dist LIMIT 11; word | dist -+-- geoffrey hinton |0 marvin minsky | 1.03892498287268 paul corkum | 1.05221701690288 william richard peltier | 1.06244397334495 brenda milner | 1.06387762685894 john charles polanyi| 1.0746452295 leslie valiant | 1.07735786596934 hava siegelmann | 1.08148623006629 hans moravec| 1.0862034591185 david rumelhart | 1.08748431130477 godel prize | 1.08774264379264 (11 rows) Time: 310753.976 ms That's 3.5 hours to do the import and 5 minutes to do the query; this is on a 2009 MacBook Pro with 8GB RAM and SSD. Confirming the intuition above, these 10-nearest are the same entities in the same order as on the word2vec project page's example output, though the euclidean distances are of course different than the cosine distances. The DATA directory is 23GB after the import of 1.4 million rows. In the word2vec uncompressed binary format, this dataset is about 5.4GB, so this word2vec-table cube-column representation involves about 4X expansion. So, to the main question: can that query be sped up by building a kNN-GIST index? Here the problems start. word2vec=# CREATE INDEX word2vec_index ON word2vec
[HACKERS] Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions
Sameer Kumar wrote edb=# explain analyze select max(score) from student_score group by course; This query returns 6 records. The window one returns 123,000. Why do you expect these to have anywhere near the same performance or plan? You can enable/disable indexes/scans to see what alternatives plans may provide but nothing here stands out as being obviously incorrect. I'm not really clear on what your question is. Generally it sounds as if you are wondering if there are any plans to I prove the algorithms behind window function processing. Are you just looking at symptoms and thus possibly have unreasonable expectations or do you actually see an avenue for improvement in the engine? QUERY PLAN | Sort Method: external merge Disk: 7576kB Work memory; I/O is killing your performance on this query. It is more flexible but you pay a price for that. Another thing, (I may be stupid and naive here) does PostgreSQL re-uses the hash which has been already created for sort. In this case the inner query must have created a hash for windoing aggregate. Can't we use that same one while applying the the filter rn=1 ? Probably but others more knowledgable will need to answer authoritatively. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Using-indexes-for-ORDER-BY-and-PARTITION-BY-clause-in-windowing-functions-tp5775605p5775708.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Testing RLS with SECURITY DEFINER functions returning refcursors
Hi all I'm going to be contributing a fair bit of time to RLS for 2ndQuadrant, courtesy of the EU AXLE project (http://axleproject.eu/). I've been catching up on Kohei KaiGai's work and I've been really impressed by what's already done and working. I'm currently reading the patches, mailing list discussion, and the tests. Prompted by mailing list discussion on the topic, I added some tests to the 9.4 v4 RLS patch to check behaviour around SECURITY DEFINER functions and found a bit of an issue. If one non-superuser user2 with limited row access creates a SECURITY DEFINER function that returns a refcursor, and the other user user1 fetches from the cursor, the returned result set is what user1 sees when selecting the table directly, not what user2 sees when it selects the table. That is inconsistent with how SECURITY DEFINER behaves for other rights. It's also inconsistent with a superuser-owned SECURITY DEFINER function, where RLS doesn't add the predicate at all so all rows are returned. Another issue occurs when the superuser invokes the SECURITY DEFINER function created by user2. There are no rows returned from a fetch of the refcursor. This makes sense given that in the test the RLS condition is (dauthor = current_user()) and there are no rows with dauthor set to the superuser's username. This asymmetry is a bug. Either RLS should be applied consistently for the definer, or consistently as the caller. Currently it's the caller unless the definer is superuser, in which case no checks are applied because the RLS predicate never gets applied. I'm doing these tests on top of the tables defined by the rowsecurity test suite in the patch. On a side note, I also noticed that while psql's \dt+ supports RLS, \d or \d+ doesn't provide any indication that there's an RLS policy or what the conditions are. Anyway - the additional tests are attached, and can also be found in https://github.com/ringerc/postgres/tree/rls-9.4 along with a patched expected file showing what I think _should_ be happening. Comments would be appreciated. I'm also interested in more details on the mention of functions that change the current user ID during a query that came up in prior RLS discussion. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql new file mode 100644 index 55d4aad..d9a827f *** a/src/test/regress/sql/rowsecurity.sql --- b/src/test/regress/sql/rowsecurity.sql *** SET SESSION AUTHORIZATION rls_regress_us *** 277,282 --- 277,352 EXPLAIN (costs off) DELETE FROM only t1 WHERE f_leak(b); EXPLAIN (costs off) DELETE FROM t1 WHERE f_leak(b); + + + -- Check refcursors returned from PL/PgSQL SECURITY DEFINER functions + + RESET SESSION AUTHORIZATION; + + CREATE OR REPLACE FUNCTION return_refcursor_assuper() RETURNS refcursor AS $$ + DECLARE + curs1 refcursor; + BEGIN + curs1 = 'super_cursor'; + OPEN curs1 FOR SELECT * FROM document; + RETURN curs1; + END; + $$ + LANGUAGE plpgsql + SECURITY DEFINER; + + -- Run the function entirely as rls_regress_user1 + SET SESSION AUTHORIZATION rls_regress_user1; + BEGIN; + SELECT return_refcursor_assuper(); + -- This fetch should return the full results, even though we are now + -- running as a user with much lower access according to the current + -- RLS policy. + FETCH ALL FROM super_cursor; + -- But this should still return the usual result set + SELECT * FROM document; + ROLLBACK; + + -- Do the same check where we return a refcursor from one RLS-affected + -- user to another RLS-affected user. + + SET SESSION AUTHORIZATION rls_regress_user2; + + CREATE OR REPLACE FUNCTION return_refcursor_asuser2() RETURNS refcursor AS $$ + DECLARE + curs1 refcursor; + BEGIN + curs1 = 'user2_cursor'; + OPEN curs1 FOR SELECT * FROM document; + RETURN curs1; + END; + $$ + LANGUAGE plpgsql + SECURITY DEFINER; + + BEGIN; + SET SESSION AUTHORIZATION rls_regress_user1; + SELECT return_refcursor_asuser2(); + -- Even though we're user1, we should see only user2's results from this. + -- This FAILS, returning user1's results. + FETCH ALL FROM user2_cursor; + -- but user1's results for this + SELECT * FROM document; + ROLLBACK; + + -- Now as the superuser, see if the SECURITY DEFINER on an RLS-affected + -- user filters the rows the superuser sees. It should, for consistency. + + BEGIN; + RESET SESSION AUTHORIZATION; + SELECT return_refcursor_asuser2(); + -- Should see user2's results, but FAILS, instead returning an empty result set (!) + FETCH ALL FROM user2_cursor; + -- Should see superuser's results + SELECT * FROM document; + ROLLBACK; + + DELETE FROM only t1 WHERE f_leak(b) RETURNING oid, *, t1; DELETE FROM t1 WHERE f_leak(b) RETURNING oid, *, t1; -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription:
Re: [HACKERS] stats for network traffic WIP
On Wed, Oct 23, 2013 at 2:44 PM, Nigel Heron nhe...@querymetrics.com wrote: Hi, thanks, I'm still actively working on this patch. I've gotten the traffic counters working when using SSL enabled clients (includes the ssl overhead now) but I still have the walsender transfers under SSL to work on. I'll post an updated patch when i have it figured out. Since the patch changes some views in pg_catalog, a regression test fails .. i'm not sure what to do next. Change the regression test in the patch, or wait until the review phase? here's v2 of the patch including the regression test update. I omitted socket counters for walreceivers, i couldn't get them working under SSL. Since they are using the front end libpq libs i would have to duplicate alot of the code in the backend to be able to instrument them under SSL (add openssl BIO custom send/recv like the backend has), not sure it's worth it.. We can get the data from the master's pg_stat_replication view anyways. I'm open to suggestions. So, for now, the counters only track sockets created from an inbound (client to server) connection. -nigel. netstats-v2.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] RLS syntax
[bikeshed] I'm not totally comfortable with the details of ALTER TABLE ... SET ROW SECURITY FOR ALL . If you read: ALTER TABLE document SET ROW SECURITY FOR ALL TO (...query...) is it obvious to you that FOR ALL here refers to all _command types_? I'd like to make that more explicit and allow an OR-list of commands, so it'd look like FOR ALL COMMANDS TO (...) or for specific commands, something like FOR INSERT OR DELETE TO (...) [/bikeshed] -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add min and max execute statement time in pg_stat_statement
Peter Geoghegan escribió: I am interested in making it store richer statistics, provided we're very careful about the costs. Every time those counters are incremented, a spinlock is held. Hmm, now if we had portable atomic addition, so that we could spare the spinlock ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Re: Using indexes for ORDER BY and PARTITION BY clause in windowing functions
Agree that windowing function will return all the rows compared to max and group by returing only max rows per group. But even while arriving at the aggregate/sorting windowing function seems to spend more effort than group by/order by. I am just trying to see if we could somehow optimize the way windowing operations are performed. (May be in query rewrite). Datawarehouses could use that improvement. Its not my production box, so I can live with disk sort. I have tried with huge sorting memory but still I see a similar difference in cost of sorting for grouping/ordering Vs windowing function. Another thing regarding work_memory, I have generally seen that windowing functions expect more amount of memory for sorting compared to grouping/ordering clauses. On 24 Oct 2013 10:54, David Johnston pol...@yahoo.com wrote: Sameer Kumar wrote edb=# explain analyze select max(score) from student_score group by course; This query returns 6 records. The window one returns 123,000. Why do you expect these to have anywhere near the same performance or plan? You can enable/disable indexes/scans to see what alternatives plans may provide but nothing here stands out as being obviously incorrect. I'm not really clear on what your question is. Generally it sounds as if you are wondering if there are any plans to I prove the algorithms behind window function processing. Are you just looking at symptoms and thus possibly have unreasonable expectations or do you actually see an avenue for improvement in the engine? QUERY PLAN | Sort Method: external merge Disk: 7576kB Work memory; I/O is killing your performance on this query. It is more flexible but you pay a price for that. Another thing, (I may be stupid and naive here) does PostgreSQL re-uses the hash which has been already created for sort. In this case the inner query must have created a hash for windoing aggregate. Can't we use that same one while applying the the filter rn=1 ? Probably but others more knowledgable will need to answer authoritatively. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Using-indexes-for-ORDER-BY-and-PARTITION-BY-clause-in-windowing-functions-tp5775605p5775708.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] high-dimensional knn-GIST tests (was Re: Cube extension kNN support)
Gordon Mohr wrote: Thanks for this! I decided to give the patch a try at the bleeding edge with some high-dimensional vectors, specifically the 1.4 million 1000-dimensional Freebase entity vectors from the Google 'word2vec' project: https://code.google.com/p/word2vec/#Pre-trained_entity_vectors_with_Freebase_naming Unfortunately, here's what I found: I wonder if these results would improve with this patch: http://www.postgresql.org/message-id/efedc2bf-ab35-4e2c-911f-fc88da647...@gmail.com -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] How should RLS treat ON DELETE CASCADE, ON UPDATE CASCADE, and ON DELETE SET NULL?
Hi all I'm interested in an opinion: How should RLS treat cascading RI constraints? The current tests check ON DELETE NO ACTION constraints, causing insertion to fail if a PK row exists but is not visible, and deletion to fail if there's still an invisible FK reference to a visible PK. That seems sane, with the caveat that inconsistent RLS rows can introduce apparent referential integrity violations where a valid constraint says a row should exist but it doesn't, or shouldn't exist but it does. I don't see a good answer to that. What I'm wondering about is how cascade RI affects things. Right now, a user can update a row they cannot see if there is an ON DELETE SET NULL or ON UPDATE CASCADE constraint on it. They can also delete a row they can't see with ON DELETE CASCADE. This seems consistent with table-level permissions, where a user can delete from a parent table and affect rows in a child table they don't have the rights to see or update directly. So I'm happy with the current behaviour. If everyone else is, I'll add appropriate tests to the test suite. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] CLUSTER FREEZE
On Thu, Oct 24, 2013 at 4:58 AM, Thomas Munro mu...@ip9.org wrote: Hi I noticed that CLUSTER doesn't have a FREEZE option. Here is a patch to add that, for consistency with VACUUM. Is it useful? I wonder why anyone would like to freeze during CLUSTER command when they already have separate way (VACUUM FREEZE) to achieve it, do you know or can think of any case where user wants to do it along with Cluster command? Anyway code side, I think you need to set both feeze_min_age as well as freeze_table_age, see VACUUM command in gram.y CLUSTER opt_freeze opt_verbose qualified_name cluster_index_specification { ClusterStmt *n = makeNode(ClusterStmt); - n-relation = $3; - n-indexname = $4; - n-verbose = $2; + n-relation = $4; + n-freeze_min_age = $2 ? 0 : -1; + n-indexname = $5; + n-verbose = $3; .. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers