Re: [GENERAL] Waiting on ExclusiveLock on extension
To put the top question first: How can table extension locks explain a a massive spike in CPU usage? I can imagine 400 connections waiting on disk I/O, but then, wouldn't they all be sleeping? Ok, that's a MAJOR hint, because relation 1249 is a system catalog; namely pg_attribute. So I think what's happening here is that your catalog has become horrifically bloated. I'm 99% certain that VACUUM ALL will not vacuum the catalog tables. Do you by chance have autovacuum turned off? A manual VACUUM VERBOSE pg_attribute might provide some immediate relief. Autovacuum is turned on. In addition, we do a manual VACUUM ALL at night. VACUUM VERBOSE pg_attribute ran in 0 seconds and processed a few hundred rows. Are you using a connection pool? Establishing 50 new database connections per second won't do anything to help performance... As I understand it, a pool reduces network and CPU load. We have never seen any issues with those. So the extra monitoring and maintenance cost of a pool seems hard to justify. I think what that means is that there was suddenly a big spike in memory demand at the OS level, so now the OS is frantically dumping cached pages. That in itself won't explain this, but it may be a clue. We monitor memory usage with Cacti. It's a dedicated server and nearly all memory is used as cache. If a script runs and demands memory, that becomes visible as cache is cleared out. There is no change in the amount of memory used as cache around the outage. In order to extend a relation we need to ask the filesystem to actually extend the file (which presumably means at least writing some metadata to disk), and then I think we create a WAL record. Creating the WAL record won't by itself write to disk... *unless* the wal_buffers are all already full. I have a question here, we have synchronous_commit = off. So when Postgres extends a page, would it do that just in memory, or does part of the extend operation require synchronous I/O? So if you also see an I/O spike when this happens you could well just be starved from the I/O system (though obviously it'd be better if we handled that situation more elegantly than this). The SAR data shows no increase in pgpgin/s and pgpgout/s, which if I understand it correctly, means that there is no I/O spike. There is however an enormous increase in CPU usage. I do suspect your pgfree/s is very high though; putting 200k pages/s on the free list seems like something's broken. The system has constant and considerable load of small writes. The pg_activity tool shows 300 IOPs sustained (it claims max IPs above 11000.) Postgres 9.3 had a comparable pgfree/s. Would you know a good resource to get more knowledgeable about pgfree, pgpin, pgsteal? Kind regards, Andomar -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Waiting on ExclusiveLock on extension
On 4/19/15 4:24 AM, Andomar wrote: To put the top question first: How can table extension locks explain a a massive spike in CPU usage? I can imagine 400 connections waiting on disk I/O, but then, wouldn't they all be sleeping? Not necessarily. Spinlocks don't put the process to sleep, but they're also supposed to be very short lived. Ok, that's a MAJOR hint, because relation 1249 is a system catalog; namely pg_attribute. So I think what's happening here is that your catalog has become horrifically bloated. I'm 99% certain that VACUUM ALL will not vacuum the catalog tables. Do you by chance have autovacuum turned off? A manual VACUUM VERBOSE pg_attribute might provide some immediate relief. Autovacuum is turned on. In addition, we do a manual VACUUM ALL at night. VACUUM VERBOSE pg_attribute ran in 0 seconds and processed a few hundred rows. Are you using a connection pool? Establishing 50 new database connections per second won't do anything to help performance... As I understand it, a pool reduces network and CPU load. We have never seen any issues with those. So the extra monitoring and maintenance cost of a pool seems hard to justify. Well, it sounds like you are CPU bound here... :P I don't know if this is related or not, but it wouldn't hurt. If you install pg_bouncer on the database server itself (which it's designed for) it shouldn't add much maintenance cost. I think what that means is that there was suddenly a big spike in memory demand at the OS level, so now the OS is frantically dumping cached pages. That in itself won't explain this, but it may be a clue. We monitor memory usage with Cacti. It's a dedicated server and nearly all memory is used as cache. If a script runs and demands memory, that becomes visible as cache is cleared out. There is no change in the amount of memory used as cache around the outage. In order to extend a relation we need to ask the filesystem to actually extend the file (which presumably means at least writing some metadata to disk), and then I think we create a WAL record. Creating the WAL record won't by itself write to disk... *unless* the wal_buffers are all already full. I have a question here, we have synchronous_commit = off. So when Postgres extends a page, would it do that just in memory, or does part of the extend operation require synchronous I/O? Turning that off doesn't mean there will never be an fsync, it just means that we don't wait for one before returning from COMMIT. I don't think relation extension itself requires a fsnyc, but see below. So if you also see an I/O spike when this happens you could well just be starved from the I/O system (though obviously it'd be better if we handled that situation more elegantly than this). The SAR data shows no increase in pgpgin/s and pgpgout/s, which if I understand it correctly, means that there is no I/O spike. There is however an enormous increase in CPU usage. I'm not familiar enough with SAR to know if that's correct or not; iostat would be a good way to confirm it. I do suspect your pgfree/s is very high though; putting 200k pages/s on the free list seems like something's broken. The system has constant and considerable load of small writes. The pg_activity tool shows 300 IOPs sustained (it claims max IPs above 11000.) Postgres 9.3 had a comparable pgfree/s. That leads me to a new theory... you may be running into problems finding free buffers in the buffer pool. We need to have a buffer before we can extend a relation, and if you have a lot of pressure on shared buffers it can take quite a bit of CPU to find one. To make matters worse, that search for a buffer takes place while holding the extension lock. Would you be able to get a stack trace of a backend that's holding an extension lock? Or maybe perf would provide some insight. Would you know a good resource to get more knowledgeable about pgfree, pgpin, pgsteal? Unfortunately I'm not strong on the system tools. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Waiting on ExclusiveLock on extension
On 4/19/15 3:08 PM, Jim Nasby wrote: I do suspect your pgfree/s is very high though; putting 200k pages/s on the free list seems like something's broken. The system has constant and considerable load of small writes. The pg_activity tool shows 300 IOPs sustained (it claims max IPs above 11000.) Postgres 9.3 had a comparable pgfree/s. That leads me to a new theory... you may be running into problems finding free buffers in the buffer pool. We need to have a buffer before we can extend a relation, and if you have a lot of pressure on shared buffers it can take quite a bit of CPU to find one. To make matters worse, that search for a buffer takes place while holding the extension lock. Would you be able to get a stack trace of a backend that's holding an extension lock? Or maybe perf would provide some insight. BTW, http://postgresql.org/message-id/flat/20150329185619.ga29...@alap3.anarazel.de has some useful info about this. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Help with slow table update
On Sat, Apr 18, 2015 at 10:24 AM, Pawel Veselov pawel.vese...@gmail.com wrote: I found some dangling prepared transactions How do you find and remove these?
Re: [GENERAL] [SQL] function to send email with query results
Using Apache camel Listen/notify -PGEvent Component (http://camel.apache.org/pgevent.html) - Mail Component (http://camel.apache.org/mail.html) You can also then handle all the exceptions of the email server easily. On Sun, Apr 19, 2015 at 6:42 AM, Jim Nasby jim.na...@bluetreble.com wrote: On 4/18/15 12:52 AM, David G. Johnston wrote: On Friday, April 17, 2015, Suresh Raja suresh.raja...@gmail.com mailto:suresh.raja...@gmail.com wrote: Hi all: I'm looking to write a function to send email with result of a query.Is it possible to send email with in a function. Any help is appreciated. Yes...though neither the neither the sql nor the plpgsql languages have the necessary language features to do so - you will need to use something like plperlu. An (not mutually exclusive) alternative is to create an email queue and write an email sending client application to process that queue. The main advantages being loose coupling and the ability to send the emails asynchronously. If you go that route you might find LISTEN/NOTIFY useful: http://www.postgresql.org/docs/9.4/static/sql-listen.html -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ORDER BY for jsonb
​Thanks for the insight. Pai-Hung
Re: [GENERAL] Help with slow table update
On 4/19/15 9:53 PM, Tim Uckun wrote: On Sat, Apr 18, 2015 at 10:24 AM, Pawel Veselov pawel.vese...@gmail.com mailto:pawel.vese...@gmail.com wrote: I found some dangling prepared transactions How do you find and remove these? SELECT * FROM pg_prepared_xacts; ROLLBACK PREPARED xid; -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Running pg_upgrade under Debian
On Sun, Apr 19, 2015 at 9:34 AM, rob stone floripa...@gmail.com wrote: On Fri, 2015-04-17 at 16:16 -0700, Adrian Klaver wrote: On 04/17/2015 03:09 PM, rob stone wrote: Hello, I'm trying to upgrade from 9.3 to 9.4 on my laptop and encountering this error:- postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade ... So, on the command line I'm putting listen_addresses=localhost which pg_upgrade picks up but the following option sets this to ''. It is starting pg_ctl in the 9.3/bin directory but makes the socket directory in 9.4/bin?? Note that 9.4/bin seems to be the current working directory from which you are running the command. That is probably not such a good idea. If pg_upgrade is actually reading the conf files it should not be necessary to supply any of these options, as port numbers, PID file locations, etc. are specified there. If pg_upgrade used the values from the config files, then anyone who is accustomed to connecting to the database with those setting could unthinkingly connect to it during the upgrade process and screw up the upgrade. It intentionally chooses methods to prevent that from happening, by putting the socket somewhere private, or at least unpredictable. waiting for server to start2596 5533cb8c.a24 2015-04-20 01:36:44 AESTLOG: redirecting log output to logging collector process 2596 5533cb8c.a24 2015-04-20 01:36:44 AESTHINT: Future log output will appear in directory pg_log However, it appears appear that pg_upgrade does honor the logging-collector related settings from the config file (should it? I can see arguments both ways) and you so you have to look in the pg_log directory to figure out what the actual problem is. Cheers, Jeff
Re: [GENERAL] Running pg_upgrade under Debian
On Fri, 2015-04-17 at 16:16 -0700, Adrian Klaver wrote: On 04/17/2015 03:09 PM, rob stone wrote: Hello, I'm trying to upgrade from 9.3 to 9.4 on my laptop and encountering this error:- postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade -b /usr/lib/postgresql/9.3/bin -B /usr/lib/postgresql/9.4/bin -d /home/postgres/data93/userqueries -D /home/postgres/data94/userqueries -U pguserqueries check for /home/postgres/data93/userqueries/base failed: No such file or directory Failure, exiting postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade --version pg_upgrade (PostgreSQL) 9.4.1 I have two tablespaces defined for this data -- one to hold the tables and another for the indices. There is no base file or directory. Guessing /home/postgres/data*/userqueries/ is the location of your tablespace, not the cluster directory. Correct. The -d and -D need to point at the cluster directory, which will be something like : /var/lib/postgresql/9.3/main Do I just create a dummy directory named base? No, you need to point to the correct directory. See here for more information: http://www.postgresql.org/docs/9.4/interactive/pgupgrade.html Actually, the wiki pages on upgrading gave me some more information, particularly about specifying where to find postgresql.conf. I adopted the symlink suggestion. However, running the following still gives errors:- postgres@roblaptop:/usr/lib/postgresql/9.4/bin$ ./pg_upgrade -b /usr/lib/postgresql/9.3/bin -B /usr/lib/postgresql/9.4/bin -d /var/lib/postgresql/9.3/main -D /var/lib/postgresql/9.4/main -p 5432 -P 5433 -o ' -c listen_addresses=localhost ' -O ' -c listen_addresses=localhost ' --check --verbose Running in verbose mode Performing Consistency Checks - Checking cluster versions ok pg_control values: First log segment after reset:00010002 pg_control version number:937 Catalog version number: 201306121 Database system identifier: 6026352641161422830 Latest checkpoint's TimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0/684 Latest checkpoint's NextOID: 12036 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID:674 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment:16777216 Maximum length of identifiers:64 Maximum columns in an index: 32 Maximum size of a TOAST chunk:1996 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 Current pg_control values: pg_control version number:942 Catalog version number: 201409291 Database system identifier: 6108663141646003887 Latest checkpoint's TimeLineID: 1 Latest checkpoint's full_page_writes: on Latest checkpoint's NextXID: 0/719 Latest checkpoint's NextOID: 12142 Latest checkpoint's NextMultiXactId: 1 Latest checkpoint's NextMultiOffset: 0 Latest checkpoint's oldestXID:675 Latest checkpoint's oldestXID's DB: 1 Latest checkpoint's oldestActiveXID: 0 Latest checkpoint's oldestMultiXid: 1 Latest checkpoint's oldestMulti's DB: 1 Maximum data alignment: 8 Database block size: 8192 Blocks per segment of large relation: 131072 WAL block size: 8192 Bytes per WAL segment:16777216 Maximum length of identifiers:64 Maximum columns in an index: 32 Maximum size of a TOAST chunk:1996 Size of a large-object chunk: 2048 Date/time type storage: 64-bit integers Float4 argument passing: by value Float8 argument passing: by value Data page checksum version: 0 Values to be changed: First log segment after reset:00010002 /usr/lib/postgresql/9.3/bin/pg_ctl -w -l pg_upgrade_server.log -D /var/lib/postgresql/9.3/main -o -p 5432 -b -c listen_addresses=localhost -c listen_addresses='' -c unix_socket_permissions=0700 -c unix_socket_directories='/usr/lib/postgresql/9.4/bin' start pg_upgrade_server.log 21 *failure* There were problems executing /usr/lib/postgresql/9.3/bin/pg_ctl -w -l pg_upgrade_server.log -D /var/lib/postgresql/9.3/main -o -p 5432 -b -c listen_addresses=localhost -c listen_addresses='' -c unix_socket_permissions=0700 -c