Re: [GENERAL] Debugging a backend stuck consuming CPU
On Thu, May 19, 2016 at 05:52:26PM -0400, Tom Lane wrote: > "k...@rice.edu" <k...@rice.edu> writes: > > The stack trace just appeared to be what I would expect while a 'DISCARD > > ALL' > > command was being run: > > > #0 0x0073bc7c in MemoryContextSetParent () > > #1 0x0073bde3 in MemoryContextDelete () > > #2 0x0054e3a9 in DropAllPreparedStatements () > > #3 0x005365f3 in DiscardCommand () > > Hmm, what it seems from these traces is that you've got a whole heck of > a lot of prepared statements. > > > The backend does have a very large memory footprint (12GB). > > Um. > > The most likely explanation is that you are hitting O(N^2) behavior as > a consequence of MemoryContextSetParent being O(N) in the number of > sibling contexts of the context to be deleted. We fixed that for 9.6 > (commit 25c539233044c235e97fd7c9dc600fb5f08fe065) but there's no easy > solution in older branches, short of not using so many prepared > statements. I'm a bit surprised that you could have gotten up to 12GB > worth of prepared statements in an application that sends DISCARD ALL > periodically. > > regards, tom lane > Hi, The DISCARD ALL is only sent by pgbouncer at the end of the processing. The actual process builds up a cache to be used later whose size is proportional to the number of items. The initial run is large, but the regular runs are much smaller and cleanup quickly. I was more concerned with incorrect behavior leading to DB corruption. Thank you for your suggestions and assistance. Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Debugging a backend stuck consuming CPU
On Thu, May 19, 2016 at 09:58:45AM -0400, Tom Lane wrote: > "k...@rice.edu" <k...@rice.edu> writes: > > I am investigating a problem with a backend that appears to be stuck > > and spinning while performing a "DISCARD ALL" command. The system is > > running an older release 9.2.2. > > You do realize that the current release in that series is 9.2.17. > > > Are there any bugs that could be causing this behavior? > > Known bugs are summarized here: > http://www.postgresql.org/docs/9.2/static/release.html > > > How can I tell what the process is actually doing? > > Getting a stack trace with gdb might be informative: > https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend > > regards, tom lane > Hi, The stack trace just appeared to be what I would expect while a 'DISCARD ALL' command was being run: Continuing. Program received signal SIGINT, Interrupt. 0x0073bc7c in MemoryContextSetParent () #0 0x0073bc7c in MemoryContextSetParent () #1 0x0073bde3 in MemoryContextDelete () #2 0x0054e3a9 in DropAllPreparedStatements () #3 0x005365f3 in DiscardCommand () #4 0x006582c7 in ?? () #5 0x006592bd in ?? () #6 0x00659a42 in PortalRun () #7 0x0065603d in ?? () #8 0x00656ed0 in PostgresMain () #9 0x00613b91 in ?? () #10 0x006167fc in PostmasterMain () #11 0x005b5290 in main () Continuing. Program received signal SIGINT, Interrupt. 0x0073bc7c in MemoryContextSetParent () #0 0x0073bc7c in MemoryContextSetParent () #1 0x0073bde3 in MemoryContextDelete () #2 0x0054e3a9 in DropAllPreparedStatements () #3 0x005365f3 in DiscardCommand () #4 0x006582c7 in ?? () #5 0x006592bd in ?? () #6 0x00659a42 in PortalRun () #7 0x0065603d in ?? () #8 0x00656ed0 in PostgresMain () #9 0x00613b91 in ?? () #10 0x006167fc in PostmasterMain () #11 0x005b5290 in main () Continuing. Program received signal SIGINT, Interrupt. 0x0073bc7c in MemoryContextSetParent () #0 0x0073bc7c in MemoryContextSetParent () #1 0x0073bde3 in MemoryContextDelete () #2 0x0070e7df in DropCachedPlan () #3 0x0054e3a9 in DropAllPreparedStatements () #4 0x005365f3 in DiscardCommand () #5 0x006582c7 in ?? () #6 0x006592bd in ?? () #7 0x00659a42 in PortalRun () #8 0x0065603d in ?? () #9 0x00656ed0 in PostgresMain () #10 0x00613b91 in ?? () #11 0x006167fc in PostmasterMain () #12 0x005b5290 in main () Continuing. Program received signal SIGINT, Interrupt. 0x0073bc7c in MemoryContextSetParent () #0 0x0073bc7c in MemoryContextSetParent () #1 0x0073bde3 in MemoryContextDelete () #2 0x0054e3a9 in DropAllPreparedStatements () #3 0x005365f3 in DiscardCommand () #4 0x006582c7 in ?? () #5 0x006592bd in ?? () #6 0x00659a42 in PortalRun () #7 0x0065603d in ?? () #8 0x00656ed0 in PostgresMain () #9 0x00613b91 in ?? () #10 0x006167fc in PostmasterMain () #11 0x005b5290 in main () Continuing. Program received signal SIGINT, Interrupt. 0x0070e7ff in DropCachedPlan () #0 0x0070e7ff in DropCachedPlan () #1 0x0054e3a9 in DropAllPreparedStatements () #2 0x005365f3 in DiscardCommand () #3 0x006582c7 in ?? () #4 0x006592bd in ?? () #5 0x00659a42 in PortalRun () #6 0x0065603d in ?? () #7 0x00656ed0 in PostgresMain () #8 0x00613b91 in ?? () #9 0x006167fc in PostmasterMain () #10 0x005b5290 in main () Detaching from program: /usr/pgsql-9.2/bin/postgres, process 38604 Undefined command: "exit". Try "help". Continuing. Program received signal SIGINT, Interrupt. 0x0070e7ff in DropCachedPlan () #0 0x0070e7ff in DropCachedPlan () #1 0x0054e3a9 in DropAllPreparedStatements () #2 0x005365f3 in DiscardCommand () #3 0x006582c7 in ?? () #4 0x006592bd in ?? () #5 0x00659a42 in PortalRun () #6 0x0065603d in ?? () #7 0x00656ed0 in PostgresMain () #8 0x00613b91 in ?? () #9 0x006167fc in PostmasterMain () #10 0x005b5290 in main () Continuing. Program received signal SIGINT, Interrupt. 0x0070e7ff in DropCachedPlan () #0 0x0070e7ff in DropCachedPlan () #1 0x0054e3a9 in DropAllPreparedStatements () #2 0x005365f3 in DiscardCommand () #3 0x006582c7 in ?? () #4 0x006592bd in ?? () #5 0x00659a42 in PortalRun () #6 0x0065603d in ?? () #7 0x00656ed0 in PostgresMain () #8 0x00613b91 in ??
Re: [GENERAL] Debugging a backend stuck consuming CPU
On Thu, May 19, 2016 at 09:58:45AM -0400, Tom Lane wrote: > "k...@rice.edu" <k...@rice.edu> writes: > > I am investigating a problem with a backend that appears to be stuck > > and spinning while performing a "DISCARD ALL" command. The system is > > running an older release 9.2.2. > > You do realize that the current release in that series is 9.2.17. > > > Are there any bugs that could be causing this behavior? > > Known bugs are summarized here: > http://www.postgresql.org/docs/9.2/static/release.html > > > How can I tell what the process is actually doing? > > Getting a stack trace with gdb might be informative: > https://wiki.postgresql.org/wiki/Generating_a_stack_trace_of_a_PostgreSQL_backend > > regards, tom lane > Hi, We do have an upgrade scheduled. I will get a stack trace and see what information that provides. Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Debugging a backend stuck consuming CPU
Hi PostgreSQL community, I am investigating a problem with a backend that appears to be stuck and spinning while performing a "DISCARD ALL" command. The system is running an older release 9.2.2. Are there any bugs that could be causing this behavior? How can I tell what the process is actually doing? It does not respond to either pg_cancel_backend() or to pg_terminate_backend() so it is acting like it is not receiving the signals. Any ideas on how to debug this would be appriciated. Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [PERFORM] PGBOUNCER ISSUE PLEASE HELP(Slowing down the site)
On Thu, Jun 18, 2015 at 05:09:10PM +, Sheena, Prabhjot wrote: Guys I have an issue going on with PGBOUNCER which is slowing down the site PGBOUNCER VERSION: pgbouncer-1.5.4-2.el6 (Hosted on separate machine) (16 cpu) 98GB RAM DATABASE VERION: postgresql 9.3 When the total client connections to pgbouncer are close to 1000, site application works fine but when the total client connections crosses 1150 site application starts showing slowness. Here is an example of output postgres@symds-pg:~ $ netstat -atnp | grep 5432 | wc (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) 9606720 104640 As you can see total connections are like 960 right now my site application is working fine. When connections crosses 1150 and even though I see lot of available connections coz my default_pool_size is set high to 250 but still the application gets slow. Database performance on the other end is great with no slow running queries or anything. So the only place I can think the issue is at PGBOUNCER end. Hi Prabhjot, This is classic behavior when you have a 1024 file limit. When you are below that number, it work fine. Above that number, you must wait for a connection to close and exit before you can connect which will cause a delay. See what ulimit has to say? Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [PERFORM] PGBOUNCER ISSUE PLEASE HELP(Slowing down the site)
On Thu, Jun 18, 2015 at 05:41:01PM +, Sheena, Prabhjot wrote: Here is the output of OS limits postgres@symds-pg:~ $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 790527 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 4096 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 16384 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Thanks Prabhjot I would bump your open files as was suggested in your pgbouncer start script. Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [PERFORM] PGBOUNCER ISSUE PLEASE HELP(Slowing down the site)
On Thu, Jun 18, 2015 at 07:19:13PM +, Sheena, Prabhjot wrote: Hi Ken/ Will I have checked the ulimit value and we are nowhere hitting the max 4096 that we have currently set. Is there any other explanation why we should be thinking of bumping it to like ulimit -n 5 ( Add ulimit -n 5 to the start of whatever you use to start pgbouncer (init script, etc..)) even though we are not reaching 4096 max value Regards Prabhjot Singh Hi, Try attaching to the pgbouncer with strace and see if you are getting any particular errors. Do you have a /etc/security/limits.d directory? And if so, what is in it? We found a nice default ulimit of 1024 for all non-root users. :( Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [PERFORM] pg bouncer issue what does sv_used column means
On Fri, Jun 12, 2015 at 09:37:36PM +, Sheena, Prabhjot wrote: Here is some more information pool_mode | transaction We have transactional pooling and our application is set up in such a way that we have one query per transaction. We have set default pool size to 100. This is output . As you guys can see active connection are 100 and 224 are waiting. We are planning to move default pool size to 250. Please suggest if you guys think otherwise pgbouncer=# show pools; database | user| cl_active | cl_waiting | sv_active | sv_idle | sv_used | sv_tested | sv_login | maxwait ---+---+---++---+-+-+---+--+- pgbouncer | pgbouncer | 2 | 0 | 0 | 0 | 0 | 0 |0 | 0 site | feature | 418 | 0 |20 | 17 | 0 | 0 |0 | 0 site | service | 621 |224 | 100 | 0 | 0 | 0 |0 | 0 site | zabbix| 0 | 0 | 0 | 0 | 0 | 0 |0 | 0 Prabhjot Singh Database Administrator CLASSMATES 1501 4th Ave., Suite 400 Seattle, WA 98101 206.301.4937 o 206.301.5701 f From: Sheena, Prabhjot Sent: Friday, June 12, 2015 10:57 AM To: 'pgsql-general@postgresql.org'; 'pgsql-performa...@postgresql.org' Subject: pg bouncer issue what does sv_used column means Guys we see spike in pg bouncer during the peak hours and that was slowing down the application. We did bump up the connection limit and it is helpful but now we again notice little spike in connection. And one thing that I notice that is different is jump in sv_used value when I run command show pools during problem times Can anyone please explain what value of sv_used means when i run show pools; Regards Prabhjot Hi Parbhjot, The spike in pgbouncer during peak hours just indicates that you are busier then. How many sv_active do you have in non-peak hours? What kind of system is this on? I suspect that your hardware cannot actually handle 100 simultaneous processes at once and if you increase that to 250 processes there is a good likelyhood that your system response will get even worse. Number of CPU to 2x number of CPU is typical for peak performance throughput. Are you using a 50-core system? What do the I/O stats look like? You may be I/O limited. Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [SQL] commit inside a function failing
On Fri, Jan 23, 2015 at 01:56:53PM -0600, Suresh Raja wrote: Hi All: We are running a function with a loop in it. We tried commit; We are getting error with above command. How can i easily commit withing a function. Thanks, -SR PostgreSQL functions run within a transaction. You would need to use something like db_link. Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [ADMIN] [GENERAL] Use, Set Catalog and JDBC questions
On Mon, Feb 25, 2013 at 01:57:11PM -0500, Frank Cavaliero wrote: Hi Adrian, Thanks for the response. The situation is more like the following: Using the JDBC driver, I connect to database TEST1 and immediately, without having to pass username credentials again, I want to use database TEST2. In MySQL, you can simply run: use TEST2. Wondering if PostgreSQL has something similar. Thanks, Frank Hi Frank, The USE xxx;' is a non-standard MySQL extension to the SQL language. This functionality is not available at the SQL layer and must be supported by the connection application itself, I think. For example, you can use \c xxx in psql to perform that function. If JDBC does not support it natively, you would need to open a new connection to the new database. Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] [ADMIN] number of rows in analyze
On Mon, Aug 01, 2011 at 04:42:23PM +0200, Willy-Bas Loos wrote: Hi, I'ver been wondering how to set the number of rows that are scanned by analyze. (I want to increase it) I couln't find it at first but when i read closer i found that is does: The largest statistics target among the columns being analyzed determines the number of table rows sampled to prepare the statistics. (http://www.postgresql.org/docs/8.4/interactive/sql-analyze.html) My question is *HOW* does it detirmine the number of table rows sampled with the statistics target? What is the formula behind it? I am not familiar enough with the source code to figure that out myself. sry about that. Would be good to include in the docs if we find out. Cheers, WBL Peruse the above reference a bit more carefully. It is actually explained pretty well. Regards, Ken -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general