Re: [GENERAL] A guide about some topics of Associate Certification
Oscar Calderon wrote: Everybody have a nice day. Well, finally the place where i currently work paid me a chance to take the Associate Certification exam and i'm reviewing some topics, specifically the topics that the exam covers (some of them are easy to me like psql, which i use almost everyday) but i have doubt about 2 of the topics and i don't know if i already have the knowledgement about those topics. The topics are: * Postgres Data Dictionary: This is about information schema? I am not sure because PostgreSQL does not use the term data dictionary. It *might* be this refers to the information_schema, but my bet is that what is meant are the system catalogs: http://www.postgresql.org/docs/current/static/catalogs.html Oracle uses the term data dictionary like that. * Moving Data: This is about migration? Or about moving tablespaces in the file system? I'm also not sure. Maybe it is refering to ALTER TABLE table_name SET TABLESPACE new_tablespace which will move the table to a different tablespace (different from ALTER DATABASE db_name SET TABLESPACE new_tablespace, which will set the default for future tables). Other things that move data around are commands like CLUSTER, VACUUM FULL or forms of ALTER TABLE that rewrite the table, but I don't know if I would headline them moving data. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Solved: could not receive data from server, background writer proces exited with exit code 0
Hi All, This one is just for the record/search: it is solved. But it is quite rare i think. May save others time. PostgreSQL for Windows setup. While trying to connect the client errors with: Could not connect to the server. Reason: could not receive data from server: Software caused connection abort The server logs show lots of lines: background writer proces. (PID xyz) exited with exit code 0 terminating any other active server processes all server processes terminated; reinitializing I found one similar issue from some yeas ago that this was caused by NOD32 AntiVirus' imon internet monitor. My issue was caused by Sophos AV's web component. Making an exception for the /bin directory worked immediately. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Failure upgrading PG 9.2 to 9.3
On 03/25/2014 05:23 PM, Sam Saffron wrote: Sorry, its part of a rather elaborate docker based upgrade, that install is just done to get the binaries, the data is all in a completely different location which is untouched. So there are two instances of 9.2 in play at one time? The upgrade process is not inadvertently cross referencing the two? I realize Toms suggestion got you past the error, just trying to figure what corrupted the system catalogs in the first place. I am assuming the 9.2 instance that became corrupted was running properly until the upgrade? On Wed, Mar 26, 2014 at 11:20 AM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 03/25/2014 05:14 PM, Sam Saffron wrote: -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG choosing nested loop for set membership?
On Tue, Mar 25, 2014 at 5:59 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yeah. The weird thing about that is that the nestloop rowcount estimate isn't the product of the two input rowcounts --- you'd sort of expect an estimate of 158 given the input-relation sizes. While that's not ipso facto evidence of a bug (because the estimates are arrived at in different ways), I'm having a hard time replicating it here. Are you using an up-to-date PG release? All right, I think I'm onto something. But first I'll answer your questions. Version is 9.3.3 from the Postgres Debian archives. One thing that might help is to increase the statistics target for pl2._visible_accounts_by_rule_set. The other two tables are small enough that you don't need to do that for them. (Although come to think of it, they are also small enough that maybe auto-analyze isn't triggering for them ... does a manual ANALYZE improve matters?) You were right that auto-analyze didn't go after them. Weird. But a few manual analyzes later, and no change. Here's what I did, though. I collapsed the pl2.current_user view into pl2.visible_accounts: === select acc.account, acc.manager, acc.is_fund from pl2._visible_accounts_by_rule_set acc inner join (pl2._users u left join pl2._users iu on u.impersonating = iu.user_id) on acc.rule_set_id = coalesce(iu.permission_rule_set_id, u.permission_rule_set_id) where u.user_principal_name = session_user === I noticed that join-on-coalesce pattern that gave us trouble in SQL Server. The query planner can't do a thing with that. So I rewrote the query so the last join would be solid: === select acc.account, acc.manager, acc.is_fund from pl2._users lu inner join pl2._users u on u.user_id = coalesce(lu.impersonating, lu.user_id) inner join pl2._visible_accounts_by_rule_set acc on acc.rule_set_id = u.permission_rule_set_id where lu.user_principal_name = session_user === The join order is the same, and the indexes used are the same, but the estimate is much better: 'Nested Loop (cost=0.68..13.70 rows=133 width=10) (actual time=0.073..0.211 rows=241 loops=1)' ' Output: acc.account, acc.manager, acc.is_fund' ' Buffers: shared hit=10' ' - Nested Loop (cost=0.54..8.58 rows=1 width=4) (actual time=0.056..0.059 rows=1 loops=1)' 'Output: u.permission_rule_set_id' 'Buffers: shared hit=7' '- Index Scan using _pl2_users_user_principal_name_idx on pl2._users lu (cost=0.27..4.29 rows=1 width=8) (actual time=0.045..0.047 rows=1 loops=1)' ' Output: lu.user_id, lu.user_principal_name, lu.name, lu.permission_rule_set_id, lu.impersonating, lu.is_admin' ' Index Cond: (lu.user_principal_name = (session_user())::text)' ' Buffers: shared hit=4' '- Index Scan using _users_pkey on pl2._users u (cost=0.27..4.29 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=1)' ' Output: u.user_id, u.user_principal_name, u.name, u.permission_rule_set_id, u.impersonating, u.is_admin' ' Index Cond: (u.user_id = COALESCE(lu.impersonating, lu.user_id))' ' Buffers: shared hit=3' ' - Index Scan using _visible_accounts_by_rule_set_idx on pl2._visible_accounts_by_rule_set acc (cost=0.15..3.54 rows=158 width=14) (actual time=0.015..0.089 rows=241 loops=1)' 'Output: acc.rule_set_id, acc.account, acc.manager, acc.is_fund' 'Index Cond: (acc.rule_set_id = u.permission_rule_set_id)' 'Buffers: shared hit=3' 'Total runtime: 0.297 ms' I'll see if I can write an isolated test case for the coalesce misestimate. Or do you think the query planner will ever be able to do anything with that form? --Brian -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG choosing nested loop for set membership?
Brian Crowell br...@fluggo.com writes: Here's what I did, though. I collapsed the pl2.current_user view into pl2.visible_accounts: === select acc.account, acc.manager, acc.is_fund from pl2._visible_accounts_by_rule_set acc inner join (pl2._users u left join pl2._users iu on u.impersonating = iu.user_id) on acc.rule_set_id = coalesce(iu.permission_rule_set_id, u.permission_rule_set_id) where u.user_principal_name = session_user === I noticed that join-on-coalesce pattern that gave us trouble in SQL Server. The query planner can't do a thing with that. So I rewrote the query so the last join would be solid: === select acc.account, acc.manager, acc.is_fund from pl2._users lu inner join pl2._users u on u.user_id = coalesce(lu.impersonating, lu.user_id) inner join pl2._visible_accounts_by_rule_set acc on acc.rule_set_id = u.permission_rule_set_id where lu.user_principal_name = session_user === Hm. It's not obvious from here that those give the same results --- but you probably understand your schema better than the rest of us. I'll see if I can write an isolated test case for the coalesce misestimate. Or do you think the query planner will ever be able to do anything with that form? Probably not much. I'd guess that the real benefit of this approach is that it avoids the join-condition-using-three-input-relations, which is a bear from any angle. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG choosing nested loop for set membership?
On Wed, Mar 26, 2014 at 10:23 AM, Tom Lane t...@sss.pgh.pa.us wrote: Hm. It's not obvious from here that those give the same results --- but you probably understand your schema better than the rest of us. The _users table has a user_id, and a nullable column impersonating which refers to a user_id you want to impersonate. If impersonating isn't null, you want the rule_set_id for that user. If not, you want the rule_set_id of your own user. Hence the first query's left join to the second, impersonated user. The final join grabs the first rule_set_id it can find with a coalesce. The second query does the same thing with an inner join; the second _users reference will have the impersonated user if there is one, or the original user if there isn't. Either way, there's a solid user to join to, which I guess is enough for the query planner. They're really equivalent, since there is still just one rule_set_id at the end. Probably not much. I'd guess that the real benefit of this approach is that it avoids the join-condition-using-three-input-relations, which is a bear from any angle. Well look what happens when I remove impersonation, and stick a coalesce in the wrong place: === select acc.account, acc.manager, acc.is_fund from pl2._users lu inner join pl2._visible_accounts_by_rule_set acc on acc.rule_set_id = coalesce(lu.permission_rule_set_id, 0) where lu.user_principal_name = session_user === 'Hash Join (cost=2.62..9.07 rows=9 width=10) (actual time=0.066..0.239 rows=241 loops=1)' ' Output: acc.account, acc.manager, acc.is_fund' ' Hash Cond: (acc.rule_set_id = COALESCE(lu.permission_rule_set_id, 0))' ' Buffers: shared hit=4' Just removing the coalesce (acc.rule_set_id = lu.permission_rule_set_id) does this: 'Hash Join (cost=2.62..10.31 rows=133 width=10) (actual time=0.063..0.257 rows=241 loops=1)' ' Output: acc.account, acc.manager, acc.is_fund' ' Hash Cond: (acc.rule_set_id = lu.permission_rule_set_id)' ' Buffers: shared hit=4' Which says to me coalesce has a selectivity. --Brian -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trimming transaction logs after extended WAL archive failures
On Tue, Mar 25, 2014 at 6:33 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Tuesday, March 25, 2014, Steven Schlansker ste...@likeness.com wrote: Hi everyone, I have a Postgres 9.3.3 database machine. Due to some intelligent work on the part of someone who shall remain nameless, the WAL archive command included a ' /dev/null 21' which masked archive failures until the disk entirely filled with 400GB of pg_xlog entries. PostgreSQL itself should be logging failures to the server log, regardless of whether those failures log themselves. I have fixed the archive command and can see WAL segments being shipped off of the server, however the xlog remains at a stable size and is not shrinking. In fact, it's still growing at a (much slower) rate. The leading edge of the log files should be archived as soon as they fill up, and recycled/deleted two checkpoints later. The trailing edge should be archived upon checkpoints and then recycled or deleted. I think there is a throttle on how many off the trailing edge are archived each checkpoint. So issues a bunch of CHECKPOINT; commands for a while and see if that clears it up. Actually my description is rather garbled, mixing up what I saw when wal_keep_segments was lowered, not when recovering from a long lasting archive failure. Nevertheless, checkpoints are what provoke the removal of excessive WAL files. Are you logging checkpoints? What do they say? Also, what is in pg_xlog/archive_status ? Cheers, Jeff
Re: [GENERAL] PG choosing nested loop for set membership?
Brian Crowell br...@fluggo.com writes: Which says to me coalesce has a selectivity. Well, the point is you're just getting a default selectivity estimate for the acc.rule_set_id = coalesce(...anything...) condition. The planner is smarter about plain x = y join conditions: it looks up the column stats for x and y and determines the probability of equality. In principle I guess we could somehow merge the stats of y and z when looking at a coalesce(y, z) expression, but I'm not sure how that would work exactly. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trimming transaction logs after extended WAL archive failures
On Mar 25, 2014, at 7:58 PM, Adrian Klaver adrian.kla...@aklaver.com wrote: On 03/25/2014 04:52 PM, Steven Schlansker wrote: Some more questions, what happens when things begin to dawn on me:) You said the disk filled up entirely with log files yet currently the number(size) of logs is growing. It’s holding stable now. I tried to vacuum up to clean some space which turned out to generate more pg_xlog activity than it saved space, and (I assume) the archiver fell behind and that was the source of the growing log. There haven’t been any new segments since I stopped doing that. Yea, vacuum just marks space as available for reuse it does not actually free space. I even knew that. Funny what you’ll forget when the system is down and you’re in a panic. This is actually something that has bit me on more than one occasion — if you accidentally temporarily use too much space, it is *very* hard to back out of the situation. It seems that the only way to actually release space to the system are VACUUM FULL, CLUSTER, or to DROP objects. None of these can be executed without severe disruption to a running database. A cluster operation on any of our tables that are large enough to matter can easily run through the night. I can only keep wishing for a CLUSTER CONCURRENTLY or VACUUM FULL CONCURRENTLY that can run without a temporary copy of the entire table... -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trimming transaction logs after extended WAL archive failures
On Mar 26, 2014, at 9:04 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Tue, Mar 25, 2014 at 6:33 PM, Jeff Janes jeff.ja...@gmail.com wrote: On Tuesday, March 25, 2014, Steven Schlansker ste...@likeness.com wrote: Hi everyone, I have a Postgres 9.3.3 database machine. Due to some intelligent work on the part of someone who shall remain nameless, the WAL archive command included a ‘ /dev/null 21’ which masked archive failures until the disk entirely filled with 400GB of pg_xlog entries. PostgreSQL itself should be logging failures to the server log, regardless of whether those failures log themselves. I have fixed the archive command and can see WAL segments being shipped off of the server, however the xlog remains at a stable size and is not shrinking. In fact, it’s still growing at a (much slower) rate. The leading edge of the log files should be archived as soon as they fill up, and recycled/deleted two checkpoints later. The trailing edge should be archived upon checkpoints and then recycled or deleted. I think there is a throttle on how many off the trailing edge are archived each checkpoint. So issues a bunch of CHECKPOINT; commands for a while and see if that clears it up. Indeed, forcing a bunch of CHECKPOINTS started to get things moving again. Actually my description is rather garbled, mixing up what I saw when wal_keep_segments was lowered, not when recovering from a long lasting archive failure. Nevertheless, checkpoints are what provoke the removal of excessive WAL files. Are you logging checkpoints? What do they say? Also, what is in pg_xlog/archive_status ? I do log checkpoints, but most of them recycle and don’t remove: Mar 26 16:09:36 prd-db1a postgres[29161]: [221-1] db=,user= LOG: checkpoint complete: wrote 177293 buffers (4.2%); 0 transaction log file(s) added, 0 removed, 56 recycled; write=539.838 s, sync=0.049 s, total=539.909 s; sync files=342, longest=0.015 s, average=0.000 s That said, after letting the db run / checkpoint / archive overnight, the xlog did indeed start to slowly shrink. The pace at which it is shrinking is somewhat unsatisfying, but at least we are making progress now! I guess if I had just been patient I could have saved some mailing list traffic. But patience is hard when your production database system is running at 0% free disk :) Thanks everyone for the help, if the log continues to shrink, I should be out of the woods now. Best, Steven -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PG choosing nested loop for set membership?
On Wed, Mar 26, 2014 at 11:43 AM, Tom Lane t...@sss.pgh.pa.us wrote: In principle I guess we could somehow merge the stats of y and z when looking at a coalesce(y, z) expression, but I'm not sure how that would work exactly. Yeah, I'm not sure there's anything to fix here, either. Just a reminder that coalesces in joins are bad. The only thing I could think was making an exception for the case when all inputs to coalesce() have one row. I think that's what's happening here; coalesce selectivity is estimated at less than one, even though you can't get a cardinality any less than the inputs (they're already one), so the nested loop sees an estimate that's less than the product of its inputs. Or that's my guess anyhow. Thanks for having a look! --Brian -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] To monitor the number of PostgreSQL database connections?
Hi Could you please provide any method (query or any logfile) to check max connections happened during a time interval in psql DB ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Disk Encryption in Production
On Mar 25, 2014, at 3:30 PM, Carlos Espejo carlosesp...@gmail.com wrote: Anybody running their PostgreSQL server from a ecryptfs container? What are the common production setups out there? What are the drawbacks that people have experienced with their solution? We run postgres on XFS on lvm volumes put on top of cloud block devices encrypted with LUKS. It feels like a lot of layers, but it lets us add more encrypted disk space on the fly very easily (especially since I've got all this config set up in a chef cookbook). It seems to work just fine. I haven't done any testing, but I am pretty sure that it adds latency. But hey, if you need crypto, you need it. :-) We currently store the keys to LUKS encrypted with the host's private chef key as a host attribute in the chef-server so that the key data at rest would be safe, and we have an init script that the cookbook installs early in the boot sequence that gets/decrypts the keys from chef, starts crypto up, and mounts the filesystems before postgres starts up. We've got some plans to improve this, but it's a heck of a lot better than storing them locally, and a heck of a lot cheaper than a real HSM. Another option that we liked and tested out, but discarded because of cost, was Gazzang. They have a really slick setup. Pretty much plug n play, and work really well in the cloud, which is where we are. The one thing that I have run into that was a problem with doing this on a loopback device mapped to a file on a host rather than directly on a real block device. We did this on some cassandra servers, and pretty quickly began seeing corruption. We never figured out where the problem was, but it was a real pain to deal with. I'd avoid doing that. Hope that helps. Have fun! -tspencer -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] PgAdmin errors
When I try to run SQL from PgAdmin : CREATE TABLE tse_history_old LIKE tse_history INCLUDING ALL WITH OIDS I get this error ERROR: syntax error at or near LIKE LINE 2: CREATE TABLE tse_history_old LIKE tse_history INCLUDING ALL ... testing, I find that using the word LIKE always causes errors PostgreSQL 9.3.0 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit Running PgAdmin on Windows 7 64 bit What I need to do is copy a table completely, all data, indexes, keys
Re: [GENERAL] To monitor the number of PostgreSQL database connections?
I know this isn't exactly what you're looking for (a query or log), but we use this tool to monitor our connections and alert when they hit a particular threshold: http://bucardo.org/check_postgres/check_postgres.pl.html#backends On Wed, Mar 26, 2014 at 12:31 AM, Nithya Soman nit...@quintetsolutions.comwrote: Hi Could you please provide any method (query or any logfile) to check max connections happened during a time interval in psql DB ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Auditing Code - Fortify
Is Fortify supported for PostgreSQL? Any auditing tool which you suggest to check the schema design, roles and functions and other aspects? I have used fortify for oracle and sybase, but just not sure about postgreSQL? Can anyone provide some pointers here and if not Fortify then any such tool? Regards...
Re: [GENERAL] PgAdmin errors
On 3/26/2014 12:32 PM, Hall, Samuel L (Sam) wrote: When I try to run SQL from PgAdmin : CREATE TABLE tse_history_old LIKE tse_history INCLUDING ALL WITH OIDS I get this error ERROR: syntax error at or near LIKE LINE 2: CREATE TABLE tse_history_old LIKE tse_history INCLUDING ALL ... from the manual, it appears LIKE belongs in parenthesis. CREATE TABLE tse_history_old (LIKE tse_history INCLUDING ALL) WITH OIDS -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] To monitor the number of PostgreSQL database connections?
On Wed, Mar 26, 2014 at 11:01:28AM +0530, Nithya Soman wrote: Hi Could you please provide any method (query or any logfile) to check max connections happened during a time interval in psql DB ? I think there will be a message in the logs when you exceed max_connections. I think the error string will be: sorry, too many clients already That is kind of an odd message. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PgAdmin errors
That doesn't help. Even this CREATE TABLE tse_history_old (LIKE tse_history) gives an error From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of John R Pierce Sent: Wednesday, March 26, 2014 2:43 PM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] PgAdmin errors On 3/26/2014 12:32 PM, Hall, Samuel L (Sam) wrote: When I try to run SQL from PgAdmin : CREATE TABLE tse_history_old LIKE tse_history INCLUDING ALL WITH OIDS I get this error ERROR: syntax error at or near LIKE LINE 2: CREATE TABLE tse_history_old LIKE tse_history INCLUDING ALL ... from the manual, it appears LIKE belongs in parenthesis. CREATE TABLE tse_history_old (LIKE tse_history INCLUDING ALL) WITH OIDS -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] Auditing Code - Fortify
On 3/26/2014 12:42 PM, Dev Kumkar wrote: Is Fortify supported for PostgreSQL? why don't you ask the Fortify vendor ? -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Auditing Code - Fortify
On Thu, Mar 27, 2014 at 1:31 AM, John R Pierce pie...@hogranch.com wrote: why don't you ask the Fortify vendor ? Yup, following up with them in parallel. Search didn't gave me any good links, so wanted to check with community too here. If not Fortify, is there any other such tool? Regards...
Re: [GENERAL] PgAdmin errors
ERROR: syntax error at or near LIKE LINE 2: CREATE TABLE tse_history_old (LIKE tse_history) ^ From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of John R Pierce Sent: Wednesday, March 26, 2014 3:02 PM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] PgAdmin errors On 3/26/2014 12:58 PM, Hall, Samuel L (Sam) wrote: That doesn't help. Even this CREATE TABLE tse_history_old (LIKE tse_history) gives an error the exact same error ? -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] PgAdmin errors
On 3/26/2014 12:58 PM, Hall, Samuel L (Sam) wrote: That doesn't help. Even this CREATE TABLE tse_history_old (LIKE tse_history) gives an error the exact same error ? -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] Increase in max_connections
Hi all, We finally made some headway on this - we noticed messages like the below in /var/log/messages whenever the issue happened: Mar 26 07:39:58 site-db01b kernel: postmaster: page allocation failure. order:1, mode:0x20 Mar 26 07:39:58 site-db01b kernel: Pid: 39066, comm: postmaster Not tainted 2.6.32-279.el6.x86_64 #1 Mar 26 07:39:58 site-db01b kernel: Call Trace: Mar 26 07:39:58 site-db01b kernel: IRQ [8112759f] ? __alloc_pages_nodemask+0x77f/0x940 Mar 26 07:39:58 site-db01b kernel: [8116297a] ? fallback_alloc+0x1ba/0x270 Mar 26 07:39:58 site-db01b kernel: [81161d62] ? kmem_getpages+0x62/0x170 Mar 26 07:39:58 site-db01b kernel: [811623cf] ? cache_grow+0x2cf/0x320 Mar 26 07:39:58 site-db01b kernel: [811626f9] ? cache_alloc_node+0x99/0x160 Mar 26 07:39:58 site-db01b kernel: [811634db] ? kmem_cache_alloc+0x11b/0x190 Mar 26 07:39:58 site-db01b kernel: [8142df32] ? sk_clone+0x22/0x2e0 Mar 26 07:39:58 site-db01b kernel: [8142dc68] ? sk_prot_alloc+0x48/0x1c0 Mar 26 07:39:58 site-db01b kernel: [81494ae3] ? tcp_create_openreq_child+0x23/0x450 Mar 26 07:39:58 site-db01b kernel: [8147bb86] ? inet_csk_clone+0x16/0xd0 Mar 26 07:39:58 site-db01b kernel: [814935be] ? tcp_v4_rcv+0x4fe/0x8d0 Mar 26 07:39:58 site-db01b kernel: [81471200] ? ip_local_deliver_finish+0x0/0x2d0 Mar 26 07:39:58 site-db01b kernel: [814712dd] ? ip_local_deliver_finish+0xdd/0x2d0 Mar 26 07:39:58 site-db01b kernel: [8149239d] ? tcp_v4_syn_recv_sock+0x4d/0x310 Mar 26 07:39:58 site-db01b kernel: [81494886] ? tcp_check_req+0x226/0x460 Mar 26 07:39:58 site-db01b kernel: [81491dbb] ? tcp_v4_do_rcv+0x35b/0x430 Mar 26 07:39:58 site-db01b kernel: [81489cfd] ? tcp_rcv_established+0x38d/0x800 Mar 26 07:39:58 site-db01b kernel: [81470fb5] ? ip_rcv+0x275/0x350 Mar 26 07:39:58 site-db01b kernel: [81470a2d] ? ip_rcv_finish+0x12d/0x440 Mar 26 07:39:58 site-db01b kernel: [81471568] ? ip_local_deliver+0x98/0xa0 Mar 26 07:39:58 site-db01b kernel: [8143a7bb] ? __netif_receive_skb+0x49b/0x6f0 Mar 26 07:39:58 site-db01b kernel: [a02fe1b4] ? tg3_poll_work+0x654/0xe30 [tg3] Mar 26 07:39:58 site-db01b kernel: [a02fe9dc] ? tg3_poll_msix+0x4c/0x150 [tg3] Mar 26 07:39:58 site-db01b kernel: [8143ca38] ? netif_receive_skb+0x58/0x60 Mar 26 07:39:58 site-db01b kernel: [81073ec1] ? __do_softirq+0xc1/0x1e0 Mar 26 07:39:58 site-db01b kernel: [8143cb40] ? napi_skb_finish+0x50/0x70 Mar 26 07:39:58 site-db01b kernel: [8143f193] ? net_rx_action+0x103/0x2f0 Mar 26 07:39:58 site-db01b kernel: [8143f079] ? napi_gro_receive+0x39/0x50 Mar 26 07:39:58 site-db01b kernel: [810db800] ? handle_IRQ_event+0x60/0x170 Mar 26 07:39:58 site-db01b kernel: [81073ca5] ? irq_exit+0x85/0x90 Mar 26 07:39:58 site-db01b kernel: [8100de85] ? do_softirq+0x65/0xa0 Mar 26 07:39:58 site-db01b kernel: [81073f1f] ? __do_softirq+0x11f/0x1e0 Mar 26 07:39:58 site-db01b kernel: [8100c24c] ? call_softirq+0x1c/0x30 Doing some digging on that, we disabled TSO/TRO, GSO/GRO at the tcp layer - and that seems to have helped. $ sudo ethtool -k eth0 Offload parameters for eth0: rx-checksumming: off tx-checksumming: on scatter-gather: on tcp-segmentation-offload: off udp-fragmentation-offload: off generic-segmentation-offload: off generic-receive-offload: off large-receive-offload: off However, I'm looking for more information on what's happening: That stack trace above seems to indicate that it was unable to allocate 2*4k pages (8k) to the network stack. Its likely that was needed for GSO/GRO. However, wondering why the kernel is unable to allocate just 8k - we have a 768G RAM server, with over 54G in buffers/cache root@site-db01b:/proc # free -m total used free sharedbuffers cached Mem:775382 773354 2028 0 1403 738735 -/+ buffers/cache: 33215 742166 Swap:0 0 0 Looking a little deeper, I saw signs of memory being heavily fragmented: root@site-db01b:/var/log # cat /proc/buddyinfo Node 0, zone DMA 1 1 2 2 2 1 0 0 1 1 3 Node 0, zone DMA32 8 7 8 7 10 8 7 11 9 5 92 Node 0, zone Normal 13069 0 0 0 0 0 0 0 0 0 1 Node 1, zone Normal 652315 36885 1168 0 1 1 0 1 1 1 0 Node 0 has 13069 4k blocks, and zero 8k blocks available to use Which is likely what caused the problem, I'd think. A little while later though, buddyinfo changed and suddenly there was a lot more memory in 8k blocks. root@site-db01b:/proc # cat /proc/buddyinfo Node 0, zone DMA 1 1 2 2 2 1 0 0 1 1 3 Node 0, zoneDMA32 8 7 8 7 10 8 7 11 9 5 92 Node 0, zone Normal 9645 5495 1115 0 0 0 0 0 0 0 1 Node 1, zone Normal 409734 10953
Re: [GENERAL] Increase in max_connections
On Wed, Mar 26, 2014 at 08:22:01PM +, Anand Kumar, Karthik wrote: Looking a little deeper, I saw signs of memory being heavily fragmented: root@site-db01b:/var/log # cat /proc/buddyinfo Node 0, zone DMA 1 1 2 2 2 1 0 0 1 1 3 Node 0, zone DMA32 8 7 8 7 10 8 7 11 9 5 92 Node 0, zone Normal 13069 0 0 0 0 0 0 0 0 0 1 Node 1, zone Normal 652315 36885 1168 0 1 1 0 1 1 1 0 Node 0 has 13069 4k blocks, and zero 8k blocks available to use Which is likely what caused the problem, I'd think. A little while later though, buddyinfo changed and suddenly there was a lot more memory in 8k blocks. root@site-db01b:/proc # cat /proc/buddyinfo Node 0, zone DMA 1 1 2 2 2 1 0 0 1 1 3 Node 0, zoneDMA32 8 7 8 7 10 8 7 11 9 5 92 Node 0, zone Normal 9645 5495 1115 0 0 0 0 0 0 0 1 Node 1, zone Normal 409734 10953 1 0 1 1 0 1 1 1 0 (Note the change in the Node 0 line, 8k blocks went up from 0 to 5495) Anyone have any idea why memory was so fragmented, and what causes memory to be defragged? Is it something postgres does? Are there any kernel specific settings that control it? If I had to take a guess, it is zone_reclaim; see: http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html The fix is this sysctl: vm.zone_reclaim_mode = 0 -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PgAdmin errors
Hall, Samuel L (Sam) sam.h...@alcatel-lucent.com writes: ERROR: syntax error at or near LIKE LINE 2: CREATE TABLE tse_history_old (LIKE tse_history) You're certain the server you're talking to is 9.3? (SELECT version() is a good way to be sure.) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PgAdmin errors
Yes PostgreSQL 9.3.0 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit -Original Message- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Sent: Wednesday, March 26, 2014 4:03 PM To: Hall, Samuel L (Sam) Cc: John R Pierce; pgsql-general@postgresql.org Subject: Re: [GENERAL] PgAdmin errors Hall, Samuel L (Sam) sam.h...@alcatel-lucent.com writes: ERROR: syntax error at or near LIKE LINE 2: CREATE TABLE tse_history_old (LIKE tse_history) You're certain the server you're talking to is 9.3? (SELECT version() is a good way to be sure.) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PgAdmin errors
On 03/26/2014 02:13 PM, Hall, Samuel L (Sam) wrote: Yes PostgreSQL 9.3.0 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit So what happens when you run the command from psql ? -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PgAdmin errors
Hall, Samuel L (Sam) wrote: ERROR: syntax error at or near LIKE LINE 2: CREATE TABLE tse_history_old (LIKE tse_history) ^ Note it says this is on line 2. What have you got in the previous line? (psql has \p to show existing query buffer contents but I don't know if pgadmin has such a facility). -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PgAdmin errors
On 3/26/2014 1:07 PM, Hall, Samuel L (Sam) wrote: ERROR: syntax error at or near LIKE LINE 2: CREATE TABLE tse_history_old (LIKE tse_history) thats the syntax from http://www.postgresql.org/docs/current/static/sql-createtable.html try it in psql instead of pgadmin ? earlier, you mentioned... PostgreSQL 9.3.0 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, 64-bit BTW, you really should update that database server to 9.3.4 What I need to do is copy a table completely, all data, indexes, keys COPY TABLE ... (LIKE ...) will copy fields, constraints, and indexes (given that you used INCLUDING ALL), but it doesn't copy data.
Re: [GENERAL] Increase in max_connections
Thanks Bruce. Really interesting, but, I show zone reclaim is already turned off on our system. root@site-db01b:~ # numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17 node 0 size: 393181 MB node 0 free: 467 MB node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23 node 1 size: 393215 MB node 1 free: 319 MB node distances: node 0 1 0: 10 20 1: 20 10 root@site-db01b:~ # cat /proc/sys/vm/zone_reclaim_mode 0 Thanks, Karthik On 3/26/14 1:54 PM, Bruce Momjian br...@momjian.us wrote: On Wed, Mar 26, 2014 at 08:22:01PM +, Anand Kumar, Karthik wrote: Looking a little deeper, I saw signs of memory being heavily fragmented: root@site-db01b:/var/log # cat /proc/buddyinfo Node 0, zone DMA 1 1 2 2 2 1 0 0 1 1 3 Node 0, zone DMA32 8 7 8 7 10 8 7 11 9 5 92 Node 0, zone Normal 13069 0 0 0 0 0 0 0 0 0 1 Node 1, zone Normal 652315 36885 1168 0 1 1 0 1 1 1 0 Node 0 has 13069 4k blocks, and zero 8k blocks available to use Which is likely what caused the problem, I'd think. A little while later though, buddyinfo changed and suddenly there was a lot more memory in 8k blocks. root@site-db01b:/proc # cat /proc/buddyinfo Node 0, zone DMA 1 1 2 2 2 1 0 0 1 1 3 Node 0, zoneDMA32 8 7 8 7 10 8 7 11 9 5 92 Node 0, zone Normal 9645 5495 1115 0 0 0 0 0 0 0 1 Node 1, zone Normal 409734 10953 1 0 1 1 0 1 1 1 0 (Note the change in the Node 0 line, 8k blocks went up from 0 to 5495) Anyone have any idea why memory was so fragmented, and what causes memory to be defragged? Is it something postgres does? Are there any kernel specific settings that control it? If I had to take a guess, it is zone_reclaim; see: http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-recl aim-mode.html The fix is this sysctl: vm.zone_reclaim_mode = 0 -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] To monitor the number of PostgreSQL database connections?
Nithya Soman wrote Hi Could you please provide any method (query or any logfile) to check max connections happened during a time interval in psql DB ? Only if the time interval desired in basically zero-width (i.e., instantaneous). The pg_stat_activity view is your friend in this. You have numerous options, including self-coding, for capturing and historically reviewing these snapshots and/or setting up monitoring on them. This presumes you are actually wondering over any given time period how many open connections were there? If your question is actually In the given time period did any clients get rejected because {max connections} were already in use. you can check the PostgreSQL logs for the relevant error. Bruce basically said this question while Brian answered the first question. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/To-monitor-the-number-of-PostgreSQL-database-connections-tp5797571p5797608.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general