Re: [GENERAL] pgpgout/s without swapping--what does it mean?
On Monday, March 17, 2014, Kevin Goess wrote: > We had a big increase in load, iowait, and disk i/o on a dedicated > database host the other night. > > Looking at the sar logs, the problem shows itself in a big increase in > pgpgout/s, which I believe is postgres paging out parts of itself to disk? > > 02:15:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s > pgscand/s pgsteal/s%vmeff > ... > However, there isn't a corresponding increase in pages *in*, so if > postgres is writing portions of itself out to disk, they can't be very > important. > As far as I can tell, pgpgout/s includes all data written to disk, not just process memory being paged. So it includes WAL and data files being written, for example due to bulk loads. Seems like a odd name for that parameter, and I don't know how it differs from bwrtn/s, other than the different units. If it is a bulk load, that would explain why it is not being read back in. Also, it could be that the data is needed, but when it is needed it is still in cache and so doesn't lead to disk reads. But it still needs to be written for durability reasons. Cheers, Jeff
Re: [GENERAL] Upgrade: 9.0.5->9.3.3
On Mon, 17 Mar 2014, Tom Lane wrote: I'm guessing from those path names that you are using self-compiled executables, not somebody's packaging? If the latter, whose? I'm confused as to how you got pg_upgrade installed without knowing where it came from. Tom, Both were buit with scripts from SlackBuilds.org. The 9.0.5 pg_upgrade in /opt/postgresql-9.0.5/contrib/ is datestamped Sep 22, 2011; I don't see that I actually built and installed it or pg_upgrade_support (that's also there as source, but neither built nor installed). Three-and-a-half years later I've no recollection from where I downloaded them. contrib is part of the main PG distribution. The sources for pg_upgrade would be under contrib/pg_upgrade/ in an unpacked tarball, and pg_upgrade_support lives next door in contrib/pg_upgrade_support/. If you did a "make install" in contrib/ that would've installed both. A-ha! Running 'pg_upgrade --help' tells me that it is, indeed, installed. And, it is the version for -9.3.3, too. So, this question's been answered. That's pretty much the base case. If none of the other options appeal to you, don't use 'em. Nah, nothing looks like I need them. Much appreciated, Rich -- Richard B. Shepard, Ph.D. | Have knowledge, will travel. Applied Ecosystem Services, Inc. | www.appl-ecosys.com Voice: 503-667-4517 Fax: 503-667-8863 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Upgrade: 9.0.5->9.3.3
Rich Shepard writes: >Now, -9.0.5 is installed in /usr/local/pgsql/ and -9.3.3 is installed in > /opt/pgsql-9.3.3. I want to use pg_upgrade and have read the Web page with > the instructions. I'm guessing from those path names that you are using self-compiled executables, not somebody's packaging? If the latter, whose? I'm confused as to how you got pg_upgrade installed without knowing where it came from. >Instruction #4 tells me to install pg_upgrade and pg_upgrade_support. I > have the 9.0.5 version of pg_upgrade in /opt/postgresql-9.0.5/contrib/ > (without pg_upgrade_support). But, I do not see the 'contrib' page on > postgresql.org. I find the 'community' and 'foundry' pages, but not the > 'contrib' page. A pointer to pg_upgrade and pg_upgrade_support for 9.3.3 is > needed. contrib is part of the main PG distribution. The sources for pg_upgrade would be under contrib/pg_upgrade/ in an unpacked tarball, and pg_upgrade_support lives next door in contrib/pg_upgrade_support/. If you did a "make install" in contrib/ that would've installed both. Possibly you're expecting pg_upgrade_support to produce an executable under $installdir/bin? It doesn't --- it should get installed as lib/pg_upgrade_support.so or equivalent. In any case, pg_upgrade will certainly complain if it can't find it. >If my reading of the instructions is correct, the following command line > should migrate my few, small databased from 9.0.5 to 9.3.3: > pg_upgrade -b /usr/local/pgsql/bin -B /opt/pgsql-9.3.3/bin -d \ > /usr/local/pgsql/data -D /opt/pgsql-9.3.3/data That's pretty much the base case. If none of the other options appeal to you, don't use 'em. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Upgrade: 9.0.5->9.3.3
Now, -9.0.5 is installed in /usr/local/pgsql/ and -9.3.3 is installed in /opt/pgsql-9.3.3. I want to use pg_upgrade and have read the Web page with the instructions. Instruction #4 tells me to install pg_upgrade and pg_upgrade_support. I have the 9.0.5 version of pg_upgrade in /opt/postgresql-9.0.5/contrib/ (without pg_upgrade_support). But, I do not see the 'contrib' page on postgresql.org. I find the 'community' and 'foundry' pages, but not the 'contrib' page. A pointer to pg_upgrade and pg_upgrade_support for 9.3.3 is needed. If my reading of the instructions is correct, the following command line should migrate my few, small databased from 9.0.5 to 9.3.3: pg_upgrade -b /usr/local/pgsql/bin -B /opt/pgsql-9.3.3/bin -d \ /usr/local/pgsql/data -D /opt/pgsql-9.3.3/data I don't see that I need any additional options. Do I? TIA, Rich -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SQL advice needed
On 17/03/14 21:42, Merlin Moncure wrote: >> I can do it in plpgsql. But that would mean to accumulate the complete >> > result in memory first, right? I need to avoid that. > I would test that assumption. This is better handled in loop IMO. > > LOOP > RETURN QUERY SELECT * FROM xx(); > IF NOT found > THEN > RETURN; > END IF; > END LOOP; At least according to the manual it is stored in memory: Note: The current implementation of RETURN NEXT and RETURN QUERY stores the entire result set before returning from the function, as discussed above. That means that if a PL/pgSQL function produces a very large result set, performance might be poor: data will be written to disk to avoid memory exhaustion, but the function itself will not return until the entire result set has been generated. A future version of PL/pgSQL might allow users to define set-returning functions that do not have this limitation. Currently, the point at which data begins being written to disk is controlled by the work_mem configuration variable. Administrators who have sufficient memory to store larger result sets in memory should consider increasing this parameter. I didn't test that, though. Torsten -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SQL advice needed
On Mon, Mar 17, 2014 at 4:20 PM, Torsten Förtsch wrote: > On 17/03/14 21:42, Merlin Moncure wrote: >>> I can do it in plpgsql. But that would mean to accumulate the complete >>> > result in memory first, right? I need to avoid that. >> I would test that assumption. This is better handled in loop IMO. >> >> LOOP >> RETURN QUERY SELECT * FROM xx(); >> IF NOT found >> THEN >> RETURN; >> END IF; >> END LOOP; > > At least according to the manual it is stored in memory: > > > Note: The current implementation of RETURN NEXT and RETURN QUERY stores > the entire result set before returning from the function, as discussed > above. That means that if a PL/pgSQL function produces a very large > result set, performance might be poor: data will be written to disk to > avoid memory exhaustion, but the function itself will not return until > the entire result set has been generated. A future version of PL/pgSQL > might allow users to define set-returning functions that do not have > this limitation. Currently, the point at which data begins being written > to disk is controlled by the work_mem configuration variable. > Administrators who have sufficient memory to store larger result sets in > memory should consider increasing this parameter. > > > I didn't test that, though. The manual says describes the exact opposite of what you said you thought it does -- large result sets are paged out to disk, not stored in memory (this is a feature). CTEs use a similar tactic so it's a wash. The performance overhead of a tuplestore is probably not as bad as you think -- just test it out some simulated results and monitor performance. Either way, work_mem controls it. It's generally dangerous to crank work_mem to huge values but it's ok to set it temporarily via SET to huge values (say to 1GB) for a query particularly if you know that it's only getting issued by one caller at a time. merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dump Database
What does the shell command: pwd show when run immediately after you issue a su postgres command? You may want to try su - postgres to ensure you are in the postgres users home directory. From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of José Pedro Santos Sent: Monday, March 17, 2014 7:50 AM To: Postgres Ajuda Subject: [GENERAL] Dump Database Dear all, I'm trying to dump my database using the following command after I enter as su postgres: pg_dump - U postgres mydb -f mydb.sql ... but I'm always having the following message: pg_dump [archiver] Could not open output file "mydb.sql" : Permission denied I try to use the same statement using su.. but I didn't manage. Can someone give me a help? I'm using Ubuntu 12.04 LTS and Postgres 9.1. Kind Regards, José Santos
Re: [GENERAL] SQL advice needed
Torsten Förtsch wrote > Hi, > > I have a volatile function that returns multiple rows. It may also > return nothing. Now, I want to write an SQL statement that calls this > function until it returns an empty result set and returns all the rows. > > What's the best (or at least a working) way to achieve what I want? > > I can do it in plpgsql. But that would mean to accumulate the complete > result in memory first, right? I need to avoid that. You are describing procedural logic. If you need intermediate steps before "returns all the rows" then either those intermediate steps stay in memory OR you stick them on a table somewhere and, when your procedure is done, send back a cursor over that, possibly temporary, table. I don't think abusing WITH/RECURSIVE is going to be viable. You should also consider whether you can do what you need using set-logic (i.e., pure SQL). At worse it will be a learning exercise and a performance comparator. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/SQL-advice-needed-tp5796431p5796436.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] SQL advice needed
On Mon, Mar 17, 2014 at 3:21 PM, Torsten Förtsch wrote: > Hi, > > I have a volatile function that returns multiple rows. It may also > return nothing. Now, I want to write an SQL statement that calls this > function until it returns an empty result set and returns all the rows. > > So, in principle I want to: > > WITH RECURSIVE > t AS ( > SELECT * FROM xx() > UNION ALL > SELECT * FROM xx() > ) > SELECT * FROM t; > > But that's not recursive because the union all part lacks a reference to t. > > Next I tried this: > > WITH RECURSIVE > t AS ( > SELECT * FROM xx() > UNION ALL > SELECT * FROM xx() WHERE EXISTS (SELECT 1 FROM t) > ) > SELECT * FROM t; > > But the reference to t is not allowed in a subquery. > > What's the best (or at least a working) way to achieve what I want? > > I can do it in plpgsql. But that would mean to accumulate the complete > result in memory first, right? I need to avoid that. I would test that assumption. This is better handled in loop IMO. LOOP RETURN QUERY SELECT * FROM xx(); IF NOT found THEN RETURN; END IF; END LOOP; merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] A user's interpretation (and thoughts) of the WAL replay bug in 9.3
I'm trying to follow the discussion on -hackers and decided I'd try putting everything I'm reading into my own words. It is probable some or even all of the following is simply wrong so please do not go acting on it without other people providing supporting evidence or comments. I am a database user, not a database programmer, but feel I do have a solid ability to understand and learn and enough experience to adequately represent a moderately skilled DBA who might see the release notes and scratch their head. During the application/restoration (replay) of WAL the modification of indexes may not be performed correctly resulting in physical rows/keys not having matching index entries. Any subsequent attempt to add a duplicate of the existing physical key will therefore succeed; thus resulting in a duplicate physical record being present and any future attempt to REINDEX the unique index column(s) to fail. A typical replay scenario would first have a row on the PK side of a relationship updated (though not any of the key columns - or any other indexed columns - since this is hot-related...?). If this update takes sufficiently long (or concurrency is otherwise extremely high) that another transaction attempted to take a lock on the PK (e.g., so that it could validate an FK relationship) then a "tuple-lock" operation is performed and added to the WAL. The reply of this WAL entry caused the locked index row to be effectively invisible. The core alteration in 9.3 that exposed this bug is that in order to improve concurrency updates to rows that did not hit indexes (i.e., hot-update capable) allowed other non-updating transactions to simultaneously acquire a lock sufficient to ensure that the core row elements (those that are indexed) remained unaltered while not caring whether specific non-indexed attributes were altered. Prior to 9.3 update locking was sufficient (i.e. exclusive) to cause the other sessions to wait for a lock and thus never hold one simultaneously. In that situation the WAL replay was effectively serialized with respect to the transactions and the index entry modifications were unnecessary but not incorrect. The most obvious test-and-correct mechanism would be to try and create new indexes and, if the creation fails, manually remove the duplicate rows from the table. A table re-write and/or re-index could only work if no duplicates entries were made (I am not sure of this line of thought...) so, for instance, if the PK column is tied to a sequence then no duplicates would ever have been entered. The unknown, for me, is whether MVCC duplication is impacted in which case any update could introduce a duplicate. Regardless of the duplicate record issue as soon as the replay happens the system cannot find the corresponding entry in the index and so any select queries are immediately at risk of silently returning incorrect results. I suppose any related FK constraints would also be broken and those too would remain invisible until the next forced validation of the constraint. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/A-user-s-interpretation-and-thoughts-of-the-WAL-replay-bug-in-9-3-tp5796432.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] SQL advice needed
Hi, I have a volatile function that returns multiple rows. It may also return nothing. Now, I want to write an SQL statement that calls this function until it returns an empty result set and returns all the rows. So, in principle I want to: WITH RECURSIVE t AS ( SELECT * FROM xx() UNION ALL SELECT * FROM xx() ) SELECT * FROM t; But that's not recursive because the union all part lacks a reference to t. Next I tried this: WITH RECURSIVE t AS ( SELECT * FROM xx() UNION ALL SELECT * FROM xx() WHERE EXISTS (SELECT 1 FROM t) ) SELECT * FROM t; But the reference to t is not allowed in a subquery. What's the best (or at least a working) way to achieve what I want? I can do it in plpgsql. But that would mean to accumulate the complete result in memory first, right? I need to avoid that. Thanks, Torsten -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] pgpgout/s without swapping--what does it mean?
Kevin Goess wrote > Can anybody help me understand what these statistics are suggesting, > what's > actually going on on this box/in postgresql? What is it writing to disk, > and why? Is it just writing out new/changed rows, or what? Not a clue on the statistics but most likely you are seeing checkpoint activity. At a simplified level all changes to the database are first persisted to disk using WAL (write-ahead-logs). The changes are written to disk, into WAL files, during commit via fsync. The original data files are not affected immediately thus improving performance at the time of commit by instead risking a prolonged delay in situations where an unclean shutdown occurs. However, at some point the WAL files need to be removed and the physical table files updated. This occurs during a checkpoint. A checkpoint basically causes the on-disk files to become baselined to the current reality so that only subsequent WAL files need be applied. There is considerably more to this whole concept than I can go into off the top of my head but in addition to looking at just I/O it would help to look, simultaneously, at what processes are active. HTH David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/pgpgout-s-without-swapping-what-does-it-mean-tp5796346p5796355.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] pgpgout/s without swapping--what does it mean?
We had a big increase in load, iowait, and disk i/o on a dedicated database host the other night. Looking at the sar logs, the problem shows itself in a big increase in pgpgout/s, which I believe is postgres paging out parts of itself to disk? 02:15:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s%vmeff 02:25:01 AM 49.63163.83836.80 0.00467.69 4.80 0.00 4.80100.00 02:35:01 AM 44.85230.29 1248.68 0.00677.18 16.21 0.00 16.21100.00 02:45:01 AM 47.67190.82 1059.58 0.00517.72 5.33 0.00 5.33100.00 02:55:01 AM 59.00175.58986.87 0.00514.08 18.13 0.00 18.13100.00 03:05:01 AM 24.67849.77 1382.60 0.00 1267.94 37.35 0.00 37.35100.00 03:15:01 AM 28.67 1701.67717.88 0.00 1231.48 0.00 0.00 0.00 0.00 03:25:02 AM 42.64 21342.02 4086.19 0.04 9701.70415.92 0.00414.44 99.64 03:35:01 AM 35.60 28290.69 4305.38 0.10 12906.73623.89 0.00615.85 98.71 03:45:01 AM 36.94 31119.30 3675.34 0.01 12456.54527.55 0.00521.61 98.87 03:55:01 AM 42.77 29020.72 3458.96 0.01 12165.57557.57 0.00553.10 99.20 Average:41.25 11306.39 2175.59 0.02 5189.70220.62 0.00218.63 99.10 However, there isn't a corresponding increase in pages *in*, so if postgres is writing portions of itself out to disk, they can't be very important. And there's no swapping going on: 02:15:01 AM pswpin/s pswpout/s 02:25:01 AM 0.00 0.00 02:35:01 AM 0.00 0.00 02:45:01 AM 0.00 0.00 02:55:01 AM 0.00 0.00 03:05:01 AM 0.00 0.00 03:15:01 AM 0.00 0.00 03:25:02 AM 0.00 0.00 03:35:01 AM 0.00 0.00 03:45:01 AM 0.00 0.00 03:55:01 AM 0.00 0.00 Average: 0.00 0.00 Can anybody help me understand what these statistics are suggesting, what's actually going on on this box/in postgresql? What is it writing to disk, and why? Is it just writing out new/changed rows, or what? -- Kevin M. Goess Software Engineer Berkeley Electronic Press kgo...@bepress.com 510-665-1200 x179 www.bepress.com bepress: sustainable scholarly publishing
Re: [GENERAL] Linux OOM-Killer
On 03/17/2014 04:21 AM, basti wrote: uname -a Linux h2085616 3.2.0-3-amd64 #1 SMP Mon Jul 23 02:45:17 UTC 2012 x86_64 GNU/Linux At any time there are not more than 20-30 Connections at once. Swap is disabled. free -m total used free sharedbuffers cached Mem: 32215 16163 16051 0 40 14842 -/+ buffers/cache: 1281 30934 Swap:0 0 0 With the updates there is a little bit tricky: https://wiki.postgresql.org/wiki/Nov2013ReplicationIssue If you are currently using 9.2.4, 9.1.9 or 9.0.13 and use any form of builtin replication do not install the most recent update. Instead, wait for the next update (9.2.6, 9.1.11 and 9.0.15) to come out. Options for users who have already updated, or are running 9.3, include: if you are using 9.2.5, 9.1.10 or 9.0.14, downgrade your replica servers to the prior update release (9.2.4, 9.1.9 or 9.0.13). Well basically the above is saying (in your case) avoid 9.1.10 by either staying below 9.1.10 or skipping over it to a higher version. FYI currently the 9.1.x series is at 9.1.12 -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Dump Database
pgsql-general-ow...@postgresql.org wrote on 17/03/2014 12:50:20: > From: José Pedro Santos > To: Postgres Ajuda , > Date: 17/03/2014 12:56 > Subject: [GENERAL] Dump Database > Sent by: pgsql-general-ow...@postgresql.org > > Dear all, > > I'm trying to dump my database using the following command after I > enter as su postgres: > > pg_dump - U postgres mydb -f mydb.sql > > ... but I'm always having the following message: > > pg_dump [archiver] Could not open output file "mydb.sql" : Permission denied > > I try to use the same statement using su.. but I didn't manage. Can > someone give me a help? > > I'm using Ubuntu 12.04 LTS and Postgres 9.1. > > Kind Regards, > José Santos The assumption here is that you're in a directory where you do not have permissions. try: # cd # pg_dump -f mydb.sql -U postgres mydb It also appears that you have a space between "-" and "U", make sure that it appears: "-U". Cheers = Romax Technology Limited A limited company registered in England and Wales. Registered office: Rutherford House Nottingham Science and Technology Park Nottingham NG7 2PZ England Registration Number: 2345696 VAT Number: 526 246 746 Telephone numbers: +44 (0)115 951 88 00 (main) For other office locations see: http://www.romaxtech.com/Contact = === E-mail: i...@romaxtech.com Website: www.romaxtech.com = Confidentiality Statement This transmission is for the addressee only and contains information that is confidential and privileged. Unless you are the named addressee, or authorised to receive it on behalf of the addressee you may not copy or use it, or disclose it to anyone else. If you have received this transmission in error please delete from your system and contact the sender. Thank you for your cooperation. =
[GENERAL] Dump Database
Dear all, I'm trying to dump my database using the following command after I enter as su postgres: pg_dump - U postgres mydb -f mydb.sql ... but I'm always having the following message: pg_dump [archiver] Could not open output file "mydb.sql" : Permission denied I try to use the same statement using su.. but I didn't manage. Can someone give me a help? I'm using Ubuntu 12.04 LTS and Postgres 9.1. Kind Regards, José Santos
Re: [GENERAL] Linux OOM-Killer
basti wrote: >>> Since a few days we had problems with the Linux OOM-Killer. >>> Some simple query that normally take around 6-7 minutes now takes 5 hours. >>> We did not change any configuration values the last days. >>> >>> First of all I have set >>> >>> vm.overcommit_memory=2 >>> vm.overcommit_ratio=80 > Swap is disabled. > free -m > total used free sharedbuffers cached > Mem: 32215 16163 16051 0 40 14842 > -/+ buffers/cache: 1281 30934 > Swap:0 0 0 That together means that you cannot use more than 80% of your RAM. Are you hitting that limit? See the description of overcommit_ratio in https://www.kernel.org/doc/Documentation/sysctl/vm.txt I would definitely add some swap. Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Linux OOM-Killer
On Mon, Mar 17, 2014 at 12:21:30PM +0100, basti wrote: > uname -a > Linux h2085616 3.2.0-3-amd64 #1 SMP Mon Jul 23 02:45:17 UTC 2012 x86_64 > GNU/Linux > > At any time there are not more than 20-30 Connections at once. > > Swap is disabled. > free -m > total used free sharedbuffers cached > Mem: 32215 16163 16051 0 40 14842 > -/+ buffers/cache: 1281 30934 > Swap:0 0 0 One really should add at least a bit of swap (and monitor it closely) such that an out of physical RAM situation does not amount to a hard limit. It doesn't matter if it is slow. Karsten -- GPG key ID E4071346 @ gpg-keyserver.de E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Linux OOM-Killer
uname -a Linux h2085616 3.2.0-3-amd64 #1 SMP Mon Jul 23 02:45:17 UTC 2012 x86_64 GNU/Linux At any time there are not more than 20-30 Connections at once. Swap is disabled. free -m total used free sharedbuffers cached Mem: 32215 16163 16051 0 40 14842 -/+ buffers/cache: 1281 30934 Swap:0 0 0 With the updates there is a little bit tricky: https://wiki.postgresql.org/wiki/Nov2013ReplicationIssue If you are currently using 9.2.4, 9.1.9 or 9.0.13 and use any form of builtin replication do not install the most recent update. Instead, wait for the next update (9.2.6, 9.1.11 and 9.0.15) to come out. Options for users who have already updated, or are running 9.3, include: if you are using 9.2.5, 9.1.10 or 9.0.14, downgrade your replica servers to the prior update release (9.2.4, 9.1.9 or 9.0.13). On 17.03.2014 12:12, Tomas Vondra wrote: > Hi, > > On 17 Březen 2014, 11:45, basti wrote: >> Hello, >> >> we have a database master Version: >> PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (Debian >> 4.7.2-2) 4.7.2, 64-bit >> and a WAL-Replication Slave with hot-standby version: >> PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc (Debian >> 4.7.2-5) 4.7.2, 64-bit. > > You're missing >18 months of fixes on the master (slightly less on the > slave). > >> Since a few days we had problems with the Linux OOM-Killer. >> Some simple query that normally take around 6-7 minutes now takes 5 hours. >> We did not change any configuration values the last days. >> >> First of all I have set >> >> vm.overcommit_memory=2 >> vm.overcommit_ratio=80 >> >> Here are some values of my DB-Master config, the Server has 32GB RAM and >> is only for database, no other service. >> Did anybody see some mistakes? > > How much swap do you have? > >> I'am not shure if work_mem, maintenance_work_mem and >> effective_cache_size is set correct. > > That's hard to say. I don't see any immediate issue there, but it really > depends on your application. For example 200 connections with > work_mem=192MB may be dangerous if many connections are active at the same > time. > >> >> /etc/postgresql/9.1/main/postgresql.conf >> >> max_connections = 200 >> ssl = true >> shared_buffers = 6GB # min 128kB >> work_mem = 192MB # min 64kB >> maintenance_work_mem = 1GB # min 1MB >> wal_level = hot_standby >> synchronous_commit = off >> wal_buffers = 16MB >> checkpoint_segments = 16 >> checkpoint_completion_target = 0.9 >> archive_mode = on >> archive_command = 'rsync -a %p -e "ssh -i >> /var/lib/postgresql/.ssh/id_rsa" >> postgres@my_postgres_slave:/var/lib/postgresql/9.1/wals/dolly_main/%f >> > max_wal_senders = 1 >> wal_keep_segments = 32 >> random_page_cost = 2.0 >> effective_cache_size = 22GB >> default_statistics_target = 100 >> constraint_exclusion = off >> join_collapse_limit = 1 >> logging_collector = on >> log_directory = 'pg_log' >> log_filename = 'postgresql-%Y-%m-%d.log' >> log_min_duration_statement = 4 >> log_lock_waits = on >> track_counts = on >> autovacuum = on >> log_autovacuum_min_duration = 5000 >> autovacuum_max_workers = 4 >> datestyle = 'iso, dmy' >> deadlock_timeout = 1s > > So what does the query do? Show us explain plan (explain analyze would be > nice, but if it's running so slow). > > Which kernel is this? When the OOM strikes, it should print detailed into > into the log - what does it say? > > When you look at "top" output, which processes consume most memory? Are > there multiple backends consuming a lot of memory? What queries are they > running? > > Assuming you have a monitoring system in place, collecting memory stats > (you should have that), what does it say about history? Is there a sudden > increase in consumed memory or something suspicious? > > regards > Tomas > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Linux OOM-Killer
Hi, On 17 Březen 2014, 11:45, basti wrote: > Hello, > > we have a database master Version: > PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (Debian > 4.7.2-2) 4.7.2, 64-bit > and a WAL-Replication Slave with hot-standby version: > PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc (Debian > 4.7.2-5) 4.7.2, 64-bit. You're missing >18 months of fixes on the master (slightly less on the slave). > Since a few days we had problems with the Linux OOM-Killer. > Some simple query that normally take around 6-7 minutes now takes 5 hours. > We did not change any configuration values the last days. > > First of all I have set > > vm.overcommit_memory=2 > vm.overcommit_ratio=80 > > Here are some values of my DB-Master config, the Server has 32GB RAM and > is only for database, no other service. > Did anybody see some mistakes? How much swap do you have? > I'am not shure if work_mem, maintenance_work_mem and > effective_cache_size is set correct. That's hard to say. I don't see any immediate issue there, but it really depends on your application. For example 200 connections with work_mem=192MB may be dangerous if many connections are active at the same time. > > /etc/postgresql/9.1/main/postgresql.conf > > max_connections = 200 > ssl = true > shared_buffers = 6GB # min 128kB > work_mem = 192MB # min 64kB > maintenance_work_mem = 1GB# min 1MB > wal_level = hot_standby > synchronous_commit = off > wal_buffers = 16MB > checkpoint_segments = 16 > checkpoint_completion_target = 0.9 > archive_mode = on > archive_command = 'rsync -a %p -e "ssh -i > /var/lib/postgresql/.ssh/id_rsa" > postgres@my_postgres_slave:/var/lib/postgresql/9.1/wals/dolly_main/%f > max_wal_senders = 1 > wal_keep_segments = 32 > random_page_cost = 2.0 > effective_cache_size = 22GB > default_statistics_target = 100 > constraint_exclusion = off > join_collapse_limit = 1 > logging_collector = on > log_directory = 'pg_log' > log_filename = 'postgresql-%Y-%m-%d.log' > log_min_duration_statement = 4 > log_lock_waits = on > track_counts = on > autovacuum = on > log_autovacuum_min_duration = 5000 > autovacuum_max_workers = 4 > datestyle = 'iso, dmy' > deadlock_timeout = 1s So what does the query do? Show us explain plan (explain analyze would be nice, but if it's running so slow). Which kernel is this? When the OOM strikes, it should print detailed into into the log - what does it say? When you look at "top" output, which processes consume most memory? Are there multiple backends consuming a lot of memory? What queries are they running? Assuming you have a monitoring system in place, collecting memory stats (you should have that), what does it say about history? Is there a sudden increase in consumed memory or something suspicious? regards Tomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Linux OOM-Killer
Hello, we have a database master Version: PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.7.2-2) 4.7.2, 64-bit and a WAL-Replication Slave with hot-standby version: PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.7.2-5) 4.7.2, 64-bit. Since a few days we had problems with the Linux OOM-Killer. Some simple query that normally take around 6-7 minutes now takes 5 hours. We did not change any configuration values the last days. First of all I have set vm.overcommit_memory=2 vm.overcommit_ratio=80 Here are some values of my DB-Master config, the Server has 32GB RAM and is only for database, no other service. Did anybody see some mistakes? I'am not shure if work_mem, maintenance_work_mem and effective_cache_size is set correct. /etc/postgresql/9.1/main/postgresql.conf max_connections = 200 ssl = true shared_buffers = 6GB# min 128kB work_mem = 192MB# min 64kB maintenance_work_mem = 1GB # min 1MB wal_level = hot_standby synchronous_commit = off wal_buffers = 16MB checkpoint_segments = 16 checkpoint_completion_target = 0.9 archive_mode = on archive_command = 'rsync -a %p -e "ssh -i /var/lib/postgresql/.ssh/id_rsa" postgres@my_postgres_slave:/var/lib/postgresql/9.1/wals/dolly_main/%f http://www.postgresql.org/mailpref/pgsql-general