[PERFORM] Dramatic change in memory usage with version 9.1
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello I am sending this email to ask if anyone has noticed a change in how a server running postgreSQL 9.1 uses and allocates memory compared to older versions. We upgraded all our systems from 8.3 to 9.1 a couple of weeks ago, and we have experienced a radical change in how our servers make use of memory. How memory is allocated has become more unstable and the swap usage has increased dramatically. The pattern that we have started seeing is: * Sudden decrease of swap when running backup/vacuum+analyze jobs * Full use of cached memory when running backup/vacuum+analyze jobs * Sudden increase of swap and unused memory when backup/vacuum+analyze jobs are finnished. * Progressive decrease of swap during the day. Here is a list of things about this upgrade to version 9.1 that can be interesting when analyzing this change of behavior: * The servers are running the samme OS version and linux kernel as with 8.3. * We are running the same values for parameters related to memory allocation as we used in 8.3. * We are running the same backups and maintenance jobs as with version 8.3. These jobs are running at the exactly same time as with 8.3. * Backups (PITR, pg_dumps) and maintenances (vacuum, analyze) jobs are executed between midnight and early morning. * We run several postgreSQL clusters per server, running in different IPs and disks. * We have not seen any significant change in how databases are used/accessed after the upgrade to 9.1. * We upgraded in the first time from 8.3.12 to 9.1.2, but because this bug: http://archives.postgresql.org/pgsql-bugs/2011-12/msg00068.php we had to downgrade to 9.1.1. We thought in the begynning that our memory problems were related to this bug, but everything is the same with 9.1.1. * A couple of days ago we decreased the values of maintenance_work_mem and work_mem over a 50% in relation to values used with 8.3. The only change we have seen is even more unused memory after backup/vacuum +analyze jobs are finnished. Here you have some graphs that can help to get a picture about what we are talking about: * Overview of how memory use changed in one of our servers after the upgrade in the begynning og week 49: http://folk.uio.no/rafael/upgrade_to_9.1/server-1/memory-month.png http://folk.uio.no/rafael/upgrade_to_9.1/server-1/memory-year.png * We could think that all this happens because we are running to much in one server. Here are some graphs from a server with 30GB+ running only one postgres cluster (shared_memory = 6GB, maintenance_work_memory = 512MB, work_mem = 32MB) for a couple of days: http://folk.uio.no/rafael/upgrade_to_9.1/server-2/memory-week.png The memory pattern is the same even when running only one postgres cluster in a server with enough memory. Any ideas about why this dramatic change in memory usage when the only thing apparently changed from our side is the postgres version? Thanks in advance for any help. regards, - -- Rafael Martinez Guerrero Center for Information Technology University of Oslo, Norway PGP Public Key: http://folk.uio.no/rafael/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk7vUpYACgkQBhuKQurGihTvjACff5J08pNJuRDgkegYdtQ5zp52 GeoAnRaaU+F/C/udQ7lMl/TkvRKX2WnP =VcDk -END PGP SIGNATURE- -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Dramatic change in memory usage with version 9.1
Wow, upgrading 3 major releases at a go. :) It would probably be useful to use the helpful: http://wiki.postgresql.org/wiki/Guide_to_reporting_problems to get the information that is needed to the right people. Regards, Ken On Mon, Dec 19, 2011 at 04:04:54PM +0100, Rafael Martinez wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello I am sending this email to ask if anyone has noticed a change in how a server running postgreSQL 9.1 uses and allocates memory compared to older versions. We upgraded all our systems from 8.3 to 9.1 a couple of weeks ago, and we have experienced a radical change in how our servers make use of memory. How memory is allocated has become more unstable and the swap usage has increased dramatically. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Dramatic change in memory usage with version 9.1
On Mon, Dec 19, 2011 at 17:04, Rafael Martinez r.m.guerr...@usit.uio.no wrote: * Sudden decrease of swap when running backup/vacuum+analyze jobs Do you know for certain that this memory use is attributed to vacuum/analyze/backup, or are you just guessing? You should isolate whether it's the vacuum or a backup process/backend that takes this memory. Do you launch vacuum/analyze manually or are you just relying on autovacuum? How many parallel vacuum jobs are there? What's your autovacuum_max_workers set to? How large is your database? How did you perform the upgrade -- via pg_upgrade or pg_dump? Any ideas about why this dramatic change in memory usage when the only thing apparently changed from our side is the postgres version? Well, for one, there have been many planner changes that make it use memory more aggressively, these probably being the most significant: * Materialize for nested loop queries in 9.0: http://rhaas.blogspot.com/2010/04/materialization-in-postgresql-90.html * Hash join usage for RIGHT and FULL OUTER JOINs in 9.0 However, none of these would apply to vacuum, analyze or backups. Regards, Marti -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] OOM-killer issue with a specific query
Hi - I'm running into an OOM-killer issue when running a specific query (no virtual machine running) and, based on researching the issue, I can probably fix by making the following sysctl adjustments: vm.overcommit_memory = 2 vm.overcommit_ratio = 0 However, I am perplexed as to why I am running into the issue in the first place. The machine (running Linux 2.6.34.7-61.fc13.x86_64) is dedicated to Postgres (v9.0.0 [RPM package: postgresql90-9.0.0-1PGDG.fc13.1.x86_64]) and the following memory usage is pretty typical for the system (via top): Mem: 8121992k total, 2901960k used, 5220032k free, 237408k buffers Swap: 1048572k total, 235940k used, 812632k free, 2053768k cached Under steady-state conditions, the following shows the virtual memory size for postgres backend processes: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 8506 postgres 20 0 2327m 3084 1792 S 0.0 0.0 0:00.33 postgres 8504 postgres 20 0 2326m 14m 13m S 0.0 0.2 0:01.32 postgres 8505 postgres 20 0 2326m 728 452 S 0.0 0.0 0:00.91 postgres 3582 postgres 20 0 2325m 54m 53m S 0.0 0.7 0:02.03 postgres My current relevant postgresql.conf settings are the following: shared_buffers = 2100MB temp_buffers = 8MB work_mem = 32MB maintenance_work_mem = 16MB max_stack_depth = 2MB constraint_exclusion = partition When executing the query, I've been watching the top activity, sorted by resident memory. Upon execution, no other processes appear to take additional resident memory, except a postgres backend process servicing the query, which goes to +6Gb (triggering the OOM-killer). Given the settings in postgresql.conf, and my anecdotal understanding of Postgres memory management functions, I am uncertain why Postgres exhausts physical memory instead of swapping to temporary files. Do I need to lower my work_mem setting since the subquery involves a partitioned table, causing a multiplier effect to the memory used (I have tried per-connection settings of 6MB)? Would tweaking query planning settings help? Thanks in advance! If it helps, I have included the query (with column names aliased to their data type), a brief description of the applicable table's contents, and an abridged copy of the EXPLAIN ANALYZE output SELECT bigint, date, text FROM tableA AS A WHERE A.boolean = 'true' AND (A.text = 'abc' OR A.text = 'xyz') AND A.bigint NOT IN (SELECT bigint FROM tableB) ORDER BY A.date DESC; tableA: - total table contains ~11 million records (total width: 109 bytes) - partitioned by month (180 partitions) - each table partition contains ~100k records tableB: - total table contains ~400k records (total width: 279 bytes) - partitioned by month (96 partitions) - each table partition contains ~30k records EXPLAIN ANALYZE output: Note: could not produce output for exact query due to OOM-killer, but ran query by limiting the subquery to the first 50 results. The planner iterates over all partitions, but only the first two partitions are noted for brevity. QUERY PLAN --- Limit (cost=451279.67..451279.70 rows=10 width=55) (actual time=18343.085..18343.090 rows=10 loops=1) - Sort (cost=451279.67..456398.37 rows=2047480 width=55) (actual time=18343.083..18343.087 rows=10 loops=1) Sort Key: A.Date Sort Method: top-N heapsort Memory: 26kB - Result (cost=1.21..407034.37 rows=2047480 width=55) (actual time=0.793..17014.726 rows=4160606 loops=1) - Append (cost=1.21..407034.37 rows=2047480 width=55) (actual time=0.792..16119.298 rows=4160606 loops=1) - Seq Scan on tableA A (cost=1.21..19.08 rows=1 width=44) (actual time=0.002..0.002 rows=0 loops=1) Filter: (boolean AND (NOT (hashed SubPlan 1)) AND ((text = 'abc'::text) OR (text = 'xyz'::text))) SubPlan 1 - Limit (cost=0.00..1.08 rows=50 width=8) (actual time=0.010..0.054 rows=50 loops=210) - Result (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.009..0.044 rows=50 loops=210) - Append (cost=0.00..9249.46 rows=427846 width=8) (actual time=0.008..0.031 rows=50 loops=210) - Seq Scan on tableB (cost=0.00..15.30 rows=530 width=8) (actual time=0.001..0.001 rows=0 loops=210) - Seq Scan on tableB_201201 tableB (cost=0.00..15.30 rows=530 width=8) (actual time=0.000..0.000 rows=0 loops=210) - Seq Scan on tableB_201112 tableB (cost=0.00..251.25 rows=12125 width=8) (actual time=0.006..0.019 rows=50 loops=210)