Re: [GENERAL] free RAM not being used for page cache
This is a super-interesting topic, thanks for all the info. On Thu, Sep 4, 2014 at 7:44 AM, Shaun Thomas wrote: > > Check /proc/meminfo for a better breakdown of how the memory is being > used. This should work: > > grep -A1 Active /proc/meminfo > > I suspect your inactive file cache is larger than the active set, > suggesting an overly aggressive memory manager. $ grep -A1 Active /proc/meminfo Active: 34393512 kB Inactive: 20765832 kB Active(anon): 13761028 kB Inactive(anon): 890688 kB Active(file): 20632484 kB Inactive(file): 19875144 kB The inactive set isn't larger than the active set, they're about even, but I'm still reading that as the memory manager being aggressive in marking pages as inactive, is that what it says to you too? Interestingly, I just looked at the memory graph for our standby backup database, and while it *normally* uses all the available RAM as the page cache, which is what I'd expect to see, when it was the active database for a time in April and May, the page cache size was reduced by about the same margin. So it's the act of running an active postgres instance that causes the phenomenon. http://s76.photobucket.com/user/kgoesspb/media/db2-mem-historic.png.html -- Kevin M. Goess Software Engineer Berkeley Electronic Press kgo...@bepress.com 510-665-1200 x179 www.bepress.com bepress: sustainable scholarly publishing
Re: [GENERAL] free RAM not being used for page cache
On 09/03/2014 07:17 PM, Kevin Goess wrote: Debian squeeze, still on 2.6.32. Interesting. Unfortunately that kernel suffers from the newer task scheduler they added to 3.2, and I doubt much of the fixes have been back-ported. I don't know if that affects the memory handling, but it might. Darn, really? I just learned about the "mysql swap insanity" problem and noticed that all the free memory is concentrated on one of the two nodes. $ numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 2 4 6 node 0 size: 32768 MB node 0 free: 9105 MB node 1 cpus: 1 3 5 7 node 1 size: 32755 MB node 1 free: 259 MB And that's the kind of behavior we were seeing until we upgraded to 3.8. A 8GB gap between your nodes is definitely bad, but it's not the same thing they described in the MySQL swap insanity posts. MySQL has a much bigger internal cache than we do, so expects a good proportion of system memory. It's not uncommon for dedicated MySQL systems to have more than 75% of system memory dedicated to database use. Without NUMA interleaving, that's a recipe for a broken system. $ free total used free sharedbuffers cached Mem: 66099280 565658049533476 0 11548 51788624 And again, this is what we started seeing with 3.2 when we upgraded initially. Unfortunately it looks like at least one of the bad memory aging patches got backported to the kernel you're using. If everything were working properly, that excess 9GB would be in your cache. Check /proc/meminfo for a better breakdown of how the memory is being used. This should work: grep -A1 Active /proc/meminfo I suspect your inactive file cache is larger than the active set, suggesting an overly aggressive memory manager. -- Shaun Thomas OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] free RAM not being used for page cache
On Tue, Aug 5, 2014 at 8:27 AM, Shaun Thomas wrote: > On 07/30/2014 12:51 PM, Kevin Goess wrote: > > A couple months ago we upgraded the RAM on our database servers from >> 48GB to 64GB. Immediately afterwards the new RAM was being used for >> page cache, which is what we want, but that seems to have dropped off >> over time, and there's currently actually like 12GB of totally unused RAM. >> > > What version of the Linux kernel are you using? We had exactly this > problem when we were on 3.2. We've since moved to 3.8 and that solved this > issue, along with a few others. > Debian squeeze, still on 2.6.32. > > If you're having the same problem, this is not a NUMA issue or in any way > related to zone_reclaim_mode. The memory page aging algorithm in pre 3.7 is > simply broken, judging by the traffic on the Linux Kernel Mailing List > (LKML). > Darn, really? I just learned about the "mysql swap insanity" problem and noticed that all the free memory is concentrated on one of the two nodes. $ numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 2 4 6 node 0 size: 32768 MB node 0 free: 9105 MB node 1 cpus: 1 3 5 7 node 1 size: 32755 MB node 1 free: 259 MB $ free total used free sharedbuffers cached Mem: 66099280 565658049533476 0 11548 51788624 I haven't been able to get any traction on what that means yet though. -- Kevin M. Goess Software Engineer Berkeley Electronic Press kgo...@bepress.com 510-665-1200 x179 www.bepress.com bepress: sustainable scholarly publishing
Re: [GENERAL] free RAM not being used for page cache
On 07/30/2014 12:51 PM, Kevin Goess wrote: A couple months ago we upgraded the RAM on our database servers from 48GB to 64GB. Immediately afterwards the new RAM was being used for page cache, which is what we want, but that seems to have dropped off over time, and there's currently actually like 12GB of totally unused RAM. What version of the Linux kernel are you using? We had exactly this problem when we were on 3.2. We've since moved to 3.8 and that solved this issue, along with a few others. If you're having the same problem, this is not a NUMA issue or in any way related to zone_reclaim_mode. The memory page aging algorithm in pre 3.7 is simply broken, judging by the traffic on the Linux Kernel Mailing List (LKML). I hate to keep beating this drum, but anyone using 3.2 (default for a few Linux distributions) needs to stop using 3.2; it's hideously broken. -- Shaun Thomas OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] free RAM not being used for page cache
On Wed, Jul 30, 2014 at 1:05 PM, Kevin Grittner wrote: > Merlin Moncure wrote: >> On Wed, Jul 30, 2014 at 12:51 PM, Kevin Goess wrote: >> >>> A couple months ago we upgraded the RAM on our database servers from 48GB to >>> 64GB. Immediately afterwards the new RAM was being used for page cache, >>> which is what we want, but that seems to have dropped off over time, and >>> there's currently actually like 12GB of totally unused RAM. > >> could be a numa issue. > > I was thinking the same thing. > > The other thought was that it could be a usage pattern and/or > monitoring issue. When there are transient requests for large > amounts of memory, it will discard cache to satisfy those (e.g., > work_mem or maintenance_work_mem allocations). If the *active* > portion of the database is not as big as RAM, it might not refill > right away. This could be compounded on your monitoring graphs if > they summarize by taking the *average* RAM usage for an interval > rather than the *maximum* usage for that interval. Intermittent > spikes in usage could make it look like the RAM is unused if you > are averaging; personally, I would prefer to use maximum for a > metric like this. Many monitoring systems allow you to choose. In fact, looking at the png he attached, I'd bet they cranked up work_mem and / or connections sometime around the end of January and that's what we're seeing here. More memory used for sorts etc, less left for caching. -- To understand recursion, one must first understand recursion. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] free RAM not being used for page cache
Merlin Moncure wrote: > On Wed, Jul 30, 2014 at 12:51 PM, Kevin Goess wrote: > >> A couple months ago we upgraded the RAM on our database servers from 48GB to >> 64GB. Immediately afterwards the new RAM was being used for page cache, >> which is what we want, but that seems to have dropped off over time, and >> there's currently actually like 12GB of totally unused RAM. > could be a numa issue. I was thinking the same thing. The other thought was that it could be a usage pattern and/or monitoring issue. When there are transient requests for large amounts of memory, it will discard cache to satisfy those (e.g., work_mem or maintenance_work_mem allocations). If the *active* portion of the database is not as big as RAM, it might not refill right away. This could be compounded on your monitoring graphs if they summarize by taking the *average* RAM usage for an interval rather than the *maximum* usage for that interval. Intermittent spikes in usage could make it look like the RAM is unused if you are averaging; personally, I would prefer to use maximum for a metric like this. Many monitoring systems allow you to choose. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] free RAM not being used for page cache
On Wed, Jul 30, 2014 at 12:57 PM, Kevin Goess wrote: > On Wed, Jul 30, 2014 at 11:49 AM, Merlin Moncure wrote: >> On Wed, Jul 30, 2014 at 12:51 PM, Kevin Goess wrote: >> > A couple months ago we upgraded the RAM on our database servers from >> > 48GB to >> > 64GB. Immediately afterwards the new RAM was being used for page cache, >> > which is what we want, but that seems to have dropped off over time, and >> > there's currently actually like 12GB of totally unused RAM. >> > >> > >> > http://s76.photobucket.com/user/kgoesspb/media/db1-mem-historical.png.html >> > >> > Is that expected? Is there a setting we need to tune for that? We have >> > 400GB of databases on this box, so I know it's not all fitting in that >> > 49.89GB. >> >> could be a numa issue. Take a look at: >> >> http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html >> >> merlin > Good suggestion, but nope, that ain't it: > > $ cat /proc/sys/vm/zone_reclaim_mode > 0 Could it just be your dataset isn't any bigger than what's being used? -- To understand recursion, one must first understand recursion. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] free RAM not being used for page cache
Good suggestion, but nope, that ain't it: $ cat /proc/sys/vm/zone_reclaim_mode 0 On Wed, Jul 30, 2014 at 11:49 AM, Merlin Moncure wrote: > On Wed, Jul 30, 2014 at 12:51 PM, Kevin Goess wrote: > > A couple months ago we upgraded the RAM on our database servers from > 48GB to > > 64GB. Immediately afterwards the new RAM was being used for page cache, > > which is what we want, but that seems to have dropped off over time, and > > there's currently actually like 12GB of totally unused RAM. > > > > > http://s76.photobucket.com/user/kgoesspb/media/db1-mem-historical.png.html > > > > Is that expected? Is there a setting we need to tune for that? We have > > 400GB of databases on this box, so I know it's not all fitting in that > > 49.89GB. > > could be a numa issue. Take a look at: > > http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html > > merlin > -- Kevin M. Goess Software Engineer Berkeley Electronic Press kgo...@bepress.com 510-665-1200 x179 www.bepress.com bepress: sustainable scholarly publishing
Re: [GENERAL] free RAM not being used for page cache
On Wed, Jul 30, 2014 at 12:51 PM, Kevin Goess wrote: > A couple months ago we upgraded the RAM on our database servers from 48GB to > 64GB. Immediately afterwards the new RAM was being used for page cache, > which is what we want, but that seems to have dropped off over time, and > there's currently actually like 12GB of totally unused RAM. > > http://s76.photobucket.com/user/kgoesspb/media/db1-mem-historical.png.html > > Is that expected? Is there a setting we need to tune for that? We have > 400GB of databases on this box, so I know it's not all fitting in that > 49.89GB. could be a numa issue. Take a look at: http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general