Re: [GENERAL] free RAM not being used for page cache

2014-09-04 Thread Kevin Goess
This is a super-interesting topic, thanks for all the info.

On Thu, Sep 4, 2014 at 7:44 AM, Shaun Thomas 
wrote:
>
> Check /proc/meminfo for a better breakdown of how the memory is being
> used. This should work:
>
> grep -A1 Active /proc/meminfo
>
> I suspect your inactive file cache is larger than the active set,
> suggesting an overly aggressive memory manager.


$ grep -A1 Active /proc/meminfo
Active: 34393512 kB
Inactive:   20765832 kB
Active(anon):   13761028 kB
Inactive(anon):   890688 kB
Active(file):   20632484 kB
Inactive(file): 19875144 kB

The inactive set isn't larger than the active set, they're about even, but
I'm still reading that as the memory manager being aggressive in marking
pages as inactive, is that what it says to you too?

Interestingly, I just looked at the memory graph for our standby backup
database, and while it *normally* uses all the available RAM as the page
cache, which is what I'd expect to see, when it was the active database for
a time in April and May, the page cache size was reduced by about the same
margin. So it's the act of running an active postgres instance that causes
the phenomenon.

http://s76.photobucket.com/user/kgoesspb/media/db2-mem-historic.png.html



-- 
Kevin M. Goess
Software Engineer
Berkeley Electronic Press
kgo...@bepress.com

510-665-1200 x179
www.bepress.com

bepress: sustainable scholarly publishing


Re: [GENERAL] free RAM not being used for page cache

2014-09-04 Thread Shaun Thomas

On 09/03/2014 07:17 PM, Kevin Goess wrote:


Debian squeeze, still on 2.6.32.


Interesting. Unfortunately that kernel suffers from the newer task 
scheduler they added to 3.2, and I doubt much of the fixes have been 
back-ported. I don't know if that affects the memory handling, but it might.



Darn, really? I just learned about the "mysql swap insanity" problem and
noticed that all the free memory is concentrated on one of the two nodes.

$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6
node 0 size: 32768 MB
node 0 free: 9105 MB
node 1 cpus: 1 3 5 7
node 1 size: 32755 MB
node 1 free: 259 MB


And that's the kind of behavior we were seeing until we upgraded to 3.8. 
A 8GB gap between your nodes is definitely bad, but it's not the same 
thing they described in the MySQL swap insanity posts. MySQL has a much 
bigger internal cache than we do, so expects a good proportion of system 
memory. It's not uncommon for dedicated MySQL systems to have more than 
75% of system memory dedicated to database use. Without NUMA 
interleaving, that's a recipe for a broken system.



$ free
  total   used   free sharedbuffers cached
Mem:  66099280   565658049533476  0  11548   51788624


And again, this is what we started seeing with 3.2 when we upgraded 
initially. Unfortunately it looks like at least one of the bad memory 
aging patches got backported to the kernel you're using. If everything 
were working properly, that excess 9GB would be in your cache.


Check /proc/meminfo for a better breakdown of how the memory is being 
used. This should work:


grep -A1 Active /proc/meminfo

I suspect your inactive file cache is larger than the active set, 
suggesting an overly aggressive memory manager.


--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] free RAM not being used for page cache

2014-09-03 Thread Kevin Goess
On Tue, Aug 5, 2014 at 8:27 AM, Shaun Thomas 
wrote:

> On 07/30/2014 12:51 PM, Kevin Goess wrote:
>
>  A couple months ago we upgraded the RAM on our database servers from
>> 48GB to 64GB.  Immediately afterwards the new RAM was being used for
>> page cache, which is what we want, but that seems to have dropped off
>> over time, and there's currently actually like 12GB of totally unused RAM.
>>
>
> What version of the Linux kernel are you using? We had exactly this
> problem when we were on 3.2. We've since moved to 3.8 and that solved this
> issue, along with a few others.
>

Debian squeeze, still on 2.6.32.

>
> If you're having the same problem, this is not a NUMA issue or in any way
> related to zone_reclaim_mode. The memory page aging algorithm in pre 3.7 is
> simply broken, judging by the traffic on the Linux Kernel Mailing List
> (LKML).
>

Darn, really? I just learned about the "mysql swap insanity" problem and
noticed that all the free memory is concentrated on one of the two nodes.

$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6
node 0 size: 32768 MB
node 0 free: 9105 MB
node 1 cpus: 1 3 5 7
node 1 size: 32755 MB
node 1 free: 259 MB

$ free
 total   used   free sharedbuffers cached
Mem:  66099280   565658049533476  0  11548   51788624

I haven't been able to get any traction on what that means yet though.


-- 
Kevin M. Goess
Software Engineer
Berkeley Electronic Press
kgo...@bepress.com

510-665-1200 x179
www.bepress.com

bepress: sustainable scholarly publishing


Re: [GENERAL] free RAM not being used for page cache

2014-08-05 Thread Shaun Thomas

On 07/30/2014 12:51 PM, Kevin Goess wrote:


A couple months ago we upgraded the RAM on our database servers from
48GB to 64GB.  Immediately afterwards the new RAM was being used for
page cache, which is what we want, but that seems to have dropped off
over time, and there's currently actually like 12GB of totally unused RAM.


What version of the Linux kernel are you using? We had exactly this 
problem when we were on 3.2. We've since moved to 3.8 and that solved 
this issue, along with a few others.


If you're having the same problem, this is not a NUMA issue or in any 
way related to zone_reclaim_mode. The memory page aging algorithm in pre 
3.7 is simply broken, judging by the traffic on the Linux Kernel Mailing 
List (LKML).


I hate to keep beating this drum, but anyone using 3.2 (default for a 
few Linux distributions) needs to stop using 3.2; it's hideously broken.


--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] free RAM not being used for page cache

2014-07-30 Thread Scott Marlowe
On Wed, Jul 30, 2014 at 1:05 PM, Kevin Grittner  wrote:
> Merlin Moncure  wrote:
>> On Wed, Jul 30, 2014 at 12:51 PM, Kevin Goess  wrote:
>>
>>> A couple months ago we upgraded the RAM on our database servers from 48GB to
>>> 64GB.  Immediately afterwards the new RAM was being used for page cache,
>>> which is what we want, but that seems to have dropped off over time, and
>>> there's currently actually like 12GB of totally unused RAM.
>
>> could be a numa issue.
>
> I was thinking the same thing.
>
> The other thought was that it could be a usage pattern and/or
> monitoring issue.  When there are transient requests for large
> amounts of memory, it will discard cache to satisfy those (e.g.,
> work_mem or maintenance_work_mem allocations).  If the *active*
> portion of the database is not as big as RAM, it might not refill
> right away.  This could be compounded on your monitoring graphs if
> they summarize by taking the *average* RAM usage for an interval
> rather than the *maximum* usage for that interval.  Intermittent
> spikes in usage could make it look like the RAM is unused if you
> are averaging; personally, I would prefer to use maximum for a
> metric like this.  Many monitoring systems allow you to choose.

In fact, looking at the png he attached, I'd bet they cranked up
work_mem and / or connections sometime around the end of January and
that's what we're seeing here. More memory used for sorts etc, less
left for caching.

-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] free RAM not being used for page cache

2014-07-30 Thread Kevin Grittner
Merlin Moncure  wrote:
> On Wed, Jul 30, 2014 at 12:51 PM, Kevin Goess  wrote:
>
>> A couple months ago we upgraded the RAM on our database servers from 48GB to
>> 64GB.  Immediately afterwards the new RAM was being used for page cache,
>> which is what we want, but that seems to have dropped off over time, and
>> there's currently actually like 12GB of totally unused RAM.

> could be a numa issue.

I was thinking the same thing.

The other thought was that it could be a usage pattern and/or
monitoring issue.  When there are transient requests for large
amounts of memory, it will discard cache to satisfy those (e.g.,
work_mem or maintenance_work_mem allocations).  If the *active*
portion of the database is not as big as RAM, it might not refill
right away.  This could be compounded on your monitoring graphs if
they summarize by taking the *average* RAM usage for an interval
rather than the *maximum* usage for that interval.  Intermittent
spikes in usage could make it look like the RAM is unused if you
are averaging; personally, I would prefer to use maximum for a
metric like this.  Many monitoring systems allow you to choose.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] free RAM not being used for page cache

2014-07-30 Thread Scott Marlowe
On Wed, Jul 30, 2014 at 12:57 PM, Kevin Goess  wrote:
> On Wed, Jul 30, 2014 at 11:49 AM, Merlin Moncure  wrote:
>> On Wed, Jul 30, 2014 at 12:51 PM, Kevin Goess  wrote:
>> > A couple months ago we upgraded the RAM on our database servers from
>> > 48GB to
>> > 64GB.  Immediately afterwards the new RAM was being used for page cache,
>> > which is what we want, but that seems to have dropped off over time, and
>> > there's currently actually like 12GB of totally unused RAM.
>> >
>> >
>> > http://s76.photobucket.com/user/kgoesspb/media/db1-mem-historical.png.html
>> >
>> > Is that expected?  Is there a setting we need to tune for that?  We have
>> > 400GB of databases on this box, so I know it's not all fitting in that
>> > 49.89GB.
>>
>> could be a numa issue.  Take a look at:
>>
>> http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html
>>
>> merlin
> Good suggestion, but nope, that ain't it:
>
> $ cat /proc/sys/vm/zone_reclaim_mode
> 0

Could it just be your dataset isn't any bigger than what's being used?
-- 
To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] free RAM not being used for page cache

2014-07-30 Thread Kevin Goess
Good suggestion, but nope, that ain't it:

$ cat /proc/sys/vm/zone_reclaim_mode
0



On Wed, Jul 30, 2014 at 11:49 AM, Merlin Moncure  wrote:

> On Wed, Jul 30, 2014 at 12:51 PM, Kevin Goess  wrote:
> > A couple months ago we upgraded the RAM on our database servers from
> 48GB to
> > 64GB.  Immediately afterwards the new RAM was being used for page cache,
> > which is what we want, but that seems to have dropped off over time, and
> > there's currently actually like 12GB of totally unused RAM.
> >
> >
> http://s76.photobucket.com/user/kgoesspb/media/db1-mem-historical.png.html
> >
> > Is that expected?  Is there a setting we need to tune for that?  We have
> > 400GB of databases on this box, so I know it's not all fitting in that
> > 49.89GB.
>
> could be a numa issue.  Take a look at:
>
> http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html
>
> merlin
>



-- 
Kevin M. Goess
Software Engineer
Berkeley Electronic Press
kgo...@bepress.com

510-665-1200 x179
www.bepress.com

bepress: sustainable scholarly publishing


Re: [GENERAL] free RAM not being used for page cache

2014-07-30 Thread Merlin Moncure
On Wed, Jul 30, 2014 at 12:51 PM, Kevin Goess  wrote:
> A couple months ago we upgraded the RAM on our database servers from 48GB to
> 64GB.  Immediately afterwards the new RAM was being used for page cache,
> which is what we want, but that seems to have dropped off over time, and
> there's currently actually like 12GB of totally unused RAM.
>
> http://s76.photobucket.com/user/kgoesspb/media/db1-mem-historical.png.html
>
> Is that expected?  Is there a setting we need to tune for that?  We have
> 400GB of databases on this box, so I know it's not all fitting in that
> 49.89GB.

could be a numa issue.  Take a look at:
http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general