I did some more searching this morning as I mentioned I would last night. I have not found anything in particular to your situation. The only suggestions I have would be:
1) Try getting a core during the memory consumption as I mentioned and do a RCA on the vmcore. 2) Write a stap script to trace d_alloc in the kernel (or one of the d_cache functions) to see who is allocating dentries, and correlate that to a process (a perf script would help you do that pretty easily). 3) Use lsof to see who has tons of open files. Presumably if you're swapping with 100% of ram holding dentries, someone is using those dentries which means lots of open files. Good luck. Sorry I couldn't find anything. If anyone has a valid RHEL subscription, I would encourage you to try with the latest RHEL6 kernel to see if the leak is still there, and if it is, allow GSS to help you find root cause. ~rp On Mon, Aug 29, 2011 at 7:52 PM, Abdussamad Abdurrazzaq <[email protected]> wrote: > On 08/30/2011 04:39 AM, [email protected] wrote: >> >> On Mon, Aug 29, 2011 at 6:18 PM, Abdussamad Abdurrazzaq >> <[email protected]> wrote: >>> >>> Hello >>> >>> Ok please ignore my previous email (if you've seen it). It's quite >>> confused >>> because I posted using gmane.org. >>> >>> I know about how Linux reports memory usage. My problem is very much >>> real. >>> Memory usage keeps increasing because of a memory leak in the kernel >>> dentry >>> cache. This is the same problem as outlined by others here: >>> >>> https://www.redhat.com/archives/rhelv6-list/2011-February/msg00001.html >>> >>> So I was wondering whether this problem was fixed? I am using centos 6 >>> with >>> the following kernel: >>> >>> Linux serve3.websitetheme.com. 2.6.32-71.29.1.el6.x86_64 #1 SMP Mon Jun >>> 27 >>> 19:49:27 BST 2011 x86_64 x86_64 x86_64 GNU/Linux >>> >>> At one point dentry was using 3GB plus on my 8GB system! >>> >>> I am currently using a cron job to clear the cache every so often: >>> >>> sync&& echo 2>/proc/sys/vm/drop_caches >>> >>> The above works but I am looking for a more permanent solution. To that >>> end >>> I tried increasing: >>> >>> echo 10000> /proc/sys/vm/vfs_cache_pressure >>> >>> And in /etc/sysctl.conf But to no effect. >>> >>> So any idea how to fix this? >>> >>> Regards, >>> Abdussamad >>> >>> _______________________________________________ >>> rhelv6-list mailing list >>> [email protected] >>> https://www.redhat.com/mailman/listinfo/rhelv6-list >>> >> >> There are some things you can try to do. You can collect a vmcore from >> a time period during which the system has exhausted nearly all of it's >> memory due to this leak. Then you could try to analyze kmem to >> indicate if there is a problem with the kernel. >> >> You could even go as simple as just looking at top, or the contents of >> /proc/<pid>/status periodically for the set of apps you suspect. If >> you do have a single app that is leaking memory, you should be able to >> record and graph a consistent increasing trend in the amount of memory >> the faulty app is leaking. That would at least give you a starting >> point for where to use an app like valgrind. >> >> Also, if you don't know which proc is at fault, I'd start with >> /etc/crontab: >> */5 * * * * root ps axo comm,vsize,rss | tail -n +2>> /tmp/rawdata >> >> This would collect /proc/pid/statm data for a while. >> >> This should work on an selinux-enforcing machine based on the output of: >> # sesearch -As crond_t | grep tmp >> >> Then use awk to find the low- and high-water marks for each comm. >> >> You could get fancy and add timestamps to the records; maybe track >> current, low-water, and high-water marks, then gnuplot with error >> bars. >> >> As far as your drop_caches work around, know that drop_caches may >> cause performance degrade because some cached data are flushed and >> system have to load them from disk if they are needed again. >> >> Use the "ps aux" above results in the cron job and locate which >> program have a growing RSS. >> And sysstat (/var/log/sa/sar*) may provide some historical memory >> information that you may be interested in. >> >> Hope this gives you somewhere to look. >> >> I will follow-up on the thread you mentioned. >> >> ~rp >> >> _______________________________________________ >> rhelv6-list mailing list >> [email protected] >> https://www.redhat.com/mailman/listinfo/rhelv6-list >> > I don't understand. Isn't dentry cache managed by the kernel? So why would I > look at applications for possible leaks when its obviously the kernel that's > at fault here? Please read my post again including the thread I linked to. > It seems to me you've misunderstood my problem. > > _______________________________________________ > rhelv6-list mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/rhelv6-list > _______________________________________________ rhelv6-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/rhelv6-list
