On 12/5/07, Grigory Batalov wrote: > The problem is that some qrunners quickly eat memory. Most of them > use 20-37Mb after 13 hours of running. But today several qrunners > 6 times took above 200Mb! Fortunately now I have Monit that checks > memory usage, and kills such runners.
I'm reasonably sure there aren't any memory leaks in the Mailman or Python code, but unless someone who is an expert in locating memory leaks in the code can step forward and give us a complete stem-to-stern audit and give us hard confirmation one way or the other, we're not likely to get any further down this road. If you can look at the code and tell us that you're definitely finding memory leaks, then I'm sure that the core developers will look very closely at that. Otherwise, I know that this is one of they things they're always on the lookout for, and they eliminate them as soon as they find them. If you are, or you can get, a Linux performance tuning expert to look closely at your system and tell you exactly what is going on, we'd love to find out what they have to say. But we're not Linux performance tuning experts ourselves, and it's hard for us to try to guess as to why you're seeing such strange behaviour when I certainly don't recall hearing any such reports from anyone else in a very long time. The last time we had such reports, it was because someone didn't understand the nature of how Unix-like OSes work and how they aggressively try to cache everything in memory, which is why I wrote the FAQ entry that you do not find to be of any use. I am not a Linux performance tuning expert, but I have a fair amount of experience in doing general purpose Unix performance tuning, and I have a certain amount of lower-level kernel knowledge of how the various components within most Unix-like OSes interact with each other. My problem is that I don't fully understand how this knowledge could be transferred or translated into a Linux environment. > I wrote previous letter after server failure when 2 greedy qrunners > took 249 and 235 Mb. In that moment even crond couldn't fork and > mail delivery was aborted. I don't know what's going on. I didn't see it happen. From what I have seen of what your tools are reporting, there's definitely some very strange stuff going on, but I can't tell if the problem is that the tool is broken and therefore it's not reporting useful information, or if there is something else going on. Certainly, your tools should not be saying that there is literally zero memory that is active, and literally zero memory that is inactive, with over a gigabyte of RAM being marked as free. That's absolutely the furthest away possible type of situation that we would expect to see, based on what you're reporting in terms of how much memory is being used by the queue runners. > After that I have increased memory limit to 2Gb and started Monit > daemon to prevent such failure. That may help, but until you figure out why netstat is reporting such totally and completely bogus numbers, I really don't think you're going to get anywhere that is very useful. I suspect, but I have no evidence to back up this claim, that the problem may be related to the fact that you're running under a virtualization system. I would suggest trying to run Mailman and the MTA directly underneath the primary OS on the machine (frequently called "domain zero" or "dom0" in virtualization parlance), and see if that at least helps the tools produce information that makes more sense. Running under dom0 may not solve the actual underlying problem of the Mailman queue runners sucking up so much RAM, but at the very least it would help reduce the complexity of the system we're trying to help you debug. -- Brad Knowles <[EMAIL PROTECTED]> LinkedIn Profile: <http://tinyurl.com/y8kpxu> ------------------------------------------------------ Mailman-Users mailing list [email protected] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp
