Re: Python garbage collection: not releasing memory to OS!
On 04/15/2016 05:25 AM, cshin...@gmail.com wrote: I have written an application with flask and uses celery for a long running task. While load testing I noticed that the celery tasks are not releasing memory even after completing the task. So I googled and found this group discussion.. https://groups.google.com/forum/#!topic/celery-users/jVc3I3kPtlw In that discussion it says, thats how python works. Also the article at https://hbfs.wordpress.com/2013/01/08/python-memory-management-part-ii/ says "But from the OS's perspective, your program's size is the total (maximum) memory allocated to Python. Since Python returns memory to the OS on the heap (that allocates other objects than small objects) only on Windows, if you run on Linux, you can only see the total memory used by your program increase." And I use Linux. So I wrote the below script to verify it. import gc def memory_usage_psutil(): # return the memory usage in MB import resource print 'Memory usage: %s (MB)' % (resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000.0) def fileopen(fname): memory_usage_psutil()# 10 MB f = open(fname) memory_usage_psutil()# 10 MB content = f.read() memory_usage_psutil()# 14 MB def fun(fname): memory_usage_psutil() # 10 MB fileopen(fname) gc.collect() memory_usage_psutil() # 14 MB import sys from time import sleep if __name__ == '__main__': fun(sys.argv[1]) for _ in range(60): gc.collect() memory_usage_psutil()#14 MB ... sleep(1) The input was a 4MB file. Even after returning from the 'fileopen' function the 4MB memory was not released. I checked htop output while the loop was running, the resident memory stays at 14MB. So unless the process is stopped the memory stays with it. So if the celery worker is not killed after its task is finished it is going to keep the memory for itself. I know I can use **max_tasks_per_child** config value to kill the process and spawn a new one. **Is there any other way to return the memory to OS from a python process?.** With situations like this, I normally just fork and do the mem intensive work in the child and then kill it off when done. Might be able to use a thread instead of a fork. But not sure how well all that would work with celery. --Sam -- https://mail.python.org/mailman/listinfo/python-list
Re: Python garbage collection: not releasing memory to OS!
On 04/15/2016 04:25 AM, cshin...@gmail.com wrote: > The input was a 4MB file. Even after returning from the 'fileopen' > function the 4MB memory was not released. I checked htop output while > the loop was running, the resident memory stays at 14MB. So unless > the process is stopped the memory stays with it. I guess the question is, why is this a problem? If there are no leaks, then I confess I don't understand what your concern is. And indeed you say it's not leaking as it never rises above 14 MB. Also there are ways of reading a file without allocating huge amounts of memory. Why not read it in in chunks, or in lines. Take advantage of Python's generator facilities to process your data. > So if the celery worker is not killed after its task is finished it > is going to keep the memory for itself. I know I can use > **max_tasks_per_child** config value to kill the process and spawn a > new one. **Is there any other way to return the memory to OS from a > python process?.** Have you tried using the subprocess module of python? If I understand it correctly, this would allow you to run python code as a subprocess (completely separate process), which would be completely reaped by the OS when it's finished. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python garbage collection: not releasing memory to OS!
On 15 April 2016 at 11:25,wrote: > The input was a 4MB file. Even after returning from the 'fileopen' function > the 4MB memory was not released. I checked htop output while the loop was > running, the resident memory stays at 14MB. So unless the process is stopped > the memory stays with it. When exactly memory gets freed to the OS is unclear but it's possible that your process can reuse the same bits of memory. The real question is whether continuously allocating and deallocating leads to steadily growing memory usage. If you change it so that your code calls fun inside the loop you will see that repeatedly calling fun does not lead to growing memory usage. > So if the celery worker is not killed after its task is finished it is going > to keep the memory for itself. I know I can use **max_tasks_per_child** > config value to kill the process and spawn a new one. **Is there any other > way to return the memory to OS from a python process?.** I don't really understand what you're asking here. You're running celery in a subprocess right? Is the problem about the memory used by subprocesses that aren't killed or is it the memory usage of the Python process? -- Oscar -- https://mail.python.org/mailman/listinfo/python-list
Python garbage collection: not releasing memory to OS!
I have written an application with flask and uses celery for a long running task. While load testing I noticed that the celery tasks are not releasing memory even after completing the task. So I googled and found this group discussion.. https://groups.google.com/forum/#!topic/celery-users/jVc3I3kPtlw In that discussion it says, thats how python works. Also the article at https://hbfs.wordpress.com/2013/01/08/python-memory-management-part-ii/ says "But from the OS's perspective, your program's size is the total (maximum) memory allocated to Python. Since Python returns memory to the OS on the heap (that allocates other objects than small objects) only on Windows, if you run on Linux, you can only see the total memory used by your program increase." And I use Linux. So I wrote the below script to verify it. import gc def memory_usage_psutil(): # return the memory usage in MB import resource print 'Memory usage: %s (MB)' % (resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000.0) def fileopen(fname): memory_usage_psutil()# 10 MB f = open(fname) memory_usage_psutil()# 10 MB content = f.read() memory_usage_psutil()# 14 MB def fun(fname): memory_usage_psutil() # 10 MB fileopen(fname) gc.collect() memory_usage_psutil() # 14 MB import sys from time import sleep if __name__ == '__main__': fun(sys.argv[1]) for _ in range(60): gc.collect() memory_usage_psutil()#14 MB ... sleep(1) The input was a 4MB file. Even after returning from the 'fileopen' function the 4MB memory was not released. I checked htop output while the loop was running, the resident memory stays at 14MB. So unless the process is stopped the memory stays with it. So if the celery worker is not killed after its task is finished it is going to keep the memory for itself. I know I can use **max_tasks_per_child** config value to kill the process and spawn a new one. **Is there any other way to return the memory to OS from a python process?.** -- https://mail.python.org/mailman/listinfo/python-list