Re: Python garbage collection: not releasing memory to OS!

2016-04-15 Thread Sam

On 04/15/2016 05:25 AM, cshin...@gmail.com wrote:

I have written an application with flask and uses celery for a long running 
task. While load testing I noticed that the celery tasks are not releasing 
memory even after completing the task. So I googled and found this group 
discussion..

https://groups.google.com/forum/#!topic/celery-users/jVc3I3kPtlw

In that discussion it says, thats how python works.

Also the article at 
https://hbfs.wordpress.com/2013/01/08/python-memory-management-part-ii/ says

"But from the OS's perspective, your program's size is the total (maximum) memory 
allocated to Python. Since Python returns memory to the OS on the heap (that allocates 
other objects than small objects) only on Windows, if you run on Linux, you can only see 
the total memory used by your program increase."

And I use Linux. So I wrote the below script to verify it.

 import gc
 def memory_usage_psutil():
 # return the memory usage in MB
 import resource
 print 'Memory usage: %s (MB)' % 
(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000.0)

 def fileopen(fname):
 memory_usage_psutil()# 10 MB
 f = open(fname)
 memory_usage_psutil()# 10 MB
 content = f.read()
 memory_usage_psutil()# 14 MB

 def fun(fname):
 memory_usage_psutil() # 10 MB
 fileopen(fname)
 gc.collect()
 memory_usage_psutil() # 14 MB

 import sys
 from time import sleep
 if __name__ == '__main__':
 fun(sys.argv[1])
 for _ in range(60):
 gc.collect()
 memory_usage_psutil()#14 MB ...
 sleep(1)

The input was a 4MB file. Even after returning from the 'fileopen' function the 
4MB memory was not released. I checked htop output while the loop was running, 
the resident memory stays at 14MB. So unless the process is stopped the memory 
stays with it.

So if the celery worker is not killed after its task is finished it is going to 
keep the memory for itself. I know I can use **max_tasks_per_child** config 
value to kill the process and spawn a new one. **Is there any other way to 
return the memory to OS from a python process?.**




With situations like this, I normally just fork and do the mem intensive 
work in the child and then kill it off when done. Might be  able to use 
a thread instead of a fork. But not sure how well all that would work 
with celery.


--Sam
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python garbage collection: not releasing memory to OS!

2016-04-15 Thread Michael Torrie
On 04/15/2016 04:25 AM, cshin...@gmail.com wrote:
> The input was a 4MB file. Even after returning from the 'fileopen'
> function the 4MB memory was not released. I checked htop output while
> the loop was running, the resident memory stays at 14MB. So unless
> the process is stopped the memory stays with it.

I guess the question is, why is this a problem?  If there are no leaks,
then I confess I don't understand what your concern is.  And indeed you
say it's not leaking as it never rises above 14 MB.

Also there are ways of reading a file without allocating huge amounts of
memory.  Why not read it in in chunks, or in lines.  Take advantage of
Python's generator facilities to process your data.

> So if the celery worker is not killed after its task is finished it
> is going to keep the memory for itself. I know I can use
> **max_tasks_per_child** config value to kill the process and spawn a
> new one. **Is there any other way to return the memory to OS from a
> python process?.**

Have you tried using the subprocess module of python? If I understand it
correctly, this would allow you to run python code as a subprocess
(completely separate process), which would be completely reaped by the
OS when it's finished.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python garbage collection: not releasing memory to OS!

2016-04-15 Thread Oscar Benjamin
On 15 April 2016 at 11:25,   wrote:
> The input was a 4MB file. Even after returning from the 'fileopen' function 
> the 4MB memory was not released. I checked htop output while the loop was 
> running, the resident memory stays at 14MB. So unless the process is stopped 
> the memory stays with it.

When exactly memory gets freed to the OS is unclear but it's possible
that your process can reuse the same bits of memory. The real question
is whether continuously allocating and deallocating leads to steadily
growing memory usage. If you change it so that your code calls fun
inside the loop you will see that repeatedly calling fun does not lead
to growing memory usage.

> So if the celery worker is not killed after its task is finished it is going 
> to keep the memory for itself. I know I can use **max_tasks_per_child** 
> config value to kill the process and spawn a new one. **Is there any other 
> way to return the memory to OS from a python process?.**

I don't really understand what you're asking here. You're running
celery in a subprocess right? Is the problem about the memory used by
subprocesses that aren't killed or is it the memory usage of the
Python process?

--
Oscar
-- 
https://mail.python.org/mailman/listinfo/python-list


Python garbage collection: not releasing memory to OS!

2016-04-15 Thread cshintov
I have written an application with flask and uses celery for a long running 
task. While load testing I noticed that the celery tasks are not releasing 
memory even after completing the task. So I googled and found this group 
discussion..

https://groups.google.com/forum/#!topic/celery-users/jVc3I3kPtlw

In that discussion it says, thats how python works.

Also the article at 
https://hbfs.wordpress.com/2013/01/08/python-memory-management-part-ii/ says  

"But from the OS's perspective, your program's size is the total (maximum) 
memory allocated to Python. Since Python returns memory to the OS on the heap 
(that allocates other objects than small objects) only on Windows, if you run 
on Linux, you can only see the total memory used by your program increase."

And I use Linux. So I wrote the below script to verify it.

import gc
def memory_usage_psutil():
# return the memory usage in MB
import resource
print 'Memory usage: %s (MB)' % 
(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000.0)

def fileopen(fname):
memory_usage_psutil()# 10 MB
f = open(fname)
memory_usage_psutil()# 10 MB
content = f.read()
memory_usage_psutil()# 14 MB

def fun(fname):
memory_usage_psutil() # 10 MB
fileopen(fname)
gc.collect()
memory_usage_psutil() # 14 MB
   
import sys
from time import sleep
if __name__ == '__main__':
fun(sys.argv[1])
for _ in range(60):
gc.collect()
memory_usage_psutil()#14 MB ...
sleep(1)

The input was a 4MB file. Even after returning from the 'fileopen' function the 
4MB memory was not released. I checked htop output while the loop was running, 
the resident memory stays at 14MB. So unless the process is stopped the memory 
stays with it.

So if the celery worker is not killed after its task is finished it is going to 
keep the memory for itself. I know I can use **max_tasks_per_child** config 
value to kill the process and spawn a new one. **Is there any other way to 
return the memory to OS from a python process?.**
-- 
https://mail.python.org/mailman/listinfo/python-list