Re: Python memory usage

2008-10-29 Thread bieffe62
On 21 Ott, 17:19, Rolf Wester <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have the problem that with long running Python scripts (many loops)
> memory consumption increases until the script crashes. I used the
> following small script to understand what might happen:
>
> import gc
>
> print len(gc.get_objects())
>
> a = []
> for i in range( 400 ):
>     a.append( None )
> for i in range( 400 ):
>     a[i] = {}
>
> print len(gc.get_objects())
>
> ret = raw_input("Return:")
>
> del a
> gc.collect()
>
> print len(gc.get_objects())
>
> ret = raw_input("Return:")
>
> The output is:
> 4002706
> Return:
> 2705
> Return:
>
> When I do ps aux | grep python before the first "Return" I get:
> wester    5255 51.2 16.3 1306696 1286828 pts/4 S+   17:59   0:30 python
> memory_prob2.py
>
> and before the second one:
> wester    5255 34.6 15.9 1271784 1255580 pts/4 S+   17:59   0:31 python
> memory_prob2.py
>
> This indicates that although the garbage collector freed 401 objects
> memory consumption does not change accordingly.
>
> I tried the C++ code:
>
> #include 
> using namespace std;
>
> int main()
> {
>         int i;
>         cout << ":";
> //ps 1
>         cin >> i;
>
>         double * v = new double[4000];
>         cout << ":";
> //ps 2
>         cin >> i;
>
>         for(int i=0; i < 4000; i++)
>                 v[i] = i;
>
>         cout << v[4000-1] << ":";
> //ps 3
>         cin >> i;
>
>         delete [] v;
>
>         cout << ":";
> //ps 4
>         cin >> i;
>
> }
>
> and got from ps:
>
> ps 1: 11184
> ps 1: 323688
> ps 1: 323688
> ps 1: 11184
>
> which means that the memery which is deallocated is no longer used by
> the C++ program.
>
> Do I miss something or is this a problem with Python? Is there any means
> to force Python to release the memory that is not used any more?
>
> I would be very appreciative for any help.
>
> With kind regards
>
> Rolf



To be sure that the deallocated memory is not cached at some level to
be reused, you could try
someting like this:

while 1:
l = [dict() for i in range(400)]
l = None # no need of gc and del

For what is worth, on my PC ( Windows XP and Python 2.5.2) the memory
usage of the process
monitored with the Task manager grows up to 600 MB before the memory
is actually released.

Note that in your example, as in mine, you do not need to call
gc.collect(), because
the huge list object is already deleted when you do "del a" ( or in my
case when I reassign "l" and the
huge list drops to 0 reference counts ). The basic memory garbage
collector in CPython
is based on reference counts; gc is only used to find and break
circular reference chains,
which your example do not create. As a proof of that, if ypu print the
retuirn value of
gc.collect (whic is the number of collected objects) you should get 0.

Ciao
--
FB
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2008-10-29 Thread David Cournapeau
On Wed, Oct 29, 2008 at 6:56 PM, [EMAIL PROTECTED]
<[EMAIL PROTECTED]> wrote:
> On Oct 21, 5:19 pm, Rolf Wester <[EMAIL PROTECTED]> wrote:
>> Hi,
>>
>> I have the problem that with long running Python scripts (many loops)
>> memory consumption increases until the script crashes. I used the
>> following small script to understand what might happen:
>>
> 
>
> AFAIK, python uses malloc behind the scenes to allocate memory. From
> the malloc man page...
>
> "The  malloc() and free() functions provide a simple, general-purpose
> memory allocation package. The malloc() function returns a pointer to
> a block of at least size bytes suitably aligned for any use. If the
> space assigned by malloc() is overrun, the results are undefined.
>
> The argument to free() is a pointer to a block previously allocated by
> malloc(), calloc(), or realloc(). After free() is executed, this space
> is made available for further  allocation by the application, though
> not returned to the system. Memory is returned to the system only
> upon  termination of  the  application.  If ptr is a null pointer, no
> action occurs. If a random number is passed to free(), the results are
> undefined."

Depending on your malloc implementation, that may not be true. IN
particular, with glibc, bit allocations are done with mmap, and those
areas are unmaped when free is called; any such area is immediatly
returned to the system

http://www.gnu.org/software/libtool/manual/libc/Malloc-Tunable-Parameters.html#Malloc-Tunable-Parameters

David
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2008-10-29 Thread [EMAIL PROTECTED]
On Oct 21, 5:19 pm, Rolf Wester <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have the problem that with long running Python scripts (many loops)
> memory consumption increases until the script crashes. I used the
> following small script to understand what might happen:
>


AFAIK, python uses malloc behind the scenes to allocate memory. From
the malloc man page...

"The  malloc() and free() functions provide a simple, general-purpose
memory allocation package. The malloc() function returns a pointer to
a block of at least size bytes suitably aligned for any use. If the
space assigned by malloc() is overrun, the results are undefined.

The argument to free() is a pointer to a block previously allocated by
malloc(), calloc(), or realloc(). After free() is executed, this space
is made available for further  allocation by the application, though
not returned to the system. Memory is returned to the system only
upon  termination of  the  application.  If ptr is a null pointer, no
action occurs. If a random number is passed to free(), the results are
undefined."

HTH,

Pete
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python Memory Usage

2007-06-30 Thread malkarouri
On Jun 20, 4:48 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> I am using Python to process particle data from a physics simulation.
> There are about 15 MB of data associated with each simulation, but
> there are many simulations.  I read the data from each simulation into
> Numpy arrays and do a simple calculation on them that involves a few
> eigenvalues of small matricies and quite a number of temporary
> arrays.  I had assumed that that generating lots of temporary arrays
> would make my program run slowly, but I didn't think that it would
> cause the program to consume all of the computer's memory, because I'm
> only dealing with 10-20 MB at a time.
>
> So, I have a function that reliably increases the virtual memory usage
> by ~40 MB each time it's run.  I'm measuring memory usage by looking
> at the VmSize and VmRSS lines in the /proc/[pid]/status file on an
> Ubuntu (edgy) system.  This seems strange because I only have 15 MB of
> data.
>
> I started looking at the difference between what gc.get_objects()
> returns before and after my function.  I expected to see zillions of
> temporary Numpy arrays that I was somehow unintentionally maintaining
> references to.  However, I found that only 27 additional objects  were
> in the list that comes from get_objects(), and all of them look
> small.  A few strings, a few small tuples, a few small dicts, and a
> Frame object.
>
> I also found a tool called heapy (http://guppy-pe.sourceforge.net/)
> which seems to be able to give useful information about memory usage
> in Python.  This seemed to confirm what I found from manual
> inspection: only a few new objects are allocated by my function, and
> they're small.
>
> I found Evan Jones article about the Python 2.4 memory allocator never
> freeing memory in certain circumstances:  
> http://evanjones.ca/python-memory.html.
> This sounds a lot like what's happening to me.  However, his patch was
> applied in Python 2.5 and I'm using Python 2.5.  Nevertheless, it
> looks an awful lot like Python doesn't think it's holding on to the
> memory, but doesn't give it back to the operating system, either.  Nor
> does Python reuse the memory, since each successive call to my
> function consumes an additional 40 MB.  This continues until finally
> the VM usage is gigabytes and I get a MemoryException.
>
> I'm using Python 2.5 on an Ubuntu edgy box, and numpy 1.0.3.  I'm also
> using a few routines from scipy 0.5.2, but for this part of the code
> it's just the eigenvalue routines.
>
> It seems that the standard advice when someone has a bit of Python
> code that progressively consumes all memory is to fork a process.  I
> guess that's not the worst thing in the world, but it certainly is
> annoying.  Given that others seem to have had this problem, is there a
> slick package to do this?  I envision:
> value = call_in_separate_process(my_func, my_args)
>
> Suggestions about how to proceed are welcome.  Ideally I'd like to
> know why this is going on and fix it.  Short of that workarounds that
> are more clever than the "separate process" one are also welcome.
>
> Thanks,
> Greg

I had almost the same problem. Will this do?

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/511474

Any comments are welcome (I wrote the recipe with Pythonistas' help).

Regards,
Muhammad Alkarouri

-- 
http://mail.python.org/mailman/listinfo/python-list


Python Memory Usage

2007-06-19 Thread [EMAIL PROTECTED]
I am using Python to process particle data from a physics simulation.
There are about 15 MB of data associated with each simulation, but
there are many simulations.  I read the data from each simulation into
Numpy arrays and do a simple calculation on them that involves a few
eigenvalues of small matricies and quite a number of temporary
arrays.  I had assumed that that generating lots of temporary arrays
would make my program run slowly, but I didn't think that it would
cause the program to consume all of the computer's memory, because I'm
only dealing with 10-20 MB at a time.

So, I have a function that reliably increases the virtual memory usage
by ~40 MB each time it's run.  I'm measuring memory usage by looking
at the VmSize and VmRSS lines in the /proc/[pid]/status file on an
Ubuntu (edgy) system.  This seems strange because I only have 15 MB of
data.

I started looking at the difference between what gc.get_objects()
returns before and after my function.  I expected to see zillions of
temporary Numpy arrays that I was somehow unintentionally maintaining
references to.  However, I found that only 27 additional objects  were
in the list that comes from get_objects(), and all of them look
small.  A few strings, a few small tuples, a few small dicts, and a
Frame object.

I also found a tool called heapy (http://guppy-pe.sourceforge.net/)
which seems to be able to give useful information about memory usage
in Python.  This seemed to confirm what I found from manual
inspection: only a few new objects are allocated by my function, and
they're small.

I found Evan Jones article about the Python 2.4 memory allocator never
freeing memory in certain circumstances:  
http://evanjones.ca/python-memory.html.
This sounds a lot like what's happening to me.  However, his patch was
applied in Python 2.5 and I'm using Python 2.5.  Nevertheless, it
looks an awful lot like Python doesn't think it's holding on to the
memory, but doesn't give it back to the operating system, either.  Nor
does Python reuse the memory, since each successive call to my
function consumes an additional 40 MB.  This continues until finally
the VM usage is gigabytes and I get a MemoryException.

I'm using Python 2.5 on an Ubuntu edgy box, and numpy 1.0.3.  I'm also
using a few routines from scipy 0.5.2, but for this part of the code
it's just the eigenvalue routines.

It seems that the standard advice when someone has a bit of Python
code that progressively consumes all memory is to fork a process.  I
guess that's not the worst thing in the world, but it certainly is
annoying.  Given that others seem to have had this problem, is there a
slick package to do this?  I envision:
value = call_in_separate_process(my_func, my_args)

Suggestions about how to proceed are welcome.  Ideally I'd like to
know why this is going on and fix it.  Short of that workarounds that
are more clever than the "separate process" one are also welcome.

Thanks,
Greg

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-13 Thread Jonathan Ballet
Le Mon, 13 Nov 2006 21:30:35 +0100,
Fredrik Lundh <[EMAIL PROTECTED]> a écrit :

> Jonathan Ballet wrote:
> 
> >> http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object
> > 
> > Is it still true with Python 2.5 ?
> > 
> > I mean, [http://evanjones.ca/python-memory.html] should fix this
> > behaviour, doesn't it ?
> 
> not really -- that change just means that Python's object allocator
> will return memory chunks to the C allocator if the chunks become
> empty, but as the FAQ entry says, there are no guarantees that
> anything will be returned at all.  it all depends on your
> application's memory allocation patterns.

Ah ok, I thought memory was much "directly freed" to system (but I
misread the faq).

Is there any documents on good application's memory allocation
patterns ?


> (did you read Evan's presentation material, btw?)

I re-read it (thx for mentioning it), but schemas on pages 7, 8 and 9
are not very clear to me.

(snip some questions, since I find answers will writing them).

Where are stored pools which are not completely free ? (not in
'usedpools' nor in 'freepools' objects) (ah, this is the
partially_allocated_arenas I guess :) )
Are 'usedpools' and 'freepools' arenas ?

Smalls objects are stored in free blocks, so a block has a length of 256
bytes I guess. Can its size change (smaller, maybe larger) ?


So, if I am correct :
- I create a new 'small' object -> Python allocates a new arena, and
stores my object in a free block in one of the pool of this arena
- this arena is stored in the partially_allocated_arenas list
(Python 2.5)
- allocating more objects fills all the blocks of every pools of the
arena -> all pools are going one by one into 'usedpools' object
- del-eting every created object in my Python program free every
blocks of every pools of the arena (with a lot of chance :) ), and pools
are going into the 'freepools' object
- if all pools from an arena are freed, the arena is freed
- else, the arena stay allocated, in order to be re-used


Thanks,
Jonathan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-13 Thread Fredrik Lundh
Klaas wrote:

> I think floats use obmalloc so I'm slightly surprised you don't see
> differences.

as noted in the FAQ I just posted a link to, floats also use a free list 
(using pretty much identical code to that used for integers).

see comments in Objects/intobject.c (quoted below) and 
Objects/floatobject.c for details.



/* Integers are quite normal objects, to make object handling uniform.
(Using odd pointers to represent integers would save much space
but require extra checks for this special case throughout the code.)
Since a typical Python program spends much of its time allocating
and deallocating integers, these operations should be very fast.
Therefore we use a dedicated allocation scheme with a much lower
overhead (in space and time) than straight malloc(): a simple
dedicated free list, filled when necessary with memory from malloc().

block_list is a singly-linked list of all PyIntBlocks ever allocated,
linked via their next members.  PyIntBlocks are never returned to the
system before shutdown (PyInt_Fini).

free_list is a singly-linked list of available PyIntObjects, linked
via abuse of their ob_type members.
*/

#define BLOCK_SIZE  1000/* 1K less typical malloc overhead */
#define BHEAD_SIZE  8   /* Enough for a 64-bit pointer */

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-13 Thread Klaas
velotron wrote:
> On Nov 9, 8:38 pm, "Klaas" <[EMAIL PROTECTED]> wrote:
>
> > I was referring specifically to abominations like range(100)
>
> However, there are plenty of valid reasons to allocate huge lists of
> integers.
I'm sure there are some; I doubt there are plenty.  Care to name a few?

> This issue has been worked on:
> http://evanjones.ca/python-memory.html
> http://evanjones.ca/python-memory-part3.html
>
> My understanding is that the patch allows most objects to be released
> back to the OS, but can't help the problem for integers.  I could be

Integers use their own allocator and as such aren't affected by Evan's
patch.

> mistaken.  But on a clean Python 2.5:
>
> x=range(1000)
> x=None
>
> The problem exists for floats too, so for a less contrived example:
>
> x=[random.weibullvariate(7.0,2.0) for i in xrange(1000)]
> x=None
>
> Both leave the Python process bloated in my environment.   Is this
> problem a good candidate for the FAQ?

I think floats use obmalloc so I'm slightly surprised you don't see
differences.  I know that evan's patch imposes conditions on freeing
obmalloc arenas, so you could be seeing effects of that.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-13 Thread Fredrik Lundh
Jonathan Ballet wrote:

>> http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object
> 
> Is it still true with Python 2.5 ?
> 
> I mean, [http://evanjones.ca/python-memory.html] should fix this
> behaviour, doesn't it ?

not really -- that change just means that Python's object allocator will 
return memory chunks to the C allocator if the chunks become empty, but 
as the FAQ entry says, there are no guarantees that anything will be 
returned at all.  it all depends on your application's memory allocation 
patterns.

(did you read Evan's presentation material, btw?)



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-13 Thread Jonathan Ballet
Le Mon, 13 Nov 2006 20:46:58 +0100,
Fredrik Lundh <[EMAIL PROTECTED]> a écrit :
> 
> http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object
> 
> 
> 

Is it still true with Python 2.5 ?

I mean, [http://evanjones.ca/python-memory.html] should fix this
behaviour, doesn't it ?

Jonathan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-13 Thread Fredrik Lundh
velotron wrote:

> x=range(1000)
> x=None
> 
> The problem exists for floats too, so for a less contrived example:
> 
> x=[random.weibullvariate(7.0,2.0) for i in xrange(1000)]
> x=None
> 
> Both leave the Python process bloated in my environment.   Is this
> problem a good candidate for the FAQ?

http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-13 Thread velotron
(hello group)

On Nov 9, 8:38 pm, "Klaas" <[EMAIL PROTECTED]> wrote:

> I was referring specifically to abominations like range(100)

However, there are plenty of valid reasons to allocate huge lists of
integers.   This issue has been worked on:
http://evanjones.ca/python-memory.html
http://evanjones.ca/python-memory-part3.html

My understanding is that the patch allows most objects to be released
back to the OS, but can't help the problem for integers.  I could be
mistaken.  But on a clean Python 2.5:

x=range(1000)
x=None

The problem exists for floats too, so for a less contrived example:

x=[random.weibullvariate(7.0,2.0) for i in xrange(1000)]
x=None

Both leave the Python process bloated in my environment.   Is this
problem a good candidate for the FAQ?

 --Joseph

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-09 Thread Klaas

placid wrote:

> Actually i am executing that code snippet and creating BeautifulSoup
> objects in the range()  (now xrange() ) code block.

Right; I was referring specifically to abominations like
range(100), not looping over an incrementing integer.

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-07 Thread placid

Klaas wrote:
> placid wrote:
> > Hi All,
> >
> > Just wondering when i run the following code;
> >
> > for i in range(100):
> >  print i
> >
> > the memory usage of Python spikes and when the range(..) block finishes
> > execution the memory usage does not drop down. Is there a way of
> > freeing this memory that range(..) allocated?
>
> Python maintains a freelist for integers which is never freed (I don't
> believe this has changed in 2.5).  Normally this isn't an issue since
> the number of distinct integers in simultaneous use is small (assuming
> you aren't executing the above snippet).

Actually i am executing that code snippet and creating BeautifulSoup
objects in the range()  (now xrange() ) code block. 

Cheers

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-07 Thread Klaas
placid wrote:
> Hi All,
>
> Just wondering when i run the following code;
>
> for i in range(100):
>  print i
>
> the memory usage of Python spikes and when the range(..) block finishes
> execution the memory usage does not drop down. Is there a way of
> freeing this memory that range(..) allocated?

Python maintains a freelist for integers which is never freed (I don't
believe this has changed in 2.5).  Normally this isn't an issue since
the number of distinct integers in simultaneous use is small (assuming
you aren't executing the above snippet).

-Mike

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-07 Thread placid

William Heymann wrote:
> On Tuesday 07 November 2006 22:42, placid wrote:
> > Hi All,
> >
> > Just wondering when i run the following code;
> >
> > for i in range(100):
> >  print i
> >
> > the memory usage of Python spikes and when the range(..) block finishes
> > execution the memory usage does not drop down. Is there a way of
> > freeing this memory that range(..) allocated?
> >
> > I found this document but the fix seems too complicated.
> >
> > http://www.python.org/pycon/2005/papers/79/python-memory.pdf
> >
> > Cheers
>
> Change range to xrange. It will run faster and use up almost no memory by
> comparison. I know the point you are getting at for releasing memory however
> in this case there is no reason to allocate the memory to begin with.

Thanks for that it has fixed some of the memory problems.  Just
wondering if i continuously create different BeautifulSoup objects
within a xrange() block when does the memory get released for this
object, after the xrange() block, the next iteration of the xrange()
block ?

Cheers

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-07 Thread William Heymann
On Tuesday 07 November 2006 22:42, placid wrote:
> Hi All,
>
> Just wondering when i run the following code;
>
> for i in range(100):
>  print i
>
> the memory usage of Python spikes and when the range(..) block finishes
> execution the memory usage does not drop down. Is there a way of
> freeing this memory that range(..) allocated?
>
> I found this document but the fix seems too complicated.
>
> http://www.python.org/pycon/2005/papers/79/python-memory.pdf
>
> Cheers

Change range to xrange. It will run faster and use up almost no memory by 
comparison. I know the point you are getting at for releasing memory however 
in this case there is no reason to allocate the memory to begin with.

Stranegly enough on my python2.4 install (kubuntu edgy) about half the memory 
gets released as soon as the call finishes however if I run the same call 
again it only goes up to the memory usage it was before the memory was 
released. So some of the memory is returned and some is reused by python 
later.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python memory usage

2006-11-07 Thread Jorge Vargas
On 7 Nov 2006 21:42:31 -0800, placid <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> Just wondering when i run the following code;
>
> for i in range(100):
>  print i
>
the problem of that is that all the memory is used by the list
returned by range which wont be freed until the for loop exits

try this

>>> import itertools
>>> for i in itertools.count(100):
... print i

that uses an iterator which I believe will bring down the memory usage
but will kill your CPU :)

> the memory usage of Python spikes and when the range(..) block finishes
> execution the memory usage does not drop down. Is there a way of
> freeing this memory that range(..) allocated?
>
> I found this document but the fix seems too complicated.
>
> http://www.python.org/pycon/2005/papers/79/python-memory.pdf
>
> Cheers
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list


Python memory usage

2006-11-07 Thread placid
Hi All,

Just wondering when i run the following code;

for i in range(100):
 print i

the memory usage of Python spikes and when the range(..) block finishes
execution the memory usage does not drop down. Is there a way of
freeing this memory that range(..) allocated?

I found this document but the fix seems too complicated.

http://www.python.org/pycon/2005/papers/79/python-memory.pdf

Cheers

-- 
http://mail.python.org/mailman/listinfo/python-list