[Numpy-discussion] Strange memory consumption in numpy?

2013-05-16 Thread Martin Raspaud
Hi all,

In the context of memory profiling an application (with memory_profiler
module) we came up a strange behaviour in numpy, see for yourselves:

Line #Mem usageIncrement   Line Contents

29 @profile
3023.832 MB 0.000 MB   def main():
3146.730 MB22.898 MB   arr1 = np.random.rand(100, 3)
3258.180 MB11.449 MB   arr1s = arr1.astype(np.float32)
3335.289 MB   -22.891 MB   del arr1
3435.289 MB 0.000 MB   gc.collect()
3558.059 MB22.770 MB   arr2 = np.random.rand(100, 3)
3669.500 MB11.441 MB   arr2s = arr2.astype(np.float32)
3769.500 MB 0.000 MB   del arr2
3869.500 MB 0.000 MB   gc.collect()
3969.500 MB 0.000 MB   arr3 = np.random.rand(100, 3)
4080.945 MB11.445 MB   arr3s = arr3.astype(np.float32)
4180.945 MB 0.000 MB   del arr3
4280.945 MB 0.000 MB   gc.collect()
4380.945 MB 0.000 MB   return arr1s, arr2s, arr3s


The lines 31-34 are behaving as expected, but then we don't understand
35-38 (why is arr2 not garbage collected ?) and 39-42 (why doesn't the
random allocate any memory ?).

Can anyone give a reasonable explanation ?

I attach the full script for reference.

Best regards,
Martin
#!/usr/bin/env python
# -*- coding: utf-8 -*-

# Copyright (c) 2013 Martin Raspaud

# Author(s):

#   Martin Raspaud martin.rasp...@smhi.se

# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with this program.  If not, see http://www.gnu.org/licenses/.




import numpy as np
import gc

@profile
def main():
arr1 = np.random.rand(100, 3)
arr1s = arr1.astype(np.float32)
del arr1
gc.collect()
arr2 = np.random.rand(100, 3)
arr2s = arr2.astype(np.float32)
del arr2
gc.collect()
arr3 = np.random.rand(100, 3)
arr3s = arr3.astype(np.float32)
del arr3
gc.collect()
return arr1s, arr2s, arr3s

@profile
def main2():
a1, a2, a3 = main()

main2()
attachment: martin_raspaud.vcf___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Strange memory consumption in numpy?

2013-05-16 Thread Robert Kern
On Thu, May 16, 2013 at 8:35 AM, Martin Raspaud martin.rasp...@smhi.se wrote:
 Hi all,

 In the context of memory profiling an application (with memory_profiler
 module) we came up a strange behaviour in numpy, see for yourselves:

 Line #Mem usageIncrement   Line Contents
 
 29 @profile
 3023.832 MB 0.000 MB   def main():
 3146.730 MB22.898 MB   arr1 = np.random.rand(100, 3)
 3258.180 MB11.449 MB   arr1s = arr1.astype(np.float32)
 3335.289 MB   -22.891 MB   del arr1
 3435.289 MB 0.000 MB   gc.collect()
 3558.059 MB22.770 MB   arr2 = np.random.rand(100, 3)
 3669.500 MB11.441 MB   arr2s = arr2.astype(np.float32)
 3769.500 MB 0.000 MB   del arr2
 3869.500 MB 0.000 MB   gc.collect()
 3969.500 MB 0.000 MB   arr3 = np.random.rand(100, 3)
 4080.945 MB11.445 MB   arr3s = arr3.astype(np.float32)
 4180.945 MB 0.000 MB   del arr3
 4280.945 MB 0.000 MB   gc.collect()
 4380.945 MB 0.000 MB   return arr1s, arr2s, arr3s


 The lines 31-34 are behaving as expected, but then we don't understand
 35-38 (why is arr2 not garbage collected ?) and 39-42 (why doesn't the
 random allocate any memory ?).

 Can anyone give a reasonable explanation ?

memory_profiler only looks at the amount of memory that the OS has
allocated to the Python process. It cannot measure the amount of
memory actually given to living objects. Python does not always return
memory back to the OS immediately when it frees the memory for an
object. Your two observations are linked. Python freed the memory of
arr2 immediately, but it did not return the memory to the OS, so
memory_profiler could not notice it. When arr3 is allocated, it
happened to fit into the block of memory that arr2 once owned, so
Python's memory allocator just used that block again. Since Python did
not have to go out to the OS to get more memory, memory_profiler could
not notice that, either.

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Strange memory consumption in numpy?

2013-05-16 Thread Martin Raspaud
On 16/05/13 10:26, Robert Kern wrote:

 Can anyone give a reasonable explanation ?
 
 memory_profiler only looks at the amount of memory that the OS has
 allocated to the Python process. It cannot measure the amount of
 memory actually given to living objects. Python does not always return
 memory back to the OS immediately when it frees the memory for an
 object. Your two observations are linked. Python freed the memory of
 arr2 immediately, but it did not return the memory to the OS, so
 memory_profiler could not notice it. When arr3 is allocated, it
 happened to fit into the block of memory that arr2 once owned, so
 Python's memory allocator just used that block again. Since Python did
 not have to go out to the OS to get more memory, memory_profiler could
 not notice that, either.

Robert,

Thanks a lot for the clear explanation, it makes perfect sense now.

You're talking about living objects, but as I understand the few memory
profilers I found around the web for python can't track numpy arrays.
Any pointers on something that would work with numpy ?

Best regards,
Martin
attachment: martin_raspaud.vcf___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Strange memory consumption in numpy?

2013-05-16 Thread Robert Kern
On Thu, May 16, 2013 at 1:32 PM, Martin Raspaud martin.rasp...@smhi.se wrote:
 On 16/05/13 10:26, Robert Kern wrote:

 Can anyone give a reasonable explanation ?

 memory_profiler only looks at the amount of memory that the OS has
 allocated to the Python process. It cannot measure the amount of
 memory actually given to living objects. Python does not always return
 memory back to the OS immediately when it frees the memory for an
 object. Your two observations are linked. Python freed the memory of
 arr2 immediately, but it did not return the memory to the OS, so
 memory_profiler could not notice it. When arr3 is allocated, it
 happened to fit into the block of memory that arr2 once owned, so
 Python's memory allocator just used that block again. Since Python did
 not have to go out to the OS to get more memory, memory_profiler could
 not notice that, either.

 Robert,

 Thanks a lot for the clear explanation, it makes perfect sense now.

 You're talking about living objects, but as I understand the few memory
 profilers I found around the web for python can't track numpy arrays.
 Any pointers on something that would work with numpy ?

meliae has special support for numpy.ndarray objects. It's a little
broken, in that it will double-count views, but you can provide a
better specialization if you wish (look for the add_special_size()
function).

https://launchpad.net/meliae

--
Robert Kern
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] __array_priority__ ignored if __array__ is present

2013-05-16 Thread Thomas Robitaille
Hi everyone,

(this was posted as part of another topic, but since it was unrelated,
I'm reposting as a separate thread)

I've also been having issues with __array_priority__ - the following
code behaves differently for __mul__ and __rmul__:


import numpy as np


class TestClass(object):

def __init__(self, input_array):
self.array = input_array

def __mul__(self, other):
print Called __mul__

def __rmul__(self, other):
print Called __rmul__

def __array_wrap__(self, out_arr, context=None):
print Called __array_wrap__
return TestClass(out_arr)

def __array__(self):
print Called __array__
return np.array(self.array)


with output:


In [7]: a = TestClass([1,2,3])

In [8]: print type(np.array([1,2,3]) * a)
Called __array__
Called __array_wrap__
class '__main__.TestClass'

In [9]: print type(a * np.array([1,2,3]))
Called __mul__
type 'NoneType'


Is this also an oversight? I opened a ticket for it a little while ago:

https://github.com/numpy/numpy/issues/3164

Any ideas?

Thanks!
Tom
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.nanmin, numpy.nanmax, and scipy.stats.nanmean

2013-05-16 Thread Benjamin Root
On Thu, May 16, 2013 at 6:09 PM, Phillip Feldman 
phillip.m.feld...@gmail.com wrote:

 It seems odd that `nanmin` and `nanmax` are in NumPy, while `nanmean` is
 in SciPy.stats.  I'd like to propose that a `nanmean` function be added to
 NumPy.

 Have no fear.  There is already plans for its inclusion in the next
release:

https://github.com/numpy/numpy/pull/3297/files

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion