[Numpy-discussion] Strange memory consumption in numpy?
Hi all, In the context of memory profiling an application (with memory_profiler module) we came up a strange behaviour in numpy, see for yourselves: Line #Mem usageIncrement Line Contents 29 @profile 3023.832 MB 0.000 MB def main(): 3146.730 MB22.898 MB arr1 = np.random.rand(100, 3) 3258.180 MB11.449 MB arr1s = arr1.astype(np.float32) 3335.289 MB -22.891 MB del arr1 3435.289 MB 0.000 MB gc.collect() 3558.059 MB22.770 MB arr2 = np.random.rand(100, 3) 3669.500 MB11.441 MB arr2s = arr2.astype(np.float32) 3769.500 MB 0.000 MB del arr2 3869.500 MB 0.000 MB gc.collect() 3969.500 MB 0.000 MB arr3 = np.random.rand(100, 3) 4080.945 MB11.445 MB arr3s = arr3.astype(np.float32) 4180.945 MB 0.000 MB del arr3 4280.945 MB 0.000 MB gc.collect() 4380.945 MB 0.000 MB return arr1s, arr2s, arr3s The lines 31-34 are behaving as expected, but then we don't understand 35-38 (why is arr2 not garbage collected ?) and 39-42 (why doesn't the random allocate any memory ?). Can anyone give a reasonable explanation ? I attach the full script for reference. Best regards, Martin #!/usr/bin/env python # -*- coding: utf-8 -*- # Copyright (c) 2013 Martin Raspaud # Author(s): # Martin Raspaud martin.rasp...@smhi.se # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # You should have received a copy of the GNU General Public License # along with this program. If not, see http://www.gnu.org/licenses/. import numpy as np import gc @profile def main(): arr1 = np.random.rand(100, 3) arr1s = arr1.astype(np.float32) del arr1 gc.collect() arr2 = np.random.rand(100, 3) arr2s = arr2.astype(np.float32) del arr2 gc.collect() arr3 = np.random.rand(100, 3) arr3s = arr3.astype(np.float32) del arr3 gc.collect() return arr1s, arr2s, arr3s @profile def main2(): a1, a2, a3 = main() main2() attachment: martin_raspaud.vcf___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Strange memory consumption in numpy?
On Thu, May 16, 2013 at 8:35 AM, Martin Raspaud martin.rasp...@smhi.se wrote: Hi all, In the context of memory profiling an application (with memory_profiler module) we came up a strange behaviour in numpy, see for yourselves: Line #Mem usageIncrement Line Contents 29 @profile 3023.832 MB 0.000 MB def main(): 3146.730 MB22.898 MB arr1 = np.random.rand(100, 3) 3258.180 MB11.449 MB arr1s = arr1.astype(np.float32) 3335.289 MB -22.891 MB del arr1 3435.289 MB 0.000 MB gc.collect() 3558.059 MB22.770 MB arr2 = np.random.rand(100, 3) 3669.500 MB11.441 MB arr2s = arr2.astype(np.float32) 3769.500 MB 0.000 MB del arr2 3869.500 MB 0.000 MB gc.collect() 3969.500 MB 0.000 MB arr3 = np.random.rand(100, 3) 4080.945 MB11.445 MB arr3s = arr3.astype(np.float32) 4180.945 MB 0.000 MB del arr3 4280.945 MB 0.000 MB gc.collect() 4380.945 MB 0.000 MB return arr1s, arr2s, arr3s The lines 31-34 are behaving as expected, but then we don't understand 35-38 (why is arr2 not garbage collected ?) and 39-42 (why doesn't the random allocate any memory ?). Can anyone give a reasonable explanation ? memory_profiler only looks at the amount of memory that the OS has allocated to the Python process. It cannot measure the amount of memory actually given to living objects. Python does not always return memory back to the OS immediately when it frees the memory for an object. Your two observations are linked. Python freed the memory of arr2 immediately, but it did not return the memory to the OS, so memory_profiler could not notice it. When arr3 is allocated, it happened to fit into the block of memory that arr2 once owned, so Python's memory allocator just used that block again. Since Python did not have to go out to the OS to get more memory, memory_profiler could not notice that, either. -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Strange memory consumption in numpy?
On 16/05/13 10:26, Robert Kern wrote: Can anyone give a reasonable explanation ? memory_profiler only looks at the amount of memory that the OS has allocated to the Python process. It cannot measure the amount of memory actually given to living objects. Python does not always return memory back to the OS immediately when it frees the memory for an object. Your two observations are linked. Python freed the memory of arr2 immediately, but it did not return the memory to the OS, so memory_profiler could not notice it. When arr3 is allocated, it happened to fit into the block of memory that arr2 once owned, so Python's memory allocator just used that block again. Since Python did not have to go out to the OS to get more memory, memory_profiler could not notice that, either. Robert, Thanks a lot for the clear explanation, it makes perfect sense now. You're talking about living objects, but as I understand the few memory profilers I found around the web for python can't track numpy arrays. Any pointers on something that would work with numpy ? Best regards, Martin attachment: martin_raspaud.vcf___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Strange memory consumption in numpy?
On Thu, May 16, 2013 at 1:32 PM, Martin Raspaud martin.rasp...@smhi.se wrote: On 16/05/13 10:26, Robert Kern wrote: Can anyone give a reasonable explanation ? memory_profiler only looks at the amount of memory that the OS has allocated to the Python process. It cannot measure the amount of memory actually given to living objects. Python does not always return memory back to the OS immediately when it frees the memory for an object. Your two observations are linked. Python freed the memory of arr2 immediately, but it did not return the memory to the OS, so memory_profiler could not notice it. When arr3 is allocated, it happened to fit into the block of memory that arr2 once owned, so Python's memory allocator just used that block again. Since Python did not have to go out to the OS to get more memory, memory_profiler could not notice that, either. Robert, Thanks a lot for the clear explanation, it makes perfect sense now. You're talking about living objects, but as I understand the few memory profilers I found around the web for python can't track numpy arrays. Any pointers on something that would work with numpy ? meliae has special support for numpy.ndarray objects. It's a little broken, in that it will double-count views, but you can provide a better specialization if you wish (look for the add_special_size() function). https://launchpad.net/meliae -- Robert Kern ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] __array_priority__ ignored if __array__ is present
Hi everyone, (this was posted as part of another topic, but since it was unrelated, I'm reposting as a separate thread) I've also been having issues with __array_priority__ - the following code behaves differently for __mul__ and __rmul__: import numpy as np class TestClass(object): def __init__(self, input_array): self.array = input_array def __mul__(self, other): print Called __mul__ def __rmul__(self, other): print Called __rmul__ def __array_wrap__(self, out_arr, context=None): print Called __array_wrap__ return TestClass(out_arr) def __array__(self): print Called __array__ return np.array(self.array) with output: In [7]: a = TestClass([1,2,3]) In [8]: print type(np.array([1,2,3]) * a) Called __array__ Called __array_wrap__ class '__main__.TestClass' In [9]: print type(a * np.array([1,2,3])) Called __mul__ type 'NoneType' Is this also an oversight? I opened a ticket for it a little while ago: https://github.com/numpy/numpy/issues/3164 Any ideas? Thanks! Tom ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.nanmin, numpy.nanmax, and scipy.stats.nanmean
On Thu, May 16, 2013 at 6:09 PM, Phillip Feldman phillip.m.feld...@gmail.com wrote: It seems odd that `nanmin` and `nanmax` are in NumPy, while `nanmean` is in SciPy.stats. I'd like to propose that a `nanmean` function be added to NumPy. Have no fear. There is already plans for its inclusion in the next release: https://github.com/numpy/numpy/pull/3297/files Cheers! Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion