On 04/07/13 04:17, Andre' Walker-Loud wrote:
Hi All,

I wrote some code that is running out of memory.


How do you know? What are the symptoms? Do you get an exception? Computer 
crashes? Something else?



It involves a set of three nested loops, manipulating a data file (array) of 
dimension ~ 300 x 256 x 1 x 2.

Is it a data file, or an array? They're different things.


It uses some third party software, but my guess is I am just not aware of how 
to use proper memory management and it is not the 3rd party software that is 
the culprit.

As a general rule, you shouldn't need to worry about such things, at least 99% 
of the time.


Memory management is new to me, and so I am looking for some general guidance.  
I had assumed that reusing a variable name in a loop would automatically flush 
the memory by just overwriting it.  But this is probably wrong.  Below is a 
very generic version of what I am doing.  I hope there is something obvious I 
am doing wrong or not doing which I can to dump the memory in each cycle of the 
innermost loop.  Hopefully, what I have below is meaningful enough, but again, 
I am new to this, so we shall see.

Completely non-meaningful.



################################################
# generic code skeleton
# import a class I wrote to utilize the 3rd party software
import my_class

Looking at the context here, "my_class" is a misleading name, since it's 
actually a module, not a class.


# instantiate the function do_stuff
my_func = my_class.do_stuff()

This is getting confusing. Either you've oversimplified your pseudo-code, or 
you're using words in ways that do not agree with standard terminology. Or 
both. You don't instantiate functions, you instantiate a class, which gives you 
an instance (an object), not a function.

So I'm lost here -- I have no idea what my_class is (possibly a module?), or 
do_stuff (possibly a class?) or my_func (possibly an instance?).


# I am manipulating a data array of size ~ 300 x 256 x 1 x 2
data = my_data  # my_data is imported just once and has the size above

Where, and how, is my_data imported from? What is it? You say it is "a data 
array" (what sort of data array?) of size 300x256x1x2 -- that's a four-dimensional 
array, with 153600 entries. What sort of entries? Is that 153600 bytes (about 150K) or 
153600 x 64-bit floats (about 1.3 MB)? Or 153600 data structures, each one holding 1MB of 
data (about 153 GB)?


# instantiate a 3d array of size 20 x 10 x 10 and fill it with all zeros
my_array = numpy.zeros([20,10,10])

At last, we finally see something concrete! A numpy array. Is this the same 
sort of array used above?


# loop over parameters and fill array with desired output
for i in range(loop_1):
     for j in range(loop_2):
         for k in range(loop_3):

How big are loop_1, loop_2, loop_3?

You should consider using xrange() rather than range(). If the number is very 
large, xrange will be more memory efficient.


             # create tmp_data that has a shape which is the same as data 
except the first dimension can range from 1 - 1024 instead of being fixed at 300
             '''  Is the next line where I am causing memory problems? '''
             tmp_data = my_class.chop_data(data,i,j,k)

How can we possibly tell if chop_data is causing memory problems when you don't 
show us what chop_data does?


             my_func(tmp_data)
             my_func.third_party_function()

Again, no idea what they do.


             my_array([i,j,k]) = my_func.results() # this is just a floating 
point number
             ''' should I do something to flush tmp_data? '''

No. Python will automatically garbage collect is as needed.

Well, that's not quite true. It depends on what my_tmp actually is. So, 
*probably* no. But without seeing the code for my_tmp, I cannot be sure.



--
Steven
_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to