Re: [Numpy-discussion] design patterns for computing
It sounds to me like you have something closer to the following. class Problem - initialized with an Input instance and an Analysis instance - has a ``get_results`` method that asks the Analysis instance to - call ``get_input`` on Input instance - run analysis on the provided input - create and return and instance of a Results class Instances of the Input class can then produce input anyway you like---e.g., as data containers or as data fetchers---as long as they provide it in a standard format in response to ``get_input``. fwiw, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
> > > The problem is I don't have time for this at the moment, I have to > develop > > my algorithm for my PhD, and if one does not work, I'll try another one, > but > > this is strange because this algorithm worked in the past... > > By incentive, I meant incentive for me, not for you :) I think this > could be really helpful for other problems if this kind as well. But as you have the same problems than me ;) > BTW, I can't use massif. Although the normal algorithm does not use more > > than 150MB, with massif, it goes higher than 2.7GB, and this is before > the > > really hungry process. > Using massif with code using a large number of python code is clearly > a no no, since it uses so much malloc. Sometimes, you can just stop > the thing when it is growing up, and watching the resulting graphs. > But this is no fun, and if you have no time... It's manily that it crashes when the second file is loaded into memory, the model that will be used for this data isn't even loaded and thus the value of the result is zero :( Strange though (IMHM) that massif needs to allocate so much memory for this little data. Matthieu ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
On 10/17/07, Matthieu Brucher <[EMAIL PROTECTED]> wrote: > > > > I'll try it this night (because it is very very long, so with the > > > simulator...) > > Yes, that's the worse cases, of course. For those cases, I wish we could > > use numpy with COUNT_ALLOCS. Unfortunately, using numpy in this case is > > impossible (it crashes at import, and I've never tracked down the > > problem; maybe your problem may be enough of an incentive to start again): > > > > > http://mail.python.org/pipermail/python-list/2003-May/206256.html > > > The problem is I don't have time for this at the moment, I have to develop > my algorithm for my PhD, and if one does not work, I'll try another one, but > this is strange because this algorithm worked in the past... By incentive, I meant incentive for me, not for you :) I think this could be really helpful for other problems if this kind as well. > > BTW, I can't use massif. Although the normal algorithm does not use more > than 150MB, with massif, it goes higher than 2.7GB, and this is before the > really hungry process. Using massif with code using a large number of python code is clearly a no no, since it uses so much malloc. Sometimes, you can just stop the thing when it is growing up, and watching the resulting graphs. But this is no fun, and if you have no time... David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] design patterns for computing
Hello there, some helpful pointers provided in the list have put me on track to ask further questions. Now that I'm acquainted with simulating interfaces and duck-typing (which is still a bit fuzzy to me) I think the information I've been looking for all along are the so-called 'design patterns'. Looking at the Optimizer and Stats examples, I could see how new objects were created that built upon a basis class. Stats even had a class to contain calculation results, which brought me even closer to the point. But now I'm stuck again.. I imagined three interfaces to be designed, through which the main program controls the number-crunching without constraining its implementation. These would be: - getInputs: responsible for passing/preparing data and parameters for the analysis/computation - Analysis: which implements all the crunching from FFTs to esoteric stuff like genetic algorithms (and which should be considered opaque from the main program POV) - AnalysisResults: to somehow get the results back into the main program. An additional class would cater for plotting properties, which are usually associated with a particular analysis. Finally, anticipating that I may want to cascade analyses, the inputs and results must be made uniform - must be of the same form, likely numpy arrays. Well, with this pattern, to implement an analysis, I'd have to define derivated getInputs and AnalysisResults classes, and code Analysis to accept and return arrays. A flag to the analysis object could be used to generate plot information or not. (Hey, writing this email is already helping, I'll put the gist in my documentation!) So, after this brainstorming I'll do some more on the framework. I'd be most grateful for comments on the design pattern - is it sensible, could it be better? Are there any good literature sources on patterns for this type of work? Thanks, Renato ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
> > > I'll try it this night (because it is very very long, so with the > > simulator...) > Yes, that's the worse cases, of course. For those cases, I wish we could > use numpy with COUNT_ALLOCS. Unfortunately, using numpy in this case is > impossible (it crashes at import, and I've never tracked down the > problem; maybe your problem may be enough of an incentive to start again): > > http://mail.python.org/pipermail/python-list/2003-May/206256.html The problem is I don't have time for this at the moment, I have to develop my algorithm for my PhD, and if one does not work, I'll try another one, but this is strange because this algorithm worked in the past... BTW, I can't use massif. Although the normal algorithm does not use more than 150MB, with massif, it goes higher than 2.7GB, and this is before the really hungry process. Matthieu ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
Matthieu Brucher wrote: > > > I agree with you, the problem is that I do not use directly C > > functions and that I do not know how I can reproduce the result > with a > > minimal example. > Yes, that's why I suggested looking at the memory usage (using top or > something else). Because maybe the problem can be spotted long before > seeing it happening, hence having smaller code (which you could e.g. > post, etc...) > > > > For the moment nothing unusual, the memory use seems to be stable. > > > But if you can reproduce it with 128*128 images, then maybe it is > quick > enough to run the whole thing under massif ? > > > I'll try it this night (because it is very very long, so with the > simulator...) Yes, that's the worse cases, of course. For those cases, I wish we could use numpy with COUNT_ALLOCS. Unfortunately, using numpy in this case is impossible (it crashes at import, and I've never tracked down the problem; maybe your problem may be enough of an incentive to start again): http://mail.python.org/pipermail/python-list/2003-May/206256.html cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] appending extra items to arrays
Hello, I work with numarray.zeros(n, numarray.Float64) as sound mixers ; there are huge number of datas : 44 000 .. 192 000 /second Operations : add, append, add & append (if the mixing begins on the existing part of the array + append if the array has to be prolonged) I do never use list-appending but I concatenate a zero-array with a fixed length (which can be given e.g. sr/10 ) This way is much more faster then appending with lists if there are huge number of datas (for small arrays, lists are faster, but I have never a sound of 10 samples :-) -- René Bastian http://www.musiques-rb.org http://pythoneon.musiques-rb.org ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
> > > I agree with you, the problem is that I do not use directly C > > functions and that I do not know how I can reproduce the result with a > > minimal example. > Yes, that's why I suggested looking at the memory usage (using top or > something else). Because maybe the problem can be spotted long before > seeing it happening, hence having smaller code (which you could e.g. > post, etc...) For the moment nothing unusual, the memory use seems to be stable. But if you can reproduce it with 128*128 images, then maybe it is quick > enough to run the whole thing under massif ? I'll try it this night (because it is very very long, so with the simulator...) Thanks for the help ;) Matthieu ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
Matthieu Brucher wrote: > > As said, my approach to debugging this kind of thing is to get out of > python ASAP. And once you manage to reproduce the result when calling > only a couple of python functions, then you use massif or memcheck. > > > I agree with you, the problem is that I do not use directly C > functions and that I do not know how I can reproduce the result with a > minimal example. Yes, that's why I suggested looking at the memory usage (using top or something else). Because maybe the problem can be spotted long before seeing it happening, hence having smaller code (which you could e.g. post, etc...) But if you can reproduce it with 128*128 images, then maybe it is quick enough to run the whole thing under massif ? cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
> > As said, my approach to debugging this kind of thing is to get out of > python ASAP. And once you manage to reproduce the result when calling > only a couple of python functions, then you use massif or memcheck. I agree with you, the problem is that I do not use directly C functions and that I do not know how I can reproduce the result with a minimal example. I know I've had once a Numpy problem that was solved by recompiling Numpy completely, but here, I do not possess a first clue to help me track this bug. The reason to get out of python is that tools like valgrind can only > give you meaningful informations at the C level, and it is difficult to > make the link between C and python calls, if not impossible in all but > trivial cases. > > But when you can get this kind of graphs: > > http://valgrind.org/docs/manual/ms-manual.html > > Then the problem is "solved", at least in my limited experience. Compltely agreed. Matthieu ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
Matthieu Brucher wrote: > > > I wish I could, but this behaviour only shows up on this > peculiar data > > set :( > It always happens at the same time ? > > > Yes. > Perhaps debugging the process will sort things out ? As said, my approach to debugging this kind of thing is to get out of python ASAP. And once you manage to reproduce the result when calling only a couple of python functions, then you use massif or memcheck. The reason to get out of python is that tools like valgrind can only give you meaningful informations at the C level, and it is difficult to make the link between C and python calls, if not impossible in all but trivial cases. But when you can get this kind of graphs: http://valgrind.org/docs/manual/ms-manual.html Then the problem is "solved", at least in my limited experience. David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
> > > I wish I could, but this behaviour only shows up on this peculiar data > > set :( > It always happens at the same time ? Yes. Perhaps debugging the process will sort things out ? > Unfortunately, the process is very long, there are several > > optimizations in the process, the whole thing in a EM-like algorithm, > > and the crash does not occur with the first image, it is later. > Can you observe some 'funky' memory behaviour while the process is > running ? For example, does the memory keeps increasing after each dataset > ? > Good question, in fact it should because I'm saving the result in a list before I pickle it (but in fact I have to change this process as pickle cannot handle the images I'm using, but that's another problem). So it should increase, but not too much. I'll check this. Matthieu ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
Matthieu Brucher wrote: > > Can't you emulate this behaviour with signals different than images ? > (say random signal of 64*64*64*3 samples). > > > > I wish I could, but this behaviour only shows up on this peculiar data > set :( It always happens at the same time ? > > Unfortunately, the process is very long, there are several > optimizations in the process, the whole thing in a EM-like algorithm, > and the crash does not occur with the first image, it is later. Can you observe some 'funky' memory behaviour while the process is running ? For example, does the memory keeps increasing after each dataset ? ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
Oups, I'm wrong, the process happened this time with the 128² images and not the 3D images. I'll try to check a little bit further. Matthieu 2007/10/17, Matthieu Brucher <[EMAIL PROTECTED]>: > > Can't you emulate this behaviour with signals different than images ? > > (say random signal of 64*64*64*3 samples). > > > > I wish I could, but this behaviour only shows up on this peculiar data set > :( > > > If the process does not > > require a long processing time (say a couple of minutes), then you may > > be able to use massif tool from valgrind, which may be helpful to detect > > too many INCREF. For DECREF, then the default memory checker from > > valgrind should be useful as well. > > > > If it does take only a couple of minutes, it is then relatively easy to > > "bisect" the code to spot the error (once you get through the C level, > > it is easy to see the problem, if you can reproduce it, in my > > experience). > > > Unfortunately, the process is very long, there are several optimizations > in the process, the whole thing in a EM-like algorithm, and the crash does > not occur with the first image, it is later. > > Matthieu > ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
> > Can't you emulate this behaviour with signals different than images ? > (say random signal of 64*64*64*3 samples). I wish I could, but this behaviour only shows up on this peculiar data set :( If the process does not > require a long processing time (say a couple of minutes), then you may > be able to use massif tool from valgrind, which may be helpful to detect > too many INCREF. For DECREF, then the default memory checker from > valgrind should be useful as well. > > If it does take only a couple of minutes, it is then relatively easy to > "bisect" the code to spot the error (once you get through the C level, > it is easy to see the problem, if you can reproduce it, in my experience). Unfortunately, the process is very long, there are several optimizations in the process, the whole thing in a EM-like algorithm, and the crash does not occur with the first image, it is later. Matthieu ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] appending extra items to arrays
Growing an array by appending it is the slow way in matlab. The suggested way to do things there is preallocate the array by saying x=zeros() and then referencing the elements in the array and inserting the correct value. --Chad Kidder On Oct 17, 2007, at 7:16 AM, mark wrote: So it seems like lists are the way to grow an array. Interestingly enough, it is very easy to grown an array in Matlab. Any idea how they do that (or are they slow as well?). Mark On Oct 11, 8:53 pm, "Adam Mercer" <[EMAIL PROTECTED]> wrote: On 11/10/2007, Mark Janikas <[EMAIL PROTECTED]> wrote: If you do not know the size of your array before you finalize it, then you should use lists whenever you can. I just cooked up a short example: # Result # Total Time with array: 2.12951189331 Total Time with list: 0.0469707035741 Hope this helps, That is helpful, I thought that using arrays would be much faster but its clearly not in this case. Thanks Adam ___ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
Matthieu Brucher wrote: > > There are two types of errors that can occur with reference > counting on > data-types. > > 1) There are too many DECREF's --- this gets us to the error > quickly and > is usually easy to reproduce > 2) There are too many INCREF's (the reference count keeps going up > until > the internal counter wraps around to 0 and deallocation is attempted) > --- this error is harder to reproduce and usually takes a while before > it happens in the code. > > > The error showed up again with a simple operation : > File > "/home/brucher/local/src/include/toolbox/stats/kernels/gaussian.py", > line 60, in __call__ > xp = (x-self.loc) * (1/self.scale) > Where x and self.loc are arrays abd self.scale is an integer. > > The error as it was given : > > *** Reference count error detected: > an attempt was made to deallocate 7 (l) *** > *** Reference count error detected: > an attempt was made to deallocate 7 (l) *** > *** Reference count error detected: > an attempt was made to deallocate 7 (l) *** > > I don't know if it is too many INCREF, but I doubt it. It happens in a > loop where I treat a bunch of 3D images each time (64*64*64*3), and it > crashes after some dozens of them. I have no problem when I treat > small 2D images (128*128), but far more samples. Can't you emulate this behaviour with signals different than images ? (say random signal of 64*64*64*3 samples). If the process does not require a long processing time (say a couple of minutes), then you may be able to use massif tool from valgrind, which may be helpful to detect too many INCREF. For DECREF, then the default memory checker from valgrind should be useful as well. If it does take only a couple of minutes, it is then relatively easy to "bisect" the code to spot the error (once you get through the C level, it is easy to see the problem, if you can reproduce it, in my experience). cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in deallocation ?
> > There are two types of errors that can occur with reference counting on > data-types. > > 1) There are too many DECREF's --- this gets us to the error quickly and > is usually easy to reproduce > 2) There are too many INCREF's (the reference count keeps going up until > the internal counter wraps around to 0 and deallocation is attempted) > --- this error is harder to reproduce and usually takes a while before > it happens in the code. > The error showed up again with a simple operation : File "/home/brucher/local/src/include/toolbox/stats/kernels/gaussian.py", line 60, in __call__ xp = (x-self.loc) * (1/self.scale) Where x and self.loc are arrays abd self.scale is an integer. The error as it was given : *** Reference count error detected: an attempt was made to deallocate 7 (l) *** *** Reference count error detected: an attempt was made to deallocate 7 (l) *** *** Reference count error detected: an attempt was made to deallocate 7 (l) *** I don't know if it is too many INCREF, but I doubt it. It happens in a loop where I treat a bunch of 3D images each time (64*64*64*3), and it crashes after some dozens of them. I have no problem when I treat small 2D images (128*128), but far more samples. Matthieu ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] appending extra items to arrays
So it seems like lists are the way to grow an array. Interestingly enough, it is very easy to grown an array in Matlab. Any idea how they do that (or are they slow as well?). Mark On Oct 11, 8:53 pm, "Adam Mercer" <[EMAIL PROTECTED]> wrote: > On 11/10/2007, Mark Janikas <[EMAIL PROTECTED]> wrote: > > > If you do not know the size of your array before you finalize it, then > > you should use lists whenever you can. I just cooked up a short > > example: > > > > > # Result # > > Total Time with array: 2.12951189331 > > Total Time with list: 0.0469707035741 > > > > > > > Hope this helps, > > That is helpful, I thought that using arrays would be much faster but > its clearly not in this case. > > Thanks > > Adam > ___ > Numpy-discussion mailing list > [EMAIL PROTECTED]://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] abstraction and interfaces
What is more, with the concept of duck-typing, you just have to provide the interface without inheriting from a mother class (which is what I do for the optimizer framework, the other sub-modules do not derive from a common ancestor). Matthieu 2007/10/17, Renato Serodio <[EMAIL PROTECTED]>: > > Hello there, > > thanks to your pointer, I've progressed further on the OO concept, and > am currently building analysis, inputData and outputResults interfaces > that should add some flexibility to my program. > > On the other hand, pulling the OO and interfaces string opened a box > full of little critters, and now I'm learning about UML, some advanced > class usage and so on. My brain hurts.. > > Cheers, > > Renato > > > On 12/10/2007, Alan G Isaac <[EMAIL PROTECTED]> wrote: > > On Fri, 12 Oct 2007, Renato Serodio apparently wrote: > > > The scripts that produce these metrics use Scipy/Numpy > > > functions that operate on data conveniently converted to > > > numpy arrays. They're quite specific, and I tend to > > > produce/tweak a lot of them. So, to fit in this > > > application someone suggested I programmed 'interfaces' > > > (in java jargon) to them - that way I could develop the > > > whole wrapper application without giving much thought to > > > the actual number-crunching bits. > > > > That sounds quite right. Check out > > https://projects.scipy.org/scipy/scikits/browser/trunk/openopt/scikits/openopt/solvers/optimizers/optimizer > > > > http://svn.scipy.org/svn/scipy/trunk/scipy/stats/models/> > > for examples that may be relevant to your project. > > > > Python does not have interfaces per se, but that does > > not stop you from designing interface-like classes and > > inheriting from them. > > > > fwiw, > > Alan Isaac > > > > > > ___ > > Numpy-discussion mailing list > > Numpy-discussion@scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > ___ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A basic question on the dot function
> > what I'm searching for is : > > In [18]: dotprod2(a,b) > Out[18]: array([ 0.28354876, 0.54474092, 0.22986942, 0.42822669, > 0.98179793]) > > where I defined a "classical" (in the way I understand it. I may not > understand it properly ?) dot product between these 2 vectors. > > def dotprod2(a,b): > return sum(a*b,axis=0) > I think this is your best choice ;) Matthieu ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A basic question on the dot function
2007/10/16, Timothy Hochberg <[EMAIL PROTECTED]>: > > > You might try tensordot. Without thinking it through too much: > numpy.tensordot(a0, a1, axes=[-1,-1]) > seems to do what you want. > > Thank you. However, it works only for this simple example, where a0 and a1 are similar. The tensor product increase the rank of the output, doesn't it ? Although the dot product decrease the rank. Is there a ¨proper" solution if a and b are general (3,N) vectors ? By example : In [16]: a = random.random_sample((3,5)) In [17]: b = random.random_sample((3,5)) what I'm searching for is : In [18]: dotprod2(a,b) Out[18]: array([ 0.28354876, 0.54474092, 0.22986942, 0.42822669, 0.98179793]) where I defined a "classical" (in the way I understand it. I may not understand it properly ?) dot product between these 2 vectors. def dotprod2(a,b): return sum(a*b,axis=0) or in maths notation : c_j = \sum_i a_{ij} b_{ij} Thank in advance, JH ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] abstraction and interfaces
Hello there, thanks to your pointer, I've progressed further on the OO concept, and am currently building analysis, inputData and outputResults interfaces that should add some flexibility to my program. On the other hand, pulling the OO and interfaces string opened a box full of little critters, and now I'm learning about UML, some advanced class usage and so on. My brain hurts.. Cheers, Renato On 12/10/2007, Alan G Isaac <[EMAIL PROTECTED]> wrote: > On Fri, 12 Oct 2007, Renato Serodio apparently wrote: > > The scripts that produce these metrics use Scipy/Numpy > > functions that operate on data conveniently converted to > > numpy arrays. They're quite specific, and I tend to > > produce/tweak a lot of them. So, to fit in this > > application someone suggested I programmed 'interfaces' > > (in java jargon) to them - that way I could develop the > > whole wrapper application without giving much thought to > > the actual number-crunching bits. > > That sounds quite right. Check out > https://projects.scipy.org/scipy/scikits/browser/trunk/openopt/scikits/openopt/solvers/optimizers/optimizer> > http://svn.scipy.org/svn/scipy/trunk/scipy/stats/models/> > for examples that may be relevant to your project. > > Python does not have interfaces per se, but that does > not stop you from designing interface-like classes and > inheriting from them. > > fwiw, > Alan Isaac > > > ___ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion