Re: [Numpy-discussion] opening pickled numarray data with numpy
Try creating an empty module/class with the given name. I.e. create a 'numarray' dir off your PYTHONPATH, create an empty __init__.py file, create a 'generic.py' file in that dir and populate it with whatever class python complains about like so: #!/usr/bin/env python class MissingClass(object): pass Cheers, Jason On Mon, Oct 19, 2009 at 1:00 PM, dagmar wismeijer wrote: > Hi, > > I've been trying to open (using numpy) old pickled data files that I once > created using numarray, but I keep getting the message that there is no > module numarray.generic. > Is there any way I could open these datafiles without installing numarray > again? > > Thanks in advance, > > Dagmar > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On Mon, Jun 8, 2009 at 11:02 AM, David Cournapeau < da...@ar.media.kyoto-u.ac.jp> wrote: > Isn't it true for any general framework who enjoys some popularity :) Yup :) I think there are cases where gradient methods are not applicable > (latent models where the complete data Y cannot be split into > observations-hidden (O, H) variables), although I am not sure that's a > very common case in machine learning, > I won't argue with that. My bias has certainly been strongly influenced by the type of problems I've been exposed to. It'd be interesting to hear of a problem where one can't separate observed/hidden variables :) Cheers, Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On Mon, Jun 8, 2009 at 8:55 AM, David Cournapeau < da...@ar.media.kyoto-u.ac.jp> wrote: > I think it depends on what you are doing - EM is used for 'real' work > too, after all :) Certainly, but EM is really just a mediocre gradient descent/hill climbing algorithm that is relatively easy to implement. Thanks for the link, I was not aware of this work. What is the > difference between the ECG method and the method proposed by Lange in > [1] ? To avoid 'local trapping' of the parameter in EM methods, > recursive EM [2] may also be a promising method, also it seems to me > that it has not been used so much, but I may well be wrong (I have seen > several people using a simplified version of it without much theoretical > consideration in speech processing). I hung-out in the machine learning community appx. 1999-2007 and thought the Salakhutdinov work was extremely refreshing to see after listening to no end of papers applying EM to whatever was the hot topic at the time. :) I've certainly seen/heard about various fixes to EM, but I haven't seen convincing reason(s) to prefer it over proper gradient descent/hill climbing algorithms (besides its present-ability and ease of implementation). Cheers, Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
Note that EM can be very slow to converge: http://www.cs.toronto.edu/~roweis/papers/emecgicml03.pdf EM is great for churning-out papers, not so great for getting real work done. Conjugate gradient is a much better tool, at least in my (and Salakhutdinov's) experience. Btw, have you considered how much the Gaussianity assumption is hurting you? Jason On Mon, Jun 8, 2009 at 1:17 AM, David Cournapeau < da...@ar.media.kyoto-u.ac.jp> wrote: > Gael Varoquaux wrote: > > I am using the heuristic exposed in > > http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996 > > > > We have very noisy and long time series. My experience is that most > > model-based heuristics for choosing the number of PCs retained give us > > way too much on this problem (they simply keep diverging if I add noise > > at the end of the time series). The algorithm we use gives us ~50 > > interesting PCs (each composed of 50 000 dimensions). That happens to be > > quite right based on our experience with the signal. However, being > > fairly new to statistics, I am not aware of the EM algorithm that you > > mention. I'd be interested in a reference, to see if I can use that > > algorithm. > > I would not be surprised if David had this paper in mind :) > > http://www.cs.toronto.edu/~roweis/papers/empca.pdf<http://www.cs.toronto.edu/%7Eroweis/papers/empca.pdf> > > cheers, > > David > ___ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
As someone who is very used to thinking in terms of matrices and who just went through the adjustment of translating matlab-like code to numpy, I found the current matrix module to be confusing. It's poor integration with the rest of numpy/scipy (in particular, scipy.optimize.fmin_cg) made it more difficult to use than it was worth. I'd rather have "matrix" and/or "matrix multiplication" sections of the documentation explain how to do typical, basic matrix operations with nparray, dot, T, and arr[None,:]. I think a matrix class would still be worthwhile for findability, but it should simply serve as documentation for how to do matrix stuff with nparray. Cheers, Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
Hi David, Let me suggest that you try the latest version of Ubuntu (9.04/Jaunty), which was released two months ago. It sounds like you are effectively using release 5 of RedHat Linux which was originally released May 2007. There have been updates (5.1, 5.2, 5.3), but, if my memory serves me correctly, RedHat updates are more focused on fixing bugs and security issues rather than improving functionality. Ubuntu does a full, new release every 6 months so you don't have to wait as long to see improvements. Ubuntu also has a tremendously better package management system. You generally shouldn't be installing packages by hand as it sounds like you are doing. This post suggests that the latest version of Ubuntu is up-to-date wrt ATLAS: http://www.mail-archive.com/numpy-discussion@scipy.org/msg13102.html Jason On Fri, Jun 5, 2009 at 5:44 AM, David Paul Reichert < d.p.reich...@sms.ed.ac.uk> wrote: > Thanks for the replies so far. > > I had already tested using an already transposed matrix in the loop, > it didn't make any difference. Oh and btw, I'm on (Scientific) Linux. > > I used the Enthought distribution, but I guess I'll have to get > my hands dirty and try to get that Atlas thing working (I'm not > a Linux expert though). My simulations pretty much consist of > matrix multiplications, so if I don't get rid of that factor 5, > I pretty much have to get back to Matlab. > > When you said Atlas is going to be optimized for my system, does > that mean I should compile everything on each machine separately? > I.e. I have a not-so-great desktop machine and one of those bigger > multicore things available... > > Cheers > > David > -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
Thanks for the responses. I did not realize that dot() would do matrix multiplication which was the main reason I was looking for a matrix-like class. Like you and Tom suggested, I think it's best to stick to arrays. Cheers, Jason On Sun, May 24, 2009 at 6:45 PM, David Warde-Farley wrote: > On 24-May-09, at 8:32 AM, Tom K. wrote: > > > Maybe my reluctance to work with matrices stems from this kind of > > inconsistency. It seems like your code has to be all matrix, or all > > array - > > and if you mix them, you need to be very careful about which is which. > > Also, functions called on things of type matrix may not return a > matrix as expected, but rather an array. > > Anecdotally, it seems to me that lots of people (myself included) seem > to go through a phase early in their use of NumPy where they try to > use matrix(), but most seem to end up switching to using 2D arrays for > all the aforementioned reasons. > > David > ___ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] matrix default to column vector?
By default, it looks like a 1-dim ndarray gets converted to a row vector by the matrix constructor. This seems to lead to some odd behavior such as a[1] yielding the 2nd element as an ndarray and throwing an IndexError as a matrix. Is it possible to set a flag to make the default be a column vector? Thanks, Jason -- Jason Rennie Research Scientist, ITA Software http://www.itasoftware.com/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion