On Mon, Jun 08, 2009 at 12:29:08AM -0400, David Warde-Farley wrote: > On 7-Jun-09, at 6:12 AM, Gael Varoquaux wrote:
> > Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it > > makes > > a big difference, especially since I have 8 cores. > Just curious Gael: how many PC's are you retaining? Have you tried > iterative methods (i.e. the EM algorithm for PCA)? I am using the heuristic exposed in http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996 We have very noisy and long time series. My experience is that most model-based heuristics for choosing the number of PCs retained give us way too much on this problem (they simply keep diverging if I add noise at the end of the time series). The algorithm we use gives us ~50 interesting PCs (each composed of 50 000 dimensions). That happens to be quite right based on our experience with the signal. However, being fairly new to statistics, I am not aware of the EM algorithm that you mention. I'd be interested in a reference, to see if I can use that algorithm. The PCA bootstrap is time-consuming. Thanks, Gaƫl _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion