Re: Beginner questions on clustering & M/R

2010-07-17 Thread Ted Dunning
Just speaking heuristically, time series data is very high dimensional. For the equities market, you have (at least) daily samples on nearly 10,000 publicly traded stocks. With only 3 years of data, that gives you 10 million dimensions. With 30 years of data, things are obviously 10x worse. If

Re: Beginner questions on clustering & M/R

2010-07-17 Thread Florent Empis
Hi, On the SVD part... why would that help? Thanks for your input:) Florent 2010/7/15 Ted Dunning > Clustering of time series data is usually better done in an abstract > relatively low dimensional coordinate space based on some transform like a > locality sensitive frequency transform. Gab

Re: Beginner questions on clustering & M/R

2010-07-16 Thread Florent Empis
Thanks :-) 2010/7/16 Ted Dunning > Gabor transform retains some time domain information. Since economic > process change somewhat, I think that would be important. It also helps > avoid questions about how to window the signal (it effectively *is* a > windowed Fourier transform). > > On Fri,

Re: Beginner questions on clustering & M/R

2010-07-16 Thread Ted Dunning
Gabor transform retains some time domain information. Since economic process change somewhat, I think that would be important. It also helps avoid questions about how to window the signal (it effectively *is* a windowed Fourier transform). On Fri, Jul 16, 2010 at 5:01 AM, Florent Empis wrote: >

Re: Beginner questions on clustering & M/R

2010-07-16 Thread Florent Empis
Hi, First of all, let me stress I'm not actually trying to do quant analysis... it's just for fun, not pratical use is expected, other than learning some new stuff. I also thought of using a transform from time to frequency (fourrier...) but it was only a wild guess based on my limited knoweldge

Re: Beginner questions on clustering & M/R

2010-07-15 Thread Ted Dunning
Clustering of time series data is usually better done in an abstract relatively low dimensional coordinate space based on some transform like a locality sensitive frequency transform. Gabor transforms might be appropriate. You might be able to get away with something like an SVD of your daily cha

Re: Beginner questions on clustering & M/R

2010-07-15 Thread Joe Spears
I once thought about being a quant long before I started my current company :) . I don't want to ruin the surprise for you, but because of volatility in the market and the fact that you are looking at daily data (unless you spend a lot of time writing a custom clustering implementation) you are mo

Beginner questions on clustering & M/R

2010-07-15 Thread Florent Empis
Hi, I want to learn more on clustering techniques. I have skimmed through Programming Collective Intelligence and Mahout in Action in the past but I don't have them on hand at the moment... :( I've seen Isabel Drost mail about test data on http://mldata.org/about/ I've had an idea of using http://