Re: Algorithms for categorical data

2013-06-02 Thread Ted Dunning
Also, I was just reading the paper you referred to. It makes what seem to me to be a series of somehwat strawman arguments against 1 of n encoding. First, actual practice often involves Euclidean distances between points on a sphere S^n rather than than unrestricted points in R^n. This helps qui

Re: Algorithms for categorical data

2013-06-02 Thread Ted Dunning
So Florents, can you say how this works better than 1 of n coding and then using a simple scaled Euclidean metric? Beyond that, how would this scale? On Sun, Jun 2, 2013 at 2:39 PM, Florents Tselai wrote: > I've noticed (correct me if I'm wrong) that mahout lacks algorithms > specialized in

Re: Algorithms for categorical data

2013-06-02 Thread Florents Tselai
Yes On Sun, Jun 2, 2013 at 9:56 PM, Yexi Jiang wrote: > You mean you are testing on the single machine version? > > > 2013/6/2 Florents Tselai > > > Not yet. > > > > I'm currently experimenting with various implementation in Python. > > > > > > On Sun, Jun 2, 2013 at 9:43 PM, Yexi Jiang wrote

Re: Algorithms for categorical data

2013-06-02 Thread Yexi Jiang
You mean you are testing on the single machine version? 2013/6/2 Florents Tselai > Not yet. > > I'm currently experimenting with various implementation in Python. > > > On Sun, Jun 2, 2013 at 9:43 PM, Yexi Jiang wrote: > > > Do you already have one implemented? > > > > > > 2013/6/2 Florents Ts

Re: Algorithms for categorical data

2013-06-02 Thread Florents Tselai
Not yet. I'm currently experimenting with various implementation in Python. On Sun, Jun 2, 2013 at 9:43 PM, Yexi Jiang wrote: > Do you already have one implemented? > > > 2013/6/2 Florents Tselai > > > I've noticed (correct me if I'm wrong) that mahout lacks algorithms > > specialized in clus

Re: Algorithms for categorical data

2013-06-02 Thread Yexi Jiang
Do you already have one implemented? 2013/6/2 Florents Tselai > I've noticed (correct me if I'm wrong) that mahout lacks algorithms > specialized in clustering data with categorical attributes. > > Would the community be interested in the implementation of algorithms like > ROCK

Algorithms for categorical data

2013-06-02 Thread Florents Tselai
I've noticed (correct me if I'm wrong) that mahout lacks algorithms specialized in clustering data with categorical attributes. Would the community be interested in the implementation of algorithms like ROCK ? I'm currently working on t