Re: How to use kmeans clustering algorithm of Mahout

2012-09-13 Thread Paritosh Ranjan
Please ask questions describing the problem that you are facing in detail here, I hope that you will get the answer. On 13-09-2012 08:29, Don.Tan wrote: I have tried it by following the way of the sample code, and I noticed that I should not use seq2sparse directory. That leads to the sparse

Re: How to use kmeans clustering algorithm of Mahout

2012-09-12 Thread Don Tan
I think I didn't explain clear enough and sorry for that. The example showed before is a part of my data. Each line is a user profile, for example, the first row is the features of a user. And I want to apply k-means to this data. I need to create a file saves all users profile as sparse vector

Re: How to use kmeans clustering algorithm of Mahout

2012-09-12 Thread Don.Tan
The original data is here: [hadoop@datamining ~]$ hadoop fs -ls /home/test/test Found 1 items -rw-r--r-- 1 hadoop supergroup 129213799 2012-09-12 15:45 /home/test/test/result After I used mahout seqdirectory -i /home/test/test/ -o /home/test/result/ -c UTF-8, get this:

Re: How to use kmeans clustering algorithm of Mahout

2012-09-12 Thread Paritosh Ranjan
Can you explain something about the error and provide the stacktrace ? On 12-09-2012 14:22, Don.Tan wrote: The original data is here: [hadoop@datamining ~]$ hadoop fs -ls /home/test/test Found 1 items -rw-r--r-- 1 hadoop supergroup 129213799 2012-09-12 15:45 /home/test/test/result After

Re: How to use kmeans clustering algorithm of Mahout

2012-09-12 Thread Paritosh Ranjan
Also try to follow the steps in cluster-reuters.sh file. This might help. On 12-09-2012 15:59, Paritosh Ranjan wrote: Can you explain something about the error and provide the stacktrace ? On 12-09-2012 14:22, Don.Tan wrote: The original data is here: [hadoop@datamining ~]$ hadoop fs -ls

Re: How to use kmeans clustering algorithm of Mahout

2012-09-12 Thread Don.Tan
Thank you for you promptly reply. Can I ask a question before I go on? My original data is in a format like that: 176329,116300,175216,167307,**46710,138740,100681,2089,1842,** 1206,101702,99210,50460,89605,**177424,142901,176464,160625,**

Re: How to use kmeans clustering algorithm of Mahout

2012-09-12 Thread Paritosh Ranjan
I think it shouldn't be sparse in the beginning, the seq2sparse should take care of it. Some one will correct me if I would be wrong, so, wait for some time and then go ahead. On 12-09-2012 16:07, Don.Tan wrote: Thank you for you promptly reply. Can I ask a question before I go on? My

Re: How to use kmeans clustering algorithm of Mahout

2012-09-12 Thread Don.Tan
I have tried it by following the way of the sample code, and I noticed that I should not use seq2sparse directory. That leads to the sparse result is empty Anyone you known could help me deal with that? On 09/12/2012 07:09 PM, Paritosh Ranjan wrote: I think it shouldn't be sparse in the