Re: Need Help in Clustering
On Mon, Jun 24, 2013 at 12:14 PM, Rajan Gupta wrote: > Do i need to create custom code for this, if yes do help me > Yes. You definitely need custom code for this. You also need to think about your data and why you want clusters. What does age mean to a cluster? Are people with the same age supposed to be the same in some sense? What does 5 years difference mean? Is the distance from 20 to 25 the same as the different between 55 and 60? What about city? How many cities are there? Do you have any sense of which cities are more like some than others? What about income? Should perhaps use log(income) for computing distances? What is "perwt"? Why is there just one product per line? What products are more similar than others?
Re: Need Help in Clustering
Thanks for your response yes,I get clustered points after running Kmeans. I have done clustering sucessfully with 20newsdata and reuters data.Clusterdump also works properly with above stated examples. Now, i have text data in fomat as Id,age,income,perwt,sex,city,product 1,23,2200,40,2,Boston,product #1 -- i want to have ouput as "Id",'age",'income","perwt","sex","city","product","cluster" 1,23,2200,25,2,"Boston","product #1",1 2,26,6600,30,1,"New york","product #5",3 3,24,4400,48,2,"Portland","product #24",2 4,29,9900,60,1,"San Jose","product #70",4 Can anyone help... Do i need to create custom code for this, if yes do help me Thanks In advance, Regards, Rajan Gupta On Mon, Jun 24, 2013 at 12:46 PM, Suneel Marthi wrote: > How are u converting your data to sequencefile? > If you are not sure check this link: > http://stackoverflow.com/questions/13663567/mahout-csv-to-vector-and-running-the-program > > Are you getting any clusteredpoints after running k-means? > > It would help if you could list the commands you had executed for > troubleshooting. > > > > > From: Rajan Gupta > To: dev@mahout.apache.org > Sent: Monday, June 24, 2013 3:09 AM > Subject: Need Help in Clustering > > > Hi, > I am new to mahout. > > i have text data in fomat as > > Id,age,income,perwt,sex,city,product > 1,23,2200,40,2,Boston,product #1 > > I want to perform kmeans clustering based on 2 feilds that is age and > income.And i also want perform in specific number of clusters. > > I have already performed clustering by changing file into sequence > vector > files but i get empty file while performing clusterdump.I guess their is > something wrong in the way the class are written and the way my input file > is. > > Can anyone help me how to do this. > > Thanks is advance > Rajan Gupta >
Re: Need Help in Clustering
How are u converting your data to sequencefile? If you are not sure check this link: http://stackoverflow.com/questions/13663567/mahout-csv-to-vector-and-running-the-program Are you getting any clusteredpoints after running k-means? It would help if you could list the commands you had executed for troubleshooting. From: Rajan Gupta To: dev@mahout.apache.org Sent: Monday, June 24, 2013 3:09 AM Subject: Need Help in Clustering Hi, I am new to mahout. i have text data in fomat as Id,age,income,perwt,sex,city,product 1,23,2200,40,2,Boston,product #1 I want to perform kmeans clustering based on 2 feilds that is age and income.And i also want perform in specific number of clusters. I have already performed clustering by changing file into sequence > vector files but i get empty file while performing clusterdump.I guess their is something wrong in the way the class are written and the way my input file is. Can anyone help me how to do this. Thanks is advance Rajan Gupta
Need Help in Clustering
Hi, I am new to mahout. i have text data in fomat as Id,age,income,perwt,sex,city,product 1,23,2200,40,2,Boston,product #1 I want to perform kmeans clustering based on 2 feilds that is age and income.And i also want perform in specific number of clusters. I have already performed clustering by changing file into sequence > vector files but i get empty file while performing clusterdump.I guess their is something wrong in the way the class are written and the way my input file is. Can anyone help me how to do this. Thanks is advance Rajan Gupta