Re: Question Regarding Entropy calculation in Mahout

2014-05-23 Thread Ted Dunning
I am sorry, but I don't understand your questions or needs sufficiently to answer. On Wed, Apr 23, 2014 at 12:21 PM, Darshan Sonagara darshan.sonag...@gmail.com wrote: sir please reply me as soon as possible thanks in advance. On Tue, Apr 22, 2014 at 11:50 PM, Darshan Sonagara

Re: Question Regarding Entropy calculation in Mahout

2014-05-23 Thread Yash Sharma
Hi Darshan, What i understand from your problem is that: - You have clustered few documents - You want to verify the accuracy of ur clustering , and you want to use entropy for that - You are not sure what should be the input for entropy calculation. Possible solution: The entropy would expect a

Re: Question Regarding Entropy calculation in Mahout

2014-05-23 Thread Ted Dunning
Yash, I am not sure how your suggestion will work. The problem is clustering algorithms tend to make hard assignments. Thus, if you try to compute entropy relative to some reference probability distribution (aka perplexity [1]) then a reference clustering will provide 1 or 0 as the probability.

Re: Question Regarding Entropy calculation in Mahout

2014-05-23 Thread Yash Sharma
Well I was not aware of perplexity calculation. Your point makes perfect sense. Entropies calculated independently for each cluster would not serve any purpose. So the question moves back to the questioner and I'd move back to textbooks :) Peace, Yash On Sat, May 24, 2014 at 12:01 AM, Ted

Question Regarding Entropy calculation in Mahout

2014-04-22 Thread Darshan Sonagara
I am Final year BE Student from Gujarat,India. right now studying in Information Technology Branch. i have Final Year project as Document Clustering using Hadoop. At this stage i am able to find final result from cluster dump command in which i can see number of document in particular cluster and

Re: Question Regarding Entropy calculation in Mahout

2014-04-22 Thread Ted Dunning
On Tue, Apr 22, 2014 at 12:11 AM, Darshan Sonagara darshan.sonag...@gmail.com wrote: But the problem is that i want check that whether my clustering is good or bad. so for that i need to calculate Entropy Value. I am not having any idea how to calculate entropy in mahout or by other

Re: Question Regarding Entropy calculation in Mahout

2014-04-22 Thread Darshan Sonagara
Thnks for the Replay sir, actually i am doing clustering for gathering similar king of document in same cluster as much as possible. i can see from output file by cluster dump by observing top term. i also figure out that by varying Distance Measure Technique. it differs. but i want some

Re: Question Regarding Entropy calculation in Mahout

2014-04-22 Thread Darshan Sonagara
waiting for the replay sir . On Tue, Apr 22, 2014 at 7:13 PM, Darshan Sonagara darshan.sonag...@gmail.com wrote: Thnks for the Replay sir, actually i am doing clustering for gathering similar king of document in same cluster as much as possible. i can see from output file by cluster dump