Re: randomseedgenerator

Grant Ingersoll Wed, 01 Jul 2009 13:25:05 -0700


On Jul 1, 2009, at 2:07 PM, Adil Aijaz wrote:

I was looking at the RandomSeedGenerator and, correct me if I amwrong, but it is not really random; rather it does a bunch ofbernoulli trials where the points that are in the beginning of yourdata are always going to have a higher chance of being selected thanthose near the end.

I was just going off of Ted's suggestion that for k-Means it wasn'treally all that important to be truly random for the initial seeds.We discussed PRNGs and a M/R way of doing it, but I didn't think itwas necessary for this. Fine if someone else wants to take it up.

Maybe that's not a problem since given sufficient iterations kmeansshould converge toward a solution. But, I thought I'd point it outin case there is an issue here.


Understood.


Adil


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)using Solr/Lucene:

http://www.lucidimagination.com/search

Re: randomseedgenerator

Reply via email to