Andrey Davydov created MAHOUT-1130: -------------------------------------- Summary: Wrong logic in org.apache.mahout.clustering.kmeans.RandomSeedGenerator Key: MAHOUT-1130 URL: https://issues.apache.org/jira/browse/MAHOUT-1130 Project: Mahout Issue Type: Bug Environment: mahout 0.7 from maven central Reporter: Andrey Davydov
There is following code in line 101: } else if (random.nextInt(currentSize + 1) != 0) { // with chance 1/(currentSize+1) pick new element but it actually means pick new element with chance currentSize/(currentSize+1) so generator takes initial centers from the end of source data file. It seems that chance of replace vector in output set should decrease with number of processed input vectors -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira