[ https://issues.apache.org/jira/browse/SPARK-6002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-6002. ------------------------------ Resolution: Won't Fix > MLLIB should support the RandomIndexing transform > ------------------------------------------------- > > Key: SPARK-6002 > URL: https://issues.apache.org/jira/browse/SPARK-6002 > Project: Spark > Issue Type: Improvement > Components: MLlib > Affects Versions: 1.2.1 > Reporter: Derrick Burns > Original Estimate: 48h > Remaining Estimate: 48h > > MLLIB offers the HashingTF. However, this simple transform offers no > guarantees on the relationship between the input and the output. > Instead of the HashingTF, MLLIB should offer Random Indexing > (http://en.wikipedia.org/wiki/Random_indexing) which does offer such > guarantees. > The K-means clusterer at > https://github.com/derrickburns/generalized-kmeans-clustering includes an > implementation of the Random Indexing transform. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org