I have just written a replacement. I will post a patch as soon as I get some solid testing done.
On Sat, Aug 29, 2009 at 2:29 PM, Grant Ingersoll <gsing...@apache.org>wrote: > Right, Colt likely could be used depending on the package it comes from and > as long as it doesn't have deps on the other packages. > > -Grant > > > On Aug 29, 2009, at 2:22 PM, Ted Dunning wrote: > > Trove is LGPL so we can't lift code. Even linking can be tricky. >> >> On Fri, Aug 28, 2009 at 10:06 AM, Shashikant Kore (JIRA) <j...@apache.org >> >wrote: >> >> >>> [ >>> >>> https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748904 >>> #action_12748904] >>> >>> Shashikant Kore commented on MAHOUT-165: >>> ---------------------------------------- >>> >>> I'm fine with copying relevant classes from Colt or Trove. >>> >>> Please let me know your library of choice. I will create the patch and >>> upload. >>> >>> >>> >>> Using better primitives hash for sparse vector for performance gains >>>> -------------------------------------------------------------------- >>>> >>>> Key: MAHOUT-165 >>>> URL: https://issues.apache.org/jira/browse/MAHOUT-165 >>>> Project: Mahout >>>> Issue Type: Improvement >>>> Components: Matrix >>>> Affects Versions: 0.2 >>>> Reporter: Shashikant Kore >>>> Fix For: 0.2 >>>> >>>> Attachments: mahout-165.patch >>>> >>>> >>>> In SparseVector, we need primitives hash map for index and values. The >>>> >>> present implementation of this hash map is not as efficient as some of >>> the >>> other implementations in non-Apache projects. >>> >>>> In an experiment, I found that, for get/set operations, the primitive >>>> >>> hash of Colt performance an order of magnitude better than >>> OrderedIntDoubleMapping. For iteration it is 2x slower, though. >>> >>>> Using Colt in Sparsevector improved performance of canopy generation. >>>> For >>>> >>> an experimental dataset, the current implementation takes 50 minutes. >>> Using >>> Colt, reduces this duration to 19-20 minutes. That's 60% reduction in the >>> delay. >>> >>> -- >>> This message is automatically generated by JIRA. >>> - >>> You can reply to this email to add a comment to the issue online. >>> >>> >>> >> >> -- >> Ted Dunning, CTO >> DeepDyve >> > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using > Solr/Lucene: > http://www.lucidimagination.com/search > > -- Ted Dunning, CTO DeepDyve