Simple answer: the Harmony team didn't code as well as the Sun people did. This is not my metier, so if someone else can suggest algorithmic improvements ...
On Tue, Dec 29, 2009 at 8:06 AM, Robin Anil (JIRA) <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/MAHOUT-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795047#action_12795047 > ] > > Robin Anil commented on MAHOUT-230: > ----------------------------------- > > I ran the SortingTest instead of 3.2 sec it now takes 5.2 seconds. I repeated > the patch and rechecked. Any idea why the perf dip? Notice the perf drop in > the second block i.e. after the line break in each block > > {code:xml|title=Original Output} > > <testcase time="0.016" classname="org.apache.mahout.math.SortingTest" > name="testBinarySearch"/> > <testcase time="0.015" classname="org.apache.mahout.math.SortingTest" > name="testBinarySearchObjects"/> > <testcase time="0.032" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortBytes"/> > <testcase time="0.234" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortChars"/> > <testcase time="0.188" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortInts"/> > <testcase time="0.171" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortLongs"/> > <testcase time="0.204" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortShorts"/> > <testcase time="0.218" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortFloats"/> > <testcase time="0.235" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortDoubles"/> > <testcase time="0.062" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortBytes"/> > > <testcase time="0.313" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortChars"/> > <testcase time="0.281" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortInts"/> > <testcase time="0.391" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortLongs"/> > <testcase time="0.25" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortShorts"/> > <testcase time="0.359" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortFloats"/> > <testcase time="0.328" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortDoubles"/> > {code} > > {code:xml|title=After Patching} > > <testcase time="0.031" classname="org.apache.mahout.math.SortingTest" > name="testBinarySearch"/> > <testcase time="0" classname="org.apache.mahout.math.SortingTest" > name="testBinarySearchObjects"/> > <testcase time="0" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortBytes"/> > <testcase time="0.234" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortChars"/> > <testcase time="0.234" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortInts"/> > <testcase time="0.172" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortLongs"/> > <testcase time="0.266" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortShorts"/> > <testcase time="0.235" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortFloats"/> > <testcase time="0.25" classname="org.apache.mahout.math.SortingTest" > name="testQuickSortDoubles"/> > <testcase time="0.015" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortBytes"/> > > <testcase time="0.594" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortChars"/> > <testcase time="0.61" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortInts"/> > <testcase time="0.781" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortLongs"/> > <testcase time="0.562" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortShorts"/> > <testcase time="0.578" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortFloats"/> > <testcase time="0.61" classname="org.apache.mahout.math.SortingTest" > name="testMergeSortDoubles"/> > {code} > >> Replace org.apache.mahout.math.Sorting with code of clear provenance >> -------------------------------------------------------------------- >> >> Key: MAHOUT-230 >> URL: https://issues.apache.org/jira/browse/MAHOUT-230 >> Project: Mahout >> Issue Type: Bug >> Components: Math >> Affects Versions: 0.3 >> Reporter: Benson Margulies >> Assignee: Benson Margulies >> Fix For: 0.3 >> >> Attachments: replace-sorting.diff >> >> Original Estimate: 72h >> Remaining Estimate: 72h >> >> org.apache.mahout.math.Sorting looks as if the original author borrowed from >> the Sun JRE, based on the private internal function names and contents. That >> code has a restrictive license. We need to take the equivalent file >> (java.util.Arrays) from Apache Harmony and use it as the basis for a clean >> replacement. >> The problematic code are the quickSort and mergeSort functions, which extend >> 'Arrays' by supporting slices of arrays and custom sorting predicate >> functions. >> One might also wistfully note that the more recent JDKs from Sun have >> deployed different (and one hopes) better sort algorithms that 1.5 and/or >> Harmony, so a really energetic person might build implementations in here to >> match. However, expediency calls for just bashing on the Harmony >> implementation to solve the problem at hand. > > -- > This message is automatically generated by JIRA. > - > You can reply to this email to add a comment to the issue online. > >