Hi, I'm working on an abstraction to sort random-access data-structures for Lucene[1] and would like to know if you think this is interesting enough for inclusion in Apache Commons.
The reason for such an abstraction is that we often need to sort parallel arrays and the JDK provides no way to do it easily. This is why we first borrowed SorterTemplate[2] from CGlib: provided the definition of some primitive methods (compare and swap), it can sort any sub-sequence of any random-access data-structure. We recently faced the need to implement TimSort (LUCENE-4839[3]) for some use-cases, but unfortunately it requires a different set of primitive operations to be properly implemented. This is why I started working on LUCENE-4946[1] to be able to have different primitive operations depending on the sorting algorithm. A few more notes about this sorting abstraction: - since it only assumes the data supports random-access, it can be used to sort off-heap or commons-primitives collections, - it provides some useful sorting algorithms, in particular introsort[4] and a modified Timsort[5], - performance is comparable to the JDK's Arrays.sort in spite of the abstraction, - it is faster and more memory-efficient than Collections.sort for random-access lists, since the latter dumps the content of the list into an array to sort it with Arrays.sort. What do you think? [1] https://issues.apache.org/jira/browse/LUCENE-4946 [2] http://grepcode.com/file/repo1.maven.org/maven2/cglib/cglib/2.2.2/net/sf/cglib/util/SorterTemplate.java#SorterTemplate [3] https://issues.apache.org/jira/browse/LUCENE-4839 [4] http://fr.wikipedia.org/wiki/Introsort [5] http://en.wikipedia.org/wiki/Timsort -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org