[ https://issues.apache.org/jira/browse/LUCENE-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12550215 ]
Karl Wettin commented on LUCENE-550: ------------------------------------ {quote} Grant Ingersoll - 10/Dec/07 02:11 PM > courtesy of Olivier Chafik What does this mean? He contributed the code personally or you got it from him? In other words, do you have the authority to assign the ASF copyright for said code? {/quote} Yes, http://ochafik.free.fr/blog/?p=106 Karl Wettin dit: 20 October 2007 at 7:54 pm Hi Olivier, I was just going nuts over the lack of offset and length in Collections.binarySearch. I was thinking that perhaps a subList would be OK, but it turns out that the overhead of AbstractList.subList (in my case an ArrayList) is huge. It takes 1/3 the time to search the complete subList owner of 5000 instanes compared to instantiate and binarySearch a subListIn(2500, 5000). Google suggested your blog post. I have based some non-released optimization in http://issues.apache.org/jira/browse/LUCENE-550 on your code. Would you mind donating it to the Apache Software Foundation? Lucene does not state author credits in source code, only in CHANGES.TXT. LUCENE-550 is an alternative RAM index store that is up to 100x faster than the standard RAMDirectory and it is built to support my machine learning projects such as http://issues.apache.org/jira/browse/LUCENE-626 and http://issues.apache.org/jira/browse/LUCENE-1025 zOlive dit: 21 October 2007 at 9:02 am Hi Karl, Thanks for your message, I'm happy to hear that someone actually made some use of this code ! Apart from the offset feature, the only specificity of my code is its relative speed for lookups in sorted integer lists, which I'm unsure whether it's exactly your use case or not. However, I will be more than pleased to contribute this tiny piece of code to Apache, and I must say I'm a bit surprised that there isn't such a method in any of their projects yet (say, in Jakarta Commons - http://commons.apache.org/collections/). Where shall I post it to ? Karl Wettin dit: 21 October 2007 at 4:32 pm Thanks! You don't need to post it anywhere, I have simply pasted it in this class of mine and adapted it to fit my needs. It is indeed an int[] (actually MyClass[].getInt()) I'm seeking in, the variable pivot is most welcome. > InstantiatedIndex - faster but memory consuming index > ----------------------------------------------------- > > Key: LUCENE-550 > URL: https://issues.apache.org/jira/browse/LUCENE-550 > Project: Lucene - Java > Issue Type: New Feature > Components: Store > Affects Versions: 2.0.0 > Reporter: Karl Wettin > Assignee: Grant Ingersoll > Attachments: HitCollectionBench.jpg, > LUCENE-550_20071021_no_core_changes.txt, test-reports.zip > > > Represented as a coupled graph of class instances, this all-in-memory index > store implementation delivers search results up to a 100 times faster than > the file-centric RAMDirectory at the cost of greater RAM consumption. > Performance seems to be a little bit better than log2n (binary search). No > real data on that, just my eyes. > Populated with a single document InstantiatedIndex is almost, but not quite, > as fast as MemoryIndex. > At 20,000 document 10-50 characters long InstantiatedIndex outperforms > RAMDirectory some 30x, > 15x at 100 documents of 2000 charachters length, > and is linear to RAMDirectory at 10,000 documents of 2000 characters length. > Mileage may vary depending on term saturation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]