On 10/10/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
Hi,

Maybe I missed it, but I was surprised that nobody here wondered about the 
algorithm and data structure changes that Dave Balmain made in Ferret, to make 
it go faster (than Java Lucene).  I know I've been wondering whether/when Dave 
will bring those up, and what the chances of those changes being applied to 
Java Lucene are.

Here is an interesting and recent interview with Dave that mentions some of 
this stuff.

  http://on-ruby.blogspot.com/2006/10/ruby-hacker-interview-dave-balmain.html

Otis


Hi Otis,

I did bring this up here:

http://mail-archives.apache.org/mod_mbox/lucene-java-dev/200607.mbox/[EMAIL 
PROTECTED]

The reason I didn't press the issue was that the changes are pretty
substantial and would break backwards compatibility in Lucene. Also, I
didn't think the major performance benifits would map back to Java
since I'm taking advantage of the fact that I have so much control
over memory allocation in C.

Given these factors and the fact that benchmarks can be a very touchy
subject, particularly in the Java community, I thought it better to
leave any performance comparison off this list. It looks like the cat
is out of the bag now so I'll put some benchmarks up on my Wiki and
everyone can check that I haven't cheated or made any mistakes. I'll
use the Reuters collection:

   http://www.daviddlewis.com/resources/testcollections/reuters21578/

If anyone thinks I should use a different corpus, please let me know.
I also have the entire Gutenburg collection here. I'll post a link
when I'm done.

Cheers,
Dave

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to