Re: [Ferret-talk] Indexing Speed?

David Balmain Fri, 05 May 2006 07:42:22 -0700

Hi Steven,

Once you made those changes were the indexes approximately the same
size? You'll get the most accurate results if the indexes are
identical. Also, which version of Ferret are you using? I just tried
200Mb here (~600 files). In my case all of it is text and everything
gets indexed. Lucene took ~120 seconds and Ferret took ~55 seconds.
Both indexes are identical. I'm using the Sun JVM.


I look forward to your reply.

Cheers,
Dave


On 5/5/06, steven <[EMAIL PROTECTED]> wrote:
> Hi Dave,
>
> Thanks very much for getting back to me.
>
> You were right about the indexes being different...
>
> Your snippet has helped - but still nowhere near as fast as the Java
> version:
>
> doc.add(new Field("path", f.getPath(), Field.Store.YES,
> Field.Index.UN_TOKENIZED));
> doc.add(new Field("modified",DateTools.timeToString(f.lastModified(),
> DateTools.Resolution.MINUTE), Field.Store.YES,
> Field.Index.UN_TOKENIZED));
> doc.add(new Field("contents", new FileReader(f)));
>
> Could it be that ruby's file.readlines is slower than Java's FileReader?
>
> Another possible snafu is that the Directory contains loads of pdfs and
> other binary files which neither lucene or ferret can index - could it
> be that ferret is slower at dealing with things like that? (Just a
> thought)
>
> Would love to hear any thoughts.
>
> Many Thanks,
> Steven.
>
> --
> Posted via http://www.ruby-forum.com/.
> _______________________________________________
> Ferret-talk mailing list
> [email protected]
> http://rubyforge.org/mailman/listinfo/ferret-talk
>

_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Re: [Ferret-talk] Indexing Speed?

Reply via email to