Matthew Talbert wrote:

> OK, here are results. All tests are done with my previous changes; the
> only difference is the first index has stop words, the second doesn't.

> KJV 7.3MB 6.3MB
> Finney 654KB 518KB
> ESV 5.9MB 5.0MB

So roughly 20% extra.  I see no reason not to go for it -- but then, I'm
a desktop user with a monstrous 640GB hard drive :)  Are there
situations and systems where this would be a significant issue?

> For those wondering why a search for "the lord" doesn't segfault, it's
> only when you search for a stop word alone that there is a segfault.
> If you want to talk about confusing users, the current system would
> seem illogical (I searched for "god is" and got nothing??).

Agreed.  Unless the 20% extra space requirement is really an issue in
some circumstances, it looks like the right approach would be to just
index everything, and so get more correct search results.

Jonathan

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to