On Feb 28, 2009, at 8:30 PM, Matthew Talbert wrote:


Bible Desktop/JSword (thus AlKitab and FireBible) have proximity searches across verse boundaries. The syntax is reminiscent of Lucene, but a bit
different.
Aaron ~3 Moses will find all verses with Aaron within 3 verses of Moses.

If I may ask, how do you implement this with lucene? Is it totally
separate from the indexed search?


We build it on top of Lucene.

In JSword, search result (and partial search results) are expressed as bitmaps, with Gen 1:1 being bit 1 and Gen 1:2, being bit 2 and so forth. (actually, we have several representations, but this is conceptually the easiest to understand).

If the term ~n (where n is a number is present), the request is split into two parts (before and after). These are searched separately.

We "blur" a second bitmap (the one with Moses) by 3 so that bits within 3 of the current bits are added in. Then we take the intersection of the two.

I've left out a couple of details, but you can see all the details in org.crosswire.jsword.index.query.BlurQuery and the routines it calls.

Another feature JSword layers on Lucene is range restriction.
A search preceded by [passage range list] will return verses in that range. If it is preceded by -[range] it will return verses that are not in that range. The passage range list can be anything that JSword understands as a list of passage ranges, such as Gen-Ex, Matt 5, Rev 2:3.

This simply builds a bitmap for the range and intersects it with the result of the search following the range.


In Christ,
        DM








_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to