DM Smith wrote:
We already have a solution, and it is external to Lucene. We look for
hits on things that are to be adjacent, get their "canonical"
reference and then compare the distances between these. While this
works well, I was hoping for a solution within Lucene.

This does not give us the ability to look for phrases across verse boundaries.

Yes and no. Let's look at this document structure:

field1: current
field2: prev+current+next

Then, you expand each query into "field1:query OR field2:query". For terms and phrases that fit only in the current verse the score will be higher than for terms and phrases that span the verse boundary, because the former will get additional boost from matching the field1.

As to storing book or chapter in the index, we don't do that, just the
whole reference.
This is worth looking into as it would help in doing range restricted
searches. Today, we do the restriction after the search.

This would need some testing, but I would suggest splitting this into two fields: one would be the book name, the other would be a combined chapter/verse, as an integer.


--
Best regards,
Andrzej Bialecki
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to