On Mar 1, 2009, at 3:33 AM, Ben Morgan wrote:
On 01/03/2009, Peter von Kaehne <ref...@gmx.net> wrote:
It appears that all clucene/lucene capable frontends can do
proximity searches. BpBible exposes this via its GUI, others rely on
the clucene/lucene syntax.
Q: Is there anything particular about bpbible's proximity searches
or do I simply use the wrong syntax?
I get on the ABU in xiphos for the following search "god love"~15 32
results, but on bpbible with a proximity search limited to 15 words
distance I get 56 hits.
BPBible's proximity is an approximation based on some average length
of word (~5 letters, I think... - though it may be calculated from
the module). So results may not be directly comparable.
Looking at the list I am find only single verse references in
xiphos, but my understanding is that crossboundary searches should
be possible.
What am I doing wrong? Or is in fact crossboundary search not
possible in other frontends?
I don't believe proper crossboundary search is possible (as lucene
has documents). BPBible also allows (for example) phrases to cross
verse boundaries. I don't think any of the others do.
You are mostly right about cross-boundary search, where the boundary
is a verse or chapter. JSword does have a limited support for cross-
boundary search. But the user has to specify such a choice. It is not
automatic.
I have talked with the Lucene folk about searching adjacent documents
and they don't see it being added to core. Their suggestion was to
have multiple documents per verse. The first set of documents would be
as it is today. The second set of documents would span a particular
number of documents, say 2 adjacent verses or the chapter. The size is
dependent upon the assumption that a user would not search for
phrases or other things beyond that size.
A phrase search in the second set of documents would find phrases that
did not exceed 2 verses or the chapter (using the given example).
Another approach would be to be to use offsets for the words.
Typically Lucene starts the offset for the first term for the document
at 0. The second at 2. And so forth. This is a function of the
Analyzer and of the Field. But it is possible to change the offsets to
be the term position in the source. (Note, this can be used to store
alternate terms, n-grams, word forms, etc at the same position.)
While true cross-boundary search is useful in some situations. I don't
think it is very useful. I think most people will use search to find a
verse or a short passage.
For the most part, searching verses in isolation is more than
sufficient.
I'd like to hear other thoughts on the usefulness of proper passage
searching.
In Him,
DM
_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page