[Lucene-dev] Problem with the new PrefixQuery

joanne . sproston Thu, 05 Jul 2001 11:34:19 -0700
Hi Doug

I have taken your advice and reworked my HitCollector to use the
 suggestions you made in your reply attached below.  This has
 worked well, although I made one extra modification - I made the
 method MultiSearcher.searcherIndex(int n) public
(I used this to help me identify my indexer class).

Thanks for giving me the feedback - I would be grateful if your
suggestions were added to the Lucene core code - so that my code
stays in Line with future releases of Lucene ?

Onto a different subject ....

Thanks very much for submitting the PrefixQuery solution - it was
 easy to get going when searching just one index, a real 'plug n
 play'.  This will help no end with our search facility.

If you can recall - I am using the MultiSearcher class to search
 several indexes using the method you suggested below :
public final void search(Query , Filter , HitCollector).

To search multiple indexes I had to make a very small change to
Query.java and PrefixQuery.java in order to get the correct hit
results.  I think this may be a bug in the solution, but I would
 like you to confirm this.  (Perhaps I have not configured
something correctly ?).

I shall try to explain where the problem lies ...

The MultiSearcher.search(Query , Filter , HitCollector) method
 calls IndexSearcher.search(Q, F, H), which calls
Query.scorer(Query, Searcher, IndexReader).
It is in this Query.scorer method that my problem begins.

This method first prepares my PrefixQuery, by setting the
PrefixQuery.reader.
During the normalization process, the method PrefixQuery.getQuery
is called to assign data to PrefixQuery.query.

But PrefixQuery.query is only ever calculated ONCE, just for the
 very first reader (in my MultiSearcher loop) since the method
PrefixQuery.getQuery() begins with :"  if (query == null) {  "

However I will need to re-calculate PrefixQuery.query For each
 reader within my loop, since each index will result in a
different set of terms to add to my resulting Boolean Query.

My attempt to resolve this problem was to edit the method
PrefixQuery.prepare(IndexReader reader) to reset the query back
to null, when starting with a new reader :

  final void prepare(IndexReader reader) {
   //JSproston - Bug Fix test
   this.query = null;
   //JSproston addition end
    this.reader = reader;
  }

This Fix, did not resolve all my problems - I was finding the
 correct search results - but the returned scores were very
strange - often zero !

I think this is a problem due to the fact that normalization is
only calculated once for the first reader - but since the scoring
 algorithm is too complex for me to understand - this is more of
a guess than a logical conclusion !

This problem is resolved by commenting out the test :
" if (!query.normalized) { "
within Query.scorer(Query , Searcher , IndexReader )
so that the query is normalized for each reader.

I appreciate that these solutions may not be the most efficient
 way of solving the problems I have - but they seem to work !!


Can you confirm whether Lucene needs to be changed to overcome
this problem or have I not configured something that I should have done ?

Please get back to me if I have not given you sufficient
information to understand my problem.

I look forward to hearing from you.

Many Thanks

Joanne Sproston



Doug Cutting  (05/06/2001  17:24):
>Overall this looks good.
>
>A few comments:
>
>I think this would be cleaner if you could use the HitCollector interface
>directly, without the class MultiIndexHitCollector.  To do this you need to
>hide the document renumbering from the API, which also cleans things up.
>
>Thus I suggest that you:
>  - add the following to Searcher.java
>     public abstract void search(Query,HitCollector) throws IOException;
>     public abstract void search(Query,Filter,HitCollector) throws
>IOException;
>  - make Searcher.doc() a public method:
>     public abstract Document doc(int i) throws IOException;
>
>Then, in MultiSearcher.search(Query,HitCollector), when you call
>IndexSearcher.search(Query,HitCollector), pass in a HitCollector which
>performs the required renumbering.
>
>This can be done with something like:
>
> public final void search(Query query, HitCollector hitResults) {
>   this(query, null, hitResults);
> }
> public final void search(Query query, Filter filter, final HitCollector
>hitResults)
>   for (int i = 0; i < searchers.length; i++) {
>     final int start = starts[i]
>     searchers[i].search(query, filter, new HitCollector() {
>       public void collect(int doc, float score) {
>         hitResults.collect(doc + start, score);
>       }
>     });
> }
>
>
>Doug



_______________________________________________
Lucene-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/lucene-dev
[Lucene-dev] Problem with the new PrefixQuery

Reply via email to