I think there is another problem here.  It is currently the Weight
implementations that do rewrite(), which requires access to the index,
not just to the idf's.  E.g., RangeQuery.rewrite() must find the terms
in the index within the range.  So, the Weight cannot be computed in the
MultiSearcher, as it does not have direct access to the remote index.

This seems to put the viability of the whole approach into question.
The better approach may be to distribute an aggregate docFreq table to
each remote node.  A simple interim step could be to support a callback
to the dispatcher node from docFreq on the remote node, although this
would be gross (remote node calls dispatcher node to get docFreq which
in turn calls all remote nodes to get all their docFreqs and sum them).

We need an aggregate docFreq table, and it needs to be on the remote
nodes since the Weight's cannot be computed until after the Query is
rewritten, which requires access to the index on the remote node.

Chuck

  > -----Original Message-----
  > From: Wolf Siberski [mailto:[EMAIL PROTECTED]
  > Sent: Wednesday, January 12, 2005 4:08 PM
  > To: Lucene Developers List
  > Subject: Re: How to proceed with Bug 31841 - MultiSearcher problems
with
  > Similarity.docFreq() ?
  > 
  > Doug Cutting wrote:
  > > Wolf Siberski wrote:
  > >
  > >> Chuck Williams wrote:
  > >>
  > >>> This is a nice solution!  By having MultiSearcher create the
Weight,
  > it
  > >>> can pass itself in as the searcher, thereby allowing the correct
  > >>> docFreq() method to be called.  This is similar to what I tried
to
  > do
  > >>> with topmostSearcher, but a much better way to do it.
  > >>
  > >> This still wouldn't work for RemoteSearchables, except if you
allow
  > >> call-backs from each RemoteSearchable to the MultiSearcher.
  > >
  > > I don't see what callbacks are required.  When the Weight is
  > constructed
  > > it invokes docFreq for each term, which, if RemoteSearchables are
  > > involved, will result in IPC calls to those RemoteSearchables.
Then,
  > > the Weight object is serialized to each RemoteSearchable and a
TopDocs
  > > is returned.  Where are the callbacks?  These are only required
for
  > > HitCollector-based methods, which are not advised with
  > RemoteSearchable.
  > 
  > Yes, I agree. I just wanted to point out that the current Weight
  > implementations need to be modified heavily to introduce the
  > behaviour you describe above. For example, take a look at
  > TermQuery.TermWeight.scorer():
  >     [...]
  >     return new TermScorer(this, termDocs, getSimilarity(searcher),
  >                           reader.norms(term.field()));
  > 
  > This typically results in a call to searcher.getSimilarity().
  > In the new context, the searcher would be a MultiSearcher,
  > and to resolve that call at on of the RemoteSearchables, the
  > method getSimilarity() would have to be called remotely on it.
  > In this case, we can change it so that the Weight is provided
  > with the Similarity object before it is serialized and sent
  > to the RemoteSearchables. But I'm not sure if all these cases
  > can be resolved that easily. As you already have pointed out,
  > it won't be possible for HitCollector-related Weights.
  > 
  > But, as I said, I still agree fully with the approach.
  > 
  > 
  > 
  >
---------------------------------------------------------------------
  > To unsubscribe, e-mail: [EMAIL PROTECTED]
  > For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to