Doug Cutting wrote:
  > Searchers are based on
  > IndexReaders, and hence doFreqs don't change until a new Searcher is
  > created.  So long as this is true, and the central dispatch node
uses a
  > searcher, then a simple cache, perhaps that is pre-fetched, is all
  > that's feasable.  It shouldn't take that long to pre-fetch the cache
  > when indexes are re-opened.
and:
  > I don't see what callbacks are required.  When the Weight is
constructed
  >   it invokes docFreq for each term, which, if RemoteSearchables are
  > involved, will result in IPC calls to those RemoteSearchables.
Then,

I don't understand the first statement, and don't see how these
statements are consistent (which is probably due to a misunderstanding
in how all this works).  The purpose of the aggregate docFreq table is
to avoid the need to issue IPC calls to each RemoteSearchable for each
Term in each Query, as Query's can have very large numbers of Term's
(e.g., with RangeQuery's).  Also, I can't believe the central dispatch
node will have all the indices open on each RemoteNode?  The dispatch
node should have just the MultiSearcher, which is accessing the remote
nodes via the RemoteSearchable's.  So how does a remote node updating
its index and reopening its Searcher cause an aggregate docFreq table on
the dispatcher to get updated?  It is for this purpose that I suggested
a callback from the remote node to the dispatch node that passed a
delta-docFreq table so that this central aggregate table can be updated
easily and efficiently.

The first-order fix would seem to be your original proposal, which
requires the IPC-calls for each Term in each Query.  This seems
straightforward to implement, although a bit tedious, as everything
needs to be changed to work with Weight's instead of Query's (and the
Query methods need to be maintained for backward compatibility).
Assuming this is too slow due to a barrage of IPC calls for non-trivial
queries, then the performance optimization is to introduce the central
aggregate docFreq table with a mechanism to keep it correct under remote
note index updates.

If I've got something fundamentally wrong in my presuppositions here,
please help me understand.

Thanks,

Chuck


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to