recording a universal ID from DocID in a CustomScoreQuery

Paul Allan Hill Fri, 03 Feb 2012 16:10:29 -0800

My Index does NOT have a simple UID, it uses the file PATH to the file as the 
unique key.
I was implementing a CustomScoreQuery which not only tweaked the score it also 
wanted to write down which documents had passed through this part of overall 
rebuilt query, so that I could further mess with those particular documents 
later.
I was hoping to do it without using loading up all PATHs from my index into a 
field cache, but maybe that is a false way to try to save memory.


I thought I could write down the docId provided in the call to customScore

public float customScore(int doc, float subQueryScore, float valSrcScore) 
throws IOException {
     docIds.add(docId);
   return ...;
  }

private Set<Integer> docIds = new HashSet<Integer>();

While I thought I had this working, apparently I had not taken into 
consideration the subreader and segment problem.
The int called doc is not the docId for the entire index, just the local reader 
doc number.  Is that right?
So is there a standard way to convert back to the index wide DocID?

If there is no standard way, I _might_ create a small subclass of IndexSearcher 
and provide a method to:


(1)    Find the right reader by looping through all IndexSearcher.subReaders[] 
to find what reader called the CustomScoreQuery

(2)    Add an offset of the proper value from IndexSearcher.docStarts[iReader]

But I'm am thinking this prone to the problem that subreader can be made of 
more subreaders etc., so I really don't have a clue where to find the current 
reader and then to map back to
docStarts.

I also think I'm doing this wrong, because ReaderUtil has nothing like this?

Is there some way to note for later that a particular document came through 
this function query or should I just accept the fact of using the field cache?

-Paul

recording a universal ID from DocID in a CustomScoreQuery

Reply via email to