To complete this thread, I read the document itself with a 1 field 
fieldSelector, so as not to bother with anything but exactly what I needed at 
this point in the code (particular not the text body).

Then I saved the primary key (the path) of documents that visited this 
CustomScoreQuery (function query) in a Set<String> seenDocs
                seenDocs.add(reader.document(docId, fieldSelector 
).getFieldable(KEY_FIELD).stringValue());

If We do introduce a short global unique ID field, the code needs little change 
to move to a different field.

When the entire query rounded up all the results, It asks the question which 
ones had come through that function query by consulting the list of seenDocs.

I decided NOT to use the fieldcache for this particular application, because 
the number of documents that are the result of this part of the query are very 
small compared to all documents
Their rarity was the point of knowing, so that I could mark the result as 
'special' for other parts of the application.  Such special documents get 
different treatment in the UI, but that's not my concern, just IDing which ones 
was the useful part for index layer.

As usual thanks for the feedback.

-Paul

> -----Original Message-----
> From: Ian Lea [mailto:ian....@gmail.com]
> Sent: Monday, February 06, 2012 3:54 AM
> To: java-user@lucene.apache.org
> Subject: Re: recording a universal ID from DocID in a CustomScoreQuery
> 
> int doc will be for the subreader, not for the entire index.
> oal.search.Collector has setNextReader(IndexReader reader, int
> docBase) which you might somehow be able to use.  Failing that I'd go for 
> FieldCache, or store the
> docids in a Set in a Map keyed by current Reader, if that would give you what 
> you needed for the
> subsequent messing around.
> 
> 
> --
> Ian.
> 
> 
> On Sat, Feb 4, 2012 at 12:09 AM, Paul Allan Hill <p...@metajure.com> wrote:
> > My Index does NOT have a simple UID, it uses the file PATH to the file as 
> > the unique key.
> > I was implementing a CustomScoreQuery which not only tweaked the score it 
> > also wanted to write
> down which documents had passed through this part of overall rebuilt query, 
> so that I could further
> mess with those particular documents later.
> > I was hoping to do it without using loading up all PATHs from my index into 
> > a field cache, but maybe
> that is a false way to try to save memory.
> >
> > I thought I could write down the docId provided in the call to
> > customScore
> >
> > public float customScore(int doc, float subQueryScore, float
> > valSrcScore) throws IOException {
> >     docIds.add(docId);
> >   return ...;
> >  }
> >
> > private Set<Integer> docIds = new HashSet<Integer>();
> >
> > While I thought I had this working, apparently I had not taken into 
> > consideration the subreader and
> segment problem.
> > The int called doc is not the docId for the entire index, just the local 
> > reader doc number.  Is that
> right?
> > So is there a standard way to convert back to the index wide DocID?
> >
> > If there is no standard way, I _might_ create a small subclass of 
> > IndexSearcher and provide a method
> to:
> >
> >
> > (1)    Find the right reader by looping through all
> > IndexSearcher.subReaders[] to find what reader called the
> > CustomScoreQuery
> >
> > (2)    Add an offset of the proper value from
> > IndexSearcher.docStarts[iReader]
> >
> > But I'm am thinking this prone to the problem that subreader can be
> > made of more subreaders etc., so I really don't have a clue where to find 
> > the current reader and
> then to map back to docStarts.
> >
> > I also think I'm doing this wrong, because ReaderUtil has nothing like this?
> >
> > Is there some way to note for later that a particular document came through 
> > this function query or
> should I just accept the fact of using the field cache?
> >
> > -Paul
> >
> >
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to