I have a few things I'd like to check with the Luke handler, if you call
could check some of the assumptions, that would be great.
* I want to print out the document frequency for a term in a given
document. Since that term shows up in the given document, I would think
the term frequency must be > 1. I am using: reader.docFreq( t ) [line
236] The results seem reasonable, but *sometimes* it returns zero... is
that possible?
* I want to return the lucene field flags for each field. I run through
all the field names with:
reader.getFieldNames(IndexReader.FieldOption.ALL). Is there a way to
get any Fieldable for a given name? IIUC, all terms with the same name
will have the same flags. I tried searching for a document with that
field, it works, but only for stored fields.
* I just realized that I am only returning stored fields for get
getDocumentFieldsInfo() (it uses Document.getFields()) How can I get
find *all* Fieldables for a given document? I have tried following the
luke source, but get a bit lost ;)
* Each field gets an boolean attribute "cacheableFaceting" -- this true
if the number of distinct terms is smaller then the filterCacheSize. I
get the filterCacheSize from: solrconfig.xml:"query/filterCache/@size"
and get the distinctTerm count from counting up the termEnum. Is this
logic solid? I know the cacheability changes if you are faciting
multiple fields at once, but its still nice to have a ballpark estimate
without needing to know the internals.
thanks for any pointers
ryan