Hi Simon,
Thank you for your reply. The document length is just an example of what I need
to store. Another stat that I need is a *normalised* sum of the TF's. I can
compute this using my own cache during retrieval by extending the
SimilarityBase and storing the values in a cache that is used w
Hey,
On Wed, Jan 4, 2012 at 1:15 PM, Hany Azzam wrote:
> Hi,
>
> I am experimenting with the Lucene trunk (aka 4.0), especially with the new
> IndexDocValues feature. I am trying to store some query-independent
> statistics such as PageRank, etc. One stat that I am trying to store is the
> sum
Hi,
I am experimenting with the Lucene trunk (aka 4.0), especially with the new
IndexDocValues feature. I am trying to store some query-independent statistics
such as PageRank, etc. One stat that I am trying to store is the sum of all the
term frequencies in a document. This can be seen as the
Hi folks,
I was recommended to use PrecedenceQueryParser if I want boolean precedence in
my queries. While examining this class, I have noticed that it and its super
class do not extend the QueryParser but have a separate
implementation/hierarchy. All other parsers in that package do extend the
Hi Ryan,
Why not preprocessing your documents with tools like Apache UIMA, GATE or
OpenNLP before indexing them in Lucene? GATE for instance has FST-based
gazetteers which would be perfect for your place names, AFAIK there is also
a Dictionary component for UIMA which would be a good match.
Julie