I have been looking over the more like this code. It looks like, in the
code, the more like this query simply does more like this based on the
first of the fields, and fails to consider the rest. Thus, if I have
title and body indexed for some document, it will do the more like this
based only
Thanks for the response and really appreciate your help. I have read the
documentation but could not get it in the first read as I was new to Lucene.
I have changed it to AtomicReader and it seems to be working fine.
One last clarification is do we also need to use AtomicReader for the
following b
Have you already checked Solr's more like this?
http://wiki.apache.org/solr/MoreLikeThisHandler and
http://wiki.apache.org/solr/MoreLikeThis Your describe a problem similar to
the use case of that component and if there is something to hack is solr's
more like this.
Lucene's similarity is a low le
I understand and it sounds ok. The "store" index would be like an ordinary
database where you search by value.
Another approach you could consider is to compress the field before
indexing. That is you compress with
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/zip/GZIPInputStream.html
and
Is there an api in Lucene for finding the similarity score for two
documents that have been randomly pulled from an index? What about for a
query and a randomly selected document?
I realize this isn't the standard purpose of Lucene, but I was given a task
to compare similarity scores for the Simil