Hi,

This is probably a question for the user list. However, as it relates to the
performance issue, also Lucene index format, I think better to ask the gurus
in this list ;-)

In my application, I have implemented a quality score for each document. For
each search performed, the relevancy score is first computed using the
lucene scoring, then, the relevancy score is combined with the quality score
to finally score the document.

For storing the quality score, I could use the FieldCache feature and then
load the quality scores as a byte array into memory when warming up the
index. However, I pay the price for the warm up. However, if I store the
quality score in the term index, as in:

term, <docId, qualityscore>+

This way, no need to warm up the index. But, I guess the index would be
significantly bigger, and for each term, the quality score for a document is
stored.

I haven't done any testing yet to see which way is better.

But, in general, could anyone give me some advice which way is better? I
think it could be a classic time vs. space issue in computer science. But
still would get the opinions from you gurus.

Thanks in advance.

Jian

Reply via email to