There are a few approaches possible here, we had a similar use case and
went for the second one below. I primarily deal with Solr, so I don't know
of Lucene-only examples, but hopefully you can dig this up..
(1) You can attach payloads to each occurrence of the tag, and modify the
scoring to use
The second solution sounds great and a lot more natural than payloads.
I know how to overwrite the Similarity class but this one would only be
called at search time and then already use the existing term frequency.
Looking up the probabilities every time a search is performed is
probably also
I want to index documents together with a list of tags (usually between
10-30) that represent meta information about this document. Normally, i
would create an extra field tag store every tag, by its name, inside
that field and create my 10-30 fields that and adding it to the document
before