Hi list

I’m trying to figure out how customizable scoring and weighting is in the 
Lucene API. I read about the API’s but still can’t figure out if the following 
is possible.

I would like to do normal document text indexing, but I would like to control 
the weight added to tokens my self, also I would like to control the weighting 
of query tokens and the how things are added together.

When indexing a word I would like attache my own weights to the word, and use 
these weights when querying for documents. F.ex.

Doc 1
Lucene(0.7) is(0) a(0) powerful(0.9) indexing(0.62) and(0) search(0.99) API(0.3)

Doc 2
Lucene(0.5) is(0) used by(0) a(0) lot of(0) smart(0) people(0.1)

The floats in parentheses are some I would like to add in the indexing process, 
not something coming from Lucene tdf/id ex.

Wen querying I would like to repeat this and also create the weights for each 
term “myself” and control how the final doc score is calculated.

I have read that it’s possible to attach your own custom attributes to tokens. 
Is this the way to go? Ie. should I add my custom weight as attributes to 
tokens, and then access these attributes when calculating document score in the 
search process (described here 
https://lucene.apache.org/core/4_4_0/core/org/apache/lucene/analysis/package-summary.html
 under “adding a custom attribute”)?

The reason why I’m asking is that I can’t find any examples of this being done 
anywhere. But I found someone stating “With Lucene, it is impossible to 
increase or decrease the weight of individual terms in a document”.

With regards
Rune 

Reply via email to