You could stuff your custom weights into a payload, and index that,
but this is per term per document per position, while it sounds like
you just want one float for each term regardless of which
documents/positions where that term occurred?

Doing your own custom attribute would be a challenge: not only must
you create & set this attribute during indexing, but you then must
change the indexing process (custom chain, custom codec) to get the
new attribute into the index, and then make a custom query that can
pull this attribute at search time.

What are these term weights?  Are you sure you can't compute these
weights at search time with a custom similarity using the stats that
are already stored (docFreq, totalTermFreq, maxDoc, etc.)?

Mike McCandless

http://blog.mikemccandless.com


On Thu, Feb 13, 2014 at 2:40 AM, Rune Stilling <s...@rdfined.dk> wrote:
> Hi list
>
> I'm trying to figure out how customizable scoring and weighting is in the 
> Lucene API. I read about the API's but still can't figure out if the 
> following is possible.
>
> I would like to do normal document text indexing, but I would like to control 
> the weight added to tokens my self, also I would like to control the 
> weighting of query tokens and the how things are added together.
>
> When indexing a word I would like attache my own weights to the word, and use 
> these weights when querying for documents. F.ex.
>
> Doc 1
> Lucene(0.7) is(0) a(0) powerful(0.9) indexing(0.62) and(0) search(0.99) 
> API(0.3)
>
> Doc 2
> Lucene(0.5) is(0) used by(0) a(0) lot of(0) smart(0) people(0.1)
>
> The floats in parentheses are some I would like to add in the indexing 
> process, not something coming from Lucene tdf/id ex.
>
> Wen querying I would like to repeat this and also create the weights for each 
> term "myself" and control how the final doc score is calculated.
>
> I have read that it's possible to attach your own custom attributes to 
> tokens. Is this the way to go? Ie. should I add my custom weight as 
> attributes to tokens, and then access these attributes when calculating 
> document score in the search process (described here 
> https://lucene.apache.org/core/4_4_0/core/org/apache/lucene/analysis/package-summary.html
>  under "adding a custom attribute")?
>
> The reason why I'm asking is that I can't find any examples of this being 
> done anywhere. But I found someone stating "With Lucene, it is impossible to 
> increase or decrease the weight of individual terms in a document".
>
> With regards
> Rune

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to