I’m not sure how I would do that, when Lucene is meant to use my custom weights when calculating document weights when executing a search query.
Doc 1 Lucene(0.7) is(0) a(0) powerful(0.9) indexing(0.62) and(0) search(0.99) API(0.3) Doc 2 Lucene(0.5) is(0) used by(0) a(0) lot of(0) smart(0) people(0.1) Query Lucene 0.7 and 0.5 are my custom weight and should be used to return Doc 1 with weight 0.7 and Doc 2 with weight 0.5 as an answer to my query. /Rune Den 13/02/2014 kl. 13.27 skrev Shai Erera <ser...@gmail.com>: > I often prefer to manage such weights outside the index. Usually managing > them inside the index leads to problems in the future when e.g the weights > change. If they are encoded in the index, it means re-indexing. Also, if > the weight changes then in some segments the weight will be different than > others. I think that if you manage the weights e.g. in a simple FST (which > is very compat), it will give you the best flexibility and it's very easy > to use. > > Shai > > > On Thu, Feb 13, 2014 at 1:36 PM, Michael McCandless < > luc...@mikemccandless.com> wrote: > >> You could stuff your custom weights into a payload, and index that, >> but this is per term per document per position, while it sounds like >> you just want one float for each term regardless of which >> documents/positions where that term occurred? >> >> Doing your own custom attribute would be a challenge: not only must >> you create & set this attribute during indexing, but you then must >> change the indexing process (custom chain, custom codec) to get the >> new attribute into the index, and then make a custom query that can >> pull this attribute at search time. >> >> What are these term weights? Are you sure you can't compute these >> weights at search time with a custom similarity using the stats that >> are already stored (docFreq, totalTermFreq, maxDoc, etc.)? >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> >> On Thu, Feb 13, 2014 at 2:40 AM, Rune Stilling <s...@rdfined.dk> wrote: >>> Hi list >>> >>> I'm trying to figure out how customizable scoring and weighting is in >> the Lucene API. I read about the API's but still can't figure out if the >> following is possible. >>> >>> I would like to do normal document text indexing, but I would like to >> control the weight added to tokens my self, also I would like to control >> the weighting of query tokens and the how things are added together. >>> >>> When indexing a word I would like attache my own weights to the word, >> and use these weights when querying for documents. F.ex. >>> >>> Doc 1 >>> Lucene(0.7) is(0) a(0) powerful(0.9) indexing(0.62) and(0) search(0.99) >> API(0.3) >>> >>> Doc 2 >>> Lucene(0.5) is(0) used by(0) a(0) lot of(0) smart(0) people(0.1) >>> >>> The floats in parentheses are some I would like to add in the indexing >> process, not something coming from Lucene tdf/id ex. >>> >>> Wen querying I would like to repeat this and also create the weights for >> each term "myself" and control how the final doc score is calculated. >>> >>> I have read that it's possible to attach your own custom attributes to >> tokens. Is this the way to go? Ie. should I add my custom weight as >> attributes to tokens, and then access these attributes when calculating >> document score in the search process (described here >> https://lucene.apache.org/core/4_4_0/core/org/apache/lucene/analysis/package-summary.htmlunder >> "adding a custom attribute")? >>> >>> The reason why I'm asking is that I can't find any examples of this >> being done anywhere. But I found someone stating "With Lucene, it is >> impossible to increase or decrease the weight of individual terms in a >> document". >>> >>> With regards >>> Rune >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org