Re: Rich positions (was boosting fields)

2006-04-29 Thread Marvin Humphrey
score *= normDecoder[norms[doc] 0xFF];// normalize for field If we're talking NORMS_IN_FREQ, then you'd replace that line with one call to getBoost() against the TermDocs. (or maybe getNorm? getMultiplier?) I'll start there. Considering I don't have to worry about any index

Re: boosting fields

2006-04-27 Thread Doug Cutting
karl wettin wrote: My own immediate thought is to compromise by allowing boost per term in document. Simply remove the norms-methods from the IndexReader and add a new one to the TermEnum and fall back on the field boost. How would the value be picked up by the scorer? Boost per position,

Rich positions (was boosting fields)

2006-04-27 Thread Marvin Humphrey
On Apr 27, 2006, at 9:41 AM, Doug Cutting wrote: karl wettin wrote: My own immediate thought is to compromise by allowing boost per term in document. Simply remove the norms-methods from the IndexReader and add a new one to the TermEnum and fall back on the field boost. How would the

Re: Rich positions (was boosting fields)

2006-04-27 Thread Doug Cutting
Marvin Humphrey wrote: Moving away from cached norms was the second of three major changes to the file format on my agenda, and the one I was all but certain I wouldn't be able to sell to the Lucene community. The first was using bytecounts at the head of Strings. The third was storing

Re: Rich positions (was boosting fields)

2006-04-27 Thread Marvin Humphrey
Now that I think about it, putting the score-multiplier into the FreqFile does offer a benefit I hadn't considered before. It makes it possible to tie the score multiplier to a term within a doc, rather than a field within a doc. Say you have a doc with a body field that's 1000 terms

Re: Rich positions (was boosting fields)

2006-04-27 Thread karl wettin
27 apr 2006 kl. 18.41 skrev Doug Cutting: karl wettin wrote: Boost per position, et.c. sounds very expensive. Indeed. It will probably nearly double the size of indexes and also increase search time. But it is also very powerful. Consider the posting representation Google describes

Re: Rich positions (was boosting fields)

2006-04-27 Thread Doug Cutting
Marvin Humphrey wrote: Incidentally, how about calling it BOOST_PER_POSITION instead? +1, that is more consistent with other naming. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL

Re: Rich positions (was boosting fields)

2006-04-27 Thread Marvin Humphrey
On Apr 27, 2006, at 2:35 PM, karl wettin wrote: What will be required in the IndexReader? Is it enough to add getBoost() in the TermEnum? How would the value be sent to the scorer? It wouldn't be the TermEnum, it would be a TermDocs subclass. If we're talking BOOST_PER_POSITION, it would

Re: Rich positions (was boosting fields)

2006-04-27 Thread karl wettin
28 apr 2006 kl. 00.30 skrev Marvin Humphrey: On Apr 27, 2006, at 2:35 PM, karl wettin wrote: What will be required in the IndexReader? Is it enough to add getBoost() in the TermEnum? How would the value be sent to the scorer? It wouldn't be the TermEnum, it would be a TermDocs

Re: boosting fields

2006-04-26 Thread Doug Cutting
karl wettin wrote: karl wettin wrote: This could lead me to believe I can use different boost for fields with the same name within one document. You can. The values are multiplied to produce the final boost value for the field. It's not really the same thing as I tried to describe

boosting fields

2006-04-25 Thread karl wettin
I don't like how fields are configured. Document doc = new Document(); Field f; f = new Field(foo, bar tzar, Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES); f.setBoost(1.5f); doc.add(f); f = new Field(foo, blah yada, Field.Store.NO,

Re: boosting fields

2006-04-25 Thread karl wettin
25 apr 2006 kl. 18.56 skrev karl wettin: How about refactoring fields to something like: [Document](fieldName)# {0..1} -[Field +boost]# {0..*} - [FieldValue +store +index +termVector] instead of as now: [Document](fieldName)# {0..1} -[Field +boost +store +index +termVector]

Re: boosting fields

2006-04-25 Thread Doug Cutting
karl wettin wrote: This could lead me to believe I can use different boost for fields with the same name within one document. You can. The values are multiplied to produce the final boost value for the field. This is described in:

Re: boosting fields

2006-04-25 Thread karl wettin
25 apr 2006 kl. 19.34 skrev Doug Cutting: karl wettin wrote: This could lead me to believe I can use different boost for fields with the same name within one document. You can. The values are multiplied to produce the final boost value for the field. This is described in: