score *= normDecoder[norms[doc] 0xFF];// normalize for
field
If we're talking NORMS_IN_FREQ, then you'd replace that line with
one call to getBoost() against the TermDocs. (or maybe getNorm?
getMultiplier?)
I'll start there.
Considering I don't have to worry about any index
karl wettin wrote:
My own immediate thought is to compromise by allowing boost per term in
document. Simply remove the norms-methods from the IndexReader and add
a new one to the TermEnum and fall back on the field boost. How would
the value be picked up by the scorer?
Boost per position,
On Apr 27, 2006, at 9:41 AM, Doug Cutting wrote:
karl wettin wrote:
My own immediate thought is to compromise by allowing boost per
term in document. Simply remove the norms-methods from the
IndexReader and add a new one to the TermEnum and fall back on
the field boost. How would the
Marvin Humphrey wrote:
Moving away from cached norms was the second of three major changes to
the file format on my agenda, and the one I was all but certain I
wouldn't be able to sell to the Lucene community. The first was using
bytecounts at the head of Strings.
The third was storing
Now that I think about it, putting the score-multiplier into the
FreqFile does offer a benefit I hadn't considered before. It makes
it possible to tie the score multiplier to a term within a doc,
rather than a field within a doc.
Say you have a doc with a body field that's 1000 terms
27 apr 2006 kl. 18.41 skrev Doug Cutting:
karl wettin wrote:
Boost per position, et.c. sounds very expensive.
Indeed. It will probably nearly double the size of indexes and
also increase search time. But it is also very powerful. Consider
the posting representation Google describes
Marvin Humphrey wrote:
Incidentally, how about calling it BOOST_PER_POSITION instead?
+1, that is more consistent with other naming.
Doug
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL
On Apr 27, 2006, at 2:35 PM, karl wettin wrote:
What will be required in the IndexReader? Is it enough to add
getBoost() in the TermEnum? How would the value be sent to the scorer?
It wouldn't be the TermEnum, it would be a TermDocs subclass. If
we're talking BOOST_PER_POSITION, it would
28 apr 2006 kl. 00.30 skrev Marvin Humphrey:
On Apr 27, 2006, at 2:35 PM, karl wettin wrote:
What will be required in the IndexReader? Is it enough to add
getBoost() in the TermEnum? How would the value be sent to the
scorer?
It wouldn't be the TermEnum, it would be a TermDocs
karl wettin wrote:
karl wettin wrote:
This could lead me to believe I can use different boost for fields
with the same name within one document.
You can. The values are multiplied to produce the final boost value
for the field.
It's not really the same thing as I tried to describe
I don't like how fields are configured.
Document doc = new Document();
Field f;
f = new Field(foo, bar tzar, Field.Store.NO,
Field.Index.TOKENIZED, Field.TermVector.YES);
f.setBoost(1.5f);
doc.add(f);
f = new Field(foo, blah yada, Field.Store.NO,
25 apr 2006 kl. 18.56 skrev karl wettin:
How about refactoring fields to something like:
[Document](fieldName)# {0..1} -[Field +boost]# {0..*} -
[FieldValue +store +index +termVector]
instead of as now:
[Document](fieldName)# {0..1} -[Field +boost +store +index
+termVector]
karl wettin wrote:
This could lead me to believe I can use different boost for fields with
the same name within one document.
You can. The values are multiplied to produce the final boost value for
the field. This is described in:
25 apr 2006 kl. 19.34 skrev Doug Cutting:
karl wettin wrote:
This could lead me to believe I can use different boost for
fields with the same name within one document.
You can. The values are multiplied to produce the final boost
value for the field. This is described in:
14 matches
Mail list logo