Oooooo.... I like the BAR_significant field idea. It seems that you'd
have to have one of those for every different level of boosting in your
document. But that is significantly easier than reforming a query for
30-odd fields.
The next quersion would be should you omit the boosted field words from
the default searchg field to get the "appropriate" score for that
document. In other words, would matching the query in the
BAR_significant and the default search field return a different score
than just matching the BAR_significant field for the query
"BAR_significant:queryword queryword"? They should return the same set
of results, just curious about how the ranking order might differ. Not
overly important, more for my own personal curiosity.
Thanks Chris for the quick response to the previous question, it helped
a lot.
Les
Chris Hostetter wrote:
The full post Erick alluded too may be helpful...
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200609.mbox/[EMAIL
PROTECTED]
in general, if your goal is that words in the "metadata" of a document
should be worth more then words in the "body" then you should have two
fields: "metadata" and "body", you shoudl query for the same word in both
fields, and your query should have a query boost on the "metadata" part.
: way it seems to work, is that if you boost a field then you have to
: actually specify that field in your query to benefit from that field
correct ... you are saying "this documents field_X is better then other
document's field_X" .. but that info only comes into play if you actually
use "field_X"
: ignored. I hacked around this by just adding the field's text to the
: default search field n times where n is the boost for that field. I
: seriously doubt that this the ideal way to do it, but I couldn't figure
: out how to do it other than reforming all of my queries to search all
: the fields.
that is a feasible solution to the use case where "this doc's use of the
word FOO in field BAR is more significant then the use of the word FOO in
field BAR for any other docs" but 90% of the time those use cases can be
better solved by making a BAR_significant field that only contains the
significant words in BAR and querying both.
for the other 10% of the time, you'll have to wait for Payload based
queries to come along, so when you add FOO to BAR you can give it a
payload that says how important it is.
(or so the Query Payload people say ... i choose to believe them even
though i don't understand them)
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]