Re: sloppyFreq question

Chris Hostetter Wed, 11 Mar 2009 12:12:53 -0700

: For a 'SpanNearQuery', this reduces the effect of the term frequency on the
: score as the number of terms in the span increases. So, for a simple phrase
: query (using spans), the longer the phrase, the lower the TF. For a simple
: SpanTermQuery, the TF is reduced in half (1.0f / 1 + 1).
: 
: I'm just wondering why this is the default behavior. For 'SpanTermQuery',
: I'd expect the TF to reflect the actual number of occurrences of the term.
: For a SpanNearQuery, wouldn't it still be the number of occurrences of the
: whole span, not the number of terms in the span?


I believe it's because a Span typically encomases multiple positions -- 
there's no advantage i can think of for executing a SpanTermQuery 
directly.  note that when you execute a SpanQuery, it doesn't pay any 
attention to the tf/idf of any nested queries, it only looks at the 
aggregated Spans.

I suppose SpanTermQuery could override the weight/scorer methods so that 
it behaved more like a TermQuery if it was executed directly ... but 
that's really not what it's intended for.

(it's unfortunate that all of the SpanQueries use a hierarchical class 
structure instead of having a single SpanQuery that composes a 
"SpanClause" hierarchy)


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: sloppyFreq question

Reply via email to