: For a 'SpanNearQuery', this reduces the effect of the term frequency on the : score as the number of terms in the span increases. So, for a simple phrase : query (using spans), the longer the phrase, the lower the TF. For a simple : SpanTermQuery, the TF is reduced in half (1.0f / 1 + 1). : : I'm just wondering why this is the default behavior. For 'SpanTermQuery', : I'd expect the TF to reflect the actual number of occurrences of the term. : For a SpanNearQuery, wouldn't it still be the number of occurrences of the : whole span, not the number of terms in the span?
I believe it's because a Span typically encomases multiple positions -- there's no advantage i can think of for executing a SpanTermQuery directly. note that when you execute a SpanQuery, it doesn't pay any attention to the tf/idf of any nested queries, it only looks at the aggregated Spans. I suppose SpanTermQuery could override the weight/scorer methods so that it behaved more like a TermQuery if it was executed directly ... but that's really not what it's intended for. (it's unfortunate that all of the SpanQueries use a hierarchical class structure instead of having a single SpanQuery that composes a "SpanClause" hierarchy) -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org