[ https://issues.apache.org/jira/browse/LUCENE-9204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368791#comment-17368791 ]
Jim Ferenczi commented on LUCENE-9204: -------------------------------------- Thanks for sharing [~mgibney]! > However, this still leaves SynonymGraphTokenFilterFactory and > WordDelimiterGraphTokenFilterFactory (in Elasticsearch) as potentially > triggering this kind of expansion (in a manner identical to what's reported > in the above-referenced thread from the solr users list). Only phrase queries are affected though. For this type of query I expect that the number of expansions is low as well as the number of terms. We generate the query lazily so the max number of clauses check ensures that we don't build the full query if it's gigantic. We can also revive the optimization that we implemented with Spans and replace it with Intervals. However it's not as easy as it looks since Intervals use a different scoring mechanism. > This is a spooky result! I did not know our IntervalQuery for the > disjunctive case had exponential cost in the number of clauses. This is only on a special case where duplicated terms appear at different position. It's not ideal but in this situation we favored correctness which is always an issue with positional queries. I also wonder if we could test something that's not an edge case but a more realistic query with duplicate terms. Right now we compare the performance of queries that return different result set so it's difficult to conclude anything. > Move span queries to the queries module > --------------------------------------- > > Key: LUCENE-9204 > URL: https://issues.apache.org/jira/browse/LUCENE-9204 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Alan Woodward > Assignee: Alan Woodward > Priority: Major > Fix For: main (9.0) > > Time Spent: 1h > Remaining Estimate: 0h > > We have a slightly odd situation currently, with two parallel query > structures for building complex positional queries: the long-standing span > queries, in core; and interval queries, in the queries module. Given that > interval queries solve at least some of the problems we've had with Spans, I > think we should be pushing users more towards these implementations. It's > counter-intuitive to do that when Spans are in core though. I've opened this > issue to discuss moving the spans package as a whole to the queries module. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org