cpoerschke commented on code in PR #3418: URL: https://github.com/apache/solr/pull/3418#discussion_r2318785260
########## solr/solr-ref-guide/modules/query-guide/pages/json-combined-query-dsl.adoc: ########## @@ -0,0 +1,112 @@ += JSON Combined Query DSL +:tabs-sync-option: +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor license agreements. See the NOTICE file +// distributed with this work for additional information +// regarding copyright ownership. The ASF licenses this file +// to you under the Apache License, Version 2.0 (the +// "License"); you may not use this file except in compliance +// with the License. You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, +// software distributed under the License is distributed on an +// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +// KIND, either express or implied. See the License for the +// specific language governing permissions and limitations +// under the License. + +The Combined Query feature aims to execute multiple queries of multiple kinds across multiple shards of a collection and combine their result basis an algorithm (like Reciprocal Rank Fusion). +It is extending JSON Query DSL ultimately enabling Hybrid Search. + +[NOTE] +==== +This feature is currently unsupported for grouping and Cursors. +==== + +== Query DSL Structure +The query structure is similar to JSON Query DSL except for how multiple queries are defined along with their parameters. + +* Multiple queries can be defined under the `queries` key by providing their name with the same syntax as a single query is defined with the key `query`. +* In addition to the other supported parameters, there are several parameters which can be defined under `params` key as below: +`combiner` | Default: `false`:: + Enables the combined query mode when set to `true`. +`combiner.query`:: + The list of queries to be executed as defined in the `queries` key. Example: `["query1", "query2"]` +`combiner.algorithm` | Default: `rrf`:: + The algorithm to be used for combining the results. Reciprocal Rank Fusion (RRF) is the in-built fusion algorithm. + Any other algorithm can be configured using xref:json-combined-query-dsl.adoc#combiner-algorithm-plugin[plugin]. +`combiner.rrf.k` | Default: `60`:: + The k parameter in the RRF algorithm. + +=== Example + +Below is a sample JSON query payload: Review Comment: My understanding is that * if (say) `lexical1` is for `title:sales` and there are (say) `numFound=500` results and * if (say) `lexical2` is for `title:report` and there are (say) `numFound=300` results and * we combine the two queries and use (say) `rows=10` and * there are (say) 100 documents in common between the 500 and 300 * then we don't know what is in common i.e. we can only observe the overlap of (say) 3 documents amongst the 10 that each of `lexical1` and `lexical2` returned. * So then `500 + 300 - 3` would be the calculation whereas `500 + 300 - 100` is the precise number if one were to interpret it as _"number of documents that matched either or both queries"_ or something along those lines. This is based on code reading/interpretation only so far i.e. haven't yet tried it out locally and/or via a test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
