vtlim commented on code in PR #18179: URL: https://github.com/apache/druid/pull/18179#discussion_r2193651540
########## docs/querying/set-query-context.md: ########## @@ -0,0 +1,240 @@ +--- +id: set-query-context +title: "Set query context" +sidebar_label: "Set query context" +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + + +The query context gives you fine-grained control over how Apache Druid executes your individual queries. While the default settings in Druid work well for most queries, you can set query context to handle specific requirements and optimize performance. + +Common use cases for the query context include: +- Override default timeouts for long-running queries or complex aggregations. +- Control resource usage to prevent expensive queries from overwhelming your cluster. +- Debug query performance by disabling caching during testing. +- Configure SQL-specific behaviors like time zones for accurate time-based analysis. +- Set priorities to ensure critical queries get computational resources first. +- Adjust memory limits for queries that process large datasets. + +Druid provides several ways to set query context, and the method you use depends on how and where you're submitting your query. +This guide lists how to set query context for each method of submitting queries. + +Before you begin, identify which context parameters you need to configure in order to establish your query context as query context carriers. For available parameters and their descriptions, see [Query context reference](query-context-reference.md). + +## Druid web console + +The most straightforward method to configure query context parameters is via the Druid web console. In web console, you can set up context parameters for both Druid SQL and native queries. + +The following steps outline how to define query context parameters: + +1. Open the **Query** tab in the web console. + +1. **Click** the **Engine** selector next to the **Run** button to choose the appropriate query type: + +- Selects the **JSON (native) engine** for native queries. +- Selects the **SQL (native) engine** for Druid SQL queries. +- Selects the **SQL (task) engine** for Multi-stage queries (MSQ). +- Selects the **Auto engine** to let the console detect the query type automatically. You just paste your query into the **Query** view, and the web console chooses the right engine for you. + +2. Enter the query you want to run in the web console. + +3. Select **Edit context** from the menu. +4. In the **Edit query context** dialog, add your context parameters as JSON key-value pairs: + ```json + { + "timeout": 300000, + "useCache": false + } + ``` +5. Click **Save** to apply the context to your query. +6. Click **Run** to execute your query with the specified context parameters. + +The web console validates your JSON and highlights any syntax errors before you run the query. + +For more information about using the Druid SQL Web Console Query view, see [Query view](../operations/web-console.md#query). + +## Druid SQL + +When you use Druid SQL programmatically—such as in applications, automated scripts, or database tools, you can set query context using three different methods depending on how you execute your queries beyond [Druid Web Console](./set-query-context.md#druid-web-console). + +### HTTP API + +When using the HTTP API, you include query context parameters in the `context` object of your JSON request. + +The following example sets the `sqlTimeZone` parameter: + + ```json + { + "query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar' AND __time > TIMESTAMP '2000-01-01 00:00:00'", + "context" : { + "sqlTimeZone" : "America/Los_Angeles" + } + } + ``` + +Druid will execute your query with the specified context parameters and return the results. + +You can set multiple context parameters in a single request: + +```json +{ + "query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar'", + "context" : { + "timeout" : 30000, + "useCache" : false, + "sqlTimeZone" : "America/Los_Angeles" + } +} +``` +For more information on how to format Druid SQL API requests and handle responses, see [Druid SQL API](../api-reference/sql-api.md). + + +### JDBC driver API + +When connecting to Druid through JDBC, you set query context parameters a JDBC connection properties object. This approach is useful when integrating Druid with BI tools or Java applications. + +Druid uses the Avatica JDBC driver (version 1.23.0 or later recommended). Note that Avatica does not support passing connection context parameters from the JDBC connection string—you must use a `Properties` object instead. + + +You can set query context parameters when creating your JDBC connection: + +```java +String url = "jdbc:avatica:remote:url=http://localhost:8082/druid/v2/sql/avatica/"; + +// Set any query context parameters you need here. +Properties connectionProperties = new Properties(); +connectionProperties.setProperty("sqlTimeZone", "America/Los_Angeles"); +connectionProperties.setProperty("useCache", "false"); + +try (Connection connection = DriverManager.getConnection(url, connectionProperties)) { + // create and execute statements, process result sets, etc +} +``` + +For more details on how to use JDBC driver API, see [Druid SQL JDBC driver API](../api-reference/sql-jdbc.md). + +### SET statements + +Beyond using the `context` parameter, you can use `SET` command to specify SQL query context parameters that modify the behavior of a Druid SQL query. You can include one or more SET statements before the main SQL query. You can use `SET` in the both web console and Druid SQL HTTP API. + +In the web console, you can write your `SET` statements followed by your query directly. For example, + +```sql +SET useApproximateTopN = false; +SET sqlTimeZone = 'America/Los_Angeles'; +SET timeout = 90000; +SELECT some_column, COUNT(*) +FROM druid.foo +WHERE other_column = 'foo' +GROUP BY 1 +ORDER BY 2 DESC +``` + +You can also include your SET statements as part of the query string in your HTTP API call. For example, + +```bash +curl -X POST 'http://localhost:8888/druid/v2/sql' \ + -H 'Content-Type: application/json' \ + -d '{ + "query": "SET useApproximateTopN = false; SET sqlTimeZone = '\''America/Los_Angeles'\''; SET timeout = 90000; SELECT some_column, COUNT(*) FROM druid.foo WHERE other_column = '\''foo'\'' GROUP BY 1 ORDER BY 2 DESC" + }' +``` + +You can also combine SET statements with the `context` field. If you include both, the parameter value in SET takes precedence: + +```json +{ + "query": "SET timeout = 90000; SELECT COUNT(*) FROM data_source", + "context": { + "timeout": 30000, // This will be overridden by SET + "priority": 100 // This will still apply + } +} +``` + +SET statements only apply to the query in the same request. Subsequent requests are not affected. + +SET statements work with SELECT, INSERT, and REPLACE queries. + +For more details on how to use the SET command in your SQL query, see [SET](sql.md#set). + +:::info + You cannot use SET statements when using Druid SQL JDBC connections. +::: + + +## Native queries + +For native queries, you can include query context parameters in a JSON object named `context` within your query structure or through [Druid Web Console](./set-query-context.md#druid-web-console). Review Comment: See earlier comment on web console ########## docs/querying/set-query-context.md: ########## @@ -0,0 +1,240 @@ +--- +id: set-query-context +title: "Set query context" +sidebar_label: "Set query context" +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + + +The query context gives you fine-grained control over how Apache Druid executes your individual queries. While the default settings in Druid work well for most queries, you can set query context to handle specific requirements and optimize performance. + +Common use cases for the query context include: +- Override default timeouts for long-running queries or complex aggregations. +- Control resource usage to prevent expensive queries from overwhelming your cluster. +- Debug query performance by disabling caching during testing. +- Configure SQL-specific behaviors like time zones for accurate time-based analysis. +- Set priorities to ensure critical queries get computational resources first. +- Adjust memory limits for queries that process large datasets. + +Druid provides several ways to set query context, and the method you use depends on how and where you're submitting your query. +This guide lists how to set query context for each method of submitting queries. Review Comment: "how to set **the** query context" ########## docs/querying/querying.md: ########## @@ -152,3 +152,6 @@ Possible Druid error codes for the `error` field include: |`Query cancelled`|500|The query was cancelled through the query cancellation API.| |`Truncated response context`|500|An intermediate response context for the query exceeded the built-in limit of 7KiB.<br/><br/>The response context is an internal data structure that Druid servers use to share out-of-band information when sending query results to each other. It is serialized in an HTTP header with a maximum length of 7KiB. This error occurs when an intermediate response context sent from a data server (like a Historical) to the Broker exceeds this limit.<br/><br/>The response context is used for a variety of purposes, but the one most likely to generate a large context is sharing details about segments that move during a query. That means this error can potentially indicate that a very large number of segments moved in between the time a Broker issued a query and the time it was processed on Historicals. This should rarely, if ever, occur during normal operation.| |`Unknown exception`|500|Some other exception occurred. Check errorMessage and errorClass for details, although keep in mind that the contents of those fields are free-form and may change from release to release.| + +## Learn more +[Set the query context](./set-query-context.md) on how the different approaches for set query context. Review Comment: This doesn't match the actual page title ########## docs/querying/query-context-reference.md: ########## @@ -23,20 +23,26 @@ sidebar_label: "Query context" ~ under the License. --> -The query context is used for various query configuration parameters. Query context parameters can be specified in -the following ways: - -- For [Druid SQL](../api-reference/sql-api.md), context parameters are provided either in a JSON object named `context` to the -HTTP POST API, or as properties to the JDBC connection. -- For [native queries](querying.md), context parameters are provided in a JSON object named `context`. +The query context provides runtime configuration for individual queries in Apache Druid. Each parameter in the query context controls a specific aspect of query behavior—from execution timeouts and resource limits to caching policies and processing strategies. Note that setting query context will override both the default value and the runtime properties value in the format of `druid.query.default.context.{property_key}` (if set). +This reference contains context parameters organized by their scope: + +- **General parameters**: Applies to all query types. +- **Parameters by query type**: Applies to the specific type of query, such as TopN, Timeseries, or GroupBy. +- **Vectorization parameters**: Controls vectorized query execution for supported query types. + +To learn how to set query context, see [Set query context](./set-query-context.md). Review Comment: "how to set **the** query context" ########## docs/querying/set-query-context.md: ########## @@ -0,0 +1,240 @@ +--- +id: set-query-context +title: "Set query context" +sidebar_label: "Set query context" +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + + +The query context gives you fine-grained control over how Apache Druid executes your individual queries. While the default settings in Druid work well for most queries, you can set query context to handle specific requirements and optimize performance. + +Common use cases for the query context include: +- Override default timeouts for long-running queries or complex aggregations. +- Control resource usage to prevent expensive queries from overwhelming your cluster. +- Debug query performance by disabling caching during testing. +- Configure SQL-specific behaviors like time zones for accurate time-based analysis. +- Set priorities to ensure critical queries get computational resources first. +- Adjust memory limits for queries that process large datasets. + +Druid provides several ways to set query context, and the method you use depends on how and where you're submitting your query. +This guide lists how to set query context for each method of submitting queries. + +Before you begin, identify which context parameters you need to configure in order to establish your query context as query context carriers. For available parameters and their descriptions, see [Query context reference](query-context-reference.md). + +## Druid web console + +The most straightforward method to configure query context parameters is via the Druid web console. In web console, you can set up context parameters for both Druid SQL and native queries. + +The following steps outline how to define query context parameters: + +1. Open the **Query** tab in the web console. + +1. **Click** the **Engine** selector next to the **Run** button to choose the appropriate query type: + +- Selects the **JSON (native) engine** for native queries. +- Selects the **SQL (native) engine** for Druid SQL queries. +- Selects the **SQL (task) engine** for Multi-stage queries (MSQ). +- Selects the **Auto engine** to let the console detect the query type automatically. You just paste your query into the **Query** view, and the web console chooses the right engine for you. + +2. Enter the query you want to run in the web console. + +3. Select **Edit context** from the menu. +4. In the **Edit query context** dialog, add your context parameters as JSON key-value pairs: + ```json + { + "timeout": 300000, + "useCache": false + } + ``` +5. Click **Save** to apply the context to your query. +6. Click **Run** to execute your query with the specified context parameters. + +The web console validates your JSON and highlights any syntax errors before you run the query. + +For more information about using the Druid SQL Web Console Query view, see [Query view](../operations/web-console.md#query). + +## Druid SQL + +When you use Druid SQL programmatically—such as in applications, automated scripts, or database tools, you can set query context using three different methods depending on how you execute your queries beyond [Druid Web Console](./set-query-context.md#druid-web-console). Review Comment: Druid Web Console -- fix casing. Also check how this is referred to in other places. Is it called "Druid web console" everywhere or just "web console"? ########## docs/querying/sql-query-context.md: ########## @@ -59,51 +59,6 @@ For more information, see [Overriding default query context values](../configura |`inFunctionExprThreshold`|At or beyond this threshold number of values, SQL `IN` is eligible for execution using the native function `scalar_in_array` rather than an <code>||</code> of `==`, even if the number of values is below `inFunctionThreshold`. This property only affects translation of SQL `IN` to a [native expression](math-expr.md). It doesn't affect translation of SQL `IN` to a [native filter](filters.md). This property is provided for backwards compatibility purposes, and may be removed in a future release.|`2`| |`inSubQueryThreshold`|At or beyond this threshold number of values, Druid converts SQL `IN` to `JOIN` on an inline table. `inFunctionThreshold` takes priority over this setting. A threshold of 0 forces usage of an inline table in all cases where the size of a SQL `IN` is larger than `inFunctionThreshold`. A threshold of `2147483647` disables the rewrite of SQL `IN` to `JOIN`. |`2147483647`| -## Set the query context - -How query context parameters are set differs depending on whether you are using the [JSON API](../api-reference/sql-api.md) or [JDBC](../api-reference/sql-jdbc.md). - -### Set the query context when using JSON API -When using the JSON API, you can configure query context parameters in the `context` object of the request. - -For example: - -``` -{ - "query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar' AND __time > TIMESTAMP '2000-01-01 00:00:00'", - "context" : { - "sqlTimeZone" : "America/Los_Angeles", - "useCache": false - } -} -``` - -Context parameters can also be set by including [SET](./sql.md#set) as part of the `query` -string in the request, separated from the query by `;`. Context parameters set by `SET` statements take priority over -values set in `context`. - -The following example expresses the previous example in this form: - -``` -{ - "query" : "SET sqlTimeZone = 'America/Los_Angeles'; SET useCache = false; SELECT COUNT(*) FROM data_source WHERE foo = 'bar' AND __time > TIMESTAMP '2000-01-01 00:00:00'" -} -``` - -### Set the query context when using JDBC -If using JDBC, context parameters can be set using [connection properties object](../api-reference/sql-jdbc.md). - -For example: - -```java -String url = "jdbc:avatica:remote:url=http://localhost:8082/druid/v2/sql/avatica/"; - -// Set any query context parameters you need here. -Properties connectionProperties = new Properties(); -connectionProperties.setProperty("sqlTimeZone", "America/Los_Angeles"); -connectionProperties.setProperty("useCache", "false"); - -try (Connection connection = DriverManager.getConnection(url, connectionProperties)) { - // create and execute statements, process result sets, etc -} -``` +## Learn more +- [Set query context](../querying/set-query-context.md) for how to set query context. Review Comment: "how to set **the** query context" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
