This is an automated email from the ASF dual-hosted git repository.
victoria pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git
The following commit(s) were added to refs/heads/master by this push:
new 56887a76b84 [Docs] Guide on setting query context (#18179)
56887a76b84 is described below
commit 56887a76b842f40a7d5421cabe8289f48f7bb854
Author: Wanru Skuld Shao <[email protected]>
AuthorDate: Thu Jul 31 09:53:35 2025 -0700
[Docs] Guide on setting query context (#18179)
Co-authored-by: Victoria Lim <[email protected]>
Co-authored-by: Charles Smith <[email protected]>
Co-authored-by: Victoria Lim <[email protected]>
---
docs/assets/set-query-context-insert-query.png | Bin 0 -> 51618 bytes
.../set-query-context-open-context-dialog.png | Bin 0 -> 75005 bytes
docs/assets/set-query-context-query-view.png | Bin 0 -> 66028 bytes
docs/assets/set-query-context-run-the-query.png | Bin 0 -> 50538 bytes
.../set-query-context-set-context-parameters.png | Bin 0 -> 54655 bytes
docs/configuration/index.md | 18 +-
.../extensions-core/datasketches-hll.md | 2 +-
.../extensions-core/datasketches-quantiles.md | 2 +-
.../extensions-core/datasketches-theta.md | 2 +-
docs/multi-stage-query/reference.md | 2 +-
docs/operations/mixed-workloads.md | 4 +-
docs/querying/caching.md | 2 +-
docs/querying/datasourcemetadataquery.md | 2 +-
docs/querying/groupbyquery.md | 2 +-
docs/querying/math-expr.md | 2 +-
docs/querying/multi-value-dimensions.md | 2 +-
docs/querying/multitenancy.md | 2 +-
...query-context.md => query-context-reference.md} | 34 +-
docs/querying/query-context.md | 379 +++++++++++++++------
docs/querying/querying.md | 4 +
docs/querying/searchquery.md | 2 +-
docs/querying/segmentmetadataquery.md | 2 +-
docs/querying/sql-query-context.md | 63 +---
docs/querying/sql.md | 5 +-
docs/querying/timeboundaryquery.md | 2 +-
docs/querying/timeseriesquery.md | 2 +-
docs/querying/topnquery.md | 2 +-
docs/querying/using-caching.md | 5 +-
docs/release-info/migr-subquery-limit.md | 2 +-
docs/tutorials/tutorial-query.md | 2 +-
website/sidebars.json | 3 +-
31 files changed, 346 insertions(+), 203 deletions(-)
diff --git a/docs/assets/set-query-context-insert-query.png
b/docs/assets/set-query-context-insert-query.png
new file mode 100644
index 00000000000..d156597d2a5
Binary files /dev/null and b/docs/assets/set-query-context-insert-query.png
differ
diff --git a/docs/assets/set-query-context-open-context-dialog.png
b/docs/assets/set-query-context-open-context-dialog.png
new file mode 100644
index 00000000000..765caa0d72d
Binary files /dev/null and
b/docs/assets/set-query-context-open-context-dialog.png differ
diff --git a/docs/assets/set-query-context-query-view.png
b/docs/assets/set-query-context-query-view.png
new file mode 100644
index 00000000000..9d25d3c6644
Binary files /dev/null and b/docs/assets/set-query-context-query-view.png differ
diff --git a/docs/assets/set-query-context-run-the-query.png
b/docs/assets/set-query-context-run-the-query.png
new file mode 100644
index 00000000000..27f29f83901
Binary files /dev/null and b/docs/assets/set-query-context-run-the-query.png
differ
diff --git a/docs/assets/set-query-context-set-context-parameters.png
b/docs/assets/set-query-context-set-context-parameters.png
new file mode 100644
index 00000000000..17fa110501b
Binary files /dev/null and
b/docs/assets/set-query-context-set-context-parameters.png differ
diff --git a/docs/configuration/index.md b/docs/configuration/index.md
index 054aea13f10..4c11e125374 100644
--- a/docs/configuration/index.md
+++ b/docs/configuration/index.md
@@ -1491,7 +1491,7 @@ Druid uses Jetty to serve HTTP requests.
|`druid.server.http.defaultQueryTimeout`|Query timeout in millis, beyond which
unfinished queries will be cancelled|300000|
|`druid.server.http.gracefulShutdownTimeout`|The maximum amount of time Jetty
waits after receiving shutdown signal. After this timeout the threads will be
forcefully shutdown. This allows any queries that are executing to
complete(Only values greater than zero are valid).|`PT30S`|
|`druid.server.http.unannouncePropagationDelay`|How long to wait for ZooKeeper
unannouncements to propagate before shutting down Jetty. This is a minimum and
`druid.server.http.gracefulShutdownTimeout` does not start counting down until
after this period elapses.|`PT0S` (do not wait)|
-|`druid.server.http.maxQueryTimeout`|Maximum allowed value (in milliseconds)
for `timeout` parameter. See [query-context](../querying/query-context.md) to
know more about `timeout`. Query is rejected if the query context `timeout` is
greater than this value. |`Long.MAX_VALUE`|
+|`druid.server.http.maxQueryTimeout`|Maximum allowed value (in milliseconds)
for `timeout` parameter. See
[query-context](../querying/query-context-reference.md) to know more about
`timeout`. Query is rejected if the query context `timeout` is greater than
this value. |`Long.MAX_VALUE`|
|`druid.server.http.maxRequestHeaderSize`|Maximum size of a request header in
bytes. Larger headers consume more memory and can make a server more vulnerable
to denial of service attacks.|8 * 1024|
|`druid.server.http.enableForwardedRequestCustomizer`|If enabled, adds Jetty
ForwardedRequestCustomizer which reads X-Forwarded-* request headers to
manipulate servlet request object when Druid is used behind a proxy.|false|
|`druid.server.http.allowedHttpMethods`|List of HTTP methods that should be
allowed in addition to the ones required by Druid APIs. Druid APIs require GET,
PUT, POST, and DELETE, which are always allowed. This option is not useful
unless you have installed an extension that needs these additional HTTP methods
or that adds functionality related to CORS. None of Druid's bundled extensions
require these methods.|`[]`|
@@ -1601,7 +1601,7 @@ Druid uses Jetty to serve HTTP requests.
|`druid.server.http.defaultQueryTimeout`|Query timeout in millis, beyond which
unfinished queries will be cancelled|300000|
|`druid.server.http.gracefulShutdownTimeout`|The maximum amount of time Jetty
waits after receiving shutdown signal. After this timeout the threads will be
forcefully shutdown. This allows any queries that are executing to
complete(Only values greater than zero are valid).|`PT30S`|
|`druid.server.http.unannouncePropagationDelay`|How long to wait for ZooKeeper
unannouncements to propagate before shutting down Jetty. This is a minimum and
`druid.server.http.gracefulShutdownTimeout` does not start counting down until
after this period elapses.|`PT0S` (do not wait)|
-|`druid.server.http.maxQueryTimeout`|Maximum allowed value (in milliseconds)
for `timeout` parameter. See [query-context](../querying/query-context.md) to
know more about `timeout`. Query is rejected if the query context `timeout` is
greater than this value. |`Long.MAX_VALUE`|
+|`druid.server.http.maxQueryTimeout`|Maximum allowed value (in milliseconds)
for `timeout` parameter. See
[query-context](../querying/query-context-reference.md) to know more about
`timeout`. Query is rejected if the query context `timeout` is greater than
this value. |`Long.MAX_VALUE`|
|`druid.server.http.maxRequestHeaderSize`|Maximum size of a request header in
bytes. Larger headers consume more memory and can make a server more vulnerable
to denial of service attacks.|8 * 1024|
|`druid.server.http.contentSecurityPolicy`|Content-Security-Policy header
value to set on each non-POST response. Setting this property to an empty
string, or omitting it, both result in the default `frame-ancestors: none`
being set.|`frame-ancestors 'none'`|
@@ -1687,7 +1687,7 @@ Laning strategies allow you to control capacity
utilization for heterogeneous qu
###### Manual prioritization strategy
-With this configuration, queries are never assigned a priority automatically,
but will preserve a priority manually set on the [query
context](../querying/query-context.md) with the `priority` key. This mode can
be explicitly set by setting `druid.query.scheduler.prioritization.strategy` to
`manual`.
+With this configuration, queries are never assigned a priority automatically,
but will preserve a priority manually set on the [query
context](../querying/query-context-reference.md) with the `priority` key. This
mode can be explicitly set by setting
`druid.query.scheduler.prioritization.strategy` to `manual`.
###### Threshold prioritization strategy
@@ -1713,7 +1713,7 @@ In this mode, queries are never assigned a lane, and the
concurrent query count
This laning strategy splits queries with a `priority` below zero into a `low`
query lane, automatically. Queries with priority of zero (the default) or above
are considered 'interactive'. The limit on `low` queries can be set to some
desired percentage of the total capacity (or HTTP thread pool size), reserving
capacity for interactive queries. Queries in the `low` lane are _not_
guaranteed their capacity, which may be consumed by interactive queries, but
may use up to this limit if tota [...]
-If the `low` lane is specified in the [query
context](../querying/query-context.md) `lane` parameter, this will override the
computed lane.
+If the `low` lane is specified in the [query
context](../querying/query-context-reference.md) `lane` parameter, this will
override the computed lane.
This strategy can be enabled by setting
`druid.query.scheduler.laning.strategy=hilo`.
@@ -1747,7 +1747,7 @@ There is no formula to calculate the correct value. Trial
and error is the best
###### Manual laning strategy
-This laning strategy is best suited for cases where one or more external
applications which query Druid are capable of manually deciding what lane a
given query should belong to. Configured with a map of lane names to percent or
exact max capacities, queries with a matching `lane` parameter in the [query
context](../querying/query-context.md) will be subjected to those limits.
+This laning strategy is best suited for cases where one or more external
applications which query Druid are capable of manually deciding what lane a
given query should belong to. Configured with a map of lane names to percent or
exact max capacities, queries with a matching `lane` parameter in the [query
context](../querying/query-context-reference.md) will be subjected to those
limits.
|Property|Description|Default|
|--------|-----------|-------|
@@ -1770,7 +1770,7 @@ Druid uses Jetty to serve HTTP requests. Each query being
processed consumes a s
|`druid.server.http.maxSubqueryBytes`|Maximum number of bytes from all
subqueries per query. Since the results are stored on the Java heap,
`druid.server.http.maxSubqueryBytes` is a guardrail like
`druid.server.http.maxSubqueryRows` to prevent the heap space from exhausting.
When a subquery exceeds the byte limit, Druid throws a resource limit exceeded
exception. A negative value for the guardrail indicates that Druid won't
guardrail by memory. This can be set to 'disabled' which disable [...]
|`druid.server.http.gracefulShutdownTimeout`|The maximum amount of time Jetty
waits after receiving shutdown signal. After this timeout the threads will be
forcefully shutdown. This allows any queries that are executing to
complete(Only values greater than zero are valid).|`PT30S`|
|`druid.server.http.unannouncePropagationDelay`|How long to wait for ZooKeeper
unannouncements to propagate before shutting down Jetty. This is a minimum and
`druid.server.http.gracefulShutdownTimeout` does not start counting down until
after this period elapses.|`PT0S` (do not wait)|
-|`druid.server.http.maxQueryTimeout`|Maximum allowed value (in milliseconds)
for `timeout` parameter. See [query-context](../querying/query-context.md) to
know more about `timeout`. Query is rejected if the query context `timeout` is
greater than this value. |`Long.MAX_VALUE`|
+|`druid.server.http.maxQueryTimeout`|Maximum allowed value (in milliseconds)
for `timeout` parameter. See
[query-context](../querying/query-context-reference.md) to know more about
`timeout`. Query is rejected if the query context `timeout` is greater than
this value. |`Long.MAX_VALUE`|
|`druid.server.http.maxRequestHeaderSize`|Maximum size of a request header in
bytes. Larger headers consume more memory and can make a server more vulnerable
to denial of service attacks. |8 * 1024|
|`druid.server.http.contentSecurityPolicy`|Content-Security-Policy header
value to set on each non-POST response. Setting this property to an empty
string, or omitting it, both result in the default `frame-ancestors: none`
being set.|`frame-ancestors 'none'`|
|`druid.server.http.enableHSTS`|If set to true, druid services will add strict
transport security header `Strict-Transport-Security: max-age=63072000;
includeSubDomains` to all HTTP responses|`false`|
@@ -1787,7 +1787,7 @@ client has the following configuration options.
|`druid.broker.http.compressionCodec`|Compression codec the Broker uses to
communicate with Historical and real-time processes. May be "gzip" or
"identity".|`gzip`|
|`druid.broker.http.readTimeout`|The timeout for data reads from Historical
servers and real-time tasks.|`PT15M`|
|`druid.broker.http.unusedConnectionTimeout`|The timeout for idle connections
in connection pool. The connection in the pool will be closed after this
timeout and a new one will be established. This timeout should be less than
`druid.broker.http.readTimeout`. Set this timeout = ~90% of
`druid.broker.http.readTimeout`|`PT4M`|
-|`druid.broker.http.maxQueuedBytes`|Maximum number of bytes queued per query
before exerting
[backpressure](../operations/basic-cluster-tuning.md#broker-backpressure) on
channels to the data servers.<br /><br />Similar to
`druid.server.http.maxScatterGatherBytes`, except that `maxQueuedBytes`
triggers
[backpressure](../operations/basic-cluster-tuning.md#broker-backpressure)
instead of query failure. Set to zero to disable. You can override this setting
by using the [`maxQueuedBytes` quer [...]
+|`druid.broker.http.maxQueuedBytes`|Maximum number of bytes queued per query
before exerting
[backpressure](../operations/basic-cluster-tuning.md#broker-backpressure) on
channels to the data servers.<br /><br />Similar to
`druid.server.http.maxScatterGatherBytes`, except that `maxQueuedBytes`
triggers
[backpressure](../operations/basic-cluster-tuning.md#broker-backpressure)
instead of query failure. Set to zero to disable. You can override this setting
by using the [`maxQueuedBytes` quer [...]
|`druid.broker.http.numMaxThreads`|`Maximum number of I/O worker
threads|max(10, ((number of cores * 17) / 16 + 2) + 30)`|
|`druid.broker.http.clientConnectTimeout`|The timeout (in milliseconds) for
establishing client connections.|500|
@@ -2176,7 +2176,7 @@ This section describes configurations that control
behavior of Druid's query typ
### Overriding default query context values
-You can override any [query context general
parameter](../querying/query-context.md#general-parameters) default value by
setting the runtime property in the format of
`druid.query.default.context.{query_context_key}`.
+You can override any [query context general
parameter](../querying/query-context-reference.md#general-parameters) default
value by setting the runtime property in the format of
`druid.query.default.context.{query_context_key}`.
The `druid.query.default.context.{query_context_key}` runtime property prefix
applies to all current and future query context keys, the same as how query
context parameter passed with the query works. You can override the runtime
property value if the value for the same key is specified in the query contexts.
The precedence chain for query context values is as follows:
@@ -2223,7 +2223,7 @@ context). If query does have `maxQueuedBytes` in the
context, then that value is
### GroupBy query config
-This section describes the configurations for groupBy queries. You can set the
runtime properties in the `runtime.properties` file on Broker, Historical, and
Middle Manager processes. You can set the query context parameters through the
[query context](../querying/query-context.md).
+This section describes the configurations for groupBy queries. You can set the
runtime properties in the `runtime.properties` file on Broker, Historical, and
Middle Manager processes. You can set the query context parameters through the
[query context](../querying/query-context-reference.md).
Supported runtime properties:
diff --git a/docs/development/extensions-core/datasketches-hll.md
b/docs/development/extensions-core/datasketches-hll.md
index 3312dcc340d..4e2b369e5e0 100644
--- a/docs/development/extensions-core/datasketches-hll.md
+++ b/docs/development/extensions-core/datasketches-hll.md
@@ -45,7 +45,7 @@ For additional sketch types supported in Druid, see
[DataSketches extension](dat
|`lgK`|log2 of K that is the number of buckets in the sketch, parameter that
controls the size and the accuracy. Must be between 4 and 21 inclusively.|no,
defaults to `12`|
|`tgtHllType`|The type of the target HLL sketch. Must be `HLL_4`, `HLL_6` or
`HLL_8` |no, defaults to `HLL_4`|
|`round`|Round off values to whole numbers. Only affects query-time behavior
and is ignored at ingestion-time.|no, defaults to `false`|
-|`shouldFinalize`|Return the final double type representing the estimate
rather than the intermediate sketch type itself. In addition to controlling the
finalization of this aggregator, you can control whether all aggregators are
finalized with the query context parameters
[`finalize`](../../querying/query-context.md) and
[`sqlFinalizeOuterSketches`](../../querying/sql-query-context.md).|no, defaults
to `true`|
+|`shouldFinalize`|Return the final double type representing the estimate
rather than the intermediate sketch type itself. In addition to controlling the
finalization of this aggregator, you can control whether all aggregators are
finalized with the query context parameters
[`finalize`](../../querying/query-context-reference.md) and
[`sqlFinalizeOuterSketches`](../../querying/sql-query-context.md).|no, defaults
to `true`|
:::info
The default `lgK` value has proven to be sufficient for most use cases;
expect only very negligible improvements in accuracy with `lgK` values over
`16` in normal circumstances.
diff --git a/docs/development/extensions-core/datasketches-quantiles.md
b/docs/development/extensions-core/datasketches-quantiles.md
index 2b2b83a47a9..e6845d92db3 100644
--- a/docs/development/extensions-core/datasketches-quantiles.md
+++ b/docs/development/extensions-core/datasketches-quantiles.md
@@ -59,7 +59,7 @@ The result of the aggregation is a DoublesSketch that is the
union of all sketch
|`fieldName`|A string for the name of the input field (can contain sketches or
raw numeric values).|yes|
|`k`|Parameter that determines the accuracy and size of the sketch. Higher k
means higher accuracy but more space to store sketches. Must be a power of 2
from 2 to 32768. See [accuracy
information](https://datasketches.apache.org/docs/Quantiles/ClassicQuantilesSketch.html#accuracy-and-size)
in the DataSketches documentation for details.|no, defaults to 128|
|`maxStreamLength`|This parameter defines the number of items that can be
presented to each sketch before it may need to move from off-heap to on-heap
memory. This is relevant to query types that use off-heap memory, including
[TopN](../../querying/topnquery.md) and
[GroupBy](../../querying/groupbyquery.md). Ideally, should be set high enough
such that most sketches can stay off-heap.|no, defaults to 1000000000|
-|`shouldFinalize`|Return the final double type representing the estimate
rather than the intermediate sketch type itself. In addition to controlling the
finalization of this aggregator, you can control whether all aggregators are
finalized with the query context parameters
[`finalize`](../../querying/query-context.md) and
[`sqlFinalizeOuterSketches`](../../querying/sql-query-context.md).|no, defaults
to `true`|
+|`shouldFinalize`|Return the final double type representing the estimate
rather than the intermediate sketch type itself. In addition to controlling the
finalization of this aggregator, you can control whether all aggregators are
finalized with the query context parameters
[`finalize`](../../querying/query-context-reference.md) and
[`sqlFinalizeOuterSketches`](../../querying/sql-query-context.md).|no, defaults
to `true`|
## Post aggregators
diff --git a/docs/development/extensions-core/datasketches-theta.md
b/docs/development/extensions-core/datasketches-theta.md
index 844fdf35b10..33bdbe9d3d5 100644
--- a/docs/development/extensions-core/datasketches-theta.md
+++ b/docs/development/extensions-core/datasketches-theta.md
@@ -57,7 +57,7 @@ For additional sketch types supported in Druid, see
[DataSketches extension](dat
|`fieldName`|A string for the name of the aggregator used at ingestion
time.|yes|
|`isInputThetaSketch`|Only set this to true at indexing time if your input
data contains Theta sketch objects. This applies to cases when you use
DataSketches outside of Druid, for example with Pig or Hive, to produce the
data to ingest into Druid |no, defaults to false|
|`size`|Must be a power of 2. Internally, size refers to the maximum number of
entries sketch object retains. Higher size means higher accuracy but more space
to store sketches. After you index with a particular size, Druid persists the
sketch in segments. At query time you must use a size greater or equal to the
ingested size. See the [DataSketches
site](https://datasketches.apache.org/docs/Theta/ThetaSize) for details. The
default is recommended for the majority of use cases.|no, defau [...]
-|`shouldFinalize`|Return the final double type representing the estimate
rather than the intermediate sketch type itself. In addition to controlling the
finalization of this aggregator, you can control whether all aggregators are
finalized with the query context parameters
[`finalize`](../../querying/query-context.md) and
[`sqlFinalizeOuterSketches`](../../querying/sql-query-context.md).|no, defaults
to `true`|
+|`shouldFinalize`|Return the final double type representing the estimate
rather than the intermediate sketch type itself. In addition to controlling the
finalization of this aggregator, you can control whether all aggregators are
finalized with the query context parameters
[`finalize`](../../querying/query-context-reference.md) and
[`sqlFinalizeOuterSketches`](../../querying/sql-query-context.md).|no, defaults
to `true`|
## Post aggregators
diff --git a/docs/multi-stage-query/reference.md
b/docs/multi-stage-query/reference.md
index 1bd82f00efe..64e31a8bb0b 100644
--- a/docs/multi-stage-query/reference.md
+++ b/docs/multi-stage-query/reference.md
@@ -390,7 +390,7 @@ The multi-stage query task engine supports the [SQL context
parameters](../query
You can specify the context parameters in SELECT, INSERT, or REPLACE
statements.
-For detailed instructions on configuring query context parameters, refer to
[Query context](../querying/query-context.md).
+For detailed instructions on configuring query context parameters, refer to
[Set query context](../querying/query-context.md).
The following table lists the context parameters for the MSQ task engine:
diff --git a/docs/operations/mixed-workloads.md
b/docs/operations/mixed-workloads.md
index 2d2e794efc9..032f25b5e21 100644
--- a/docs/operations/mixed-workloads.md
+++ b/docs/operations/mixed-workloads.md
@@ -69,14 +69,14 @@ If you use the __high/low laning strategy__, set the
following:
* `druid.query.scheduler.laning.maxLowPercent` – The maximum percent of query
threads to handle low priority queries. The remaining query threads are
dedicated to high priority queries.
-Consider also defining a [prioritization
strategy](../configuration/index.md#prioritization-strategies) for the Broker
to label queries as high or low priority. Otherwise, manually set the priority
for incoming queries on the [query context](../querying/query-context.md).
+Consider also defining a [prioritization
strategy](../configuration/index.md#prioritization-strategies) for the Broker
to label queries as high or low priority. Otherwise, manually set the priority
for incoming queries on the [query
context](../querying/query-context-reference.md).
If you use a __manual laning strategy__, set the following:
* `druid.query.scheduler.laning.lanes.{name}` – The limit for how many queries
can run in the `name` lane. Define as many named lanes as needed.
* `druid.query.scheduler.laning.isLimitPercent` – Whether to treat the lane
limit as an exact number or a percent of the minimum of
`druid.server.http.numThreads` or `druid.query.scheduler.numThreads`.
-With manual laning, incoming queries can be labeled with the desired lane in
the `lane` parameter of the [query context](../querying/query-context.md).
+With manual laning, incoming queries can be labeled with the desired lane in
the `lane` parameter of the [query
context](../querying/query-context-reference.md).
See [Query prioritization and
laning](../configuration/index.md#query-prioritization-and-laning) for
additional details on query laning configuration.
diff --git a/docs/querying/caching.md b/docs/querying/caching.md
index a84e3d25eee..cbd70d6581d 100644
--- a/docs/querying/caching.md
+++ b/docs/querying/caching.md
@@ -81,7 +81,7 @@ Use *whole-query caching* on the Broker to increase query
efficiency when there
- On Brokers for small production clusters with less than five servers.
-Avoid using per-segment cache at the Broker for large production clusters.
When the Broker cache is enabled (`druid.broker.cache.populateCache` is `true`)
and `populateCache` _is not_ `false` in the [query
context](../querying/query-context.md), individual Historicals will _not_ merge
individual segment-level results, and instead pass these back to the lead
Broker. The Broker must then carry out a large merge from _all_ segments on its
own.
+Avoid using per-segment cache at the Broker for large production clusters.
When the Broker cache is enabled (`druid.broker.cache.populateCache` is `true`)
and `populateCache` _is not_ `false` in the [query
context](../querying/query-context-reference.md), individual Historicals will
_not_ merge individual segment-level results, and instead pass these back to
the lead Broker. The Broker must then carry out a large merge from _all_
segments on its own.
**Whole-query cache** is available exclusively on Brokers.
diff --git a/docs/querying/datasourcemetadataquery.md
b/docs/querying/datasourcemetadataquery.md
index 0a77426e765..c7fc2fb35ae 100644
--- a/docs/querying/datasourcemetadataquery.md
+++ b/docs/querying/datasourcemetadataquery.md
@@ -48,7 +48,7 @@ There are 2 main parts to a Data Source Metadata query:
|--------|-----------|---------|
|queryType|This String should always be "dataSourceMetadata"; this is the
first thing Apache Druid looks at to figure out how to interpret the query|yes|
|dataSource|A String or Object defining the data source to query, very similar
to a table in a relational database. See
[DataSource](../querying/datasource.md) for more information.|yes|
-|context|See [Context](../querying/query-context.md)|no|
+|context|See [Query context
reference](../querying/query-context-reference.md)|no|
The format of the result is:
diff --git a/docs/querying/groupbyquery.md b/docs/querying/groupbyquery.md
index 7350f23b7fe..58e20fa54d0 100644
--- a/docs/querying/groupbyquery.md
+++ b/docs/querying/groupbyquery.md
@@ -337,7 +337,7 @@ dictionary that can spill to disk. The outer query is run
on the Broker in a sin
### Configurations
-This section describes the configurations for groupBy queries. You can set the
runtime properties in the `runtime.properties` file on Broker, Historical, and
Middle Manager processes. You can set the query context parameters through the
[query context](query-context.md).
+This section describes the configurations for groupBy queries. You can set the
runtime properties in the `runtime.properties` file on Broker, Historical, and
Middle Manager processes. You can set the query context parameters through the
[query context](query-context-reference.md).
Supported runtime properties:
diff --git a/docs/querying/math-expr.md b/docs/querying/math-expr.md
index 5444bf853e0..926446200f1 100644
--- a/docs/querying/math-expr.md
+++ b/docs/querying/math-expr.md
@@ -300,7 +300,7 @@ For the IPv6 address function, the `address` argument
accepts a semicolon separa
## Vectorization support
-A number of expressions support ['vectorized' query
engines](../querying/query-context.md#vectorization-parameters)
+A number of expressions support ['vectorized' query
engines](../querying/query-context-reference.md#vectorization-parameters)
Supported features:
* constants and identifiers are supported for any column type
diff --git a/docs/querying/multi-value-dimensions.md
b/docs/querying/multi-value-dimensions.md
index a77a7766f27..f18f45994fb 100644
--- a/docs/querying/multi-value-dimensions.md
+++ b/docs/querying/multi-value-dimensions.md
@@ -498,7 +498,7 @@ Having specs are applied at the outermost level of groupBy
query processing.
## Disable GroupBy on multi-value columns
You can disable the implicit unnesting behavior for groupBy by setting
`groupByEnableMultiValueUnnesting: false` in your
-[query context](query-context.md). In this mode, the groupBy engine will
return an error instead of completing the query. This is a safety
+[query context](query-context-reference.md). In this mode, the groupBy engine
will return an error instead of completing the query. This is a safety
feature for situations where you believe that all dimensions are singly-valued
and want the engine to reject any
multi-valued dimensions that were inadvertently included.
diff --git a/docs/querying/multitenancy.md b/docs/querying/multitenancy.md
index 3619298291f..bc70177d0fa 100644
--- a/docs/querying/multitenancy.md
+++ b/docs/querying/multitenancy.md
@@ -83,7 +83,7 @@ running, we don't want any query to be starved out. Druid's
internal processing
This allows for a second set of segments from another query to be scanned. By
keeping segment computation time very small, we ensure
that resources are constantly being yielded, and segments pertaining to
different queries are all being processed.
-Druid queries can optionally set a `priority` flag in the [query
context](../querying/query-context.md). Queries known to be
+Druid queries can optionally set a `priority` flag in the [query context
reference](../querying/query-context-reference.md). Queries known to be
slow (download or reporting style queries) can be de-prioritized and more
interactive queries can have higher priority.
Broker processes can also be dedicated to a given tier. For example, one set
of Broker processes can be dedicated to fast interactive queries,
diff --git a/docs/querying/query-context.md
b/docs/querying/query-context-reference.md
similarity index 91%
copy from docs/querying/query-context.md
copy to docs/querying/query-context-reference.md
index 35fb7fe3c4e..acc762fa9fe 100644
--- a/docs/querying/query-context.md
+++ b/docs/querying/query-context-reference.md
@@ -1,7 +1,7 @@
---
-id: query-context
-title: "Query context"
-sidebar_label: "Query context"
+id: query-context-reference
+title: "Query context reference"
+sidebar_label: "Query context reference"
---
<!--
@@ -23,20 +23,23 @@ sidebar_label: "Query context"
~ under the License.
-->
-The query context is used for various query configuration parameters. Query
context parameters can be specified in
-the following ways:
+The query context provides runtime configuration for individual queries in
Apache Druid. Each parameter in the query context controls a specific aspect of
query behavior—from execution timeouts and resource limits to caching policies
and processing strategies.
-- For [Druid SQL](../api-reference/sql-api.md), context parameters are
provided either in a JSON object named `context` to the
-HTTP POST API, or as properties to the JDBC connection.
-- For [native queries](querying.md), context parameters are provided in a JSON
object named `context`.
+This reference contains context parameters organized by their scope:
+
+- **General parameters**: Applies to all query types.
+- **Parameters by query type**: Applies to a specific type of query, such as
TopN.
+- **Vectorization parameters**: Controls vectorized query execution for
supported queries.
+
+To learn how to set the query context, see [Set query
context](./query-context.md).
+
+For reference on query context parameters specific to Druid SQL, visit [SQL
query context](sql-query-context.md).
+For context parameters related to SQL-based ingestion, see the [SQL-based
ingestion reference](../multi-stage-query/reference/#context-parameters).
-Note that setting query context will override both the default value and the
runtime properties value in the format of
-`druid.query.default.context.{property_key}` (if set).
## General parameters
Unless otherwise noted, the following parameters apply to all query types, and
to both native and SQL queries.
-See [SQL query context](sql-query-context.md) for other query context
parameters that are specific to Druid SQL planning.
|Parameter |Default | Description
|
|-------------------|----------------------------------------|----------------------|
@@ -126,3 +129,12 @@ vectorization. These query types will ignore the
`vectorize` parameter even if i
|`vectorize`|`true`|Enables or disables vectorized query execution. Possible
values are `false` (disabled), `true` (enabled if possible, disabled otherwise,
on a per-segment basis), and `force` (enabled, and groupBy or timeseries
queries that cannot be vectorized will fail). The `"force"` setting is meant to
aid in testing, and is not generally useful in production (since real-time
segments can never be processed with vectorized execution, any queries on
real-time data will fail). This w [...]
|`vectorSize`|`512`|Sets the row batching size for a particular query. This
will override `druid.query.default.context.vectorSize` if it's set.|
|`vectorizeVirtualColumns`|`true`|Enables or disables vectorized query
processing of queries with virtual columns, layered on top of `vectorize`
(`vectorize` must also be set to true for a query to utilize vectorization).
Possible values are `false` (disabled), `true` (enabled if possible, disabled
otherwise, on a per-segment basis), and `force` (enabled, and groupBy or
timeseries queries with virtual columns that cannot be vectorized will fail).
The `"force"` setting is meant to aid in [...]
+
+## Learn more
+
+For more information, see the following topics:
+
+- [Set query context](./query-context.md) to learn how to configure query
context parameters.
+- [SQL query context](sql-query-context.md) for query context parameters
specific to Druid SQL.
+- [SQL-based ingestion
reference](../multi-stage-query/reference/#context-parameters) for context
parameters used in SQL-based ingestion (MSQ).
+
diff --git a/docs/querying/query-context.md b/docs/querying/query-context.md
index 35fb7fe3c4e..2f423945beb 100644
--- a/docs/querying/query-context.md
+++ b/docs/querying/query-context.md
@@ -1,7 +1,10 @@
---
id: query-context
-title: "Query context"
-sidebar_label: "Query context"
+title: "Set query context"
+sidebar_label: "Set query context"
+description:
+ "Learn how to configure the query context
+ to customize query execution behavior and optimize performance."
---
<!--
@@ -22,107 +25,273 @@ sidebar_label: "Query context"
~ specific language governing permissions and limitations
~ under the License.
-->
+
-The query context is used for various query configuration parameters. Query
context parameters can be specified in
-the following ways:
-
-- For [Druid SQL](../api-reference/sql-api.md), context parameters are
provided either in a JSON object named `context` to the
-HTTP POST API, or as properties to the JDBC connection.
-- For [native queries](querying.md), context parameters are provided in a JSON
object named `context`.
-
-Note that setting query context will override both the default value and the
runtime properties value in the format of
-`druid.query.default.context.{property_key}` (if set).
-
-## General parameters
-
-Unless otherwise noted, the following parameters apply to all query types, and
to both native and SQL queries.
-See [SQL query context](sql-query-context.md) for other query context
parameters that are specific to Druid SQL planning.
-
-|Parameter |Default | Description
|
-|-------------------|----------------------------------------|----------------------|
-|`timeout` | `druid.server.http.defaultQueryTimeout`| Query timeout
in millis, beyond which unfinished queries will be cancelled. 0 timeout means
`no timeout` (up to the server-side maximum query timeout,
`druid.server.http.maxQueryTimeout`). To set the default timeout and maximum
timeout, see [Broker configuration](../configuration/index.md#broker) |
-|`priority` | The default priority is one of the following:
<ul><li>Value of `priority` in the query context, if set</li><li>The value of
the runtime property `druid.query.default.context.priority`, if set and not
null</li><li>`0` if the priority is not set in the query context or runtime
properties</li></ul>| Query priority. Queries with higher priority get
precedence for computational resources.|
-|`lane` | `null` | Query lane,
used to control usage limits on classes of queries. See [Broker
configuration](../configuration/index.md#broker) for more details.|
-|`queryId` | auto-generated | Unique
identifier given to this query. If a query ID is set or known, this can be used
to cancel the query |
-|`brokerService` | `null` | Broker service
to which this query should be routed. This parameter is honored only by a
broker selector strategy of type *manual*. See [Router
strategies](../design/router.md#router-strategies) for more details.|
-|`useCache` | `true` | Flag indicating
whether to leverage the query cache for this query. When set to false, it
disables reading from the query cache for this query. When set to true, Apache
Druid uses `druid.broker.cache.useCache` or `druid.historical.cache.useCache`
to determine whether or not to read from the query cache |
-|`populateCache` | `true` | Flag indicating
whether to save the results of the query to the query cache. Primarily used for
debugging. When set to false, it disables saving the results of this query to
the query cache. When set to true, Druid uses
`druid.broker.cache.populateCache` or `druid.historical.cache.populateCache` to
determine whether or not to save the results of this query to the query cache |
-|`useResultLevelCache`| `true` | Flag indicating whether
to leverage the result level cache for this query. When set to false, it
disables reading from the query cache for this query. When set to true, Druid
uses `druid.broker.cache.useResultLevelCache` to determine whether or not to
read from the result-level query cache |
-|`populateResultLevelCache` | `true` | Flag indicating
whether to save the results of the query to the result level cache. Primarily
used for debugging. When set to false, it disables saving the results of this
query to the query cache. When set to true, Druid uses
`druid.broker.cache.populateResultLevelCache` to determine whether or not to
save the results of this query to the result-level query cache |
-|`bySegment` | `false` | Native queries
only. Return "by segment" results. Primarily used for debugging, setting it to
`true` returns results associated with the data segment they came from |
-|`finalize` | `N/A` | Flag indicating
whether to "finalize" aggregation results. Primarily used for debugging. For
instance, the `hyperUnique` aggregator returns the full HyperLogLog sketch
instead of the estimated cardinality when this flag is set to `false` |
-|`maxScatterGatherBytes`| `druid.server.http.maxScatterGatherBytes` | Maximum
number of bytes gathered from data processes such as Historicals and realtime
processes to execute a query. This parameter can be used to further reduce
`maxScatterGatherBytes` limit at query time. See [Broker
configuration](../configuration/index.md#broker) for more details.|
-|`maxQueuedBytes` | `druid.broker.http.maxQueuedBytes` | Maximum
number of bytes queued per query before exerting backpressure on the channel to
the data server. Similar to `maxScatterGatherBytes`, except unlike that
configuration, this one will trigger backpressure rather than query failure.
Zero means disabled.|
-|`maxSubqueryRows`| `druid.server.http.maxSubqueryRows` | Upper limit on the
number of rows a subquery can generate. See [Broker
configuration](../configuration/index.md#broker) and [subquery
guardrails](../configuration/index.md#Guardrails for materialization of
subqueries) for more details.|
-|`maxSubqueryBytes`| `druid.server.http.maxSubqueryBytes` | Upper limit on the
number of bytes a subquery can generate. See [Broker
configuration](../configuration/index.md#broker) and [subquery
guardrails](../configuration/index.md#Guardrails for materialization of
subqueries) for more details.|
-|`serializeDateTimeAsLong`| `false` | If true, DateTime is serialized as
long in the result returned by Broker and the data transportation between
Broker and compute process|
-|`serializeDateTimeAsLongInner`| `false` | If true, DateTime is serialized as
long in the data transportation between Broker and compute process|
-|`enableParallelMerge`|`true`|Enable parallel result merging on the Broker.
Note that `druid.processing.merge.useParallelMergePool` must be enabled for
this setting to be set to `true`. See [Broker
configuration](../configuration/index.md#broker) for more details.|
-|`parallelMergeParallelism`|`druid.processing.merge.parallelism`|Maximum
number of parallel threads to use for parallel result merging on the Broker.
See [Broker configuration](../configuration/index.md#broker) for more details.|
-|`parallelMergeInitialYieldRows`|`druid.processing.merge.initialYieldNumRows`|Number
of rows to yield per ForkJoinPool merge task for parallel result merging on
the Broker, before forking off a new task to continue merging sequences. See
[Broker configuration](../configuration/index.md#broker) for more details.|
-|`parallelMergeSmallBatchRows`|`druid.processing.merge.smallBatchNumRows`|Size
of result batches to operate on in ForkJoinPool merge tasks for parallel result
merging on the Broker. See [Broker
configuration](../configuration/index.md#broker) for more details.|
-|`useFilterCNF`|`false`| If true, Druid will attempt to convert the query
filter to Conjunctive Normal Form (CNF). During query processing, columns can
be pre-filtered by intersecting the bitmap indexes of all values that match the
eligible filters, often greatly reducing the raw number of rows which need to
be scanned. But this effect only happens for the top level filter, or
individual clauses of a top level 'and' filter. As such, filters in CNF
potentially have a higher chance to util [...]
-|`secondaryPartitionPruning`|`true`|Enable secondary partition pruning on the
Broker. The Broker will always prune unnecessary segments from the input scan
based on a filter on time intervals, but if the data is further partitioned
with hash or range partitioning, this option will enable additional pruning
based on a filter on secondary partition dimensions.|
-|`debug`| `false` | Flag indicating whether to enable debugging outputs for
the query. When set to false, no additional logs will be produced (logs
produced will be entirely dependent on your logging level). When set to true,
the following addition logs will be produced:<br />- Log the stack trace of the
exception (if any) produced by the query |
-|`setProcessingThreadNames`|`true`| Whether processing thread names will be
set to `queryType_dataSource_intervals` while processing a query. This aids in
interpreting thread dumps, and is on by default. Query overhead can be reduced
slightly by setting this to `false`. This has a tiny effect in most scenarios,
but can be meaningful in high-QPS, low-per-segment-processing-time scenarios. |
-|`sqlPlannerBloat`|`1000`|Calcite parameter which controls whether to merge
two Project operators when inlining expressions causes complexity to increase.
Implemented as a workaround to exception `There are not enough rules to produce
a node with desired properties: convention=DRUID, sort=[]` thrown after
rejecting the merge of two projects.|
-|`cloneQueryMode`|`excludeClones`| Indicates whether clone Historicals should
be queried by brokers. Clone servers are created by the `cloneServers`
Coordinator dynamic configuration. Possible values are `excludeClones`,
`includeClones` and `preferClones`. `excludeClones` means that clone
Historicals are not queried by the broker. `preferClones` indicates that when
given a choice between the clone Historical and the original Historical which
is being cloned, the broker chooses the clones [...]
-
-## Parameters by query type
-
-Some query types offer context parameters specific to that query type.
-
-### TopN
-
-|Parameter |Default | Description |
-|-----------------|---------------------|----------------------|
-|`minTopNThreshold` | `1000` | The top minTopNThreshold local
results from each segment are returned for merging to determine the global
topN. |
-
-### Timeseries
-
-|Parameter |Default | Description |
-|-----------------|---------------------|----------------------|
-|`skipEmptyBuckets` | `false` | Disable timeseries zero-filling
behavior, so only buckets with results will be returned. |
-
-### Join filter
-
-|Parameter |Default | Description |
-|-----------------|---------------------|----------------------|
-|`enableJoinFilterPushDown` | `true` | Controls whether a join query will
attempt filter push down, which reduces the number of rows that have to be
compared in a join operation.|
-|`enableJoinFilterRewrite` | `true` | Controls whether filter clauses that
reference non-base table columns will be rewritten into filters on base table
columns.|
-|`enableJoinFilterRewriteValueColumnFilters` | `false` | Controls whether
Druid rewrites non-base table filters on non-key columns in the non-base table.
Requires a scan of the non-base table.|
-|`enableRewriteJoinToFilter` | `true` | Controls whether a join can be pushed
partial or fully to the base table as a filter at runtime.|
-|`joinFilterRewriteMaxSize` | `10000` | The maximum size of the correlated
value set used for filter rewrites. Set this limit to prevent excessive memory
use.|
-
-### GroupBy
-
-See the list of [GroupBy query
context](groupbyquery.md#advanced-configurations) parameters available on the
groupBy
-query page.
-
-## Vectorization parameters
-
-The GroupBy and Timeseries query types can run in _vectorized_ mode, which
speeds up query execution by processing
-batches of rows at a time. Not all queries can be vectorized. In particular,
vectorization currently has the following
-requirements:
-
-- All query-level filters must either be able to run on bitmap indexes or must
offer vectorized row-matchers. These
-include `selector`, `bound`, `in`, `like`, `regex`, `search`, `and`, `or`, and
`not`.
-- All filters in filtered aggregators must offer vectorized row-matchers.
-- All aggregators must offer vectorized implementations. These include
`count`, `doubleSum`, `floatSum`, `longSum`. `longMin`,
- `longMax`, `doubleMin`, `doubleMax`, `floatMin`, `floatMax`, `longAny`,
`doubleAny`, `floatAny`, `stringAny`,
- `hyperUnique`, `filtered`, `approxHistogram`, `approxHistogramFold`, and
`fixedBucketsHistogram` (with numerical input).
-- All virtual columns must offer vectorized implementations. Currently for
expression virtual columns, support for vectorization is decided on a per
expression basis, depending on the type of input and the functions used by the
expression. See the currently supported list in the [expression
documentation](math-expr.md#vectorization-support).
-- For GroupBy: All dimension specs must be "default" (no extraction functions
or filtered dimension specs).
-- For GroupBy: No multi-value dimensions.
-- For Timeseries: No "descending" order.
-- Only immutable segments (not real-time).
-- Only [table datasources](datasource.md#table) (not joins, subqueries,
lookups, or inline datasources).
-
-Other query types (like TopN, Scan, Select, and Search) ignore the `vectorize`
parameter, and will execute without
-vectorization. These query types will ignore the `vectorize` parameter even if
it is set to `"force"`.
-
-|Parameter|Default| Description|
-|---------|-------|------------|
-|`vectorize`|`true`|Enables or disables vectorized query execution. Possible
values are `false` (disabled), `true` (enabled if possible, disabled otherwise,
on a per-segment basis), and `force` (enabled, and groupBy or timeseries
queries that cannot be vectorized will fail). The `"force"` setting is meant to
aid in testing, and is not generally useful in production (since real-time
segments can never be processed with vectorized execution, any queries on
real-time data will fail). This w [...]
-|`vectorSize`|`512`|Sets the row batching size for a particular query. This
will override `druid.query.default.context.vectorSize` if it's set.|
-|`vectorizeVirtualColumns`|`true`|Enables or disables vectorized query
processing of queries with virtual columns, layered on top of `vectorize`
(`vectorize` must also be set to true for a query to utilize vectorization).
Possible values are `false` (disabled), `true` (enabled if possible, disabled
otherwise, on a per-segment basis), and `force` (enabled, and groupBy or
timeseries queries with virtual columns that cannot be vectorized will fail).
The `"force"` setting is meant to aid in [...]
+The query context gives you fine-grained control over how Apache Druid
executes your individual queries. While the default settings in Druid work well
for most queries, you can set the query context to handle specific requirements
and optimize performance.
+
+Common use cases for the query context include:
+- Override default timeouts for long-running queries or complex aggregations.
+- Debug query performance by disabling caching during testing.
+- Configure SQL-specific behaviors like time zones for accurate time-based
analysis.
+- Set priorities to ensure critical queries get computational resources first.
+- Adjust memory limits for queries that process large datasets.
+
+The way you set the query context depends on how you submit the query to
Druid, whether using the web console or API.
+It also depends on whether your query is Druid SQL or a JSON-based native
query.
+This guide shows you how to set the query context for each application.
+
+Before you begin, identify which context parameters you need to configure in
order to establish your query context as query context carriers. For available
parameters and their descriptions, see [Query context
reference](query-context-reference.md).
+
+## Web console
+
+You can configure query context parameters for both Druid SQL and native
queries in the [web console](../operations/web-console.md).
+
+The following steps show you how to set the query context using the web
console:
+
+1. In the web console, select **Query** from the top-level navigation.
+
+ 
+
+2. Enter the query you want to run. If you ingested the Wikipedia dataset from
the [quickstart](../tutorials/index.md), you can use the following query:
+
+ ```sql
+ SELECT * FROM wikipedia WHERE user='BlueMoon2662'
+ ```
+
+ 
+
+3. In the menu for the engine selector, click **Edit query context**.
+
+ 
+
+4. In the **Edit query context** dialog, add your context parameters as JSON
key-value pairs.
+
+ For example, you can set the `sqlTimeZone` parameter to ensure that the
query results reflect the specified time zone. This may differ from your local
time zone when viewing the data.
+
+ ```json
+ {
+ "sqlTimeZone" : "America/Los_Angeles"
+ }
+ ```
+
+5. The web console validates the JSON object containing the query context
parameters and highlights any syntax errors.
+ Click **Save**.
+
+ 
+
+6. Click **Run** to execute your query with the specified context parameters.
+
+ 
+
+ Compare the results of the example query with and without the query context.
+ * Without the query context, the query returns the `__time` value of
`2015-09-12T00:47:53.259Z`.
+ * When you set the `sqlTimeZone` parameter, the query returns
`2015-09-11T17:47:53.259-07:00`.
+
+
+## Druid SQL
+
+When using Druid SQL programmatically—such as in applications, automated
scripts, or database tools—you can set the query context through various
methods depending on how you submit your queries.
+
+### HTTP API
+
+When using the HTTP API, you include query context parameters in the `context`
object of your JSON request. For more information on how to format Druid SQL
API requests and handle responses, see [Druid SQL
API](../api-reference/sql-api.md).
+
+The following example sets the `sqlTimeZone` parameter:
+
+```json
+{
+ "query": "SELECT * FROM wikipedia WHERE user = 'BlueMoon2662'",
+ "context": {
+ "sqlTimeZone": "America/Los_Angeles"
+ }
+}
+```
+
+You can set multiple context parameters in a single request:
+
+```json
+{
+ "query": "SELECT * FROM wikipedia WHERE user = 'BlueMoon2662'",
+ "context": {
+ "sqlTimeZone": "America/Los_Angeles",
+ "sqlQueryId": "request01"
+ }
+}
+```
+
+
+### JDBC driver API
+
+You can connect to Druid over JDBC and issue Druid SQL queries using the
[Druid SQL JDBC driver API](../api-reference/sql-jdbc.md).
+This approach is useful when integrating Druid with BI tools or Java
applications.
+When connecting to Druid through JDBC, you set query context parameters in a
JDBC connection properties object.
+You supply the object when establishing the connection to Druid.
+
+The following code excerpt shows how you can configure the connection
properties:
+
+```java
+String url =
"jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/";
+
+// Set the time zone to America/Los_Angeles
+Properties connectionProperties = new Properties();
+connectionProperties.setProperty("sqlTimeZone", "America/Los_Angeles");
+
+try (Connection connection = DriverManager.getConnection(url,
connectionProperties)) {
+ // create and execute statements, process result sets, etc
+}
+```
+
+<details>
+<summary>View full JDBC example</summary>
+
+```java
+import java.sql.*;
+import java.util.Properties;
+
+public class JdbcDruid {
+
+ public static void main(String args[]) {
+
+ // Connect to /druid/v2/sql/avatica/ on your Broker.
+ String url =
"jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnection=true";
+
+ // The query you want to run.
+ String query = "SELECT * FROM wikipedia WHERE user = 'BlueMoon2662'";
+
+ // Set any connection context parameters you need here.
+ Properties connectionProperties = new Properties();
+ connectionProperties.setProperty("sqlTimeZone", "America/Los_Angeles");
+
+ try (Connection connection = DriverManager.getConnection(url,
connectionProperties)) {
+ try (
+ final Statement statement = connection.createStatement();
+ final ResultSet rs = statement.executeQuery(query)
+ ) {
+ while (rs.next()) {
+ // process result set
+ Timestamp timeStamp = rs.getTimestamp("__time");
+ System.out.println(timeStamp);
+ }
+ }
+ } catch (Exception e) {
+ System.out.println(e.toString());
+ }
+ }
+}
+```
+
+</details>
+
+### SET statements
+
+You can use the SET command to specify SQL query context parameters that
modify the behavior of a Druid SQL query. Druid accepts one or more SET
statements before the main SQL query. The SET command works in the both web
console and the Druid SQL HTTP API.
+
+In the web console, you can write your SET statements followed by your query
directly. For example:
+
+```sql
+SET sqlTimeZone = 'America/Los_Angeles';
+SELECT * FROM wikipedia WHERE user = 'BlueMoon2662';
+```
+
+You can also include your SET statements as part of the query string in your
HTTP API call. For example:
+
+```bash
+curl -X POST 'http://localhost:8888/druid/v2/sql' \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "query": "SET sqlTimeZone='\''America/Los_Angeles'\''; SELECT * FROM
wikipedia WHERE user='\''BlueMoon2662'\''"
+}'
+```
+
+You can also combine SET statements with the `context` field. If you include
both, the parameter value in SET takes precedence:
+
+```bash
+curl -X POST 'http://localhost:8888/druid/v2/sql' \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "query": "SET sqlTimeZone='\''America/Los_Angeles'\''; SELECT * FROM
wikipedia WHERE user='\''BlueMoon2662'\''",
+ "context": {
+ "sqlTimeZone": "UTC"
+ }
+}'
+```
+
+For more details on how to use the SET command in your SQL query, see
[SET](sql.md#set).
+
+:::info
+You cannot use SET statements in JDBC connections.
+:::
+
+
+## Native queries
+
+For native queries, you can include query context parameters in a JSON object
named `context` within your query or through the [web console](#web-console).
+
+The following example shows a native query that sets the `sqlTimeZone` to
`America/Los_Angeles` and `queryId` to `only_query_id_test`:
+
+```json
+{
+ "queryType": "timeseries",
+ "dataSource": "wikipedia",
+ "granularity": "day",
+ "descending": true,
+ "filter": {
+ "type": "and",
+ "fields": [
+ { "type": "selector", "dimension": "countryName", "value": "Australia" },
+ { "type": "selector", "dimension": "isAnonymous", "value": "true" }
+ ]
+ },
+ "aggregations": [
+ { "type": "count", "name": "row_count" }
+ ],
+ "intervals": ["2015-09-12T00:00:00.000/2015-09-13T00:00:00.000"],
+ "context": {
+ "sqlTimeZone": "America/Los_Angeles",
+ "queryId": "only_query_id_test",
+ }
+}
+```
+
+
+## Runtime properties
+
+You can configure query context parameters globally by adding a runtime
property to your configuration file.
+The property takes the following format:
+
+```properties
+druid.query.default.context.{PARAMETER}={VALUE}
+```
+
+Replace `PARAMETER` with the query context parameter and `VALUE` with its
value.
+For example:
+
+```properties
+druid.query.default.context.debug=true
+```
+
+For more information, see [Configuration
reference](../configuration/index.md#overriding-default-query-context-values).
+
+
+## Query context precedence
+
+For a given context query, Druid determines the final query context value to
use based on the following order of precedence, from lowest to highest:
+
+1. **Built-in defaults**: Druid uses the documented default values if you
don’t specify anything.
+
+2. **Runtime properties**: If you configure parameters as
`druid.query.default.context.{PARAMETER}` in the configuration files, these
override the built-in defaults and act as your system-wide defaults.
+
+3. **Context object in HTTP request**: Parameters passed within the JSON
`context` object override both built-in defaults and runtime properties.
+
+4. **SET statements**: Parameters set in Druid SQL using `SET key=value;` take
the highest precedence and override all other settings.
+
+
+## Learn more
+
+For more information, see the following topics:
+
+- [Query context reference](query-context-reference.md) for available query
context parameters.
+- [SQL query context](sql-query-context.md) for SQL-specific context
parameters.
+- [Multi-stage query
context](../multi-stage-query/reference.md#context-parameters) for context
parameters specific to SQL-based ingestion.
+- [Native queries](querying.md) for details on constructing native queries
with context.
+- [SET](sql.md#set) for complete syntax and usage of SET statements.
diff --git a/docs/querying/querying.md b/docs/querying/querying.md
index 894920b72cb..ba173fdb0a6 100644
--- a/docs/querying/querying.md
+++ b/docs/querying/querying.md
@@ -152,3 +152,7 @@ Possible Druid error codes for the `error` field include:
|`Query cancelled`|500|The query was cancelled through the query cancellation
API.|
|`Truncated response context`|500|An intermediate response context for the
query exceeded the built-in limit of 7KiB.<br/><br/>The response context is an
internal data structure that Druid servers use to share out-of-band information
when sending query results to each other. It is serialized in an HTTP header
with a maximum length of 7KiB. This error occurs when an intermediate response
context sent from a data server (like a Historical) to the Broker exceeds this
limit.<br/><br/>The res [...]
|`Unknown exception`|500|Some other exception occurred. Check errorMessage and
errorClass for details, although keep in mind that the contents of those fields
are free-form and may change from release to release.|
+
+## Learn more
+
+To learn how to use the query context parameters, see [Set query
context](./query-context.md).
diff --git a/docs/querying/searchquery.md b/docs/querying/searchquery.md
index c0b5b741100..97ddf59372a 100644
--- a/docs/querying/searchquery.md
+++ b/docs/querying/searchquery.md
@@ -67,7 +67,7 @@ There are several main parts to a search query:
|virtualColumns|A JSON list of [virtual columns](./virtual-columns.md)
available to use in `searchDimensions`.| no (default none)|
|query|See [SearchQuerySpec](#searchqueryspec).|yes|
|sort|An object specifying how the results of the search should be
sorted.<br/>Possible types are "lexicographic" (the default sort),
"alphanumeric", "strlen", and "numeric".<br/>See [Sorting
Orders](./sorting-orders.md) for more details.|no|
-|context|See [Context](../querying/query-context.md)|no|
+|context|See [Context](../querying/query-context-reference.md)|no|
The format of the result is:
diff --git a/docs/querying/segmentmetadataquery.md
b/docs/querying/segmentmetadataquery.md
index de223b9fed2..58eb93b4cf0 100644
--- a/docs/querying/segmentmetadataquery.md
+++ b/docs/querying/segmentmetadataquery.md
@@ -62,7 +62,7 @@ There are several main parts to a segment metadata query:
|intervals|A JSON Object representing ISO-8601 Intervals. This defines the
time ranges to run the query over.|no|
|toInclude|A JSON Object representing what columns should be included in the
result. Defaults to "all".|no|
|merge|Merge all individual segment metadata results into a single result|no|
-|context|See [Context](../querying/query-context.md)|no|
+|context|See [Context](../querying/query-context-reference.md)|no|
|analysisTypes|A list of Strings specifying what column properties (e.g.
cardinality, size) should be calculated and returned in the result. Defaults to
["cardinality", "interval", "minmax"], but can be overridden with using the
[segment metadata query
config](../configuration/index.md#segmentmetadata-query-config). See section
[analysisTypes](#analysistypes) for more details.|no|
|aggregatorMergeStrategy| The strategy Druid uses to merge aggregators across
segments. If true and if the `aggregators` analysis type is enabled,
`aggregatorMergeStrategy` defaults to `strict`. Possible values include
`strict`, `lenient`, `earliest`, and `latest`. See
[`aggregatorMergeStrategy`](#aggregatormergestrategy) for details.|no|
|lenientAggregatorMerge|Deprecated. Use `aggregatorMergeStrategy` property
instead. If true, and if the `aggregators` analysis type is enabled, Druid
merges aggregators leniently.|no|
diff --git a/docs/querying/sql-query-context.md
b/docs/querying/sql-query-context.md
index cc0a9bdc7c1..854d3838474 100644
--- a/docs/querying/sql-query-context.md
+++ b/docs/querying/sql-query-context.md
@@ -28,18 +28,15 @@ sidebar_label: "SQL query context"
This document describes the SQL language.
:::
-Druid supports query context parameters which affect [SQL query](./sql.md)
planning.
-See [Query context](query-context.md) for general query context parameters for
all query types.
+In Apache Druid, you can control how your [Druid SQL queries](./sql.md)
queries run by using query context parameters. The parameters let you adjust
aspects of query processing such as using approximations, selecting particular
filters, controlling how lookups are executed.
-## SQL query context parameters
+For additional context parameters supported for all query types, refer to
[Query context reference](query-context-reference.md). To learn how to set the
query context, see [Set query context](../querying/query-context.md).
-The following table lists query context parameters you can use to configure
Druid SQL planning.
-You can override a parameter's default value by setting a runtime property in
the format `druid.query.default.context.{query_context_key}`.
-For more information, see [Overriding default query context
values](../configuration/index.md#overriding-default-query-context-values).
+The table below lists the query context parameters you can use with Druid SQL.
|Parameter|Description|Default value|
|---------|-----------|-------------|
-|`sqlQueryId`|SQL query ID. For HTTP client, Druid returns it in the
`X-Druid-SQL-Query-Id` header.<br/><br/>To specify a SQL query ID, use
`sqlQueryId` instead of [`queryId`](query-context.md). Setting `queryId` for a
SQL request has no effect. All native queries underlying SQL use an
auto-generated `queryId`.|auto-generated|
+|`sqlQueryId`|SQL query ID. For HTTP client, Druid returns it in the
`X-Druid-SQL-Query-Id` header.<br/><br/>To specify a SQL query ID, use
`sqlQueryId` instead of [`queryId`](query-context-reference.md). Setting
`queryId` for a SQL request has no effect. All native queries underlying SQL
use an auto-generated `queryId`.|auto-generated|
|`sqlTimeZone`|Time zone for a connection. For example, "America/Los_Angeles"
or an offset like "-08:00". This parameter affects how time functions and
timestamp literals behave. |UTC|
|`sqlStringifyArrays`|If `true`, Druid serializes result columns with array
values as JSON strings in the response instead of arrays.|`true`, except for
JDBC connections, where it's always `false`|
|`useApproximateCountDistinct`|Whether to use an approximate cardinality
algorithm for `COUNT(DISTINCT foo)`.|`true`|
@@ -59,51 +56,7 @@ For more information, see [Overriding default query context
values](../configura
|`inFunctionExprThreshold`|At or beyond this threshold number of values, SQL
`IN` is eligible for execution using the native function `scalar_in_array`
rather than an <code>||</code> of `==`, even if the number of values
is below `inFunctionThreshold`. This property only affects translation of SQL
`IN` to a [native expression](math-expr.md). It doesn't affect translation of
SQL `IN` to a [native filter](filters.md). This property is provided for
backwards compatibility purposes [...]
|`inSubQueryThreshold`|At or beyond this threshold number of values, Druid
converts SQL `IN` to `JOIN` on an inline table. `inFunctionThreshold` takes
priority over this setting. A threshold of 0 forces usage of an inline table in
all cases where the size of a SQL `IN` is larger than `inFunctionThreshold`. A
threshold of `2147483647` disables the rewrite of SQL `IN` to `JOIN`.
|`2147483647`|
-## Set the query context
-
-How query context parameters are set differs depending on whether you are
using the [JSON API](../api-reference/sql-api.md) or
[JDBC](../api-reference/sql-jdbc.md).
-
-### Set the query context when using JSON API
-When using the JSON API, you can configure query context parameters in the
`context` object of the request.
-
-For example:
-
-```
-{
- "query" : "SELECT COUNT(*) FROM data_source WHERE foo = 'bar' AND __time >
TIMESTAMP '2000-01-01 00:00:00'",
- "context" : {
- "sqlTimeZone" : "America/Los_Angeles",
- "useCache": false
- }
-}
-```
-
-Context parameters can also be set by including [SET](./sql.md#set) as part of
the `query`
-string in the request, separated from the query by `;`. Context parameters set
by `SET` statements take priority over
-values set in `context`.
-
-The following example expresses the previous example in this form:
-
-```
-{
- "query" : "SET sqlTimeZone = 'America/Los_Angeles'; SET useCache = false;
SELECT COUNT(*) FROM data_source WHERE foo = 'bar' AND __time > TIMESTAMP
'2000-01-01 00:00:00'"
-}
-```
-
-### Set the query context when using JDBC
-If using JDBC, context parameters can be set using [connection properties
object](../api-reference/sql-jdbc.md).
-
-For example:
-
-```java
-String url =
"jdbc:avatica:remote:url=http://localhost:8082/druid/v2/sql/avatica/";
-
-// Set any query context parameters you need here.
-Properties connectionProperties = new Properties();
-connectionProperties.setProperty("sqlTimeZone", "America/Los_Angeles");
-connectionProperties.setProperty("useCache", "false");
-
-try (Connection connection = DriverManager.getConnection(url,
connectionProperties)) {
- // create and execute statements, process result sets, etc
-}
-```
+## Learn more
+- [Set query context](../querying/query-context.md) for how to set the query
context.
+- [Query context reference](query-context-reference.md) for available query
context parameters.
+- [MSQ context
parameters](../multi-stage-query/reference.md#context-parameters) for how to
set context parameters for Multi-Stage Queries.
diff --git a/docs/querying/sql.md b/docs/querying/sql.md
index 98977e422f8..c2310531694 100644
--- a/docs/querying/sql.md
+++ b/docs/querying/sql.md
@@ -383,7 +383,6 @@ Request logs show the exact native query that will be run.
Alternatively, to see
SET statements allow you to specify SQL query context parameters that modify
the behavior of a Druid SQL query. You can include one or more SET statements
before the main SQL query. Druid supports using SET in the Druid SQL [JSON
API](../api-reference/sql-api.md) and the [web
console](../operations/web-console.md).
-
The syntax of a `SET` statement is:
```sql
@@ -401,10 +400,14 @@ SELECT some_column, COUNT(*) FROM druid.foo WHERE
other_column = 'foo' GROUP BY
SET statements only apply to the query in the same request. Subsequent
requests are not affected.
+SET statements work with SELECT, INSERT, and REPLACE queries.
+
If you use the [JSON API](../api-reference/sql-api.md), you can also include
query context parameters using the `context` field. If you include both, the
parameter value in SET takes precedence over the parameter value in `context`.
Note that you can only use SET to assign literal values, such as numbers,
strings, or Booleans. To set a query context parameter to an array or JSON
object, use the `context` field rather than SET.
+For other approaches to set the query context, see [Set query
context](./query-context.md).
+
## Identifiers and literals
Identifiers like datasource and column names can optionally be quoted using
double quotes. To escape a double quote
diff --git a/docs/querying/timeboundaryquery.md
b/docs/querying/timeboundaryquery.md
index 81fece0e60e..16166203942 100644
--- a/docs/querying/timeboundaryquery.md
+++ b/docs/querying/timeboundaryquery.md
@@ -48,7 +48,7 @@ There are 3 main parts to a time boundary query:
|dataSource|A String or Object defining the data source to query, very similar
to a table in a relational database. See
[DataSource](../querying/datasource.md) for more information.|yes|
|bound | Optional, set to `maxTime` or `minTime` to return only the latest
or earliest timestamp. Default to returning both if not set| no |
|filter|See [Filters](../querying/filters.md)|no|
-|context|See [Context](../querying/query-context.md)|no|
+|context|See [Query context
reference](../querying/query-context-reference.md)|no|
The format of the result is:
diff --git a/docs/querying/timeseriesquery.md b/docs/querying/timeseriesquery.md
index 4266a316585..6d093f2a749 100644
--- a/docs/querying/timeseriesquery.md
+++ b/docs/querying/timeseriesquery.md
@@ -84,7 +84,7 @@ There are 7 main parts to a timeseries query:
|aggregations|See [Aggregations](../querying/aggregations.md)|no|
|postAggregations|See [Post Aggregations](../querying/post-aggregations.md)|no|
|limit|An integer that limits the number of results. The default is
unlimited.|no|
-|context|Can be used to modify query behavior, including [grand
totals](#grand-totals) and [empty bucket values](#empty-bucket-values). See
also [Context](../querying/query-context.md) for parameters that apply to all
query types.|no|
+|context|Can be used to modify query behavior, including [grand
totals](#grand-totals) and [empty bucket values](#empty-bucket-values). See
also [Query context reference](../querying/query-context-reference.md) for
parameters that apply to all query types.|no|
To pull it all together, the above query would return 2 data points, one for
each day between 2012-01-01 and 2012-01-03, from the "sample\_datasource"
table. Each data point would be the (long) sum of sample\_fieldName1, the
(double) sum of sample\_fieldName2 and the (double) result of
sample\_fieldName1 divided by sample\_fieldName2 for the filter set. The output
looks like this:
diff --git a/docs/querying/topnquery.md b/docs/querying/topnquery.md
index 25d8b224802..fe03532314e 100644
--- a/docs/querying/topnquery.md
+++ b/docs/querying/topnquery.md
@@ -111,7 +111,7 @@ There are 11 parts to a topN query.
|dimension|A String or JSON object defining the dimension that you want the
top taken for. For more info, see
[DimensionSpecs](../querying/dimensionspecs.md)|yes|
|threshold|An integer defining the N in the topN (i.e. how many results you
want in the top list)|yes|
|metric|A String or JSON object specifying the metric to sort by for the top
list. For more info, see [TopNMetricSpec](../querying/topnmetricspec.md).|yes|
-|context|See [Context](../querying/query-context.md)|no|
+|context|See [Query context
reference](../querying/query-context-reference.md)|no|
Please note the context JSON object is also available for topN queries and
should be used with the same caution as the timeseries case.
The format of the results would look like so:
diff --git a/docs/querying/using-caching.md b/docs/querying/using-caching.md
index 85ac33126e9..02561fac59a 100644
--- a/docs/querying/using-caching.md
+++ b/docs/querying/using-caching.md
@@ -72,7 +72,8 @@ druid.broker.cache.populateResultLevelCache=true
See [Broker caching](../configuration/index.md#broker-caching) for a
description of all available Broker cache configurations.
## Enabling caching in the query context
-As long as the service is set to populate the cache, you can set cache options
for individual queries in the query [context](./query-context.md). For example,
you can `POST` a Druid SQL request to the HTTP POST API and include the context
as a JSON object:
+
+As long as the service is set to populate the cache, you can set cache options
for individual queries in the [query context](./query-context-reference.md).
For example, you can send a POST request to the Druid SQL API and include the
context as a JSON object:
```
{
@@ -91,5 +92,5 @@ You can also use the SET command to specify cache options
directly within your S
## Learn more
See the following topics for more information:
- [Query caching](./caching.md) for an overview of caching.
-- [Query context](./query-context.md) for more details and usage for the query
context.
+- [Query context reference](./query-context-reference.md) for more details
about query context parameters.
- [Cache configuration](../configuration/index.md#cache-configuration) for
information about different cache types and additional configuration options.
diff --git a/docs/release-info/migr-subquery-limit.md
b/docs/release-info/migr-subquery-limit.md
index adf600cdc67..cad8ef63389 100644
--- a/docs/release-info/migr-subquery-limit.md
+++ b/docs/release-info/migr-subquery-limit.md
@@ -60,5 +60,5 @@ See [Metrics
monitors](../configuration/index.md#metrics-monitors-for-each-servi
See the following topics for more information:
-- [Query context](../querying/query-context.md) for information on setting
query context parameters.
+- [Query context reference](../querying/query-context-reference.md) for
information on query context parameters.
- [Broker configuration
reference](../configuration#guardrails-for-materialization-of-subqueries) for
more information on `maxSubqueryRows` and `maxSubqueryBytes`.
diff --git a/docs/tutorials/tutorial-query.md b/docs/tutorials/tutorial-query.md
index 1ad7e8e28bf..54563513a48 100644
--- a/docs/tutorials/tutorial-query.md
+++ b/docs/tutorials/tutorial-query.md
@@ -143,7 +143,7 @@ from the command line or over HTTP.
:::
-9. Finally, click `...` and **Edit context** to see how you can add
additional parameters controlling the execution of the query execution. In the
field, enter query context options as JSON key-value pairs, as described in
[Context flags](../querying/query-context.md).
+9. Finally, click `...` and **Edit context** to see how you can add
additional parameters controlling the execution of the query execution. In the
field, enter query context options as JSON key-value pairs, as described in
[Set query context](../querying/query-context.md#web-console).
That's it! We've built a simple query using some of the query builder features
built into the web console. The following
sections provide a few more example queries you can try.
diff --git a/website/sidebars.json b/website/sidebars.json
index b7cf6675038..8cda4cec978 100644
--- a/website/sidebars.json
+++ b/website/sidebars.json
@@ -204,7 +204,8 @@
"querying/multitenancy",
"querying/caching",
"querying/using-caching",
- "querying/query-context"
+ "querying/query-context",
+ "querying/query-context-reference"
]
},
{
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]