(druid) branch master updated: [Docs] Adds tutorial and stepwise instructions for EXTERN (#17501)

techdocsmith Thu, 19 Dec 2024 12:51:37 -0800

This is an automated email from the ASF dual-hosted git repository.

techdocsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git



The following commit(s) were added to refs/heads/master by this push:
     new 4648a41cab5 [Docs] Adds tutorial and stepwise instructions for EXTERN 
(#17501)
4648a41cab5 is described below

commit 4648a41cab55b2ac4d28e1f471c92da2402d68cb
Author: Charles Smith <[email protected]>
AuthorDate: Thu Dec 19 12:49:56 2024 -0800

    [Docs] Adds tutorial and stepwise instructions for EXTERN (#17501)
    
    * draft tutorial on extern:
    
    * updated draft
    
    * updates
    
    * add sidebar, fix reference wording
    
    * update reference
    
    * final updates to reference
    
    * update
    
    * add cloud info to tutorial
    
    * fix conflict
    
    * fix link
    
    * Update docs/multi-stage-query/reference.md
    
    Co-authored-by: Katya Macedo  <[email protected]>
    
    * Update docs/multi-stage-query/reference.md
    
    Co-authored-by: Katya Macedo  <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Katya Macedo  <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Katya Macedo  <[email protected]>
    
    * fixes
    
    * make hyperlink to console
    
    * Update docs/multi-stage-query/reference.md
    
    Co-authored-by: Katya Macedo  <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Katya Macedo  <[email protected]>
    
    * fix typos
    
    ---------
    
    Co-authored-by: Katya Macedo <[email protected]>
---
 docs/multi-stage-query/reference.md | 103 +++++++++---------
 docs/multi-stage-query/security.md  |   3 +
 docs/tutorials/tutorial-extern.md   | 206 ++++++++++++++++++++++++++++++++++++
 website/sidebars.json               |   3 +-
 4 files changed, 265 insertions(+), 50 deletions(-)

diff --git a/docs/multi-stage-query/reference.md 
b/docs/multi-stage-query/reference.md
index 6facbaedcb3..d34ba1bdd4e 100644
--- a/docs/multi-stage-query/reference.md
+++ b/docs/multi-stage-query/reference.md
@@ -72,9 +72,8 @@ FROM TABLE(
    `name` and a `type`. The type can be `string`, `long`, `double`, or 
`float`. This row signature is
    used to map the external data into the SQL layer.
 
-Variation 2, with the input schema expressed in SQL using an `EXTEND` clause. 
(See the next
-section for more detail on `EXTEND`). This format also uses named arguments to 
make the
-SQL a bit easier to read:
+Variation 2, with the input schema expressed in SQL using an `EXTEND` clause. 
See the next
+section for more detail on `EXTEND`. This format also uses named arguments to 
make the SQL easier to read:
 
 ```sql
 SELECT
@@ -95,12 +94,12 @@ For more information, see [Read external data with 
EXTERN](concepts.md#read-exte
 
 #### `EXTERN` to export to a destination
 
-`EXTERN` can be used to specify a destination where you want to export data to.
-This variation of EXTERN requires one argument, the details of the destination 
as specified below.
-This variation additionally requires an `AS` clause to specify the format of 
the exported rows.
+You can use `EXTERN` to specify a destination to export data.
+This variation of `EXTERN` accepts the details of the destination as the only 
argument and requires an `AS` clause to specify the format of the exported rows.
+
+When you export data, Druid creates metadata files in a subdirectory named 
`_symlink_format_manifest`.
+Within the `_symlink_format_manifest/manifest` directory, the `manifest` file 
lists absolute paths to exported files using the symlink manifest format. For 
example:
 
-While exporting data, some metadata files will also be created at the 
destination in addition to the data. These files will be created in a directory 
`_symlink_format_manifest`.
-- `_symlink_format_manifest/manifest`: Lists the files which were created as 
part of the export. The file is in the symlink manifest format, and consists of 
a list of absolute paths to the files created.
 ```text
 
s3://export-bucket/export/query-6564a32f-2194-423a-912e-eead470a37c4-worker2-partition2.csv
 
s3://export-bucket/export/query-6564a32f-2194-423a-912e-eead470a37c4-worker1-partition1.csv
@@ -112,8 +111,8 @@ 
s3://export-bucket/export/query-6564a32f-2194-423a-912e-eead470a37c4-worker0-par
 Keep the following in mind when using EXTERN to export rows:
 - Only INSERT statements are supported.
 - Only `CSV` format is supported as an export format.
-- Partitioning (`PARTITIONED BY`) and clustering (`CLUSTERED BY`) aren't 
supported with export statements.
-- You can export to Amazon S3 or local storage.
+- Partitioning (`PARTITIONED BY`) and clustering (`CLUSTERED BY`) aren't 
supported with EXTERN statements.
+- You can export to Amazon S3, Google GCS, or local storage.
 - The destination provided should contain no other files or directories.
 
 When you export data, use the `rowsPerPage` context parameter to restrict the 
size of exported files.
@@ -128,10 +127,14 @@ SELECT
 FROM <table>
 ```
 
-##### S3
+##### S3 - Amazon S3
+
+To export results to S3, pass the `s3()` function as an argument to the 
`EXTERN` function.
+Export to S3 requires the `druid-s3-extensions` extension.
+For a list of S3 permissions the MSQ task engine requires to perform export, 
see [Permissions for durable storage](./security.md#s3).
 
-Export results to S3 by passing the function `s3()` as an argument to the 
`EXTERN` function. Note that this requires the `druid-s3-extensions`.
-The `s3()` function is a Druid function that configures the connection. 
Arguments for `s3()` should be passed as named parameters with the value in 
single quotes like the following example:
+The `s3()` function configures the connection to AWS.
+Pass all arguments for `s3()` as named parameters with their values enclosed 
in single quotes. For example:
 
 ```sql
 INSERT INTO
@@ -146,25 +149,26 @@ FROM <table>
 
 Supported arguments for the function:
 
-| Parameter   | Required | Description                                         
                                                                                
                                                                                
                                                           | Default |
-|-------------|----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|
-| `bucket`    | Yes      | The S3 bucket to which the files are exported to. 
The bucket and prefix combination should be whitelisted in 
`druid.export.storage.s3.allowedExportPaths`.                                   
                                                                                
  | n/a     |
-| `prefix`    | Yes      | Path where the exported files would be created. The 
export query expects the destination to be empty. If the location includes 
other files, then the query will fail. The bucket and prefix combination should 
be whitelisted in `druid.export.storage.s3.allowedExportPaths`. | n/a     |
+| Parameter | Required | Description | Default |
+|---|---|---|---|
+| `bucket` | Yes  | S3 bucket destination for exported files. You must add the 
bucket and prefix combination to the 
`druid.export.storage.s3.allowedExportPaths` allow list. | n/a |
+| `prefix` | Yes  | Destination path in the bucket to create exported files. 
The export query expects the destination path to be empty. If the location 
includes other files, the query will fail. You must add the bucket and prefix 
combination to the `druid.export.storage.s3.allowedExportPaths` allow list. | 
n/a |
 
-The following runtime parameters must be configured to export into an S3 
destination:
+Configure the following runtime parameters to export to an S3 destination:
 
-| Runtime Parameter                            | Required | Description        
                                                                                
                                                                                
                                                  | Default |
-|----------------------------------------------|----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|
-| `druid.export.storage.s3.allowedExportPaths` | Yes      | An array of S3 
prefixes that are whitelisted as export destinations. Export queries fail if 
the export destination does not match any of the configured prefixes. Example: 
`[\"s3://bucket1/export/\", \"s3://bucket2/export/\"]`    | n/a |
-| `druid.export.storage.s3.tempLocalDir`       | No       | Directory used on 
the local storage of the worker to store temporary files required while 
uploading the data. Uses the task temporary directory by default.               
                                                           | n/a |
-| `druid.export.storage.s3.maxRetry`           | No       | Defines the max 
number times to attempt S3 API calls to avoid failures due to transient errors. 
                                                                                
                                                     | 10  |
-| `druid.export.storage.s3.chunkSize`          | No       | Defines the size 
of each chunk to temporarily store in `tempDir`. The chunk size must be between 
5 MiB and 5 GiB. A large chunk size reduces the API calls to S3, however it 
requires more disk space to store the temporary chunks. | 100MiB |
+| Runtime parameter | Required | Description | Default |
+|---|---|---|---|
+| `druid.export.storage.s3.allowedExportPaths` | Yes | Array of S3 prefixes 
allowed as export destinations. Export queries fail if the export destination 
does not match any of the configured prefixes. For example: 
`[\"s3://bucket1/export/\", \"s3://bucket2/export/\"]` | n/a |
+| `druid.export.storage.s3.tempLocalDir` | No | Directory for local storage 
where the worker stores temporary files before uploading the data to S3. | n/a |
+| `druid.export.storage.s3.maxRetry` | No | Maximum number of  attempts for S3 
API calls to avoid failures due to transient errors. | 10 |
+| `druid.export.storage.s3.chunkSize` | No | Individual chunk size to store 
temporarily in `tempDir`. Large chunk sizes reduce the number of API calls to 
S3, but require more disk space to store temporary chunks. | 100MiB |
 
+##### GOOGLE - Google Cloud Storage
 
-##### GS
+To export query results to Google Cloud Storage (GCS), pass the `google()` 
function as an argument to the `EXTERN` function.
+Export to GCS requires the `druid-google-extensions` extension.
 
-Export results to GCS by passing the function `google()` as an argument to the 
`EXTERN` function. Note that this requires the `druid-google-extensions`.
-The `google()` function is a Druid function that configures the connection. 
Arguments for `google()` should be passed as named parameters with the value in 
single quotes like the following example:
+The `google()` function configures the connection to GCS. Pass the arguments 
for `google()` as named parameters with their values enclosed in single quotes. 
For example:
 
 ```sql
 INSERT INTO
@@ -179,29 +183,30 @@ FROM <table>
 
 Supported arguments for the function:
 
-| Parameter   | Required | Description                                         
                                                                                
                                                                                
                                                               | Default |
-|-------------|----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|
-| `bucket`    | Yes      | The GS bucket to which the files are exported to. 
The bucket and prefix combination should be whitelisted in 
`druid.export.storage.google.allowedExportPaths`.                               
                                                                                
      | n/a     |
-| `prefix`    | Yes      | Path where the exported files would be created. The 
export query expects the destination to be empty. If the location includes 
other files, then the query will fail. The bucket and prefix combination should 
be whitelisted in `druid.export.storage.google.allowedExportPaths`. | n/a     |
+| Parameter   | Required | Description | Default |
+|---|---|---|---|
+| `bucket`    | Yes | GCS bucket destination for exported files. You must add 
the bucket and prefix combination to the 
`druid.export.storage.google.allowedExportPaths` allow list. | n/a |
+| `prefix` | Yes  | Destination path in the bucket to create exported files. 
The export query expects the destination path to be empty. If the location 
includes other files, the query will fail. You must add the bucket and prefix 
combination to the `druid.export.storage.google.allowedExportPaths` allow list. 
| n/a |
 
-The following runtime parameters must be configured to export into a GCS 
destination:
+Configure the following runtime parameters to export query results to a GCS 
destination:
 
-| Runtime Parameter                                | Required | Description    
                                                                                
                                                                                
                                               | Default |
-|--------------------------------------------------|----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|
-| `druid.export.storage.google.allowedExportPaths` | Yes      | An array of GS 
prefixes that are allowed as export destinations. Export queries fail if the 
export destination does not match any of the configured prefixes. Example: 
`[\"gs://bucket1/export/\", \"gs://bucket2/export/\"]` | n/a     |
-| `druid.export.storage.google.tempLocalDir`       | No       | Directory used 
on the local storage of the worker to store temporary files required while 
uploading the data. Uses the task temporary directory by default.               
                                                    | n/a     |
-| `druid.export.storage.google.maxRetry`           | No       | Defines the 
max number times to attempt GS API calls to avoid failures due to transient 
errors.                                                                         
                                                      | 10      |
-| `druid.export.storage.google.chunkSize`          | No       | Defines the 
size of each chunk to temporarily store in `tempDir`. A large chunk size 
reduces the API calls to GS; however, it requires more disk space to store the 
temporary chunks.                                         | 4MiB    |
+| Runtime parameter | Required | Description | Default |
+|---|---|---|---|
+| `druid.export.storage.google.allowedExportPaths` | Yes | Array of GCS 
prefixes allowed as export destinations. Export queries fail if the export 
destination does not match any of the configured prefixes. For example: 
`[\"gs://bucket1/export/\", \"gs://bucket2/export/\"]` | n/a     |
+| `druid.export.storage.google.tempLocalDir` | No | Directory for local 
storage where the worker stores temporary files before uploading the data to 
GCS. | n/a |
+| `druid.export.storage.google.maxRetry` | No |  Maximum number of attempts 
for GCS API calls to avoid failures due to transient errors. | 10 |
+| `druid.export.storage.google.chunkSize` | No | Individual chunk size to 
store temporarily in `tempDir`. Large chunk sizes reduce the number of API 
calls to GS, but require more disk space to store temporary chunks. | 4 MiB |
 
-##### LOCAL
 
-You can export to the local storage, which exports the results to the 
filesystem of the MSQ worker.
+##### LOCAL - local file storage
+
+You can export queries to local storage. This process writes the results to 
the filesystem on the MSQ worker.
 This is useful in a single node setup or for testing but is not suitable for 
production use cases.
 
-Export results to local storage by passing the function `LOCAL()` as an 
argument for the `EXTERN FUNCTION`.
-To use local storage as an export destination, the runtime property 
`druid.export.storage.baseDir` must be configured on the Indexer/Middle Manager.
-This value must be set to an absolute path on the local machine. Exporting 
data will be allowed to paths which match the prefix set by this value.
-Arguments to `LOCAL()` should be passed as named parameters with the value in 
single quotes in the following example:
+To export results to local storage, pass the `LOCAL()` function as an argument 
to the EXTERN function.
+You must configure the runtime property `druid.export.storage.baseDir` as an 
absolute path on the Indexer or Middle Manager to use local storage as an 
export destination.
+You can export data to paths that match this value as a prefix.
+Pass all arguments to `LOCAL()` as named parameters with values enclosed in 
single quotes. For example:
 
 ```sql
 INSERT INTO
@@ -214,13 +219,13 @@ SELECT
 FROM <table>
 ```
 
-Supported arguments to the function:
+Supported arguments for the function:
 
-| Parameter   | Required | Description                                         
                                                                                
                                                                                
                                     | Default |
-|-------------|--------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 --|
-| `exportPath`  | Yes | Absolute path to a subdirectory of 
`druid.export.storage.baseDir` used as the destination to export the results 
to. The export query expects the destination to be empty. If the location 
includes other files or directories, then the query will fail. | n/a |
+| Parameter | Required | Description | Default |
+|---|---|---|---|
+| `exportPath`  | Yes | Absolute path to a subdirectory of 
`druid.export.storage.baseDir` where Druid exports the query results. The 
destination must be empty. If the location includes other files or directories, 
the query will fail. | n/a |
 
-For more information, see [Read external data with 
EXTERN](concepts.md#write-to-an-external-destination-with-extern).
+For more information, see [Export external data with 
EXTERN](concepts.md#write-to-an-external-destination-with-extern).
 
 ### `INSERT`
 
diff --git a/docs/multi-stage-query/security.md 
b/docs/multi-stage-query/security.md
index 2aed00ab851..d98695ebedc 100644
--- a/docs/multi-stage-query/security.md
+++ b/docs/multi-stage-query/security.md
@@ -81,3 +81,6 @@ The MSQ task engine needs the following permissions for 
pushing, fetching, and r
 - `Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write` to 
write files in durable storage.
 - `Microsoft.Storage/storageAccounts/blobServices/containers/blobs/add/action` 
to create files in durable storage.
 - `Microsoft.Storage/storageAccounts/blobServices/containers/blobs/delete` to 
delete files when they're no longer needed.
+
+<!--TBD GCS-->
+
diff --git a/docs/tutorials/tutorial-extern.md 
b/docs/tutorials/tutorial-extern.md
new file mode 100644
index 00000000000..d44dd19a154
--- /dev/null
+++ b/docs/tutorials/tutorial-extern.md
@@ -0,0 +1,206 @@
+---
+id: tutorial-extern
+title: Export query results
+sidebar_label: Export results
+description: How to use EXTERN to export query results.
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+This tutorial demonstrates how to use the Apache Druid&circledR; SQL 
[EXTERN](../multi-stage-query/reference.md#extern-function) function to export 
data.
+
+## Prerequisites
+
+Before you follow the steps in this tutorial, download Druid as described in 
the [Local quickstart](index.md).
+Don't start Druid, you'll do that as part of the tutorial.
+
+You should be familiar with ingesting and querying data in Druid.
+If you haven't already, go through the [Query 
data](../tutorials/tutorial-query.md) tutorial first.
+
+## Export query results to the local file system
+
+This example demonstrates how to configure Druid to export data to the local 
file system.
+While you can use this approach to learn about EXTERN syntax for exporting 
data, it's not suitable for production scenarios.
+
+### Configure Druid local export directory 
+
+The following commands set the base path for the Druid exports to 
`/tmp/druid/`.
+If the account running Druid doesn't have access to `/tmp/druid/`, change the 
path.
+For example: `/Users/Example/druid`.
+If you change the path in this step, use the updated path in all subsequent 
steps.
+
+From the root of the Druid distribution, run the following:
+
+```bash
+export export_path="/tmp/druid"
+sed -i -e $'$a\\\n\\\n\\\n#\\\n###Local 
export\\\n#\\\ndruid.export.storage.baseDir='$export_path' 
conf/druid/auto/_common/common.runtime.properties
+```
+
+This adds the following section to the Druid `common.runtime.properties` 
configuration file located in `conf/druid/auto/_common`:
+
+```
+#
+###Local export
+#
+druid.export.storage.baseDir=/tmp/druid/
+```
+
+### Start Druid and load sample data
+
+1. From the root of the Druid distribution, launch Druid as follows:
+
+     ```bash
+    ./bin/start-druid
+     ```
+1. After Druid starts, open [http://localhost:8888/](http://localhost:8888/) 
in your browser to access the Web Console.
+1. From the [Query 
view](http://localhost:8888/unified-console.html#workbench), run the following 
command to load the Wikipedia example data set:
+     ```sql
+     REPLACE INTO "wikipedia" OVERWRITE ALL
+     WITH "ext" AS (
+       SELECT *
+       FROM TABLE(
+         EXTERN(
+           
'{"type":"http","uris":["https://druid.apache.org/data/wikipedia.json.gz"]}',
+           '{"type":"json"}'
+         )
+       ) EXTEND ("isRobot" VARCHAR, "channel" VARCHAR, "timestamp" VARCHAR, 
"flags" VARCHAR, "isUnpatrolled" VARCHAR, "page" VARCHAR, "diffUrl" VARCHAR, 
"added" BIGINT, "comment" VARCHAR, "commentLength" BIGINT, "isNew" VARCHAR, 
"isMinor" VARCHAR, "delta" BIGINT, "isAnonymous" VARCHAR, "user" VARCHAR, 
"deltaBucket" BIGINT, "deleted" BIGINT, "namespace" VARCHAR, "cityName" 
VARCHAR, "countryName" VARCHAR, "regionIsoCode" VARCHAR, "metroCode" BIGINT, 
"countryIsoCode" VARCHAR, "regionName" VARCHAR)
+     )
+     SELECT
+       TIME_PARSE("timestamp") AS "__time",
+       "isRobot",
+       "channel",
+       "flags",
+       "isUnpatrolled",
+       "page",
+       "diffUrl",
+       "added",
+       "comment",
+       "commentLength",
+       "isNew",
+       "isMinor",
+       "delta",
+       "isAnonymous",
+       "user",
+       "deltaBucket",
+       "deleted",
+       "namespace",
+       "cityName",
+       "countryName",
+       "regionIsoCode",
+       "metroCode",
+       "countryIsoCode",
+       "regionName"
+     FROM "ext"
+     PARTITIONED BY DAY
+     ```
+
+### Query to export data
+
+Open a new tab and run the following query to export query results to the path:
+`/tmp/druid/wiki_example`.
+The path must be a subdirectory of the `druid.export.storage.baseDir`.
+
+
+```sql
+INSERT INTO
+  EXTERN(
+    local(exportPath => '/tmp/druid/wiki_example')
+        )
+AS CSV
+SELECT "channel",
+  SUM("delta") AS "changes"
+FROM "wikipedia"
+GROUP BY 1
+LIMIT 10
+```
+
+Druid exports the results of the query to the `/tmp/druid/wiki_example` 
directory.
+Run the following command to list the contents of 
+
+```bash
+ls /tmp/druid/wiki_example
+```
+
+The results are a CSV file export of the data and a directory.
+
+## Export query results to cloud storage
+
+The steps to export to cloud storage are similar to exporting to the local 
file system.
+Druid supports Amazon S3 or Google Cloud Storage (GCS) as cloud storage 
destinations.
+
+1. Enable the extension for your cloud storage destination. See [Loading core 
extensions](../configuration/extensions.md#loading-core-extensions).
+   - **Amazon S3**: `druid-s3-extensions`
+   - **GCS**: `google-extensions`
+  See [Loading core 
extensions](../configuration/extensions.md#loading-core-extensions) for more 
information.
+1. Configure the additional properties for your cloud storage destination. 
Replace `{CLOUD}` with `s3` or `google` accordingly:
+   - `druid.export.storage.{CLOUD}.tempLocalDir`:  Local temporary directory 
where the query engine stages files to export.
+   - `druid.export.storage.{CLOUD}.allowedExportPaths`: S3 or GS prefixes 
allowed as Druid export locations. For example 
`[\"s3://bucket1/export/\",\"s3://bucket2/export/\"]` or 
`[\"gs://bucket1/export/\", \"gs://bucket2/export/\"]`.
+   - `druid.export.storage.{CLOUD}.maxRetry`: Maximum number of times to 
attempt cloud API calls to avoid failures from transient errors.
+   - `druid.export.storage.s3.chunkSize`: Maximum size of individual data 
chunks to store in the temporary directory.
+1. Verify the instance role has the correct permissions to the bucket and 
folders: read, write, create, and delete. See [Permissions for durable 
storage](../multi-stage-query/security.md#permissions-for-durable-storage).
+1. Use the query syntax for your cloud storage type. For example:
+
+   <Tabs>
+
+   <TabItem value="1" label="S3">
+
+    ```sql
+    INSERT INTO
+    EXTERN(
+      s3(bucket => 'your_bucket', prefix => 'prefix/to/files'))
+    AS CSV
+    SELECT "channel",
+    SUM("delta") AS "changes"
+    FROM "wikipedia"
+    GROUP BY 1
+    LIMIT 10
+    ```
+
+  </TabItem>
+
+   <TabItem value="2" label="GCS">
+
+   ```sql
+   INSERT INTO
+   EXTERN
+    google(bucket => 'your_bucket', prefix => 'prefix/to/files')
+   AS CSV
+   SELECT "channel",
+   SUM("delta") AS "changes"
+   FROM "wikipedia"
+   GROUP BY 1
+   LIMIT 10
+   ``` 
+
+   </TabItem>
+
+   </Tabs>
+
+1. When querying, use the `rowsPerPage` query context parameter to restrict 
the output file size. While it's possible to add a very large LIMIT at the end 
of your query to force Druid to create a single file, we don't recommend this 
technique.
+
+## Learn more
+
+See the following topics for more information:
+
+* [Export to a 
destination](../multi-stage-query/reference.md#extern-to-export-to-a-destination)
 for a reference of the EXTERN.
+* [SQL-based ingestion 
security](../multi-stage-query/security.md/#permissions-for-durable-storage) 
for cloud permission requirements for MSQ.
diff --git a/website/sidebars.json b/website/sidebars.json
index e5304006318..b7cf6675038 100644
--- a/website/sidebars.json
+++ b/website/sidebars.json
@@ -38,7 +38,8 @@
           "tutorials/tutorial-sql-query-view",
           "tutorials/tutorial-unnest-arrays",
           "tutorials/tutorial-query-deep-storage",
-          "tutorials/tutorial-latest-by"]
+          "tutorials/tutorial-latest-by",
+          "tutorials/tutorial-extern"]
       },
     "tutorials/tutorial-sketches-theta",
     "tutorials/tutorial-jdbc",


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

(druid) branch master updated: [Docs] Adds tutorial and stepwise instructions for EXTERN (#17501)

Reply via email to