adarshsanjeev commented on code in PR #17501:
URL: https://github.com/apache/druid/pull/17501#discussion_r1891185789


##########
docs/tutorials/tutorial-extern.md:
##########
@@ -0,0 +1,206 @@
+---
+id: tutorial-extern
+title: Export query results
+sidebar_label: Export results
+description: How to use EXTERN to export query results.
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+This tutorial demonstrates how to use the Apache Druid&circledR; SQL 
[EXTERN](../multi-stage-query/reference.md#extern-function) function to export 
data.
+
+## Prerequisites
+
+Before you follow the steps in this tutorial, download Druid as described in 
the [Local quickstart](index.md).
+Don't start Druid, you'll do that as part of the tutorial.
+
+You should be familiar with ingesting and querying data in Druid.
+If you haven't already, go through the [Query 
data](../tutorials/tutorial-query.md) tutorial first.
+
+## Export query results to the local file system
+
+This example demonstrates how to configure Druid to export data to the local 
file system.
+While you can use this approach to learn about EXTERN syntax for exporting 
data, it's not suitable for production scenarios.
+
+### Configure Druid local export directory 
+
+The following commands set the base path for the Druid exports to 
`/tmp/druid/`.
+If the account running Druid doesn't have access to `/tmp/druid/`, change the 
path.
+For example: `/Users/Example/druid`.
+If you change the path in this step, use the updated path in all subsequent 
steps.
+
+From the root of the Druid distribution, run the following:
+
+```bash
+export export_path="/tmp/druid"
+sed -i -e $'$a\\\n\\\n\\\n#\\\n###Local 
export\\\n#\\\ndruid.export.storage.baseDir='$export_path' 
conf/druid/auto/_common/common.runtime.properties
+```
+
+This adds the following section to the Druid `common.runtime.properties` 
configuration file located in `conf/druid/auto/_common`:
+
+```
+#
+###Local export
+#
+druid.export.storage.baseDir=/tmp/druid/
+```
+
+### Start Druid and load sample data
+
+1. From the root of the Druid distribution, launch Druid as follows:
+
+     ```bash
+    ./bin/start-druid
+     ```
+1. After Druid starts, open [http://localhost:8888/](http://localhost:8888/) 
in your browser to access the Web Console.
+1. From the [Query 
view](http://localhost:8888/unified-console.html#workbench), run the following 
command to load the Wikipedia example data set:
+     ```sql
+     REPLACE INTO "wikipedia" OVERWRITE ALL
+     WITH "ext" AS (
+       SELECT *
+       FROM TABLE(
+         EXTERN(
+           
'{"type":"http","uris":["https://druid.apache.org/data/wikipedia.json.gz"]}',
+           '{"type":"json"}'
+         )
+       ) EXTEND ("isRobot" VARCHAR, "channel" VARCHAR, "timestamp" VARCHAR, 
"flags" VARCHAR, "isUnpatrolled" VARCHAR, "page" VARCHAR, "diffUrl" VARCHAR, 
"added" BIGINT, "comment" VARCHAR, "commentLength" BIGINT, "isNew" VARCHAR, 
"isMinor" VARCHAR, "delta" BIGINT, "isAnonymous" VARCHAR, "user" VARCHAR, 
"deltaBucket" BIGINT, "deleted" BIGINT, "namespace" VARCHAR, "cityName" 
VARCHAR, "countryName" VARCHAR, "regionIsoCode" VARCHAR, "metroCode" BIGINT, 
"countryIsoCode" VARCHAR, "regionName" VARCHAR)
+     )
+     SELECT
+       TIME_PARSE("timestamp") AS "__time",
+       "isRobot",
+       "channel",
+       "flags",
+       "isUnpatrolled",
+       "page",
+       "diffUrl",
+       "added",
+       "comment",
+       "commentLength",
+       "isNew",
+       "isMinor",
+       "delta",
+       "isAnonymous",
+       "user",
+       "deltaBucket",
+       "deleted",
+       "namespace",
+       "cityName",
+       "countryName",
+       "regionIsoCode",
+       "metroCode",
+       "countryIsoCode",
+       "regionName"
+     FROM "ext"
+     PARTITIONED BY DAY
+     ```
+
+### Query to export data
+
+Open a new tab and run the following query to export query results to the path:
+`/tmp/druid/wiki_example`.
+The path must be a subdirectory of the `druid.export.storage.baseDir`.
+
+
+```sql
+INSERT INTO
+  EXTERN(
+    local(exportPath => '/tmp/druid/wiki_example')
+        )
+AS CSV
+SELECT "channel",
+  SUM("delta") AS "changes"
+FROM "wikipedia"
+GROUP BY 1
+LIMIT 10
+```
+
+Druid exports the results of the query to the `/tmp/druid/wiki_example` 
directory.
+Run the following command to list the contents of 
+
+```bash
+ls /tmp/druid/wiki_example
+```
+
+The results are a CSV file export of the data and a directory.
+
+## Export query results to cloud storage
+
+The steps to export to cloud storage are similar to exporting to the local 
file system.
+Druid supports Amazon S3 or Google Cloud Storage (GCS) as cloud storage 
destinations.
+
+1. Enable the extension for your cloud storage destination. See [Loading core 
extensions](../configuration/extensions.md#loading-core-extensions).
+   - **Amazon S3**: `druid-s3-extensions`
+   - **GCS**: `google-extensions`
+  See [Loading core 
extensions](../configuration/extensions.md#loading-core-extensions) for more 
information.
+1. Configure the additional properties for your cloud storage destination. 
Replace `{CLOUD}` with `s3` or `google` accordingly:

Review Comment:
   The numbering seems off here and in other places in the file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to