This is an automated email from the ASF dual-hosted git repository.
git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/drill-site.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 414541d Automatic Site Publish by Buildbot
414541d is described below
commit 414541df979c239ff7be43cb28aae645547bff00
Author: buildbot <[email protected]>
AuthorDate: Mon Jan 3 08:12:03 2022 +0000
Automatic Site Publish by Buildbot
---
.../index.html | 3 +-
.../index.html | 68 +++++++++-------------
output/feed.xml | 4 +-
.../index.html | 3 +-
.../index.html | 68 +++++++++-------------
output/zh/feed.xml | 4 +-
6 files changed, 60 insertions(+), 90 deletions(-)
diff --git a/output/docs/configuring-hashicorp-vault-authentication/index.html
b/output/docs/configuring-hashicorp-vault-authentication/index.html
index f3dabaa..4ab9665 100644
--- a/output/docs/configuring-hashicorp-vault-authentication/index.html
+++ b/output/docs/configuring-hashicorp-vault-authentication/index.html
@@ -1470,7 +1470,7 @@
</tbody>
</table>
-<p>To enable Drill’s Vault authenticator, add the following configuration
based on the example below to the <code class="language-plaintext
highlighter-rouge">drill.exec</code> block in the <code
class="language-plaintext
highlighter-rouge"><DRILL_HOME>/conf/drill-override.conf</code> file and
restart every Drillbit.</p>
+<p>Note that in the current implementation, Drill does not preserve the access
token returned by Vault after a successful authentication. It merely uses the
success or failure status returned by Vault to decide whether user gets logged
in. To enable Drill’s Vault authenticator, add the following configuration
based on the example below to the <code class="language-plaintext
highlighter-rouge">drill.exec</code> block in the <code
class="language-plaintext highlighter-rouge"><DRILL_HO [...]
<div class="language-hocon highlighter-rouge"><div class="highlight"><pre
class="highlight"><code><span class="nl">drill.exec</span><span
class="p">:</span><span class="w"> </span><span class="p">{</span><span
class="w">
</span><span class="nl">cluster-id</span><span class="p">:</span><span
class="w"> </span><span class="s2">"drillbits1"</span><span
class="p">,</span><span class="w">
@@ -1487,7 +1487,6 @@
</span><span class="l">packages</span><span class="w"> </span><span
class="err">+</span><span class="p">=</span><span class="w"> </span><span
class="s2">"org.apache.drill.exec.rpc.user.security"</span><span
class="p">,</span><span class="w">
</span><span class="nl">impl</span><span class="p">:</span><span
class="w"> </span><span class="s2">"vault"</span><span class="p">,</span><span
class="w">
</span><span class="nl">vault.address</span><span
class="p">:</span><span class="w"> </span><span
class="s2">"http://localhost:8200"</span><span class="p">,</span><span
class="w">
- </span><span class="nl">vault.token</span><span
class="p">:</span><span class="w"> </span><span
class="s2">"drill_vault_token_123"</span><span class="p">,</span><span
class="w">
</span><span class="nl">vault.method</span><span
class="p">:</span><span class="w"> </span><span
class="s2">"USER_PASS"</span><span class="w"> </span><span class="c1">#
supported values: APP_ROLE, LDAP, USER_PASS, VAULT_TOKEN</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
diff --git
a/output/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
b/output/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
index 91c122b..abf4f44 100644
---
a/output/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
+++
b/output/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
@@ -1480,11 +1480,11 @@ Drill uses the Hash-Join operator to join data. Drill
1.15 introduces semi-join
<ul>
<li>
- <p><strong>drill.exec.spill.fs</strong><br />
+ <p><strong>drill.exec.spill.fs</strong>
Introduced in Drill 1.11. The default file system on the local machine into
which the spillable operators spill data. You can configure this option so that
data spills into a distributed file system, such as hdfs. For example,
“hdfs:///”. The default setting is “file:///”.</p>
</li>
<li>
- <p><strong>drill.exec.spill.directories</strong><br />
+ <p><strong>drill.exec.spill.directories</strong>
Introduced in Drill 1.11. The list of directories into which the spillable
operators spill data. The list must be an array with directories separated by a
comma, for example [“/fs1/drill/spill” , “/fs2/drill/spill” ,
“/fs3/drill/spill”]. The default setting is [“/tmp/drill/spill”].</p>
</li>
</ul>
@@ -1507,83 +1507,69 @@ Introduced in Drill 1.11. The list of directories into
which the spillable opera
<ul>
<li>
- <p><strong>planner.memory.max_query_memory_per_node</strong><br />
+ <p><strong>planner.memory.max_query_memory_per_node</strong>
The <code class="language-plaintext
highlighter-rouge">planner.memory.max_query_memory_per_node</code> option is
the minimum amount of memory available to Drill per query on a node. The
default of 2 GB typically allows between two and three concurrent queries to
run when the JVM is configured to use 8 GB of direct memory (default). When the
memory requirement for Drill increases, the default of 2 GB is constraining.
You must increase the amount of memory for queries to complete, unless t [...]
</li>
<li>
- <p><strong>planner.memory.percent_per_query</strong><br />
+ <p><strong>planner.memory.percent_per_query</strong>
Alternatively, the <code class="language-plaintext
highlighter-rouge">planner.memory.percent_per_query</code> option sets the
memory as a percentage of the total direct memory. The default is 5%. This
value is only used when throttling is disabled. Setting the value to 0 disables
the option. You can increase or decrease the value; however, you should set the
percentage well below the JVM direct memory to account for the cases where
Drill does not manage memory, such as for the less memor [...]
- <div class="language-plaintext highlighter-rouge"><div
class="highlight"><pre class="highlight"><code> - The percentage is calculated
using the following formula:
+ <div class="language-plaintext highlighter-rouge"><div
class="highlight"><pre class="highlight"><code> - The percentage is calculated
using the following formula:
- (1 - non-managed allowance)/concurrency
+ (1 - non-managed allowance)/concurrency
- - The non-managed allowance is an assumed amount of system memory that
non-managed operators will use. Non-managed operators do not spill to disk. The
conservative assumption for the non-managed allowance is 50% of the total
system memory. Concurrency is the number of concurrent queries that may run.
The default assumption is 10 concurrent queries.
+ - The non-managed allowance is an assumed amount of system memory that
non-managed operators will use. Non-managed operators do not spill to disk. The
conservative assumption for the non-managed allowance is 50% of the total
system memory. Concurrency is the number of concurrent queries that may run.
The default assumption is 10 concurrent queries.
- - Based on the default assumptions, the default value of 5% is calculated, as
shown:
+ - Based on the default assumptions, the default value of 5% is calculated, as
shown:
- (1 - .50)/10 = 0.05
+ (1 - .50)/10 = 0.05
</code></pre></div> </div>
</li>
</ul>
<p><strong>Increasing the Available Memory</strong></p>
-<table>
- <tbody>
- <tr>
- <td>You can increase the amount of available memory to Drill using the
ALTER SYSTEM</td>
- <td>SESSION SET commands with the <code class="language-plaintext
highlighter-rouge">planner.memory.max_query_memory_per_node</code> or <code
class="language-plaintext
highlighter-rouge">planner.memory.percent_per_query</code> options, as
shown:</td>
- </tr>
- </tbody>
-</table>
-
-<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code> ALTER SYSTEM|SESSION SET
`planner.memory.max_query_memory_per_node` = <new_value>
- //The default value is to 2147483648 bytes (2GB).
-
- ALTER SYSTEM|SESSION SET `planner.memory.percent_per_query` =
<new_value>
- //The default value is 0.05.
+<p>You can increase the amount of available memory to Drill using the ALTER
SYSTEM/SESSION SET commands with the <code class="language-plaintext
highlighter-rouge">planner.memory.max_query_memory_per_node</code> or <code
class="language-plaintext
highlighter-rouge">planner.memory.percent_per_query</code> options, as
shown:</p>
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code> ALTER SYSTEM/SESSION SET
`planner.memory.max_query_memory_per_node` = <new_value>
+ //The default value is to 2147483648 bytes (2GB).
+
+ ALTER SYSTEM/SESSION SET `planner.memory.percent_per_query` =
<new_value>
+ //The default value is 0.05.
</code></pre></div></div>
<h2 id="disabling-the-hash-operators">Disabling the Hash Operators</h2>
<p>You can disable the Hash Aggregate and Hash Join operators. When you
disable these operators, Drill creates alternative query plans that use the
Sort operator and the Streaming Aggregate or the Merge Join operator.</p>
-<table>
- <tbody>
- <tr>
- <td>Use the ALTER SYSTEM</td>
- <td>SESSION SET commands with the following options to disable the Hash
Aggregate and Hash Join operators. Typically, you set the options at the
session level unless you want the setting to persist across all sessions.</td>
- </tr>
- </tbody>
-</table>
+<p>Use the ALTER SYSTEM/SESSION SET commands with the following options to
disable the Hash Aggregate and Hash Join operators. Typically, you set the
options at the session level unless you want the setting to persist across all
sessions.</p>
<p>The following options control the hash-based operators:</p>
<ul>
<li>
- <p><strong>planner.enable_hashagg</strong><br />
+ <p><strong>planner.enable_hashagg</strong>
Enables or disables hash aggregation; otherwise, Drill does a sort-based
aggregation. This option is enabled by default. The default, and recommended,
setting is true. Prior to Drill 1.11, the Hash Aggregate operator used an
uncontrolled amount of memory (up to 10 GB), after which the operator ran out
of memory. As of Drill 1.11, the Hash Aggregate operator can spill to disk.</p>
</li>
<li>
- <p><strong>planner.enable_hashjoin</strong><br />
+ <p><strong>planner.enable_hashjoin</strong>
Enables or disables hash joins. This option is enabled by default. Drill
assumes that a query will have adequate memory to complete and tries to use the
fastest operations possible. Prior to Drill 1.14, the Hash-Join operator used
an uncontrolled amount of memory (up to 10 GB), after which the operator ran
out of memory. As of Drill 1.14, this operator can spill to disk. This option
is enabled by default.</p>
</li>
<li>
- <p><strong>planner.enable_semijoin</strong><br />
+ <p><strong>planner.enable_semijoin</strong>
Enables or disables semi-join functionality inside the Hash Join. This option
is enabled by default. When enabled, a semi-join flag inside the HashJoin flag
is set to true, and Drill uses a semi-join to remove the distinct processing
below the Hash Join. When disabled, Drill can still perform semi-joins, but the
semi-joins are performed outside of the Hash Join, as shown in the following
example:</p>
</li>
</ul>
<h3 id="example-query-plan-with-and-without-semi-join">Example: Query Plan
with and without Semi-Join</h3>
-<p><strong>Semi-Join Disabled</strong> <br />
+<p><strong>Semi-Join Disabled</strong>
In the following query plan, you can see the HashAgg before the HashJoin. In
the HashJoin flag, you can see that semi-join flag is set to false, indicating
that a semi-join was not used.</p>
-<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code>EXPLAIN PLAN FOR SELECT employee_id, full_name FROM
cp.`employee.json` WHERE employee_id IN (SELECT employee_id FROM
cp.`employee.json`);
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code>EXPLAIN PLAN FOR SELECT employee_id, full_name FROM
cp.`employee.json` WHERE employee_id IN (SELECT employee_id FROM
cp.`employee.json`);
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
-| text
|
+| text
|
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
| 00-00 Screen
00-01 Project(employee_id=[$0], full_name=[$1])
@@ -1593,16 +1579,16 @@ In the following query plan, you can see the HashAgg
before the HashJoin. In the
00-04 Project(employee_id0=[$0])
00-06 HashAgg(group=[{0}])
00-07 Scan(table=[[cp, employee.json]],
groupscan=[EasyGroupScan [selectionRoot=classpath:/employee.json, numFiles=1,
columns=[`employee_id`], files=[classpath:/employee.json]]])
-planner.enable_semijoin
+planner.enable_semijoin
</code></pre></div></div>
-<p><strong>Semi-Join Enabled</strong> <br />
+<p><strong>Semi-Join Enabled</strong>
In the following query plan, you can see that the HashAgg is absent. In the
HashJoin flag, you can see that semi-join flag is set to true, indicating that
a semi-join was used. Using the semi-join optimizes the query by reducing the
amount of processing that Drill must perform on data.</p>
-<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code>EXPLAIN PLAN FOR SELECT employee_id, full_name FROM
cp.`employee.json` WHERE employee_id IN (SELECT employee_id FROM
cp.`employee.json`);
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code>EXPLAIN PLAN FOR SELECT employee_id, full_name FROM
cp.`employee.json` WHERE employee_id IN (SELECT employee_id FROM
cp.`employee.json`);
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
-| text
|
+| text
|
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
| 00-00 Screen
00-01 Project(employee_id=[$0], full_name=[$1])
diff --git a/output/feed.xml b/output/feed.xml
index 1346d13..3dec316 100644
--- a/output/feed.xml
+++ b/output/feed.xml
@@ -6,8 +6,8 @@
</description>
<link>/</link>
<atom:link href="/feed.xml" rel="self" type="application/rss+xml"/>
- <pubDate>Wed, 22 Dec 2021 13:13:48 +0000</pubDate>
- <lastBuildDate>Wed, 22 Dec 2021 13:13:48 +0000</lastBuildDate>
+ <pubDate>Mon, 03 Jan 2022 08:09:35 +0000</pubDate>
+ <lastBuildDate>Mon, 03 Jan 2022 08:09:35 +0000</lastBuildDate>
<generator>Jekyll v3.9.1</generator>
<item>
diff --git
a/output/zh/docs/configuring-hashicorp-vault-authentication/index.html
b/output/zh/docs/configuring-hashicorp-vault-authentication/index.html
index 9f9a7a3..196ac98 100644
--- a/output/zh/docs/configuring-hashicorp-vault-authentication/index.html
+++ b/output/zh/docs/configuring-hashicorp-vault-authentication/index.html
@@ -1470,7 +1470,7 @@
</tbody>
</table>
-<p>To enable Drill’s Vault authenticator, add the following configuration
based on the example below to the <code class="language-plaintext
highlighter-rouge">drill.exec</code> block in the <code
class="language-plaintext
highlighter-rouge"><DRILL_HOME>/conf/drill-override.conf</code> file and
restart every Drillbit.</p>
+<p>Note that in the current implementation, Drill does not preserve the access
token returned by Vault after a successful authentication. It merely uses the
success or failure status returned by Vault to decide whether user gets logged
in. To enable Drill’s Vault authenticator, add the following configuration
based on the example below to the <code class="language-plaintext
highlighter-rouge">drill.exec</code> block in the <code
class="language-plaintext highlighter-rouge"><DRILL_HO [...]
<div class="language-hocon highlighter-rouge"><div class="highlight"><pre
class="highlight"><code><span class="nl">drill.exec</span><span
class="p">:</span><span class="w"> </span><span class="p">{</span><span
class="w">
</span><span class="nl">cluster-id</span><span class="p">:</span><span
class="w"> </span><span class="s2">"drillbits1"</span><span
class="p">,</span><span class="w">
@@ -1487,7 +1487,6 @@
</span><span class="l">packages</span><span class="w"> </span><span
class="err">+</span><span class="p">=</span><span class="w"> </span><span
class="s2">"org.apache.drill.exec.rpc.user.security"</span><span
class="p">,</span><span class="w">
</span><span class="nl">impl</span><span class="p">:</span><span
class="w"> </span><span class="s2">"vault"</span><span class="p">,</span><span
class="w">
</span><span class="nl">vault.address</span><span
class="p">:</span><span class="w"> </span><span
class="s2">"http://localhost:8200"</span><span class="p">,</span><span
class="w">
- </span><span class="nl">vault.token</span><span
class="p">:</span><span class="w"> </span><span
class="s2">"drill_vault_token_123"</span><span class="p">,</span><span
class="w">
</span><span class="nl">vault.method</span><span
class="p">:</span><span class="w"> </span><span
class="s2">"USER_PASS"</span><span class="w"> </span><span class="c1">#
supported values: APP_ROLE, LDAP, USER_PASS, VAULT_TOKEN</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
diff --git
a/output/zh/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
b/output/zh/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
index a851a61..930759f 100644
---
a/output/zh/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
+++
b/output/zh/docs/sort-based-and-hash-based-memory-constrained-operators/index.html
@@ -1480,11 +1480,11 @@ Drill uses the Hash-Join operator to join data. Drill
1.15 introduces semi-join
<ul>
<li>
- <p><strong>drill.exec.spill.fs</strong><br />
+ <p><strong>drill.exec.spill.fs</strong>
Introduced in Drill 1.11. The default file system on the local machine into
which the spillable operators spill data. You can configure this option so that
data spills into a distributed file system, such as hdfs. For example,
“hdfs:///”. The default setting is “file:///”.</p>
</li>
<li>
- <p><strong>drill.exec.spill.directories</strong><br />
+ <p><strong>drill.exec.spill.directories</strong>
Introduced in Drill 1.11. The list of directories into which the spillable
operators spill data. The list must be an array with directories separated by a
comma, for example [“/fs1/drill/spill” , “/fs2/drill/spill” ,
“/fs3/drill/spill”]. The default setting is [“/tmp/drill/spill”].</p>
</li>
</ul>
@@ -1507,83 +1507,69 @@ Introduced in Drill 1.11. The list of directories into
which the spillable opera
<ul>
<li>
- <p><strong>planner.memory.max_query_memory_per_node</strong><br />
+ <p><strong>planner.memory.max_query_memory_per_node</strong>
The <code class="language-plaintext
highlighter-rouge">planner.memory.max_query_memory_per_node</code> option is
the minimum amount of memory available to Drill per query on a node. The
default of 2 GB typically allows between two and three concurrent queries to
run when the JVM is configured to use 8 GB of direct memory (default). When the
memory requirement for Drill increases, the default of 2 GB is constraining.
You must increase the amount of memory for queries to complete, unless t [...]
</li>
<li>
- <p><strong>planner.memory.percent_per_query</strong><br />
+ <p><strong>planner.memory.percent_per_query</strong>
Alternatively, the <code class="language-plaintext
highlighter-rouge">planner.memory.percent_per_query</code> option sets the
memory as a percentage of the total direct memory. The default is 5%. This
value is only used when throttling is disabled. Setting the value to 0 disables
the option. You can increase or decrease the value; however, you should set the
percentage well below the JVM direct memory to account for the cases where
Drill does not manage memory, such as for the less memor [...]
- <div class="language-plaintext highlighter-rouge"><div
class="highlight"><pre class="highlight"><code> - The percentage is calculated
using the following formula:
+ <div class="language-plaintext highlighter-rouge"><div
class="highlight"><pre class="highlight"><code> - The percentage is calculated
using the following formula:
- (1 - non-managed allowance)/concurrency
+ (1 - non-managed allowance)/concurrency
- - The non-managed allowance is an assumed amount of system memory that
non-managed operators will use. Non-managed operators do not spill to disk. The
conservative assumption for the non-managed allowance is 50% of the total
system memory. Concurrency is the number of concurrent queries that may run.
The default assumption is 10 concurrent queries.
+ - The non-managed allowance is an assumed amount of system memory that
non-managed operators will use. Non-managed operators do not spill to disk. The
conservative assumption for the non-managed allowance is 50% of the total
system memory. Concurrency is the number of concurrent queries that may run.
The default assumption is 10 concurrent queries.
- - Based on the default assumptions, the default value of 5% is calculated, as
shown:
+ - Based on the default assumptions, the default value of 5% is calculated, as
shown:
- (1 - .50)/10 = 0.05
+ (1 - .50)/10 = 0.05
</code></pre></div> </div>
</li>
</ul>
<p><strong>Increasing the Available Memory</strong></p>
-<table>
- <tbody>
- <tr>
- <td>You can increase the amount of available memory to Drill using the
ALTER SYSTEM</td>
- <td>SESSION SET commands with the <code class="language-plaintext
highlighter-rouge">planner.memory.max_query_memory_per_node</code> or <code
class="language-plaintext
highlighter-rouge">planner.memory.percent_per_query</code> options, as
shown:</td>
- </tr>
- </tbody>
-</table>
-
-<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code> ALTER SYSTEM|SESSION SET
`planner.memory.max_query_memory_per_node` = <new_value>
- //The default value is to 2147483648 bytes (2GB).
-
- ALTER SYSTEM|SESSION SET `planner.memory.percent_per_query` =
<new_value>
- //The default value is 0.05.
+<p>You can increase the amount of available memory to Drill using the ALTER
SYSTEM/SESSION SET commands with the <code class="language-plaintext
highlighter-rouge">planner.memory.max_query_memory_per_node</code> or <code
class="language-plaintext
highlighter-rouge">planner.memory.percent_per_query</code> options, as
shown:</p>
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code> ALTER SYSTEM/SESSION SET
`planner.memory.max_query_memory_per_node` = <new_value>
+ //The default value is to 2147483648 bytes (2GB).
+
+ ALTER SYSTEM/SESSION SET `planner.memory.percent_per_query` =
<new_value>
+ //The default value is 0.05.
</code></pre></div></div>
<h2 id="disabling-the-hash-operators">Disabling the Hash Operators</h2>
<p>You can disable the Hash Aggregate and Hash Join operators. When you
disable these operators, Drill creates alternative query plans that use the
Sort operator and the Streaming Aggregate or the Merge Join operator.</p>
-<table>
- <tbody>
- <tr>
- <td>Use the ALTER SYSTEM</td>
- <td>SESSION SET commands with the following options to disable the Hash
Aggregate and Hash Join operators. Typically, you set the options at the
session level unless you want the setting to persist across all sessions.</td>
- </tr>
- </tbody>
-</table>
+<p>Use the ALTER SYSTEM/SESSION SET commands with the following options to
disable the Hash Aggregate and Hash Join operators. Typically, you set the
options at the session level unless you want the setting to persist across all
sessions.</p>
<p>The following options control the hash-based operators:</p>
<ul>
<li>
- <p><strong>planner.enable_hashagg</strong><br />
+ <p><strong>planner.enable_hashagg</strong>
Enables or disables hash aggregation; otherwise, Drill does a sort-based
aggregation. This option is enabled by default. The default, and recommended,
setting is true. Prior to Drill 1.11, the Hash Aggregate operator used an
uncontrolled amount of memory (up to 10 GB), after which the operator ran out
of memory. As of Drill 1.11, the Hash Aggregate operator can spill to disk.</p>
</li>
<li>
- <p><strong>planner.enable_hashjoin</strong><br />
+ <p><strong>planner.enable_hashjoin</strong>
Enables or disables hash joins. This option is enabled by default. Drill
assumes that a query will have adequate memory to complete and tries to use the
fastest operations possible. Prior to Drill 1.14, the Hash-Join operator used
an uncontrolled amount of memory (up to 10 GB), after which the operator ran
out of memory. As of Drill 1.14, this operator can spill to disk. This option
is enabled by default.</p>
</li>
<li>
- <p><strong>planner.enable_semijoin</strong><br />
+ <p><strong>planner.enable_semijoin</strong>
Enables or disables semi-join functionality inside the Hash Join. This option
is enabled by default. When enabled, a semi-join flag inside the HashJoin flag
is set to true, and Drill uses a semi-join to remove the distinct processing
below the Hash Join. When disabled, Drill can still perform semi-joins, but the
semi-joins are performed outside of the Hash Join, as shown in the following
example:</p>
</li>
</ul>
<h3 id="example-query-plan-with-and-without-semi-join">Example: Query Plan
with and without Semi-Join</h3>
-<p><strong>Semi-Join Disabled</strong> <br />
+<p><strong>Semi-Join Disabled</strong>
In the following query plan, you can see the HashAgg before the HashJoin. In
the HashJoin flag, you can see that semi-join flag is set to false, indicating
that a semi-join was not used.</p>
-<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code>EXPLAIN PLAN FOR SELECT employee_id, full_name FROM
cp.`employee.json` WHERE employee_id IN (SELECT employee_id FROM
cp.`employee.json`);
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code>EXPLAIN PLAN FOR SELECT employee_id, full_name FROM
cp.`employee.json` WHERE employee_id IN (SELECT employee_id FROM
cp.`employee.json`);
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
-| text
|
+| text
|
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
| 00-00 Screen
00-01 Project(employee_id=[$0], full_name=[$1])
@@ -1593,16 +1579,16 @@ In the following query plan, you can see the HashAgg
before the HashJoin. In the
00-04 Project(employee_id0=[$0])
00-06 HashAgg(group=[{0}])
00-07 Scan(table=[[cp, employee.json]],
groupscan=[EasyGroupScan [selectionRoot=classpath:/employee.json, numFiles=1,
columns=[`employee_id`], files=[classpath:/employee.json]]])
-planner.enable_semijoin
+planner.enable_semijoin
</code></pre></div></div>
-<p><strong>Semi-Join Enabled</strong> <br />
+<p><strong>Semi-Join Enabled</strong>
In the following query plan, you can see that the HashAgg is absent. In the
HashJoin flag, you can see that semi-join flag is set to true, indicating that
a semi-join was used. Using the semi-join optimizes the query by reducing the
amount of processing that Drill must perform on data.</p>
-<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code>EXPLAIN PLAN FOR SELECT employee_id, full_name FROM
cp.`employee.json` WHERE employee_id IN (SELECT employee_id FROM
cp.`employee.json`);
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre
class="highlight"><code>EXPLAIN PLAN FOR SELECT employee_id, full_name FROM
cp.`employee.json` WHERE employee_id IN (SELECT employee_id FROM
cp.`employee.json`);
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
-| text
|
+| text
|
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
| 00-00 Screen
00-01 Project(employee_id=[$0], full_name=[$1])
diff --git a/output/zh/feed.xml b/output/zh/feed.xml
index da33398..168c65e 100644
--- a/output/zh/feed.xml
+++ b/output/zh/feed.xml
@@ -6,8 +6,8 @@
</description>
<link>/</link>
<atom:link href="/zh/feed.xml" rel="self" type="application/rss+xml"/>
- <pubDate>Wed, 22 Dec 2021 13:13:48 +0000</pubDate>
- <lastBuildDate>Wed, 22 Dec 2021 13:13:48 +0000</lastBuildDate>
+ <pubDate>Mon, 03 Jan 2022 08:09:35 +0000</pubDate>
+ <lastBuildDate>Mon, 03 Jan 2022 08:09:35 +0000</lastBuildDate>
<generator>Jekyll v3.9.1</generator>
<item>