This is an automated email from the ASF dual-hosted git repository.
blue pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/iceberg.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 86dd3aa Deployed 349e8e304 with MkDocs version: 1.0.4
86dd3aa is described below
commit 86dd3aab5a1c1e03df375f0b277a5ed297cb31c5
Author: Ryan Blue <[email protected]>
AuthorDate: Tue Jul 14 16:00:54 2020 -0800
Deployed 349e8e304 with MkDocs version: 1.0.4
---
configuration/index.html | 48 +++++++++++++++++++++++++++++++++++++++++++----
index.html | 2 +-
sitemap.xml.gz | Bin 225 -> 225 bytes
spark/index.html | 2 +-
4 files changed, 46 insertions(+), 6 deletions(-)
diff --git a/configuration/index.html b/configuration/index.html
index 58aa1fb..69fa53f 100644
--- a/configuration/index.html
+++ b/configuration/index.html
@@ -346,10 +346,11 @@
<li class="third-level"><a href="#write-properties">Write
properties</a></li>
<li class="third-level"><a
href="#table-behavior-properties">Table behavior properties</a></li>
<li class="third-level"><a
href="#compatibility-flags">Compatibility flags</a></li>
- <li class="second-level"><a href="#hadoop-options">Hadoop
options</a></li>
+ <li class="second-level"><a href="#hadoop-configuration">Hadoop
configuration</a></li>
- <li class="second-level"><a href="#spark-options">Spark
options</a></li>
+ <li class="second-level"><a href="#spark-configuration">Spark
configuration</a></li>
+ <li class="third-level"><a href="#catalogs">Catalogs</a></li>
<li class="third-level"><a href="#read-options">Read
options</a></li>
<li class="third-level"><a href="#write-options">Write
options</a></li>
</ul>
@@ -554,7 +555,8 @@
</tr>
</tbody>
</table>
-<h2 id="hadoop-options">Hadoop options<a class="headerlink"
href="#hadoop-options" title="Permanent link">¶</a></h2>
+<h2 id="hadoop-configuration">Hadoop configuration<a class="headerlink"
href="#hadoop-configuration" title="Permanent link">¶</a></h2>
+<p>The following properties from the Hadoop configuration are used by the Hive
Metastore connector.</p>
<table>
<thead>
<tr>
@@ -576,7 +578,45 @@
</tr>
</tbody>
</table>
-<h2 id="spark-options">Spark options<a class="headerlink"
href="#spark-options" title="Permanent link">¶</a></h2>
+<h2 id="spark-configuration">Spark configuration<a class="headerlink"
href="#spark-configuration" title="Permanent link">¶</a></h2>
+<h3 id="catalogs">Catalogs<a class="headerlink" href="#catalogs"
title="Permanent link">¶</a></h3>
+<p><a href="../spark#configuring-catalogs">Spark catalogs</a> are configured
using Spark session properties.</p>
+<p>A catalog is created and named by adding a property
<code>spark.sql.catalog.(catalog-name)</code> with an implementation class for
its value.</p>
+<p>Iceberg supplies two implementations:
+* <code>org.apache.iceberg.spark.SparkCatalog</code> supports a Hive Metastore
or a Hadoop warehouse as a catalog
+* <code>org.apache.iceberg.spark.SparkSessionCatalog</code> adds support for
Iceberg tables to Spark’s built-in catalog, and delegates to the built-in
catalog for non-Iceberg tables</p>
+<p>Both catalogs are configured using properties nested under the catalog
name:</p>
+<table>
+<thead>
+<tr>
+<th>Property</th>
+<th>Values</th>
+<th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>spark.sql.catalog.<em>catalog-name</em>.type</td>
+<td>hive or hadoop</td>
+<td>The underlying Iceberg catalog implementation</td>
+</tr>
+<tr>
+<td>spark.sql.catalog.<em>catalog-name</em>.default-namespace</td>
+<td>default</td>
+<td>The default current namespace for the catalog</td>
+</tr>
+<tr>
+<td>spark.sql.catalog.<em>catalog-name</em>.uri</td>
+<td>thrift://host:port</td>
+<td>URI for the Hive Metastore; default from <code>hive-site.xml</code> (Hive
only)</td>
+</tr>
+<tr>
+<td>spark.sql.catalog.<em>catalog-name</em>.warehouse</td>
+<td>hdfs://nn:8020/warehouse/path</td>
+<td>Base path for the warehouse directory (Hadoop only)</td>
+</tr>
+</tbody>
+</table>
<h3 id="read-options">Read options<a class="headerlink" href="#read-options"
title="Permanent link">¶</a></h3>
<p>Spark read options are passed when configuring the DataFrameReader, like
this:</p>
<pre><code class="scala">// time travel
diff --git a/index.html b/index.html
index 44fe712..4031557 100644
--- a/index.html
+++ b/index.html
@@ -460,5 +460,5 @@
<!--
MkDocs version : 1.0.4
-Build Date UTC : 2020-07-14 23:39:15
+Build Date UTC : 2020-07-15 00:00:54
-->
diff --git a/sitemap.xml.gz b/sitemap.xml.gz
index d91c841..f19cdb3 100644
Binary files a/sitemap.xml.gz and b/sitemap.xml.gz differ
diff --git a/spark/index.html b/spark/index.html
index 965d9f5..39ba204 100644
--- a/spark/index.html
+++ b/spark/index.html
@@ -494,7 +494,7 @@
</tbody>
</table>
<h2 id="configuring-catalogs">Configuring catalogs<a class="headerlink"
href="#configuring-catalogs" title="Permanent link">¶</a></h2>
-<p>Spark 3.0 adds an API to plug in table catalogs that are used to load,
create, and manage Iceberg tables. Spark catalogs are configured by setting
Spark properties under <code>spark.sql.catalog</code>.</p>
+<p>Spark 3.0 adds an API to plug in table catalogs that are used to load,
create, and manage Iceberg tables. Spark catalogs are configured by setting <a
href="../configuration#catalogs">Spark properties</a> under
<code>spark.sql.catalog</code>.</p>
<p>This creates an Iceberg catalog named <code>hive_prod</code> that loads
tables from a Hive metastore:</p>
<pre><code class="plain">spark.sql.catalog.hive_prod =
org.apache.iceberg.spark.SparkCatalog
spark.sql.catalog.hive_prod.type = hive