[iceberg] branch master updated: Update the site for the 0.9.0 release (#1205)

blue Wed, 15 Jul 2020 11:59:01 -0700

This is an automated email from the ASF dual-hosted git repository.

blue pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/iceberg.git



The following commit(s) were added to refs/heads/master by this push:
     new 39348fe  Update the site for the 0.9.0 release (#1205)
39348fe is described below

commit 39348fec94d08a9eb4f3da0b41e002adc8b832d9
Author: Ryan Blue <[email protected]>
AuthorDate: Wed Jul 15 11:58:45 2020 -0700

    Update the site for the 0.9.0 release (#1205)
---
 site/docs/api-quickstart.md  | 176 -------------------------------------------
 site/docs/configuration.md   |  42 ++++++++---
 site/docs/css/extra.css      |   7 +-
 site/docs/getting-started.md | 102 +++++++++++++++----------
 site/docs/javadoc/index.html |   4 +-
 site/docs/releases.md        |  24 ++++--
 site/docs/spark.md           |  23 +++++-
 site/mkdocs.yml              |   7 +-
 8 files changed, 140 insertions(+), 245 deletions(-)

diff --git a/site/docs/api-quickstart.md b/site/docs/api-quickstart.md
deleted file mode 100644
index 1926f4a..0000000
--- a/site/docs/api-quickstart.md
+++ /dev/null
@@ -1,176 +0,0 @@
-<!--
- - Licensed to the Apache Software Foundation (ASF) under one or more
- - contributor license agreements.  See the NOTICE file distributed with
- - this work for additional information regarding copyright ownership.
- - The ASF licenses this file to You under the Apache License, Version 2.0
- - (the "License"); you may not use this file except in compliance with
- - the License.  You may obtain a copy of the License at
- -
- -   http://www.apache.org/licenses/LICENSE-2.0
- -
- - Unless required by applicable law or agreed to in writing, software
- - distributed under the License is distributed on an "AS IS" BASIS,
- - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- - See the License for the specific language governing permissions and
- - limitations under the License.
- -->
-
-# Spark API Quickstart
-
-## Create a table
-
-Tables are created using either a 
[`Catalog`](/javadoc/master/index.html?org/apache/iceberg/catalog/Catalog.html) 
or an implementation of the 
[`Tables`](/javadoc/master/index.html?org/apache/iceberg/Tables.html) interface.
-
-### Using a Hive catalog
-
-The Hive catalog connects to a Hive MetaStore to keep track of Iceberg tables. 
This example uses Spark's Hadoop configuration to get a Hive catalog:
-
-```scala
-import org.apache.iceberg.hive.HiveCatalog
-
-val catalog = new HiveCatalog(spark.sessionState.newHadoopConf())
-```
-
-The `Catalog` interface defines methods for working with tables, like 
`createTable`, `loadTable`, `renameTable`, and `dropTable`.
-
-To create a table, pass an `Identifier` and a `Schema` along with other 
initial metadata:
-
-```scala
-val name = TableIdentifier.of("logging", "logs")
-val table = catalog.createTable(name, schema, spec)
-
-// write into the new logs table with Spark 2.4
-logsDF.write
-    .format("iceberg")
-    .mode("append")
-    .save("logging.logs")
-```
-
-The logs [schema](#create-a-schema) and [partition 
spec](#create-a-partition-spec) are created below.
-
-### Using a Hadoop catalog
-
-A Hadoop catalog doesn't need to connect to a Hive MetaStore, but can only be 
used with HDFS or similar file systems that support atomic rename. Concurrent 
writes with a Hadoop catalog are not safe with a local FS or S3. To create a 
Hadoop catalog:
-
-```scala
-import org.apache.hadoop.conf.Configuration;
-import org.apache.iceberg.hadoop.HadoopCatalog;
-
-val conf = new Configuration();
-val warehousePath = "hdfs://host:8020/warehouse_path";
-val catalog = new HadoopCatalog(conf, warehousePath);
-```
-
-Like the Hive catalog, `HadoopCatalog` implements `Catalog`, so it also has 
methods for working with tables, like `createTable`, `loadTable`, and 
`dropTable`.
-                                                                               
        
-This example creates a table with the Hadoop catalog:
-
-```scala
-val name = TableIdentifier.of("logging", "logs")
-val table = catalog.createTable(name, schema, spec)
-
-// write into the new logs table with Spark 2.4
-logsDF.write
-    .format("iceberg")
-    .mode("append")
-    .save("hdfs://host:8020/warehouse_path/logging.db/logs")
-```
-
-The logs [schema](#create-a-schema) and [partition 
spec](#create-a-partition-spec) are created below.
-
-### Using Hadoop tables
-
-Iceberg also supports tables that are stored in a directory in HDFS. 
Concurrent writes with a Hadoop tables are not safe when stored in the local FS 
or S3. Directory tables don't support all catalog operations, like rename, so 
they use the `Tables` interface instead of `Catalog`.
-
-To create a table in HDFS, use `HadoopTables`:
-
-```scala
-import org.apache.iceberg.hadoop.HadoopTables
-
-val tables = new HadoopTables(spark.sessionState.newHadoopConf())
-
-val table = tables.create(schema, spec, "hdfs:/tables/logging/logs")
-
-// write into the new logs table with Spark 2.4
-logsDF.write
-    .format("iceberg")
-    .mode("append")
-    .save("hdfs:/tables/logging/logs")
-```
-
-!!! Warning
-    Hadoop tables shouldn't be used with file systems that do not support 
atomic rename. Iceberg relies on rename to synchronize concurrent commits for 
directory tables.
-
-### Tables in Spark
-
-Spark uses both `HiveCatalog` and `HadoopTables` to load tables. Hive is used 
when the identifier passed to `load` or `save` is not a path, otherwise Spark 
assumes it is a path-based table.
-
-To read and write to tables from Spark see:
-
-* [Reading a table in Spark](../spark#reading-an-iceberg-table)
-* [Appending to a table in Spark](../spark#appending-data)
-* [Overwriting data in a table in Spark](../spark#overwriting-data)
-
-
-## Schemas
-
-### Create a schema
-
-This example creates a schema for a `logs` table:
-
-```scala
-import org.apache.iceberg.Schema
-import org.apache.iceberg.types.Types._
-
-val schema = new Schema(
-    NestedField.required(1, "level", StringType.get()),
-    NestedField.required(2, "event_time", TimestampType.withZone()),
-    NestedField.required(3, "message", StringType.get()),
-    NestedField.optional(4, "call_stack", ListType.ofRequired(5, 
StringType.get()))
-  )
-```
-
-When using the Iceberg API directly, type IDs are required. Conversions from 
other schema formats, like Spark, Avro, and Parquet will automatically assign 
new IDs.
-
-When a table is created, all IDs in the schema are re-assigned to ensure 
uniqueness.
-
-### Convert a schema from Avro
-
-To create an Iceberg schema from an existing Avro schema, use converters in 
`AvroSchemaUtil`:
-
-```scala
-import org.apache.avro.Schema.Parser
-import org.apache.iceberg.avro.AvroSchemaUtil
-
-val avroSchema = new Parser().parse("""{"type": "record", ... }""")
-
-val icebergSchema = AvroSchemaUtil.toIceberg(avroSchema)
-```
-
-### Convert a schema from Spark
-
-To create an Iceberg schema from an existing table, use converters in 
`SparkSchemaUtil`:
-
-```scala
-import org.apache.iceberg.spark.SparkSchemaUtil
-
-val schema = SparkSchemaUtil.convert(spark.table("db.table").schema)
-```
-
-
-## Partitioning
-
-### Create a partition spec
-
-Partition specs describe how Iceberg should group records into data files. 
Partition specs are created for a table's schema using a builder.
-
-This example creates a partition spec for the `logs` table that partitions 
records by the hour of the log event's timestamp and by log level:
-
-```scala
-import org.apache.iceberg.PartitionSpec
-
-val spec = PartitionSpec.builderFor(schema)
-                        .hour("event_time")
-                        .identity("level")
-                        .build()
-```
diff --git a/site/docs/configuration.md b/site/docs/configuration.md
index 874b0e3..27dd60b 100644
--- a/site/docs/configuration.md
+++ b/site/docs/configuration.md
@@ -67,14 +67,36 @@ Iceberg tables support table properties to configure table 
behavior, like the de
 | --------------------------------------------- | -------- | 
------------------------------------------------------------- |
 | compatibility.snapshot-id-inheritance.enabled | false    | Enables 
committing snapshots without explicit snapshot IDs    |
 
-## Hadoop options
+## Hadoop configuration
+
+The following properties from the Hadoop configuration are used by the Hive 
Metastore connector.
 
 | Property                           | Default          | Description          
                                         |
 | ---------------------------------- | ---------------- | 
------------------------------------------------------------- |
 | iceberg.hive.client-pool-size      | 5                | The size of the Hive 
client pool when tracking tables in HMS  |
 | iceberg.hive.lock-timeout-ms       | 180000 (3 min)   | Maximum time in 
milliseconds to acquire a lock                |
 
-## Spark options
+## Spark configuration
+
+### Catalogs
+
+[Spark catalogs](../spark#configuring-catalogs) are configured using Spark 
session properties.
+
+A catalog is created and named by adding a property 
`spark.sql.catalog.(catalog-name)` with an implementation class for its value.
+
+Iceberg supplies two implementations:
+
+* `org.apache.iceberg.spark.SparkCatalog` supports a Hive Metastore or a 
Hadoop warehouse as a catalog
+* `org.apache.iceberg.spark.SparkSessionCatalog` adds support for Iceberg 
tables to Spark's built-in catalog, and delegates to the built-in catalog for 
non-Iceberg tables
+
+Both catalogs are configured using properties nested under the catalog name:
+
+| Property                                           | Values                  
      | Description                                                          |
+| -------------------------------------------------- | 
----------------------------- | 
-------------------------------------------------------------------- |
+| spark.sql.catalog._catalog-name_.type              | hive or hadoop          
      | The underlying Iceberg catalog implementation                        |
+| spark.sql.catalog._catalog-name_.default-namespace | default                 
      | The default current namespace for the catalog                        |
+| spark.sql.catalog._catalog-name_.uri               | thrift://host:port      
      | URI for the Hive Metastore; default from `hive-site.xml` (Hive only) |
+| spark.sql.catalog._catalog-name_.warehouse         | 
hdfs://nn:8020/warehouse/path | Base path for the warehouse directory (Hadoop 
only)                  |
 
 ### Read options
 
@@ -83,9 +105,8 @@ Spark read options are passed when configuring the 
DataFrameReader, like this:
 ```scala
 // time travel
 spark.read
-    .format("iceberg")
     .option("snapshot-id", 10963874102873L)
-    .load("db.table")
+    .table("catalog.db.table")
 ```
 
 | Spark option    | Default               | Description                        
                                                       |
@@ -103,14 +124,13 @@ Spark write options are passed when configuring the 
DataFrameWriter, like this:
 ```scala
 // write with Avro instead of Parquet
 df.write
-    .format("iceberg")
     .option("write-format", "avro")
-    .save("db.table")
+    .insertInto("catalog.db.table")
 ```
 
-| Spark option | Default                    | Description                      
                            |
-| ------------ | -------------------------- | 
------------------------------------------------------------ |
-| write-format | Table write.format.default | File format to use for this 
write operation; parquet or avro |
-| target-file-size-bytes | As per table property | Overrides this table's 
write.target-file-size-bytes     |
-| check-nullability | true         | Sets the nullable check on fields         
               |
+| Spark option           | Default                    | Description            
                                      |
+| ---------------------- | -------------------------- | 
------------------------------------------------------------ |
+| write-format           | Table write.format.default | File format to use for 
this write operation; parquet or avro |
+| target-file-size-bytes | As per table property      | Overrides this table's 
write.target-file-size-bytes          |
+| check-nullability      | true                       | Sets the nullable 
check on fields                            |
 
diff --git a/site/docs/css/extra.css b/site/docs/css/extra.css
index de09c11..76545b0 100644
--- a/site/docs/css/extra.css
+++ b/site/docs/css/extra.css
@@ -41,6 +41,10 @@
   opacity: 0;
 }
 
+h2, h3, h4 {
+  padding-top: 1em;
+}
+
 h2:target .headerlink {
   color: #008cba;
   opacity: 1;
@@ -79,7 +83,8 @@ pre {
 
 .admonition {
   margin: 0.5em;
-  margin-left: 0em;
+  margin-top: 1.5em;
+  margin-left: 1em;
   padding: 0.5em;
   padding-left: 1em;
 }
diff --git a/site/docs/getting-started.md b/site/docs/getting-started.md
index 0d15b4d..1f08521 100644
--- a/site/docs/getting-started.md
+++ b/site/docs/getting-started.md
@@ -17,82 +17,102 @@
 
 # Getting Started
 
-## Using Iceberg in Spark
+## Using Iceberg in Spark 3
 
-The latest version of Iceberg is [0.8.0-incubating](../releases).
+The latest version of Iceberg is [0.9.0](../releases).
 
 To use Iceberg in a Spark shell, use the `--packages` option:
 
 ```sh
-spark-shell --packages 
org.apache.iceberg:iceberg-spark-runtime:0.8.0-incubating
+spark-shell --packages org.apache.iceberg:iceberg-spark3-runtime:0.9.0
 ```
 
-You can also build Iceberg locally, and add the jar using `--jars`. This can 
be helpful to test unreleased features or while developing something new:
+!!! Note
+    If you want to include Iceberg in your Spark installation, add the 
[`iceberg-spark3-runtime` Jar][spark-runtime-jar] to Spark's `jars` folder.
 
-```sh
-./gradlew assemble
-spark-shell --jars spark-runtime/build/libs/iceberg-spark-runtime-8c05a2f.jar
-```
+[spark-runtime-jar]: 
https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark3-runtime/0.9.0/iceberg-spark3-runtime-0.9.0.jar
 
-## Installing with Spark
+### Adding catalogs
 
-If you want to include Iceberg in your Spark installation, add the 
[`iceberg-spark-runtime` Jar][spark-runtime-jar] to Spark's `jars` folder.
+Iceberg comes with [catalogs](../spark#configuring-catalogs) that enable SQL 
commands to manage tables and load them by name. Catalogs are configured using 
properties under `spark.sql.catalog.(catalog_name)`.
 
-Where you have to replace `8c05a2f` with the git hash that you're using.
+This command creates a path-based catalog named `local` for tables under 
`$PWD/warehouse` and adds support for Iceberg tables to Spark's built-in 
catalog:
 
-[spark-runtime-jar]: 
https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime/0.8.0-incubating/iceberg-spark-runtime-0.8.0-incubating.jar
+```sh
+spark-sql --packages org.apache.iceberg:iceberg-spark3-runtime:0.9.0 \
+    --conf 
spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \
+    --conf spark.sql.catalog.spark_catalog.type=hive \
+    --conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \
+    --conf spark.sql.catalog.local.type=hadoop \
+    --conf spark.sql.catalog.local.uri=$PWD/warehouse
+```
 
-## Creating a table
+### Creating a table
 
-Spark 2.4 is limited to reading and writing existing Iceberg tables. Use the 
[Iceberg API](../api) to create Iceberg tables.
+To create your first Iceberg table in Spark, use the `spark-sql` shell or 
`spark.sql(...)` to run a [`CREATE TABLE`](../spark#create-table) command:
 
-Here's how to create your first Iceberg table in Spark, using a source Dataset
+```sql
+-- local is the path-based catalog defined above
+CREATE TABLE local.db.table (id bigint, data string) USING iceberg
+```
 
-First, import Iceberg classes and create a catalog client:
+Iceberg catalogs support the full range of SQL DDL commands, including:
 
-```scala
-import org.apache.iceberg.hive.HiveCatalog
-import org.apache.iceberg.catalog.TableIdentifier
-import org.apache.iceberg.spark.SparkSchemaUtil
+* [`CREATE TABLE ... PARTITIONED BY`](../spark#create-table)
+* [`CREATE TABLE ... AS SELECT`](../spark#create-table-as-select)
+* [`ALTER TABLE`](../spark#alter-table)
+* [`DROP TABLE`](../spark#drop-table)
 
-val catalog = new HiveCatalog(spark.sparkContext.hadoopConfiguration)
-```
+### Writing
 
-Next, create a dataset to write into your table and get an Iceberg schema for 
it:
+Once your table is created, insert data using [`INSERT 
INTO`](../spark#insert-into):
 
-```scala
-val data = Seq((1, "a"), (2, "b"), (3, "c")).toDF("id", "data")
-val schema = SparkSchemaUtil.convert(data.schema)
+```sql
+INSERT INTO local.db.table VALUES (1, 'a'), (2, 'b'), (3, 'c');
+INSERT INTO local.db.table SELECT id, data FROM source WHERE length(data) = 1;
 ```
 
-Finally, create a table using the schema:
+Iceberg supports writing DataFrames using the new [v2 DataFrame write 
API](../spark#writing-with-dataframes):
 
 ```scala
-val name = TableIdentifier.of("default", "test_table")
-val table = catalog.createTable(name, schema)
+spark.table("source").select("id", "data")
+     .writeTo("local.db.table").append()
 ```
 
-### Reading and writing
+The old `write` API is supported, but _not_ recommended.
 
-Once your table is created, you can use it in `load` and `save` in Spark 2.4:
+### Reading
 
-```scala
-// write the dataset to the table
-data.write.format("iceberg").mode("append").save("default.test_table")
+To read with SQL, use the an Iceberg table name in a `SELECT` query:
 
-// read the table
-spark.read.format("iceberg").load("default.test_table")
+```sql
+SELECT count(1) as count, data
+FROM local.db.table
+GROUP BY data
 ```
 
-### Reading with SQL
+SQL is also the recommended way to [inspect 
tables](../spark#inspecting-tables). To view all of the snapshots in a table, 
use the `snapshots` metadata table:
+```sql
+SELECT * FROM local.db.table.snapshots
+```
+```
++-------------------------+----------------+-----------+-----------+----------------------------------------------------+-----+
+| committed_at            | snapshot_id    | parent_id | operation | 
manifest_list                                      | ... |
++-------------------------+----------------+-----------+-----------+----------------------------------------------------+-----+
+| 2019-02-08 03:29:51.215 | 57897183625154 | null      | append    | 
s3://.../table/metadata/snap-57897183625154-1.avro | ... |
+|                         |                |           |           |           
                                         | ... |
+|                         |                |           |           |           
                                         | ... |
+| ...                     | ...            | ...       | ...       | ...       
                                         | ... |
++-------------------------+----------------+-----------+-----------+----------------------------------------------------+-----+
+```
 
-You can also create a temporary view to use the table in SQL:
+[DataFrame reads](../spark#querying-with-dataframes) are supported and can now 
reference tables by name using `spark.table`:
 
 ```scala
-spark.read.format("iceberg").load("default.test_table").createOrReplaceTempView("test_table")
-spark.sql("""SELECT count(1) FROM test_table""")
+val df = spark.table("local.db.table")
+df.count()
 ```
 
 ### Next steps
 
-Next, you can learn more about the [Iceberg Table API](../api), or about 
[Iceberg tables in Spark](../spark)
+Next, you can learn more about [Iceberg tables in Spark](../spark), or about 
the [Iceberg Table API](../api).
diff --git a/site/docs/javadoc/index.html b/site/docs/javadoc/index.html
index 603c018..d18f655 100644
--- a/site/docs/javadoc/index.html
+++ b/site/docs/javadoc/index.html
@@ -1,9 +1,9 @@
 <html xmlns="http://www.w3.org/1999/xhtml";>    
   <head>      
     <title>Iceberg Javadoc Redirect</title>      
-    <meta http-equiv="refresh" content="0;URL='/javadoc/0.8.0-incubating/'" /> 
   
+    <meta http-equiv="refresh" content="0;URL='/javadoc/0.9.0/'" />    
   </head>    
   <body> 
-    <p>Redirecting to Javadoc for the 0.8.0-incubating release: <a 
href="/javadoc/0.8.0-incubating/">/javadoc/0.8.0-incubating</a>.</p> 
+    <p>Redirecting to Javadoc for the 0.9.0 release: <a 
href="/javadoc/0.9.0/">/javadoc/0.9.0</a>.</p> 
   </body>  
 </html>
diff --git a/site/docs/releases.md b/site/docs/releases.md
index fbae902..8dd169d 100644
--- a/site/docs/releases.md
+++ b/site/docs/releases.md
@@ -1,26 +1,27 @@
 
 ## Downloads
 
-The latest version of Iceberg is 
[0.8.0-incubating](https://github.com/apache/iceberg/releases/tag/apache-iceberg-0.8.0-incubating).
+The latest version of Iceberg is 
[0.9.0](https://github.com/apache/iceberg/releases/tag/apache-iceberg-0.9.0).
 
-* [0.8.0-incubating source 
tar.gz](https://www.apache.org/dyn/closer.cgi/incubator/iceberg/apache-iceberg-0.8.0-incubating/apache-iceberg-0.8.0-incubating.tar.gz)
 -- 
[signature](https://downloads.apache.org/incubator/iceberg/apache-iceberg-0.8.0-incubating/apache-iceberg-0.8.0-incubating.tar.gz.asc)
 -- 
[sha512](https://downloads.apache.org/incubator/iceberg/apache-iceberg-0.8.0-incubating/apache-iceberg-0.8.0-incubating.tar.gz.sha512)
-* [0.8.0-incubating Spark 2.4 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime/0.8.0-incubating/iceberg-spark-runtime-0.8.0-incubating.jar)
+* [0.9.0 source 
tar.gz](https://www.apache.org/dyn/closer.cgi/iceberg/apache-iceberg-0.9.0/apache-iceberg-0.9.0.tar.gz)
 -- 
[signature](https://downloads.apache.org/iceberg/apache-iceberg-0.9.0/apache-iceberg-0.9.0.tar.gz.asc)
 -- 
[sha512](https://downloads.apache.org/iceberg/apache-iceberg-0.9.0/apache-iceberg-0.9.0.tar.gz.sha512)
+* [0.9.0 Spark 3.0 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark3-runtime/0.9.0/iceberg-spark3-runtime-0.9.0.jar)
+* [0.9.0 Spark 2.4 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime/0.9.0/iceberg-spark-runtime-0.9.0.jar)
 
-To use Iceberg in Spark 2.4, download the runtime Jar and add it to the jars 
folder of your Spark install.
+To use Iceberg in Spark, download the runtime Jar and add it to the jars 
folder of your Spark install. Use iceberg-spark3-runtime for Spark 3, and 
iceberg-spark-runtime for Spark 2.4.
 
-## Gradle
+### Gradle
 
 To add a dependency on Iceberg in Gradle, add the following to `build.gradle`:
 
 ```
 dependencies {
-  compile 'org.apache.iceberg:iceberg-core:0.8.0-incubating'
+  compile 'org.apache.iceberg:iceberg-core:0.9.0'
 }
 ```
 
 You may also want to include `iceberg-parquet` for Parquet file support.
 
-## Maven
+### Maven
 
 To add a dependency on Iceberg in Maven, add the following to your `pom.xml`:
 
@@ -30,7 +31,7 @@ To add a dependency on Iceberg in Maven, add the following to 
your `pom.xml`:
   <dependency>
     <groupId>org.apache.iceberg</groupId>
     <artifactId>iceberg-core</artifactId>
-    <version>0.8.0-incubating</version>
+    <version>0.9.0</version>
   </dependency>
   ...
 </dependencies>
@@ -38,6 +39,13 @@ To add a dependency on Iceberg in Maven, add the following 
to your `pom.xml`:
 
 ## Past releases
 
+### 0.8.0
+
+* Git tag: 
[apache-iceberg-0.8.0-incubating](https://github.com/apache/iceberg/releases/tag/apache-iceberg-0.8.0-incubating)
+* [0.8.0-incubating source 
tar.gz](https://www.apache.org/dyn/closer.cgi/incubator/iceberg/apache-iceberg-0.8.0-incubating/apache-iceberg-0.8.0-incubating.tar.gz)
 -- 
[signature](https://downloads.apache.org/incubator/iceberg/apache-iceberg-0.8.0-incubating/apache-iceberg-0.8.0-incubating.tar.gz.asc)
 -- 
[sha512](https://downloads.apache.org/incubator/iceberg/apache-iceberg-0.8.0-incubating/apache-iceberg-0.8.0-incubating.tar.gz.sha512)
+* [0.8.0-incubating Spark 2.4 runtime 
Jar](https://search.maven.org/remotecontent?filepath=org/apache/iceberg/iceberg-spark-runtime/0.8.0-incubating/iceberg-spark-runtime-0.8.0-incubating.jar)
+
+
 ### 0.7.0
 
 * Git tag: 
[apache-iceberg-0.7.0-incubating](https://github.com/apache/iceberg/releases/tag/apache-iceberg-0.7.0-incubating)
diff --git a/site/docs/spark.md b/site/docs/spark.md
index 5353c1d..120ff0d 100644
--- a/site/docs/spark.md
+++ b/site/docs/spark.md
@@ -37,7 +37,7 @@ Iceberg uses Apache Spark's DataSourceV2 API for data source 
and catalog impleme
 
 ## Configuring catalogs
 
-Spark 3.0 adds an API to plug in table catalogs that are used to load, create, 
and manage Iceberg tables. Spark catalogs are configured by setting Spark 
properties under `spark.sql.catalog`.
+Spark 3.0 adds an API to plug in table catalogs that are used to load, create, 
and manage Iceberg tables. Spark catalogs are configured by setting [Spark 
properties](../configuration#catalogs) under `spark.sql.catalog`.
 
 This creates an Iceberg catalog named `hive_prod` that loads tables from a 
Hive metastore:
 
@@ -93,7 +93,7 @@ This configuration can use same Hive Metastore for both 
Iceberg and non-Iceberg
 ## DDL commands
 
 !!! Note
-    Spark 2.4 can't create Iceberg tables with DDL, instead use the [Iceberg 
API](../api-quickstart).
+    Spark 2.4 can't create Iceberg tables with DDL, instead use the [Iceberg 
API](../java-api-quickstart).
 
 ### `CREATE TABLE`
 
@@ -286,6 +286,11 @@ val df = spark.read
     .table("prod.db.table")
 ```
 
+!!! Warning
+    When reading with DataFrames in Spark 3, use `table` to load a table by 
name from a catalog.
+    Using `format("iceberg")` loads an isolated table reference that is not 
refreshed when other queries update the table.
+
+
 ### Time travel
 
 To select a specific table snapshot or the snapshot at some time, Iceberg 
supports two Spark read options:
@@ -353,6 +358,13 @@ To replace data in the table with the result of a query, 
use `INSERT OVERWRITE`.
 
 The partitions that will be replaced by `INSERT OVERWRITE` depends on Spark's 
partition overwrite mode and the partitioning of a table.
 
+!!! Warning
+    Spark 3.0.0 has a correctness bug that affects dynamic `INSERT OVERWRITE` 
with hidden partitioning, [SPARK-32168][spark-32168].
+    For tables with [hidden partitions](../partitioning), wait for Spark 3.0.1.
+
+[spark-32168]: https://issues.apache.org/jira/browse/SPARK-32168
+
+
 #### Overwrite behavior
 
 Spark's default overwrite mode is **static**, but **dynamic overwrite mode is 
recommended when writing to Iceberg tables.** Static overwrite mode determines 
which partitions to overwrite in a table by converting the `PARTITION` clause 
to a filter, but the `PARTITION` clause can only reference table columns.
@@ -432,6 +444,13 @@ Spark 3 introduced the new `DataFrameWriterV2` API for 
writing to tables using d
     - `df.writeTo(t).append()` is equivalent to `INSERT INTO`
     - `df.writeTo(t).overwritePartitions()` is equivalent to dynamic `INSERT 
OVERWRITE`
 
+The v1 DataFrame `write` API is still supported, but is not recommended.
+
+!!! Warning
+    When writing with the v1 DataFrame API in Spark 3, use `saveAsTable` or 
`insertInto` to load tables with a catalog.
+    Using `format("iceberg")` loads an isolated table reference that will not 
automatically refresh tables used by queries.
+
+
 ### Appending data
 
 To append a dataframe to an Iceberg table, use `append`:
diff --git a/site/mkdocs.yml b/site/mkdocs.yml
index 637c40c..5f1976e 100644
--- a/site/mkdocs.yml
+++ b/site/mkdocs.yml
@@ -48,16 +48,15 @@ nav:
     - How to Release: how-to-release.md
   - User docs:
     - Getting Started: getting-started.md
+    - Spark: spark.md
+    - Presto: presto.md
     - Configuration: configuration.md
     - Schemas: schemas.md
     - Partitioning: partitioning.md
+    - Table evolution: evolution.md
     - Performance: performance.md
     - Reliability: reliability.md
-    - Table evolution: evolution.md
     - Time Travel: spark#time-travel
-    - Spark Quickstart: api-quickstart.md
-    - Spark: spark.md
-    - Presto: presto.md
   - Java:
     - Git Repo: https://github.com/apache/iceberg
     - Quickstart: java-api-quickstart.md

[iceberg] branch master updated: Update the site for the 0.9.0 release (#1205)

Reply via email to