[iceberg] branch master updated: Docs: Add HadoopCatalog example (#1095)

blue Wed, 17 Jun 2020 10:56:31 -0700

This is an automated email from the ASF dual-hosted git repository.

blue pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/iceberg.git



The following commit(s) were added to refs/heads/master by this push:
     new f3c54a8  Docs: Add HadoopCatalog example (#1095)
f3c54a8 is described below

commit f3c54a810ad06ab2ad7c42ca63eb5d8607690d0a
Author: hzfanxinxin <[email protected]>
AuthorDate: Thu Jun 18 01:56:08 2020 +0800

    Docs: Add HadoopCatalog example (#1095)
    
    Co-authored-by: 范欣欣 <[email protected]>
---
 site/docs/api-quickstart.md      | 32 +++++++++++++++++++++++++++++++-
 site/docs/java-api-quickstart.md | 30 +++++++++++++++++++++++++++++-
 2 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/site/docs/api-quickstart.md b/site/docs/api-quickstart.md
index 00f7f35..1926f4a 100644
--- a/site/docs/api-quickstart.md
+++ b/site/docs/api-quickstart.md
@@ -48,9 +48,39 @@ logsDF.write
 
 The logs [schema](#create-a-schema) and [partition 
spec](#create-a-partition-spec) are created below.
 
+### Using a Hadoop catalog
+
+A Hadoop catalog doesn't need to connect to a Hive MetaStore, but can only be 
used with HDFS or similar file systems that support atomic rename. Concurrent 
writes with a Hadoop catalog are not safe with a local FS or S3. To create a 
Hadoop catalog:
+
+```scala
+import org.apache.hadoop.conf.Configuration;
+import org.apache.iceberg.hadoop.HadoopCatalog;
+
+val conf = new Configuration();
+val warehousePath = "hdfs://host:8020/warehouse_path";
+val catalog = new HadoopCatalog(conf, warehousePath);
+```
+
+Like the Hive catalog, `HadoopCatalog` implements `Catalog`, so it also has 
methods for working with tables, like `createTable`, `loadTable`, and 
`dropTable`.
+                                                                               
        
+This example creates a table with the Hadoop catalog:
+
+```scala
+val name = TableIdentifier.of("logging", "logs")
+val table = catalog.createTable(name, schema, spec)
+
+// write into the new logs table with Spark 2.4
+logsDF.write
+    .format("iceberg")
+    .mode("append")
+    .save("hdfs://host:8020/warehouse_path/logging.db/logs")
+```
+
+The logs [schema](#create-a-schema) and [partition 
spec](#create-a-partition-spec) are created below.
+
 ### Using Hadoop tables
 
-Iceberg also supports tables that are stored in a directory in HDFS or the 
local file system. Directory tables don't support all catalog operations, like 
rename, so they use the `Tables` interface instead of `Catalog`.
+Iceberg also supports tables that are stored in a directory in HDFS. 
Concurrent writes with a Hadoop tables are not safe when stored in the local FS 
or S3. Directory tables don't support all catalog operations, like rename, so 
they use the `Tables` interface instead of `Catalog`.
 
 To create a table in HDFS, use `HadoopTables`:
 
diff --git a/site/docs/java-api-quickstart.md b/site/docs/java-api-quickstart.md
index 8bcf080..826db6f 100644
--- a/site/docs/java-api-quickstart.md
+++ b/site/docs/java-api-quickstart.md
@@ -46,9 +46,37 @@ Table table = catalog.createTable(name, schema, spec);
 The logs [schema](#create-a-schema) and [partition 
spec](#create-a-partition-spec) are created below.
 
 
+### Using a Hadoop catalog
+
+A Hadoop catalog doesn't need to connect to a Hive MetaStore, but can only be 
used with HDFS or similar file systems that support atomic rename. Concurrent 
writes with a Hadoop catalog are not safe with a local FS or S3. To create a 
Hadoop catalog:
+
+```java
+import org.apache.hadoop.conf.Configuration;
+import org.apache.iceberg.hadoop.HadoopCatalog;
+
+Configuration conf = new Configuration();
+String warehousePath = "hdfs://host:8020/warehouse_path";
+HadoopCatalog catalog = new HadoopCatalog(conf, warehousePath);
+```
+
+Like the Hive catalog, `HadoopCatalog` implements `Catalog`, so it also has 
methods for working with tables, like `createTable`, `loadTable`, and 
`dropTable`.
+                                                                               
        
+This example creates a table with Hadoop catalog:
+
+```java
+import org.apache.iceberg.Table;
+import org.apache.iceberg.catalog.TableIdentifier;
+
+TableIdentifier name = TableIdentifier.of("logging", "logs");
+Table table = catalog.createTable(name, schema, spec);
+```
+
+The logs [schema](#create-a-schema) and [partition 
spec](#create-a-partition-spec) are created below.
+
+
 ### Using Hadoop tables
 
-Iceberg also supports tables that are stored in a directory in HDFS or the 
local file system. Directory tables don't support all catalog operations, like 
rename, so they use the `Tables` interface instead of `Catalog`.
+Iceberg also supports tables that are stored in a directory in HDFS. 
Concurrent writes with a Hadoop tables are not safe when stored in the local FS 
or S3. Directory tables don't support all catalog operations, like rename, so 
they use the `Tables` interface instead of `Catalog`.
 
 To create a table in HDFS, use `HadoopTables`:

[iceberg] branch master updated: Docs: Add HadoopCatalog example (#1095)

Reply via email to