pvary commented on a change in pull request #1837: URL: https://github.com/apache/iceberg/pull/1837#discussion_r532636366
########## File path: site/docs/hive.md ########## @@ -79,12 +79,45 @@ In order to query a Hive table created by either of the HiveCatalog methods desc ```sql SET iceberg.mr.catalog=hive; ``` -You should now be able to issue Hive SQL `SELECT` queries using the above table and see the results returned from the underlying Iceberg table. Both the Map Reduce and Tez query execution engines are supported. +You should now be able to issue Hive SQL `SELECT` queries using the above table and see the results returned from the underlying Iceberg table. ```sql SELECT * from table_b; ``` +#### Using Hadoop Catalog +Iceberg tables created using `HadoopCatalog` are stored entirely in a directory in a filesytem like HDFS. + +##### Create an Iceberg table +The first step is to create an Iceberg table using the Spark/Java/Python API and `HadoopCatalog`. For the purposes of this documentation we will assume that the fully qualified table identifier is `database_a.table_c` and that the Hadoop Catalog warehouse location is `hdfs://some_bucket/path_to_hadoop_warehouse`. Iceberg will therefore create the table at the location `hdfs://some_bucket/path_to_hadoop_warehouse/database_a/table_c`. + +##### Create a Hive table +Now overlay a Hive table on top of this Iceberg table by issuing Hive DDL like so: +```sql +CREATE EXTERNAL TABLE database_a.table_c +STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' +LOCATION 'hdfs://some_bucket/path_to_hadoop_warehouse/database_a/table_c' +TBLPROPERTIES ( + 'iceberg.mr.catalog'='hadoop', + 'iceberg.mr.catalog.hadoop.warehouse.location'='hdfs://some_bucket/path_to_hadoop_warehouse') +; +``` +Note that the Hive database and table name *must* match the values used in the Iceberg `TableIdentifier` when the table was created. Review comment: It would be good to remove this restriction later ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org