[ 
https://issues.apache.org/jira/browse/DRILL-3588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Barefoot updated DRILL-3588:
-----------------------------------
    Component/s: Storage - Hive

> Write back to Hive Metastore
> ----------------------------
>
>                 Key: DRILL-3588
>                 URL: https://issues.apache.org/jira/browse/DRILL-3588
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Hive
>            Reporter: Joseph Barefoot
>            Priority: Critical
>
> This feature is particularly important to us here at AtScale in order to 
> leverage Drill as a query engine option for our BI on Hadoop solution. 
> Currently you can connect to and query databases/tables from Hive Metastore 
> fine. However if you create a table, it will be created in HDFS but no 
> metadata is written to the Hive Metastore. That means those tables won't be 
> easily visible to any other tool. 
> When you read schemas from a Hive datasource via Drill, they are prefixed 
> with "hive.". This namespacing makes sense to us considering how Drill works, 
> and ideally it would work symmetrically when you create tables with the same 
> prefix, i.e. Drill would map the prefix to the target data source, in this 
> case Hive, and write the schema information back to the Hive MetaStore. Our 
> specific use case is Create Table As Select, however ideally any DDL 
> statements against a hive datasource schema/table would write back to the 
> Hive Metastore. 
> The reason it's important to have the metadata in Hive Metastore is we have 
> found many of our customers use multiple SQL tools to access data tracked in 
> the Metastore. For example, even if Impala is their primary SQL on Hadoop 
> engine for clients/tools, they may run Spark jobs to manipulate data via RDDs 
> that pull data by referencing the Metastore. Organizations using a lot of SQL 
> on Hadoop have come to expect this sort of interoperability between Hive, 
> Spark, and Impala, and supporting it within Drill will help drive adoption 
> within the Hadoop community (besides making it a lot easier for us to use 
> Drill effectively from within our BI engine).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to