[jira] [Updated] (DRILL-540) Allow querying hive views in drill

Arina Ielchiieva (JIRA) Wed, 14 Nov 2018 07:39:19 -0800


     [ 
https://issues.apache.org/jira/browse/DRILL-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Arina Ielchiieva updated DRILL-540:
-----------------------------------
    Description: 
Currently hive views cannot be queried from drill.

*Suggested approach*
 # Drill persists it's views metadata in file with suffix .view.drill using 
json format. For example: 

{noformat}
{
 "name" : "view_from_calcite_1_4",
 "sql" : "SELECT * FROM `cp`.`store.json`WHERE `store_id` = 0",
 "fields" : [ {
 "name" : "*",
 "type" : "ANY",
 "isNullable" : true
 } ],
 "workspaceSchemaPath" : [ "dfs", "tmp" ]
}
{noformat}
Later Drill parses the metadata and uses it to treat view names in SQL as a 
subquery.

      2. In Apache Hive metadata about views is stored in similar way to 
tables. Below is example from metastore.TBLS :

 
{noformat}

TBL_ID |CREATE_TIME |DB_ID |LAST_ACCESS_TIME |OWNER |RETENTION |SD_ID |TBL_NAME 
 |TBL_TYPE      |VIEW_EXPANDED_TEXT                         |
-------|------------|------|-----------------|------|----------|------|----------|--------------|-------------------------------------------|
2      |1542111078  |1     |0                |mapr  |0         |2     |cview    
 |VIRTUAL_VIEW  |SELECT COUNT(*) FROM `default`.`customers` |

{noformat}
      3. So in Hive metastore views are considered as tables of special type. 
And main benefit is that we also have expanded SQL definition of views (just 
like in view.drill files). Also reading of the metadata is already implemented 
in Drill with help of thrift Metastore API.

      4. To enable querying of Hive views I'll reuse existing code for Drill 
views as much as possible. First in *_HiveSchemaFactory.getDrillTable_* for 
_*HiveReadEntry*_ we'll convert the metadata to instance of _*View*_ (_which is 
actually model for data persisted in .view.drill files_) and then based on this 
instance return new _*DrillViewTable*_. Using this approach drill will handle 
hive views the same way as if it was initially defined in Drill and persisted 
in .view.drill file. 

     5. For conversion of Hive types: from _*FieldSchema*_ to _*RelDataType*_ 
we'll reuse existing code from _*DrillHiveTable*_, so the conversion 
functionality will be extracted and used for both (table and view) fields type 
conversions. 

 

  was:
Currently hive views cannot be queried from drill.

*Suggested approach*
 # Drill persists it's views metadata in file with suffix .view.drill using 
json format. For example: 

{noformat}
{
 "name" : "view_from_calcite_1_4",
 "sql" : "SELECT * FROM `cp`.`store.json`WHERE `store_id` = 0",
 "fields" : [ {
 "name" : "*",
 "type" : "ANY",
 "isNullable" : true
 } ],
 "workspaceSchemaPath" : [ "dfs", "tmp" ]
}{noformat}
        Later drill parses the metadata and uses it to treat view names in SQL 
as a subquery.

      2. In Apache Hive metadata about views is stored in similar way to 
tables. Below is example from metastore.TBLS :

 
{noformat}
TBL_ID |CREATE_TIME |DB_ID |LAST_ACCESS_TIME |OWNER |RETENTION |SD_ID |TBL_NAME 
 |TBL_TYPE      |VIEW_EXPANDED_TEXT                         |
-------|------------|------|-----------------|------|----------|------|----------|--------------|-------------------------------------------|
2      |1542111078  |1     |0                |mapr  |0         |2     |cview    
 |VIRTUAL_VIEW  |SELECT COUNT(*) FROM `default`.`customers` |

{noformat}
      3. So in Hive metastore views are considered as tables of special type. 
And main benefit is that we also have expanded SQL definition of views (just 
like in view.drill files). Also reading of the metadata is already implemented 
in Drill with help of thrift Metastore API.

      4. To enable querying of Hive views I'll reuse existing code for Drill 
views as much as possible. First in *_HiveSchemaFactory.getDrillTable_* for 
_*HiveReadEntry*_ we'll convert the metadata to instance of _*View*_ (_which is 
actually model for data persisted in .view.drill files_) and then based on this 
instance return new _*DrillViewTable*_. Using this approach drill will handle 
hive views the same way as if it was initially defined in Drill and persisted 
in .view.drill file. 

     5. For conversion of Hive types: from _*FieldSchema*_ to _*RelDataType*_ 
we'll reuse existing code from _*DrillHiveTable*_, so the conversion 
functionality will be extracted and used for both (table and view) fields type 
conversions. 

 


> Allow querying hive views in drill
> ----------------------------------
>
>                 Key: DRILL-540
>                 URL: https://issues.apache.org/jira/browse/DRILL-540
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Storage - Hive
>            Reporter: Ramana Inukonda Nagaraj
>            Assignee: Igor Guzenko
>            Priority: Major
>              Labels: doc-impacting
>             Fix For: 1.16.0
>
>
> Currently hive views cannot be queried from drill.
> *Suggested approach*
>  # Drill persists it's views metadata in file with suffix .view.drill using 
> json format. For example: 
> {noformat}
> {
>  "name" : "view_from_calcite_1_4",
>  "sql" : "SELECT * FROM `cp`.`store.json`WHERE `store_id` = 0",
>  "fields" : [ {
>  "name" : "*",
>  "type" : "ANY",
>  "isNullable" : true
>  } ],
>  "workspaceSchemaPath" : [ "dfs", "tmp" ]
> }
> {noformat}
> Later Drill parses the metadata and uses it to treat view names in SQL as a 
> subquery.
>       2. In Apache Hive metadata about views is stored in similar way to 
> tables. Below is example from metastore.TBLS :
>  
> {noformat}
> TBL_ID |CREATE_TIME |DB_ID |LAST_ACCESS_TIME |OWNER |RETENTION |SD_ID 
> |TBL_NAME  |TBL_TYPE      |VIEW_EXPANDED_TEXT                         |
> -------|------------|------|-----------------|------|----------|------|----------|--------------|-------------------------------------------|
> 2      |1542111078  |1     |0                |mapr  |0         |2     |cview  
>    |VIRTUAL_VIEW  |SELECT COUNT(*) FROM `default`.`customers` |
> {noformat}
>       3. So in Hive metastore views are considered as tables of special type. 
> And main benefit is that we also have expanded SQL definition of views (just 
> like in view.drill files). Also reading of the metadata is already 
> implemented in Drill with help of thrift Metastore API.
>       4. To enable querying of Hive views I'll reuse existing code for Drill 
> views as much as possible. First in *_HiveSchemaFactory.getDrillTable_* for 
> _*HiveReadEntry*_ we'll convert the metadata to instance of _*View*_ (_which 
> is actually model for data persisted in .view.drill files_) and then based on 
> this instance return new _*DrillViewTable*_. Using this approach drill will 
> handle hive views the same way as if it was initially defined in Drill and 
> persisted in .view.drill file. 
>      5. For conversion of Hive types: from _*FieldSchema*_ to _*RelDataType*_ 
> we'll reuse existing code from _*DrillHiveTable*_, so the conversion 
> functionality will be extracted and used for both (table and view) fields 
> type conversions. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-540) Allow querying hive views in drill

Reply via email to