[ 
https://issues.apache.org/jira/browse/SPARK-30022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Ravi M V updated SPARK-30022:
----------------------------------
    Description: 
 

We have an environment were we use Apache Spark and Presto (both backed by 
Apache Hive Metastore). Currently, Views created from Presto fail to get parsed 
in Apache Spark. This is because Presto stores the View definition and View 
schema in a base64 encoded fashion and Spark is unable to process it. I would 
like to propose a minor change that will allow us to read these encoded 
definitions created by Presto in a Spark Program.

Assuming that the UDFs are made available, the user should be able to read 
presto views after the fix.

 

I would like to propose a change to 
[https://github.com/apache/spark/blob/9459833eae7fae887af560f3127997e023c51d00/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L440]

to have support creation of CatlogTable for Views created from Presto.

 

Hive Metastore DB, the table definition for Presto views  (*select* * *from* 
TBLS *where* `TBL_TYPE` *like* '%VIRTUAL_VIEW%') shows that the 
VIEW_EXPANDED_TEXT is hardcoded as `/* Presto View */` and VIEW_ORIGINAL_TEXT 
is `/* Presto View: base64({ "originalSql": "" "catalog": "", "schema": "", 
"columns": [

{ "name": "", "type": "" }

], "owner": ""}) */`  

Refer: 
[https://github.com/prestodb/presto/blob/3242715959a169dbcdd88946c28488d2365c8886/presto-hive/src/main/java/com/facebook/presto/hive/HiveUtil.java#L614]

 

  was:
 

We have an environment were we use Apache Spark and Presto (both backed by 
Apache Hive Metastore). Currently, Views created from Presto fail to get parsed 
in Apache Spark. This is because Presto stores the View definition and View 
schema in a base64 encoded fashion and Spark is unable to process it. I would 
like to propose a minor change that will allow us to read these encoded 
definitions created by Presto in a Spark Program.

Assuming that the UDFs are made available, the user should be able to read 
presto views after the fix.

 

I would like to propose a change to 
[https://github.com/apache/spark/blob/9459833eae7fae887af560f3127997e023c51d00/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L440]

to have support creation of CatlogTable for Views created from Presto.

 

Hive Metastore DB, the table definition for Presto views  (*select* * *from* 
TBLS *where* `TBL_TYPE` *like* '%VIRTUAL_VIEW%') shows that the 
VIEW_EXPANDED_TEXT is hardcoded as `/\* Presto View \*/` and VIEW_ORIGINAL_TEXT 
is `/*Presto View: base64({ "originalSql": "" "catalog": "", "schema": "", 
"columns": [

{ "name": "", "type": "" }

], "owner": ""})*/`  

Refer: 
[https://github.com/prestodb/presto/blob/3242715959a169dbcdd88946c28488d2365c8886/presto-hive/src/main/java/com/facebook/presto/hive/HiveUtil.java#L614]

 


> Supporting Parsing of Simple Hive Virtual View created from Presto
> ------------------------------------------------------------------
>
>                 Key: SPARK-30022
>                 URL: https://issues.apache.org/jira/browse/SPARK-30022
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.4.4
>            Reporter: Arun Ravi M V
>            Priority: Major
>
>  
> We have an environment were we use Apache Spark and Presto (both backed by 
> Apache Hive Metastore). Currently, Views created from Presto fail to get 
> parsed in Apache Spark. This is because Presto stores the View definition and 
> View schema in a base64 encoded fashion and Spark is unable to process it. I 
> would like to propose a minor change that will allow us to read these encoded 
> definitions created by Presto in a Spark Program.
> Assuming that the UDFs are made available, the user should be able to read 
> presto views after the fix.
>  
> I would like to propose a change to 
> [https://github.com/apache/spark/blob/9459833eae7fae887af560f3127997e023c51d00/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L440]
> to have support creation of CatlogTable for Views created from Presto.
>  
> Hive Metastore DB, the table definition for Presto views  (*select* * *from* 
> TBLS *where* `TBL_TYPE` *like* '%VIRTUAL_VIEW%') shows that the 
> VIEW_EXPANDED_TEXT is hardcoded as `/* Presto View */` and VIEW_ORIGINAL_TEXT 
> is `/* Presto View: base64({ "originalSql": "" "catalog": "", "schema": "", 
> "columns": [
> { "name": "", "type": "" }
> ], "owner": ""}) */`  
> Refer: 
> [https://github.com/prestodb/presto/blob/3242715959a169dbcdd88946c28488d2365c8886/presto-hive/src/main/java/com/facebook/presto/hive/HiveUtil.java#L614]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to