[jira] [Updated] (SPARK-29018) Spark ThriftServer change to it's own API

angerszhu (Jira) Tue, 17 Dec 2019 06:10:16 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-29018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


angerszhu updated SPARK-29018:
------------------------------
    Description: 
### What changes were proposed in this pull request?

With the development of Spark and Hive，in current `sql/hive-thriftserver`, we 
need to do a lot of work to solve code conflicts between different hive 
versions. It's an annoying and unending work in current ways. And these issues 
are troubling us when we develop new features for the SparkThriftServer2. We 
suppose to implement a new thrift server based on latest v11 
`TCLService.thrift` thrift protocol. Implement all API in spark's own code to 
get rid of hive code .
    Finally, the new thrift server have below feature:
1. support all functions current `hive-thriftserver` support
2. use all code maintained by spark itself
3. realize origin function fit to spark’s own feature, won't limited by hive's 
code
4. support running without hive metastore or with hive metastore
5. support user impersonation by Multi-tenant authority separation,  hive 
authentication and DFS authentication
6. add a new module `spark-jdbc`, with connection url  
`jdbc:spark:<host>:<port>/<db>`, all `hive-jdbc` support we will all support
7. support both `hive-jdbc` and `spark-jdbc` client for compatibility with most 
clients


We have done all these works in our repo, now we plan  merge our code into the 
master step by step.  

1.  **phase1**  pr about build new module  `spark-service` on folder 
`sql/service`
2. **phase2**  pr thrift protocol and generated thrift protocol java code 
3. **phase3**  pr with all `spark-service` module code  and description about 
design, also UT
4. **phase4**  pr about build new module `spark-jdbc` on folder `sql/jdbc`
5. **phase5**  pr with all `spark-jdbc` module code  and UT
6. **phase6**  pr about support thriftserver Impersonation
7. **phase7**   pr about build spark's own beeline client `spark-beeline`
8. **phase8**  pr about spark's own Cli client `Spark SQL CLI` `spark-cli`

### Why are the changes needed?

Build a totally new thrift server base on spark's own code and feature.Don't 
rely on hive code anymore


### Does this PR introduce any user-facing change?


### How was this patch tested?
Not need  UT now


  was:
Current SparkThriftServer rely on HiveServer2 too much, when Hive version 
changed, we should change a lot to fit for Hive code change.

We would best just use Hive's thrift interface to implement it 's own API for 
SparkThriftServer. 

And remove unused code logical [for Spark Thrift Server]. 


> Spark ThriftServer change to it's own API
> -----------------------------------------
>
>                 Key: SPARK-29018
>                 URL: https://issues.apache.org/jira/browse/SPARK-29018
>             Project: Spark
>          Issue Type: Umbrella
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: angerszhu
>            Priority: Major
>
> ### What changes were proposed in this pull request?
> With the development of Spark and Hive，in current `sql/hive-thriftserver`, we 
> need to do a lot of work to solve code conflicts between different hive 
> versions. It's an annoying and unending work in current ways. And these 
> issues are troubling us when we develop new features for the 
> SparkThriftServer2. We suppose to implement a new thrift server based on 
> latest v11 `TCLService.thrift` thrift protocol. Implement all API in spark's 
> own code to get rid of hive code .
>     Finally, the new thrift server have below feature:
> 1. support all functions current `hive-thriftserver` support
> 2. use all code maintained by spark itself
> 3. realize origin function fit to spark’s own feature, won't limited by 
> hive's code
> 4. support running without hive metastore or with hive metastore
> 5. support user impersonation by Multi-tenant authority separation,  hive 
> authentication and DFS authentication
> 6. add a new module `spark-jdbc`, with connection url  
> `jdbc:spark:<host>:<port>/<db>`, all `hive-jdbc` support we will all support
> 7. support both `hive-jdbc` and `spark-jdbc` client for compatibility with 
> most clients
> We have done all these works in our repo, now we plan  merge our code into 
> the master step by step.  
> 1.  **phase1**  pr about build new module  `spark-service` on folder 
> `sql/service`
> 2. **phase2**  pr thrift protocol and generated thrift protocol java code 
> 3. **phase3**  pr with all `spark-service` module code  and description about 
> design, also UT
> 4. **phase4**  pr about build new module `spark-jdbc` on folder `sql/jdbc`
> 5. **phase5**  pr with all `spark-jdbc` module code  and UT
> 6. **phase6**  pr about support thriftserver Impersonation
> 7. **phase7**   pr about build spark's own beeline client `spark-beeline`
> 8. **phase8**  pr about spark's own Cli client `Spark SQL CLI` `spark-cli`
> ### Why are the changes needed?
> Build a totally new thrift server base on spark's own code and feature.Don't 
> rely on hive code anymore
> ### Does this PR introduce any user-facing change?
> ### How was this patch tested?
> Not need  UT now



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-29018) Spark ThriftServer change to it's own API

Reply via email to