Hi Bill!

Please check out the scope document attached to HBASE-18405 "Track
scope for HBase-Spark module". It's the result of the last time the
community went through discussing what was needed for a release-worthy
integration.

pdf: https://s.apache.org/fejd

I haven't gotten to take a look at the scope specifically in about a
year, but it'd be great to get renewed effort going again. It'd be
simpler and faster to propose things in terms of updating that scope
document.

Also please note that recently the issue of moving the spark
integration out of the main repo came up again as a part of a wider
discussion about moving integration with various other systems into a
different repo (thread on dev@hbase with subject "[DISCUSS] Kafka
Connection, HBASE-15320").

On Sun, Jul 29, 2018 at 2:53 AM, bill.yunfu
<guangcheng....@alibaba-inc.com> wrote:
> May I take this issue --hbase-spark
>
> Hi community
>    I am working in one HBase team which service hundreds customers. We find
> that along increasing amount of data in the HBase, many customers have
> analysis requirement for their data on Hbase. For example they want use
> Spark to do some analysis which may query more data from Hbase and may also
> join with other tables, the tables may be in Hbase or Spark.
>    But Hbase can not support this scenario very well. So we plan use spark
> to support this.
>    We found the Apache Hbase already has one module called Hbase-spark, but
> this module is not updated recently and not formally released. Besides we
> found there are others project support Sql On Hbase. For example Hive on
> Hbase which give good sql syntax support.
>    Even there are many projects for Spark on Hbase, but I think now no one
> is the public knowing for users. Because our customer have more and more
> requirement for Spark on Hbase, So we want take this issue. Initial goal is
> make a standard and public knowing Spark on Hbase in apache Hbase
> community.
>    Our initial idea is:
>    SQL support:  Now the hbase-spark model can not spark-sql command to
> create table, We want make it support sql command which may like the sql
> syntax from Hive on HBase or the SQL syntax from SHC.
>    Performance improved: this part is not very clearly now, the goal is use
> spark sql query HBase data has a good performance.
>
> We want to get some suggestions from community. Then I will raise a JIRA to
> track it and put a design document.
>
> Best Regards
> Bill
>
>
>
>
> --
> Sent from: 
> http://apache-hbase.679495.n3.nabble.com/HBase-Developer-f679493.html

Reply via email to