Re: 答复: Package Release Annoucement: Spark SQL on HBase "Astro"

Ted Yu Tue, 11 Aug 2015 00:29:10 -0700

HBase will not have query engine. 

It will provide better support to query engines.


Cheers



> On Aug 10, 2015, at 11:11 PM, Yan Zhou.sc <[email protected]> wrote:
> 
> Ted,
>  
> I’m in China now, and seem to experience difficulty to access Apache Jira. 
> Anyways, it appears to me  that HBASE-14181 attempts to support Spark 
> DataFrame inside HBase.
> If true, one question to me is whether HBase is intended to have a built-in 
> query engine or not. Or it will stick with the current way as
> a k-v store with some built-in processing capabilities in the forms of 
> coprocessor, custom filter, …, etc., which allows for loosely-coupled query 
> engines
> built on top of it.
>  
> Thanks,
>  
> 发件人: Ted Yu [mailto:[email protected]] 
> 发送时间: 2015年8月11日 8:54
> 收件人: Bing Xiao (Bing)
> 抄送: [email protected]; [email protected]; Yan Zhou.sc
> 主题: Re: Package Release Annoucement: Spark SQL on HBase "Astro"
>  
> Yan / Bing:
> Mind taking a look at HBASE-14181 'Add Spark DataFrame DataSource to 
> HBase-Spark Module' ?
>  
> Thanks
>  
> On Wed, Jul 22, 2015 at 4:53 PM, Bing Xiao (Bing) <[email protected]> 
> wrote:
> We are happy to announce the availability of the Spark SQL on HBase 1.0.0 
> release.  http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase
> The main features in this package, dubbed “Astro”, include:
> ·         Systematic and powerful handling of data pruning and intelligent 
> scan, based on partial evaluation technique
> 
> ·         HBase pushdown capabilities like custom filters and coprocessor to 
> support ultra low latency processing
> 
> ·         SQL, Data Frame support
> 
> ·         More SQL capabilities made possible (Secondary index, bloom filter, 
> Primary Key, Bulk load, Update)
> 
> ·         Joins with data from other sources
> 
> ·         Python/Java/Scala support
> 
> ·         Support latest Spark 1.4.0 release
> 
>  
> 
> The tests by Huawei team and community contributors covered the areas: bulk 
> load; projection pruning; partition pruning; partial evaluation; code 
> generation; coprocessor; customer filtering; DML; complex filtering on keys 
> and non-keys; Join/union with non-Hbase data; Data Frame; multi-column family 
> test.  We will post the test results including performance tests the middle 
> of August.
> You are very welcomed to try out or deploy the package, and help improve the 
> integration tests with various combinations of the settings, extensive Data 
> Frame tests, complex join/union test and extensive performance tests.  Please 
> use the “Issues” “Pull Requests” links at this package homepage, if you want 
> to report bugs, improvement or feature requests.
> Special thanks to project owner and technical leader Yan Zhou, Huawei global 
> team, community contributors and Databricks.   Databricks has been providing 
> great assistance from the design to the release.
> “Astro”, the Spark SQL on HBase package will be useful for ultra low latency 
> query and analytics of large scale data sets in vertical enterprises. We will 
> continue to work with the community to develop new features and improve code 
> base.  Your comments and suggestions are greatly appreciated.
>  
> Yan Zhou / Bing Xiao
> Huawei Big Data team
>  
>

Re: 答复: Package Release Annoucement: Spark SQL on HBase "Astro"

Reply via email to