HBase will not have query engine. It will provide better support to query engines.
Cheers > On Aug 10, 2015, at 11:11 PM, Yan Zhou.sc <yan.zhou...@huawei.com> wrote: > > Ted, > > I’m in China now, and seem to experience difficulty to access Apache Jira. > Anyways, it appears to me that HBASE-14181 attempts to support Spark > DataFrame inside HBase. > If true, one question to me is whether HBase is intended to have a built-in > query engine or not. Or it will stick with the current way as > a k-v store with some built-in processing capabilities in the forms of > coprocessor, custom filter, …, etc., which allows for loosely-coupled query > engines > built on top of it. > > Thanks, > > 发件人: Ted Yu [mailto:yuzhih...@gmail.com] > 发送时间: 2015年8月11日 8:54 > 收件人: Bing Xiao (Bing) > 抄送: d...@spark.apache.org; user@spark.apache.org; Yan Zhou.sc > 主题: Re: Package Release Annoucement: Spark SQL on HBase "Astro" > > Yan / Bing: > Mind taking a look at HBASE-14181 'Add Spark DataFrame DataSource to > HBase-Spark Module' ? > > Thanks > > On Wed, Jul 22, 2015 at 4:53 PM, Bing Xiao (Bing) <bing.x...@huawei.com> > wrote: > We are happy to announce the availability of the Spark SQL on HBase 1.0.0 > release. http://spark-packages.org/package/Huawei-Spark/Spark-SQL-on-HBase > The main features in this package, dubbed “Astro”, include: > · Systematic and powerful handling of data pruning and intelligent > scan, based on partial evaluation technique > > · HBase pushdown capabilities like custom filters and coprocessor to > support ultra low latency processing > > · SQL, Data Frame support > > · More SQL capabilities made possible (Secondary index, bloom filter, > Primary Key, Bulk load, Update) > > · Joins with data from other sources > > · Python/Java/Scala support > > · Support latest Spark 1.4.0 release > > > > The tests by Huawei team and community contributors covered the areas: bulk > load; projection pruning; partition pruning; partial evaluation; code > generation; coprocessor; customer filtering; DML; complex filtering on keys > and non-keys; Join/union with non-Hbase data; Data Frame; multi-column family > test. We will post the test results including performance tests the middle > of August. > You are very welcomed to try out or deploy the package, and help improve the > integration tests with various combinations of the settings, extensive Data > Frame tests, complex join/union test and extensive performance tests. Please > use the “Issues” “Pull Requests” links at this package homepage, if you want > to report bugs, improvement or feature requests. > Special thanks to project owner and technical leader Yan Zhou, Huawei global > team, community contributors and Databricks. Databricks has been providing > great assistance from the design to the release. > “Astro”, the Spark SQL on HBase package will be useful for ultra low latency > query and analytics of large scale data sets in vertical enterprises. We will > continue to work with the community to develop new features and improve code > base. Your comments and suggestions are greatly appreciated. > > Yan Zhou / Bing Xiao > Huawei Big Data team > >