Mich,

I have not found too many happy users of Hive on top of HBase in my experience. 
 For every query in Hive, you will have to read the data from the filesystem 
into hbase and then serialize the data via an HBase scanner into Hive.  The 
throughput through this mechanism is pretty poor and now when you read 1 
million records you actually read 1 Million records in HBase and 1 Million 
Records in Hive.  There are significant resource management issues with this 
approach as well.    

At Splice Machine (open source), we have written an implementation to read the 
store files directly from the file system (via embedded Spark) and then we do 
incremental deltas with HBase to maintain consistency.  When we read 1 million 
records, Spark reads most of them directly from the filesystem.  Spark provides 
resource management and fair scheduling of those queries as well.  

We released some of our performance results at HBaseCon East in NYC.  Here is 
the video.  https://www.youtube.com/watch?v=cgIz-cjehJ0 
<https://www.youtube.com/watch?v=cgIz-cjehJ0> .

Regards,
John Leach

> On Nov 17, 2016, at 6:09 AM, Mich Talebzadeh <mich.talebza...@gmail.com> 
> wrote:
> 
> H,
> 
> My approach to have a SQL engine on top of Hbase has been (excluding Spark
> & Phoenix for now) is to create Hbase table as is, then create an EXTERNAL
> Hive table on Hbase using Hadoop.hive.HbaseStorageHandler to interface with
> Hbase table.
> 
> My reasoning with creating Hive external table is to avoid accidentally
> dropping Hbase table etc. Is this a reasonable approach?
> 
> Then that Hive table can be used by a variety of tools like Spark, Tableau,
> Zeppelin.
> 
> Is this a viable solution as Hive seems to be preferred on top of Hbase
> compared to Phoenix etc.
> 
> Thaks
> 
> Dr Mich Talebzadeh
> 
> 
> 
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
> 
> 
> 
> http://talebzadehmich.wordpress.com
> 
> 
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.

Reply via email to