Re: Hive footprint

Jörn Franke Wed, 20 Apr 2016 09:13:49 -0700

Depends really what you want to do. Hive is more for queries involving a lot of 
data, whereby hbase+Phoenix is more for oltp scenarios or sensor ingestion.


I think the reason is that hive has been the entry point for many engines and 
formats. Additionally there is a lot of tuning capabilities from hardware over 
software to make it fast. Thus, other software always had it a little bit 
difficult.


> On 19 Apr 2016, at 00:34, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
> 
> Hi,
> 
> I notice that Impala is rarely mentioned these days.  I may be missing 
> something. However, I gather it is coming to end now as I don't recall many 
> use cases for it (or customers asking for it). In contrast, Hive has hold its 
> ground with the new addition of Spark and Tez as execution engines, support 
> for ACID and ORC and new stuff in Hive 2. In addition provided a good choice 
> for its metastore it scales well.
> 
> If Hive had the ability (organic) to have local variable and stored procedure 
> support then it would be top notch Data Warehouse. Given its metastore, I 
> don't see any technical reason why it cannot support these constructs.
> 
> I was recently asked to comment on migration from commercial DWs to Big Data 
> (primarily for TCO reason) and really could not recall any better candidate 
> than Hive. Is HBase a viable alternative? Obviously whatever one decides 
> there is still HDFS, a good engine for Hive (sounds like many prefer TEZ 
> although I am a Spark fan) and the ubiquitous YARN.
> 
> Let me know your thoughts.
> 
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> http://talebzadehmich.wordpress.com
>

Re: Hive footprint

Reply via email to