Thank you both for the answers.  We are trying to find out if Hive can be
used as a replacement of Netezza, but if there are no indexes then I don't
see how it will beat Netezza in terms of performance.  Sounds like it
certainly can't be used to do a quick lookup from a webapp - like Netezza
can.

If performance isn't a concern, then I guess it could be a useful tool.
Will try it out & see how it works out.  Thanks.


On Sun, Sep 16, 2012 at 10:51 PM, Tim Robertson
<timrobertson...@gmail.com>wrote:

> Note:  I am a newbie to Hive.
>>
>> Can someone please answer the following questions?
>>
>> 1)  Does Hive provide APIs (like HBase does) that can be used to retrieve
>> data from the tables in Hive from a Java program?  I heard somewhere that
>> the data can be accessed with JDBC (style) APIs.  True?
>>
>
> True.
> https://cwiki.apache.org/Hive/hiveclient.html#HiveClient-JDBC
>
>
>> 2)  I don't see how I can add indexes on the tables, so does that mean a
>> query such as the following will trigger a MR job that will search files on
>> HDFS sequentially?
>>
>> hive> SELECT a.foo FROM invites a WHERE a.ds='2008-08-15';
>>
>>
> There are some index implementations in hive, but it is not as simple as a
> traditional db.
> E.g. Search Jira and see some of the work:
> https://issues.apache.org/jira/browse/HIVE-417
>
> You are correct that the above would do a full table scan
>
> 3)  Has anyone compared performance of Hive against other NOSQL databases
>> such as HBase, MongoDB.  I understand it's not exactly apples to apples
>> comparison, but still...
>>
>
> I think you misunderstand what Hive is.  It is a basically a SQL to MR
> translation engine, which has adapters for the input source.  By default it
> uses simple files on the HDFS, but there is (e.g.) HBase adapters, so you
> can use it to run SQL on HBase tables for example (which works great).
>  Regarding performance, on the HBase scans, the operation is the same as
> running a normal HBase MR scan, so is the same.
>
>
>>
>> Thanks.
>
>
>

Reply via email to