Hi,

I am working on a use case that has the following characteristics.
1) Data volume is in the order 30 million records
2) Data schema is known & is fixed (for the application we are building)
3) Data is NOT multi format. A single key will have integer data for
different aspects of that key
4) Data will be incrementally updated (some column values will be updated at
different points of time)
5) There is a need to support adhoc (queries are not known ahead of time)
querying of data (without writing map reduce jobs)
6) Queries are likely to have a lot of joins & aggregations

Could you please help me with suggestions on whether i should use
1) Hive
2) HBase
3) Hive over HBase
4) Pig over HBase

Thanks
Vivek

Reply via email to