if we have a huge table, and every 1 hour only 1% of that has some updates,
it would be a huge waste to slurp in the whole table through MR job and
write out the new table.
instead, if we store this table in HBASE, and use the current HBase+Hive
integration, as long as we can do upsert, then we
Hi Yang. That's correct. You should check out the HBase UDFs in Klout's
Brickhouse library
https://github.com/klout/brickhouse/tree/master/src/main/java/brickhouse/hbase
On Jul 24, 2014 8:07 PM, Yang tedd...@gmail.com wrote:
if we have a huge table, and every 1 hour only 1% of that has some