Hi,
As per my understanding, the available codecs for ORC file format Hive table
compression are either Zlib or Snappy.Both the compression techniques are non
splittable.. Does it mean that any queries on Hive table stored as ORC and
compressed will not run multiple maps in parallel ???
I know
Hi,
I have read couple of articles that say pig joins perform better compared to
Hive joins... Is that true ? if Yes could you please explain the reason.
Thanks
Hi,
I am new to Hive, please help me understand the benefit of ORC file format
storing Sum, Min, Max values.Whenever we try to find a sum of values in a
particular column, it still runs the MapReduce job.
select sum(col1) from orctable;select sum(col1) from txttable;
For a sample file with