Understanding ORC file format compression

2015-06-21 Thread sreejesh s
Hi, As per my understanding, the available codecs for ORC file format Hive table compression are either Zlib or Snappy.Both the compression techniques are non splittable.. Does it mean that any queries on Hive table stored as ORC and compressed will not run multiple maps in parallel ??? I know

HIve Joins vs Pig Joins

2015-06-02 Thread sreejesh s
Hi, I have read couple of articles that say pig joins perform better compared to Hive joins... Is that true ? if Yes could you please explain the reason. Thanks

Benefit of ORC format storing Sum, Min, Max...

2015-05-29 Thread sreejesh s
Hi, I am new to Hive, please help me understand the benefit of ORC file format storing Sum, Min, Max values.Whenever we try to find a sum of values in a particular column, it still runs the MapReduce job. select sum(col1) from orctable;select sum(col1) from txttable; For a sample file with