Re: hive on Tez - merging orc files

2015-04-24 Thread Prasanth Jayachandran
You can download the branch-0.14 source code from 
https://github.com/apache/hive/tree/branch-0.14, apply 
HIVE-9529-branch-1.0.0.patch from 
https://issues.apache.org/jira/browse/HIVE-9529 and compile it using “mvn clean 
install -DskipTests -Phadoop-2,dist”. This will generate tar file under 
hive/packaging/target. You can extract the tar file, copy the 
hive-exec-x.x.x.jar into /usr/hdp/2.2.*.*/hive/lib/ (take backup of 
hive-exec.jar and replace with the new one). Rerunning hive cli should use the 
new hive-exec jar with the patch.

Thanks
Prasanth

 On Apr 24, 2015, at 1:15 AM, patcharee patcharee.thong...@uni.no wrote:
 
 Hi,
 
 The sandbox 2.2 comes with hive 0.14. Does it also have the bug? If so, how 
 can I patch hive on sandbox?
 
 BR,
 Patcharee
 
 On 24. april 2015 09:42, Prasanth Jayachandran wrote:
 Hi
 
 This has been fixed recently 
 https://issues.apache.org/jira/browse/HIVE-9529. Merging is triggered in two 
 different ways. INSERT/CTAS can trigger merging of small files and 
 CONCATENATE can trigger merging of small files. The later had a bug which 
 generated MR task instead of TEZ task which was fixed recently. Earlier one 
 will use TEZ task always.
 
 Thanks
 Prasanth
 
 On Apr 24, 2015, at 12:33 AM, patcharee patcharee.thong...@uni.no wrote:
 
 Hi,
 
 Is there anyone using hortonworks sandbox 2.2? I am trying to use hive on 
 Tez on the sandbox. I set the running engine in hive-site.xml to Tez.
 
property
  namehive.execution.engine/name
  valuetez/value
/property
 
 Then I ran the script that alters a table to merge small orc files (alter 
 table orc_merge5a partition(st=0.8) concatenate;). The merging feature 
 worked, but Hive does not use Tez, it used MapReduce, so weird!
 
 Another point, I tried to run the same script on the production cluster 
 which is on always Tez, the merging feature sometimes worked, sometimes did 
 not.
 
 I would appreciate any suggestions.
 
 BR,
 Patcharee
 



hive on Tez - merging orc files

2015-04-24 Thread patcharee

Hi,

Is there anyone using hortonworks sandbox 2.2? I am trying to use hive 
on Tez on the sandbox. I set the running engine in hive-site.xml to Tez.


property
  namehive.execution.engine/name
  valuetez/value
/property

Then I ran the script that alters a table to merge small orc files 
(alter table orc_merge5a partition(st=0.8) concatenate;). The merging 
feature worked, but Hive does not use Tez, it used MapReduce, so weird!


Another point, I tried to run the same script on the production cluster 
which is on always Tez, the merging feature sometimes worked, sometimes 
did not.


I would appreciate any suggestions.

BR,
Patcharee


Re: hive on Tez - merging orc files

2015-04-24 Thread Prasanth Jayachandran
Hi

This has been fixed recently https://issues.apache.org/jira/browse/HIVE-9529. 
Merging is triggered in two different ways. INSERT/CTAS can trigger merging of 
small files and CONCATENATE can trigger merging of small files. The later had a 
bug which generated MR task instead of TEZ task which was fixed recently. 
Earlier one will use TEZ task always.

Thanks
Prasanth

 On Apr 24, 2015, at 12:33 AM, patcharee patcharee.thong...@uni.no wrote:
 
 Hi,
 
 Is there anyone using hortonworks sandbox 2.2? I am trying to use hive on Tez 
 on the sandbox. I set the running engine in hive-site.xml to Tez.
 
property
  namehive.execution.engine/name
  valuetez/value
/property
 
 Then I ran the script that alters a table to merge small orc files (alter 
 table orc_merge5a partition(st=0.8) concatenate;). The merging feature 
 worked, but Hive does not use Tez, it used MapReduce, so weird!
 
 Another point, I tried to run the same script on the production cluster which 
 is on always Tez, the merging feature sometimes worked, sometimes did not.
 
 I would appreciate any suggestions.
 
 BR,
 Patcharee