Even 500 reducers sounds a high number but I don't know the deatils of your 
cluster. Can u provide some details
How many nodes in cluster 
Hive version 
Which distribution (Hortonworks, Apache, CDH, Amazon)
Node specs
Partitions in the table 
Number of records. 
Thanks
Sanjay

Sent from my iPhone

> On Mar 2, 2014, at 3:09 PM, Siddharth Tiwari <siddharth.tiw...@live.com> 
> wrote:
> 
> Hi team,
> 
> following query hangs at 99.97% for one reducer, kindly help or point to what 
> can be cause
> 
> drop table if exists sample.dpi_short_lt;
> create table sample.dpi_short_lt as
> select                   b.msisdn,
>                 a.area_erb,
>                 a.longitude,
>                 a.latitude,
>                                substring(b.msisdn,1,2) as country,
>                                substring(b.msisdn,3,2) as area_code,
>                                substring(b.start_time,1,4) as year,
>                                substring(b.start_time,6,2) as month,
>                                substring(b.start_time,9,2) as day,
>                                substring(b.start_time,12,2) as hour,
>                                cast(b.procedure_duration as double) as 
> duracao_ms,
>                                cast(b.internet_latency as double) as 
> int_internet_latency,
>                                cast(b.ran_latency as double) as 
> int_ran_latency,
>                                cast(b.http_latency as double) as 
> int_http_latency,
>                                (case when b.internet_latency='' then 1 else 0 
> end) as internet_latency_missing,
>                                (case when b.ran_latency='' then 1 else 0 end) 
> as ran_latency_missing,
>                                (case when b.http_latency='' then 1 else 0 
> end) as http_latency_missing,
>                                (cast(b.mean_throughput_ul as int) * cast( 
> procedure_duration as int) / 1000) as total_up_bytes,
>                                (cast(b.mean_throughput_dl as int) * 
> cast(procedure_duration as int)  / 1000) as total_dl_bytes,
>                                cast(b.missing_packets_ul as int) as 
> int_missing_packets_ul,
>                                cast(b.missing_packets_dl as int) as 
> int_missing_packets_dl
> from sample.dpi_large b
> left outer join sample.science_new a
> on b.cgi = regexp_replace(a.codigo_cgi_ecgi,'-','')
> where msisdn!='';
> 
> Hive was heuristically selecting 1000 reducers and it was hanging at 99.97 
> percent on one reduce task. I then changed the above values to 3GB per 
> reducer and 500 reducers and started hitting this error.
> 
> java.lang.RuntimeException: Hive Runtime Error while closing operators: 
> Unable to rename output from: 
> hdfs://tlvcluster/tmp/hive-hadoop/hive_2014-03-01_03-14-36_812_8390586541316719852-1/_task_tmp.-ext-10001/_tmp.000003_0
>  to: 
> hdfs://tlvcluster/tmp/hive-hadoop/hive_2014-03-01_03-14-36_812_8390586541316719852-1/_tmp.-ext-10001/000003_0
>       at 
> org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:313)
>       at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:516)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>       at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
> output from: hdfs://tlvcluster/tmp/hive-hadoop/hive_2014-03-01_03-14-36_812
> 
> 
> I have 22 node cluster running cdh 4.3. Please try to locate what can be teh 
> issue.
> 
> 
> *------------------------*
> Cheers !!!
> Siddharth Tiwari
> Have a refreshing day !!!
> "Every duty is holy, and devotion to duty is the highest form of worship of 
> God.” 
> "Maybe other people will try to limit me but I don't limit myself"

Reply via email to