Even 500 reducers sounds a high number but I don't know the deatils of your 
cluster. Can u provide some details
How many nodes in cluster 
Hive version 
Which distribution (Hortonworks, Apache, CDH, Amazon)
Node specs
Partitions in the table 
Number of records. 

Sent from my iPhone

> On Mar 2, 2014, at 3:09 PM, Siddharth Tiwari <siddharth.tiw...@live.com> 
> wrote:
> Hi team,
> following query hangs at 99.97% for one reducer, kindly help or point to what 
> can be cause
> drop table if exists sample.dpi_short_lt;
> create table sample.dpi_short_lt as
> select                   b.msisdn,
>                 a.area_erb,
>                 a.longitude,
>                 a.latitude,
>                                substring(b.msisdn,1,2) as country,
>                                substring(b.msisdn,3,2) as area_code,
>                                substring(b.start_time,1,4) as year,
>                                substring(b.start_time,6,2) as month,
>                                substring(b.start_time,9,2) as day,
>                                substring(b.start_time,12,2) as hour,
>                                cast(b.procedure_duration as double) as 
> duracao_ms,
>                                cast(b.internet_latency as double) as 
> int_internet_latency,
>                                cast(b.ran_latency as double) as 
> int_ran_latency,
>                                cast(b.http_latency as double) as 
> int_http_latency,
>                                (case when b.internet_latency='' then 1 else 0 
> end) as internet_latency_missing,
>                                (case when b.ran_latency='' then 1 else 0 end) 
> as ran_latency_missing,
>                                (case when b.http_latency='' then 1 else 0 
> end) as http_latency_missing,
>                                (cast(b.mean_throughput_ul as int) * cast( 
> procedure_duration as int) / 1000) as total_up_bytes,
>                                (cast(b.mean_throughput_dl as int) * 
> cast(procedure_duration as int)  / 1000) as total_dl_bytes,
>                                cast(b.missing_packets_ul as int) as 
> int_missing_packets_ul,
>                                cast(b.missing_packets_dl as int) as 
> int_missing_packets_dl
> from sample.dpi_large b
> left outer join sample.science_new a
> on b.cgi = regexp_replace(a.codigo_cgi_ecgi,'-','')
> where msisdn!='';
> Hive was heuristically selecting 1000 reducers and it was hanging at 99.97 
> percent on one reduce task. I then changed the above values to 3GB per 
> reducer and 500 reducers and started hitting this error.
> java.lang.RuntimeException: Hive Runtime Error while closing operators: 
> Unable to rename output from: 
> hdfs://tlvcluster/tmp/hive-hadoop/hive_2014-03-01_03-14-36_812_8390586541316719852-1/_task_tmp.-ext-10001/_tmp.000003_0
>  to: 
> hdfs://tlvcluster/tmp/hive-hadoop/hive_2014-03-01_03-14-36_812_8390586541316719852-1/_tmp.-ext-10001/000003_0
>       at 
> org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:313)
>       at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:516)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>       at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
> output from: hdfs://tlvcluster/tmp/hive-hadoop/hive_2014-03-01_03-14-36_812
> I have 22 node cluster running cdh 4.3. Please try to locate what can be teh 
> issue.
> *------------------------*
> Cheers !!!
> Siddharth Tiwari
> Have a refreshing day !!!
> "Every duty is holy, and devotion to duty is the highest form of worship of 
> God.” 
> "Maybe other people will try to limit me but I don't limit myself"

Reply via email to