Hi, 

we have an issue with windowing function query never completed when
running against the large dataset > 25,000 rows. That is the reducer
(only one) never exit and it appears stuck in an infinite loop. 
I looked at the Reducer counter and it never changes over the 6 hours when it 
gets stuck in a loop.

When the data set is small < 25K rows, it runs fine.

Is there any work around this issue? We tested
against Hive 0.11/0.12/0.13 and the same result is the same.

create table window_function_fail
as
select a.*,
sum(case when bprice is not null then 1 else 0 end) over (partition by
date,name order by otime,bprice,aprice desc ROWS BETWEEN UNBOUNDED
PRECEDING AND CURRENT ROW) bidpid
from
large_table a;

create table large_table(
date               string    ,             
name             string   ,        
stime              string  ,          
bprice            decimal  ,              
aprice            decimal   ,             
otime             double                             
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' stored as textfile;

Thanks in advance.



Reply via email to