Roadblock -- stuck for 10 days :( how come same hive udf giving different results in spark and hive

Alex Tue, 31 Jan 2017 06:34:31 -0800

Hi All,

i am trying to run a hive udf in spark-sql and its giving different rows as
result in both hive and spark..


My UDF query looks something like this

select col1,col2,col3, sum(col4) col4, sum(col5) col5,Group_name
from
(select inline(myudf('cons1',record))
from table1) test group by col1,col2,col3;

but the results are same till here if i give below subquery

its giving the same output

(select inline(myudf('cons1',record))
from table1) test group by col1,col2,col3;

But If I pass the entire script its giving different outputs in both hive
and spark


select col1,col2,col3, sum(col4) col4, sum(col5) col5,Group_name
from
(select inline(myudf('cons1',record))
from table1) test group by col1,col2,col3;

how come? :(

Roadblock -- stuck for 10 days :( how come same hive udf giving different results in spark and hive

Reply via email to