Re: Need help on Spark UDF (Join) Performance tuning .

2014-07-18 Thread S Malligarjunan
Hello Experts, Appreciate your input highly, please suggest/ give me hint, what would be the issue here?   Thanks and Regards, Malligarjunan S.   On Thursday, 17 July 2014, 22:47, S Malligarjunan smalligarju...@yahoo.com wrote: Hello Experts, I am facing performance problem when I use

Re: Need help on Spark UDF (Join) Performance tuning .

2014-07-18 Thread Michael Armbrust
It's likely that since your UDF is a black box to hive's query optimizer that it must choose a less efficient join algorithm that passes all possible matches to your function for comparison. This will happen any time your UDF touches attributes from both sides of the join. In general you can

Need help on Spark UDF (Join) Performance tuning .

2014-07-17 Thread S Malligarjunan
Hello Experts, I am facing performance problem when I use the UDF function call. Please help me to tune the query. Please find the details below shark select count(*) from table1; OK 151096 Time taken: 7.242 seconds shark select count(*) from table2;  OK 938 Time taken: 1.273 seconds Without