Hi, all What's the size of three tables? Also, what's the performance difference of the two queries?
On Tue, Mar 22, 2016 at 3:53 PM, Rishi Mishra <rmis...@snappydata.io> wrote: > What we have observed so far is Spark picks join order in the same order > as tables in from clause is specified. Sometimes reordering benefits the > join query. > This can be an inbuilt optimization in Spark. But again its not going to > be straight forward, where rather than table size, selectivity of Join is > important. > Probably some kind of heuristic might help. > > Regards, > Rishitesh Mishra, > SnappyData . (http://www.snappydata.io/) > > https://in.linkedin.com/in/rishiteshmishra > > On Mon, Mar 21, 2016 at 11:18 PM, gtinside <gtins...@gmail.com> wrote: > >> More details : >> >> Execution plan for Original query >> select distinct pge.portfolio_code >> from table1 pge join table2 p >> on p.perm_group = pge.anc_port_group >> join table3 uge >> on p.user_group=uge.anc_user_group >> where uge.user_name = 'user' and p.perm_type = 'TEST' >> >> == Physical Plan == >> TungstenAggregate(key=[portfolio_code#14119], functions=[], >> output=[portfolio_code#14119]) >> TungstenExchange hashpartitioning(portfolio_code#14119) >> TungstenAggregate(key=[portfolio_code#14119], functions=[], >> output=[portfolio_code#14119]) >> TungstenProject [portfolio_code#14119] >> BroadcastHashJoin [user_group#13665], [anc_user_group#13658], >> BuildRight >> TungstenProject [portfolio_code#14119,user_group#13665] >> BroadcastHashJoin [anc_port_group#14117], [perm_group#13667], >> BuildRight >> ConvertToUnsafe >> Scan >> >> ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table1.parquet][portfolio_code#14119,anc_port_group#14117] >> ConvertToUnsafe >> Project [user_group#13665,perm_group#13667] >> Filter (perm_type#13666 = TEST) >> Scan >> >> ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table2.parquet][user_group#13665,perm_group#13667,perm_type#13666] >> ConvertToUnsafe >> Project [anc_user_group#13658] >> Filter (user_name#13659 = user) >> Scan >> >> ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table3.parquet][anc_user_group#13658,user_name#13659] >> >> >> >> Execution plan for optimized query >> select distinct pge.portfolio_code >> from table1 uge, table2 p, table3 pge >> where uge.user_name = 'user' and p.perm_type = 'TEST' >> and p.perm_group = pge.anc_port_group >> and p.user_group=uge.anc_user_group >> >> == Physical Plan == >> TungstenAggregate(key=[portfolio_code#14119], functions=[], >> output=[portfolio_code#14119]) >> TungstenExchange hashpartitioning(portfolio_code#14119) >> TungstenAggregate(key=[portfolio_code#14119], functions=[], >> output=[portfolio_code#14119]) >> TungstenProject [portfolio_code#14119] >> BroadcastHashJoin [perm_group#13667], [anc_port_group#14117], >> BuildRight >> TungstenProject [perm_group#13667] >> BroadcastHashJoin [anc_user_group#13658], [user_group#13665], >> BuildRight >> ConvertToUnsafe >> Project [anc_user_group#13658] >> Filter (user_name#13659 = user) >> Scan >> >> ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table3.parquet][anc_user_group#13658,user_name#13659] >> ConvertToUnsafe >> Project [perm_group#13667,user_group#13665] >> Filter (perm_type#13666 = TEST) >> Scan >> >> ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table2.parquet][perm_group#13667,user_group#13665,perm_type#13666] >> ConvertToUnsafe >> Scan >> >> ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table1.parquet][portfolio_code#14119,anc_port_group#14117] >> >> >> >> >> >> >> >> >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Optimization-tp26548p26553.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > -- --- Takeshi Yamamuro