You can use Explain extended select …. From: luohui20...@sina.com [mailto:luohui20...@sina.com] Sent: Tuesday, May 05, 2015 9:52 AM To: Cheng, Hao; Olivier Girardot; user Subject: 回复:RE: 回复:Re: sparksql running slow while joining_2_tables.
As I know broadcastjoin is automatically enabled by spark.sql.autoBroadcastJoinThreshold. refer to http://spark.apache.org/docs/latest/sql-programming-guide.html#other-configuration-options and how to check my app's physical plan,and others things like optimized plan,executable plan.etc thanks -------------------------------- Thanks&Best regards! 罗辉 San.Luo ----- 原始邮件 ----- 发件人:"Cheng, Hao" <hao.ch...@intel.com<mailto:hao.ch...@intel.com>> 收件人:"Cheng, Hao" <hao.ch...@intel.com<mailto:hao.ch...@intel.com>>, "luohui20...@sina.com<mailto:luohui20...@sina.com>" <luohui20...@sina.com<mailto:luohui20...@sina.com>>, Olivier Girardot <ssab...@gmail.com<mailto:ssab...@gmail.com>>, user <user@spark.apache.org<mailto:user@spark.apache.org>> 主题:RE: 回复:Re: sparksql running slow while joining_2_tables. 日期:2015年05月05日 08点38分 Or, have you ever try broadcast join? From: Cheng, Hao [mailto:hao.ch...@intel.com] Sent: Tuesday, May 5, 2015 8:33 AM To: luohui20...@sina.com<mailto:luohui20...@sina.com>; Olivier Girardot; user Subject: RE: 回复:Re: sparksql running slow while joining 2 tables. Can you print out the physical plan? EXPLAIN SELECT xxx… From: luohui20...@sina.com<mailto:luohui20...@sina.com> [mailto:luohui20...@sina.com] Sent: Monday, May 4, 2015 9:08 PM To: Olivier Girardot; user Subject: 回复:Re: sparksql running slow while joining 2 tables. hi Olivier spark1.3.1, with java1.8.0.45 and add 2 pics . it seems like a GC issue. I also tried with different parameters like memory size of driver&executor, memory fraction, java opts... but this issue still happens. -------------------------------- Thanks&Best regards! 罗辉 San.Luo ----- 原始邮件 ----- 发件人:Olivier Girardot <ssab...@gmail.com<mailto:ssab...@gmail.com>> 收件人:luohui20...@sina.com<mailto:luohui20...@sina.com>, user <user@spark.apache.org<mailto:user@spark.apache.org>> 主题:Re: sparksql running slow while joining 2 tables. 日期:2015年05月04日 20点46分 Hi, What is you Spark version ? Regards, Olivier. Le lun. 4 mai 2015 à 11:03, <luohui20...@sina.com<mailto:luohui20...@sina.com>> a écrit : hi guys when i am running a sql like "select a.name<http://a.name>,a.startpoint,a.endpoint, a.piece from db a join sample b on (a.name<http://a.name> = b.name<http://b.name>) where (b.startpoint > a.startpoint + 25);" I found sparksql running slow in minutes which may caused by very long GC and shuffle time. table db is created from a txt file size at 56mb while table sample sized at 26mb, both at small size. my spark cluster is a standalone pseudo-distributed spark cluster with 8g executor and 4g driver manager. any advises? thank you guys. -------------------------------- Thanks&Best regards! 罗辉 San.Luo --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org> For additional commands, e-mail: user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>