Or, have you ever try broadcast join?
From: Cheng, Hao [mailto:hao.ch...@intel.com]
Sent: Tuesday, May 5, 2015 8:33 AM
To: luohui20...@sina.com; Olivier Girardot; user
Subject: RE: 回复:Re: sparksql running slow while joining 2 tables.
Can you print out the physical plan?
EXPLAIN SELECT xxx…
From: luohui20...@sina.commailto:luohui20...@sina.com
[mailto:luohui20...@sina.com]
Sent: Monday, May 4, 2015 9:08 PM
To: Olivier Girardot; user
Subject: 回复:Re: sparksql running slow while joining 2 tables.
hi Olivier
spark1.3.1, with java1.8.0.45
and add 2 pics .
it seems like a GC issue. I also tried with different parameters like memory
size of driverexecutor, memory fraction, java opts...
but this issue still happens.
Thanksamp;Best regards!
罗辉 San.Luo
- 原始邮件 -
发件人:Olivier Girardot ssab...@gmail.commailto:ssab...@gmail.com
收件人:luohui20...@sina.commailto:luohui20...@sina.com, user
user@spark.apache.orgmailto:user@spark.apache.org
主题:Re: sparksql running slow while joining 2 tables.
日期:2015年05月04日 20点46分
Hi,
What is you Spark version ?
Regards,
Olivier.
Le lun. 4 mai 2015 à 11:03, luohui20...@sina.commailto:luohui20...@sina.com
a écrit :
hi guys
when i am running a sql like select
a.namehttp://a.name,a.startpoint,a.endpoint, a.piece from db a join sample b
on (a.namehttp://a.name = b.namehttp://b.name) where (b.startpoint
a.startpoint + 25); I found sparksql running slow in minutes which may caused
by very long GC and shuffle time.
table db is created from a txt file size at 56mb while table sample
sized at 26mb, both at small size.
my spark cluster is a standalone pseudo-distributed spark cluster with
8g executor and 4g driver manager.
any advises? thank you guys.
Thanksamp;Best regards!
罗辉 San.Luo
-
To unsubscribe, e-mail:
user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org
For additional commands, e-mail:
user-h...@spark.apache.orgmailto:user-h...@spark.apache.org