Re: How to control spark.sql.shuffle.partitions per query

2015-09-23 Thread Ted Yu
Please take a look at the following for example: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveThriftServer2Suites.scala Search for spark.sql.shuffle.partitions and SQLConf.SHUFFLE_PARTITIONS.key FYI On Wed, Sep 23, 2015 at 12:42 AM, tridib

How to control spark.sql.shuffle.partitions per query

2015-09-23 Thread tridib
I am having GC issue with default value of spark.sql.shuffle.partitions (200). When I increase it to 2000, shuffle join works fine. I want to use different values for spark.sql.shuffle.partitions depending on data volume, for different queries which are fired from sane SparkSql context. Thanks