Re: Configuration for unit testing and sql.shuffle.partitions

2017-09-18 Thread Vadim Semenov
you can create a Super class "FunSuiteWithSparkContext" that's going to create a Spark sessions, Spark context, and SQLContext with all the desired properties. Then you add the class to all the relevant test suites, and that's pretty much it. The other option can be is to pass it as a VM

Re: Configuration for unit testing and sql.shuffle.partitions

2017-09-16 Thread Femi Anthony
How are you specifying it, as an option to spark-submit ? On Sat, Sep 16, 2017 at 12:26 PM, Akhil Das wrote: > spark.sql.shuffle.partitions is still used I believe. I can see it in the > code >

Re: Configuration for unit testing and sql.shuffle.partitions

2017-09-16 Thread Akhil Das
spark.sql.shuffle.partitions is still used I believe. I can see it in the code and in the documentation page

Configuration for unit testing and sql.shuffle.partitions

2017-09-12 Thread peay
Hello, I am running unit tests with Spark DataFrames, and I am looking for configuration tweaks that would make tests faster. Usually, I use a local[2] or local[4] master. Something that has been bothering me is that most of my stages end up using 200 partitions, independently of whether I