Re: Spark 2.0.0 performance; potential large Spark core regression

Adam Roberts Fri, 08 Jul 2016 09:23:53 -0700

Thanks Michael, we can give your options a try and aim for a 2.0.0 tuned 
vs 2.0.0 default vs 1.6.2 default comparison, for future reference the 
defaults in Spark 2 RC2 look to be:


sql.shuffle.partitions: 200
Tungsten enabled: true
Executor memory: 1 GB (we set to 18 GB)
kryo buffer max: 64mb
WholeStageCodegen: on I think, we turned it off when fixing a bug
offHeap.enabled: false
offHeap.size: 0

Cheers,




From:   Michael Allman <mich...@videoamp.com>
To:     Adam Roberts/UK/IBM@IBMGB
Cc:     dev <dev@spark.apache.org>
Date:   08/07/2016 17:05
Subject:        Re: Spark 2.0.0 performance; potential large Spark core 
regression



Here are some settings we use for some very large GraphX jobs. These are 
based on using EC2 c3.8xl workers:

    .set("spark.sql.shuffle.partitions", "1024")
    .set("spark.sql.tungsten.enabled", "true")
    .set("spark.executor.memory", "24g")
    .set("spark.kryoserializer.buffer.max","1g")
    .set("spark.sql.codegen.wholeStage", "true")
    .set("spark.memory.offHeap.enabled", "true")
    .set("spark.memory.offHeap.size", "25769803776") // 24 GB

Some of these are in fact default configurations. Some are not.

Michael


On Jul 8, 2016, at 9:01 AM, Michael Allman <mich...@videoamp.com> wrote:

Hi Adam,

>From our experience we've found the default Spark 2.0 configuration to be 
highly suboptimal. I don't know if this affects your benchmarks, but I 
would consider running some tests with tuned and alternate configurations.

Michael


On Jul 8, 2016, at 8:58 AM, Adam Roberts <arobe...@uk.ibm.com> wrote:

Hi Michael, the two Spark configuration files aren't very exciting 

spark-env.sh 
Same as the template apart from a JAVA_HOME setting 

spark-defaults.conf 
spark.io.compression.codec lzf 

config.py has the Spark home set, is running Spark standalone mode, we run 
and prep Spark tests only, driver 8g, executor memory 16g, Kryo, 0.66 
memory fraction, 100 trials 

We can post the 1.6.2 comparison early next week, running lots of 
iterations over the weekend once we get the dedicated time again 

Cheers, 





From:        Michael Allman <mich...@videoamp.com> 
To:        Adam Roberts/UK/IBM@IBMGB 
Cc:        dev <dev@spark.apache.org> 
Date:        08/07/2016 16:44 
Subject:        Re: Spark 2.0.0 performance; potential large Spark core 
regression 



Hi Adam, 

Do you have your spark confs and your spark-env.sh somewhere where we can 
see them? If not, can you make them available? 

Cheers, 

Michael 

On Jul 8, 2016, at 3:17 AM, Adam Roberts <arobe...@uk.ibm.com> wrote: 

Hi, we've been testing the performance of Spark 2.0 compared to previous 
releases, unfortunately there are no Spark 2.0 compatible versions of 
HiBench and SparkPerf apart from those I'm working on (see 
https://github.com/databricks/spark-perf/issues/108) 

With the Spark 2.0 version of SparkPerf we've noticed a 30% geomean 
regression with a very small scale factor and so we've generated a couple 
of profiles comparing 1.5.2 vs 2.0.0. Same JDK version and same platform. 
We will gather a 1.6.2 comparison and increase the scale factor. 

Has anybody noticed a similar problem? My changes for SparkPerf and Spark 
2.0 are very limited and AFAIK don't interfere with Spark core 
functionality, so any feedback on the changes would be much appreciated 
and welcome, I'd much prefer it if my changes are the problem. 

A summary for your convenience follows (this matches what I've mentioned 
on the SparkPerf issue above) 

1. spark-perf/config/config.py : SCALE_FACTOR=0.05
No. Of Workers: 1
Executor per Worker : 1
Executor Memory: 18G
Driver Memory : 8G
Serializer: kryo 

2. $SPARK_HOME/conf/spark-defaults.conf: executor Java Options: 
-Xdisableexplicitgc -Xcompressedrefs 

Main changes I made for the benchmark itself 
Use Scala 2.11.8 and Spark 2.0.0 RC2 on our local filesystem 
MLAlgorithmTests use Vectors.fromML 
For streaming-tests in HdfsRecoveryTest we use wordStream.foreachRDD not 
wordStream.foreach 
KVDataTest uses awaitTerminationOrTimeout in a SparkStreamingContext 
instead of awaitTermination 
Trivial: we use compact not compact.render for outputting json

In Spark 2.0 the top five methods where we spend our time is as follows, 
the percentage is how much of the overall processing time was spent in 
this particular method: 
1.        AppendOnlyMap.changeValue 44% 
2.        SortShuffleWriter.write 19% 
3.        SizeTracker.estimateSize 7.5% 
4.        SizeEstimator.estimate 5.36% 
5.        Range.foreach 3.6% 

and in 1.5.2 the top five methods are: 
1.        AppendOnlyMap.changeValue 38% 
2.        ExternalSorter.insertAll 33% 
3.        Range.foreach 4% 
4.        SizeEstimator.estimate 2% 
5.        SizeEstimator.visitSingleObject 2% 

I see the following scores, on the left I have the test name followed by 
the 1.5.2 time and then the 2.0.0 time
scheduling throughput: 5.2s vs 7.08s
agg by key; 0.72s vs 1.01s
agg by key int: 0.93s vs 1.19s
agg by key naive: 1.88s vs 2.02
sort by key: 0.64s vs 0.8s
sort by key int: 0.59s vs 0.64s
scala count: 0.09s vs 0.08s
scala count w fltr: 0.31s vs 0.47s 

This is only running the Spark core tests (scheduling throughput through 
scala-count-w-filtr, including all inbetween). 

Cheers, 


Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU 



Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU



Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Re: Spark 2.0.0 performance; potential large Spark core regression

Reply via email to