Thanks, I'm surprised to see there are so much difference (4x), there could be something wrong in Spark (some contention between tasks).
On Fri, Sep 11, 2015 at 11:47 AM, Jesse F Chen <jfc...@us.ibm.com> wrote: > > @Davies...good question.. > > > Just be curious how the difference would be if you use 20 executors > > and 20G memory for each executor.. > > So I tried the following combinations: > > (GB X # executors) (query response time in secs) > 20X20 415 > 10X40 230 > 5X80 141 > 4X100 128 > 2X200 104 > > CPU utilization is high so spreading more JVMs onto more vCores helps in this > case. > For other workloads where memory utilization outweighs CPU, i can see larger > JVM > sizes maybe more beneficial. It's for sure case-by-case. > > Seems overhead for codegen and scheduler overhead are negligible. > > > > Davies Liu ---09/11/2015 10:41:23 AM---On Fri, Sep 11, 2015 at 10:31 AM, > Jesse F Chen <jfc...@us.ibm.com> wrote: > > > From: Davies Liu <dav...@databricks.com> > To: Jesse F Chen/San Francisco/IBM@IBMUS > Cc: "Cheng, Hao" <hao.ch...@intel.com>, Todd <bit1...@163.com>, Michael > Armbrust <mich...@databricks.com>, "user@spark.apache.org" > <user@spark.apache.org> > Date: 09/11/2015 10:41 AM > Subject: Re: Re:Re:RE: Re:RE: spark 1.5 SQL slows down dramatically by 50%+ > compared with spark 1.4.1 SQL > > ________________________________ > > > > On Fri, Sep 11, 2015 at 10:31 AM, Jesse F Chen <jfc...@us.ibm.com> wrote: > > > > Thanks Hao! > > > > I tried your suggestion of setting > > spark.shuffle.reduceLocality.enabled=false and my initial tests showed > > queries are on par between 1.5 and 1.4.1. > > > > Results: > > > > tpcds-query39b-141.out:query time: 129.106478631 sec > > tpcds-query39b-150-reduceLocality-false.out:query time: 128.854284296 sec > > tpcds-query39b-150.out:query time: 572.443151734 sec > > > > With default spark.shuffle.reduceLocality.enabled=true, I am seeing > > across-the-board slow down for majority of the TPCDS queries. > > > > My test is on a bare metal 20-node cluster. I ran the my test as follows: > > > > /TestAutomation/spark-1.5/bin/spark-submit --master yarn-client > > --packages com.databricks:spark-csv_2.10:1.1.0 --name TPCDSSparkSQLHC > > --conf spark.shuffle.reduceLocality.enabled=false > > --executor-memory 4096m --num-executors 100 > > --class org.apache.spark.examples.sql.hive.TPCDSSparkSQLHC > > /TestAutomation/databricks/spark-sql-perf-master/target/scala-2.10/tpcdssparksql_2.10-0.9.jar > > hdfs://rhel2.cisco.com:8020/user/bigsql/hadoopds100g > > /TestAutomation/databricks/spark-sql-perf-master/src/main/queries/jesse/query39b.sql > > > > Just be curious how the difference would be if you use 20 executors > and 20G memory for each executor. Share the same JVM for some tasks, > could reduce the overhead for codegen and JIT, it may also reduce the > overhead of `reduceLocality`(it can be easier to schedule the tasks). > > > > > > > > > "Cheng, Hao" ---09/11/2015 01:00:28 AM---Can you confirm if the query > > really run in the cluster mode? Not the local mode. Can you print the c > > > > From: "Cheng, Hao" <hao.ch...@intel.com> > > To: Todd <bit1...@163.com> > > Cc: Jesse F Chen/San Francisco/IBM@IBMUS, Michael Armbrust > > <mich...@databricks.com>, "user@spark.apache.org" <user@spark.apache.org> > > Date: 09/11/2015 01:00 AM > > Subject: RE: Re:Re:RE: Re:RE: spark 1.5 SQL slows down dramatically by 50%+ > > compared with spark 1.4.1 SQL > > > > ________________________________ > > > > > > > > Can you confirm if the query really run in the cluster mode? Not the local > > mode. Can you print the call stack of the executor when the query is > > running? > > > > BTW: spark.shuffle.reduceLocality.enabled is the configuration of Spark, > > not Spark SQL. > > > > From: Todd [mailto:bit1...@163.com] > > Sent: Friday, September 11, 2015 3:39 PM > > To: Todd > > Cc: Cheng, Hao; Jesse F Chen; Michael Armbrust; user@spark.apache.org > > Subject: Re:Re:RE: Re:RE: spark 1.5 SQL slows down dramatically by 50%+ > > compared with spark 1.4.1 SQL > > > > I add the following two options: > > spark.sql.planner.sortMergeJoin=false > > spark.shuffle.reduceLocality.enabled=false > > > > But it still performs the same as not setting them two. > > > > One thing is that on the spark ui, when I click the SQL tab, it shows an > > empty page but the header title 'SQL',there is no table to show queries and > > execution plan information. > > > > > > > > > > At 2015-09-11 14:39:06, "Todd" <bit1...@163.com> wrote: > > > > > > Thanks Hao. > > Yes,it is still low as SMJ。Let me try the option your suggested, > > > > > > At 2015-09-11 14:34:46, "Cheng, Hao" <hao.ch...@intel.com> wrote: > > > > You mean the performance is still slow as the SMJ in Spark 1.5? > > > > Can you set the spark.shuffle.reduceLocality.enabled=false when you start > > the spark-shell/spark-sql? It’s a new feature in Spark 1.5, and it’s true > > by default, but we found it probably causes the performance reduce > > dramatically. > > > > > > From: Todd [mailto:bit1...@163.com] > > Sent: Friday, September 11, 2015 2:17 PM > > To: Cheng, Hao > > Cc: Jesse F Chen; Michael Armbrust; user@spark.apache.org > > Subject: Re:RE: spark 1.5 SQL slows down dramatically by 50%+ compared with > > spark 1.4.1 SQL > > > > Thanks Hao for the reply. > > I turn the merge sort join off, the physical plan is below, but the > > performance is roughly the same as it on... > > > > == Physical Plan == > > TungstenProject > > [ss_quantity#10,ss_list_price#12,ss_coupon_amt#19,ss_cdemo_sk#4,ss_item_sk#2,ss_promo_sk#8,ss_sold_date_sk#0] > > ShuffledHashJoin [ss_item_sk#2], [ss_item_sk#25], BuildRight > > TungstenExchange hashpartitioning(ss_item_sk#2) > > ConvertToUnsafe > > Scan > > ParquetRelation[hdfs://ns1/tmp/spark_perf/scaleFactor=30/useDecimal=true/store_sales][ss_promo_sk#8,ss_quantity#10,ss_cdemo_sk#4,ss_list_price#12,ss_coupon_amt#19,ss_item_sk#2,ss_sold_date_sk#0] > > TungstenExchange hashpartitioning(ss_item_sk#25) > > ConvertToUnsafe > > Scan > > ParquetRelation[hdfs://ns1/tmp/spark_perf/scaleFactor=30/useDecimal=true/store_sales][ss_item_sk#25] > > > > Code Generation: true > > > > > > > > At 2015-09-11 13:48:23, "Cheng, Hao" <hao.ch...@intel.com> wrote: > > > > This is not a big surprise the SMJ is slower than the HashJoin, as we do > > not fully utilize the sorting yet, more details can be found at > > https://issues.apache.org/jira/browse/SPARK-2926 . > > > > Anyway, can you disable the sort merge join by > > “spark.sql.planner.sortMergeJoin=false;” in Spark 1.5, and run the query > > again? In our previous testing, it’s about 20% slower for sort merge join. > > I am not sure if there anything else slow down the performance. > > > > Hao > > > > > > From: Jesse F Chen [mailto:jfc...@us.ibm.com] > > Sent: Friday, September 11, 2015 1:18 PM > > To: Michael Armbrust > > Cc: Todd; user@spark.apache.org > > Subject: Re: spark 1.5 SQL slows down dramatically by 50%+ compared with > > spark 1.4.1 SQL > > > > > > Could this be a build issue (i.e., sbt package)? > > > > If I ran the same jar build for 1.4.1 in 1.5, I am seeing large regression > > too in queries (all other things identical)... > > > > I am curious, to build 1.5 (when it isn't released yet), what do I need to > > do with the build.sbt file? > > > > any special parameters i should be using to make sure I load the latest > > hive dependencies? > > > > Michael Armbrust ---09/10/2015 11:07:28 AM---I've been running TPC-DS > > SF=1500 daily on Spark 1.4.1 and Spark 1.5 on S3, so this is surprising. I > > > > From: Michael Armbrust <mich...@databricks.com> > > To: Todd <bit1...@163.com> > > Cc: "user@spark.apache.org" <user@spark.apache.org> > > Date: 09/10/2015 11:07 AM > > Subject: Re: spark 1.5 SQL slows down dramatically by 50%+ compared with > > spark 1.4.1 SQL > > > > ________________________________ > > > > > > > > > > I've been running TPC-DS SF=1500 daily on Spark 1.4.1 and Spark 1.5 on S3, > > so this is surprising. In my experiments Spark 1.5 is either the same or > > faster than 1.4 with only small exceptions. A few thoughts, > > > > - 600 partitions is probably way too many for 6G of data. > > - Providing the output of explain for both runs would be helpful whenever > > reporting performance changes. > > > > On Thu, Sep 10, 2015 at 1:24 AM, Todd <bit1...@163.com> wrote: > > > > Hi, > > > > I am using data generated with > > sparksqlperf(https://github.com/databricks/spark-sql-perf > ) to test the spark sql performance (spark on yarn, with 10 nodes) with the > following code (The table store_sales is about 90 million records, 6G in size) > > > > > val > > outputDir="hdfs://tmp/spark_perf/scaleFactor=30/useDecimal=true/store_sales" > > val name="store_sales" > > sqlContext.sql( > > s""" > > |CREATE TEMPORARY TABLE ${name} > > |USING org.apache.spark.sql.parquet > > |OPTIONS ( > > | path '${outputDir}' > > |) > > """.stripMargin) > > > > val sql=""" > > |select > > | t1.ss_quantity, > > | t1.ss_list_price, > > | t1.ss_coupon_amt, > > | t1.ss_cdemo_sk, > > | t1.ss_item_sk, > > | t1.ss_promo_sk, > > | t1.ss_sold_date_sk > > |from store_sales t1 join store_sales t2 on t1.ss_item_sk = > > t2.ss_item_sk > > |where > > | t1.ss_sold_date_sk between 2450815 and 2451179 > > """.stripMargin > > > > val df = sqlContext.sql(sql) > > df.rdd.foreach(row=>Unit) > > > > With 1.4.1, I can finish the query in 6 minutes, but I need 10+ minutes > > with 1.5. > > > > The configuration are basically the same, since I copy the configuration > > from 1.4.1 to 1.5: > > > > sparkVersion 1.4.1 1.5.0 > > scaleFactor 30 30 > > spark.sql.shuffle.partitions 600 600 > > spark.sql.sources.partitionDiscovery.enabled true true > > spark.default.parallelism 200 200 > > spark.driver.memory 4G 4G 4G > > spark.executor.memory 4G 4G > > spark.executor.instances 10 10 > > spark.shuffle.consolidateFiles true true > > spark.storage.memoryFraction 0.4 0.4 > > spark.executor.cores 3 3 > > > > I am not sure where is going wrong,any ideas? > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org