Opening Dynamic Scaling Executors on Yarn
Hi all, SPARK-3174 (https://issues.apache.org/jira/browse/SPARK-3174) is a useful feature to save resources on yarn. We want to open this feature on our yarn cluster. I have a question about the version of shuffle service. I’m now using spark-1.5.1 (shuffle service). If I want to upgrade to spark-1.6.0, should I replace the shuffle service jar and restart all the namenode on yarn ? Thanks a lot. Mars
RE: Opening Dynamic Scaling Executors on Yarn
Is it possible to support both spark-1.5.1 and spark-1.6.0 on one yarn cluster? From: Saisai Shao [mailto:sai.sai.s...@gmail.com] Sent: Monday, December 28, 2015 2:29 PM To: Jeff Zhang Cc: 顾亮亮; user@spark.apache.org; 刘骋昺 Subject: Re: Opening Dynamic Scaling Executors on Yarn Replace all the shuffle jars and restart the NodeManager is enough, no need to restart NN. On Mon, Dec 28, 2015 at 2:05 PM, Jeff Zhang <zjf...@gmail.com<mailto:zjf...@gmail.com>> wrote: See http://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation On Mon, Dec 28, 2015 at 2:00 PM, 顾亮亮 <guliangli...@qiyi.com<mailto:guliangli...@qiyi.com>> wrote: Hi all, SPARK-3174 (https://issues.apache.org/jira/browse/SPARK-3174) is a useful feature to save resources on yarn. We want to open this feature on our yarn cluster. I have a question about the version of shuffle service. I’m now using spark-1.5.1 (shuffle service). If I want to upgrade to spark-1.6.0, should I replace the shuffle service jar and restart all the namenode on yarn ? Thanks a lot. Mars -- Best Regards Jeff Zhang
Parquet Partition Size are different when using Dataframe's save append funciton
Hi, When I use Dataframe’s save append function, I find that the parquet partition size are very different. Part-r-1 to 00021 are generated at the first time save append function is called. Part-r-00022 to 00042 is generated at the second time save append function is called. As you can see, the size of Part-r-1 to 00021 is 200M, while the size of Part-r-00022 to 00042 is 700M. But the source data is the same, which confused me. -rw-r--r-- 1 sysplatform sysplatform 2.0K Apr 8 10:01 _common_metadata -rw-r--r-- 1 sysplatform sysplatform 392K Apr 8 10:01 _metadata -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-1.parquet -rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:44 part-r-2.parquet -rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-3.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-4.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-5.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-6.parquet -rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-7.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-8.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-9.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00010.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00011.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00012.parquet -rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-00013.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00014.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00015.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00016.parquet -rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-00017.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00018.parquet -rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-00019.parquet -rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00020.parquet -rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-00021.parquet -rw-r--r-- 1 sysplatform sysplatform 720M Apr 8 10:01 part-r-00022.parquet -rw-r--r-- 1 sysplatform sysplatform 723M Apr 8 10:00 part-r-00023.parquet -rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:01 part-r-00024.parquet -rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:00 part-r-00025.parquet -rw-r--r-- 1 sysplatform sysplatform 717M Apr 8 10:00 part-r-00026.parquet -rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:00 part-r-00027.parquet -rw-r--r-- 1 sysplatform sysplatform 720M Apr 8 10:01 part-r-00028.parquet -rw-r--r-- 1 sysplatform sysplatform 725M Apr 8 10:01 part-r-00029.parquet -rw-r--r-- 1 sysplatform sysplatform 720M Apr 8 10:00 part-r-00030.parquet -rw-r--r-- 1 sysplatform sysplatform 725M Apr 8 10:01 part-r-00031.parquet -rw-r--r-- 1 sysplatform sysplatform 724M Apr 8 10:01 part-r-00032.parquet -rw-r--r-- 1 sysplatform sysplatform 724M Apr 8 10:00 part-r-00033.parquet -rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:01 part-r-00034.parquet -rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:01 part-r-00035.parquet -rw-r--r-- 1 sysplatform sysplatform 720M Apr 8 10:00 part-r-00036.parquet -rw-r--r-- 1 sysplatform sysplatform 717M Apr 8 10:00 part-r-00037.parquet -rw-r--r-- 1 sysplatform sysplatform 724M Apr 8 10:01 part-r-00038.parquet -rw-r--r-- 1 sysplatform sysplatform 722M Apr 8 10:01 part-r-00039.parquet -rw-r--r-- 1 sysplatform sysplatform 722M Apr 8 10:00 part-r-00040.parquet -rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:01 part-r-00041.parquet -rw-r--r-- 1 sysplatform sysplatform 723M Apr 8 10:01 part-r-00042.parquet -rw-r--r-- 1 sysplatform sysplatform 0 Apr 8 10:01 _SUCCESS
Question about TTL with TorrentBroadcastFactory in Spark-1.2.0
Hi All, I am facing a problem when using TTL with TorrentBroadcastFactory in Spark-1.2.0. My code is as follows: val conf = new SparkConf(). setAppName(TTL_Broadcast_vars). setMaster(local). //set(spark.broadcast.factory, org.apache.spark.broadcast.HttpBroadcastFactory). set(spark.cleaner.ttl, 2) val sc = new SparkContext(conf) val data = TTL_Broadcast_vars val bData = sc.broadcast(data) sc.parallelize(1 to 3, 3).map(v = { Thread.sleep(4 * 1000) bData.value }).collect().foreach(println) I got the following error message: java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_0_piece0 of broadcast_0 at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003) at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164) at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64) at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87) at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70) at TTL_Broadcast_vars$$anonfun$main$1.apply(TTL_Broadcast_vars.scala:17) at TTL_Broadcast_vars$$anonfun$main$1.apply(TTL_Broadcast_vars.scala:15) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) at scala.collection.TraversableOnce$class.tohttp://class.to/(TraversableOnce.scala:273) at scala.collection.AbstractIterator.tohttp://scala.collection.abstractiterator.to/(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) at org.apache.spark.rdd.RDD$$anonfun$16.apply(RDD.scala:780) at org.apache.spark.rdd.RDD$$anonfun$16.apply(RDD.scala:780) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:56) But if i use HttpBroadcastFactory, it can run successfully. I'm wondering whether it is a feature of TorrentBroadcastFactory or a bug? Mars Gu