Opening Dynamic Scaling Executors on Yarn

2015-12-27 Thread
Hi all,

SPARK-3174 (https://issues.apache.org/jira/browse/SPARK-3174) is a useful 
feature to save resources on yarn.
We want to open this feature on our yarn cluster.
I have a question about the version of shuffle service.

I’m now using spark-1.5.1 (shuffle service).
If I want to upgrade to spark-1.6.0, should I replace the shuffle service jar 
and restart all the namenode on yarn ?

Thanks a lot.

Mars



RE: Opening Dynamic Scaling Executors on Yarn

2015-12-27 Thread
Is it possible to support both spark-1.5.1 and spark-1.6.0 on one yarn cluster?

From: Saisai Shao [mailto:sai.sai.s...@gmail.com]
Sent: Monday, December 28, 2015 2:29 PM
To: Jeff Zhang
Cc: 顾亮亮; user@spark.apache.org; 刘骋昺
Subject: Re: Opening Dynamic Scaling Executors on Yarn

Replace all the shuffle jars and restart the NodeManager is enough, no need to 
restart NN.

On Mon, Dec 28, 2015 at 2:05 PM, Jeff Zhang 
<zjf...@gmail.com<mailto:zjf...@gmail.com>> wrote:
See 
http://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation



On Mon, Dec 28, 2015 at 2:00 PM, 顾亮亮 
<guliangli...@qiyi.com<mailto:guliangli...@qiyi.com>> wrote:
Hi all,

SPARK-3174 (https://issues.apache.org/jira/browse/SPARK-3174) is a useful 
feature to save resources on yarn.
We want to open this feature on our yarn cluster.
I have a question about the version of shuffle service.

I’m now using spark-1.5.1 (shuffle service).
If I want to upgrade to spark-1.6.0, should I replace the shuffle service jar 
and restart all the namenode on yarn ?

Thanks a lot.

Mars




--
Best Regards

Jeff Zhang



Parquet Partition Size are different when using Dataframe's save append funciton

2015-04-15 Thread
Hi,

When I use Dataframe’s save append function, I find that the parquet partition 
size are very different.

Part-r-1 to 00021 are generated at the first time save append function is 
called.
Part-r-00022 to 00042 is generated at the second time save append function is 
called.

As you can see, the size of Part-r-1 to 00021 is 200M, while the size of 
Part-r-00022 to 00042 is 700M.
But the source data is the same, which confused me.

-rw-r--r-- 1 sysplatform sysplatform 2.0K Apr 8 10:01 _common_metadata
-rw-r--r-- 1 sysplatform sysplatform 392K Apr 8 10:01 _metadata
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-1.parquet
-rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:44 part-r-2.parquet
-rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-3.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-4.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-5.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-6.parquet
-rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-7.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-8.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-9.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00010.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00011.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00012.parquet
-rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-00013.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00014.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00015.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00016.parquet
-rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-00017.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00018.parquet
-rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-00019.parquet
-rw-r--r-- 1 sysplatform sysplatform 199M Apr 8 09:43 part-r-00020.parquet
-rw-r--r-- 1 sysplatform sysplatform 200M Apr 8 09:43 part-r-00021.parquet
-rw-r--r-- 1 sysplatform sysplatform 720M Apr 8 10:01 part-r-00022.parquet
-rw-r--r-- 1 sysplatform sysplatform 723M Apr 8 10:00 part-r-00023.parquet
-rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:01 part-r-00024.parquet
-rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:00 part-r-00025.parquet
-rw-r--r-- 1 sysplatform sysplatform 717M Apr 8 10:00 part-r-00026.parquet
-rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:00 part-r-00027.parquet
-rw-r--r-- 1 sysplatform sysplatform 720M Apr 8 10:01 part-r-00028.parquet
-rw-r--r-- 1 sysplatform sysplatform 725M Apr 8 10:01 part-r-00029.parquet
-rw-r--r-- 1 sysplatform sysplatform 720M Apr 8 10:00 part-r-00030.parquet
-rw-r--r-- 1 sysplatform sysplatform 725M Apr 8 10:01 part-r-00031.parquet
-rw-r--r-- 1 sysplatform sysplatform 724M Apr 8 10:01 part-r-00032.parquet
-rw-r--r-- 1 sysplatform sysplatform 724M Apr 8 10:00 part-r-00033.parquet
-rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:01 part-r-00034.parquet
-rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:01 part-r-00035.parquet
-rw-r--r-- 1 sysplatform sysplatform 720M Apr 8 10:00 part-r-00036.parquet
-rw-r--r-- 1 sysplatform sysplatform 717M Apr 8 10:00 part-r-00037.parquet
-rw-r--r-- 1 sysplatform sysplatform 724M Apr 8 10:01 part-r-00038.parquet
-rw-r--r-- 1 sysplatform sysplatform 722M Apr 8 10:01 part-r-00039.parquet
-rw-r--r-- 1 sysplatform sysplatform 722M Apr 8 10:00 part-r-00040.parquet
-rw-r--r-- 1 sysplatform sysplatform 721M Apr 8 10:01 part-r-00041.parquet
-rw-r--r-- 1 sysplatform sysplatform 723M Apr 8 10:01 part-r-00042.parquet
-rw-r--r-- 1 sysplatform sysplatform 0 Apr 8 10:01 _SUCCESS


Question about TTL with TorrentBroadcastFactory in Spark-1.2.0

2014-12-21 Thread
Hi All,

I am facing a problem when using TTL with  TorrentBroadcastFactory in 
Spark-1.2.0.

My code is as follows:

val conf = new SparkConf().
  setAppName(TTL_Broadcast_vars).
  setMaster(local).
  //set(spark.broadcast.factory, 
org.apache.spark.broadcast.HttpBroadcastFactory).
  set(spark.cleaner.ttl, 2)
val sc = new SparkContext(conf)

val data = TTL_Broadcast_vars
val bData = sc.broadcast(data)
sc.parallelize(1 to 3, 3).map(v = {
  Thread.sleep(4 * 1000)
  bData.value
}).collect().foreach(println)


I got the following error message:

java.io.IOException: org.apache.spark.SparkException: Failed to get 
broadcast_0_piece0 of broadcast_0
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1003)
at 
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
at 
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
at 
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at TTL_Broadcast_vars$$anonfun$main$1.apply(TTL_Broadcast_vars.scala:17)
at TTL_Broadcast_vars$$anonfun$main$1.apply(TTL_Broadcast_vars.scala:15)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at 
scala.collection.TraversableOnce$class.tohttp://class.to/(TraversableOnce.scala:273)
at 
scala.collection.AbstractIterator.tohttp://scala.collection.abstractiterator.to/(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.rdd.RDD$$anonfun$16.apply(RDD.scala:780)
at org.apache.spark.rdd.RDD$$anonfun$16.apply(RDD.scala:780)
at 
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
at 
org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1314)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)

But if i use HttpBroadcastFactory, it can run successfully.

I'm wondering whether it is a feature of TorrentBroadcastFactory or a bug?

Mars Gu