Can SparkContext shared across nodes/drivers

2014-09-21 Thread
Hi all, So far as I known, a SparkContext instance take in charge of some resources of a cluster the master assigned to. And It is hardly shared with different sparkcontexts. meanwhile, schedule between applications is also not easier. To address this without introducing extra resource schedule

Driver cannot receive StatusUpdate message for FINISHED

2014-07-15 Thread
Hi all, I got a strange problem, I submit a reduce job(any one split), it finished normally on Executor, log is: 14/07/15 21:08:56 INFO Executor: Serialized size of result for 0 is 10476031 14/07/15 21:08:56 INFO Executor: Sending result for 0 directly to driver 14/07/15 21:08:56 INFO Executor:

答复: RDD usage

2014-03-24 Thread
Hi hequn, a relative question, is that mean the memory usage will doubled? And further more, if the compute function in a rdd is not idempotent, rdd will changed during the job running, is that right? -原始邮件- 发件人: hequn cheng chenghe...@gmail.com 发送时间: ‎2014/‎3/‎25 9:35 收件人:

答复: 答复: RDD usage

2014-03-24 Thread
and the memory will be free soon. Only cache() will persist your RDD in memory for a long time. Second question: Once RDD be created, it can not be changed due to the immutable feature.You can only create a new RDD from the existing RDD or from file system. 2014-03-25 9:45 GMT+08:00 林武康

答复: unable to build spark - sbt/sbt: line 50: killed

2014-03-22 Thread
Large memory is need to build spark, I think you should make xmx larger, 2g for example. -原始邮件- 发件人: Bharath Bhushan manku.ti...@outlook.com 发送时间: ‎2014/‎3/‎22 12:50 收件人: user@spark.apache.org user@spark.apache.org 主题: unable to build spark - sbt/sbt: line 50: killed I am getting the

KryoSerializer return null when deserialize Task obj in Executor

2014-03-18 Thread
Hi all, I changed spark.closure.serializer to kryo, when I try count action in spark shell the Task obj deserialize in Executor return null, src line is: override def run(){ .. task = ser.deserializer[Task[Any]](...) .. } Where task is null Can any one help me? Thank you!

Can two spark applications share rdd?

2014-03-15 Thread
hi, I am a newbie of spark, the question below may seems fool, but I really want some advices: As load data from disk to generate an rdd is very cost in my applications, I hope I can generate it once and cache it in memory, then any other spark applications can refer to this rdd. Can this