Hi all,
So far as I known, a SparkContext instance take in charge of some resources of
a cluster the master assigned to. And It is hardly shared with different
sparkcontexts. meanwhile, schedule between applications is also not easier.
To address this without introducing extra resource schedule
Hi all,
I got a strange problem, I submit a reduce job(any one split), it finished
normally on Executor, log is:
14/07/15 21:08:56 INFO Executor: Serialized size of result for 0 is 10476031
14/07/15 21:08:56 INFO Executor: Sending result for 0 directly to driver
14/07/15 21:08:56 INFO Executor: F
Hi all,
We got some troubles on the issue of driver's HA. we run a long-live driver on
spark standalone mode which service as server that submit jobs as requests
arrived. therefore we come across the issue of driver process's HA problem,
like how to resume jobs after the driver process failed.
s stage
and the memory will be free soon.
Only cache() will persist your RDD in memory for a long time.
Second question:
Once RDD be created, it can not be changed due to the immutable feature.You can
only create a new RDD from the existing RDD or from file system.
2014-03-25 9:45 GMT+08:00
Hi hequn, a relative question, is that mean the memory usage will doubled? And
further more, if the compute function in a rdd is not idempotent, rdd will
changed during the job running, is that right?
-原始邮件-
发件人: "hequn cheng"
发送时间: 2014/3/25 9:35
收件人: "user@spark.apache.org"
主题: R
Large memory is need to build spark, I think you should make xmx larger, 2g for
example.
-原始邮件-
发件人: "Bharath Bhushan"
发送时间: 2014/3/22 12:50
收件人: "user@spark.apache.org"
主题: unable to build spark - sbt/sbt: line 50: killed
I am getting the following error when trying to build spark.
unpersist:Mark the RDD as non-persistent, and remove all blocks for it from
memory and disk
2014-03-19 16:40 GMT+08:00 林武康 :
Hi, can any one tell me about the lifecycle of an rdd? I search through the
official website and still can't figure it out. Can I use an rdd in some stages
and destroy it
Hi, can any one tell me about the lifecycle of an rdd? I search through the
official website and still can't figure it out. Can I use an rdd in some stages
and destroy it in order to release memory because that no stages ahead will use
this rdd any more. Is it possible?
Thanks!
Sincerely
Lin
Hi all, I changed spark.closure.serializer to kryo, when I try count action in
spark shell the Task obj deserialize in Executor return null, src line is:
override def run(){
..
task = ser.deserializer[Task[Any]](...)
..
}
Where task is null
Can any one help me? Thank you!
hi, I am a newbie of spark, the question below may seems fool, but I really
want some advices:
As load data from disk to generate an rdd is very cost in my applications, I
hope I can generate it once and cache it in memory, then any other spark
applications can refer to this rdd. Can this possib
on YARN. Here is the
documentation.
TD
On Thu, Feb 20, 2014 at 11:16 PM, 林武康 wrote:
hi all,
I am a very newbie of apache spark, recently I have tried spark on yarn, it
works for batch process. Now we want to try streaming process using
spark-streaming, and still, use yarn for resource scheduler as we
11 matches
Mail list logo