Hi,
We switched from ParallelGC to CMS, and the symptom is gone.
On Thu, Jun 4, 2015 at 3:37 PM, Ji ZHANG zhangj...@gmail.com wrote:
Hi,
I set spark.shuffle.io.preferDirectBufs to false in SparkConf and this
setting can be seen in web ui's environment tab. But, it still eats memory,
i.e.
Glad to hear that. :)
On Thu, Jun 18, 2015 at 6:25 AM, Ji ZHANG zhangj...@gmail.com wrote:
Hi,
We switched from ParallelGC to CMS, and the symptom is gone.
On Thu, Jun 4, 2015 at 3:37 PM, Ji ZHANG zhangj...@gmail.com wrote:
Hi,
I set spark.shuffle.io.preferDirectBufs to false in
Hi,
I set spark.shuffle.io.preferDirectBufs to false in SparkConf and this
setting can be seen in web ui's environment tab. But, it still eats memory,
i.e. -Xmx set to 512M but RES grows to 1.5G in half a day.
On Wed, Jun 3, 2015 at 12:02 PM, Shixiong Zhu zsxw...@gmail.com wrote:
Could you
Hi,
Thanks for you information. I'll give spark1.4 a try when it's released.
On Wed, Jun 3, 2015 at 11:31 AM, Tathagata Das t...@databricks.com wrote:
Could you try it out with Spark 1.4 RC3?
Also pinging, Cloudera folks, they may be aware of something.
BTW, the way I have debugged memory
Could you try it out with Spark 1.4 RC3?
Also pinging, Cloudera folks, they may be aware of something.
BTW, the way I have debugged memory leaks in the past is as follows.
Run with a small driver memory, say 1 GB. Periodically (maybe a script),
take snapshots of histogram and also do memory
Hi,
Thanks for you reply. Here's the top 30 entries of jmap -histo:live result:
num #instances #bytes class name
--
1: 40802 145083848 [B
2: 99264 12716112 methodKlass
3: 99264 12291480
Hi Zhang,
Could you paste your code in a gist? Not sure what you are doing inside the
code to fill up memory.
Thanks
Best Regards
On Thu, May 28, 2015 at 10:08 AM, Ji ZHANG zhangj...@gmail.com wrote:
Hi,
Yes, I'm using createStream, but the storageLevel param is by default
Hi,
I wrote a simple test job, it only does very basic operations. for example:
val lines = KafkaUtils.createStream(ssc, zkQuorum, group, Map(topic -
1)).map(_._2)
val logs = lines.flatMap { line =
try {
Some(parse(line).extract[Impression])
} catch {
case _:
Can you replace your counting part with this?
logs.filter(_.s_id 0).foreachRDD(rdd = logger.info(rdd.count()))
Thanks
Best Regards
On Thu, May 28, 2015 at 1:02 PM, Ji ZHANG zhangj...@gmail.com wrote:
Hi,
I wrote a simple test job, it only does very basic operations. for example:
Hi,
Unfortunately, they're still growing, both driver and executors.
I run the same job with local mode, everything is fine.
On Thu, May 28, 2015 at 5:26 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:
Can you replace your counting part with this?
logs.filter(_.s_id 0).foreachRDD(rdd =
After submitting the job, if you do a ps aux | grep spark-submit then you
can see all JVM params. Are you using the highlevel consumer (receiver
based) for receiving data from Kafka? In that case if your throughput is
high and the processing delay exceeds batch interval then you will hit this
Hi,
Yes, I'm using createStream, but the storageLevel param is by default
MEMORY_AND_DISK_SER_2. Besides, the driver's memory is also growing. I
don't think Kafka messages will be cached in driver.
On Thu, May 28, 2015 at 12:24 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:
Are you using the
Hi Akhil,
Thanks for your reply. Accoding to the Streaming tab of Web UI, the
Processing Time is around 400ms, and there's no Scheduling Delay, so I
suppose it's not the Kafka messages that eat up the off-heap memory. Or
maybe it is, but how to tell?
I googled about how to check the off-heap
13 matches
Mail list logo