Hi,
I used 1g memory for the driver java process and got OOM error on
driver side before reduceByKey. After analyzed the heap dump, the biggest
object is org.apache.spark.MapStatus, which occupied over 900MB memory.
Here's my question:
1. Is there any optimization switches that I can tune
reduceByKey(_ + _,
100) to use only 100 tasks).
Matei
On May 29, 2014, at 2:03 AM, haitao .yao yao.e...@gmail.com wrote:
Hi,
I used 1g memory for the driver java process and got OOM error on
driver side before reduceByKey. After analyzed the heap dump, the biggest
object
Hi,
Amazon aws started to provide service for China mainland, the region
name is cn-north-1. But the script spark provides: spark_ec2.py will query
ami id from https://github.com/mesos/spark-ec2/tree/v4/ami-list and there's
no ami information for cn-north-1 region .
Can anybody update the
://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html
cn-north-1 is not a supported region for EC2, as far as I can tell. There
may be other AWS services that can use that region, but spark-ec2 relies on
EC2.
Nick
On Tue, Nov 4, 2014 at 8:09 PM, haitao .yao yao.e
/jira/secure/Dashboard.jspa to track this
request? I can do it if you've never opened a JIRA issue before.
Nick
On Tue, Nov 4, 2014 at 9:03 PM, haitao .yao yao.e...@gmail.com wrote:
I'm afraid not. We have been using EC2 instances in cn-north-1 region for
a while. And the latest version of boto
Hey, guys. Here's my problem:
While using the standalone mode, I always use the following args for
executor:
-XX:+PrintGCDetails -XX:+PrintGCDateStamps -verbose:gc
-Xloggc:/tmp/spark.executor.gc.log
But as we know, hotspot JVM does not support variable substitution on
-Xloggc parameter, which
unsubscribe
--
haitao.yao