Executor and BlockManager memory size

2014-10-09 Thread Larry Xiao
Hi all, I'm confused about Executor and BlockManager, why they have different memory. 14/10/10 08:50:02 INFO AppClient$ClientActor: Executor added: app-20141010085001-/2 on worker-20141010004933-brick6-35657 (brick6:35657) with 6 cores 14/10/10 08:50:02 INFO SparkDeploySchedulerBackend:

PageRank execution imbalance, might hurt performance by 6x

2014-09-27 Thread Larry Xiao
Hi all! I'm running PageRank on GraphX, and I find on some tasks on one machine can spend 5~6 times more time than on others, others are perfectly balance (around 1 second to finish). And since time for a stage (iteration) is determined by the slowest task, the performance is undesirable. I

VertexRDD partition imbalance

2014-09-25 Thread Larry Xiao
Hi all VertexRDD is partitioned with HashPartitioner, and it exhibits some imbalance of tasks. For example, Connected Components with partition strategy Edge2D: Aggregated Metrics by Executor Executor ID Task Time Total Tasks Failed Tasks Succeeded Tasks Input Shuffle Read

Re: Specifying Spark Executor Java options using Spark Submit

2014-09-24 Thread Larry Xiao
Hi Arun! I think you can find info at https://spark.apache.org/docs/latest/configuration.html quote: Spark provides three locations to configure the system: * Spark properties https://spark.apache.org/docs/latest/configuration.html#spark-propertiescontrol most application

Re: spark.shuffle.consolidateFiles seems not working

2014-07-30 Thread Larry Xiao
Hi Jianshi, I've met similar situation before. And my solution was 'ulimit', you can use -a to see your current settings -n to set open files limit (and other limits also) And I set -n to 10240. I see spark.shuffle.consolidateFiles helps by reusing open files. (so I don't know to what extend