Re: RDD Partitions not distributed evenly to executors

2016-11-21 Thread Thunder Stumpges
Has anyone figured this out yet!? I have gone looking for this exact problem (spark 1.6.1) and I cannot get my partitions to be distributed evenly across executors no matter what I've tried. it has been mentioned several other times in the user group as well as the dev group (as mentioned by Mike

spark 1.6 : RDD Partitions not distributed evenly to executors

2016-05-09 Thread prateek arora
-spark-user-list.1001560.n3.nabble.com/spark-1-6-RDD-Partitions-not-distributed-evenly-to-executors-tp26911.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr

Re: RDD Partitions not distributed evenly to executors

2016-04-06 Thread Mike Hynes
Hello All (and Devs in particular), Thank you again for your further responses. Please find a detailed email below which identifies the cause (I believe) of the partition imbalance problem, which occurs in spark 1.5, 1.6, and a 2.0-SNAPSHOT. This is followed by follow-up questions for the dev

Re: RDD Partitions not distributed evenly to executors

2016-04-05 Thread Khaled Ammar
I have a similar experience. Using 32 machines, I can see than number of tasks (partitions) assigned to executors (machines) is not even. Moreover, the distribution change every stage (iteration). I wonder why Spark needs to move partitions around any way, should not the scheduler reduce network

Re: RDD Partitions not distributed evenly to executors

2016-04-04 Thread Mike Hynes
Dear all, Thank you for your responses. Michael Slavitch: > Just to be sure: Has spark-env.sh and spark-defaults.conf been correctly > propagated to all nodes? Are they identical? Yes; these files are stored on a shared memory directory accessible to all nodes. Koert Kuipers: > we ran into

Re: RDD Partitions not distributed evenly to executors

2016-04-04 Thread Ted Yu
bq. the modifications do not touch the scheduler If the changes can be ported over to 1.6.1, do you mind reproducing the issue there ? I ask because master branch changes very fast. It would be good to narrow the scope where the behavior you observed started showing. On Mon, Apr 4, 2016 at 6:12

Re: RDD Partitions not distributed evenly to executors

2016-04-04 Thread Michael Slavitch
Just to be sure: Has spark-env.sh and spark-defaults.conf been correctly propagated to all nodes? Are they identical? > On Apr 4, 2016, at 9:12 AM, Mike Hynes <91m...@gmail.com> wrote: > > [ CC'ing dev list since nearly identical questions have occurred in > user list recently w/o

RDD Partitions not distributed evenly to executors

2016-04-04 Thread Mike Hynes
[ CC'ing dev list since nearly identical questions have occurred in user list recently w/o resolution; c.f.: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-work-distribution-among-execs-tt26502.html