We keep running into https://issues.apache.org/jira/browse/SPARK-2823 when
trying to use GraphX.  The cost of repartitioning the data is really high
for us (lots of network traffic) which is killing the job performance.

I understand the bug was reverted to stabilize unit tests, but frankly it
makes it very hard to tune Spark applications with the limits this puts on
someone.  What is the process to get fixing this prioritized if we do not
have the cycles to do it ourselves?

Reply via email to