Aaron Davidson created SPARK-3029:
-------------------------------------

             Summary: Disable local execution of Spark jobs by default
                 Key: SPARK-3029
                 URL: https://issues.apache.org/jira/browse/SPARK-3029
             Project: Spark
          Issue Type: Improvement
            Reporter: Aaron Davidson
            Assignee: Aaron Davidson


Currently, local execution of Spark jobs is only used by take(), and it can be 
problematic as it can load a significant amount of data onto the driver. The 
worst case scenarios occur if the RDD is cached (guaranteed to load whole 
partition), has very large elements, or the partition is just large and we 
apply a filter with high selectivity or computational overhead.

Additionally, jobs that run locally in this manner do not show up in the web 
UI, and are thus harder to track or understand what is occurring.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to