[ https://issues.apache.org/jira/browse/SPARK-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen updated SPARK-3750: ----------------------------- Priority: Minor (was: Major) Andrew do you still want to add this? > Log ulimit settings at warning if they are too low > -------------------------------------------------- > > Key: SPARK-3750 > URL: https://issues.apache.org/jira/browse/SPARK-3750 > Project: Spark > Issue Type: Improvement > Components: Deploy > Affects Versions: 1.1.0 > Reporter: Andrew Ash > Priority: Minor > > In recent versions of Spark the shuffle implementation is much more > aggressive about writing many files out to disk at once. Most linux kernels > have a default limit in the number of open files per process, and Spark can > exhaust this limit. The current hash-based shuffle implementation requires > as many files as the product of the map and reduce partition counts in a wide > dependency. > In order to reduce the errors we're seeing on the user list, we should > determine a value that is considered "too low" for normal operations and log > a warning on executor startup when that value isn't met. > 1. determine what ulimit is acceptable > 2. log when that value isn't met -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org