[ https://issues.apache.org/jira/browse/SPARK-39755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun reassigned SPARK-39755: ------------------------------------- Assignee: pralabhkumar > SPARK_LOCAL_DIRS locations are not randomized in K8s > ---------------------------------------------------- > > Key: SPARK-39755 > URL: https://issues.apache.org/jira/browse/SPARK-39755 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core > Affects Versions: 3.3.0 > Reporter: pralabhkumar > Assignee: pralabhkumar > Priority: Minor > > In org.apache.spark.util getConfiguredLocalDirs > > {code:java} > if (isRunningInYarnContainer(conf)) { > // If we are in yarn mode, systems can have different disk layouts so we > must set it > // to what Yarn on this system said was available. Note this assumes that > Yarn has > // created the directories already, and that they are secured so that only > the > // user has access to them. > randomizeInPlace(getYarnLocalDirs(conf).split(",")) > } else if (conf.getenv("SPARK_EXECUTOR_DIRS") != null) { > conf.getenv("SPARK_EXECUTOR_DIRS").split(File.pathSeparator) > } else if (conf.getenv("SPARK_LOCAL_DIRS") != null) { > conf.getenv("SPARK_LOCAL_DIRS").split(",") > }{code} > randomizedInplace is not called conf.getenv("SPARK_LOCAL_DIRS").split(",") . > This is what used in case of K8s and the shuffle locations are not > randomized. > IMHO , this should be randomized , so that all the directories have equal > changes of pushing the data as was done on yarn side > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org