[ https://issues.apache.org/jira/browse/SPARK-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386585#comment-14386585 ]
Steve Loughran commented on SPARK-2356: --------------------------------------- It's coming from {{ UserGroupInformation.setConfiguration(conf)}}; UGI is using Hadoop's {{StringUtils}} to do something, which then init's a static variable {code} public static final Pattern ENV_VAR_PATTERN = Shell.WINDOWS ? WIN_ENV_VAR_PATTERN : SHELL_ENV_VAR_PATTERN; {code} And Hadoop utils shell, does some stuff in its constructor, which depends on winutils.exe being on the path. convoluted, but there you go. HADOOP-11293 proposes factoring out the {{Shell.Windows}} code into something standalone...if that can be pushed into Hadoop 2.8 then this problem will go away from then on > Exception: Could not locate executable null\bin\winutils.exe in the Hadoop > --------------------------------------------------------------------------- > > Key: SPARK-2356 > URL: https://issues.apache.org/jira/browse/SPARK-2356 > Project: Spark > Issue Type: Bug > Components: Windows > Affects Versions: 1.0.0 > Reporter: Kostiantyn Kudriavtsev > Priority: Critical > > I'm trying to run some transformation on Spark, it works fine on cluster > (YARN, linux machines). However, when I'm trying to run it on local machine > (Windows 7) under unit test, I got errors (I don't use Hadoop, I'm read file > from local filesystem): > {code} > 14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > 14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the > hadoop binary path > java.io.IOException: Could not locate executable null\bin\winutils.exe in the > Hadoop binaries. > at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318) > at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333) > at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326) > at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76) > at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93) > at org.apache.hadoop.security.Groups.<init>(Groups.java:77) > at > org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240) > at > org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255) > at > org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283) > at > org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36) > at > org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109) > at > org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:228) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:97) > {code} > It's happened because Hadoop config is initialized each time when spark > context is created regardless is hadoop required or not. > I propose to add some special flag to indicate if hadoop config is required > (or start this configuration manually) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org