Hi guys, This is a spark fresh user...
I'm trying to setup a spark cluster with multiple nodes, starting with 2. With one node, it is working fine. When I get a slave node, slave is able to register to the master node. However when I launch a spark shell, and when the executor is launched on the slave, I see below error on the slave node, in $spark/work directory. So the first exception is hadoop related. I have set $HADOOP_HOME to /usr/local/hadoop, where hadoop is installed. It seems the first issue is not the issue that is causing spark not to work. The second exception is the problem. Any idea why the problem happens and how can I resolve it? Much appreciated. cody 14/09/28 00:28:23 INFO CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT] 14/09/28 00:28:24 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of successful kerberos logins and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops) 14/09/28 00:28:24 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of failed kerberos logins and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops) 14/09/28 00:28:24 DEBUG MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[GetGroups], always=false, type=DEFAULT, sampleName=Ops) 14/09/28 00:28:24 DEBUG MetricsSystemImpl: UgiMetrics, User and group related metrics 14/09/28 00:28:24 DEBUG KerberosName: Kerberos krb5 configuration not found, setting default realm to empty 14/09/28 00:28:24 DEBUG Groups: Creating new Groups object 14/09/28 00:28:24 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library... 14/09/28 00:28:24 DEBUG NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path 14/09/28 00:28:24 DEBUG NativeCodeLoader: java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 14/09/28 00:28:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/09/28 00:28:24 DEBUG JniBasedUnixGroupsMappingWithFallback: Falling back to shell based 14/09/28 00:28:24 DEBUG JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping 14/09/28 00:28:24 DEBUG Shell: Failed to detect a valid hadoop home directory java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set. at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290) at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76) at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92) at org.apache.hadoop.security.Groups.<init>(Groups.java:76) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283) at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36) at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109) at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:113) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:156) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) 14/09/28 00:28:24 DEBUG Shell: setsid exited with exit code 0 14/09/28 00:28:24 DEBUG Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000 14/09/28 00:28:24 DEBUG SparkHadoopUtil: running as user: codeoedoc 14/09/28 00:28:24 DEBUG UserGroupInformation: hadoop login 14/09/28 00:28:24 DEBUG UserGroupInformation: hadoop login commit 14/09/28 00:28:24 DEBUG UserGroupInformation: using local user:UnixPrincipal: codeoedoc 14/09/28 00:28:24 DEBUG UserGroupInformation: UGI loginUser:codeoedoc (auth:SIMPLE) 14/09/28 00:28:24 DEBUG UserGroupInformation: PrivilegedAction as:codeoedoc (auth:SIMPLE) from:org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:52) 14/09/28 00:28:24 INFO SecurityManager: Changing view acls to: codeoedoc 14/09/28 00:28:24 INFO SecurityManager: Changing modify acls to: codeoedoc 14/09/28 00:28:24 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(codeoedoc); users with modify permissions: Set(codeoedoc) 14/09/28 00:28:24 DEBUG AkkaUtils: In createActorSystem, requireCookie is: off 14/09/28 00:28:24 INFO Slf4jLogger: Slf4jLogger started 14/09/28 00:28:24 INFO Remoting: Starting remoting 14/09/28 00:28:24 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@localhost:40146] 14/09/28 00:28:24 INFO Remoting: Remoting now listens on addresses: [akka.tcp://driverPropsFetcher@localhost:40146] 14/09/28 00:28:24 INFO Utils: Successfully started service 'driverPropsFetcher' on port 40146. *14/09/28 00:28:54 WARN UserGroupInformation: PriviledgedActionException as:codeoedoc (auth:SIMPLE) cause:java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1561) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:52) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:113) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:156) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:107) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:125) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:53) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:52) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) ... 4 more*