Re: spark multi-node cluster

codeoedoc Sun, 28 Sep 2014 00:52:30 -0700

BTW, I'm using standalone deployment (The name standalone deployment for
cluster,  is kind of misleading... I think the doc needs to be updated.
It's not really standalone, but plain spark only deployment)


Thx,
cody

On Sun, Sep 28, 2014 at 12:36 AM, codeoedoc <codeoe...@gmail.com> wrote:

> Hi guys,
>
> This is a spark fresh user...
>
> I'm trying to setup a spark cluster with multiple nodes, starting with 2.
> With one node, it is working fine. When I get a slave node, slave is able
> to register to the master node. However when I launch a spark shell, and
> when the executor is launched on the slave, I see below error on the slave
> node, in $spark/work directory.
>
> So the first exception is hadoop related. I have set $HADOOP_HOME to
> /usr/local/hadoop, where hadoop is installed. It seems the first issue is
> not the issue that is causing spark not to work. The second exception is
> the problem.
>
> Any idea why the problem happens and how can I resolve it?
>
> Much appreciated.
> cody
>
> 14/09/28 00:28:23 INFO CoarseGrainedExecutorBackend: Registered signal
> handlers for [TERM, HUP, INT]
> 14/09/28 00:28:24 DEBUG MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
> with annotation
> @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, about=,
> value=[Rate of successful kerberos logins and latency (milliseconds)],
> always=false, type=DEFAULT, sampleName=Ops)
> 14/09/28 00:28:24 DEBUG MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
> with annotation
> @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, about=,
> value=[Rate of failed kerberos logins and latency (milliseconds)],
> always=false, type=DEFAULT, sampleName=Ops)
> 14/09/28 00:28:24 DEBUG MutableMetricsFactory: field
> org.apache.hadoop.metrics2.lib.MutableRate
> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with
> annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time,
> about=, value=[GetGroups], always=false, type=DEFAULT, sampleName=Ops)
> 14/09/28 00:28:24 DEBUG MetricsSystemImpl: UgiMetrics, User and group
> related metrics
> 14/09/28 00:28:24 DEBUG KerberosName: Kerberos krb5 configuration not
> found, setting default realm to empty
> 14/09/28 00:28:24 DEBUG Groups:  Creating new Groups object
> 14/09/28 00:28:24 DEBUG NativeCodeLoader: Trying to load the custom-built
> native-hadoop library...
> 14/09/28 00:28:24 DEBUG NativeCodeLoader: Failed to load native-hadoop
> with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
> 14/09/28 00:28:24 DEBUG NativeCodeLoader:
> java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
> 14/09/28 00:28:24 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 14/09/28 00:28:24 DEBUG JniBasedUnixGroupsMappingWithFallback: Falling
> back to shell based
> 14/09/28 00:28:24 DEBUG JniBasedUnixGroupsMappingWithFallback: Group
> mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
> 14/09/28 00:28:24 DEBUG Shell: Failed to detect a valid hadoop home
> directory
> java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
>         at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265)
>         at org.apache.hadoop.util.Shell.<clinit>(Shell.java:290)
>         at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
>         at
> org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:92)
>         at org.apache.hadoop.security.Groups.<init>(Groups.java:76)
>         at
> org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:239)
>         at
> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
>         at
> org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
>         at
> org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
>         at
> org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
>         at
> org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
>         at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:113)
>         at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:156)
>         at
> org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
> 14/09/28 00:28:24 DEBUG Shell: setsid exited with exit code 0
> 14/09/28 00:28:24 DEBUG Groups: Group mapping
> impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback;
> cacheTimeout=300000; warningDeltaMs=5000
> 14/09/28 00:28:24 DEBUG SparkHadoopUtil: running as user: codeoedoc
> 14/09/28 00:28:24 DEBUG UserGroupInformation: hadoop login
> 14/09/28 00:28:24 DEBUG UserGroupInformation: hadoop login commit
> 14/09/28 00:28:24 DEBUG UserGroupInformation: using local
> user:UnixPrincipal: codeoedoc
> 14/09/28 00:28:24 DEBUG UserGroupInformation: UGI loginUser:codeoedoc
> (auth:SIMPLE)
> 14/09/28 00:28:24 DEBUG UserGroupInformation: PrivilegedAction
> as:codeoedoc (auth:SIMPLE)
> from:org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:52)
> 14/09/28 00:28:24 INFO SecurityManager: Changing view acls to: codeoedoc
> 14/09/28 00:28:24 INFO SecurityManager: Changing modify acls to: codeoedoc
> 14/09/28 00:28:24 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users with view permissions: Set(codeoedoc);
> users with modify permissions: Set(codeoedoc)
> 14/09/28 00:28:24 DEBUG AkkaUtils: In createActorSystem, requireCookie is:
> off
> 14/09/28 00:28:24 INFO Slf4jLogger: Slf4jLogger started
> 14/09/28 00:28:24 INFO Remoting: Starting remoting
> 14/09/28 00:28:24 INFO Remoting: Remoting started; listening on addresses
> :[akka.tcp://driverPropsFetcher@localhost:40146]
> 14/09/28 00:28:24 INFO Remoting: Remoting now listens on addresses:
> [akka.tcp://driverPropsFetcher@localhost:40146]
> 14/09/28 00:28:24 INFO Utils: Successfully started service
> 'driverPropsFetcher' on port 40146.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *14/09/28 00:28:54 WARN UserGroupInformation: PriviledgedActionException
> as:codeoedoc (auth:SIMPLE) cause:java.util.concurrent.TimeoutException:
> Futures timed out after [30 seconds]Exception in thread "main"
> java.lang.reflect.UndeclaredThrowableException        at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1561)
> at
> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:52)
> at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:113)
> at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:156)
> at
> org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)Caused
> by: java.util.concurrent.TimeoutException: Futures timed out after [30
> seconds]        at
> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
> at
> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
> at
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> at scala.concurrent.Await$.result(package.scala:107)        at
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:125)
> at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:53)
> at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:52)
> at java.security.AccessController.doPrivileged(Native Method)        at
> javax.security.auth.Subject.doAs(Subject.java:415)        at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> ... 4 more*
>
>

Re: spark multi-node cluster

Reply via email to