I'm hitting an odd issue with running spark on mesos together with
HA-HDFS, with an even odder workaround.
In particular I get an error that it can't find the HDFS nameservice
unless I put in a _broken_ url (discovered that workaround by
mistake!). core-site.xml, hdfs-site.xml is distributed to the slave
node - and that file is read since I deliberately break the file then I
get an error as you'd expect.
NB: This is a bit different to
http://mail-archives.us.apache.org/mod_mbox/spark-user/201402.mbox/%3c1392442185079-1549.p...@n3.nabble.com%3E
Spark 1.5.0:
t=sc.textFile("hdfs://nameservice1/tmp/issue")
t.count()
(fails)
t=sc.textFile("file://etc/passwd")
t.count()
(errors about bad url - should have an extra / of course)
t=sc.textFile("hdfs://nameservice1/tmp/issue")
t.count()
then it works!!!
I should say that using file:///etc/passwd or hdfs:///tmp/issue both
fail as well. Unless preceded by a broken url. I've tried setting
spark.hadoop.cloneConf to true, no change.
Sample (broken) run:
15/09/14 13:00:14 DEBUG HadoopRDD: Creating new JobConf and caching it
for later re-use
15/09/14 13:00:14 DEBUG : address: ip-10-1-200-165/10.1.200.165
isLoopbackAddress: false, with host 10.1.200.165 ip-10-1-200-165
15/09/14 13:00:14 DEBUG BlockReaderLocal:
dfs.client.use.legacy.blockreader.local = false
15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit =
false
15/09/14 13:00:14 DEBUG BlockReaderLocal:
dfs.client.domain.socket.data.traffic = false
15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.domain.socket.path =
/var/run/hdfs-sockets/dn
15/09/14 13:00:14 DEBUG HAUtil: No HA service delegation token found for
logical URI hdfs://nameservice1
15/09/14 13:00:14 DEBUG BlockReaderLocal:
dfs.client.use.legacy.blockreader.local = false
15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit =
false
15/09/14 13:00:14 DEBUG BlockReaderLocal:
dfs.client.domain.socket.data.traffic = false
15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.domain.socket.path =
/var/run/hdfs-sockets/dn
15/09/14 13:00:14 DEBUG RetryUtils: multipleLinearRandomRetry = null
15/09/14 13:00:14 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER,
rpcRequestWrapperClass=class
org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper,
rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@6245f50b
15/09/14 13:00:14 DEBUG Client: getting client out of cache:
org.apache.hadoop.ipc.Client@267f0fd3
15/09/14 13:00:14 DEBUG NativeCodeLoader: Trying to load the
custom-built native-hadoop library...
15/09/14 13:00:14 DEBUG NativeCodeLoader: Loaded the native-hadoop library
...
15/09/14 13:00:14 DEBUG Client: Connecting to
mesos-1.example.com/10.1.200.165:8020
15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu: starting, having
connections 1
15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #0
15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #0
15/09/14 13:00:14 DEBUG ProtobufRpcEngine: Call: getFileInfo took 36ms
15/09/14 13:00:14 DEBUG FileInputFormat: Time taken to get FileStatuses: 69
15/09/14 13:00:14 INFO FileInputFormat: Total input paths to process : 1
15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #1
15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #1
15/09/14 13:00:14 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 1ms
15/09/14 13:00:14 DEBUG FileInputFormat: Total # of splits generated by
getSplits: 2, TimeTaken: 104
...
15/09/14 13:00:24 DEBUG Client: IPC Client (1739425103) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu: closed
15/09/14 13:00:24 DEBUG Client: IPC Client (1739425103) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu: stopped, remaining
connections 0
15/09/14 13:00:24 DEBUG
AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] received
message
AkkaMessage(ExecutorRemoved(20150826-133446-3217621258-5050-4064-S1),true)
from Actor[akka://sparkDriver/temp/$g]
15/09/14 13:00:24 DEBUG
AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: Received RPC
message:
AkkaMessage(ExecutorRemoved(20150826-133446-3217621258-5050-4064-S1),true)
15/09/14 13:00:24 DEBUG
AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] handled
message (0.513851 ms)
AkkaMessage(ExecutorRemoved(20150826-133446-3217621258-5050-4064-S1),true)
from Actor[akka://sparkDriver/temp/$g]
15/09/14 13:00:25 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID
0, 10.1.200.245): java.lang.IllegalArgumentException:
java.net.UnknownHostException: nameservice1
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
at
org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:656)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:438)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:411)
at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at
org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:249)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
at
org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208)
Caused by: java.net.UnknownHostException: nameservice1
... 32 more
Sample working run:
15/09/14 13:00:43 DEBUG HadoopRDD: Creating new JobConf and caching it
for later re-use
15/09/14 13:00:43 DEBUG : address: ip-10-1-200-165/10.1.200.165
isLoopbackAddress: false, with host 10.1.200.165 ip-10-1-200-165
15/09/14 13:00:43 DEBUG BlockReaderLocal:
dfs.client.use.legacy.blockreader.local = false
15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit =
false
15/09/14 13:00:43 DEBUG BlockReaderLocal:
dfs.client.domain.socket.data.traffic = false
15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.domain.socket.path =
/var/run/hdfs-sockets/dn
15/09/14 13:00:43 DEBUG HAUtil: No HA service delegation token found for
logical URI hdfs://nameservice1
15/09/14 13:00:43 DEBUG BlockReaderLocal:
dfs.client.use.legacy.blockreader.local = false
15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit =
false
15/09/14 13:00:43 DEBUG BlockReaderLocal:
dfs.client.domain.socket.data.traffic = false
15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.domain.socket.path =
/var/run/hdfs-sockets/dn
15/09/14 13:00:43 DEBUG RetryUtils: multipleLinearRandomRetry = null
15/09/14 13:00:43 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER,
rpcRequestWrapperClass=class
org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper,
rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@114b3357
15/09/14 13:00:43 DEBUG Client: getting client out of cache:
org.apache.hadoop.ipc.Client@28a248cd
15/09/14 13:00:44 DEBUG NativeCodeLoader: Trying to load the
custom-built native-hadoop library...
15/09/14 13:00:44 DEBUG NativeCodeLoader: Loaded the native-hadoop library
15/09/14 13:00:44 DEBUG DomainSocketWatcher:
org.apache.hadoop.net.unix.DomainSocketWatcher$2@3962387d: starting with
interruptCheckPeriodMs = 60000
15/09/14 13:00:44 DEBUG PerformanceAdvisory: Both short-circuit local
reads and UNIX domain socket are disabled.
15/09/14 13:00:44 DEBUG DataTransferSaslUtil: DataTransferProtocol not
using SaslPropertiesResolver, no QOP found in configuration for
dfs.data.transfer.protection
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/spark15/python/pyspark/rdd.py", line 1006, in count
return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
File "/home/ubuntu/spark15/python/pyspark/rdd.py", line 997, in sum
return self.mapPartitions(lambda x: [sum(x)]).fold(0, operator.add)
File "/home/ubuntu/spark15/python/pyspark/rdd.py", line 871, in fold
vals = self.mapPartitions(func).collect()
File "/home/ubuntu/spark15/python/pyspark/rdd.py", line 773, in collect
port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
File
"/home/ubuntu/spark15/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
line 538, in __call__
File "/home/ubuntu/spark15/python/pyspark/sql/utils.py", line 42, in deco
raise IllegalArgumentException(s.split(': ', 1)[1])
pyspark.sql.utils.IllegalArgumentException: Wrong FS: file://etc/passwd,
expected: file:///
...
15/09/14 13:00:51 DEBUG HadoopRDD: Creating new JobConf and caching it
for later re-use
15/09/14 13:00:51 DEBUG Client: The ping interval is 60000 ms.
15/09/14 13:00:51 DEBUG Client: Connecting to
mesos-1.example.com/10.1.200.165:8020
15/09/14 13:00:51 DEBUG Client: IPC Client (24266793) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu: starting, having
connections 1
15/09/14 13:00:51 DEBUG Client: IPC Client (24266793) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #0
15/09/14 13:00:51 DEBUG Client: IPC Client (24266793) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #0
15/09/14 13:00:51 DEBUG ProtobufRpcEngine: Call: getFileInfo took 32ms
15/09/14 13:00:51 DEBUG FileInputFormat: Time taken to get FileStatuses: 64
15/09/14 13:00:51 INFO FileInputFormat: Total input paths to process : 1
15/09/14 13:00:51 DEBUG Client: IPC Client (24266793) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #1
15/09/14 13:00:51 DEBUG Client: IPC Client (24266793) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #1
15/09/14 13:00:51 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 2ms
15/09/14 13:00:51 DEBUG FileInputFormat: Total # of splits generated by
getSplits: 2, TimeTaken: 95
2
(the answer!)
The mesos logs are very slightly different (apologies - this was for a
different run). Notice that dfs.domain.socket.path is blank (or cut-off
by the exception?) in the broken run.
Broken:
15/09/14 13:48:30 DEBUG HadoopRDD: Cloning Hadoop Configuration
15/09/14 13:48:30 DEBUG : address: ip-10-1-200-245/10.1.200.245
isLoopbackAddress: false, with host 10.1.200.245 ip-10-1-200-245
15/09/14 13:48:30 DEBUG BlockReaderLocal:
dfs.client.use.legacy.blockreader.local = false
15/09/14 13:48:30 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit =
false
15/09/14 13:48:30 DEBUG BlockReaderLocal:
dfs.client.domain.socket.data.traffic = false
15/09/14 13:48:30 DEBUG BlockReaderLocal: dfs.domain.socket.path =
15/09/14 13:48:30 ERROR PythonRDD: Python worker exited unexpectedly
(crashed)
org.apache.spark.api.python.PythonException: Traceback (most recent call
last):
File
"/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S2/frameworks/20150826-133446-3217621258-5050-4064-216556/executors/20150826-133446-3217621258-5050-4064-S2/runs/b31501ae-22d0-47dd-b4b6-2fb17717e1f8/spark15/python/lib/pyspark.zip/pyspark/worker.py",
line 98, in main
command = pickleSer._read_with_length(infile)
File
"/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S2/frameworks/20150826-133446-3217621258-5050-4064-216556/executors/20150826-133446-3217621258-5050-4064-S2/runs/b31501ae-22d0-47dd-b4b6-2fb17717e1f8/spark15/python/lib/pyspark.zip/pyspark/serializers.py",
line 156, in _read_with_length
length = read_int(stream)
File
"/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S2/frameworks/20150826-133446-3217621258-5050-4064-216556/executors/20150826-133446-3217621258-5050-4064-S2/runs/b31501ae-22d0-47dd-b4b6-2fb17717e1f8/spark15/python/lib/pyspark.zip/pyspark/serializers.py",
line 544, in read_int
raise EOFError
EOFError
at
org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:138)
at
org.apache.spark.api.python.PythonRDD$$anon$1.<init>(PythonRDD.scala:179)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:97)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException:
java.net.UnknownHostException: nameservice1
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
at
org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:656)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:438)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:411)
at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$2.apply(HadoopRDD.scala:157)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$2.apply(HadoopRDD.scala:157)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:157)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at
org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:249)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
at
org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208)
Caused by: java.net.UnknownHostException: nameservice1
... 32 more
15/09/14 13:48:30 ERROR PythonRDD: This may have been caused by a prior
exception:
java.lang.IllegalArgumentException: java.net.UnknownHostException:
nameservice1
at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377)
at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
at
org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:656)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:438)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:411)
at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$2.apply(HadoopRDD.scala:157)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$2.apply(HadoopRDD.scala:157)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:157)
at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at
org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:249)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
at
org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208)
Caused by: java.net.UnknownHostException: nameservice1
... 32 more
Working:
15/09/14 13:47:17 DEBUG HadoopRDD: Cloning Hadoop Configuration
15/09/14 13:47:17 DEBUG : address: ip-10-1-200-245/10.1.200.245
isLoopbackAddress: false, with host 10.1.200.245 ip-10-1-200-245
15/09/14 13:47:17 DEBUG BlockReaderLocal:
dfs.client.use.legacy.blockreader.local = false
15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit =
false
15/09/14 13:47:17 DEBUG BlockReaderLocal:
dfs.client.domain.socket.data.traffic = false
15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.domain.socket.path =
/var/run/hdfs-sockets/dn
15/09/14 13:47:17 DEBUG HAUtil: No HA service delegation token found for
logical URI hdfs://nameservice1
15/09/14 13:47:17 DEBUG BlockReaderLocal:
dfs.client.use.legacy.blockreader.local = false
15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit =
false
15/09/14 13:47:17 DEBUG BlockReaderLocal:
dfs.client.domain.socket.data.traffic = false
15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.domain.socket.path =
/var/run/hdfs-sockets/dn
15/09/14 13:47:17 DEBUG RetryUtils: multipleLinearRandomRetry = null
15/09/14 13:47:17 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER,
rpcRequestWrapperClass=class
org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper,
rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@30b68416
15/09/14 13:47:17 DEBUG Client: getting client out of cache:
org.apache.hadoop.ipc.Client@4599b420
15/09/14 13:47:18 DEBUG NativeCodeLoader: Trying to load the
custom-built native-hadoop library...
15/09/14 13:47:18 DEBUG NativeCodeLoader: Loaded the native-hadoop library
15/09/14 13:47:18 DEBUG DomainSocketWatcher:
org.apache.hadoop.net.unix.DomainSocketWatcher$2@4ed189cf: starting with
interruptCheckPeriodMs = 60000
15/09/14 13:47:18 DEBUG PerformanceAdvisory: Both short-circuit local
reads and UNIX domain socket are disabled.
15/09/14 13:47:18 DEBUG DataTransferSaslUtil: DataTransferProtocol not
using SaslPropertiesResolver, no QOP found in configuration for
dfs.data.transfer.protection
15/09/14 13:47:18 INFO deprecation: mapred.tip.id is deprecated.
Instead, use mapreduce.task.id
15/09/14 13:47:18 INFO deprecation: mapred.task.id is deprecated.
Instead, use mapreduce.task.attempt.id
15/09/14 13:47:18 INFO deprecation: mapred.task.is.map is deprecated.
Instead, use mapreduce.task.ismap
15/09/14 13:47:18 INFO deprecation: mapred.task.partition is deprecated.
Instead, use mapreduce.task.partition
15/09/14 13:47:18 INFO deprecation: mapred.job.id is deprecated.
Instead, use mapreduce.job.id
15/09/14 13:47:18 DEBUG Client: The ping interval is 60000 ms.
15/09/14 13:47:18 DEBUG Client: Connecting to
mesos-1.example.com/10.1.200.165:8020
15/09/14 13:47:18 DEBUG Client: IPC Client (2055067800) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu: starting, having
connections 1
15/09/14 13:47:18 DEBUG Client: IPC Client (2055067800) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #0
15/09/14 13:47:18 DEBUG Client: IPC Client (2055067800) connection to
mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #0
15/09/14 13:47:18 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 28ms
15/09/14 13:47:18 DEBUG DFSClient: newInfo = LocatedBlocks{
--
*Adrian Bridgett* | Sysadmin Engineer, OpenSignal
<http://www.opensignal.com>
_____________________________________________________
Office: First Floor, Scriptor Court, 155-157 Farringdon Road,
Clerkenwell, London, EC1R 3AD
Phone #: +44 777-377-8251
Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>|
LinkedIn link <https://uk.linkedin.com/in/abridgett>
_____________________________________________________