I'm hitting an odd issue with running spark on mesos together with HA-HDFS, with an even odder workaround.

In particular I get an error that it can't find the HDFS nameservice unless I put in a _broken_ url (discovered that workaround by mistake!). core-site.xml, hdfs-site.xml is distributed to the slave node - and that file is read since I deliberately break the file then I get an error as you'd expect.

NB: This is a bit different to http://mail-archives.us.apache.org/mod_mbox/spark-user/201402.mbox/%3c1392442185079-1549.p...@n3.nabble.com%3E


Spark 1.5.0:

t=sc.textFile("hdfs://nameservice1/tmp/issue")
t.count()
(fails)

t=sc.textFile("file://etc/passwd")
t.count()
(errors about bad url - should have an extra / of course)
t=sc.textFile("hdfs://nameservice1/tmp/issue")
t.count()
then it works!!!

I should say that using file:///etc/passwd or hdfs:///tmp/issue both fail as well. Unless preceded by a broken url. I've tried setting spark.hadoop.cloneConf to true, no change.

Sample (broken) run:
15/09/14 13:00:14 DEBUG HadoopRDD: Creating new JobConf and caching it for later re-use 15/09/14 13:00:14 DEBUG : address: ip-10-1-200-165/10.1.200.165 isLoopbackAddress: false, with host 10.1.200.165 ip-10-1-200-165 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn 15/09/14 13:00:14 DEBUG HAUtil: No HA service delegation token found for logical URI hdfs://nameservice1 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 15/09/14 13:00:14 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn
15/09/14 13:00:14 DEBUG RetryUtils: multipleLinearRandomRetry = null
15/09/14 13:00:14 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@6245f50b 15/09/14 13:00:14 DEBUG Client: getting client out of cache: org.apache.hadoop.ipc.Client@267f0fd3 15/09/14 13:00:14 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
15/09/14 13:00:14 DEBUG NativeCodeLoader: Loaded the native-hadoop library
...
15/09/14 13:00:14 DEBUG Client: Connecting to mesos-1.example.com/10.1.200.165:8020 15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu: starting, having connections 1 15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #0 15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #0
15/09/14 13:00:14 DEBUG ProtobufRpcEngine: Call: getFileInfo took 36ms
15/09/14 13:00:14 DEBUG FileInputFormat: Time taken to get FileStatuses: 69
15/09/14 13:00:14 INFO FileInputFormat: Total input paths to process : 1
15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #1 15/09/14 13:00:14 DEBUG Client: IPC Client (1739425103) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #1
15/09/14 13:00:14 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 1ms
15/09/14 13:00:14 DEBUG FileInputFormat: Total # of splits generated by getSplits: 2, TimeTaken: 104
...
15/09/14 13:00:24 DEBUG Client: IPC Client (1739425103) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu: closed 15/09/14 13:00:24 DEBUG Client: IPC Client (1739425103) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu: stopped, remaining connections 0 15/09/14 13:00:24 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] received message AkkaMessage(ExecutorRemoved(20150826-133446-3217621258-5050-4064-S1),true) from Actor[akka://sparkDriver/temp/$g] 15/09/14 13:00:24 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: Received RPC message: AkkaMessage(ExecutorRemoved(20150826-133446-3217621258-5050-4064-S1),true) 15/09/14 13:00:24 DEBUG AkkaRpcEnv$$anonfun$actorRef$lzycompute$1$1$$anon$1: [actor] handled message (0.513851 ms) AkkaMessage(ExecutorRemoved(20150826-133446-3217621258-5050-4064-S1),true) from Actor[akka://sparkDriver/temp/$g] 15/09/14 13:00:25 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 10.1.200.245): java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:656) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:438) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:411) at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007) at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
    at scala.Option.map(Option.scala:145)
    at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
    at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:249)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
at org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208)
Caused by: java.net.UnknownHostException: nameservice1
    ... 32 more


Sample working run:
15/09/14 13:00:43 DEBUG HadoopRDD: Creating new JobConf and caching it for later re-use 15/09/14 13:00:43 DEBUG : address: ip-10-1-200-165/10.1.200.165 isLoopbackAddress: false, with host 10.1.200.165 ip-10-1-200-165 15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn 15/09/14 13:00:43 DEBUG HAUtil: No HA service delegation token found for logical URI hdfs://nameservice1 15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 15/09/14 13:00:43 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn
15/09/14 13:00:43 DEBUG RetryUtils: multipleLinearRandomRetry = null
15/09/14 13:00:43 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@114b3357 15/09/14 13:00:43 DEBUG Client: getting client out of cache: org.apache.hadoop.ipc.Client@28a248cd 15/09/14 13:00:44 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
15/09/14 13:00:44 DEBUG NativeCodeLoader: Loaded the native-hadoop library
15/09/14 13:00:44 DEBUG DomainSocketWatcher: org.apache.hadoop.net.unix.DomainSocketWatcher$2@3962387d: starting with interruptCheckPeriodMs = 60000 15/09/14 13:00:44 DEBUG PerformanceAdvisory: Both short-circuit local reads and UNIX domain socket are disabled. 15/09/14 13:00:44 DEBUG DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/spark15/python/pyspark/rdd.py", line 1006, in count
    return self.mapPartitions(lambda i: [sum(1 for _ in i)]).sum()
  File "/home/ubuntu/spark15/python/pyspark/rdd.py", line 997, in sum
    return self.mapPartitions(lambda x: [sum(x)]).fold(0, operator.add)
  File "/home/ubuntu/spark15/python/pyspark/rdd.py", line 871, in fold
    vals = self.mapPartitions(func).collect()
  File "/home/ubuntu/spark15/python/pyspark/rdd.py", line 773, in collect
    port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
File "/home/ubuntu/spark15/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
  File "/home/ubuntu/spark15/python/pyspark/sql/utils.py", line 42, in deco
    raise IllegalArgumentException(s.split(': ', 1)[1])
pyspark.sql.utils.IllegalArgumentException: Wrong FS: file://etc/passwd, expected: file:///
...
15/09/14 13:00:51 DEBUG HadoopRDD: Creating new JobConf and caching it for later re-use
15/09/14 13:00:51 DEBUG Client: The ping interval is 60000 ms.
15/09/14 13:00:51 DEBUG Client: Connecting to mesos-1.example.com/10.1.200.165:8020 15/09/14 13:00:51 DEBUG Client: IPC Client (24266793) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu: starting, having connections 1 15/09/14 13:00:51 DEBUG Client: IPC Client (24266793) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #0 15/09/14 13:00:51 DEBUG Client: IPC Client (24266793) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #0
15/09/14 13:00:51 DEBUG ProtobufRpcEngine: Call: getFileInfo took 32ms
15/09/14 13:00:51 DEBUG FileInputFormat: Time taken to get FileStatuses: 64
15/09/14 13:00:51 INFO FileInputFormat: Total input paths to process : 1
15/09/14 13:00:51 DEBUG Client: IPC Client (24266793) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #1 15/09/14 13:00:51 DEBUG Client: IPC Client (24266793) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #1
15/09/14 13:00:51 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 2ms
15/09/14 13:00:51 DEBUG FileInputFormat: Total # of splits generated by getSplits: 2, TimeTaken: 95
2
(the answer!)


The mesos logs are very slightly different (apologies - this was for a different run). Notice that dfs.domain.socket.path is blank (or cut-off by the exception?) in the broken run.

Broken:
15/09/14 13:48:30 DEBUG HadoopRDD: Cloning Hadoop Configuration
15/09/14 13:48:30 DEBUG : address: ip-10-1-200-245/10.1.200.245 isLoopbackAddress: false, with host 10.1.200.245 ip-10-1-200-245 15/09/14 13:48:30 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:48:30 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:48:30 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false
15/09/14 13:48:30 DEBUG BlockReaderLocal: dfs.domain.socket.path =
15/09/14 13:48:30 ERROR PythonRDD: Python worker exited unexpectedly (crashed) org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S2/frameworks/20150826-133446-3217621258-5050-4064-216556/executors/20150826-133446-3217621258-5050-4064-S2/runs/b31501ae-22d0-47dd-b4b6-2fb17717e1f8/spark15/python/lib/pyspark.zip/pyspark/worker.py", line 98, in main
    command = pickleSer._read_with_length(infile)
File "/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S2/frameworks/20150826-133446-3217621258-5050-4064-216556/executors/20150826-133446-3217621258-5050-4064-S2/runs/b31501ae-22d0-47dd-b4b6-2fb17717e1f8/spark15/python/lib/pyspark.zip/pyspark/serializers.py", line 156, in _read_with_length
    length = read_int(stream)
File "/tmp/mesos/slaves/20150826-133446-3217621258-5050-4064-S2/frameworks/20150826-133446-3217621258-5050-4064-216556/executors/20150826-133446-3217621258-5050-4064-S2/runs/b31501ae-22d0-47dd-b4b6-2fb17717e1f8/spark15/python/lib/pyspark.zip/pyspark/serializers.py", line 544, in read_int
    raise EOFError
EOFError

at org.apache.spark.api.python.PythonRDD$$anon$1.read(PythonRDD.scala:138) at org.apache.spark.api.python.PythonRDD$$anon$1.<init>(PythonRDD.scala:179)
    at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:97)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:656) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:438) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:411) at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007) at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$2.apply(HadoopRDD.scala:157) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$2.apply(HadoopRDD.scala:157)
    at scala.Option.map(Option.scala:145)
    at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:157)
    at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:249)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
at org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208)
Caused by: java.net.UnknownHostException: nameservice1
    ... 32 more
15/09/14 13:48:30 ERROR PythonRDD: This may have been caused by a prior exception: java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1 at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:377) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:92)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2687)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2669)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:371)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:170)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:656) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:438) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:411) at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007) at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$2.apply(HadoopRDD.scala:157) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$2.apply(HadoopRDD.scala:157)
    at scala.Option.map(Option.scala:145)
    at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:157)
    at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
at org.apache.spark.api.python.PythonRDD$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:249)
    at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
at org.apache.spark.api.python.PythonRDD$WriterThread.run(PythonRDD.scala:208)
Caused by: java.net.UnknownHostException: nameservice1
    ... 32 more

Working:
15/09/14 13:47:17 DEBUG HadoopRDD: Cloning Hadoop Configuration
15/09/14 13:47:17 DEBUG : address: ip-10-1-200-245/10.1.200.245 isLoopbackAddress: false, with host 10.1.200.245 ip-10-1-200-245 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn 15/09/14 13:47:17 DEBUG HAUtil: No HA service delegation token found for logical URI hdfs://nameservice1 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.use.legacy.blockreader.local = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.read.shortcircuit = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.client.domain.socket.data.traffic = false 15/09/14 13:47:17 DEBUG BlockReaderLocal: dfs.domain.socket.path = /var/run/hdfs-sockets/dn
15/09/14 13:47:17 DEBUG RetryUtils: multipleLinearRandomRetry = null
15/09/14 13:47:17 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@30b68416 15/09/14 13:47:17 DEBUG Client: getting client out of cache: org.apache.hadoop.ipc.Client@4599b420 15/09/14 13:47:18 DEBUG NativeCodeLoader: Trying to load the custom-built native-hadoop library...
15/09/14 13:47:18 DEBUG NativeCodeLoader: Loaded the native-hadoop library
15/09/14 13:47:18 DEBUG DomainSocketWatcher: org.apache.hadoop.net.unix.DomainSocketWatcher$2@4ed189cf: starting with interruptCheckPeriodMs = 60000 15/09/14 13:47:18 DEBUG PerformanceAdvisory: Both short-circuit local reads and UNIX domain socket are disabled. 15/09/14 13:47:18 DEBUG DataTransferSaslUtil: DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection 15/09/14 13:47:18 INFO deprecation: mapred.tip.id is deprecated. Instead, use mapreduce.task.id 15/09/14 13:47:18 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id 15/09/14 13:47:18 INFO deprecation: mapred.task.is.map is deprecated. Instead, use mapreduce.task.ismap 15/09/14 13:47:18 INFO deprecation: mapred.task.partition is deprecated. Instead, use mapreduce.task.partition 15/09/14 13:47:18 INFO deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
15/09/14 13:47:18 DEBUG Client: The ping interval is 60000 ms.
15/09/14 13:47:18 DEBUG Client: Connecting to mesos-1.example.com/10.1.200.165:8020 15/09/14 13:47:18 DEBUG Client: IPC Client (2055067800) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu: starting, having connections 1 15/09/14 13:47:18 DEBUG Client: IPC Client (2055067800) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu sending #0 15/09/14 13:47:18 DEBUG Client: IPC Client (2055067800) connection to mesos-1.example.com/10.1.200.165:8020 from ubuntu got value #0
15/09/14 13:47:18 DEBUG ProtobufRpcEngine: Call: getBlockLocations took 28ms
15/09/14 13:47:18 DEBUG DFSClient: newInfo = LocatedBlocks{


--
*Adrian Bridgett* | Sysadmin Engineer, OpenSignal <http://www.opensignal.com>
_____________________________________________________
Office: First Floor, Scriptor Court, 155-157 Farringdon Road, Clerkenwell, London, EC1R 3AD
Phone #: +44 777-377-8251
Skype: abridgett |@adrianbridgett <http://twitter.com/adrianbridgett>| LinkedIn link <https://uk.linkedin.com/in/abridgett>
_____________________________________________________

Reply via email to