That's strange, for some reason your hadoop configurations are not picked
up by spark.

Thanks
Best Regards

On Wed, Sep 30, 2015 at 9:11 PM, Stephen Hankinson <step...@affinio.com>
wrote:

> When I use hdfs://affinio/tmp/Input it gives the same error about
> UnknownHostException affinio.
>
> However, from the command line I can run hdfs dfs -ls /tmp/Input or hdfs
> dfs -ls hdfs://affinio/tmp/Input and they work correctly.
>
> See more details here:
> http://stackoverflow.com/questions/32833860/unknownhostexception-with-mesos-spark-and-custom-jar
>
> Stephen Hankinson, P. Eng.
> CTO
> Affinio Inc.
> 301 - 211 Horseshoe Lake Dr.
> Halifax, Nova Scotia, Canada
> B3S 0B9
>
> http://www.affinio.com
>
> On Wed, Sep 30, 2015 at 4:21 AM, Akhil Das <ak...@sigmoidanalytics.com>
> wrote:
>
>> Can you try replacing your code with the hdfs uri? like:
>>
>> sc.textFile("hdfs://...").collect().foreach(println)
>>
>> Thanks
>> Best Regards
>>
>> On Tue, Sep 29, 2015 at 1:45 AM, Stephen Hankinson <step...@affinio.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Wondering if anyone can help me with the issue I am having.
>>>
>>> I am receiving an UnknownHostException when running a custom jar with
>>> Spark on Mesos. The issue does not happen when running spark-shell.
>>>
>>> My spark-env.sh contains the following:
>>>
>>> export MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos.so
>>>
>>> export HADOOP_CONF_DIR=/hadoop-2.7.1/etc/hadoop/
>>>
>>> My spark-defaults.conf contains the following:
>>>
>>> spark.master                       mesos://zk://172.31.0.81:2181,
>>> 172.31.16.81:2181,172.31.32.81:2181/mesos
>>>
>>> spark.mesos.executor.home          /spark-1.5.0-bin-hadoop2.6/
>>>
>>> Starting spark-shell as follows and running the following line works
>>> correctly:
>>>
>>> /spark-1.5.0-bin-hadoop2.6/bin/spark-shell
>>>
>>> sc.textFile("/tmp/Input").collect.foreach(println)
>>>
>>> 15/09/28 20:04:49 INFO storage.MemoryStore: ensureFreeSpace(88528)
>>> called with curMem=0, maxMem=556038881
>>>
>>> 15/09/28 20:04:49 INFO storage.MemoryStore: Block broadcast_0 stored as
>>> values in memory (estimated size 86.5 KB, free 530.2 MB)
>>>
>>> 15/09/28 20:04:49 INFO storage.MemoryStore: ensureFreeSpace(20236)
>>> called with curMem=88528, maxMem=556038881
>>>
>>> 15/09/28 20:04:49 INFO storage.MemoryStore: Block broadcast_0_piece0
>>> stored as bytes in memory (estimated size 19.8 KB, free 530.2 MB)
>>>
>>> 15/09/28 20:04:49 INFO storage.BlockManagerInfo: Added
>>> broadcast_0_piece0 in memory on 172.31.21.104:49048 (size: 19.8 KB,
>>> free: 530.3 MB)
>>>
>>> 15/09/28 20:04:49 INFO spark.SparkContext: Created broadcast 0 from
>>> textFile at <console>:22
>>>
>>> 15/09/28 20:04:49 INFO mapred.FileInputFormat: Total input paths to
>>> process : 1
>>>
>>> 15/09/28 20:04:49 INFO spark.SparkContext: Starting job: collect at
>>> <console>:22
>>>
>>> 15/09/28 20:04:49 INFO scheduler.DAGScheduler: Got job 0 (collect at
>>> <console>:22) with 3 output partitions
>>>
>>> 15/09/28 20:04:49 INFO scheduler.DAGScheduler: Final stage: ResultStage
>>> 0(collect at <console>:22)
>>>
>>> 15/09/28 20:04:49 INFO scheduler.DAGScheduler: Parents of final stage:
>>> List()
>>>
>>> 15/09/28 20:04:49 INFO scheduler.DAGScheduler: Missing parents: List()
>>>
>>> 15/09/28 20:04:49 INFO scheduler.DAGScheduler: Submitting ResultStage 0
>>> (MapPartitionsRDD[1] at textFile at <console>:22), which has no missing
>>> parents
>>>
>>> 15/09/28 20:04:49 INFO storage.MemoryStore: ensureFreeSpace(3120) called
>>> with curMem=108764, maxMem=556038881
>>>
>>> 15/09/28 20:04:49 INFO storage.MemoryStore: Block broadcast_1 stored as
>>> values in memory (estimated size 3.0 KB, free 530.2 MB)
>>>
>>> 15/09/28 20:04:49 INFO storage.MemoryStore: ensureFreeSpace(1784) called
>>> with curMem=111884, maxMem=556038881
>>>
>>> 15/09/28 20:04:49 INFO storage.MemoryStore: Block broadcast_1_piece0
>>> stored as bytes in memory (estimated size 1784.0 B, free 530.2 MB)
>>>
>>> 15/09/28 20:04:49 INFO storage.BlockManagerInfo: Added
>>> broadcast_1_piece0 in memory on 172.31.21.104:49048 (size: 1784.0 B,
>>> free: 530.3 MB)
>>>
>>> 15/09/28 20:04:49 INFO spark.SparkContext: Created broadcast 1 from
>>> broadcast at DAGScheduler.scala:861
>>>
>>> 15/09/28 20:04:49 INFO scheduler.DAGScheduler: Submitting 3 missing
>>> tasks from ResultStage 0 (MapPartitionsRDD[1] at textFile at <console>:22)
>>>
>>> 15/09/28 20:04:49 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0
>>> with 3 tasks
>>>
>>> 15/09/28 20:04:49 INFO scheduler.TaskSetManager: Starting task 0.0 in
>>> stage 0.0 (TID 0, ip-172-31-37-82.us-west-2.compute.internal, NODE_LOCAL,
>>> 2142 bytes)
>>>
>>> 15/09/28 20:04:49 INFO scheduler.TaskSetManager: Starting task 1.0 in
>>> stage 0.0 (TID 1, ip-172-31-21-104.us-west-2.compute.internal, NODE_LOCAL,
>>> 2142 bytes)
>>>
>>> 15/09/28 20:04:49 INFO scheduler.TaskSetManager: Starting task 2.0 in
>>> stage 0.0 (TID 2, ip-172-31-4-4.us-west-2.compute.internal, NODE_LOCAL,
>>> 2142 bytes)
>>>
>>> 15/09/28 20:04:52 INFO storage.BlockManagerMasterEndpoint: Registering
>>> block manager ip-172-31-4-4.us-west-2.compute.internal:50648 with 530.3 MB
>>> RAM, BlockManagerId(20150928-190245-1358962604-5050-11297-S2,
>>> ip-172-31-4-4.us-west-2.compute.internal, 50648)
>>>
>>> 15/09/28 20:04:52 INFO storage.BlockManagerMasterEndpoint: Registering
>>> block manager ip-172-31-37-82.us-west-2.compute.internal:52624 with 530.3
>>> MB RAM, BlockManagerId(20150928-190245-1358962604-5050-11297-S1,
>>> ip-172-31-37-82.us-west-2.compute.internal, 52624)
>>>
>>> 15/09/28 20:04:52 INFO storage.BlockManagerMasterEndpoint: Registering
>>> block manager ip-172-31-21-104.us-west-2.compute.internal:56628 with 530.3
>>> MB RAM, BlockManagerId(20150928-190245-1358962604-5050-11297-S0,
>>> ip-172-31-21-104.us-west-2.compute.internal, 56628)
>>>
>>> 15/09/28 20:04:52 INFO storage.BlockManagerInfo: Added
>>> broadcast_1_piece0 in memory on
>>> ip-172-31-37-82.us-west-2.compute.internal:52624 (size: 1784.0 B, free:
>>> 530.3 MB)
>>>
>>> 15/09/28 20:04:52 INFO storage.BlockManagerInfo: Added
>>> broadcast_1_piece0 in memory on
>>> ip-172-31-21-104.us-west-2.compute.internal:56628 (size: 1784.0 B, free:
>>> 530.3 MB)
>>>
>>> 15/09/28 20:04:52 INFO storage.BlockManagerInfo: Added
>>> broadcast_1_piece0 in memory on
>>> ip-172-31-4-4.us-west-2.compute.internal:50648 (size: 1784.0 B, free: 530.3
>>> MB)
>>>
>>> 15/09/28 20:04:52 INFO storage.BlockManagerInfo: Added
>>> broadcast_0_piece0 in memory on
>>> ip-172-31-37-82.us-west-2.compute.internal:52624 (size: 19.8 KB, free:
>>> 530.3 MB)
>>>
>>> 15/09/28 20:04:52 INFO storage.BlockManagerInfo: Added
>>> broadcast_0_piece0 in memory on
>>> ip-172-31-21-104.us-west-2.compute.internal:56628 (size: 19.8 KB, free:
>>> 530.3 MB)
>>>
>>> 15/09/28 20:04:52 INFO storage.BlockManagerInfo: Added
>>> broadcast_0_piece0 in memory on
>>> ip-172-31-4-4.us-west-2.compute.internal:50648 (size: 19.8 KB, free: 530.3
>>> MB)
>>>
>>> 15/09/28 20:04:53 INFO scheduler.TaskSetManager: Finished task 0.0 in
>>> stage 0.0 (TID 0) in 3907 ms on ip-172-31-37-82.us-west-2.compute.internal
>>> (1/3)
>>>
>>> 15/09/28 20:04:53 INFO scheduler.TaskSetManager: Finished task 2.0 in
>>> stage 0.0 (TID 2) in 3884 ms on ip-172-31-4-4.us-west-2.compute.internal
>>> (2/3)
>>>
>>> 15/09/28 20:04:53 INFO scheduler.TaskSetManager: Finished task 1.0 in
>>> stage 0.0 (TID 1) in 3907 ms on ip-172-31-21-104.us-west-2.compute.internal
>>> (3/3)
>>>
>>> 15/09/28 20:04:53 INFO scheduler.DAGScheduler: ResultStage 0 (collect at
>>> <console>:22) finished in 3.940 s
>>>
>>> 15/09/28 20:04:53 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0,
>>> whose tasks have all completed, from pool
>>>
>>> 15/09/28 20:04:53 INFO scheduler.DAGScheduler: Job 0 finished: collect
>>> at <console>:22, took 4.019454 s
>>>
>>> pepsi
>>>
>>> cocacola
>>>
>>> The following sample code compiled into a Jar fails
>>>
>>> Sample code:
>>>
>>> import org.apache.spark.SparkContext
>>>
>>> import org.apache.spark.SparkContext._
>>>
>>> import org.apache.spark.SparkConf
>>>
>>> object SimpleApp {
>>>
>>>   def main(args: Array[String]) {
>>>
>>>     val conf = new SparkConf().setAppName("Simple Application")
>>>
>>>     val sc = new SparkContext(conf)
>>>
>>>     sc.textFile("/tmp/Input").collect.foreach(println)
>>>
>>>   }
>>>
>>> }
>>>
>>>
>>> /spark-1.5.0-bin-hadoop2.6/bin/spark-submit --class "SimpleApp"
>>> /home/hdfs/test_2.10-0.1.jar
>>>
>>> 15/09/28 20:07:38 INFO spark.SparkContext: Running Spark version 1.5.0
>>>
>>> 15/09/28 20:07:39 WARN util.NativeCodeLoader: Unable to load
>>> native-hadoop library for your platform... using builtin-java classes where
>>> applicable
>>>
>>> 15/09/28 20:07:39 INFO spark.SecurityManager: Changing view acls to: hdfs
>>>
>>> 15/09/28 20:07:39 INFO spark.SecurityManager: Changing modify acls to:
>>> hdfs
>>>
>>> 15/09/28 20:07:39 INFO spark.SecurityManager: SecurityManager:
>>> authentication disabled; ui acls disabled; users with view permissions:
>>> Set(hdfs); users with modify permissions: Set(hdfs)
>>>
>>> 15/09/28 20:07:40 INFO slf4j.Slf4jLogger: Slf4jLogger started
>>>
>>> 15/09/28 20:07:40 INFO Remoting: Starting remoting
>>>
>>> 15/09/28 20:07:40 INFO Remoting: Remoting started; listening on
>>> addresses :[akka.tcp://sparkDriver@172.31.21.104:39262]
>>>
>>> 15/09/28 20:07:40 INFO util.Utils: Successfully started service
>>> 'sparkDriver' on port 39262.
>>>
>>> 15/09/28 20:07:40 INFO spark.SparkEnv: Registering MapOutputTracker
>>>
>>> 15/09/28 20:07:40 INFO spark.SparkEnv: Registering BlockManagerMaster
>>>
>>> 15/09/28 20:07:40 INFO storage.DiskBlockManager: Created local directory
>>> at /tmp/blockmgr-236f6d4d-22fd-4f4d-9457-369cc846d790
>>>
>>> 15/09/28 20:07:40 INFO storage.MemoryStore: MemoryStore started with
>>> capacity 530.3 MB
>>>
>>> 15/09/28 20:07:40 INFO spark.HttpFileServer: HTTP File server directory
>>> is
>>> /tmp/spark-e0c1b94b-e901-4f2f-8d0d-2a16b23acda7/httpd-c8bba444-d177-4e85-8014-93a356d74241
>>>
>>> 15/09/28 20:07:40 INFO spark.HttpServer: Starting HTTP Server
>>>
>>> 15/09/28 20:07:40 INFO server.Server: jetty-8.y.z-SNAPSHOT
>>>
>>> 15/09/28 20:07:40 INFO server.AbstractConnector: Started
>>> SocketConnector@0.0.0.0:33504
>>>
>>> 15/09/28 20:07:40 INFO util.Utils: Successfully started service 'HTTP
>>> file server' on port 33504.
>>>
>>> 15/09/28 20:07:40 INFO spark.SparkEnv: Registering
>>> OutputCommitCoordinator
>>>
>>> 15/09/28 20:07:40 INFO server.Server: jetty-8.y.z-SNAPSHOT
>>>
>>> 15/09/28 20:07:40 INFO server.AbstractConnector: Started
>>> SelectChannelConnector@0.0.0.0:4040
>>>
>>> 15/09/28 20:07:40 INFO util.Utils: Successfully started service
>>> 'SparkUI' on port 4040.
>>>
>>> 15/09/28 20:07:40 INFO ui.SparkUI: Started SparkUI at
>>> http://172.31.21.104:4040
>>>
>>> 15/09/28 20:07:40 INFO spark.SparkContext: Added JAR
>>> file:/home/hdfs/test_2.10-0.1.jar at
>>> http://172.31.21.104:33504/jars/test_2.10-0.1.jar with timestamp
>>> 1443470860701
>>>
>>> 15/09/28 20:07:40 WARN metrics.MetricsSystem: Using default name
>>> DAGScheduler for source because spark.app.id is not set.
>>>
>>> I0928 20:07:40.925557  6586 sched.cpp:137] Version: 0.21.0
>>>
>>> 2015-09-28 20:07:40,925:6491(0x7f3dbfc33700):ZOO_INFO@log_env@712:
>>> Client environment:zookeeper.version=zookeeper C client 3.4.5
>>>
>>> 2015-09-28 20:07:40,926:6491(0x7f3dbfc33700):ZOO_INFO@log_env@716:
>>> Client environment:host.name=ip-172-31-21-104
>>>
>>> 2015-09-28 20:07:40,926:6491(0x7f3dbfc33700):ZOO_INFO@log_env@723:
>>> Client environment:os.name=Linux
>>>
>>> 2015-09-28 20:07:40,926:6491(0x7f3dbfc33700):ZOO_INFO@log_env@724:
>>> Client environment:os.arch=4.1.7-15.23.amzn1.x86_64
>>>
>>> 2015-09-28 20:07:40,926:6491(0x7f3dbfc33700):ZOO_INFO@log_env@725:
>>> Client environment:os.version=#1 SMP Mon Sep 14 23:20:33 UTC 2015
>>>
>>> 2015-09-28 20:07:40,926:6491(0x7f3dbfc33700):ZOO_INFO@log_env@733:
>>> Client environment:user.name=ec2-user
>>>
>>> 2015-09-28 20:07:40,926:6491(0x7f3dbfc33700):ZOO_INFO@log_env@741:
>>> Client environment:user.home=/home/hdfs
>>>
>>> 2015-09-28 20:07:40,926:6491(0x7f3dbfc33700):ZOO_INFO@log_env@753:
>>> Client environment:user.dir=/home/hdfs
>>>
>>> 2015-09-28 20:07:40,926:6491(0x7f3dbfc33700):ZOO_INFO@zookeeper_init@786:
>>> Initiating client connection, host=172.31.0.81:2181,172.31.16.81:2181,
>>> 172.31.32.81:2181 sessionTimeout=10000 watcher=0x7f3dc66f153a
>>> sessionId=0 sessionPasswd=<null> context=0x7f3e14002a20 flags=0
>>>
>>> 2015-09-28 20:07:40,927:6491(0x7f3dbdb2e700):ZOO_INFO@check_events@1703:
>>> initiated connection to server [172.31.32.81:2181]
>>>
>>> 2015-09-28 20:07:40,932:6491(0x7f3dbdb2e700):ZOO_INFO@check_events@1750:
>>> session establishment complete on server [172.31.32.81:2181],
>>> sessionId=0x2501545b3da0006, negotiated timeout=10000
>>>
>>> I0928 20:07:40.932901  6579 group.cpp:313] Group process (group(1)@
>>> 172.31.21.104:37721) connected to ZooKeeper
>>>
>>> I0928 20:07:40.932950  6579 group.cpp:790] Syncing group operations:
>>> queue size (joins, cancels, datas) = (0, 0, 0)
>>>
>>> I0928 20:07:40.932970  6579 group.cpp:385] Trying to create path
>>> '/mesos' in ZooKeeper
>>>
>>> I0928 20:07:40.945708  6578 detector.cpp:138] Detected a new leader:
>>> (id='0')
>>>
>>> I0928 20:07:40.946292  6578 group.cpp:659] Trying to get
>>> '/mesos/info_0000000000' in ZooKeeper
>>>
>>> I0928 20:07:40.948209  6577 detector.cpp:433] A new leading master (UPID=
>>> master@172.31.0.81:5050) is detected
>>>
>>> I0928 20:07:40.948470  6578 sched.cpp:234] New master detected at
>>> master@172.31.0.81:5050
>>>
>>> I0928 20:07:40.948700  6578 sched.cpp:242] No credentials provided.
>>> Attempting to register without authentication
>>>
>>> I0928 20:07:40.953208  6578 sched.cpp:408] Framework registered with
>>> 20150928-190245-1358962604-5050-11297-0009
>>>
>>> 15/09/28 20:07:40 INFO mesos.MesosSchedulerBackend: Registered as
>>> framework ID 20150928-190245-1358962604-5050-11297-0009
>>>
>>> 15/09/28 20:07:41 INFO util.Utils: Successfully started service
>>> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 48820.
>>>
>>> 15/09/28 20:07:41 INFO netty.NettyBlockTransferService: Server created
>>> on 48820
>>>
>>> 15/09/28 20:07:41 INFO storage.BlockManagerMaster: Trying to register
>>> BlockManager
>>>
>>> 15/09/28 20:07:41 INFO storage.BlockManagerMasterEndpoint: Registering
>>> block manager 172.31.21.104:48820 with 530.3 MB RAM,
>>> BlockManagerId(driver, 172.31.21.104, 48820)
>>>
>>> 15/09/28 20:07:41 INFO storage.BlockManagerMaster: Registered
>>> BlockManager
>>>
>>> 15/09/28 20:07:41 INFO storage.MemoryStore: ensureFreeSpace(157320)
>>> called with curMem=0, maxMem=556038881
>>>
>>> 15/09/28 20:07:41 INFO storage.MemoryStore: Block broadcast_0 stored as
>>> values in memory (estimated size 153.6 KB, free 530.1 MB)
>>>
>>> 15/09/28 20:07:41 INFO storage.MemoryStore: ensureFreeSpace(14306)
>>> called with curMem=157320, maxMem=556038881
>>>
>>> 15/09/28 20:07:41 INFO storage.MemoryStore: Block broadcast_0_piece0
>>> stored as bytes in memory (estimated size 14.0 KB, free 530.1 MB)
>>>
>>> 15/09/28 20:07:41 INFO storage.BlockManagerInfo: Added
>>> broadcast_0_piece0 in memory on 172.31.21.104:48820 (size: 14.0 KB,
>>> free: 530.3 MB)
>>>
>>> 15/09/28 20:07:41 INFO spark.SparkContext: Created broadcast 0 from
>>> textFile at SimpleApp.scala:10
>>>
>>> 15/09/28 20:07:42 INFO mapred.FileInputFormat: Total input paths to
>>> process : 1
>>>
>>> 15/09/28 20:07:42 INFO spark.SparkContext: Starting job: collect at
>>> SimpleApp.scala:10
>>>
>>> 15/09/28 20:07:42 INFO scheduler.DAGScheduler: Got job 0 (collect at
>>> SimpleApp.scala:10) with 3 output partitions
>>>
>>> 15/09/28 20:07:42 INFO scheduler.DAGScheduler: Final stage: ResultStage
>>> 0(collect at SimpleApp.scala:10)
>>>
>>> 15/09/28 20:07:42 INFO scheduler.DAGScheduler: Parents of final stage:
>>> List()
>>>
>>> 15/09/28 20:07:42 INFO scheduler.DAGScheduler: Missing parents: List()
>>>
>>> 15/09/28 20:07:42 INFO scheduler.DAGScheduler: Submitting ResultStage 0
>>> (MapPartitionsRDD[1] at textFile at SimpleApp.scala:10), which has no
>>> missing parents
>>>
>>> 15/09/28 20:07:42 INFO storage.MemoryStore: ensureFreeSpace(3120) called
>>> with curMem=171626, maxMem=556038881
>>>
>>> 15/09/28 20:07:42 INFO storage.MemoryStore: Block broadcast_1 stored as
>>> values in memory (estimated size 3.0 KB, free 530.1 MB)
>>>
>>> 15/09/28 20:07:42 INFO storage.MemoryStore: ensureFreeSpace(1784) called
>>> with curMem=174746, maxMem=556038881
>>>
>>> 15/09/28 20:07:42 INFO storage.MemoryStore: Block broadcast_1_piece0
>>> stored as bytes in memory (estimated size 1784.0 B, free 530.1 MB)
>>>
>>> 15/09/28 20:07:42 INFO storage.BlockManagerInfo: Added
>>> broadcast_1_piece0 in memory on 172.31.21.104:48820 (size: 1784.0 B,
>>> free: 530.3 MB)
>>>
>>> 15/09/28 20:07:42 INFO spark.SparkContext: Created broadcast 1 from
>>> broadcast at DAGScheduler.scala:861
>>>
>>> 15/09/28 20:07:42 INFO scheduler.DAGScheduler: Submitting 3 missing
>>> tasks from ResultStage 0 (MapPartitionsRDD[1] at textFile at
>>> SimpleApp.scala:10)
>>>
>>> 15/09/28 20:07:42 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0
>>> with 3 tasks
>>>
>>> 15/09/28 20:07:42 INFO scheduler.TaskSetManager: Starting task 0.0 in
>>> stage 0.0 (TID 0, ip-172-31-4-4.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:42 INFO scheduler.TaskSetManager: Starting task 1.0 in
>>> stage 0.0 (TID 1, ip-172-31-37-82.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:42 INFO scheduler.TaskSetManager: Starting task 2.0 in
>>> stage 0.0 (TID 2, ip-172-31-21-104.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:45 INFO storage.BlockManagerMasterEndpoint: Registering
>>> block manager ip-172-31-37-82.us-west-2.compute.internal:57573 with 530.3
>>> MB RAM, BlockManagerId(20150928-190245-1358962604-5050-11297-S1,
>>> ip-172-31-37-82.us-west-2.compute.internal, 57573)
>>>
>>> 15/09/28 20:07:45 INFO storage.BlockManagerMasterEndpoint: Registering
>>> block manager ip-172-31-4-4.us-west-2.compute.internal:43269 with 530.3 MB
>>> RAM, BlockManagerId(20150928-190245-1358962604-5050-11297-S2,
>>> ip-172-31-4-4.us-west-2.compute.internal, 43269)
>>>
>>> 15/09/28 20:07:45 INFO storage.BlockManagerMasterEndpoint: Registering
>>> block manager ip-172-31-21-104.us-west-2.compute.internal:49173 with 530.3
>>> MB RAM, BlockManagerId(20150928-190245-1358962604-5050-11297-S0,
>>> ip-172-31-21-104.us-west-2.compute.internal, 49173)
>>>
>>> 15/09/28 20:07:45 INFO storage.BlockManagerInfo: Added
>>> broadcast_1_piece0 in memory on
>>> ip-172-31-37-82.us-west-2.compute.internal:57573 (size: 1784.0 B, free:
>>> 530.3 MB)
>>>
>>> 15/09/28 20:07:45 INFO storage.BlockManagerInfo: Added
>>> broadcast_1_piece0 in memory on
>>> ip-172-31-4-4.us-west-2.compute.internal:43269 (size: 1784.0 B, free: 530.3
>>> MB)
>>>
>>> 15/09/28 20:07:45 INFO storage.BlockManagerInfo: Added
>>> broadcast_1_piece0 in memory on
>>> ip-172-31-21-104.us-west-2.compute.internal:49173 (size: 1784.0 B, free:
>>> 530.3 MB)
>>>
>>> 15/09/28 20:07:45 INFO storage.BlockManagerInfo: Added
>>> broadcast_0_piece0 in memory on
>>> ip-172-31-4-4.us-west-2.compute.internal:43269 (size: 14.0 KB, free: 530.3
>>> MB)
>>>
>>> 15/09/28 20:07:45 INFO storage.BlockManagerInfo: Added
>>> broadcast_0_piece0 in memory on
>>> ip-172-31-37-82.us-west-2.compute.internal:57573 (size: 14.0 KB, free:
>>> 530.3 MB)
>>>
>>> 15/09/28 20:07:45 INFO storage.BlockManagerInfo: Added
>>> broadcast_0_piece0 in memory on
>>> ip-172-31-21-104.us-west-2.compute.internal:49173 (size: 14.0 KB, free:
>>> 530.3 MB)
>>>
>>> 15/09/28 20:07:46 WARN scheduler.TaskSetManager: Lost task 1.0 in stage
>>> 0.0 (TID 1, ip-172-31-37-82.us-west-2.compute.internal):
>>> java.lang.IllegalArgumentException: java.net.UnknownHostException: affinio
>>>
>>> at
>>> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
>>>
>>> at
>>> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:312)
>>>
>>> at
>>> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:178)
>>>
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:665)
>>>
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:601)
>>>
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:148)
>>>
>>> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
>>>
>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
>>>
>>> at
>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
>>>
>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
>>>
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
>>>
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
>>>
>>> at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:656)
>>>
>>> at
>>> org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:436)
>>>
>>> at
>>> org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:409)
>>>
>>> at
>>> org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
>>>
>>> at
>>> org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
>>>
>>> at
>>> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
>>>
>>> at
>>> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
>>>
>>> at scala.Option.map(Option.scala:145)
>>>
>>> at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
>>>
>>> at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
>>>
>>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
>>>
>>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>>>
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
>>>
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
>>>
>>> at
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>>
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
>>>
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
>>>
>>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>>
>>> at org.apache.spark.scheduler.Task.run(Task.scala:88)
>>>
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Caused by: java.net.UnknownHostException: affinio
>>>
>>> ... 35 more
>>>
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 2.0 in stage
>>> 0.0 (TID 2) on executor ip-172-31-21-104.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 1]
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 0.0 in stage
>>> 0.0 (TID 0) on executor ip-172-31-4-4.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 2]
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Starting task 0.1 in
>>> stage 0.0 (TID 3, ip-172-31-37-82.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Starting task 2.1 in
>>> stage 0.0 (TID 4, ip-172-31-21-104.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Starting task 1.1 in
>>> stage 0.0 (TID 5, ip-172-31-4-4.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 0.1 in stage
>>> 0.0 (TID 3) on executor ip-172-31-37-82.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 3]
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 2.1 in stage
>>> 0.0 (TID 4) on executor ip-172-31-21-104.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 4]
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 1.1 in stage
>>> 0.0 (TID 5) on executor ip-172-31-4-4.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 5]
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Starting task 1.2 in
>>> stage 0.0 (TID 6, ip-172-31-37-82.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Starting task 2.2 in
>>> stage 0.0 (TID 7, ip-172-31-4-4.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Starting task 0.2 in
>>> stage 0.0 (TID 8, ip-172-31-21-104.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 1.2 in stage
>>> 0.0 (TID 6) on executor ip-172-31-37-82.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 6]
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 2.2 in stage
>>> 0.0 (TID 7) on executor ip-172-31-4-4.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 7]
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 0.2 in stage
>>> 0.0 (TID 8) on executor ip-172-31-21-104.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 8]
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Starting task 0.3 in
>>> stage 0.0 (TID 9, ip-172-31-21-104.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Starting task 2.3 in
>>> stage 0.0 (TID 10, ip-172-31-37-82.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Starting task 1.3 in
>>> stage 0.0 (TID 11, ip-172-31-4-4.us-west-2.compute.internal, NODE_LOCAL,
>>> 2201 bytes)
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 0.3 in stage
>>> 0.0 (TID 9) on executor ip-172-31-21-104.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 9]
>>>
>>> 15/09/28 20:07:46 ERROR scheduler.TaskSetManager: Task 0 in stage 0.0
>>> failed 4 times; aborting job
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSchedulerImpl: Cancelling stage 0
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSchedulerImpl: Stage 0 was cancelled
>>>
>>> 15/09/28 20:07:46 INFO scheduler.DAGScheduler: ResultStage 0 (collect at
>>> SimpleApp.scala:10) failed in 3.811 s
>>>
>>> 15/09/28 20:07:46 INFO scheduler.DAGScheduler: Job 0 failed: collect at
>>> SimpleApp.scala:10, took 3.867348 s
>>>
>>> Exception in thread "main" org.apache.spark.SparkException: Job aborted
>>> due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent
>>> failure: Lost task 0.3 in stage 0.0 (TID 9,
>>> ip-172-31-21-104.us-west-2.compute.internal):
>>> java.lang.IllegalArgumentException: java.net.UnknownHostException: affinio
>>>
>>> at
>>> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
>>>
>>> at
>>> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:312)
>>>
>>> at
>>> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:178)
>>>
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:665)
>>>
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:601)
>>>
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:148)
>>>
>>> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
>>>
>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
>>>
>>> at
>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
>>>
>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
>>>
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
>>>
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
>>>
>>> at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:656)
>>>
>>> at
>>> org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:436)
>>>
>>> at
>>> org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:409)
>>>
>>> at
>>> org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
>>>
>>> at
>>> org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
>>>
>>> at
>>> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
>>>
>>> at
>>> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
>>>
>>> at scala.Option.map(Option.scala:145)
>>>
>>> at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
>>>
>>> at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
>>>
>>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
>>>
>>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>>>
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
>>>
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
>>>
>>> at
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>>
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
>>>
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
>>>
>>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>>
>>> at org.apache.spark.scheduler.Task.run(Task.scala:88)
>>>
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Caused by: java.net.UnknownHostException: affinio
>>>
>>> ... 35 more
>>>
>>>
>>> Driver stacktrace:
>>>
>>> at org.apache.spark.scheduler.DAGScheduler.org
>>> <http://org.apache.spark.scheduler.dagscheduler.org/>
>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1280)
>>>
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1268)
>>>
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1267)
>>>
>>> at
>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>>
>>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>>
>>> at
>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1267)
>>>
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>>>
>>> at
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
>>>
>>> at scala.Option.foreach(Option.scala:236)
>>>
>>> at
>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
>>>
>>> at
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1493)
>>>
>>> at
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1455)
>>>
>>> at
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1444)
>>>
>>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>>>
>>> at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
>>>
>>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1813)
>>>
>>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1826)
>>>
>>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1839)
>>>
>>> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1910)
>>>
>>> at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:905)
>>>
>>> at
>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
>>>
>>> at
>>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
>>>
>>> at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
>>>
>>> at org.apache.spark.rdd.RDD.collect(RDD.scala:904)
>>>
>>> at SimpleApp$.main(SimpleApp.scala:10)
>>>
>>> at SimpleApp.main(SimpleApp.scala)
>>>
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>
>>> at
>>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
>>>
>>> at
>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
>>>
>>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
>>>
>>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>>>
>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>>
>>> Caused by: java.lang.IllegalArgumentException:
>>> java.net.UnknownHostException: affinio
>>>
>>> at
>>> org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:374)
>>>
>>> at
>>> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:312)
>>>
>>> at
>>> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:178)
>>>
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:665)
>>>
>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:601)
>>>
>>> at
>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:148)
>>>
>>> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
>>>
>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
>>>
>>> at
>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
>>>
>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
>>>
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
>>>
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
>>>
>>> at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:656)
>>>
>>> at
>>> org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:436)
>>>
>>> at
>>> org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:409)
>>>
>>> at
>>> org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
>>>
>>> at
>>> org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$32.apply(SparkContext.scala:1007)
>>>
>>> at
>>> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
>>>
>>> at
>>> org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
>>>
>>> at scala.Option.map(Option.scala:145)
>>>
>>> at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
>>>
>>> at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:220)
>>>
>>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216)
>>>
>>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>>>
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
>>>
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
>>>
>>> at
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>>
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297)
>>>
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)
>>>
>>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>>
>>> at org.apache.spark.scheduler.Task.run(Task.scala:88)
>>>
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> Caused by: java.net.UnknownHostException: affinio
>>>
>>> ... 35 more
>>>
>>> 15/09/28 20:07:46 INFO spark.SparkContext: Invoking stop() from shutdown
>>> hook
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/metrics/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/api,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/static,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/executors/threadDump,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/executors/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/executors,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/environment/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/environment,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/storage/rdd,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/storage/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/storage,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/stages/pool/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/stages/pool,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/stages/stage/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/stages/stage,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/stages/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/stages,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/jobs/job/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/jobs/job,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/jobs/json,null}
>>>
>>> 15/09/28 20:07:46 INFO handler.ContextHandler: stopped
>>> o.s.j.s.ServletContextHandler{/jobs,null}
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 1.3 in stage
>>> 0.0 (TID 11) on executor ip-172-31-4-4.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 10]
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0,
>>> whose tasks have all completed, from pool
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSetManager: Lost task 2.3 in stage
>>> 0.0 (TID 10) on executor ip-172-31-37-82.us-west-2.compute.internal:
>>> java.lang.IllegalArgumentException (java.net.UnknownHostException: affinio)
>>> [duplicate 11]
>>>
>>> 15/09/28 20:07:46 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0,
>>> whose tasks have all completed, from pool
>>>
>>> 15/09/28 20:07:46 INFO ui.SparkUI: Stopped Spark web UI at
>>> http://172.31.21.104:4040
>>>
>>> 15/09/28 20:07:46 INFO scheduler.DAGScheduler: Stopping DAGScheduler
>>>
>>> I0928 20:07:46.327116  6694 sched.cpp:1286] Asked to stop the driver
>>>
>>> I0928 20:07:46.327282  6584 sched.cpp:752] Stopping framework
>>> '20150928-190245-1358962604-5050-11297-0009'
>>>
>>> 15/09/28 20:07:46 INFO mesos.MesosSchedulerBackend: driver.run()
>>> returned with code DRIVER_STOPPED
>>>
>>> 15/09/28 20:07:46 INFO spark.MapOutputTrackerMasterEndpoint:
>>> MapOutputTrackerMasterEndpoint stopped!
>>>
>>> 15/09/28 20:07:46 INFO storage.MemoryStore: MemoryStore cleared
>>>
>>> 15/09/28 20:07:46 INFO storage.BlockManager: BlockManager stopped
>>>
>>> 15/09/28 20:07:46 INFO storage.BlockManagerMaster: BlockManagerMaster
>>> stopped
>>>
>>> 15/09/28 20:07:46 INFO
>>> scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
>>> OutputCommitCoordinator stopped!
>>>
>>> 15/09/28 20:07:46 INFO spark.SparkContext: Successfully stopped
>>> SparkContext
>>>
>>> 15/09/28 20:07:46 INFO util.ShutdownHookManager: Shutdown hook called
>>>
>>> 15/09/28 20:07:46 INFO util.ShutdownHookManager: Deleting directory
>>> /tmp/spark-e0c1b94b-e901-4f2f-8d0d-2a16b23acda7
>>>
>>> 15/09/28 20:07:46 INFO remote.RemoteActorRefProvider$RemotingTerminator:
>>> Shutting down remote daemon.
>>>
>>> 15/09/28 20:07:46 INFO remote.RemoteActorRefProvider$RemotingTerminator:
>>> Remote daemon shut down; proceeding with flushing remote transports.
>>>
>>> 15/09/28 20:07:46 INFO remote.RemoteActorRefProvider$RemotingTerminator:
>>> Remoting shut down.
>>> Any thoughts?
>>>
>>> Thanks
>>> Stephen
>>>
>>
>>
>

Reply via email to