Anyone can help me?
Log Type: stderr Log Upload Time: 星期一 四月 11 19:55:35 +0800 2016 Log Length: 65816 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/hadoop/yarn/local/filecache/10/spark-hdp-assembly.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.4.0.0-169/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 16/04/11 19:54:33 INFO ApplicationMaster: Registered signal handlers for [TERM, HUP, INT] 16/04/11 19:54:34 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1460348986844_0004_000002 16/04/11 19:54:35 INFO SecurityManager: Changing view acls to: yarn,spark 16/04/11 19:54:35 INFO SecurityManager: Changing modify acls to: yarn,spark 16/04/11 19:54:35 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, spark); users with modify permissions: Set(yarn, spark) 16/04/11 19:54:36 INFO ApplicationMaster: Starting the user application in a separate Thread 16/04/11 19:54:36 INFO ApplicationMaster: Waiting for spark context initialization 16/04/11 19:54:36 INFO ApplicationMaster: Waiting for spark context initialization ... 16/04/11 19:54:36 INFO SparkContext: Running Spark version 1.6.0 16/04/11 19:54:36 INFO SecurityManager: Changing view acls to: yarn,spark 16/04/11 19:54:36 INFO SecurityManager: Changing modify acls to: yarn,spark 16/04/11 19:54:36 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, spark); users with modify permissions: Set(yarn, spark) 16/04/11 19:54:36 INFO Utils: Successfully started service 'sparkDriver' on port 52671. 16/04/11 19:54:37 INFO Slf4jLogger: Slf4jLogger started 16/04/11 19:54:37 INFO Remoting: Starting remoting 16/04/11 19:54:38 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.1.193:47695] 16/04/11 19:54:38 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 47695. 16/04/11 19:54:38 INFO SparkEnv: Registering MapOutputTracker 16/04/11 19:54:38 INFO SparkEnv: Registering BlockManagerMaster 16/04/11 19:54:38 INFO DiskBlockManager: Created local directory at /hadoop/yarn/local/usercache/spark/appcache/application_1460348986844_0004/blockmgr-bade5582-438f-4035-9545-ea9f06f2f201 16/04/11 19:54:38 INFO MemoryStore: MemoryStore started with capacity 116.6 MB 16/04/11 19:54:38 INFO SparkEnv: Registering OutputCommitCoordinator 16/04/11 19:54:38 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 16/04/11 19:54:38 INFO Server: jetty-8.y.z-SNAPSHOT 16/04/11 19:54:38 INFO AbstractConnector: Started SelectChannelConnector@0.0.0.0:33446 16/04/11 19:54:38 INFO Utils: Successfully started service 'SparkUI' on port 33446. 16/04/11 19:54:38 INFO SparkUI: Started SparkUI at http://192.168.1.193:33446 16/04/11 19:54:39 INFO YarnClusterScheduler: Created YarnClusterScheduler 16/04/11 19:54:39 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1460348986844_0004 and attemptId Some(appattempt_1460348986844_0004_000002) 16/04/11 19:54:39 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43352. 16/04/11 19:54:39 INFO NettyBlockTransferService: Server created on 43352 16/04/11 19:54:39 INFO BlockManagerMaster: Trying to register BlockManager 16/04/11 19:54:39 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.193:43352 with 116.6 MB RAM, BlockManagerId(driver, 192.168.1.193, 43352) 16/04/11 19:54:39 INFO BlockManagerMaster: Registered BlockManager 16/04/11 19:54:39 INFO EventLoggingListener: Logging events to hdfs:///spark-history/application_1460348986844_0004_appattempt_1460348986844_0004_000002 16/04/11 19:54:39 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark://YarnAM@192.168.1.193:52671) 16/04/11 19:54:39 INFO RMProxy: Connecting to ResourceManager at hadoop1/192.168.1.191:8030 16/04/11 19:54:39 INFO YarnRMClient: Registering the ApplicationMaster 16/04/11 19:54:40 INFO YarnAllocator: Will request 2 executor containers, each with 1 cores and 896 MB memory including 384 MB overhead 16/04/11 19:54:40 INFO YarnAllocator: Container request (host: Any, capability: <memory:896, vCores:1>) 16/04/11 19:54:40 INFO YarnAllocator: Container request (host: Any, capability: <memory:896, vCores:1>) 16/04/11 19:54:40 INFO ApplicationMaster: Started progress reporter thread with (heartbeat : 5000, initial allocation : 200) intervals 16/04/11 19:54:41 INFO AMRMClientImpl: Received new token for : hadoop3:45454 16/04/11 19:54:41 INFO YarnAllocator: Launching container container_e05_1460348986844_0004_02_000002 for on host hadoop3 16/04/11 19:54:41 INFO YarnAllocator: Launching ExecutorRunnable. driverUrl: spark://CoarseGrainedScheduler@192.168.1.193:52671, executorHostname: hadoop3 16/04/11 19:54:41 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 16/04/11 19:54:41 INFO ExecutorRunnable: Starting Executor Container 16/04/11 19:54:41 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 16/04/11 19:54:41 INFO ExecutorRunnable: Setting up ContainerLaunchContext 16/04/11 19:54:41 INFO ExecutorRunnable: Preparing Local resources 16/04/11 19:54:41 INFO ExecutorRunnable: Prepared Local resources Map(__app__.jar -> resource { scheme: "hdfs" host: "hadoop1" port: 8020 file: "/user/spark/.sparkStaging/application_1460348986844_0004/spark_to_parquet-0.0.1-SNAPSHOT.jar" } size: 13254271 timestamp: 1460369291325 type: FILE visibility: PRIVATE, __spark__.jar -> resource { scheme: "hdfs" host: "hadoop1" port: 8020 file: "/hdp/apps/2.4.0.0-169/spark/spark-hdp-assembly.jar" } size: 191724642 timestamp: 1460287769119 type: FILE visibility: PUBLIC) 16/04/11 19:54:41 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs://hadoop1:8020/hdp/apps/2.4.0.0-169/spark/spark-hdp-assembly.jar 16/04/11 19:54:41 INFO ExecutorRunnable: =============================================================================== YARN executor launch context: env: CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>/usr/hdp/current/hadoop-client/*<CPS>/usr/hdp/current/hadoop-client/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.4.0.0-169/hadoop/lib/hadoop-lzo-0.6.0.2.4.0.0-169.jar:/etc/hadoop/conf/secure SPARK_LOG_URL_STDERR -> http://hadoop3:8042/node/containerlogs/container_e05_1460348986844_0004_02_000002/spark/stderr?start=-4096 SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1460348986844_0004 SPARK_YARN_CACHE_FILES_FILE_SIZES -> 191724642,13254271 SPARK_USER -> spark SPARK_YARN_CACHE_FILES_VISIBILITIES -> PUBLIC,PRIVATE SPARK_YARN_MODE -> true SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1460287769119,1460369291325 SPARK_LOG_URL_STDOUT -> http://hadoop3:8042/node/containerlogs/container_e05_1460348986844_0004_02_000002/spark/stdout?start=-4096 SPARK_YARN_CACHE_FILES -> hdfs://hadoop1:8020/hdp/apps/2.4.0.0-169/spark/spark-hdp-assembly.jar#__spark__.jar,hdfs://hadoop1:8020/user/spark/.sparkStaging/application_1460348986844_0004/spark_to_parquet-0.0.1-SNAPSHOT.jar#__app__.jar command: {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms512m -Xmx512m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=52671' '-Dspark.history.ui.port=18080' '-Dspark.ui.port=0' -Dspark.yarn.app.container.log.dir=<LOG_DIR> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@192.168.1.193:52671 --executor-id 1 --hostname hadoop3 --cores 1 --app-id application_1460348986844_0004 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr =============================================================================== 16/04/11 19:54:41 INFO ContainerManagementProtocolProxy: Opening proxy : hadoop3:45454 16/04/11 19:54:43 INFO YarnAllocator: Launching container container_e05_1460348986844_0004_02_000003 for on host hadoop3 16/04/11 19:54:43 INFO YarnAllocator: Launching ExecutorRunnable. driverUrl: spark://CoarseGrainedScheduler@192.168.1.193:52671, executorHostname: hadoop3 16/04/11 19:54:43 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. 16/04/11 19:54:43 INFO ExecutorRunnable: Starting Executor Container 16/04/11 19:54:43 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0 16/04/11 19:54:43 INFO ExecutorRunnable: Setting up ContainerLaunchContext 16/04/11 19:54:43 INFO ExecutorRunnable: Preparing Local resources 16/04/11 19:54:43 INFO ExecutorRunnable: Prepared Local resources Map(__app__.jar -> resource { scheme: "hdfs" host: "hadoop1" port: 8020 file: "/user/spark/.sparkStaging/application_1460348986844_0004/spark_to_parquet-0.0.1-SNAPSHOT.jar" } size: 13254271 timestamp: 1460369291325 type: FILE visibility: PRIVATE, __spark__.jar -> resource { scheme: "hdfs" host: "hadoop1" port: 8020 file: "/hdp/apps/2.4.0.0-169/spark/spark-hdp-assembly.jar" } size: 191724642 timestamp: 1460287769119 type: FILE visibility: PUBLIC) 16/04/11 19:54:43 INFO Client: Using the spark assembly jar on HDFS because you are using HDP, defaultSparkAssembly:hdfs://hadoop1:8020/hdp/apps/2.4.0.0-169/spark/spark-hdp-assembly.jar 16/04/11 19:54:43 INFO ExecutorRunnable: =============================================================================== YARN executor launch context: env: CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>/usr/hdp/current/hadoop-client/*<CPS>/usr/hdp/current/hadoop-client/lib/*<CPS>/usr/hdp/current/hadoop-hdfs-client/*<CPS>/usr/hdp/current/hadoop-hdfs-client/lib/*<CPS>/usr/hdp/current/hadoop-yarn-client/*<CPS>/usr/hdp/current/hadoop-yarn-client/lib/*<CPS>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.4.0.0-169/hadoop/lib/hadoop-lzo-0.6.0.2.4.0.0-169.jar:/etc/hadoop/conf/secure SPARK_LOG_URL_STDERR -> http://hadoop3:8042/node/containerlogs/container_e05_1460348986844_0004_02_000003/spark/stderr?start=-4096 SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1460348986844_0004 SPARK_YARN_CACHE_FILES_FILE_SIZES -> 191724642,13254271 SPARK_USER -> spark SPARK_YARN_CACHE_FILES_VISIBILITIES -> PUBLIC,PRIVATE SPARK_YARN_MODE -> true SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1460287769119,1460369291325 SPARK_LOG_URL_STDOUT -> http://hadoop3:8042/node/containerlogs/container_e05_1460348986844_0004_02_000003/spark/stdout?start=-4096 SPARK_YARN_CACHE_FILES -> hdfs://hadoop1:8020/hdp/apps/2.4.0.0-169/spark/spark-hdp-assembly.jar#__spark__.jar,hdfs://hadoop1:8020/user/spark/.sparkStaging/application_1460348986844_0004/spark_to_parquet-0.0.1-SNAPSHOT.jar#__app__.jar command: {{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms512m -Xmx512m -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=52671' '-Dspark.history.ui.port=18080' '-Dspark.ui.port=0' -Dspark.yarn.app.container.log.dir=<LOG_DIR> org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@192.168.1.193:52671 --executor-id 2 --hostname hadoop3 --cores 1 --app-id application_1460348986844_0004 --user-class-path file:$PWD/__app__.jar 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr =============================================================================== 16/04/11 19:54:43 INFO ContainerManagementProtocolProxy: Opening proxy : hadoop3:45454 16/04/11 19:54:48 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 0 of them. 16/04/11 19:54:48 INFO YarnClusterSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (hadoop3:36455) with ID 1 16/04/11 19:54:48 INFO BlockManagerMasterEndpoint: Registering block manager hadoop3:59480 with 143.3 MB RAM, BlockManagerId(1, hadoop3, 59480) 16/04/11 19:54:49 INFO YarnClusterSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (hadoop3:36456) with ID 2 16/04/11 19:54:49 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 16/04/11 19:54:49 INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done 16/04/11 19:54:49 INFO BlockManagerMasterEndpoint: Registering block manager hadoop3:60869 with 143.3 MB RAM, BlockManagerId(2, hadoop3, 60869) 16/04/11 19:54:50 INFO VerifiableProperties: Verifying properties 16/04/11 19:54:50 INFO VerifiableProperties: Property group.id is overridden to 16/04/11 19:54:50 INFO VerifiableProperties: Property zookeeper.connect is overridden to 16/04/11 19:54:51 INFO ForEachDStream: metadataCleanupDelay = -1 16/04/11 19:54:51 INFO FlatMappedDStream: metadataCleanupDelay = -1 16/04/11 19:54:51 INFO MappedDStream: metadataCleanupDelay = -1 16/04/11 19:54:51 INFO DirectKafkaInputDStream: metadataCleanupDelay = -1 16/04/11 19:54:51 INFO DirectKafkaInputDStream: Slide time = 30000 ms 16/04/11 19:54:51 INFO DirectKafkaInputDStream: Storage level = StorageLevel(false, false, false, false, 1) 16/04/11 19:54:51 INFO DirectKafkaInputDStream: Checkpoint interval = null 16/04/11 19:54:51 INFO DirectKafkaInputDStream: Remember duration = 30000 ms 16/04/11 19:54:51 INFO DirectKafkaInputDStream: Initialized and validated org.apache.spark.streaming.kafka.DirectKafkaInputDStream@11d5086 16/04/11 19:54:51 INFO MappedDStream: Slide time = 30000 ms 16/04/11 19:54:51 INFO MappedDStream: Storage level = StorageLevel(false, false, false, false, 1) 16/04/11 19:54:51 INFO MappedDStream: Checkpoint interval = null 16/04/11 19:54:51 INFO MappedDStream: Remember duration = 30000 ms 16/04/11 19:54:51 INFO MappedDStream: Initialized and validated org.apache.spark.streaming.dstream.MappedDStream@3d71b12c 16/04/11 19:54:51 INFO FlatMappedDStream: Slide time = 30000 ms 16/04/11 19:54:51 INFO FlatMappedDStream: Storage level = StorageLevel(false, true, false, false, 1) 16/04/11 19:54:51 INFO FlatMappedDStream: Checkpoint interval = null 16/04/11 19:54:51 INFO FlatMappedDStream: Remember duration = 30000 ms 16/04/11 19:54:51 INFO FlatMappedDStream: Initialized and validated org.apache.spark.streaming.dstream.FlatMappedDStream@b74bccf 16/04/11 19:54:51 INFO ForEachDStream: Slide time = 30000 ms 16/04/11 19:54:51 INFO ForEachDStream: Storage level = StorageLevel(false, false, false, false, 1) 16/04/11 19:54:51 INFO ForEachDStream: Checkpoint interval = null 16/04/11 19:54:51 INFO ForEachDStream: Remember duration = 30000 ms 16/04/11 19:54:51 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@6cbe0b89 16/04/11 19:54:51 INFO RecurringTimer: Started timer for JobGenerator at time 1460375700000 16/04/11 19:54:51 INFO JobGenerator: Started JobGenerator at 1460375700000 ms 16/04/11 19:54:51 INFO JobScheduler: Started JobScheduler 16/04/11 19:54:51 INFO StreamingContext: StreamingContext started 16/04/11 19:55:00 INFO VerifiableProperties: Verifying properties 16/04/11 19:55:00 INFO VerifiableProperties: Property group.id is overridden to 16/04/11 19:55:00 INFO VerifiableProperties: Property zookeeper.connect is overridden to 16/04/11 19:55:00 INFO JobScheduler: Added jobs for time 1460375700000 ms 16/04/11 19:55:00 INFO JobScheduler: Starting job streaming job 1460375700000 ms.0 from job set of time 1460375700000 ms 16/04/11 19:55:02 INFO SparkContext: Starting job: count at AppMain.java:93 16/04/11 19:55:02 INFO DAGScheduler: Registering RDD 6 (count at AppMain.java:93) 16/04/11 19:55:02 INFO DAGScheduler: Got job 0 (count at AppMain.java:93) with 1 output partitions 16/04/11 19:55:02 INFO DAGScheduler: Final stage: ResultStage 1 (count at AppMain.java:93) 16/04/11 19:55:02 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0) 16/04/11 19:55:02 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0) 16/04/11 19:55:02 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[6] at count at AppMain.java:93), which has no missing parents 16/04/11 19:55:02 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 11.8 KB, free 11.8 KB) 16/04/11 19:55:02 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.8 KB, free 17.6 KB) 16/04/11 19:55:02 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.1.193:43352 (size: 5.8 KB, free: 116.6 MB) 16/04/11 19:55:02 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006 16/04/11 19:55:02 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[6] at count at AppMain.java:93) 16/04/11 19:55:02 INFO YarnClusterScheduler: Adding task set 0.0 with 1 tasks 16/04/11 19:55:02 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, hadoop3, partition 0,RACK_LOCAL, 1994 bytes) 16/04/11 19:55:03 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on hadoop3:59480 (size: 5.8 KB, free: 143.2 MB) 16/04/11 19:55:04 INFO BlockManagerInfo: Added rdd_2_0 in memory on hadoop3:59480 (size: 4.0 B, free: 143.2 MB) 16/04/11 19:55:05 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 2707 ms on hadoop3 (1/1) 16/04/11 19:55:05 INFO DAGScheduler: ShuffleMapStage 0 (count at AppMain.java:93) finished in 2.719 s 16/04/11 19:55:05 INFO DAGScheduler: looking for newly runnable stages 16/04/11 19:55:05 INFO DAGScheduler: running: Set() 16/04/11 19:55:05 INFO DAGScheduler: waiting: Set(ResultStage 1) 16/04/11 19:55:05 INFO DAGScheduler: failed: Set() 16/04/11 19:55:05 INFO YarnClusterScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 16/04/11 19:55:05 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[9] at count at AppMain.java:93), which has no missing parents 16/04/11 19:55:05 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 12.7 KB, free 30.3 KB) 16/04/11 19:55:05 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 6.2 KB, free 36.5 KB) 16/04/11 19:55:05 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.1.193:43352 (size: 6.2 KB, free: 116.6 MB) 16/04/11 19:55:05 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006 16/04/11 19:55:05 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[9] at count at AppMain.java:93) 16/04/11 19:55:05 INFO YarnClusterScheduler: Adding task set 1.0 with 1 tasks 16/04/11 19:55:05 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, hadoop3, partition 0,NODE_LOCAL, 1999 bytes) 16/04/11 19:55:05 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on hadoop3:59480 (size: 6.2 KB, free: 143.2 MB) 16/04/11 19:55:05 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to hadoop3:36455 16/04/11 19:55:05 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 136 bytes 16/04/11 19:55:06 INFO DAGScheduler: ResultStage 1 (count at AppMain.java:93) finished in 0.407 s 16/04/11 19:55:06 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 409 ms on hadoop3 (1/1) 16/04/11 19:55:06 INFO YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 16/04/11 19:55:06 INFO DAGScheduler: Job 0 finished: count at AppMain.java:93, took 3.483560 s 16/04/11 19:55:06 INFO JobScheduler: Finished job streaming job 1460375700000 ms.0 from job set of time 1460375700000 ms 16/04/11 19:55:06 INFO JobScheduler: Total delay: 6.069 s for time 1460375700000 ms (execution: 5.758 s) 16/04/11 19:55:06 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer() 16/04/11 19:55:06 INFO InputInfoTracker: remove old batch metadata: 16/04/11 19:55:30 INFO JobScheduler: Added jobs for time 1460375730000 ms 16/04/11 19:55:30 INFO JobScheduler: Starting job streaming job 1460375730000 ms.0 from job set of time 1460375730000 ms 16/04/11 19:55:30 INFO SparkContext: Starting job: count at AppMain.java:93 16/04/11 19:55:30 INFO DAGScheduler: Registering RDD 16 (count at AppMain.java:93) 16/04/11 19:55:30 INFO DAGScheduler: Got job 1 (count at AppMain.java:93) with 1 output partitions 16/04/11 19:55:30 INFO DAGScheduler: Final stage: ResultStage 3 (count at AppMain.java:93) 16/04/11 19:55:30 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 2) 16/04/11 19:55:30 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 2) 16/04/11 19:55:30 INFO DAGScheduler: Submitting ShuffleMapStage 2 (MapPartitionsRDD[16] at count at AppMain.java:93), which has no missing parents 16/04/11 19:55:30 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 11.8 KB, free 48.2 KB) 16/04/11 19:55:30 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 5.8 KB, free 54.0 KB) 16/04/11 19:55:30 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.1.193:43352 (size: 5.8 KB, free: 116.6 MB) 16/04/11 19:55:30 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006 16/04/11 19:55:30 INFO DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 2 (MapPartitionsRDD[16] at count at AppMain.java:93) 16/04/11 19:55:30 INFO YarnClusterScheduler: Adding task set 2.0 with 1 tasks 16/04/11 19:55:30 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, hadoop3, partition 0,RACK_LOCAL, 1994 bytes) 16/04/11 19:55:30 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on hadoop3:60869 (size: 5.8 KB, free: 143.2 MB) 16/04/11 19:55:32 INFO BlockManagerInfo: Added rdd_12_0 in memory on hadoop3:60869 (size: 6.0 B, free: 143.2 MB) 16/04/11 19:55:33 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2, hadoop3): java.lang.NullPointerException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1356) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/04/11 19:55:33 INFO TaskSetManager: Starting task 0.1 in stage 2.0 (TID 3, hadoop3, partition 0,RACK_LOCAL, 1994 bytes) 16/04/11 19:55:33 INFO TaskSetManager: Lost task 0.1 in stage 2.0 (TID 3) on executor hadoop3: java.lang.NullPointerException (null) [duplicate 1] 16/04/11 19:55:33 INFO TaskSetManager: Starting task 0.2 in stage 2.0 (TID 4, hadoop3, partition 0,RACK_LOCAL, 1994 bytes) 16/04/11 19:55:33 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on hadoop3:59480 (size: 5.8 KB, free: 143.2 MB) 16/04/11 19:55:33 INFO TaskSetManager: Lost task 0.2 in stage 2.0 (TID 4) on executor hadoop3: java.lang.NullPointerException (null) [duplicate 2] 16/04/11 19:55:33 INFO TaskSetManager: Starting task 0.3 in stage 2.0 (TID 5, hadoop3, partition 0,RACK_LOCAL, 1994 bytes) 16/04/11 19:55:33 INFO TaskSetManager: Lost task 0.3 in stage 2.0 (TID 5) on executor hadoop3: java.lang.NullPointerException (null) [duplicate 3] 16/04/11 19:55:33 ERROR TaskSetManager: Task 0 in stage 2.0 failed 4 times; aborting job 16/04/11 19:55:33 INFO YarnClusterScheduler: Removed TaskSet 2.0, whose tasks have all completed, from pool 16/04/11 19:55:33 INFO YarnClusterScheduler: Cancelling stage 2 16/04/11 19:55:33 INFO DAGScheduler: ShuffleMapStage 2 (count at AppMain.java:93) failed in 3.682 s 16/04/11 19:55:33 INFO DAGScheduler: Job 1 failed: count at AppMain.java:93, took 3.712123 s 16/04/11 19:55:33 INFO JobScheduler: Finished job streaming job 1460375730000 ms.0 from job set of time 1460375730000 ms 16/04/11 19:55:33 INFO JobScheduler: Total delay: 3.858 s for time 1460375730000 ms (execution: 3.837 s) 16/04/11 19:55:33 INFO MapPartitionsRDD: Removing RDD 2 from persistence list 16/04/11 19:55:33 ERROR JobScheduler: Error running job streaming job 1460375730000 ms.0 org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 5, hadoop3): java.lang.NullPointerException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1356) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.collect(RDD.scala:926) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1538) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1538) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56) at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2125) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1537) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1544) at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1554) at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1553) at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2138) at org.apache.spark.sql.DataFrame.count(DataFrame.scala:1553) at com.sectong.spark_to_parquet.AppMain.lambda$1(AppMain.java:93) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:316) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:316) at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:661) at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:661) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:426) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:49) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:49) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:49) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:224) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:224) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:224) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:223) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1356) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) ... 3 more 16/04/11 19:55:33 ERROR ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 5, hadoop3): java.lang.NullPointerException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1356) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 5, hadoop3): java.lang.NullPointerException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1356) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1431) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.collect(RDD.scala:926) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1538) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1538) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56) at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2125) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1537) at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1544) at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1554) at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1553) at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2138) at org.apache.spark.sql.DataFrame.count(DataFrame.scala:1553) at com.sectong.spark_to_parquet.AppMain.lambda$1(AppMain.java:93) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:316) at org.apache.spark.streaming.api.java.JavaDStreamLike$$anonfun$foreachRDD$1.apply(JavaDStreamLike.scala:316) at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:661) at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1$$anonfun$apply$mcV$sp$3.apply(DStream.scala:661) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1$$anonfun$apply$mcV$sp$1.apply(ForEachDStream.scala:50) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:426) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:49) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:49) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:49) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:39) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:224) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:224) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:224) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:223) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1356) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) ... 3 more 16/04/11 19:55:33 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 5, hadoop3): java.lang.NullPointerException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1356) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace:) 16/04/11 19:55:33 INFO MapPartitionsRDD: Removing RDD 1 from persistence list 16/04/11 19:55:33 INFO BlockManager: Removing RDD 2 16/04/11 19:55:33 INFO BlockManager: Removing RDD 1 16/04/11 19:55:33 INFO KafkaRDD: Removing RDD 0 from persistence list 16/04/11 19:55:33 INFO StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook 16/04/11 19:55:33 INFO JobGenerator: Stopping JobGenerator immediately 16/04/11 19:55:33 INFO ReceivedBlockTracker: Deleting batches ArrayBuffer() 16/04/11 19:55:33 INFO BlockManager: Removing RDD 0 16/04/11 19:55:33 INFO InputInfoTracker: remove old batch metadata: 16/04/11 19:55:33 INFO RecurringTimer: Stopped timer for JobGenerator after time 1460375730000 16/04/11 19:55:33 INFO JobGenerator: Stopped JobGenerator 16/04/11 19:55:33 INFO JobScheduler: Stopped JobScheduler 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/streaming,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/streaming/batch,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static/streaming,null} 16/04/11 19:55:33 INFO StreamingContext: StreamingContext stopped successfully 16/04/11 19:55:33 INFO SparkContext: Invoking stop() from shutdown hook 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/streaming/batch/json,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/streaming/json,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static/sql,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL/execution/json,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL/execution,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL/json,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null} 16/04/11 19:55:33 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null} 16/04/11 19:55:34 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null} 16/04/11 19:55:34 INFO SparkUI: Stopped Spark web UI at http://192.168.1.193:33446 16/04/11 19:55:34 INFO YarnClusterSchedulerBackend: Shutting down all executors 16/04/11 19:55:34 INFO YarnClusterSchedulerBackend: Asking each executor to shut down 16/04/11 19:55:34 INFO SchedulerExtensionServices: Stopping SchedulerExtensionServices (serviceOption=None, services=List(), started=false) 16/04/11 19:55:34 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 16/04/11 19:55:34 INFO MemoryStore: MemoryStore cleared 16/04/11 19:55:34 INFO BlockManager: BlockManager stopped 16/04/11 19:55:34 INFO BlockManagerMaster: BlockManagerMaster stopped 16/04/11 19:55:34 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/04/11 19:55:34 INFO SparkContext: Successfully stopped SparkContext 16/04/11 19:55:34 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 5, hadoop3): java.lang.NullPointerException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1$$anonfun$apply$1.apply(SQLContext.scala:1358) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1358) at org.apache.spark.sql.SQLContext$$anonfun$org$apache$spark$sql$SQLContext$$beansToRows$1.apply(SQLContext.scala:1356) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.processInputs(TungstenAggregationIterator.scala:505) at org.apache.spark.sql.execution.aggregate.TungstenAggregationIterator.<init>(TungstenAggregationIterator.scala:686) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:95) at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Driver stacktrace:) 16/04/11 19:55:34 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 16/04/11 19:55:34 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 16/04/11 19:55:34 INFO AMRMClientImpl: Waiting for application to be successfully unregistered. 16/04/11 19:55:34 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down. 16/04/11 19:55:34 INFO ApplicationMaster: Deleting staging directory .sparkStaging/application_1460348986844_0004 16/04/11 19:55:34 INFO ShutdownHookManager: Shutdown hook called 16/04/11 19:55:34 INFO ShutdownHookManager: Deleting directory /hadoop/yarn/local/usercache/spark/appcache/application_1460348986844_0004/spark-7d07ba73-cc49-43c4-828b-82aa16199496 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-can-t-submit-App-to-yarn-tp26745.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org