Hi - I'm running Spark Streaming using PySpark 1.3 in yarn-client mode on CDH 5.4.4. The job sometimes runs a full 24hrs, but more often it fails sometime during the day.
I'm getting several vague errors that I don't see much about when searching online: - py4j.Py4JException: Error while obtaining a new communication channel - java.net.ConnectException: Connection timed out - py4j.Py4JException: An exception was raised by the Python Proxy. Return Message: null What could be happening here, and is there a workaround for keeping this process up a full 24hrs at a time? 15/12/16 23:17:53 INFO TaskSetManager: Finished task 0.0 in stage 948.0 (TID 986) in 79670 ms on phd40010043.xyz.com (1/1) 15/12/16 23:17:53 INFO YarnScheduler: Removed TaskSet 948.0, whose tasks have all completed, from pool 15/12/16 23:17:53 INFO DAGScheduler: Stage 948 (repartition at DataFrame.scala:835) finished in 79.673 s 15/12/16 23:17:53 INFO DAGScheduler: looking for newly runnable stages 15/12/16 23:17:53 INFO DAGScheduler: running: Set() 15/12/16 23:17:53 INFO DAGScheduler: waiting: Set(Stage 949) 15/12/16 23:17:53 INFO DAGScheduler: failed: Set() 15/12/16 23:17:53 INFO DAGScheduler: Missing parents for Stage 949: List() 15/12/16 23:17:53 INFO DAGScheduler: Submitting Stage 949 (MapPartitionsRDD[3944] at repartition at DataFrame.scala:835), which is now runnable 15/12/16 23:17:53 INFO MemoryStore: ensureFreeSpace(77816) called with curMem=51550693, maxMem=2778778828 15/12/16 23:17:53 INFO MemoryStore: Block broadcast_1156 stored as values in memory (estimated size 76.0 KB, free 2.5 GB) 15/12/16 23:17:53 INFO MemoryStore: ensureFreeSpace(27738) called with curMem=51628509, maxMem=2778778828 15/12/16 23:17:53 INFO MemoryStore: Block broadcast_1156_piece0 stored as bytes in memory (estimated size 27.1 KB, free 2.5 GB) 15/12/16 23:17:53 INFO BlockManagerInfo: Added broadcast_1156_piece0 in memory on phe40010004.xyz.com:36705 (size: 27.1 KB, free: 2.6 GB) 15/12/16 23:17:53 INFO BlockManagerMaster: Updated info of block broadcast_1156_piece0 15/12/16 23:17:53 INFO SparkContext: Created broadcast 1156 from broadcast at DAGScheduler.scala:839 15/12/16 23:17:53 INFO DAGScheduler: Submitting 1 missing tasks from Stage 949 (MapPartitionsRDD[3944] at repartition at DataFrame.scala:835) 15/12/16 23:17:53 INFO YarnScheduler: Adding task set 949.0 with 1 tasks 15/12/16 23:17:53 INFO TaskSetManager: Starting task 0.0 in stage 949.0 (TID 987, phd40010020.xyz.com, PROCESS_LOCAL, 1345 bytes) 15/12/16 23:17:53 INFO BlockManagerInfo: Added broadcast_1156_piece0 in memory on phd40010020.xyz.com:40560 (size: 27.1 KB, free: 2.6 GB) 15/12/16 23:17:53 INFO MapOutputTrackerMasterActor: Asked to send map output locations for shuffle 314 to sparkexecu...@phd40010020.xyz.com:17913 15/12/16 23:17:53 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 314 is 154 bytes 15/12/16 23:17:56 INFO TaskSetManager: Finished task 0.0 in stage 949.0 (TID 987) in 2824 ms on phd40010020.xyz.com (1/1) 15/12/16 23:17:56 INFO DAGScheduler: Stage 949 (runJob at newParquet.scala:646) finished in 2.827 s 15/12/16 23:17:56 INFO YarnScheduler: Removed TaskSet 949.0, whose tasks have all completed, from pool 15/12/16 23:17:56 INFO DAGScheduler: Job 636 finished: runJob at newParquet.scala:646, took 82.548844 s 15/12/16 23:17:56 INFO ParquetFileReader: Initiating action with parallelism: 5 15/12/16 23:17:56 INFO SparkContext: Starting job: collect at SparkPlan.scala:83 15/12/16 23:17:56 INFO DAGScheduler: Registering RDD 3948 (mapPartitions at Exchange.scala:100) 15/12/16 23:17:56 INFO DAGScheduler: Got job 637 (collect at SparkPlan.scala:83) with 1 output partitions (allowLocal=false) 15/12/16 23:17:56 INFO DAGScheduler: Final stage: Stage 951(collect at SparkPlan.scala:83) 15/12/16 23:17:56 INFO DAGScheduler: Parents of final stage: List(Stage 950) 15/12/16 23:17:56 INFO DAGScheduler: Missing parents: List(Stage 950) 15/12/16 23:17:56 INFO DAGScheduler: Submitting Stage 950 (MapPartitionsRDD[3948] at mapPartitions at Exchange.scala:100), which has no missing parents 15/12/16 23:17:56 INFO MemoryStore: ensureFreeSpace(56016) called with curMem=51656247, maxMem=2778778828 15/12/16 23:17:56 INFO MemoryStore: Block broadcast_1157 stored as values in memory (estimated size 54.7 KB, free 2.5 GB) 15/12/16 23:17:56 INFO MemoryStore: ensureFreeSpace(35457) called with curMem=51712263, maxMem=2778778828 15/12/16 23:17:56 INFO MemoryStore: Block broadcast_1157_piece0 stored as bytes in memory (estimated size 34.6 KB, free 2.5 GB) 15/12/16 23:17:56 INFO BlockManagerInfo: Added broadcast_1157_piece0 in memory on phe40010004.xyz.com:36705 (size: 34.6 KB, free: 2.6 GB) 15/12/16 23:17:56 INFO BlockManagerMaster: Updated info of block broadcast_1157_piece0 15/12/16 23:17:56 INFO SparkContext: Created broadcast 1157 from broadcast at DAGScheduler.scala:839 15/12/16 23:17:56 INFO DAGScheduler: Submitting 1 missing tasks from Stage 950 (MapPartitionsRDD[3948] at mapPartitions at Exchange.scala:100) 15/12/16 23:17:56 INFO YarnScheduler: Adding task set 950.0 with 1 tasks 15/12/16 23:17:56 INFO TaskSetManager: Starting task 0.0 in stage 950.0 (TID 988, phd40010043.xyz.com, NODE_LOCAL, 1496 bytes) 15/12/16 23:17:56 INFO BlockManagerInfo: Added broadcast_1157_piece0 in memory on phd40010043.xyz.com:15447 (size: 34.6 KB, free: 2.6 GB) 15/12/16 23:18:00 INFO FileInputDStream: Finding new files took 15 ms 15/12/16 23:18:00 INFO FileInputDStream: New files at time 1450325880000 ms: hdfs://nameservice1/data/raw/weblogs/bf_access_rt/2015/12/16/instance10a-prd-apache.1450325556913.gz 15/12/16 23:18:00 INFO MemoryStore: ensureFreeSpace(308525) called with curMem=51747720, maxMem=2778778828 15/12/16 23:18:00 INFO MemoryStore: Block broadcast_1158 stored as values in memory (estimated size 301.3 KB, free 2.5 GB) 15/12/16 23:18:00 INFO MemoryStore: ensureFreeSpace(24304) called with curMem=52056245, maxMem=2778778828 15/12/16 23:18:00 INFO MemoryStore: Block broadcast_1158_piece0 stored as bytes in memory (estimated size 23.7 KB, free 2.5 GB) 15/12/16 23:18:00 INFO BlockManagerInfo: Added broadcast_1158_piece0 in memory on phe40010004.xyz.com:36705 (size: 23.7 KB, free: 2.6 GB) 15/12/16 23:18:00 INFO BlockManagerMaster: Updated info of block broadcast_1158_piece0 15/12/16 23:18:00 INFO SparkContext: Created broadcast 1158 from textFileStream at NativeMethodAccessorImpl.java:-2 15/12/16 23:18:00 INFO FileInputFormat: Total input paths to process : 1 15/12/16 23:19:03 ERROR JobScheduler: Error generating jobs for time 1450325880000 ms py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonTransformedDStream.compute(PythonDStream.scala:197) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:300) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:300) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:299) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:287) at scala.Option.orElse(Option.scala:257) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:284) at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:38) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:116) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:116) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) at org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:116) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:243) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:241) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:241) at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:177) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$start$1$$anon$1$$anonfun$receive$1.applyOrElse(JobGenerator.scala:86) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 37 more Traceback (most recent call last): File "/hdata/tools/azkaban/at_bi_core/scripts/at_bi_page_instance_rt/at_sa_page_instance_rt_withdups.py", line 353, in <module> main(sys.argv[1:]) File "/hdata/tools/azkaban/at_bi_core/scripts/at_bi_page_instance_rt/at_sa_page_instance_rt_withdups.py", line 348, in main ssc.awaitTermination() File "/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p894.568/lib/spark/python/pyspark/streaming/context.py", line 190, in awaitTermination self._jssc.awaitTermination() File "/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p894.568/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__ File "/opt/cloudera/parcels/CDH-5.4.4-1.cdh5.4.4.p894.568/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o32.awaitTermination. : java.net.ConnectException: Connection timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonTransformedDStream.compute(PythonDStream.scala:197) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:300) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:300) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:299) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:287) at scala.Option.orElse(Option.scala:257) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:284) at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:38) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:116) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:116) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) at org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:116) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:243) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:241) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:241) at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:177) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$start$1$$anon$1$$anonfun$receive$1.applyOrElse(JobGenerator.scala:86) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 15/12/16 23:19:03 INFO BlockManager: Removing broadcast 1122 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1122 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1122 of size 57520 dropped from memory (free 2726755799) 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1122_piece0 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1122_piece0 of size 36302 dropped from memory (free 2726792101) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1122_piece0 on phe40010004.xyz.com:36705 in memory (size: 35.5 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO BlockManagerMaster: Updated info of block broadcast_1122_piece0 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1122_piece0 on phd40010050.xyz.com:18737 in memory (size: 35.5 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO ContextCleaner: Cleaned broadcast 1122 15/12/16 23:19:03 INFO BlockManager: Removing broadcast 1120 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1120_piece0 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1120_piece0 of size 35534 dropped from memory (free 2726827635) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1120_piece0 on phe40010004.xyz.com:36705 in memory (size: 34.7 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO BlockManagerMaster: Updated info of block broadcast_1120_piece0 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1120 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1120 of size 56152 dropped from memory (free 2726883787) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1120_piece0 on phd40010020.xyz.com:40560 in memory (size: 34.7 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1120_piece0 on phd40010043.xyz.com:15447 in memory (size: 34.7 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO ContextCleaner: Cleaned broadcast 1120 15/12/16 23:19:03 INFO ContextCleaner: Cleaned shuffle 305 15/12/16 23:19:03 INFO BlockManager: Removing broadcast 1119 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1119 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1119 of size 77816 dropped from memory (free 2726961603) 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1119_piece0 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1119_piece0 of size 27744 dropped from memory (free 2726989347) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1119_piece0 on phe40010004.xyz.com:36705 in memory (size: 27.1 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO BlockManagerMaster: Updated info of block broadcast_1119_piece0 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1119_piece0 on phd40010050.xyz.com:18737 in memory (size: 27.1 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO ContextCleaner: Cleaned broadcast 1119 15/12/16 23:19:03 INFO BlockManager: Removing broadcast 1117 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1117 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1117 of size 52056 dropped from memory (free 2727041403) 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1117_piece0 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1117_piece0 of size 33741 dropped from memory (free 2727075144) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1117_piece0 on phe40010004.xyz.com:36705 in memory (size: 33.0 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO BlockManagerMaster: Updated info of block broadcast_1117_piece0 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1117_piece0 on phd40010020.xyz.com:40560 in memory (size: 33.0 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1117_piece0 on phd40010007.xyz.com:37078 in memory (size: 33.0 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO ContextCleaner: Cleaned broadcast 1117 15/12/16 23:19:03 INFO BlockManager: Removing broadcast 1156 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1156 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1156 of size 77816 dropped from memory (free 2727152960) 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1156_piece0 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1156_piece0 of size 27738 dropped from memory (free 2727180698) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1156_piece0 on phe40010004.xyz.com:36705 in memory (size: 27.1 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO BlockManagerMaster: Updated info of block broadcast_1156_piece0 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1156_piece0 on phd40010020.xyz.com:40560 in memory (size: 27.1 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO ContextCleaner: Cleaned broadcast 1156 15/12/16 23:19:03 INFO BlockManager: Removing broadcast 1155 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1155 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1155 of size 51920 dropped from memory (free 2727232618) 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1155_piece0 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1155_piece0 of size 33649 dropped from memory (free 2727266267) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1155_piece0 on phe40010004.xyz.com:36705 in memory (size: 32.9 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO BlockManagerMaster: Updated info of block broadcast_1155_piece0 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1155_piece0 on phd40010043.xyz.com:15447 in memory (size: 32.9 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO ContextCleaner: Cleaned broadcast 1155 15/12/16 23:19:03 INFO BlockManager: Removing broadcast 1154 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1154 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1154 of size 48640 dropped from memory (free 2727314907) 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1154_piece0 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1154_piece0 of size 32071 dropped from memory (free 2727346978) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1154_piece0 on phe40010004.xyz.com:36705 in memory (size: 31.3 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO BlockManagerMaster: Updated info of block broadcast_1154_piece0 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450320120000 ms.0 from job set of time 1450320120000 ms 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450319940000 ms.0 py4j.Py4JException: An exception was raised by the Python Proxy. Return Message: null at py4j.Protocol.getReturnValue(Protocol.java:417) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:113) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450320300000 ms.0 from job set of time 1450320300000 ms 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450320120000 ms.0 py4j.Py4JException: An exception was raised by the Python Proxy. Return Message: null at py4j.Protocol.getReturnValue(Protocol.java:417) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:113) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1154_piece0 on phd40010043.xyz.com:15447 in memory (size: 31.3 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450320480000 ms.0 from job set of time 1450320480000 ms 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450320300000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.CallbackClient.sendCommand(CallbackClient.java:240) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 20 more 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450320660000 ms.0 from job set of time 1450320660000 ms 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450320840000 ms.0 from job set of time 1450320840000 ms 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450321020000 ms.0 from job set of time 1450321020000 ms 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450320480000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 19 more 15/12/16 23:19:03 INFO ContextCleaner: Cleaned broadcast 1154 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450321200000 ms.0 from job set of time 1450321200000 ms 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450320660000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 19 more 15/12/16 23:19:03 INFO BlockManager: Removing broadcast 1153 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450320840000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 19 more 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1153 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450321380000 ms.0 from job set of time 1450321380000 ms 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1153 of size 49576 dropped from memory (free 2727396554) 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450321560000 ms.0 from job set of time 1450321560000 ms 15/12/16 23:19:03 INFO BlockManager: Removing block broadcast_1153_piece0 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450321740000 ms.0 from job set of time 1450321740000 ms 15/12/16 23:19:03 INFO MemoryStore: Block broadcast_1153_piece0 of size 32564 dropped from memory (free 2727429118) 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450321020000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 19 more 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1153_piece0 on phe40010004.xyz.com:36705 in memory (size: 31.8 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450321920000 ms.0 from job set of time 1450321920000 ms 15/12/16 23:19:03 INFO BlockManagerMaster: Updated info of block broadcast_1153_piece0 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450322100000 ms.0 from job set of time 1450322100000 ms 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450322280000 ms.0 from job set of time 1450322280000 ms 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450322460000 ms.0 from job set of time 1450322460000 ms 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450322640000 ms.0 from job set of time 1450322640000 ms 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450321200000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 19 more 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450321380000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 19 more 15/12/16 23:19:03 INFO BlockManagerInfo: Removed broadcast_1153_piece0 on phd40010043.xyz.com:15447 in memory (size: 31.8 KB, free: 2.6 GB) 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450322820000 ms.0 from job set of time 1450322820000 ms 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450321560000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 19 more 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450323000000 ms.0 from job set of time 1450323000000 ms 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450323180000 ms.0 from job set of time 1450323180000 ms 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450323360000 ms.0 from job set of time 1450323360000 ms 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450321740000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 19 more 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450321920000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 19 more 15/12/16 23:19:03 INFO ContextCleaner: Cleaned broadcast 1153 15/12/16 23:19:03 INFO JobScheduler: Starting job streaming job 1450323540000 ms.0 from job set of time 1450323540000 ms 15/12/16 23:19:03 ERROR JobScheduler: Error running job streaming job 1450322100000 ms.0 py4j.Py4JException: Error while obtaining a new communication channel at py4j.CallbackClient.getConnectionLock(CallbackClient.java:155) at py4j.CallbackClient.sendCommand(CallbackClient.java:229) at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:111) at com.sun.proxy.$Proxy27.call(Unknown Source) at org.apache.spark.streaming.api.python.TransformFunction.apply(PythonDStream.scala:63) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.api.python.PythonDStream$$anonfun$callForeachRDD$1.apply(PythonDStream.scala:156) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply$mcV$sp(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler$$anonfun$run$1.apply(JobScheduler.scala:176) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at java.net.Socket.connect(Socket.java:528) at java.net.Socket.<init>(Socket.java:425) at java.net.Socket.<init>(Socket.java:241) at py4j.CallbackConnection.start(CallbackConnection.java:104) at py4j.CallbackClient.getConnection(CallbackClient.java:134) at py4j.CallbackClient.getConnectionLock(CallbackClient.java:146) ... 19 more -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-PySpark-1-3-randomly-losing-connection-tp25742.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org