Hi Yana, As per my custom split code, only three splits submit to the system. So three executors are sufficient for that. but it had run 8 executors. First three executors logs show the exact output what I want(i did put some syso in console to debug the code), but next five are have some other and random exceptions.
I think it is due to first three executor didn't exist properly thatswy driver run more executors on top it, which create so many processes hitting the same application and overall result it fails. from Log i can see first three executors return with exit status 1. and logs are below : 15/01/23 15:51:39 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT] 15/01/23 15:51:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/01/23 15:51:40 INFO spark.SecurityManager: Changing view acls to: sparkAdmin 15/01/23 15:51:40 INFO spark.SecurityManager: Changing modify acls to: sparkAdmin 15/01/23 15:51:40 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(sparkAdmin); users with modify permissions: Set(sparkAdmin) 15/01/23 15:51:40 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/01/23 15:51:40 INFO Remoting: Starting remoting 15/01/23 15:51:41 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@VM219:40166] 15/01/23 15:51:41 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 40166. 15/01/23 15:51:41 INFO spark.SecurityManager: Changing view acls to: sparkAdmin 15/01/23 15:51:41 INFO spark.SecurityManager: Changing modify acls to: sparkAdmin 15/01/23 15:51:41 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(sparkAdmin); users with modify permissions: Set(sparkAdmin) 15/01/23 15:51:41 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 15/01/23 15:51:41 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 15/01/23 15:51:41 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/01/23 15:51:41 INFO Remoting: Starting remoting 15/01/23 15:51:41 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down. 15/01/23 15:51:41 INFO util.Utils: Successfully started service 'sparkExecutor' on port 57695. 15/01/23 15:51:41 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutor@VM219:57695] 15/01/23 15:51:41 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: akka.tcp://sparkDriver@VM220:53484/user/CoarseGrainedScheduler 15/01/23 15:51:41 INFO worker.WorkerWatcher: Connecting to worker akka.tcp://sparkWorker@VM219:44826/user/Worker 15/01/23 15:51:41 INFO worker.WorkerWatcher: Successfully connected to akka.tcp://sparkWorker@VM219:44826/user/Worker 15/01/23 15:51:41 INFO executor.CoarseGrainedExecutorBackend: Successfully registered with driver 15/01/23 15:51:41 INFO spark.SecurityManager: Changing view acls to: sparkAdmin 15/01/23 15:51:41 INFO spark.SecurityManager: Changing modify acls to: sparkAdmin 15/01/23 15:51:41 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(sparkAdmin); users with modify permissions: Set(sparkAdmin) 15/01/23 15:51:41 INFO util.AkkaUtils: Connecting to MapOutputTracker: akka.tcp://sparkDriver@VM220:53484/user/MapOutputTracker 15/01/23 15:51:41 INFO util.AkkaUtils: Connecting to BlockManagerMaster: akka.tcp://sparkDriver@VM220:53484/user/BlockManagerMaster 15/01/23 15:51:41 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-local-20150123155141-b237 15/01/23 15:51:41 INFO storage.MemoryStore: MemoryStore started with capacity 529.9 MB 15/01/23 15:51:41 INFO netty.NettyBlockTransferService: Server created on 54273 15/01/23 15:51:41 INFO storage.BlockManagerMaster: Trying to register BlockManager 15/01/23 15:51:41 INFO storage.BlockManagerMaster: Registered BlockManager 15/01/23 15:51:41 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@VM220:53484/user/HeartbeatReceiver 15/01/23 15:51:47 ERROR executor.CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp://sparkExecutor@VM219:57695] -> [akka.tcp://sparkDriver@VM220:53484] disassociated! Shutting down. 15/01/23 15:51:47 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@VM220:53484] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. On 24 January 2015 at 06:37, Yana Kadiyska <yana.kadiy...@gmail.com> wrote: > It looks to me like your executor actually crashed and didn't just finish > properly. > > Can you check the executor log? > > It is available in the UI, or on the worker machine, under > $SPARK_HOME/work/ app-20150123155114-0000/6/stderr (unless you manually > changed the work directory location but in that case I'd assume you know > where to find the log) > > On Thu, Jan 22, 2015 at 10:54 PM, Harihar Nahak <hna...@wynyardgroup.com> > wrote: > >> Hi All, >> >> I wrote a custom reader to read a DB, and it is able to return key and >> value >> as expected but after it finished it never returned to driver >> >> here is output of worker log : >> 15/01/23 15:51:38 INFO worker.ExecutorRunner: Launch command: "java" "-cp" >> >> "::/usr/local/spark-1.2.0-bin-hadoop2.4/sbin/../conf:/usr/local/spark-1.2.0-bin-hadoop2.4/lib/spark-assembly-1.2.0-hadoop2.4.0.jar:/usr/local/spark-1.2.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:/usr/local/spark-1.2.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar:/usr/local/spark-1.2.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar:/usr/local/hadoop/etc/hadoop" >> "-XX:MaxPermSize=128m" "-Dspark.driver.port=53484" "-Xms1024M" "-Xmx1024M" >> "org.apache.spark.executor.CoarseGrainedExecutorBackend" >> "akka.tcp://sparkDriver@VM90:53484/user/CoarseGrainedScheduler" "6" >> "VM99" >> "4" "app-20150123155114-0000" >> "akka.tcp://sparkWorker@VM99:44826/user/Worker" >> 15/01/23 15:51:47 INFO worker.Worker: Executor app-20150123155114-0000/6 >> finished with state EXITED message Command exited with code 1 exitStatus 1 >> 15/01/23 15:51:47 WARN remote.ReliableDeliverySupervisor: Association with >> remote system [akka.tcp://sparkExecutor@VM99:57695] has failed, address >> is >> now gated for [5000] ms. Reason is: [Disassociated]. >> 15/01/23 15:51:47 INFO actor.LocalActorRef: Message >> [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from >> Actor[akka://sparkWorker/deadLetters] to >> >> Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%40143.96.25.29%3A35065-4#-915179653] >> was not delivered. [3] dead letters encountered. This logging can be >> turned >> off or adjusted with configuration settings 'akka.log-dead-letters' and >> 'akka.log-dead-letters-during-shutdown'. >> 15/01/23 15:51:49 INFO worker.Worker: Asked to kill unknown executor >> app-20150123155114-0000/6 >> >> If someone noticed any clue to fixed that will really appreciate. >> >> >> >> ----- >> --Harihar >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Results-never-return-to-driver-Spark-Custom-Reader-tp21328.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > -- Regards, Harihar Nahak BigData Developer Wynyard Email:hna...@wynyardgroup.com | Extn: 8019