Hi All,

please help me with this error

17/12/20 11:07:16 INFO executor.CoarseGrainedExecutorBackend: Started
daemon with process name: 19581@ddh-dev-dataproc-sw-hdgx
17/12/20 11:07:16 INFO util.SignalUtils: Registered signal handler for TERM
17/12/20 11:07:16 INFO util.SignalUtils: Registered signal handler for HUP
17/12/20 11:07:16 INFO util.SignalUtils: Registered signal handler for INT
17/12/20 11:07:16 INFO spark.SecurityManager: Changing view acls to:
yarn,tkmafag
17/12/20 11:07:16 INFO spark.SecurityManager: Changing modify acls to:
yarn,tkmafag
17/12/20 11:07:16 INFO spark.SecurityManager: Changing view acls groups to:
17/12/20 11:07:16 INFO spark.SecurityManager: Changing modify acls groups to:
17/12/20 11:07:16 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users  with view
permissions: Set(yarn, tkmafag); groups with view permissions: Set();
users  with modify permissions: Set(yarn, tkmafag); groups with modify
permissions: Set()
17/12/20 11:07:16 INFO client.TransportClientFactory: Successfully
created connection to /10.206.52.20:35617 after 48 ms (0 ms spent in
bootstraps)
17/12/20 11:07:17 INFO spark.SecurityManager: Changing view acls to:
yarn,tkmafag
17/12/20 11:07:17 INFO spark.SecurityManager: Changing modify acls to:
yarn,tkmafag
17/12/20 11:07:17 INFO spark.SecurityManager: Changing view acls groups to:
17/12/20 11:07:17 INFO spark.SecurityManager: Changing modify acls groups to:
17/12/20 11:07:17 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users  with view
permissions: Set(yarn, tkmafag); groups with view permissions: Set();
users  with modify permissions: Set(yarn, tkmafag); groups with modify
permissions: Set()
17/12/20 11:07:17 INFO client.TransportClientFactory: Successfully
created connection to /10.206.52.20:35617 after 1 ms (0 ms spent in
bootstraps)
17/12/20 11:07:17 INFO storage.DiskBlockManager: Created local
directory at 
/hadoop/yarn/nm-local-dir/usercache/tkmafag/appcache/application_1512677738429_16167/blockmgr-d585ecec-829a-432b-a8f1-89503359510e
17/12/20 11:07:17 INFO memory.MemoryStore: MemoryStore started with
capacity 7.8 GB
17/12/20 11:07:17 INFO executor.CoarseGrainedExecutorBackend:
Connecting to driver:
spark://CoarseGrainedScheduler@10.206.52.20:35617
17/12/20 11:07:17 INFO executor.CoarseGrainedExecutorBackend:
Successfully registered with driver
17/12/20 11:07:17 INFO executor.Executor: Starting executor ID 1 on
host ddh-dev-dataproc-sw-hdgx.c.kohls-ddh-lle.internal
17/12/20 11:07:17 INFO util.Utils: Successfully started service
'org.apache.spark.network.netty.NettyBlockTransferService' on port
60054.
17/12/20 11:07:17 INFO netty.NettyBlockTransferService: Server created
on ddh-dev-dataproc-sw-hdgx.c.kohls-ddh-lle.internal:60054
17/12/20 11:07:17 INFO storage.BlockManager: Using
org.apache.spark.storage.RandomBlockReplicationPolicy for block
replication policy
17/12/20 11:07:17 INFO storage.BlockManagerMaster: Registering
BlockManager BlockManagerId(1,
ddh-dev-dataproc-sw-hdgx.c.kohls-ddh-lle.internal, 60054, None)
17/12/20 11:07:17 INFO storage.BlockManagerMaster: Registered
BlockManager BlockManagerId(1,
ddh-dev-dataproc-sw-hdgx.c.kohls-ddh-lle.internal, 60054, None)
17/12/20 11:07:17 INFO storage.BlockManager: external shuffle service
port = 7337
17/12/20 11:07:17 INFO storage.BlockManager: Registering executor with
local external shuffle service.
17/12/20 11:07:17 INFO client.TransportClientFactory: Successfully
created connection to
ddh-dev-dataproc-sw-hdgx.c.kohls-ddh-lle.internal/10.206.53.214:7337
after 1 ms (0 ms spent in bootstraps)
17/12/20 11:07:17 INFO storage.BlockManager: Initialized BlockManager:
BlockManagerId(1, ddh-dev-dataproc-sw-hdgx.c.kohls-ddh-lle.internal,
60054, None)
17/12/20 11:08:21 ERROR executor.CoarseGrainedExecutorBackend:
RECEIVED SIGNAL TERM
17/12/20 11:08:21 INFO storage.DiskBlockManager: Shutdown hook called
17/12/20 11:08:21 INFO util.ShutdownHookManager: Shutdown hook called

I am using following spark config
 maxCores = 5
 driverMemory=2g
 executorMemory=17g
 executorInstances=100

Out of 100 Executors My job end up with only 10 active executors,
nonetheless the enough memory is available. Even tried setting the
executors to 250 only 10 remains active.

All I am trying to do is loading a mulitpartition hive table  and
doing df.count over it.

Please help me understanding the issue causing the executors kill

Thanks & Regards,
*Vishal Verma*

-- 

*DISCLAIMER:*
All the content in email is intended for the recipient and not to be 
published elsewhere without Exadatum consent. And attachments shall be send 
only if required and with ownership of the sender. This message contains 
confidential information and is intended only for the individual named. If 
you are not the named addressee, you should not disseminate, distribute or 
copy this email. Please notify the sender immediately by email if you have 
received this email by mistake and delete this email from your system. 
Email transmission cannot be guaranteed to be secure or error-free, as 
information could be intercepted, corrupted, lost, destroyed, arrive late 
or incomplete, or contain viruses. The sender, therefore, does not accept 
liability for any errors or omissions in the contents of this message which 
arise as a result of email transmission. If verification is required, 
please request a hard-copy version.

Reply via email to