/u01/spark-1.6.0-hive/bin/spark-submit --driver-memory 4G --class com.ETLTransform --master yarn --executor-cores 4 --executor-memory 1000m --num-executors 20 --conf spark.rdd.compress=false --conf spark.shuffle.compress=false --conf spark.broadcast.compress=false /u01/spark_engine863.jar -quesize 10 -batchSize 5000 -writethread 30 -runningSeconds 20 args name=-quesize value=10 args name=-batchSize value=5000 args name=-writethread value=30 args name=-runningSeconds value=20 16/09/02 22:21:17 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. 16/09/02 22:21:17 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042. 16/09/02 22:21:17 WARN Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043. 16/09/02 22:21:17 WARN Utils: Service 'SparkUI' could not bind on port 4043. Attempting port 4044. ignite:start====== *[Stage 0:==========================================> (15 + 5) / 20]*
Stage was handed by this, any progress on this. I found running executor on spark ui and found the following error message as below 16/09/02 22:21:43 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hbase); users with modify permissions: Set(hbase) 16/09/02 22:21:44 INFO slf4j.Slf4jLogger: Slf4jLogger started 16/09/02 22:21:44 INFO Remoting: Starting remoting 16/09/02 22:21:44 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkExecutorActorSystem@vmsecdomain010194070063.cm10:56914] 16/09/02 22:21:44 INFO util.Utils: Successfully started service 'sparkExecutorActorSystem' on port 56914. 16/09/02 22:21:44 INFO storage.DiskBlockManager: Created local directory at /u01/hbase/tmp/nm-local-dir/usercache/hbase/appcache/application_1455892346017_5645/blockmgr-218e5cde-129e-41f8-b05e-c262e24c346f 16/09/02 22:21:44 INFO storage.MemoryStore: MemoryStore started with capacity 6.8 GB 16/09/02 22:21:44 INFO executor.CoarseGrainedExecutorBackend: Connecting to driver: spark://CoarseGrainedScheduler@10.194.70.26:51811 16/09/02 22:21:44 INFO executor.CoarseGrainedExecutorBackend: Successfully registered with driver 16/09/02 22:21:44 INFO executor.Executor: Starting executor ID 2 on host xxxxxxx 16/09/02 22:21:45 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41006. 16/09/02 22:21:45 INFO netty.NettyBlockTransferService: Server created on 41006 16/09/02 22:21:45 INFO storage.BlockManagerMaster: Trying to register BlockManager 16/09/02 22:21:45 INFO storage.BlockManagerMaster: Registered BlockManager 16/09/02 22:21:50 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 1 16/09/02 22:21:50 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 9 16/09/02 22:21:50 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 17 16/09/02 22:21:50 INFO executor.Executor: Running task 9.0 in stage 0.0 (TID 9) 16/09/02 22:21:50 INFO executor.Executor: Running task 1.0 in stage 0.0 (TID 1) 16/09/02 22:21:50 INFO executor.Executor: Running task 17.0 in stage 0.0 (TID 17) 16/09/02 22:21:50 INFO executor.Executor: Fetching http://10.194.70.26:48676/jars/spark_zmqpull_engine863.jar with timestamp 1472826078761 16/09/02 22:21:50 INFO util.Utils: Fetching http://10.194.70.26:48676/jars/spark_zmqpull_engine863.jar to /u01/hbase/tmp/nm-local-dir/usercache/hbase/appcache/application_1455892346017_5645/spark-50073b87-72e3-4f4a-84cd-f01a7d5061dd/fetchFileTemp8818016450869617667.tmp 16/09/02 22:21:58 INFO util.Utils: Copying /u01/hbase/tmp/nm-local-dir/usercache/hbase/appcache/application_1455892346017_5645/spark-50073b87-72e3-4f4a-84cd-f01a7d5061dd/-10798427751472826078761_cache to /u01/hbase/tmp/nm-local-dir/usercache/hbase/appcache/application_1455892346017_5645/container_1455892346017_5645_01_000009/./spark_zmqpull_engine863.jar 16/09/02 22:21:59 INFO executor.Executor: Adding file:/u01/hbase/tmp/nm-local-dir/usercache/hbase/appcache/application_1455892346017_5645/container_1455892346017_5645_01_000009/./spark_zmqpull_engine863.jar to class loader 16/09/02 22:21:59 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0 16/09/02 22:21:59 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1568.0 B, free 1568.0 B) 16/09/02 22:21:59 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 129 ms 16/09/02 22:21:59 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1608.0 B, free 3.1 KB) 16/09/02 22:21:59 INFO internal.IgniteKernal: >>> __________ ________________ >>> / _/ ___/ |/ / _/_ __/ __/ >>> _/ // (7 7 // / / / / _/ >>> /___/\___/_/|_/___/ /_/ /___/ >>> >>> ver. 1.6.0#20160518-sha1:0b22c45b >>> 2016 Copyright(C) Apache Software Foundation >>> >>> Ignite documentation: http://ignite.apache.org 16/09/02 22:21:59 INFO internal.IgniteKernal: Config URL: n/a 16/09/02 22:21:59 INFO internal.IgniteKernal: Daemon mode: off 16/09/02 22:21:59 INFO internal.IgniteKernal: OS: Linux 2.6.32-220.23.2.ali878.el6.x86_64 amd64 16/09/02 22:21:59 INFO internal.IgniteKernal: OS user: hbase 16/09/02 22:21:59 INFO internal.IgniteKernal: Language runtime: Java Platform API Specification ver. 1.7 16/09/02 22:21:59 INFO internal.IgniteKernal: VM information: Java(TM) SE Runtime Environment 1.7.0_79-b15 Oracle Corporation OpenJDK (Alibaba) 64-Bit Server VM 24.79-b02-internal 16/09/02 22:21:59 INFO internal.IgniteKernal: VM total memory: 9.4GB 16/09/02 22:21:59 INFO internal.IgniteKernal: Remote Management [restart: off, REST: on, JMX (remote: off)] 16/09/02 22:21:59 INFO internal.IgniteKernal: IGNITE_HOME=null 16/09/02 22:21:59 INFO internal.IgniteKernal: VM arguments: [-XX:OnOutOfMemoryError=kill %p, -Xms10000m, -Xmx10000m, -XX:MaxPermSize=256M, -Djava.io.tmpdir=/u01/hbase/tmp/nm-local-dir/usercache/hbase/appcache/application_1455892346017_5645/container_1455892346017_5645_01_000009/tmp, -Dspark.driver.port=51811, -Dspark.yarn.app.container.log.dir=/u01/hbase/hadoop-2.5.0-cdh5.3.0/logs/userlogs/application_1455892346017_5645/container_1455892346017_5645_01_000009, -XX:MaxPermSize=256m] 16/09/02 22:21:59 INFO internal.IgniteKernal: Configured caches ['ignite-marshaller-sys-cache', 'ignite-sys-cache', 'ignite-atomics-sys-cache'] 16/09/02 22:22:00 INFO internal.IgniteKernal: Non-loopback local IPs: 10.194.70.63 16/09/02 22:22:00 INFO internal.IgniteKernal: Enabled local MACs: 283152A77F79 16/09/02 22:22:00 INFO plugin.IgnitePluginProcessor: Configured plugins: 16/09/02 22:22:00 INFO plugin.IgnitePluginProcessor: ^-- None 16/09/02 22:22:00 INFO plugin.IgnitePluginProcessor: 16/09/02 22:22:00 INFO tcp.TcpCommunicationSpi: IPC shared memory server endpoint started [port=48101, tokDir=/u01/hbase/tmp/nm-local-dir/usercache/hbase/appcache/application_1455892346017_5645/container_1455892346017_5645_01_000009/tmp/ignite/work/ipc/shmem/cad18430-c808-4c97-a6b1-ecff919b0108-82210] 16/09/02 22:22:00 INFO tcp.TcpCommunicationSpi: Successfully bound shared memory communication to TCP port [port=48101, locHost=0.0.0.0/0.0.0.0] 16/09/02 22:22:00 INFO tcp.TcpCommunicationSpi: Successfully bound to TCP port [port=47101, locHost=0.0.0.0/0.0.0.0] 16/09/02 22:22:00 WARN noop.NoopCheckpointSpi: Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation) 16/09/02 22:22:00 WARN collision.GridCollisionManager: Collision resolution is disabled (all jobs will be activated upon arrival). 16/09/02 22:22:00 WARN noop.NoopSwapSpaceSpi: Swap space is disabled. To enable use FileSwapSpaceSpi. 16/09/02 22:22:00 INFO internal.IgniteKernal: Security status [authentication=off, tls/ssl=off] 16/09/02 22:22:00 INFO tcp.GridTcpRestProtocol: Command protocol successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11212] 16/09/02 22:22:00 INFO tcp.TcpDiscoverySpi: Successfully bound to TCP port [port=47501, localHost=0.0.0.0/0.0.0.0] 16/09/02 22:22:00 WARN multicast.TcpDiscoveryMulticastIpFinder: TcpDiscoveryMulticastIpFinder has no pre-configured addresses (it is recommended in production to specify at least one address in TcpDiscoveryMulticastIpFinder.getAddresses() configuration property) 16/09/02 22:22:05 INFO cache.GridCacheProcessor: Started cache [name=ignite-marshaller-sys-cache, mode=REPLICATED] 16/09/02 22:22:05 INFO cache.GridCacheProcessor: Started cache [name=embedCache, mode=PARTITIONED] 16/09/02 22:22:05 INFO cache.GridCacheProcessor: Started cache [name=ignite-atomics-sys-cache, mode=PARTITIONED] 16/09/02 22:22:05 INFO cache.GridCacheProcessor: Started cache [name=ignite-sys-cache, mode=REPLICATED] 16/09/02 22:22:09 INFO discovery.GridDiscoveryManager: Added new node to topology: TcpDiscoveryNode [id=68644149-534e-4a21-932d-43c02062d085, addrs=[10.194.70.71, 127.0.0.1], sockAddrs=[xx:47500, /xx:47500, /127.0.0.1:47500], discPort=47500, order=1187, intOrder=640, lastExchangeTime=1472826127728, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 16/09/02 22:22:09 INFO discovery.GridDiscoveryManager: Topology snapshot [ver=1187, servers=92, clients=1, CPUs=1824, heap=600.0GB] 16/09/02 22:22:35 WARN cache.GridCachePartitionExchangeManager: Failed to wait for initial partition map exchange. Possible reasons are: ^-- Transactions in deadlock. ^-- Long running transactions (ignore if this is the case). ^-- Unreleased explicit locks. 16/09/02 22:23:05 WARN cache.GridCachePartitionExchangeManager: Still waiting for initial partition map exchange [fut=GridDhtPartitionsExchangeFuture [dummy=false, forcePreload=false, reassign=false, discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=cad18430-c808-4c97-a6b1-ecff919b0108, addrs=[10.194.70.63, 127.0.0.1], sockAddrs=[xx.cm10/yy:47501, /yy:47501, /127.0.0.1:47501], discPort=47501, order=1186, intOrder=639, lastExchangeTime=1472826184572, loc=true, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false], topVer=1186, nodeId8=cad18430, msg=null, type=NODE_JOINED, tstamp=1472826124647], crd=TcpDiscoveryNode [id=e13a3446-b45a-4f31-9740-f8a75ac3fd78, addrs=[10.194.70.60, 127.0.0.1], sockAddrs=[xx/bb:47500, /bb:47500, /127.0.0.1:47500], discPort=47500, order=876, intOrder=459, lastExchangeTime=1472826122976, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false], exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=1186, minorTopVer=0], nodeId=cad18430, evt=NODE_JOINED], added=false, initFut=GridFutureAdapter [resFlag=0, res=null, startTime=1472826125231, endTime=0, ignoreInterrupts=false, state=INIT], init=false, topSnapshot=null, lastVer=null, partReleaseFut=null, affChangeMsg=null, skipPreload=false, clientOnlyExchange=false, initTs=1472826125231, centralizedAff=false, evtLatch=0, remaining=[c8c7f8c4-3db4-4215-ba1c-dc45502d6327, 4aaca392-4985-4f4d-8ea3-66281584be36, 0d45635b-fc3e-4650-8f46-6336caae9b89, 1e7ebe14-ba8a-4711-93d3-11cf2c931d50, 5d0949e6-730d-4eef-90af-1fe1ada4fef2, 5698e9ab-ae20-4155-bcf7-f60945c7b688, 30a42087-2cde-4937-a3f1-4924e4a947a3, 588cf5af-5f39-405c-bdf5-adc9030375d2, 44c03c97-fb32-4fc0-beb5-4551c2857183, 0847fce3-56ef-48eb-91ae-c07ac45c8340, 9e11121e-e928-4986-971a-f5f82fc44ab8, 3b2c76ff-d384-443f-9575-59048d8631cf, e3298869-f1e3-4d88-93c5-c22b0e3f05c7, c97a064a-a4d2-4090-80cb-bf34f7aaf6a3, 44b9a00d-5f01-46a2-9709-7be87128d32b, 465bd409-8ea9-4d23-9ad6-8dfceef6f852, 1d4250aa-d289-4ebe-903d-08aafa5e10b1, 093185dd-9934-4fe5-b60e-13871c35259a, c19e588c-6a94-4276-8238-9b45aa6cfdee, cec1a06a-c944-487a-8ac6-90492e3481cc, 43f4b602-f0d0-4453-8c56-2c480b4dea98, 44aaf175-f371-4dee-9e06-e484fe0e47ed, 4678a454-c8c1-4485-8780-bec84528ca62, 2a3a6591-2f2b-4f7a-b02d-c31b16b8aa6d, 90c0446c-ad67-4cd7-badb-44d944b068ec, d3ac0e36-e631-4c19-8a0c-65bd933dd985, 0f13528a-509e-472c-8a05-65a384cbff52, a302ee03-111a-499b-b090-1d2bf9f2c6cc, f7ecaea0-2ec4-4260-9603-8c0789a92a94, 9a8a4150-49ef-4b5b-b6a5-6f4931adaddc, ae7cfab9-34c8-4a24-8105-1b9820e7614f, 9bb3a6cd-900f-496e-b0ab-df08d2b4b0de, c1e4ae9b-2c2b-4aff-8503-733ad127d63c, c36eb7dc-db09-4238-afe6-3cbef603feaf, 74e3ab66-dffd-4717-aaca-ddb731dc77b0, 1c0ac9c2-cbc5-4091-9706-dd07259c1cf0, f8ac6957-9232-4190-9bdf-9c54bbc4cacb, ae4070b0-cd16-4a30-aefa-4b3b6286fa10, c29bfc50-1e1b-4d4a-bd59-b35ce54ca00d, 52886c04-9f19-4a7f-a512-a91835c11afb, 4ba28d00-4415-4f17-ac7a-6c8145e444a6, 32d460a8-c47d-4e30-888a-f9ca7c5d0fa4, 2c08987e-e22c-4243-8950-2c6482f36ad1, 7ceb7755-4597-4ee2-a29b-49b114c6137d, 78b37939-8917-42a2-8b6b-a0e10e0ebd15, 7f96eb82-1c17-45d1-a9d7-5a592ecb069e, a95afa33-8060-4872-9d69-97022cd83c70, f38a66ac-ed39-4f1a-bc68-528a6283c377, e6f43a4b-12be-4b8c-a108-baa385063729, 84bd0196-fdbb-4c10-86ed-9d4412057835, f627c4bc-fecc-4293-9218-f5ca13e5c841, b75a22fa-bead-49ae-8a0d-38aa06ccc683, 201cea0f-8b07-4b22-9b92-ff338c3de827, 787117ee-544b-4519-b35a-a7264c07bc1e, 3998dc72-b2fc-4338-b7a8-94fbd704c1b3, 8e11d28a-30a1-40c4-a34e-5ea8128ae099, 761270ac-5b5e-45fe-af43-8d4252c396b3, 01899710-287f-4c15-b567-58281c69857a, 1085f698-3c03-42d1-96af-816c1eb9eb4e, 76e0979b-40d5-4bf8-a8fd-f623c8a0e200, e77e58e5-6f30-41a0-b5a5-5b59405353cc, ce5b8430-8061-406a-ae29-ced20d5a21f4, 5492c340-1441-429a-9a01-9741a84c5e2d, 3980e286-125b-4370-a785-0bf229ab223a, 1b63cbca-b944-4511-b5f8-f7205f8e8377, f7c8ee22-f18a-4d5e-a500-40754786cab9, f2deea1b-282d-4a1f-9edc-b07bdd6f478e, cd736923-0978-4ecb-9a1a-ebf87d760bd6, 73d3bfd4-0eb2-40d6-8f47-a65b4adc5cbc, 28498f9a-983c-4683-b588-dc37e15ef84c, 3b436182-790f-48bf-bb07-0dd3f3ef013e, d96425ef-15e8-437f-8402-8273f0b49c7b, f492bab3-986c-4d51-aa00-99f71ff87660, 454c3ac1-1946-4289-921b-33868d278ac3, b0521958-7fbc-464d-a135-601fd78a0cdf, 9a691f5e-cb6d-46a0-b99d-ea0aee88c2c4, 0383e4bb-c560-4e54-a5c0-7a8042a9c983, 42566a31-ee1d-4abf-ad42-78ba9f1bc16c, e300013c-df6e-4bba-bc9a-b83441cb6b59, 6b189281-fb50-4a7a-b007-1365a535dd8a, 44a8749e-aa3e-42de-8d77-1fb7beb746a5, e13a3446-b45a-4f31-9740-f8a75ac3fd78, 538cb865-05a5-477b-97fb-6ad4bc72149b, 72228a4b-6059-4f9c-b9c8-2319e2461708, bf51c304-6b3d-4f25-bdfe-98b5f0bf5176, 52fade64-efc7-4b8d-bda2-017b3aa53f42, 0b16dec0-ec90-46f8-ac71-9c5bb833e5e1, b31e02df-91d3-4308-adf0-1a25e3b82642, 807d4d47-87b4-43ac-9bbf-e31745cbd5f2, c72a35e4-021c-483c-b8ea-81e8099eabce], srvNodes=[TcpDiscoveryNode [id=e13a3446-b45a-4f31-9740-f8a75ac3fd78, addrs=[10.194.70.60, 127.0.0.1], sockAddrs=[zzz/yyy:47500, /xxx:47500, /127.0.0.1:47500], discPort=47500, order=876, intOrder=459, lastExchangeTime=1472826122976, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false], TcpDiscoveryNode [id=e77e58e5-6f30-41a0-b5a5-5b59405353cc, addrs=[10.194.78.21, 127.0.0.1], sockAddrs=[yyyy.cm10/mm:47500, /10.194.78.21:47500, /127.0.0.1:47500], discPort=47500, order=877, intOrder=460, lastExchangeTime=1472826122976, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false], TcpDiscoveryNode [id=6b189281-fb50-4a7a-b007-1365a535dd8a, addrs=[10.194.62.38, 127.0.0.1], sockAddrs=[xxx.cm10/xxx:47500, /yy:47500, /127.0.0.1:47500], discPort=47500, order=878, intOrder=461, lastExchangeTime=1472826122986, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false], TcpDiscoveryNode [id=3998dc72-b2fc-4338-b7a8-94fbd704c1b3, addrs=[10.194.54.71, 127.0.0.1], Can anyone help me on this? Thanks!!! -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Spark-stage-was-hang-via-ignite-tp7485.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.