Hello, Everything is okay for me to integrate ignite with yarn on *Multicast Based Discovery* in my local spark and yarn cluster , but in our production env, some of ports could't be opened .So, I need to specify a static ip address to discovery each other.
but when running my configuration and encountered the following issue. List my detailed steps as below. 1、config/default-config.mxl <?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd"> <bean class="org.apache.ignite.configuration.IgniteConfiguration"> <property name="discoverySpi"> <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi"> <property name="ipFinder"> <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder"> <property name="addresses"> <list> <value>172.16.186.200:47500..47509</value> <value>172.16.186.201:47500..47509</value> <value>172.16.186.202:47500..47509</value> </list> </property> </bean> </property> </bean> </property> </bean> </beans> 2、my java code on idea ackage com.ignite import org.apache.ignite.spark._ import org.apache.ignite.configuration._ import org.apache.spark.{SparkConf, SparkContext} /** * Created by limu on 2016/8/14. */ object testIgniteSharedRDD { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("testIgniteSharedRDD") val sc = new SparkContext(conf) /* val cfg = new IgniteConfiguration() cfg.setIgniteHome("/usr/apache-ignite-fabric-1.6.0-bin") */ //val ic = new IgniteContext[Integer, Integer](sc, () => new IgniteConfiguration()) val ic = new IgniteContext[Integer, Integer](sc, "/usr/apache-ignite-fabric-1.6.0-bin/config/default-config.xml") val sharedRDD = ic.fromCache("sharedIgniteRDD-ling-sha111o") println("original.sharedCounter=> " + sharedRDD.count()) sharedRDD.savePairs(sc.parallelize(1 to 77000, 10).map(i => (new Integer(i), new Integer(i)))) println("final.sharedCounter=> " + sharedRDD.count()) println("final.condition.couner=> " + sharedRDD.filter(_._2 > 21000).count ) } 3、Yarn container logs Logs for container_1471869381289_0001_01_000001 About Apache Hadoop ResourceManager RM Home NodeManager Tools SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hadoop-2.4.1/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hadoop-2.4.1/tmp/nm-local-dir/usercache/root/appcache/application_1471869381289_0001/filecache/10/ignite-yarn.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 16/08/22 05:38:52 INFO impl.ContainerManagementProtocolProxy: yarn.client.max-nodemanagers-proxies : 500 16/08/22 05:38:52 INFO client.RMProxy: Connecting to ResourceManager at sparkup1/172.16.186.200:8030 Aug 22, 2016 5:38:53 AM org.apache.ignite.yarn.ApplicationMaster run INFO: Application master registered. Aug 22, 2016 5:38:53 AM org.apache.ignite.yarn.ApplicationMaster run INFO: Making request. Memory: 1,908, cpu 1. Aug 22, 2016 5:38:53 AM org.apache.ignite.yarn.ApplicationMaster run INFO: Making request. Memory: 1,908, cpu 1. Aug 22, 2016 5:38:53 AM org.apache.ignite.yarn.ApplicationMaster run INFO: Making request. Memory: 1,908, cpu 1. 16/08/22 05:38:54 INFO impl.AMRMClientImpl: Received new token for : sparkup3:46170 16/08/22 05:38:54 INFO impl.AMRMClientImpl: Received new token for : sparkup2:55406 Aug 22, 2016 5:38:54 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated INFO: Launching container: container_1471869381289_0001_01_000002. 16/08/22 05:38:54 INFO impl.ContainerManagementProtocolProxy: Opening proxy : sparkup3:46170 16/08/22 05:38:54 INFO impl.AMRMClientImpl: Received new token for : sparkup1:53711 Aug 22, 2016 5:38:54 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated INFO: Launching container: container_1471869381289_0001_01_000003. 16/08/22 05:38:54 INFO impl.ContainerManagementProtocolProxy: Opening proxy : sparkup2:55406 Aug 22, 2016 5:38:55 AM org.apache.ignite.yarn.ApplicationMaster onContainersAllocated INFO: Launching container: container_1471869381289_0001_01_000004. 16/08/22 05:38:55 INFO impl.ContainerManagementProtocolProxy: Opening proxy : sparkup1:53711 4、spark-submit errors [root@sparkup1 config]# clear [root@sparkup1 config]# spark-submit --driver-memory 2G --class com.ignite.testIgniteSharedRDD --master yarn --executor-cores 2 --executor-memory 1000m --num-executors 2 --conf spark.rdd.compress=false --conf spark.shuffle.compress=false --conf spark.broadcast.compress=false /root/limu/ignite/spark-project-jar-with-dependencies.jar 16/08/22 05:44:36 INFO spark.SparkContext: Running Spark version 1.6.1 16/08/22 05:44:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/08/22 05:44:37 INFO spark.SecurityManager: Changing view acls to: root 16/08/22 05:44:37 INFO spark.SecurityManager: Changing modify acls to: root 16/08/22 05:44:37 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 16/08/22 05:44:38 INFO util.Utils: Successfully started service 'sparkDriver' on port 43868. 16/08/22 05:44:38 INFO slf4j.Slf4jLogger: Slf4jLogger started 16/08/22 05:44:39 INFO Remoting: Starting remoting 16/08/22 05:44:39 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@172.16.186.200:47472] 16/08/22 05:44:39 INFO util.Utils: Successfully started service 'sparkDriverActorSystem' on port 47472. 16/08/22 05:44:39 INFO spark.SparkEnv: Registering MapOutputTracker 16/08/22 05:44:39 INFO spark.SparkEnv: Registering BlockManagerMaster 16/08/22 05:44:39 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-d1426202-a59e-4436-82b3-5142d5861e0e 16/08/22 05:44:39 INFO storage.MemoryStore: MemoryStore started with capacity 1259.8 MB 16/08/22 05:44:39 INFO spark.SparkEnv: Registering OutputCommitCoordinator 16/08/22 05:44:40 INFO server.Server: jetty-8.y.z-SNAPSHOT 16/08/22 05:44:40 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 16/08/22 05:44:40 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 16/08/22 05:44:40 INFO ui.SparkUI: Started SparkUI at http://172.16.186.200:4040 16/08/22 05:44:40 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-70fba93f-d31c-44d7-ada1-5fe0b9dae5cc/httpd-91da8e86-bee0-4c5a-868a-ea7d35a2536d 16/08/22 05:44:40 INFO spark.HttpServer: Starting HTTP Server 16/08/22 05:44:40 INFO server.Server: jetty-8.y.z-SNAPSHOT 16/08/22 05:44:40 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:56870 16/08/22 05:44:40 INFO util.Utils: Successfully started service 'HTTP file server' on port 56870. 16/08/22 05:44:41 INFO spark.SparkContext: Added JAR file:/root/limu/ignite/spark-project-jar-with-dependencies.jar at http://172.16.186.200:56870/jars/spark-project-jar-with-dependencies.jar with timestamp 1471869881473 16/08/22 05:44:41 INFO client.RMProxy: Connecting to ResourceManager at sparkup1/172.16.186.200:8032 16/08/22 05:44:42 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers 16/08/22 05:44:42 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 16/08/22 05:44:42 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 16/08/22 05:44:42 INFO yarn.Client: Setting up container launch context for our AM 16/08/22 05:44:42 INFO yarn.Client: Setting up the launch environment for our AM container 16/08/22 05:44:42 INFO yarn.Client: Preparing resources for our AM container 16/08/22 05:44:43 INFO yarn.Client: Uploading resource file:/usr/spark-1.6.1-bin-hadoop2.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar -> hdfs://sparkup1:9000/user/root/.sparkStaging/application_1471869381289_0003/spark-assembly-1.6.1-hadoop2.6.0.jar 16/08/22 05:44:48 INFO yarn.Client: Uploading resource file:/tmp/spark-70fba93f-d31c-44d7-ada1-5fe0b9dae5cc/__spark_conf__522717764919469546.zip -> hdfs://sparkup1:9000/user/root/.sparkStaging/application_1471869381289_0003/__spark_conf__522717764919469546.zip 16/08/22 05:44:48 INFO spark.SecurityManager: Changing view acls to: root 16/08/22 05:44:48 INFO spark.SecurityManager: Changing modify acls to: root 16/08/22 05:44:48 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root) 16/08/22 05:44:48 INFO yarn.Client: Submitting application 3 to ResourceManager 16/08/22 05:44:48 INFO impl.YarnClientImpl: Submitted application application_1471869381289_0003 16/08/22 05:44:49 INFO yarn.Client: Application report for application_1471869381289_0003 (state: ACCEPTED) 16/08/22 05:44:49 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1471869888680 final status: UNDEFINED tracking URL: http://sparkup1:8088/proxy/application_1471869381289_0003/ user: root 16/08/22 05:44:50 INFO yarn.Client: Application report for application_1471869381289_0003 (state: ACCEPTED) 16/08/22 05:44:51 INFO yarn.Client: Application report for application_1471869381289_0003 (state: ACCEPTED) 16/08/22 05:44:52 INFO yarn.Client: Application report for application_1471869381289_0003 (state: ACCEPTED) 16/08/22 05:44:53 INFO yarn.Client: Application report for application_1471869381289_0003 (state: ACCEPTED) 16/08/22 05:44:54 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null) 16/08/22 05:44:54 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> sparkup1, PROXY_URI_BASES -> http://sparkup1:8088/proxy/application_1471869381289_0003), /proxy/application_1471869381289_0003 16/08/22 05:44:54 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 16/08/22 05:44:54 INFO yarn.Client: Application report for application_1471869381289_0003 (state: RUNNING) 16/08/22 05:44:54 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: 172.16.186.200 ApplicationMaster RPC port: 0 queue: default start time: 1471869888680 final status: UNDEFINED tracking URL: http://sparkup1:8088/proxy/application_1471869381289_0003/ user: root 16/08/22 05:44:54 INFO cluster.YarnClientSchedulerBackend: Application application_1471869381289_0003 has started running. 16/08/22 05:44:54 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34347. 16/08/22 05:44:54 INFO netty.NettyBlockTransferService: Server created on 34347 16/08/22 05:44:54 INFO storage.BlockManagerMaster: Trying to register BlockManager 16/08/22 05:44:54 INFO storage.BlockManagerMasterEndpoint: Registering block manager 172.16.186.200:34347 with 1259.8 MB RAM, BlockManagerId(driver, 172.16.186.200, 34347) 16/08/22 05:44:54 INFO storage.BlockManagerMaster: Registered BlockManager 16/08/22 05:45:02 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (sparkup1:36133) with ID 1 16/08/22 05:45:02 INFO storage.BlockManagerMasterEndpoint: Registering block manager sparkup1:37604 with 500.0 MB RAM, BlockManagerId(1, sparkup1, 37604) 16/08/22 05:45:02 INFO cluster.YarnClientSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (sparkup3:35841) with ID 2 16/08/22 05:45:02 INFO storage.BlockManagerMasterEndpoint: Registering block manager sparkup3:53373 with 500.0 MB RAM, BlockManagerId(2, sparkup3, 53373) 16/08/22 05:45:02 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 16/08/22 05:45:03 INFO xml.XmlBeanDefinitionReader: Loading XML bean definitions from URL [file:/usr/apache-ignite-fabric-1.6.0-bin/config/default-config.xml] 16/08/22 05:45:04 INFO support.GenericApplicationContext: Refreshing org.springframework.context.support.GenericApplicationContext@529c42c6: startup date [Mon Aug 22 05:45:04 PDT 2016]; root of context hierarchy 16/08/22 05:45:04 INFO internal.IgniteKernal: >>> __________ ________________ >>> / _/ ___/ |/ / _/_ __/ __/ >>> _/ // (7 7 // / / / / _/ >>> /___/\___/_/|_/___/ /_/ /___/ >>> >>> ver. 1.6.0#20160518-sha1:0b22c45b >>> 2016 Copyright(C) Apache Software Foundation >>> >>> Ignite documentation: http://ignite.apache.org 16/08/22 05:45:04 INFO internal.IgniteKernal: Config URL: n/a 16/08/22 05:45:04 INFO internal.IgniteKernal: Daemon mode: off 16/08/22 05:45:04 INFO internal.IgniteKernal: OS: Linux 2.6.32-431.el6.x86_64 amd64 16/08/22 05:45:04 INFO internal.IgniteKernal: OS user: root 16/08/22 05:45:04 INFO internal.IgniteKernal: Language runtime: Java Platform API Specification ver. 1.7 16/08/22 05:45:04 INFO internal.IgniteKernal: VM information: Java(TM) SE Runtime Environment 1.7.0_71-b14 Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 24.71-b01 16/08/22 05:45:04 INFO internal.IgniteKernal: VM total memory: 1.9GB 16/08/22 05:45:04 INFO internal.IgniteKernal: Remote Management [restart: off, REST: on, JMX (remote: off)] 16/08/22 05:45:04 INFO internal.IgniteKernal: IGNITE_HOME=/usr/apache-ignite-fabric-1.6.0-bin 16/08/22 05:45:04 INFO internal.IgniteKernal: VM arguments: [-Xms2G, -Xmx2G, -XX:MaxPermSize=256m] 16/08/22 05:45:04 INFO internal.IgniteKernal: Configured caches ['ignite-marshaller-sys-cache', 'ignite-sys-cache', 'ignite-atomics-sys-cache'] 16/08/22 05:45:05 INFO internal.IgniteKernal: 3-rd party licenses can be found at: /usr/apache-ignite-fabric-1.6.0-bin/libs/licenses 16/08/22 05:45:05 INFO internal.IgniteKernal: Non-loopback local IPs: 172.16.186.200, fe80:0:0:0:20c:29ff:fecd:2f25%2 16/08/22 05:45:05 INFO internal.IgniteKernal: Enabled local MACs: 000C29CD2F25 16/08/22 05:45:05 INFO plugin.IgnitePluginProcessor: Configured plugins: 16/08/22 05:45:05 INFO plugin.IgnitePluginProcessor: ^-- None 16/08/22 05:45:05 INFO plugin.IgnitePluginProcessor: 16/08/22 05:45:05 INFO tcp.TcpCommunicationSpi: IPC shared memory server endpoint started [port=48101, tokDir=/usr/apache-ignite-fabric-1.6.0-bin/work/ipc/shmem/d02e3ed0-a14a-4331-9403-822aa7bd2c9f-6908] 16/08/22 05:45:05 INFO tcp.TcpCommunicationSpi: Successfully bound shared memory communication to TCP port [port=48101, locHost=0.0.0.0/0.0.0.0] 16/08/22 05:45:05 INFO tcp.TcpCommunicationSpi: Successfully bound to TCP port [port=47101, locHost=0.0.0.0/0.0.0.0] 16/08/22 05:45:05 WARN noop.NoopCheckpointSpi: Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation) 16/08/22 05:45:05 WARN collision.GridCollisionManager: Collision resolution is disabled (all jobs will be activated upon arrival). 16/08/22 05:45:05 WARN noop.NoopSwapSpaceSpi: Swap space is disabled. To enable use FileSwapSpaceSpi. 16/08/22 05:45:05 INFO internal.IgniteKernal: Security status [authentication=off, tls/ssl=off] 16/08/22 05:45:05 INFO tcp.GridTcpRestProtocol: Command protocol successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11212] 16/08/22 05:45:06 WARN internal.IgniteKernal: Attempting to start more nodes than physical RAM available on current host (this can cause significant slowdown) 16/08/22 05:45:06 INFO cache.GridCacheProcessor: Started cache [name=ignite-marshaller-sys-cache, mode=REPLICATED] 16/08/22 05:45:06 INFO cache.GridCacheProcessor: Started cache [name=ignite-atomics-sys-cache, mode=PARTITIONED] 16/08/22 05:45:06 INFO cache.GridCacheProcessor: Started cache [name=ignite-sys-cache, mode=REPLICATED] 16/08/22 05:45:07 INFO internal.IgniteKernal: To start Console Management & Monitoring run ignitevisorcmd.{sh|bat} 16/08/22 05:45:07 INFO internal.IgniteKernal: 16/08/22 05:45:07 INFO internal.IgniteKernal: >>> +----------------------------------------------------------------------+ >>> Ignite ver. 1.6.0#20160518-sha1:0b22c45bb9b97692208fd0705ddf8045ff34a031 >>> +----------------------------------------------------------------------+ >>> OS name: Linux 2.6.32-431.el6.x86_64 amd64 >>> CPU(s): 1 >>> Heap: 2.0GB >>> VM name: 6908@sparkup1 >>> Local node [ID=D02E3ED0-A14A-4331-9403-822AA7BD2C9F, order=8, >>> clientMode=true] >>> Local node addresses: [sparkup1/0:0:0:0:0:0:0:1%1, /127.0.0.1, >>> /172.16.186.200] >>> Local ports: TCP:11212 TCP:47101 TCP:48101 16/08/22 05:45:07 INFO discovery.GridDiscoveryManager: Topology snapshot [ver=8, servers=3, clients=1, CPUs=3, heap=6.5GB] 16/08/22 05:45:07 INFO cache.GridCacheProcessor: Started cache [name=sharedIgniteRDD-ling-sha111o, mode=PARTITIONED] 16/08/22 05:45:07 INFO spark.SparkContext: Starting job: count at testIgniteSharedRDD.scala:19 16/08/22 05:45:07 INFO scheduler.DAGScheduler: Got job 0 (count at testIgniteSharedRDD.scala:19) with 1024 output partitions 16/08/22 05:45:07 INFO scheduler.DAGScheduler: Final stage: ResultStage 0 (count at testIgniteSharedRDD.scala:19) 16/08/22 05:45:07 INFO scheduler.DAGScheduler: Parents of final stage: List() 16/08/22 05:45:07 INFO scheduler.DAGScheduler: Missing parents: List() 16/08/22 05:45:07 INFO scheduler.DAGScheduler: Submitting ResultStage 0 (IgniteRDD[0] at RDD at IgniteAbstractRDD.scala:31), which has no missing parents 16/08/22 05:45:08 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1872.0 B, free 1872.0 B) 16/08/22 05:45:08 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1853.0 B, free 3.6 KB) 16/08/22 05:45:08 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.16.186.200:34347 (size: 1853.0 B, free: 1259.8 MB) 16/08/22 05:45:08 INFO spark.SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006 16/08/22 05:45:08 INFO scheduler.DAGScheduler: Submitting 1024 missing tasks from ResultStage 0 (IgniteRDD[0] at RDD at IgniteAbstractRDD.scala:31) 16/08/22 05:45:08 INFO cluster.YarnScheduler: Adding task set 0.0 with 1024 tasks 16/08/22 05:45:08 INFO scheduler.TaskSetManager: Starting task 1.0 in stage 0.0 (TID 0, sparkup1, partition 1,NODE_LOCAL, 1967 bytes) 16/08/22 05:45:08 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 1, sparkup3, partition 0,NODE_LOCAL, 1967 bytes) 16/08/22 05:45:08 INFO scheduler.TaskSetManager: Starting task 3.0 in stage 0.0 (TID 2, sparkup1, partition 3,NODE_LOCAL, 1967 bytes) 16/08/22 05:45:08 INFO scheduler.TaskSetManager: Starting task 2.0 in stage 0.0 (TID 3, sparkup3, partition 2,NODE_LOCAL, 1967 bytes) 16/08/22 05:45:11 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sparkup3:53373 (size: 1853.0 B, free: 500.0 MB) 16/08/22 05:45:11 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on sparkup1:37604 (size: 1853.0 B, free: 500.0 MB) 16/08/22 05:45:19 INFO discovery.GridDiscoveryManager: Added new node to topology: TcpDiscoveryNode [id=d7f0a10f-68d7-4c8f-ab1d-d59c17b5bef7, addrs=[0:0:0:0:0:0:0:1%1, 127.0.0.1, 172.16.186.200], sockAddrs=[sparkup1/172.16.186.200:47501, /0:0:0:0:0:0:0:1%1:47501, /127.0.0.1:47501, /172.16.186.200:47501], discPort=47501, order=9, intOrder=7, lastExchangeTime=1471869918832, loc=false, ver=1.6.0#20160518-sha1:0b22c45b, isClient=false] 16/08/22 05:45:19 INFO discovery.GridDiscoveryManager: Topology snapshot [ver=9, servers=4, clients=1, CPUs=3, heap=7.4GB] 16/08/22 05:45:20 INFO scheduler.TaskSetManager: Starting task 5.0 in stage 0.0 (TID 4, sparkup3, partition 5,NODE_LOCAL, 1967 bytes) 16/08/22 05:45:20 INFO scheduler.TaskSetManager: Starting task 9.0 in stage 0.0 (TID 5, sparkup3, partition 9,NODE_LOCAL, 1967 bytes) 16/08/22 05:45:20 INFO scheduler.TaskSetManager: Finished task 2.0 in stage 0.0 (TID 3) in 12285 ms on sparkup3 (1/1024) 16/08/22 05:45:20 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 1) in 12308 ms on sparkup3 (2/1024) 16/08/22 05:45:22 INFO scheduler.TaskSetManager: Starting task 4.0 in stage 0.0 (TID 6, sparkup1, partition 4,NODE_LOCAL, 1967 bytes) 16/08/22 05:45:22 INFO scheduler.TaskSetManager: Starting task 6.0 in stage 0.0 (TID 7, sparkup1, partition 6,NODE_LOCAL, 1967 bytes) 16/08/22 05:45:22 INFO scheduler.TaskSetManager: Finished task 3.0 in stage 0.0 (TID 2) in 13752 ms on sparkup1 (3/1024) 16/08/22 05:45:22 INFO scheduler.TaskSetManager: Finished task 1.0 in stage 0.0 (TID 0) in 13782 ms on sparkup1 (4/1024) 16/08/22 05:45:43 INFO scheduler.TaskSetManager: Starting task 10.0 in stage 0.0 (TID 8, sparkup1, partition 10,NODE_LOCAL, 1967 bytes) 16/08/22 05:45:43 INFO scheduler.TaskSetManager: Starting task 11.0 in stage 0.0 (TID 9, sparkup1, partition 11,NODE_LOCAL, 1967 bytes) 16/08/22 05:45:43 WARN scheduler.TaskSetManager: Lost task 6.0 in stage 0.0 (TID 7, sparkup1): class org.apache.ignite.IgniteCheckedException: Failed to instantiate Spring XML application context [springUrl=file:/usr/apache-ignite-fabric-1.6.0-bin/config/default-config.xml, err=Line 6 in XML document from URL [file:/usr/apache-ignite-fabric-1.6.0-bin/config/default-config.xml] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 6; columnNumber: 71; cvc-elt.1: Cannot find the declaration of element 'beans'.] at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.applicationContext(IgniteSpringHelperImpl.java:391) at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.loadConfigurations(IgniteSpringHelperImpl.java:104) at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.loadConfigurations(IgniteSpringHelperImpl.java:98) at org.apache.ignite.internal.IgnitionEx.loadConfigurations(IgnitionEx.java:639) at org.apache.ignite.internal.IgnitionEx.loadConfigurations(IgnitionEx.java:678) at org.apache.ignite.internal.IgnitionEx.loadConfiguration(IgnitionEx.java:717) at org.apache.ignite.spark.IgniteContext$$anonfun$$lessinit$greater$2.apply(IgniteContext.scala:85) at org.apache.ignite.spark.IgniteContext$$anonfun$$lessinit$greater$2.apply(IgniteContext.scala:85) at org.apache.ignite.spark.Once.apply(IgniteContext.scala:198) at org.apache.ignite.spark.IgniteContext.ignite(IgniteContext.scala:138) at org.apache.ignite.spark.impl.IgniteAbstractRDD.ensureCache(IgniteAbstractRDD.scala:37) at org.apache.ignite.spark.IgniteRDD.compute(IgniteRDD.scala:58) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.springframework.beans.factory.xml.XmlBeanDefinitionStoreException: Line 6 in XML document from URL [file:/usr/apache-ignite-fabric-1.6.0-bin/config/default-config.xml] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 6; columnNumber: 71; cvc-elt.1: Cannot find the declaration of element 'beans'. at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:398) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:335) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:303) at org.apache.ignite.internal.util.spring.IgniteSpringHelperImpl.applicationContext(IgniteSpringHelperImpl.java:379) ... 19 more Caused by: org.xml.sax.SAXParseException; lineNumber: 6; columnNumber: 71; cvc-elt.1: Cannot find the declaration of element 'beans'. at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source) at org.apache.xerces.util.ErrorHandlerWrapper.error(Unknown Source) at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) at org.apache.xerces.impl.xs.XMLSchemaValidator.handleStartElement(Unknown Source) at org.apache.xerces.impl.xs.XMLSchemaValidator.startElement(Unknown Source) at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source) at org.apache.xerces.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) at org.springframework.beans.factory.xml.DefaultDocumentLoader.loadDocument(DefaultDocumentLoader.java:76) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadDocument(XmlBeanDefinitionReader.java:428) at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:390) ... 22 more 16/08/22 05:45:43 INFO scheduler.TaskSetManager: Lost task 4.0 in stage 0.0 (TID 6) on executor sparkup1: org.apache.ignite.IgniteCheckedException (Failed to instantiate Spring XML application context [springUrl=file:/usr/apache-ignite-fabric-1.6.0-bin/config/default-config.xml, err=Line 6 in XML document from URL [file:/usr/apache-ignite-fabric-1.6.0-bin/config/default-config.xml] is invalid; nested exception is org.xml.sax.SAXParseException; lineNumber: 6; columnNumber: 71; cvc-elt.1: Cannot find the declaration of element 'beans'.]) [duplicate 1] ^C16/08/22 05:46:02 INFO typedef.G: Invoking shutdown hook... 16/08/22 05:46:02 INFO spark.SparkContext: Invoking stop() from shutdown hook 16/08/22 05:46:02 INFO tcp.GridTcpRestProtocol: Command protocol successfully stopped: TCP binary 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static/sql,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL/execution/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL/execution,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/SQL,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null} 16/08/22 05:46:02 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null} 16/08/22 05:46:02 INFO ui.SparkUI: Stopped Spark web UI at http://172.16.186.200:4040 ^C16/08/22 05:46:02 INFO cache.GridCacheProcessor: Stopped cache: ignite-marshaller-sys-cache 16/08/22 05:46:02 INFO cache.GridCacheProcessor: Stopped cache: ignite-sys-cache 16/08/22 05:46:02 INFO cache.GridCacheProcessor: Stopped cache: ignite-atomics-sys-cache 16/08/22 05:46:02 INFO cache.GridCacheProcessor: Stopped cache: sharedIgniteRDD-ling-sha111o 16/08/22 05:46:02 INFO scheduler.DAGScheduler: Job 0 failed: count at testIgniteSharedRDD.scala:19, took 54.886757 s Exception in thread "main" org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:806) at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:804) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:804) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:1658) at org.apache.spark.util.EventLoop.stop(EventLoop.scala:84) at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1581) at org.apache.spark.SparkContext$$anonfun$stop$9.apply$mcV$sp(SparkContext.scala:1740) at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1229) at org.apache.spark.SparkContext.stop(SparkContext.scala:1739) at org.apache.spark.SparkContext$$anonfun$3.apply$mcV$sp(SparkContext.scala:596) at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:267) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:239) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:239) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1765) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:239) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:239) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:239) at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:218) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD.count(RDD.scala:1157) at com.ignite.testIgniteSharedRDD$.main(testIgniteSharedRDD.scala:19) at com.ignite.testIgniteSharedRDD.main(testIgniteSharedRDD.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 16/08/22 05:46:02 INFO scheduler.DAGScheduler: ResultStage 0 (count at testIgniteSharedRDD.scala:19) failed in 54.382 s ^C^C^C16/08/22 05:46:03 INFO internal.IgniteKernal: >>> +---------------------------------------------------------------------------------+ >>> Ignite ver. 1.6.0#20160518-sha1:0b22c45bb9b97692208fd0705ddf8045ff34a031 >>> stopped OK >>> +---------------------------------------------------------------------------------+ >>> Grid uptime: 00:00:56:139 16/08/22 05:46:03 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerStageCompleted(org.apache.spark.scheduler.StageInfo@6632c145) 16/08/22 05:46:03 ERROR scheduler.LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerJobEnd(0,1471869963305,JobFailed(org.apache.spark.SparkException: Job 0 cancelled because SparkContext was shut down)) 16/08/22 05:46:03 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread 16/08/22 05:46:03 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors 16/08/22 05:46:03 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down 16/08/22 05:46:03 INFO cluster.YarnClientSchedulerBackend: Stopped 16/08/22 05:46:03 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! ^C16/08/22 05:46:03 INFO storage.MemoryStore: MemoryStore cleared 16/08/22 05:46:03 INFO storage.BlockManager: BlockManager stopped 16/08/22 05:46:03 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 16/08/22 05:46:03 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 16/08/22 05:46:03 INFO spark.SparkContext: Successfully stopped SparkContext 16/08/22 05:46:03 INFO util.ShutdownHookManager: Shutdown hook called 16/08/22 05:46:03 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-70fba93f-d31c-44d7-ada1-5fe0b9dae5cc 16/08/22 05:46:03 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 16/08/22 05:46:03 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 16/08/22 05:46:03 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-70fba93f-d31c-44d7-ada1-5fe0b9dae5cc/httpd-91da8e86-bee0-4c5a-868a-ea7d35a2536d [root@sparkup1 config]# -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Yarn-deployment-for-static-TcpDiscoverySpi-issues-Urgent-In-Production-tp7205.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.