Hi, Anyone might help a newbie ramping up with Ignite on YARN?
Thanks, Juan On Sun, Nov 19, 2017 at 7:34 PM, Juan Rodríguez Hortalá < [email protected]> wrote: > Hi, > > I'm trying to run ignite on AWS EMR as a YARN application, using zookeeper > for node discovery. I have compiled ignite with > > ``` > mvn clean package -DskipTests -Dignite.edition=hadoop > -Dhadoop.version=2.7.3 > ``` > > I'm using ignite_yarn.properties > > ``` > # The number of nodes in the cluster. > IGNITE_NODE_COUNT=3 > > # The number of CPU Cores for each Apache Ignite node. > IGNITE_RUN_CPU_PER_NODE=1 > > # The number of Megabytes of RAM for each Apache Ignite node. > IGNITE_MEMORY_PER_NODE=500 > > IGNITE_PATH=hdfs:///user/hadoop/ignite/apache-ignite-2. > 3.0-hadoop-2.7.3.zip > > IGNITE_XML_CONFIG=hdfs:///user/hadoop/ignite/ignite_conf.xml > > # Local path > IGNITE_WORK_DIR=/mnt > > # Local path > IGNITE_RELEASES_DIR=/mnt > > IGNITE_WORKING_DIR=/mnt > ```` > > and ignite_conf.xml as > > ``` > <?xml version="1.0" encoding="UTF-8"?> > <beans xmlns="http://www.springframework.org/schema/beans" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation=" > http://www.springframework.org/schema/beans > http://www.springframework.org/schema/beans/spring-beans.xsd"> > <bean id="ignite.cfg" class="org.apache.ignite.configuration. > IgniteConfiguration"> > <property name="cacheConfiguration"> > <list> > <!-- Partitioned replicated cache configuration (Atomic > mode). --> > <bean class="org.apache.ignite.configuration. > CacheConfiguration"> > <property name="name" value="default"/> > <property name="atomicityMode" value="ATOMIC"/> > <property name="backups" value="3"/> > <property name="cacheMode" value="PARTITIONED"/> > </bean> > </list> > </property> > > <!-- Explicitly configure TCP discovery SPI to provide list of > initial nodes. --> > <property name="discoverySpi"> > <bean class="org.apache.ignite.spi. > discovery.tcp.TcpDiscoverySpi"> > <property name="ipFinder"> > <bean class="org.apache.ignite.spi. > discovery.tcp.ipfinder.zk.TcpDiscoveryZookeeperIpFinder"> > <!-- FIXME change to master internal API (as used by > YARN), e.g. ip-10-0-0-154.ec2.internal:2181 --> > <property name="zkConnectionString" > value="ip-10-0-0-173.ec2.internal:2181"/> > </bean> > </property> > </bean> > </property> > <property name="gridLogger"> > <bean class="org.apache.ignite.logger.log4j2.Log4J2Logger"> > <!-- Default path relative to IGNITE_HOME, assumming > IGNITE_HOME is set to the > root of the Ignite installation --> > <constructor-arg type="java.lang.String" > value="config/ignite-log4j2.xml"/> > </bean> > </property> > </bean> > </beans> > ``` > > Then I launch the yarn job as > > > ``` > IGNITE_YARN_JAR=/mnt/ignite/apache-ignite-2.3.0-src/ > modules/yarn/target/ignite-yarn-2.3.0.jar > yarn jar ${IGNITE_YARN_JAR} ${IGNITE_YARN_JAR} /mnt/ignite/ignite_yarn. > properties > ``` > > The app launches and the application master is outputting logs, but > containers only last some seconds running, and the application is > constantly asking for more containers. For example, in the application > master log > > ``` > > Nov 20, 2017 3:08:30 AM org.apache.ignite.yarn.ApplicationMaster > onContainersAllocated > INFO: Launching container: container_1511142795395_0005_01_017079. > 17/11/20 03:08:30 INFO impl.ContainerManagementProtocolProxy: Opening proxy : > ip-10-0-0-230.ec2.internal:8041 > Nov 20, 2017 3:08:30 AM org.apache.ignite.yarn.ApplicationMaster > onContainersAllocated > INFO: Launching container: container_1511142795395_0005_01_017080. > 17/11/20 03:08:30 INFO impl.ContainerManagementProtocolProxy: Opening proxy : > ip-10-0-0-78.ec2.internal:8041 > Nov 20, 2017 3:08:30 AM org.apache.ignite.yarn.ApplicationMaster > onContainersAllocated > INFO: Launching container: container_1511142795395_0005_01_017081. > 17/11/20 03:08:30 INFO impl.ContainerManagementProtocolProxy: Opening proxy : > ip-10-0-0-193.ec2.internal:8041 > Nov 20, 2017 3:08:31 AM org.apache.ignite.yarn.ApplicationMaster > onContainersCompleted > INFO: Container completed. Container id: > container_1511142795395_0005_01_017080. State: COMPLETE. > Nov 20, 2017 3:08:31 AM org.apache.ignite.yarn.ApplicationMaster > onContainersCompleted > INFO: Container completed. Container id: > container_1511142795395_0005_01_017081. State: COMPLETE. > Nov 20, 2017 3:08:31 AM org.apache.ignite.yarn.ApplicationMaster > onContainersCompleted > INFO: Container completed. Container id: > container_1511142795395_0005_01_017079. State: COMPLETE. > Nov 20, 2017 3:08:31 AM org.apache.ignite.yarn.ApplicationMaster > onContainersAllocated > > ``` > > In the logs for a node manager I see containers seem to fail when they are > launched, because the corresponding bash command is not well formed > > ``` > 2017-11-20 03:08:47,810 INFO org.apache.hadoop.yarn.server. > nodemanager.containermanager.container.ContainerImpl (AsyncDispatcher > event handler): Container container_1511142795395_0005_01_017281 > transitioned from LOCALIZED to RUNNING > 2017-11-20 03:08:47,811 INFO > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor > (ContainersLauncher #4): launchContainer: [bash, /mnt/yarn/usercache/hadoop/ > appcache/application_1511142795395_0005/container_ > 1511142795395_0005_01_017281/default_container_executor.sh] > 2017-11-20 03:08:47,819 WARN > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor > (ContainersLauncher #4): Exit code from container > container_1511142795395_0005_01_017281 is : 2 > 2017-11-20 03:08:47,819 WARN > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor > (ContainersLauncher #4): Exception from container-launch with container ID: > container_1511142795395_0005_01_017281 and exit code: 2 > ExitCodeException exitCode=2: /mnt/yarn/usercache/hadoop/ > appcache/application_1511142795395_0005/container_ > 1511142795395_0005_01_017281/launch_container.sh: line 4: syntax error > near unexpected token `(' > /mnt/yarn/usercache/hadoop/appcache/application_ > 1511142795395_0005/container_1511142795395_0005_01_017281/launch_container.sh: > line 4: `export BASH_FUNC_run_prestart()="() { su -s /bin/bash $SVC_USER > -c "cd $WORKING_DIR && $EXEC_PATH --config '$CONF_DIR' start $DAEMON_FLAGS"' > > at org.apache.hadoop.util.Shell.runCommand(Shell.java:582) > at org.apache.hadoop.util.Shell.run(Shell.java:479) > at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute( > Shell.java:773) > at org.apache.hadoop.yarn.server.nodemanager. > DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java: > 212) > at org.apache.hadoop.yarn.server.nodemanager.containermanager. > launcher.ContainerLaunch.call(ContainerLaunch.java:302) > at org.apache.hadoop.yarn.server.nodemanager.containermanager. > launcher.ContainerLaunch.call(ContainerLaunch.java:82) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1149) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2017-11-20 03:08:47,819 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor > (ContainersLauncher #4): Exception from container-launch. > 2017-11-20 03:08:47,819 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor > (ContainersLauncher #4): Container id: container_1511142795395_0005_ > 01_017281 > 2017-11-20 03:08:47,819 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor > (ContainersLauncher #4): Exit code: 2 > 2017-11-20 03:08:47,819 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor > (ContainersLauncher #4): Exception message: /mnt/yarn/usercache/hadoop/ > appcache/application_1511142795395_0005/container_ > 1511142795395_0005_01_017281/launch_container.sh: line 4: syntax error > near unexpected token `(' > 2017-11-20 03:08:47,819 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor > (ContainersLauncher #4): /mnt/yarn/usercache/hadoop/appcache/application_ > 1511142795395_0005/container_1511142795395_0005_01_017281/launch_container.sh: > line 4: `export BASH_FUNC_run_prestart()="() { su -s /bin/bash $SVC_USER > -c "cd $WORKING_DIR && $EXEC_PATH --config '$CONF_DIR' start $DAEMON_FLAGS"' > 2017-11-20 03:08:47,819 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor > (ContainersLauncher #4): > 2017-11-20 03:08:47,819 INFO > org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor > (ContainersLauncher #4): Stack trace: ExitCodeException exitCode=2: > /mnt/yarn/usercache/hadoop/appcache/application_ > 1511142795395_0005/container_1511142795395_0005_01_017281/launch_container.sh: > line 4: syntax error near unexpected token `(' > ``` > > When I launch ignite manually in the master it is able to start fine, and > connect to zookeeper, but I see a topology with just 1 node. > > Any thoughts on what I might be doing wrong here? > > Thanks in advance. > > Juan Rodriguez > >
