Hi,
I am trying to setup flink with Yarn on Mapr cluster. I built flink
(flink-1.3-SNAPSHOT) as follows:
mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=2.7.0-mapr-1607
The build is successful. Then I try to run ./bin/yarn-session.sh -n 4 (without
changing any config or whatsoever) and get the following two errors:
1. This one is a minor error (or bug?)
Error while trying to split key and value in configuration file
/conf/flink-conf.yaml:
2. Second error is more serious and as follows:
Error while deploying YARN cluster: Couldn't deploy Yarn cluster
java.lang.RuntimeException: Couldn't deploy Yarn cluster
at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:425)
at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:620)
at
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476)
at
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473)
at
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
at
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
at
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473)
Caused by: java.lang.NumberFormatException: For input string:
"${nodemanager.resource.cpu-vcores}"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:569)
at java.lang.Integer.parseInt(Integer.java:615)
at
org.apache.hadoop.conf.Configuration.getInt(Configuration.java:1271)
at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.isReadyForDeployment(AbstractYarnClusterDescriptor.java:315)
at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:434)
at
org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:423)
... 9 more
Now, the property that is causing this error nodemanager.resource.cpu-vcores is
appropriately set in yarn-site.xml. The cluster is 3 ResourceManager (2 on
standby) and 5 NodeManager. To be extra safe, I changed the value for this
property at ALL the Nodemanager’s yarn-site.xml.
I believe that this property is default set to 4 according to this blog [
https://www.mapr.com/blog/best-practices-yarn-resource-management ]. So I am
trying to understand as to why is this error cropping up.
The required environment variable is set as follows:
YARN_CONF_DIR=/opt/mapr/hadoop/hadoop-2.7.0/etc/hadoop/
I also tried setting the fs.hdfs.hadoopconf property (to point to the Hadoop
conf directory) in flink-config.yaml. But I still get the same error.
Any help with these (especially the latter) errors would be greatly appreciated.
Thanks in advance,
Aniket D