[jira] [Commented] (FLINK-1914) Wrong FS while starting YARN session without correct HADOOP_HOME
[ https://issues.apache.org/jira/browse/FLINK-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677519#comment-15677519 ] Malte Schwarzer commented on FLINK-1914: I'm having the same issue with Flink 1.1.3 and Hadoop 2.7.3, when I try to run a Flink job on YARN. HADOOP_HOME is not set. {code}flink/bin/flink run -m yarn-cluster -yn 1 -yjm 1024 -ytm 4096 flink/examples/batch/WordCount.jar hdfs://power1:55000/info.txt {code} {code} 016-11-18 20:06:48,987 INFO org.apache.flink.yarn.YarnApplicationMasterRunner - Setting up resources for TaskManagers 2016-11-18 20:06:49,989 ERROR org.apache.flink.yarn.YarnApplicationMasterRunner - YARN Application Master initialization failed java.lang.IllegalArgumentException: Wrong FS: file:/home/hadoop/.flink/application_1479495922304_0001/flink-dist_2.11-1.1.3.jar, expected: hdfs://ibm-power-1.dima.tu-berlin.de:55000 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643) at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:191) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:102) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120) at org.apache.flink.yarn.Utils.registerLocalResource(Utils.java:135) at org.apache.flink.yarn.YarnApplicationMasterRunner.createTaskManagerContext(YarnApplicationMasterRunner.java:543) at org.apache.flink.yarn.YarnApplicationMasterRunner.runApplicationMaster(YarnApplicationMasterRunner.java:261) at org.apache.flink.yarn.YarnApplicationMasterRunner$1.run(YarnApplicationMasterRunner.java:153) at org.apache.flink.yarn.YarnApplicationMasterRunner$1.run(YarnApplicationMasterRunner.java:150) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1536) at org.apache.flink.yarn.YarnApplicationMasterRunner.run(YarnApplicationMasterRunner.java:150) at org.apache.flink.yarn.YarnApplicationMasterRunner.main(YarnApplicationMasterRunner.java:112) {code} > Wrong FS while starting YARN session without correct HADOOP_HOME > > > Key: FLINK-1914 > URL: https://issues.apache.org/jira/browse/FLINK-1914 > Project: Flink > Issue Type: Bug > Components: YARN Client >Reporter: Zoltán Zvara >Priority: Trivial > Labels: yarn, yarn-client > > When YARN session invoked ({{yarn-session.sh}}) without a correct > {{HADOOP_HOME}} (AM still deployed to - for example to {{0.0.0.0:8032}}), but > the deployed AM fails with an {{IllegalArgumentException}}: > {code} > java.lang.IllegalArgumentException: Wrong FS: > file:/home/.../flink-dist-0.9-SNAPSHOT.jar, expected: hdfs://localhost:9000 > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:642) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:181) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:92) > at > org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106) > at > org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102) > at org.apache.flink.yarn.Utils.registerLocalResource(Utils.java:105) > at > org.apache.flink.yarn.ApplicationMasterActor$$anonfun$org$apache$flink$yarn$ApplicationMasterActor$$startYarnSession$2.apply(ApplicationMasterActor.scala:436) > at > org.apache.flink.yarn.ApplicationMasterActor$$anonfun$org$apache$flink$yarn$ApplicationMasterActor$$startYarnSession$2.apply(ApplicationMasterActor.scala:371) > at scala.util.Try$.apply(Try.scala:161) > at > org.apache.flink.yarn.ApplicationMasterActor$class.org$apache$flink$yarn$ApplicationMasterActor$$startYarnSession(ApplicationMasterActor.scala:371) > at > org.apache.flink.yarn.ApplicationMasterActor$$anonfun$receiveYarnMessages$1.applyOrElse(ApplicationMasterActor.scala:155) > at scala.PartialFunction$OrElse.apply(PartialFunction.scala:162) > at >
[jira] [Updated] (FLINK-1435) TaskManager does not log missing memory error on start up
[ https://issues.apache.org/jira/browse/FLINK-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Malte Schwarzer updated FLINK-1435: --- Description: When using bin/start-cluster.sh to start TaskManagers and a worker node is failing to start because of missing memory, you do not receive any error messages in log files. Worker node has only 15000M memory available, but it is configured with Maximum heap size: 4 MiBytes. Task manager does not join the cluster. Process hangs. Last lines of log looks like this: ... ... - - Starting with 12 incoming and 12 outgoing connection threads. ... - Setting low water mark to 16384 and high water mark to 32768 bytes. ... - Instantiated PooledByteBufAllocator with direct arenas: 24, heap arenas: 0, page size (bytes): 65536, chunk size (bytes): 16777216. ... - Using 0.7 of the free heap space for managed memory. ... - Initializing memory manager with 24447 megabytes of memory. Page size is 32768 bytes. (END) Error message about not enough memory is missing. was: When using bin/start-cluster.sh to start TaskManagers and a worker node is failing to start because of missing memory, you do not receive any error messages in log files. Worker node has only 15000M memory available, but it is configured with Maximum heap size: 4 MiBytes. Task manager does not join the cluster. Process seems to stuck. Last line of log looks like this: ... - Initializing memory manager with 24447 megabytes of memory. Page size is 32768 bytes. TaskManager does not log missing memory error on start up - Key: FLINK-1435 URL: https://issues.apache.org/jira/browse/FLINK-1435 Project: Flink Issue Type: Bug Components: TaskManager Affects Versions: 0.7.0-incubating Reporter: Malte Schwarzer Priority: Minor Labels: memorymanager When using bin/start-cluster.sh to start TaskManagers and a worker node is failing to start because of missing memory, you do not receive any error messages in log files. Worker node has only 15000M memory available, but it is configured with Maximum heap size: 4 MiBytes. Task manager does not join the cluster. Process hangs. Last lines of log looks like this: ... ... - - Starting with 12 incoming and 12 outgoing connection threads. ... - Setting low water mark to 16384 and high water mark to 32768 bytes. ... - Instantiated PooledByteBufAllocator with direct arenas: 24, heap arenas: 0, page size (bytes): 65536, chunk size (bytes): 16777216. ... - Using 0.7 of the free heap space for managed memory. ... - Initializing memory manager with 24447 megabytes of memory. Page size is 32768 bytes. (END) Error message about not enough memory is missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)