[jira] [Commented] (FLINK-1914) Wrong FS while starting YARN session without correct HADOOP_HOME

2016-11-18 Thread Malte Schwarzer (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15677519#comment-15677519
 ] 

Malte Schwarzer commented on FLINK-1914:


I'm having the same issue with Flink 1.1.3 and Hadoop 2.7.3, when I try to run 
a Flink job on YARN. HADOOP_HOME is not set.

{code}flink/bin/flink run -m yarn-cluster -yn 1 -yjm 1024 -ytm 4096 
flink/examples/batch/WordCount.jar hdfs://power1:55000/info.txt
{code}

{code}
016-11-18 20:06:48,987 INFO  org.apache.flink.yarn.YarnApplicationMasterRunner  
   - Setting up resources for TaskManagers
2016-11-18 20:06:49,989 ERROR org.apache.flink.yarn.YarnApplicationMasterRunner 
- YARN Application Master initialization failed
java.lang.IllegalArgumentException: Wrong FS: 
file:/home/hadoop/.flink/application_1479495922304_0001/flink-dist_2.11-1.1.3.jar,
 expected: hdfs://ibm-power-1.dima.tu-berlin.de:55000
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:191)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:102)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1124)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
at org.apache.flink.yarn.Utils.registerLocalResource(Utils.java:135)
at 
org.apache.flink.yarn.YarnApplicationMasterRunner.createTaskManagerContext(YarnApplicationMasterRunner.java:543)
at 
org.apache.flink.yarn.YarnApplicationMasterRunner.runApplicationMaster(YarnApplicationMasterRunner.java:261)
at 
org.apache.flink.yarn.YarnApplicationMasterRunner$1.run(YarnApplicationMasterRunner.java:153)
at 
org.apache.flink.yarn.YarnApplicationMasterRunner$1.run(YarnApplicationMasterRunner.java:150)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:360)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1536)
at 
org.apache.flink.yarn.YarnApplicationMasterRunner.run(YarnApplicationMasterRunner.java:150)
at 
org.apache.flink.yarn.YarnApplicationMasterRunner.main(YarnApplicationMasterRunner.java:112)
{code}

> Wrong FS while starting YARN session without correct HADOOP_HOME
> 
>
> Key: FLINK-1914
> URL: https://issues.apache.org/jira/browse/FLINK-1914
> Project: Flink
>  Issue Type: Bug
>  Components: YARN Client
>Reporter: Zoltán Zvara
>Priority: Trivial
>  Labels: yarn, yarn-client
>
> When YARN session invoked ({{yarn-session.sh}}) without a correct 
> {{HADOOP_HOME}} (AM still deployed to - for example to {{0.0.0.0:8032}}), but 
> the deployed AM fails with an {{IllegalArgumentException}}:
> {code}
> java.lang.IllegalArgumentException: Wrong FS: 
> file:/home/.../flink-dist-0.9-SNAPSHOT.jar, expected: hdfs://localhost:9000
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:642)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:181)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:92)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
>   at org.apache.flink.yarn.Utils.registerLocalResource(Utils.java:105)
>   at 
> org.apache.flink.yarn.ApplicationMasterActor$$anonfun$org$apache$flink$yarn$ApplicationMasterActor$$startYarnSession$2.apply(ApplicationMasterActor.scala:436)
>   at 
> org.apache.flink.yarn.ApplicationMasterActor$$anonfun$org$apache$flink$yarn$ApplicationMasterActor$$startYarnSession$2.apply(ApplicationMasterActor.scala:371)
>   at scala.util.Try$.apply(Try.scala:161)
>   at 
> org.apache.flink.yarn.ApplicationMasterActor$class.org$apache$flink$yarn$ApplicationMasterActor$$startYarnSession(ApplicationMasterActor.scala:371)
>   at 
> org.apache.flink.yarn.ApplicationMasterActor$$anonfun$receiveYarnMessages$1.applyOrElse(ApplicationMasterActor.scala:155)
>   at scala.PartialFunction$OrElse.apply(PartialFunction.scala:162)
>   at 
> 

[jira] [Updated] (FLINK-1435) TaskManager does not log missing memory error on start up

2015-01-22 Thread Malte Schwarzer (JIRA)

 [ 
https://issues.apache.org/jira/browse/FLINK-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Malte Schwarzer updated FLINK-1435:
---
Description: 
When using bin/start-cluster.sh to start TaskManagers and a worker node is 
failing to start because of missing memory, you do not receive any error 
messages in log files.

Worker node has only 15000M memory available, but it is configured with Maximum 
heap size: 4 MiBytes. Task manager does not join the cluster. Process hangs.

Last lines of log looks like this:
...
... - - Starting with 12 incoming and 12 outgoing connection threads.
... - Setting low water mark to 16384 and high water mark to 32768 bytes.
... - Instantiated PooledByteBufAllocator with direct arenas: 24, heap arenas: 
0, page size (bytes): 65536, chunk size (bytes): 16777216.
... - Using 0.7 of the free heap space for managed memory.
... - Initializing memory manager with 24447 megabytes of memory. Page size is 
32768 bytes.
(END)

Error message about not enough memory is missing.



  was:
When using bin/start-cluster.sh to start TaskManagers and a worker node is 
failing to start because of missing memory, you do not receive any error 
messages in log files.

Worker node has only 15000M memory available, but it is configured with Maximum 
heap size: 4 MiBytes. Task manager does not join the cluster. Process seems 
to stuck.

Last line of log looks like this:

...  - Initializing memory manager with 24447 megabytes of memory. Page size is 
32768 bytes.




 TaskManager does not log missing memory error on start up
 -

 Key: FLINK-1435
 URL: https://issues.apache.org/jira/browse/FLINK-1435
 Project: Flink
  Issue Type: Bug
  Components: TaskManager
Affects Versions: 0.7.0-incubating
Reporter: Malte Schwarzer
Priority: Minor
  Labels: memorymanager

 When using bin/start-cluster.sh to start TaskManagers and a worker node is 
 failing to start because of missing memory, you do not receive any error 
 messages in log files.
 Worker node has only 15000M memory available, but it is configured with 
 Maximum heap size: 4 MiBytes. Task manager does not join the cluster. 
 Process hangs.
 Last lines of log looks like this:
 ...
 ... - - Starting with 12 incoming and 12 outgoing connection threads.
 ... - Setting low water mark to 16384 and high water mark to 32768 bytes.
 ... - Instantiated PooledByteBufAllocator with direct arenas: 24, heap 
 arenas: 0, page size (bytes): 65536, chunk size (bytes): 16777216.
 ... - Using 0.7 of the free heap space for managed memory.
 ... - Initializing memory manager with 24447 megabytes of memory. Page size 
 is 32768 bytes.
 (END)
 Error message about not enough memory is missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)