[ 
https://issues.apache.org/jira/browse/FLINK-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327487#comment-15327487
 ] 

ASF GitHub Bot commented on FLINK-3937:
---------------------------------------

Github user rmetzger commented on the issue:

    https://github.com/apache/flink/pull/2085
  
    While trying out your code, I noticed that the `.yarn-properties` file is 
not properly deleted, even though my previous yarn session was shutting down 
correctly:
    
    ```
    2016-06-13 07:16:51,885 INFO  org.apache.flink.client.program.ClusterClient 
                - TaskManager status (0/1)
    TaskManager status (0/1)
    2016-06-13 07:16:52,386 INFO  org.apache.flink.client.program.ClusterClient 
                - TaskManager status (0/1)
    TaskManager status (0/1)
    2016-06-13 07:16:52,887 INFO  org.apache.flink.client.program.ClusterClient 
                - All TaskManagers are connected
    All TaskManagers are connected
    2016-06-13 07:16:52,887 INFO  org.apache.flink.client.program.ClusterClient 
                - Looking up JobManager
    2016-06-13 07:16:52,912 INFO  org.apache.flink.client.program.ClusterClient 
                - Looking up JobManager
    Flink JobManager is now running on 10.0.2.15:51747
    JobManager Web Interface: 
http://quickstart.cloudera:8088/proxy/application_1447844011707_0038/
    Number of connected TaskManagers changed to 1. Slots available: 1
    ^[[A^C2016-06-13 07:25:35,370 INFO  org.apache.flink.yarn.YarnClusterClient 
                      - Shutting down YarnClusterClient from the client 
shutdown hook
    2016-06-13 07:25:35,372 INFO  org.apache.flink.yarn.YarnClusterClient       
                - Sending shutdown request to the Application Master
    2016-06-13 07:25:35,373 INFO  org.apache.flink.yarn.ApplicationClient       
                - Sending StopCluster request to JobManager.
    2016-06-13 07:25:35,429 INFO  org.apache.flink.yarn.ApplicationClient       
                - Stopped Application client.
    2016-06-13 07:25:35,431 INFO  org.apache.flink.yarn.ApplicationClient       
                - Disconnect from JobManager 
Actor[akka.tcp://[email protected]:51747/user/jobmanager#1733798764].
    2016-06-13 07:25:35,469 INFO  
akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down 
remote daemon.
    2016-06-13 07:25:35,469 INFO  
akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon 
shut down; proceeding with flushing remote transports.
    2016-06-13 07:25:35,622 INFO  
akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut 
down.
    2016-06-13 07:25:35,701 INFO  org.apache.flink.yarn.YarnClusterClient       
                - Deleting files in 
hdfs://quickstart.cloudera:8020/user/cloudera/.flink/application_1447844011707_0038
    2016-06-13 07:25:35,706 INFO  org.apache.flink.yarn.YarnClusterClient       
                - Application application_1447844011707_0038 finished with 
state FINISHED and final state SUCCEEDED at 1465827935399
    2016-06-13 07:25:36,567 INFO  org.apache.flink.yarn.YarnClusterClient       
                - YARN Client is shutting down
    (reverse-i-search)`yarn': ./bin/^Crn-session.sh -n 1
    [cloudera@quickstart build-target]$ ./bin/flink run -y yarn-cluster -yd -yn 
1 ./examples/batch/WordCount.jar 
    JAR file does not exist: -y
    
    Use the help option (-h or --help) to get help on the command.
    [cloudera@quickstart build-target]$ ./bin/flink run -m yarn-cluster -yd -yn 
1 ./examples/batch/WordCount.jar 
    2016-06-13 07:26:11,057 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
                - Found YARN properties file /tmp/.yarn-properties-cloudera
    2016-06-13 07:26:11,057 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
                - Found YARN properties file /tmp/.yarn-properties-cloudera
    Found YARN properties file /tmp/.yarn-properties-cloudera
    2016-06-13 07:26:11,082 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
                - Using JobManager address from YARN properties 
quickstart.cloudera/10.0.2.15:51747
    2016-06-13 07:26:11,082 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
                - Using JobManager address from YARN properties 
quickstart.cloudera/10.0.2.15:51747
    Using JobManager address from YARN properties 
quickstart.cloudera/10.0.2.15:51747
    2016-06-13 07:26:11,270 INFO  org.apache.hadoop.yarn.client.RMProxy         
                - Connecting to ResourceManager at 
quickstart.cloudera/10.0.2.15:8032
    2016-06-13 07:26:11,795 INFO  org.apache.flink.yarn.YarnClusterDescriptor   
                - Found application 'application_1447844011707_0038' with 
JobManager host name 'quickstart.cloudera' and port '51747' from Yarn 
properties file.
    2016-06-13 07:26:11,842 INFO  org.apache.hadoop.yarn.client.RMProxy         
                - Connecting to ResourceManager at 
quickstart.cloudera/10.0.2.15:8032
    2016-06-13 07:26:11,878 ERROR org.apache.flink.yarn.YarnClusterDescriptor   
                - The application application_1447844011707_0038 doesn't run 
anymore. It has previously completed with final status: SUCCEEDED
    
    ------------------------------------------------------------
     The program finished with the following exception:
    
    java.lang.RuntimeException: The Yarn application 
application_1447844011707_0038 doesn't run anymore.
        at 
org.apache.flink.yarn.AbstractYarnClusterDescriptor.retrieve(AbstractYarnClusterDescriptor.java:377)
        at 
org.apache.flink.yarn.AbstractYarnClusterDescriptor.retrieveFromConfig(AbstractYarnClusterDescriptor.java:351)
        at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli.retrieveCluster(FlinkYarnSessionCli.java:464)
        at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli.retrieveCluster(FlinkYarnSessionCli.java:62)
        at org.apache.flink.client.CliFrontend.getClient(CliFrontend.java:862)
        at org.apache.flink.client.CliFrontend.run(CliFrontend.java:228)
        at 
org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:983)
        at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1034)
    [cloudera@quickstart build-target]$ 
    ```



> Make flink cli list, savepoint, cancel and stop work on Flink-on-YARN clusters
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-3937
>                 URL: https://issues.apache.org/jira/browse/FLINK-3937
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Sebastian Klemke
>            Assignee: Maximilian Michels
>            Priority: Trivial
>         Attachments: improve_flink_cli_yarn_integration.patch
>
>
> Currently, flink cli can't figure out JobManager RPC location for 
> Flink-on-YARN clusters. Therefore, list, savepoint, cancel and stop 
> subcommands are hard to invoke if you only know the YARN application ID. As 
> an improvement, I suggest adding a -yid <yarnApplicationId> option to the 
> mentioned subcommands that can be used together with -m yarn-cluster. Flink 
> cli would then retrieve JobManager RPC location from YARN ResourceManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to