[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628662#comment-16628662
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

asfgit closed pull request #6540: [FLINK-9891] Added hook to shutdown cluster 
if a session was created in per-job mode.
URL: https://github.com/apache/flink/pull/6540
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java 
b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
index 2e78e4a615b..780f8144029 100644
--- a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
+++ b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
@@ -58,6 +58,7 @@
 import org.apache.flink.util.ExceptionUtils;
 import org.apache.flink.util.FlinkException;
 import org.apache.flink.util.Preconditions;
+import org.apache.flink.util.ShutdownHookUtil;
 
 import org.apache.commons.cli.CommandLine;
 import org.apache.commons.cli.Options;
@@ -249,13 +250,22 @@ protected void run(String[] args) throws Exception {
LOG.info("Could not properly shut down 
the client.", e);
}
} else {
+   final Thread shutdownHook;
if (clusterId != null) {
client = 
clusterDescriptor.retrieve(clusterId);
+   shutdownHook = null;
} else {
// also in job mode we have to deploy a 
session cluster because the job
// might consist of multiple parts 
(e.g. when using collect)
final ClusterSpecification 
clusterSpecification = customCommandLine.getClusterSpecification(commandLine);
client = 
clusterDescriptor.deploySessionCluster(clusterSpecification);
+   // if not running in detached mode, add 
a shutdown hook to shut down cluster if client exits
+   // there's a race-condition here if cli 
is killed before shutdown hook is installed
+   if (!runOptions.getDetachedMode()) {
+   shutdownHook = 
ShutdownHookUtil.addShutdownHook(client::shutDownCluster, 
client.getClass().getSimpleName(), LOG);
+   } else {
+   shutdownHook = null;
+   }
}
 
try {
@@ -278,12 +288,12 @@ protected void run(String[] args) throws Exception {
 
executeProgram(program, client, 
userParallelism);
} finally {
-   if (clusterId == null && 
!client.isDetached()) {
+   if (shutdownHook != null) {
// terminate the cluster only 
if we have started it before and if it's not detached
try {
-   
client.shutDownCluster();
-   } catch (final Exception e) {
-   LOG.info("Could not 
properly terminate the Flink cluster.", e);
+   shutdownHook.run();
+   } finally {
+   
ShutdownHookUtil.removeShutdownHook(shutdownHook, 
client.getClass().getSimpleName(), LOG);
}
}
 


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>  Components: Client, YARN
>Affects Versions: 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-26 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628663#comment-16628663
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

asfgit closed pull request #6718: [FLINK-9891] Add optional hook to shutdown 
cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/ops/cli.md b/docs/ops/cli.md
index f96ccf52ea9..9af5c5b5333 100644
--- a/docs/ops/cli.md
+++ b/docs/ops/cli.md
@@ -257,6 +257,11 @@ Action "run" compiles and runs a program.
  -s,--fromSavepointPath to a savepoint to restore the 
job
   from (for example
   hdfs:///flink/savepoint-1537).
+ -sae,--shutdownOnAttachedExitIf the job is submitted in attached
+  mode, perform a best-effort cluster
+  shutdown when the CLI is terminated
+  abruptly, e.g., in response to a user
+  interrupt, such as typing Ctrl + C.
   Options for yarn-cluster mode:
  -d,--detachedIf present, runs the job in detached
   mode
@@ -265,6 +270,11 @@ Action "run" compiles and runs a program.
   connect to a different JobManager 
than
   the one specified in the
   configuration.
+ -sae,--shutdownOnAttachedExitIf the job is submitted in attached
+  mode, perform a best-effort cluster
+  shutdown when the CLI is terminated
+  abruptly, e.g., in response to a user
+  interrupt, such as typing Ctrl + C.
  -yD  use value for given property
  -yd,--yarndetached   If present, runs the job in detached
   mode (deprecated; use non-YARN
diff --git 
a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java 
b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
index e2a260c5478..ae0052c3c38 100644
--- a/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
+++ b/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
@@ -58,6 +58,7 @@
 import org.apache.flink.util.ExceptionUtils;
 import org.apache.flink.util.FlinkException;
 import org.apache.flink.util.Preconditions;
+import org.apache.flink.util.ShutdownHookUtil;
 
 import org.apache.commons.cli.CommandLine;
 import org.apache.commons.cli.Options;
@@ -250,13 +251,22 @@ protected void run(String[] args) throws Exception {
LOG.info("Could not properly shut down 
the client.", e);
}
} else {
+   final Thread shutdownHook;
if (clusterId != null) {
client = 
clusterDescriptor.retrieve(clusterId);
+   shutdownHook = null;
} else {
// also in job mode we have to deploy a 
session cluster because the job
// might consist of multiple parts 
(e.g. when using collect)
final ClusterSpecification 
clusterSpecification = customCommandLine.getClusterSpecification(commandLine);
client = 
clusterDescriptor.deploySessionCluster(clusterSpecification);
+   // if not running in detached mode, add 
a shutdown hook to shut down cluster if client exits
+   // there's a race-condition here if cli 
is killed before shutdown hook is installed
+   if (!runOptions.getDetachedMode() && 
runOptions.isShutdownOnAttachedExit()) {
+   shutdownHook = 
ShutdownHookUtil.addShutdownHook(client::shutDownCluster, 
client.getClass().getSimpleName(), LOG);
+   } else {
+   shutdownHook = null;
+   }
}
 
try {
@@ -286,8 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627429#comment-16627429
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

GJL edited a comment on issue #6540: [FLINK-9891] Added hook to shutdown 
cluster if a session was created in per-job mode.
URL: https://github.com/apache/flink/pull/6540#issuecomment-424360084
 
 
   Thank you for your contribution to Apache Flink, @packet23. LGTM, merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>  Components: Client, YARN
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Andrey Zagrebin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0, 1.6.2, 1.5.5
>
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627431#comment-16627431
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

GJL commented on issue #6718: [FLINK-9891] Add optional hook to shutdown 
cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718#issuecomment-424360423
 
 
   Thank you for your contribution to Apache Flink, @azagrebin. LGTM, merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>  Components: Client, YARN
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Andrey Zagrebin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0, 1.6.2, 1.5.5
>
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16627428#comment-16627428
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

GJL commented on issue #6540: [FLINK-9891] Added hook to shutdown cluster if a 
session was created in per-job mode.
URL: https://github.com/apache/flink/pull/6540#issuecomment-424360084
 
 
   LGTM, merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>  Components: Client, YARN
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Andrey Zagrebin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.0, 1.6.2, 1.5.5
>
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-21 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16623270#comment-16623270
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

azagrebin commented on a change in pull request #6718: [FLINK-9891] Add 
optional hook to shutdown cluster if a session was created in per-job mode in 
attached mode
URL: https://github.com/apache/flink/pull/6718#discussion_r219421119
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
 ##
 @@ -492,11 +505,10 @@ private static void 
printJobStatusMessages(List jobs) {
jobsByState.entrySet().stream()
.sorted(statusComparator)

.map(Map.Entry::getValue).flatMap(List::stream).sorted(startTimeComparator)
-   .forEachOrdered(job -> {
-   System.out.println(dateFormat.format(new 
Date(job.getStartTime()))
-   + " : " + job.getJobId() + " : " + 
job.getJobName()
-   + " (" + job.getJobState() + ")");
-   });
+   .forEachOrdered(job ->
 
 Review comment:
   yes, redundant brackets in one-line lambda


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622298#comment-16622298
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

GJL commented on a change in pull request #6718: [FLINK-9891] Add optional hook 
to shutdown cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718#discussion_r219225730
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
 ##
 @@ -492,11 +505,10 @@ private static void 
printJobStatusMessages(List jobs) {
jobsByState.entrySet().stream()
.sorted(statusComparator)

.map(Map.Entry::getValue).flatMap(List::stream).sorted(startTimeComparator)
-   .forEachOrdered(job -> {
-   System.out.println(dateFormat.format(new 
Date(job.getStartTime()))
-   + " : " + job.getJobId() + " : " + 
job.getJobName()
-   + " (" + job.getJobState() + ")");
-   });
+   .forEachOrdered(job ->
 
 Review comment:
   nit: unrelated formatting change?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622296#comment-16622296
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

GJL commented on a change in pull request #6718: [FLINK-9891] Add optional hook 
to shutdown cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718#discussion_r219231030
 
 

 ##
 File path: docs/ops/cli.md
 ##
 @@ -265,6 +269,10 @@ Action "run" compiles and runs a program.
   connect to a different JobManager 
than
   the one specified in the
   configuration.
+ -sae,--shutdownOnAttachedExitIf the job mode is not detached and a
 
 Review comment:
   I'd rephrase it a bit: _If the job is submitted in attached mode, perform a 
best-effort cluster shutdown when the CLI is terminated abruptly, e.g., in 
response to a user interrupt, such as typing Ctrl + C._ 
   
   If you don't like it, I'd at least change _the best effort cluster shutdown_ 
to _a best-effort cluster shutdown_


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622299#comment-16622299
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

GJL commented on a change in pull request #6718: [FLINK-9891] Add optional hook 
to shutdown cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718#discussion_r219226146
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
 ##
 @@ -57,6 +58,8 @@
 
private final boolean detachedMode;
 
+   private final boolean shutdownOnAttachedExist;
 
 Review comment:
   _exit_


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622297#comment-16622297
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

GJL commented on a change in pull request #6718: [FLINK-9891] Add optional hook 
to shutdown cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718#discussion_r219226199
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
 ##
 @@ -145,6 +149,10 @@ public boolean getDetachedMode() {
return detachedMode;
}
 
+   public boolean isShutdownOnAttachedExist() {
 
 Review comment:
   _exit_


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620770#comment-16620770
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

GJL commented on a change in pull request #6718: [FLINK-9891] Add optional hook 
to shutdown cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718#discussion_r218859953
 
 

 ##
 File path: docs/ops/cli.md
 ##
 @@ -265,6 +268,9 @@ Action "run" compiles and runs a program.
   connect to a different JobManager 
than
   the one specified in the
   configuration.
+ -sae,--schutdownOnAttachedExist  If the job mode is not detached and
 
 Review comment:
   Is this supposed to be _exit_ instead of _exist_? How about 
`--shutdownOnExit`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620758#comment-16620758
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

GJL commented on a change in pull request #6718: [FLINK-9891] Add optional hook 
to shutdown cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718#discussion_r218853732
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java
 ##
 @@ -57,6 +58,8 @@
 
private final boolean detachedMode;
 
+   private final boolean schutdownOnAttachedExist;
 
 Review comment:
   _shutdown_


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620755#comment-16620755
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

GJL commented on a change in pull request #6718: [FLINK-9891] Add optional hook 
to shutdown cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718#discussion_r218853427
 
 

 ##
 File path: docs/ops/cli.md
 ##
 @@ -257,6 +257,9 @@ Action "run" compiles and runs a program.
  -s,--fromSavepointPath to a savepoint to restore the 
job
   from (for example
   hdfs:///flink/savepoint-1537).
+ -sae,--schutdownOnAttachedExist  If the job mode is not detached and
 
 Review comment:
   _shutdown_


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620750#comment-16620750
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

azagrebin commented on issue #6718: [FLINK-9891] Add optional hook to shutdown 
cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718#issuecomment-422847652
 
 
   cc @tillrohrmann @GJL 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is running as a 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620752#comment-16620752
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

azagrebin edited a comment on issue #6718: [FLINK-9891] Add optional hook to 
shutdown cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718#issuecomment-422847652
 
 
   cc @tillrohrmann @GJL @packet23 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-19 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620749#comment-16620749
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

azagrebin opened a new pull request #6718: [FLINK-9891] Add optional hook to 
shutdown cluster if a session was created in per-job mode in attached mode
URL: https://github.com/apache/flink/pull/6718
 
 
   ## What is the purpose of the change
   
   This PR picks up #6540, adds a cli option to activate the cluster shutdown 
hook. By default, it preserves previous behaviour where cluster is not shutdown 
in case of abrupt cli exit.
   
   ## Brief change log
   
   - add SHUTDOWN_IF_ATTACHED_OPTION in cli
   
   ## Verifying this change
   
   Trivial option, can be checked running example job in local yarn cluster:
   HADOOP_CLASSPATH=`hadoop classpath` $FLINK/bin/flink run -m yarn-cluster 
$FLINK/examples/streaming/WordCount.jar
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes)
 - If yes, how is the feature documented? (docs)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-10 Thread Sebastian Klemke (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608917#comment-16608917
 ] 

Sebastian Klemke commented on FLINK-9891:
-

[~azagrebin] It's not only parsing log outputs -- it's also keeping state, 
which we try to avoid.

[~till.rohrmann] I can see where it fits, do you think it should go to 
flink-conf.yaml or be exposed as one of the yarn options for flink cli?

> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is running as a Flink session
>  # Ctrl+C or 'stop' doesn't stop a job and YARN cluster
>  # Cancel job via Job Maanager web ui doesn't stop Flink cluster. To kill the 
> cluster we need to run: yarn application -kill 
> We also tried to run a flink job with 'mode: legacy' and we have the same 
> issues:
>  # Add property 'mode: 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-09-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607349#comment-16607349
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

tillrohrmann commented on issue #6540: [FLINK-9891] Added hook to shutdown 
cluster if a session was created in per-job mode.
URL: https://github.com/apache/flink/pull/6540#issuecomment-419500426
 
 
   I like the idea of making this behaviour configurable. @packet23 do you have 
time to add this functionality?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-08-31 Thread Andrey Zagrebin (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598966#comment-16598966
 ] 

Andrey Zagrebin commented on FLINK-9891:


[~packet], do you see something like this in the logs when run the job and 
cancel it?
{code:java}
Submitting application master {code}
from AbstractYarnClusterDescriptor.startAppMaster.

This should the Yarn application id.

> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is running as a Flink session
>  # Ctrl+C or 'stop' doesn't stop a job and YARN cluster
>  # Cancel job via Job Maanager web ui doesn't stop Flink cluster. To kill the 
> cluster we need to run: yarn application -kill 
> We also tried to run a flink job with 'mode: legacy' and we have the same 
> issues:
>  # Add property 'mode: legacy' to 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-08-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579332#comment-16579332
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

yanghua commented on a change in pull request #6540: [FLINK-9891] Added hook to 
shutdown cluster if a session was created in per-job mode.
URL: https://github.com/apache/flink/pull/6540#discussion_r209845260
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
 ##
 @@ -1219,4 +1231,21 @@ public static void 
setJobManagerAddressInConfig(Configuration config, InetSocket
return constructor.newInstance(params);
}
 
+   private static class ClusterShutdownHook extends Thread {
+   private final ClusterClient client;
+
+   ClusterShutdownHook(ClusterClient client) {
 
 Review comment:
   Not replace, keep both of them.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-08-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16579331#comment-16579331
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

packet23 commented on a change in pull request #6540: [FLINK-9891] Added hook 
to shutdown cluster if a session was created in per-job mode.
URL: https://github.com/apache/flink/pull/6540#discussion_r209844574
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
 ##
 @@ -1219,4 +1231,21 @@ public static void 
setJobManagerAddressInConfig(Configuration config, InetSocket
return constructor.newInstance(params);
}
 
+   private static class ClusterShutdownHook extends Thread {
+   private final ClusterClient client;
+
+   ClusterShutdownHook(ClusterClient client) {
 
 Review comment:
   Thanks @yanghua for reviewing. Do you mean @Nonnull in addition to 
Objects.requireNonNull() or as replacement?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-08-13 Thread Sebastian Klemke (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16578017#comment-16578017
 ] 

Sebastian Klemke commented on FLINK-9891:
-

Exactly, in the past, doing exactly that proved to be very fragile. How would I 
retrieve the Yarn application id given job was submitted via {code}flink run -m 
yarn-cluster ...{code} ?

> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is running as a Flink session
>  # Ctrl+C or 'stop' doesn't stop a job and YARN cluster
>  # Cancel job via Job Maanager web ui doesn't stop Flink cluster. To kill the 
> cluster we need to run: yarn application -kill 
> We also tried to run a flink job with 'mode: legacy' and we have the same 
> issues:
>  # Add property 'mode: legacy' to ./conf/flink-conf.yaml
>  # Execute the following 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-08-13 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577908#comment-16577908
 ] 

Till Rohrmann commented on FLINK-9891:
--

Would it be an option to use `yarn application -kill` to terminate the Flink 
cluster instead of adding the shut down hook [~packet]? The only thing you need 
to retrieve is the corresponding application id of the running Flink cluster.

> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is running as a Flink session
>  # Ctrl+C or 'stop' doesn't stop a job and YARN cluster
>  # Cancel job via Job Maanager web ui doesn't stop Flink cluster. To kill the 
> cluster we need to run: yarn application -kill 
> We also tried to run a flink job with 'mode: legacy' and we have the same 
> issues:
>  # Add property 'mode: legacy' to 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-08-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577422#comment-16577422
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

yanghua commented on a change in pull request #6540: [FLINK-9891] Added hook to 
shutdown cluster if a session was created in per-job mode.
URL: https://github.com/apache/flink/pull/6540#discussion_r209443343
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
 ##
 @@ -1219,4 +1231,21 @@ public static void 
setJobManagerAddressInConfig(Configuration config, InetSocket
return constructor.newInstance(params);
}
 
+   private static class ClusterShutdownHook extends Thread {
+   private final ClusterClient client;
+
+   ClusterShutdownHook(ClusterClient client) {
 
 Review comment:
   here use `@Nonnull` looks better to me


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-08-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577421#comment-16577421
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

yanghua commented on a change in pull request #6540: [FLINK-9891] Added hook to 
shutdown cluster if a session was created in per-job mode.
URL: https://github.com/apache/flink/pull/6540#discussion_r209443483
 
 

 ##
 File path: 
flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java
 ##
 @@ -249,13 +250,24 @@ protected void run(String[] args) throws Exception {
LOG.info("Could not properly shut down 
the client.", e);
}
} else {
+   final ClusterShutdownHook shutdownHook;
if (clusterId != null) {
client = 
clusterDescriptor.retrieve(clusterId);
+   shutdownHook = null;
} else {
// also in job mode we have to deploy a 
session cluster because the job
// might consist of multiple parts 
(e.g. when using collect)
final ClusterSpecification 
clusterSpecification = customCommandLine.getClusterSpecification(commandLine);
client = 
clusterDescriptor.deploySessionCluster(clusterSpecification);
+   // if not running in detached mode, add 
a shutdown hook to shut down cluster if client exits
+   // there's a race-condition here if cli 
is killed before shutdown hook is installed
+   if (!runOptions.getDetachedMode()) {
+   shutdownHook = new 
ClusterShutdownHook(client);
+   
Runtime.getRuntime().addShutdownHook(shutdownHook);
 
 Review comment:
   In flink-core there is a `ShutdownHookUtil` util class you can consider


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-08-11 Thread Sebastian Klemke (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577359#comment-16577359
 ] 

Sebastian Klemke commented on FLINK-9891:
-

Thanks [~till.rohrmann] for the detailed explanation. Unfortunately, this bug 
limits our upgrade possibilities: Our batch jobs (DataSet API) are run by a 
service that orchestrates complex batch flows. This service needs to cancel 
running applications if SLA timeout kicks in. So far, we relied on Flink Cli in 
per-job, attached mode to manage Yarn application life-cycle also in the case 
of orchestrator cancelling a job.
[~suez1224]: Is it possible to merge our PR to 1.5 branch, 1.6 branch and 
master and release 1.5.3 + 1.6.1?

> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is running as a Flink session
>  # Ctrl+C 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-08-11 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577358#comment-16577358
 ] 

ASF GitHub Bot commented on FLINK-9891:
---

packet23 opened a new pull request #6540: [FLINK-9891] Added hook to shutdown 
cluster if a session was created in per-job mode.
URL: https://github.com/apache/flink/pull/6540
 
 
   ## What is the purpose of the change
   
   Change minimizes probability of a per-job yarn session cluster surviving 
after client exited. It restores the Flink 1.4 and below cli behaviour.
   
   
   ## Brief change log
   
 - when per-job yarn cluster is deployed, a shutdown hook installed
 - when shutdown hook is called, it terminates the deployed cluster
 - when cli terminates normally, shutdown hook is called by cli and 
deinstalled
 - otherwise, java runtime will call shutdown hook
   
   
   ## Verifying this change
   
   We verified the change manually by submitting a DataSet API application to 
Yarn and killing client before job terminated.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not documented
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>  Labels: pull-request-available
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-07-31 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563356#comment-16563356
 ] 

Till Rohrmann commented on FLINK-9891:
--

Hi [~krasovcheg], as far as I know the issue has not been fixed yet. I'm also 
not sure whether the community will fix it for {{1.5.x}}.

> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is running as a Flink session
>  # Ctrl+C or 'stop' doesn't stop a job and YARN cluster
>  # Cancel job via Job Maanager web ui doesn't stop Flink cluster. To kill the 
> cluster we need to run: yarn application -kill 
> We also tried to run a flink job with 'mode: legacy' and we have the same 
> issues:
>  # Add property 'mode: legacy' to ./conf/flink-conf.yaml
>  # Execute the following command:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-07-31 Thread Sergey Krasovskiy (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16563275#comment-16563275
 ] 

Sergey Krasovskiy commented on FLINK-9891:
--

[~till.rohrmann] thank you for your response.

Are you aware if there is any progress on this task? And may we expect the fix 
in version of Flink 1.5.x?

> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Major
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
> ('/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/conf')
>  contains both LOG4J and Logback configuration files. Please delete or rename 
> one of them.
> 2018-07-18 12:47:07,846 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Submitting application 
> master application_1531474158783_10814
> 2018-07-18 12:47:08,073 INFO 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application 
> application_1531474158783_10814
> 2018-07-18 12:47:08,074 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Waiting for the cluster 
> to be allocated
> 2018-07-18 12:47:08,076 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Deploying cluster, 
> current state ACCEPTED
> 2018-07-18 12:47:12,864 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN application has 
> been deployed successfully.
> {code}
> Job Manager logs:
> {code:java}
> 2018-07-18 12:47:09,913 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
> 
> 2018-07-18 12:47:09,915 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting 
> YarnSessionClusterEntrypoint (Version: 1.5.1, Rev:3488f8b, Date:10.07.2018 @ 
> 11:51:27 GMT)
> ...
> {code}
> Issues:
>  # Flink job is running as a Flink session
>  # Ctrl+C or 'stop' doesn't stop a job and YARN cluster
>  # Cancel job via Job Maanager web ui doesn't stop Flink cluster. To kill the 
> cluster we need to run: yarn application -kill 
> We also tried to run a flink job with 'mode: legacy' and we have the same 
> issues:
>  # Add property 'mode: legacy' to ./conf/flink-conf.yaml
>  # Execute the following command:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 

[jira] [Commented] (FLINK-9891) Flink cluster is not shutdown in YARN mode when Flink client is stopped

2018-07-22 Thread Till Rohrmann (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-9891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552017#comment-16552017
 ] 

Till Rohrmann commented on FLINK-9891:
--

Thanks for reporting the issue [~krasovcheg]. The behaviour you're observing is 
unfortunately a technical artifact. If you run the same command with the 
detached option {{-d}}, everything should work as expected.

Let me quickly explain what's happening behind the scenes: First of all, a 
Flink cluster is no longer stopped if the client is terminated because this is 
just one endpoint for the communication and the user might use a different way 
to retrieve the job result/wait for the job completion (e.g. calling the REST 
endpoint with his custom application).

If you start a Flink per job cluster with the above mentioned command (in 
attached mode), Flink will actually deploy a session cluster and then submits 
the program. The reason for this is that your Flink job might consists of 
multiple parts which the client submits sequentially to the cluster (e.g. a 
batch job which calls {{collect}} or {{print}}). Therefore, the Flink cluster 
needs to be stopped explicitly after a job has finished.

If you start a Flink per job cluster in detached mode via {{-d}}, then Flink 
knows that there is only a single job which it will include in the deployment 
message which it sends to Yarn. The cluster will then be started in a slightly 
different mode which makes sure that the cluster terminates after the job 
execution finishes (also if you cancel it).

In order to properly solve this problem, I think we need to rework the client 
to not start a session cluster if the user uses the above written command (per 
job mode). Instead, we should start for each job part a dedicated per job 
cluster. However, also if this works, then the per-job cluster will wait until 
it has served the execution result at least once before it terminates. 
Otherwise we won't be able to send the job results back to a potentially 
waiting client.

Moreover, it would be helpful to add a command/utility with which we can stop a 
Flink cluster (on Yarn, Mesos, etc.).

> Flink cluster is not shutdown in YARN mode when Flink client is stopped
> ---
>
> Key: FLINK-9891
> URL: https://issues.apache.org/jira/browse/FLINK-9891
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1
>Reporter: Sergey Krasovskiy
>Assignee: Shuyi Chen
>Priority: Blocker
>
> We are not using session mode and detached mode. The command to run Flink job 
> on YARN is:
> {code:java}
> /bin/flink run -m yarn-cluster -yn 1 -yqu flink -yjm 768 -ytm 
> 2048 -j ./flink-quickstart-java-1.0-SNAPSHOT.jar -c org.test.WordCount
> {code}
> Flink CLI logs:
> {code:java}
> Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was set.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink-streaming/flink-streaming-1.5.1-1.5.1-bin-hadoop27-scala_2.11-1531485329/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.4.2.10-1/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 2018-07-18 12:47:03,747 INFO 
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service 
> address: http://hmaster-1.ipbl.rgcloud.net:8188/ws/v1/timeline/
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,222 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - 
> No path for the flink jar passed. Using the location of class 
> org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
> 2018-07-18 12:47:04,248 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Neither the 
> HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink 
> YARN Client needs one of these to be set to properly load the Hadoop 
> configuration for accessing YARN.
> 2018-07-18 12:47:04,409 INFO 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: 
> ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=2048, 
> numberTaskManagers=1, slotsPerTaskManager=1}
> 2018-07-18 12:47:04,783 WARN 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory - The short-circuit 
> local reads feature cannot be used because libhadoop cannot be loaded.
> 2018-07-18 12:47:04,788 WARN 
> org.apache.flink.yarn.AbstractYarnClusterDescriptor - The configuration 
> directory 
>