[jira] [Created] (FLINK-10694) ZooKeeperRunningJobsRegistry Cleanup

2018-10-26 Thread Mikhail Pryakhin (JIRA)
Mikhail Pryakhin created FLINK-10694:


 Summary: ZooKeeperRunningJobsRegistry Cleanup
 Key: FLINK-10694
 URL: https://issues.apache.org/jira/browse/FLINK-10694
 Project: Flink
  Issue Type: Bug
  Components: DataStream API
Affects Versions: 1.6.1
Reporter: Mikhail Pryakhin


When a streaming job with Zookeeper-HA enabled gets cancelled all the 
job-related Zookeeper nodes are not removed. Is there a reason behind that? 
 I noticed that Zookeeper paths are created of type "Container Node" (an 
Ephemeral node that can have nested nodes) and fall back to Persistent node 
type in case Zookeeper doesn't support this sort of nodes. 
 But anyway, it is worth removing the job Zookeeper node when a job is 
cancelled, isn't it?

zookeeper version 3.4.10
 flink version 1.6.1
 # The job is deployed as a YARN cluster with the following properties set
{noformat}
 high-availability: zookeeper
 high-availability.zookeeper.quorum: 
 high-availability.zookeeper.storageDir: hdfs:///
 high-availability.zookeeper.path.root: 
 high-availability.zookeeper.path.namespace: 
{noformat}
 # The job is cancelled via flink cancel  command.

What I've noticed:
 when the job is running the following directory structure is created in 
zookeeper
{noformat}
///leader/resource_manager_lock
///leader/rest_server_lock
///leader/dispatcher_lock
///leader/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock
///leaderlatch/resource_manager_lock
///leaderlatch/rest_server_lock
///leaderlatch/dispatcher_lock
///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock
///checkpoints/5c21f00b9162becf5ce25a1cf0e67cde/041
///checkpoint-counter/5c21f00b9162becf5ce25a1cf0e67cde
///running_job_registry/5c21f00b9162becf5ce25a1cf0e67cde
{noformat}
when the job is cancelled the some ephemeral nodes disappear, but most of them 
are still there:
{noformat}
///leader/5c21f00b9162becf5ce25a1cf0e67cde
///leaderlatch/resource_manager_lock
///leaderlatch/rest_server_lock
///leaderlatch/dispatcher_lock
///leaderlatch/5c21f00b9162becf5ce25a1cf0e67cde/job_manager_lock
///checkpoints/
///checkpoint-counter/
///running_job_registry/
{noformat}
Here is the method [1] responsible for cleaning zookeeper folders up [1] which 
is called when the job manager has stopped [2]. 
 And it seems it only cleans up the folder running_job_registry, other folders 
stay untouched. I supposed that everything under the 
*///* folder is cleaned up when the job is 
cancelled.

[1] 
[https://github.com/apache/flink/blob/8674b69964eae50cad024f2c5caf92a71bf21a09/flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/zookeeper/ZooKeeperRunningJobsRegistry.java#L107]
 [2] 
[https://github.com/apache/flink/blob/f087f57749004790b6f5b823d66822c36ae09927/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L332]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-6950) Add ability to specify single jar files to be shipped to YARN

2017-06-19 Thread Mikhail Pryakhin (JIRA)
Mikhail Pryakhin created FLINK-6950:
---

 Summary: Add ability to specify single jar files to be shipped to 
YARN
 Key: FLINK-6950
 URL: https://issues.apache.org/jira/browse/FLINK-6950
 Project: Flink
  Issue Type: Improvement
Reporter: Mikhail Pryakhin
Priority: Minor


When deploying a flink job on YARN it is not possible to specify multiple 
yarnship folders. 
Often when submitting a flink job, job dependencies and job resources are 
located in different local folders and both of them should be shipped to YARN 
cluster. 
I think it would be great to have an ability to specify jars but not folders 
that should be shipped to YARN cluster (via the --yarnship-jars option). 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6949) Make custom resource files to be shipped to YARN cluster

2017-06-19 Thread Mikhail Pryakhin (JIRA)
Mikhail Pryakhin created FLINK-6949:
---

 Summary: Make custom resource files to be shipped to YARN cluster
 Key: FLINK-6949
 URL: https://issues.apache.org/jira/browse/FLINK-6949
 Project: Flink
  Issue Type: Improvement
  Components: Client
Affects Versions: 1.3.0
Reporter: Mikhail Pryakhin


*The problem:*
When deploying a flink job on YARN it is not possible to specify custom 
resource files to be shipped to YARN cluster.
 
*The use case description:*
When running a flink job on multiple environments it becomes necessary to pass 
environment-related configuration files to the job's runtime. It can be 
accomplished by packaging configuration files within the job's jar. But having 
tens of different environments one can easily end up packaging as many jar as 
there are environments. It would be great to have an ability to separate 
configuration files from the job artifacts. 
 
*The possible solution:*
add the --yarnship-files option to flink cli to specify files that should be 
shipped to the YARN cluster.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (FLINK-6912) Consider changing the RichFunction#open method signature to take no arguments.

2017-06-14 Thread Mikhail Pryakhin (JIRA)
Mikhail Pryakhin created FLINK-6912:
---

 Summary: Consider changing the RichFunction#open method signature 
to take no arguments.
 Key: FLINK-6912
 URL: https://issues.apache.org/jira/browse/FLINK-6912
 Project: Flink
  Issue Type: Sub-task
  Components: DataSet API, DataStream API
Affects Versions: 1.3.0
Reporter: Mikhail Pryakhin
Priority: Minor


RichFunction#open(org.apache.flink.configuration.Configuration) method takes a 
Configuration instance as an argument which is always [passed as a new 
instance|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/operators/AbstractUdfStreamOperator.java#L111]
 bearing no configuration parameters. As I figured out it is a remnant of the 
past since that method signature originates from the Record API. Consider 
changing the RichFunction#open method signature to take no arguments as well as 
actualizing java docs.
You can find the complete discussion 
[here|http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/RichMapFunction-setup-method-td13696.html]




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)