[jira] [Created] (FLINK-13941) Prevent data-loss by not cleaning up small part files from S3.

2019-09-02 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-13941:
--

 Summary: Prevent data-loss by not cleaning up small part files 
from S3.
 Key: FLINK-13941
 URL: https://issues.apache.org/jira/browse/FLINK-13941
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem
Affects Versions: 1.9.0, 1.8.1, 1.8.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.8.2






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13946) Remove deactivated JobSession-related code.

2019-09-03 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-13946:
--

 Summary: Remove deactivated JobSession-related code.
 Key: FLINK-13946
 URL: https://issues.apache.org/jira/browse/FLINK-13946
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This issue refers to removing the code related to job session as described in 
[FLINK-2097|https://issues.apache.org/jira/browse/FLINK-2097].  The feature is 
deactivated, as pointed by the comment 
[here|https://github.com/apache/flink/blob/master/flink-java/src/main/java/org/apache/flink/api/java/ExecutionEnvironment.java#L285]
 and it complicates the code paths related to job submission, namely the 
lifecycle of the Remote and LocalExecutors.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13954) Clean up ExecutionEnvironment / JobSubmission code paths

2019-09-04 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-13954:
--

 Summary: Clean up ExecutionEnvironment / JobSubmission code paths
 Key: FLINK-13954
 URL: https://issues.apache.org/jira/browse/FLINK-13954
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission
Affects Versions: 1.9.0
Reporter: Kostas Kloudas


This is an umbrella issue to serve as a hub for all issues related to job 
submission / (stream) execution environment refactoring.

 

This issue does not change any existing functionality, but it targets to clean 
up / rearrange the code in the relevant components so that further changes are 
easier to apply.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13959) Consolidate DetachedEnvironment and ContextEnvironment

2019-09-04 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-13959:
--

 Summary: Consolidate DetachedEnvironment and ContextEnvironment
 Key: FLINK-13959
 URL: https://issues.apache.org/jira/browse/FLINK-13959
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


The code for the two environments is similar, so with a bit of work we can get 
rid of the {{DetachedEnvironment}} and maintain only the {{ContextEnvironment}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-13972) Move PreviewPlanEnvironment to flink-tests

2019-09-05 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-13972:
--

 Summary: Move PreviewPlanEnvironment to flink-tests
 Key: FLINK-13972
 URL: https://issues.apache.org/jira/browse/FLINK-13972
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


The PreviewPlanEnvironment is already used only in tests. This issue simply 
moves it in the package where it is used.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (FLINK-14157) Temporarily remove S3 StreamingFileSink end-to-end test

2019-09-20 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14157:
--

 Summary: Temporarily remove S3 StreamingFileSink end-to-end test
 Key: FLINK-14157
 URL: https://issues.apache.org/jira/browse/FLINK-14157
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / FileSystem, Tests
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This issue temporarily disables the failing test so that we can have a green 
travis build, until a proper solution is found.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14376) Introduce the Executor abstraction (FLIP-73)

2019-10-11 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14376:
--

 Summary: Introduce the Executor abstraction (FLIP-73)
 Key: FLINK-14376
 URL: https://issues.apache.org/jira/browse/FLINK-14376
 Project: Flink
  Issue Type: New Feature
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas


This is the umbrella issue to track the progress of the implementation of 
[FLIP-73|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-73%3A+Introducing+Executors+for+job+submission]]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14377) Translate ProgramOptions relevant for job execution to ConfigOptions.

2019-10-11 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14377:
--

 Summary: Translate ProgramOptions relevant for job execution to 
ConfigOptions.
 Key: FLINK-14377
 URL: https://issues.apache.org/jira/browse/FLINK-14377
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas


A subset of the {{ProgramOptions}} will be needed by the executor for 
{{JobGraph}} creation and/or {{Executor}} selection.
These parameters are the:
 * jars
 * classpaths
 * parallelism
 * savepointSettings
 * detached (or not)
 * shutdown_on_attached

This issue aims at introducing the {{ConfigOption}} counterparts of these 
parameters and mapping the one to the other.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14501) Move ClusteDescriptor/ClusterSpecification creation from the CustomCommandLine to a ClusterClientFactory

2019-10-23 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14501:
--

 Summary: Move ClusteDescriptor/ClusterSpecification creation from 
the CustomCommandLine to a ClusterClientFactory
 Key: FLINK-14501
 URL: https://issues.apache.org/jira/browse/FLINK-14501
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission, Command Line Client
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This issue targets to make the {{ClusterDescriptor}} and the 
{{ClusterSpecification}} independent from the {{CommandLine}} . Currently this 
is not the case, as the {{CustomCommandLine}} is responsible for instantiating 
them.

 

The correct {{ClusterClientFactory}} can be discovered based on Service 
Discovery.

 

This decoupling will enable the further decoupling of the cluster execution 
from the command line-related code, which apart from unnecessary, it is also a 
bad separation of concerns.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14502) Make ClusterDescriptor/ClusterSpecification creation configuration-based.

2019-10-23 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14502:
--

 Summary: Make ClusterDescriptor/ClusterSpecification creation 
configuration-based.
 Key: FLINK-14502
 URL: https://issues.apache.org/jira/browse/FLINK-14502
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


Currently {{ClusterDescriptors}} are created directly from the command line. 
This issue aims at allowing their creation based on {{Configuration}} objects.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14548) Many tests in the nightly fail during cancellation.

2019-10-28 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14548:
--

 Summary: Many tests in the nightly fail during cancellation.
 Key: FLINK-14548
 URL: https://issues.apache.org/jira/browse/FLINK-14548
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination, Tests
Affects Versions: 1.10.0
Reporter: Kostas Kloudas


The stacktrace often looks like:

{{{

2019-10-27 23:34:17,257 ERROR org.apache.flink.runtime.taskmanager.Task - Error 
while canceling the task Map -> Sink: Unnamed (1/1). 
java.util.concurrent.RejectedExecutionException: 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxStateException: Mailbox 
is in state CLOSED, but is required to be in state OPEN for put operations. at 
org.apache.flink.streaming.runtime.tasks.mailbox.execution.MailboxExecutorImpl.executeFirst(MailboxExecutorImpl.java:75)
 at 
org.apache.flink.streaming.runtime.tasks.mailbox.execution.MailboxProcessor.sendPriorityLetter(MailboxProcessor.java:176)
 at 
org.apache.flink.streaming.runtime.tasks.mailbox.execution.MailboxProcessor.allActionsCompleted(MailboxProcessor.java:172)
 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.cancel(StreamTask.java:540) 
at org.apache.flink.runtime.taskmanager.Task$TaskCanceler.run(Task.java:1330) 
at java.lang.Thread.run(Thread.java:748) Caused by: 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxStateException: Mailbox 
is in state CLOSED, but is required to be in state OPEN for put operations. at 
org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.checkPutStateConditions(TaskMailboxImpl.java:199)
 at 
org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.putHeadInternal(TaskMailboxImpl.java:141)
 at 
org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.putFirst(TaskMailboxImpl.java:131)
 at 
org.apache.flink.streaming.runtime.tasks.mailbox.execution.MailboxExecutorImpl.executeFirst(MailboxExecutorImpl.java:73)
 ... 5 more
}}} 

 

For more details check here: 
[https://travis-ci.org/apache/flink/builds/603215648?utm_source=slack&utm_medium=notification]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14657) Generalize and move YarnConfigUtils from flink-yarn to flink-core

2019-11-07 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14657:
--

 Summary: Generalize and move YarnConfigUtils from flink-yarn to 
flink-core
 Key: FLINK-14657
 URL: https://issues.apache.org/jira/browse/FLINK-14657
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This issue is just about moving some utility methods from flink-yarn to 
flink-core because they could be of general interest.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14745) Put job jars from PackagedProgram to Configuration

2019-11-13 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14745:
--

 Summary: Put job jars from PackagedProgram to Configuration
 Key: FLINK-14745
 URL: https://issues.apache.org/jira/browse/FLINK-14745
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


Currently, the {{PackagedProgram}} has methods that allow the "flattening" of 
the job jar, in case it contains other {{jars}}. This issue will allow to put 
the list of {{URLs}} in the configuration directly and independently from the 
{{PackagedProgram}}, so that then they can be picked up by the {{Executors}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14758) Add Executor-related interfaces and utilities

2019-11-13 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14758:
--

 Summary: Add Executor-related interfaces and utilities
 Key: FLINK-14758
 URL: https://issues.apache.org/jira/browse/FLINK-14758
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This is issue only targets at introducing the interfaces without (yet) using 
them. It will add the {{Executor}} interface, the {{ExecutorFactory}} and the 
{{ExecutorServiceLoader}} which will be able to "discover" the registered 
{{ExecutorFactories}}.

In addition, it will add the {{env.execute()}} (both batch and streaming) 
implementations as described in 
[FLIP-73|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-73%3A+Introducing+Executors+for+job+submission]],
 but this is not going to affect anything as the current environments override 
the {{execute()}} method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14829) Add test that verifies that the CLI sets the correct classloader.

2019-11-16 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14829:
--

 Summary: Add test that verifies that the CLI sets the correct 
classloader.
 Key: FLINK-14829
 URL: https://issues.apache.org/jira/browse/FLINK-14829
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas


Add a test that verifies that the 
[FLINK-14808|https://issues.apache.org/jira/browse/FLINK-14808] does not happen 
again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14850) Refactor Executor interface and introduce a minimal JobClient interface

2019-11-19 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14850:
--

 Summary: Refactor Executor interface and introduce a minimal 
JobClient interface
 Key: FLINK-14850
 URL: https://issues.apache.org/jira/browse/FLINK-14850
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This issue aims at decoupling job submission from result retrieval with the 
introduction of a {{JobClient}} which will be responsible for fetching the 
{{JobExecutionResult}}. The {{Executor.execute()}} will return a {{JobClient}} 
which can then be used to retrieve the result.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14851) Make the (Stream)ContextEnvironment use the Executors

2019-11-19 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14851:
--

 Summary: Make the (Stream)ContextEnvironment use the Executors
 Key: FLINK-14851
 URL: https://issues.apache.org/jira/browse/FLINK-14851
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-14972) Implement RemoteExecutor as a new Executor

2019-11-27 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-14972:
--

 Summary: Implement RemoteExecutor as a new Executor
 Key: FLINK-14972
 URL: https://issues.apache.org/jira/browse/FLINK-14972
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15033) Remove unused RemoteEnvirnment.executeRemotely() (FLINK-11048)

2019-12-03 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15033:
--

 Summary: Remove unused RemoteEnvirnment.executeRemotely() 
(FLINK-11048) 
 Key: FLINK-15033
 URL: https://issues.apache.org/jira/browse/FLINK-15033
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15039) Remove default close() implementation from ClusterClient

2019-12-03 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15039:
--

 Summary: Remove default close() implementation from ClusterClient
 Key: FLINK-15039
 URL: https://issues.apache.org/jira/browse/FLINK-15039
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


Currently the {{ClusterClient}} interface extends {{AutoCloseable}} but 
provides a default, empty implementation for the {{close()}} method. This is 
error-prone as subclass implementations may not think that they need to 
actually implement a proper {{close()}}.

 

To this end, this issue aims at removing the default implementation and each 
implementation of the {{ClusterClient}} will have to explicitly provide its own 
implementation.
 
 
 
 
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15041) Remove default close() implementation from JobClient

2019-12-03 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15041:
--

 Summary: Remove default close() implementation from JobClient
 Key: FLINK-15041
 URL: https://issues.apache.org/jira/browse/FLINK-15041
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15042) Fix python compatibility by excluding the Env.executeAsync() (FLINK-14854)

2019-12-03 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15042:
--

 Summary: Fix python compatibility by excluding the 
Env.executeAsync() (FLINK-14854)
 Key: FLINK-15042
 URL: https://issues.apache.org/jira/browse/FLINK-15042
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15118) Make flink-scala-shell use Executors

2019-12-06 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15118:
--

 Summary: Make flink-scala-shell use Executors
 Key: FLINK-15118
 URL: https://issues.apache.org/jira/browse/FLINK-15118
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission, Scala Shell
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15119) Remove PlanExecutor and its subclasses

2019-12-06 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15119:
--

 Summary: Remove PlanExecutor and its subclasses
 Key: FLINK-15119
 URL: https://issues.apache.org/jira/browse/FLINK-15119
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15120) Make FlinkYarnCli#isActive() & #getApplicationId() respect config APPLICATION_ID

2019-12-06 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15120:
--

 Summary: Make FlinkYarnCli#isActive() & #getApplicationId() 
respect config APPLICATION_ID 
 Key: FLINK-15120
 URL: https://issues.apache.org/jira/browse/FLINK-15120
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission, Deployment / YARN
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15179) Kubernetes should not have a CustomCommandLine.

2019-12-10 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15179:
--

 Summary: Kubernetes should not have a CustomCommandLine.
 Key: FLINK-15179
 URL: https://issues.apache.org/jira/browse/FLINK-15179
 Project: Flink
  Issue Type: Sub-task
  Components: Deployment / Kubernetes
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


As part of FLIP-73, all command line options are mapped to config options. 
Given this 1-to-1 mapping, the Kubernetes command line could simply forward the 
command line arguments to ConfigOptions directly, instead of introducing new 
command line options. In this case, the user is expected to simply write:

 

{{bin/run -e (or --executor) kubernetes-session-cluster -D

kubernetes.container.image=MY_IMAGE ...}} 

and the CLI will parse the -e to figure out the correct 
{{ClusterClientFactory}} and {{ExecutorFactory}} and then forward to that the 
config options specified with {{-D}}. 

For this, we need to introduce a {{GenericCustomCommandLine}} that simply 
forward the specified parameters to the executors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15195) Remove unu

2019-12-11 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15195:
--

 Summary: Remove unu
 Key: FLINK-15195
 URL: https://issues.apache.org/jira/browse/FLINK-15195
 Project: Flink
  Issue Type: Sub-task
Reporter: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-12051) TaskExecutorTest.testFilterOutDuplicateJobMasterRegistrations() failed locally.

2019-03-28 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-12051:
--

 Summary: 
TaskExecutorTest.testFilterOutDuplicateJobMasterRegistrations() failed locally.
 Key: FLINK-12051
 URL: https://issues.apache.org/jira/browse/FLINK-12051
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.8.0
Reporter: Kostas Kloudas


The test failed locally with:

{code}

Wanted but not invoked:
 jobLeaderService.start(
 ,
 ,
 ,
 
 );
 -> at 
org.apache.flink.runtime.taskexecutor.TaskExecutorTest.testFilterOutDuplicateJobMasterRegistrations(TaskExecutorTest.java:1171)
 Actually, there were zero interactions with this mock.

Wanted but not invoked:
 jobLeaderService.start(
 ,
 ,
 ,
 
 );
 -> at 
org.apache.flink.runtime.taskexecutor.TaskExecutorTest.testFilterOutDuplicateJobMasterRegistrations(TaskExecutorTest.java:1171)
 Actually, there were zero interactions with this mock.

at 
org.apache.flink.runtime.taskexecutor.TaskExecutorTest.testFilterOutDuplicateJobMasterRegistrations(TaskExecutorTest.java:1171)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
 at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
 at org.junit.runners.Suite.runChild(Suite.java:128)
 at org.junit.runners.Suite.runChild(Suite.java:27)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
 at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
 at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
 at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
 at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-12226) Add documentation about SUSPEND/TERMINATE

2019-04-17 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-12226:
--

 Summary: Add documentation about SUSPEND/TERMINATE
 Key: FLINK-12226
 URL: https://issues.apache.org/jira/browse/FLINK-12226
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.9.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-13103) Graceful Shutdown Handling by UDFs

2019-07-04 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13103:
--

 Summary: Graceful Shutdown Handling by UDFs
 Key: FLINK-13103
 URL: https://issues.apache.org/jira/browse/FLINK-13103
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Affects Versions: 1.8.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This is an umbrella issue for 
[FLIP-46|[https://cwiki.apache.org/confluence/display/FLINK/FLIP-46%3A+Graceful+Shutdown+Handling+by+UDFs]]
 and it will be broken down into more fine grained JIRAs as the discussion on 
the FLIP evolves.

 

For more details on what the FLIP is about, please refer to the link above.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-13275) Race condition in SourceStreamTaskTest.finishingIgnoresExceptions()

2019-07-15 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13275:
--

 Summary: Race condition in 
SourceStreamTaskTest.finishingIgnoresExceptions()
 Key: FLINK-13275
 URL: https://issues.apache.org/jira/browse/FLINK-13275
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


The race condition consists of the following series of steps:

1) the `finishTask` of the source puts the {{POISON_PILL}} in the {{mailbox}}. 
This will be put as first letter in the queue because we call 
{{sendPriorityLetter()}} in the {{MailboxProcessor}}. 

2) then in the {{cancel()}} of the source (called by the {{cancelTask()}} leads 
the source out of the run-loop which throws the exception in the test source. 
The latter, calls again {{sendPriorityLetter()}} for the exception, which means 
that this exception letter may override the previously sent {{POISON_PILL}} 
(because it jumps the line), 

3) if the {{POISON_PILL}} has already been executed, we are good, if not, then 
the test harness sets the exception in the 
{{StreamTaskTestHarness.TaskThread.error}}.

To fix that, I would suggest to follow the same strategy for the root problem 
https://issues.apache.org/jira/browse/FLINK-13124 as in release-1.9.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13307) SourceStreamTaskTest test instability.

2019-07-17 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13307:
--

 Summary: SourceStreamTaskTest test instability.
 Key: FLINK-13307
 URL: https://issues.apache.org/jira/browse/FLINK-13307
 Project: Flink
  Issue Type: Improvement
  Components: Tests
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
 Fix For: 1.9.0






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13440) Add test that fails job when sync savepoint is discarded.

2019-07-26 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13440:
--

 Summary: Add test that fails job when sync savepoint is discarded.
 Key: FLINK-13440
 URL: https://issues.apache.org/jira/browse/FLINK-13440
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Checkpointing
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.9.0


This is a test for FLINK-12858



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13605) AsyncDataStreamITCase. testUnorderedWait failed on Travis

2019-08-06 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13605:
--

 Summary: AsyncDataStreamITCase. testUnorderedWait failed on Travis
 Key: FLINK-13605
 URL: https://issues.apache.org/jira/browse/FLINK-13605
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


An instance of the failure can be found here 
https://api.travis-ci.org/v3/job/568291353/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13713) Remove legacy Package interface.

2019-08-14 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13713:
--

 Summary: Remove legacy Package interface.
 Key: FLINK-13713
 URL: https://issues.apache.org/jira/browse/FLINK-13713
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.10.0


The Package interface is an interface used in the Stratosphere days in order to 
write Flink jobs. This is now outdated as users use the Environments to write 
jobs.

This Jira proposes the removal of this interface and the related code, as it is 
used nowhere in Flink itself but it complicates parts of the codebase.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13714) Remove Package-related code.

2019-08-14 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13714:
--

 Summary: Remove Package-related code.
 Key: FLINK-13714
 URL: https://issues.apache.org/jira/browse/FLINK-13714
 Project: Flink
  Issue Type: Sub-task
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13715) Remove Package-related english documentation.

2019-08-14 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13715:
--

 Summary: Remove Package-related english documentation.
 Key: FLINK-13715
 URL: https://issues.apache.org/jira/browse/FLINK-13715
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.9.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-13716) Remove Package-related chinese documentation

2019-08-14 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-13716:
--

 Summary: Remove Package-related chinese documentation
 Key: FLINK-13716
 URL: https://issues.apache.org/jira/browse/FLINK-13716
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.9.0
Reporter: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (FLINK-10522) Check if RecoverableWriter supportsResume and accordingly.

2018-10-10 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-10522:
--

 Summary: Check if RecoverableWriter supportsResume and accordingly.
 Key: FLINK-10522
 URL: https://issues.apache.org/jira/browse/FLINK-10522
 Project: Flink
  Issue Type: Sub-task
  Components: filesystem-connector
Affects Versions: 1.6.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.7.0


So far we assumed that all `RecoverableWriters` support "resuming", i.e. after 
recovering from a failure or from a savepoint they could keep writing to the 
previously "in-progress" file. This assumption holds for all current writers, 
but in order to be able to accommodate also filesystems that may not support 
this operation, we should check upon initialization if the writer supports 
resuming and if yes, we go as before, if not, we recover for commit and commit 
the previously in-progress file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10529) Add flink-s3-fs-base to the connectors in the travis stage file.

2018-10-11 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-10529:
--

 Summary: Add flink-s3-fs-base to the connectors in the travis 
stage file.
 Key: FLINK-10529
 URL: https://issues.apache.org/jira/browse/FLINK-10529
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.7.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.7.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10530) ProcessFailureCancelingITCase.testCancelingOnProcessFailure failed on Travis.

2018-10-11 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-10530:
--

 Summary: 
ProcessFailureCancelingITCase.testCancelingOnProcessFailure failed on Travis.
 Key: FLINK-10530
 URL: https://issues.apache.org/jira/browse/FLINK-10530
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.7.0
Reporter: Kostas Kloudas
 Fix For: 1.7.0


The logs from Travis: https://api.travis-ci.org/v3/job/440109944/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10583) Add support for state retention to the Processing Time versioned joins.

2018-10-17 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-10583:
--

 Summary: Add support for state retention to the Processing Time 
versioned joins.
 Key: FLINK-10583
 URL: https://issues.apache.org/jira/browse/FLINK-10583
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.7.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.7.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10584) Add support for state retention to the Event Time versioned joins.

2018-10-17 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-10584:
--

 Summary: Add support for state retention to the Event Time 
versioned joins.
 Key: FLINK-10584
 URL: https://issues.apache.org/jira/browse/FLINK-10584
 Project: Flink
  Issue Type: Sub-task
  Components: Table API & SQL
Affects Versions: 1.7.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.7.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10733) Misleading clean_log_files() in common.sh

2018-10-31 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-10733:
--

 Summary: Misleading clean_log_files() in common.sh
 Key: FLINK-10733
 URL: https://issues.apache.org/jira/browse/FLINK-10733
 Project: Flink
  Issue Type: Bug
  Components: E2E Tests
Affects Versions: 1.6.2
Reporter: Kostas Kloudas


In the `common.sh` base script of the end-to-end tests, there is a 
`clean_stdout_files` which cleans only the `*.out` files and a 
`clean_log_files` which cleans *both* `*.log` and `*.out` files.

Given the current behavior that at the end of a test, the logs are checked and 
if there are exceptions (even expected ones but not whitelisted), the tests 
fails, some tests chose to call the `clean_log_files` so that exceptions are 
ignored. In this case, also `*.out` files are cleaned so if a test was checking 
for errors in the `.out` files, then the test will falsely pass.

The solution is as simple as renaming the method to something more descriptive 
like `clean_logs_and_output_files`, but doing so, also includes checking if any 
existing tests were falsely passing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10739) Unstable ProcessFailureCancelingITCase.testCancelingOnProcessFailure

2018-10-31 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-10739:
--

 Summary: Unstable 
ProcessFailureCancelingITCase.testCancelingOnProcessFailure
 Key: FLINK-10739
 URL: https://issues.apache.org/jira/browse/FLINK-10739
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.6.2
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.7.0


An instance can be found here https://api.travis-ci.org/v3/job/448844674/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10803) Add documentation about S3 support by the StreamingFileSink

2018-11-06 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-10803:
--

 Summary: Add documentation about S3 support by the 
StreamingFileSink
 Key: FLINK-10803
 URL: https://issues.apache.org/jira/browse/FLINK-10803
 Project: Flink
  Issue Type: Bug
  Components: filesystem-connector
Affects Versions: 1.7.0
Reporter: Kostas Kloudas
 Fix For: 1.7.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-10963) Cleanup small objects uploaded to S3 as independent objects

2018-11-21 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-10963:
--

 Summary: Cleanup small objects uploaded to S3 as independent 
objects
 Key: FLINK-10963
 URL: https://issues.apache.org/jira/browse/FLINK-10963
 Project: Flink
  Issue Type: Sub-task
  Components: filesystem-connector
Affects Versions: 1.7.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.7.1


The S3 {{RecoverableWriter}} uses the Multipart Upload (MPU) Feature of S3 in 
order to upload the different part files. This means that a large part is split 
in chunks of at least 5MB which are uploaded independently, whenever each one 
of them is ready.

This 5MB minimum size requires special handling of parts that are less than 5MB 
when a checkpoint barrier arrives. These small files are uploaded as 
independent objects (not associated with an active MPU). This way, when Flink 
needs to restore, it simply downloads them and resumes writing to them.

These small objects are currently not cleaned up, thus leading to wasted space 
on S3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11027) JobManagerHAProcessFailureRecoveryITCase#testDispatcherProcessFailure failed on Travis

2018-11-29 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11027:
--

 Summary: 
JobManagerHAProcessFailureRecoveryITCase#testDispatcherProcessFailure failed on 
Travis
 Key: FLINK-11027
 URL: https://issues.apache.org/jira/browse/FLINK-11027
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.7.0
Reporter: Kostas Kloudas


The failed run on travis: https://api.travis-ci.org/v3/job/460844327/log.txt



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11116) Clean-up temporary files that upon recovery, they belong to no checkpoint.

2018-12-10 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-6:
--

 Summary: Clean-up temporary files that upon recovery, they belong 
to no checkpoint.
 Key: FLINK-6
 URL: https://issues.apache.org/jira/browse/FLINK-6
 Project: Flink
  Issue Type: Improvement
  Components: filesystem-connector
Affects Versions: 1.7.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.7.1


In order to guarantee exactly-once semantics, the streaming file sink is 
implementing a two-phase commit protocol when writing files to the filesystem.

Initially data is written to in-progress files. These files are then put into 
"pending" state when they are completed (based on the rolling policy), and they 
are finally committed when the checkpoint that put them in the "pending" state 
is acknowledged as complete.

The above shows that in the case that we have:
1) checkpoints A, B, C coming 
2) checkpoint A being acknowledged and 
3) failure

Then we may have files that do not belong to any checkpoint (because B and C 
were not considered successful). These files are currently not cleaned up.

This issue aims at cleaning up these files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11458) Ensure sink commit side-effects when cancelling with savepoint.

2019-01-29 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11458:
--

 Summary: Ensure sink commit side-effects when cancelling with 
savepoint.
 Key: FLINK-11458
 URL: https://issues.apache.org/jira/browse/FLINK-11458
 Project: Flink
  Issue Type: New Feature
  Components: State Backends, Checkpointing
Affects Versions: 1.8.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.8.0


TBD.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11572) Add new types of CheckpointBarriers.

2019-02-11 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11572:
--

 Summary: Add new types of CheckpointBarriers.
 Key: FLINK-11572
 URL: https://issues.apache.org/jira/browse/FLINK-11572
 Project: Flink
  Issue Type: Sub-task
  Components: Core
Affects Versions: 1.8.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.8.0


Add the new type of barriers that will be sent to the JobManager to the 
TaskManagers and which will trigger the {{DRAINING}} or {{SUSPENSION}} of the 
pipeline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11573) Make StreamTask properly handle the SUSPEND message.

2019-02-11 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11573:
--

 Summary: Make StreamTask properly handle the SUSPEND message.
 Key: FLINK-11573
 URL: https://issues.apache.org/jira/browse/FLINK-11573
 Project: Flink
  Issue Type: Sub-task
  Components: Core, Streaming, TaskManager
Affects Versions: 1.8.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.8.0


In this case, the {{StreamTask}} has to;
- launch the checkpointing process
- wait until also for the checkpoint notification callback to be successfully 
executed
- make sure no elements are processed by downstream operators (req. for 
exactly-once with idempotent updates)
- do not call {{close()}}
- let the job go gracefully to {{FINISHED}}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11575) Let the JobManager send the SUSPEND and DRAIN messages

2019-02-11 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11575:
--

 Summary: Let the JobManager send the SUSPEND and DRAIN messages
 Key: FLINK-11575
 URL: https://issues.apache.org/jira/browse/FLINK-11575
 Project: Flink
  Issue Type: Sub-task
  Components: Core, Streaming, TaskManager
Affects Versions: 1.8.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.8.0


Expose the functionality to the job manager and subsequently to the user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11574) Make StreamTask properly handle the DRAIN message.

2019-02-11 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11574:
--

 Summary: Make StreamTask properly handle the DRAIN message.
 Key: FLINK-11574
 URL: https://issues.apache.org/jira/browse/FLINK-11574
 Project: Flink
  Issue Type: Sub-task
  Components: Core, Streaming, TaskManager
Affects Versions: 1.8.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.8.0


In this case, the {{StreamTask}} has to;
- send {{MAX_WATERMARK}} so that timers fire
- launch the checkpointing process
- wait until also for the checkpoint notification callback to be successfully 
executed
- make sure no elements are processed by downstream operators (req. for 
exactly-once with idempotent updates)
-  call {{close()}}
- let the job go gracefully to {{FINISHED}}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11667) Add Synchronous Checkpoint handling in StreamTask

2019-02-20 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11667:
--

 Summary: Add Synchronous Checkpoint handling in StreamTask
 Key: FLINK-11667
 URL: https://issues.apache.org/jira/browse/FLINK-11667
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This is the basic building block for the SUSPEND/TERMINATE functionality. 

In case of a synchronous checkpoint barrier, the checkpointing thread will 
block (without holding the checkpoint lock) until the 
{{notifyCheckpointComplete}} is executed successfully. This  will allow the 
checkpoint to be considered successful ONLY when also the 
{{notifyCheckpointComplete}} is successfully executed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11668) Allow sources to advance time to max watermark on checkpoint.

2019-02-20 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11668:
--

 Summary: Allow sources to advance time to max watermark on 
checkpoint.
 Key: FLINK-11668
 URL: https://issues.apache.org/jira/browse/FLINK-11668
 Project: Flink
  Issue Type: Sub-task
  Components: State Backends, Checkpointing, Streaming
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This is needed for the TERMINATE case. It will allow the sources to inject the 
{{MAX_WATERMARK}} before the barrier that will trigger the last savepoint. This 
will fire any registered event-time timers and flush any state associated with 
these timers, e.g. windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11669) Add Synchronous Checkpoint Triggering RPCs.

2019-02-20 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11669:
--

 Summary: Add Synchronous Checkpoint Triggering RPCs.
 Key: FLINK-11669
 URL: https://issues.apache.org/jira/browse/FLINK-11669
 Project: Flink
  Issue Type: Sub-task
  Components: JobManager, TaskManager
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


Wires the triggering of the Synchronous {{Checkpoint/Savepoint}} from the 
{{JobMaster}} to the {{TaskExecutor}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11671) Expose SUSPEND/TERMINATE to CLI

2019-02-20 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11671:
--

 Summary: Expose SUSPEND/TERMINATE to CLI
 Key: FLINK-11671
 URL: https://issues.apache.org/jira/browse/FLINK-11671
 Project: Flink
  Issue Type: Sub-task
  Components: Client
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


Expose the SUSPEND/TERMINATE functionality to the user through the command line.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11670) Add SUSPEND/TERMINATE calls to REST API

2019-02-20 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11670:
--

 Summary: Add SUSPEND/TERMINATE calls to REST API
 Key: FLINK-11670
 URL: https://issues.apache.org/jira/browse/FLINK-11670
 Project: Flink
  Issue Type: Sub-task
  Components: REST
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


Exposes the SUSPEND/TERMINATE functionality to the user through the REST API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-11984) StreamingFileSink docs do not mention S3 savepoint caveats.

2019-03-20 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-11984:
--

 Summary: StreamingFileSink docs do not mention S3 savepoint 
caveats.
 Key: FLINK-11984
 URL: https://issues.apache.org/jira/browse/FLINK-11984
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem, Documentation
Affects Versions: 1.7.2
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (FLINK-15260) Uniformize Kubernetes executor name with the rest

2019-12-14 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15260:
--

 Summary: Uniformize Kubernetes executor name with the rest
 Key: FLINK-15260
 URL: https://issues.apache.org/jira/browse/FLINK-15260
 Project: Flink
  Issue Type: Sub-task
  Components: Deployment / Kubernetes
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.10.0


Currently the  Kubernetes executor is named "kubernetes-session-cluster" while 
all the others have the word executor in their name. This issue will rename it 
to "kubernetes-session-executor".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15273) Complete and Update Documentation

2019-12-16 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15273:
--

 Summary: Complete and Update Documentation
 Key: FLINK-15273
 URL: https://issues.apache.org/jira/browse/FLINK-15273
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.10.0
Reporter: Kostas Kloudas


This is an umbrella issue that serves at monitoring the progress in documenting 
the new features that were introduced in {{Flink-1.10}} and also the overall 
improvement of the documentation before the release, based on user feedback.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15274) Update Filesystem documentation to reflect changes in shading

2019-12-16 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15274:
--

 Summary: Update Filesystem documentation to reflect changes in 
shading
 Key: FLINK-15274
 URL: https://issues.apache.org/jira/browse/FLINK-15274
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, FileSystems
Affects Versions: 1.10.0
Reporter: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15275) Update CLI documentation to include only current valid options

2019-12-16 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15275:
--

 Summary: Update CLI documentation to include only current valid 
options
 Key: FLINK-15275
 URL: https://issues.apache.org/jira/browse/FLINK-15275
 Project: Flink
  Issue Type: Sub-task
  Components: Command Line Client, Documentation
Affects Versions: 1.10.0
Reporter: Kostas Kloudas


Currently the documentation for the CLI contains outdated/invalid information, 
such as deprecated and removed options. This has to be fixed before the release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15276) Update CLI documentation to reflect the addition of the ExecutorCLI

2019-12-16 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15276:
--

 Summary: Update CLI documentation to reflect the addition of the 
ExecutorCLI
 Key: FLINK-15276
 URL: https://issues.apache.org/jira/browse/FLINK-15276
 Project: Flink
  Issue Type: Sub-task
  Components: Command Line Client, Documentation
Affects Versions: 1.10.0
Reporter: Kostas Kloudas


The addition of the {{ExecutorCLI}} includes a new, alternative way to specify 
command-line parameters for _any_ execution target. This should be documented, 
and the other CLI documentation pages should be updated to point to also this 
one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15278) Update StreamingFileSink documentation

2019-12-16 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15278:
--

 Summary: Update StreamingFileSink documentation
 Key: FLINK-15278
 URL: https://issues.apache.org/jira/browse/FLINK-15278
 Project: Flink
  Issue Type: Sub-task
Reporter: Kostas Kloudas


Many times in the ML we have seen questions about the {{StreamingFileSink}} 
that could have been answered with better documentation that includes:
1) shortcomings (especially in the case of S3 and also bulk formats)
2) file lifecycle



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15279) Document new `executeAsync()` method and the newly introduced `JobClient`

2019-12-16 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15279:
--

 Summary: Document new `executeAsync()` method and the newly 
introduced `JobClient`
 Key: FLINK-15279
 URL: https://issues.apache.org/jira/browse/FLINK-15279
 Project: Flink
  Issue Type: Sub-task
  Components: Client / Job Submission, Documentation
Affects Versions: 1.10.0
Reporter: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15292) Rename Executor interface to PipelineExecutor

2019-12-17 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15292:
--

 Summary: Rename Executor interface to PipelineExecutor
 Key: FLINK-15292
 URL: https://issues.apache.org/jira/browse/FLINK-15292
 Project: Flink
  Issue Type: Improvement
  Components: Client / Job Submission
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas
 Fix For: 1.10.0


This is a trivial renaming task which renames the newly introduced {{Executor}} 
interface to {{PipelineExecutor}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15687) Potential test instabilities due to concurrent access to TaskSlotTable.

2020-01-20 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15687:
--

 Summary: Potential test instabilities due to concurrent access to 
TaskSlotTable.
 Key: FLINK-15687
 URL: https://issues.apache.org/jira/browse/FLINK-15687
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination, Tests
Affects Versions: 1.10.0
Reporter: Kostas Kloudas


Working on [FLINK-14742|https://issues.apache.org/jira/browse/FLINK-14742] 
revealed that the problem with that test instability was the modification of 
the {{taskSlotTable}} of the {{TaskManager}} under test from multiple threads, 
namely the test thread and the main thread of the {{rpcEnpoint}}. This 
data-structure is not thread-safe and this should not happen.

This anti-pattern seems to be repeated in multiple tests like most of the tests 
in the {{TaskExecutorSubmissionTest}} (look for the call to the 
{{TaskSlotTable.allocateSlot()}}). There we seem to call 
{{taskSlotTable.allocateSlot()}} and then \{{tmGateway.submitTask()}} which is 
essentially accessing the slot table from within the main rpc-endpoint thread.

This JIRA is just to investigate if this is also a problem in those tests or 
not.

cc [~trohrmann], [~chesnay] , [~yangwang166]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-15805) StreamingFileSink on S3 fails when firing processing time timers after recovery.

2020-01-29 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15805:
--

 Summary: StreamingFileSink on S3 fails when firing processing time 
timers after recovery.
 Key: FLINK-15805
 URL: https://issues.apache.org/jira/browse/FLINK-15805
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / FileSystem
Affects Versions: 1.9.1, 1.10.0
Reporter: Kostas Kloudas



{code:java}
org.apache.flink.util.SerializedThrowable: Caught exception while processing 
timer.
at 
org.apache.flink.streaming.runtime.tasks.StreamTask$StreamTaskAsyncExceptionHandler.handleAsyncException(StreamTask.java:958)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.handleAsyncException(StreamTask.java:932)
at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:285)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_232]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 ~[?:1.8.0_232]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 ~[?:1.8.0_232]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_232]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_232]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_232]
Caused by: org.apache.flink.util.SerializedThrowable: java.io.IOException: 
Uploading parts failed
at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:284)
 
... 7 more
Caused by: org.apache.flink.util.SerializedThrowable: Uploading parts failed
at 
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartUploadToComplete(RecoverableMultiPartUploadImpl.java:231)
 ~[?:?]
at 
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartsUpload(RecoverableMultiPartUploadImpl.java:215)
 ~[?:?]
at 
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetRecoverable(RecoverableMultiPartUploadImpl.java:151)
 ~[?:?]
at 
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetCommitter(RecoverableMultiPartUploadImpl.java:123)
 ~[?:?]
at 
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetCommitter(RecoverableMultiPartUploadImpl.java:56)
 ~[?:?]
at 
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeForCommit(S3RecoverableFsDataOutputStream.java:167)
 ~[?:?]
at 
org.apache.flink.streaming.api.functions.sink.filesystem.PartFileWriter.closeForCommit(PartFileWriter.java:71)
 ~[flink-dist_2.11-1.9.1-stream2.jar:?]
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.closePartFile(Bucket.java:235)
 ~[flink-dist_2.11-1.9.1-stream2.jar:?]
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.onProcessingTime(Bucket.java:334)
 ~[flink-dist_2.11-1.9.1-stream2.jar:?]
at 
org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.onProcessingTime(Buckets.java:297)
 ~[flink-dist_2.11-1.9.1-stream2.jar:?]
at 
org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink.onProcessingTime(StreamingFileSink.java:364)
 ~[flink-dist_2.11-1.9.1-stream2.jar:?]
at 
org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeService$TriggerTask.run(SystemProcessingTimeService.java:281)
 ~[flink-dist_2.11-1.9.1-stream2.jar:1.9.1-stream2]
... 7 more
Caused by: org.apache.flink.util.SerializedThrowable: upload part on 
BUCKET/2019/12/05/22/04/part-5-30280: 
org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.model.AmazonS3Exception:
 The specified upload does not exist. The upload ID may be invalid, or the 
upload may have been aborted or completed. 
S3 Extended Request ID: 
tKO5p6woMJi1tDIp+KBnZ4BnkHMw1W1XIbF+B8YjxSIPG6NS21Nnh/RoYPumkvuP0WROIiCTj2k=:NoSuchUpload
at 
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:225)
 ~[?:?]
at 
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
 ~[?:?]
at 
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260)
 ~[?:?]
at 
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317)
 ~[?:?]
at 
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:256)
 ~[?:?]
at 
org.apache.flink.fs.shaded.hadoop3.org.ap

[jira] [Created] (FLINK-15926) Add DataStream.broadcast(StateDescriptor) to available transformations docs

2020-02-05 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-15926:
--

 Summary: Add DataStream.broadcast(StateDescriptor) to available 
transformations docs
 Key: FLINK-15926
 URL: https://issues.apache.org/jira/browse/FLINK-15926
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream, Documentation
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16369) Allow the StreamingFileSink to restore from a previous (old) savepoint.

2020-03-02 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-16369:
--

 Summary: Allow the StreamingFileSink to restore from a previous 
(old) savepoint.
 Key: FLINK-16369
 URL: https://issues.apache.org/jira/browse/FLINK-16369
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / FileSystem
Affects Versions: 1.11.0
Reporter: Kostas Kloudas


Currently the StreamingFileSink will fail upon recovery if the "in-progress" 
temporary file is not there anymore. This can happen either because of an error 
(someone deleted the file) or because the file was eventually committed after 
the savepoint was taken and is when we try to restore, it is in the FINALIZED 
state.

This issue will track the progress of alleviating this limitation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16654) Implement Application Mode according to FLIP-85

2020-03-18 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-16654:
--

 Summary: Implement Application Mode according to FLIP-85
 Key: FLINK-16654
 URL: https://issues.apache.org/jira/browse/FLINK-16654
 Project: Flink
  Issue Type: New Feature
  Components: Job Execution / Application Mode
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16655) Introduce the EmbeddedExecutor

2020-03-18 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-16655:
--

 Summary: Introduce the EmbeddedExecutor
 Key: FLINK-16655
 URL: https://issues.apache.org/jira/browse/FLINK-16655
 Project: Flink
  Issue Type: Sub-task
  Components: Job Execution / Application Mode
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


Running the user's {{main}} method in "Application Mode" implies: 
1) launching a dedicated cluster for the application's jobs
2) running the user's {{main}} on the cluster, alongside the {{Dispatcher}}

The {{EmbeddedExecutor}} will conceptually be like the existing {{Executors}} 
for session clusters,  with the difference that this time there is no need to 
go through REST as this will be running already on the same machine as the 
{{Dispatcher}}.

This executor, apart from the application mode can also be used in the case of 
the web submission and the {{StandaloneClusterEntrypoint}} which so far do not 
use executors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16656) Introduce the DispatcherBootstrap

2020-03-18 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-16656:
--

 Summary: Introduce the DispatcherBootstrap
 Key: FLINK-16656
 URL: https://issues.apache.org/jira/browse/FLINK-16656
 Project: Flink
  Issue Type: Sub-task
  Components: Job Execution / Application Mode
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This will be a "Container" of all the initial "state" of the dispatcher. 

In the current {{master}}, the bootstrap will only include the 
{{recoveredJobs}} which are passed directly to the constructor. 

For the Application Mode, this Bootstrap will also include the logic to launch 
the application and handle its recovery and resubmission upon failure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16657) Wire the EmbeddedExecutor to the Web Submission logic.

2020-03-18 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-16657:
--

 Summary: Wire the EmbeddedExecutor to the Web Submission logic.
 Key: FLINK-16657
 URL: https://issues.apache.org/jira/browse/FLINK-16657
 Project: Flink
  Issue Type: Sub-task
  Components: Job Execution / Application Mode, Runtime / REST
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This will replace the current logic of web submission in the {{JarRunHandler}} 
with one that uses the newly introduced {{EmbeddedExecutor}} .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16658) Wire the EmbeddedExecutor to the StandaloneClusterEntrypoint.

2020-03-18 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-16658:
--

 Summary: Wire the EmbeddedExecutor to the 
StandaloneClusterEntrypoint.
 Key: FLINK-16658
 URL: https://issues.apache.org/jira/browse/FLINK-16658
 Project: Flink
  Issue Type: Sub-task
  Components: Job Execution / Application Mode
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16660) Introduce the ApplicationDispatcherBootstrap

2020-03-18 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-16660:
--

 Summary: Introduce the ApplicationDispatcherBootstrap
 Key: FLINK-16660
 URL: https://issues.apache.org/jira/browse/FLINK-16660
 Project: Flink
  Issue Type: Sub-task
  Components: Job Execution / Application Mode
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This will be a {{DispatcherBootstrap}} with the necessary logic to run a 
{{PackagedProgram}} on the {{Dispatcher}} of an application cluster. This 
includes:
1) running the main and submitting the jobs of the application when required
2) handling the case of a failure where the jobgraphs are retrieved and 
re-submitted and the main has to detect it and continue as needed.

The scope of this task will be further clarified as the implementation unfolds. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-16661) Introduce the YarnApplicationClusterEntrypoint

2020-03-18 Thread Kostas Kloudas (Jira)
Kostas Kloudas created FLINK-16661:
--

 Summary: Introduce the YarnApplicationClusterEntrypoint
 Key: FLINK-16661
 URL: https://issues.apache.org/jira/browse/FLINK-16661
 Project: Flink
  Issue Type: Sub-task
  Components: Job Execution / Application Mode
Affects Versions: 1.10.0
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This will be the final step to support "Application Mode" for Yarn.

So far, the idea is to ship the jar to the cluster as a {{LocalResource}} and 
the entrypoint will take care of running the user's {{main}} and providing the 
desired guarantees.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-3683) Unstable tests: in the Kafka09ITCase, the testMultipleSourcesOnePartition() and the testOneToOneSources()

2016-03-30 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-3683:
-

 Summary: Unstable tests: in the Kafka09ITCase, the 
testMultipleSourcesOnePartition() and the testOneToOneSources()
 Key: FLINK-3683
 URL: https://issues.apache.org/jira/browse/FLINK-3683
 Project: Flink
  Issue Type: Bug
  Components: Kafka Connector
Reporter: Kostas Kloudas


The aforementioned tests fail sometimes. To reproduce the behavior put them in 
a for-loop and let them run 100 times. In this case the problem seems to be 
that the topic was not deleted before being recreated for the next run.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3717) Add functionality to get the current offset of a source with a FileInputFormat

2016-04-08 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-3717:
-

 Summary: Add functionality to get the current offset of a source 
with a FileInputFormat
 Key: FLINK-3717
 URL: https://issues.apache.org/jira/browse/FLINK-3717
 Project: Flink
  Issue Type: Sub-task
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3808) Refactor the whole file monitoring source to take a fileInputFormat as an argument.

2016-04-25 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-3808:
-

 Summary: Refactor the whole file monitoring source to take a 
fileInputFormat as an argument.
 Key: FLINK-3808
 URL: https://issues.apache.org/jira/browse/FLINK-3808
 Project: Flink
  Issue Type: Sub-task
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This issue is just an intermediate step towards making the file source 
fault-tolerant.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3826) Broken test: StreamCheckpointingITCase

2016-04-27 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-3826:
-

 Summary: Broken test: StreamCheckpointingITCase
 Key: FLINK-3826
 URL: https://issues.apache.org/jira/browse/FLINK-3826
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Kostas Kloudas


The purpose of the StreamCheckpointingITCase is to run a job, throw an 
exception to trigger a task failure, and test the recovery process. In the 
test, the line (270), which is throwing the exception to cause the task 
failure, is commented out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3843) StreamFaultToleranceTestBase hangs in case of test failure.

2016-04-28 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-3843:
-

 Summary: StreamFaultToleranceTestBase hangs in case of test 
failure.
 Key: FLINK-3843
 URL: https://issues.apache.org/jira/browse/FLINK-3843
 Project: Flink
  Issue Type: Bug
  Components: Tests
Reporter: Kostas Kloudas


The StreamFaultToleranceTestBase, which is the base class for checkpointing 
tests defines no maximum number of retries. This can lead to a test either 
succeeding and terminating, or failing silently and restarting indefinitely.

The change can just consist of adding something like:

env.getConfig().setRestartStrategy(RestartStrategies.fixedDelayRestart(3, 0));

to the runCheckpointedProgram() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3889) Make File Monitoring Function checkpointable.

2016-05-10 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-3889:
-

 Summary: Make File Monitoring Function checkpointable.
 Key: FLINK-3889
 URL: https://issues.apache.org/jira/browse/FLINK-3889
 Project: Flink
  Issue Type: Sub-task
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


This is essentially the combination of FLINK-3808 and FLINK-3717.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3896) Allow a StreamTask to be Externally Cancelled.

2016-05-10 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-3896:
-

 Summary: Allow a StreamTask to be Externally Cancelled.
 Key: FLINK-3896
 URL: https://issues.apache.org/jira/browse/FLINK-3896
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3956) Make FileInputFormats in Streaming independent from the Configuration object

2016-05-23 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-3956:
-

 Summary: Make FileInputFormats in Streaming independent from the 
Configuration object
 Key: FLINK-3956
 URL: https://issues.apache.org/jira/browse/FLINK-3956
 Project: Flink
  Issue Type: Bug
  Components: Streaming
Reporter: Kostas Kloudas
Assignee: Kostas Kloudas


Currently many input file input formats rely on a user specified Configuration 
object for their correct configuration. As an example, the DelimitedInputFormat 
can take its delimiter character from this object. 

This is happening for legacy reasons, and does not work in the streaming 
codebase. In streaming, the user-specified configuration object is over-written 
at the TaskManagers by a new Configuration object.

This issue aims at solving this by refactoring current file input formats to be 
able to set their parameters through setters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3986) Rename the readFileStream(String filePath, long intervalMillis, WatchType watchType) from the StreamExecutionEnvironment

2016-05-28 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-3986:
-

 Summary: Rename the readFileStream(String filePath, long 
intervalMillis, WatchType watchType) from the StreamExecutionEnvironment
 Key: FLINK-3986
 URL: https://issues.apache.org/jira/browse/FLINK-3986
 Project: Flink
  Issue Type: Sub-task
Reporter: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4065) Unstable Kafka09ITCase.testMultipleSourcesOnePartition test

2016-06-13 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4065:
-

 Summary: Unstable Kafka09ITCase.testMultipleSourcesOnePartition 
test
 Key: FLINK-4065
 URL: https://issues.apache.org/jira/browse/FLINK-4065
 Project: Flink
  Issue Type: Test
  Components: Tests
Reporter: Kostas Kloudas
Priority: Critical


Sometime the Kafka09ITCase.testMultipleSourcesOnePartition test fails on 
travis. Here is a log of such a failure:
https://s3.amazonaws.com/archive.travis-ci.org/jobs/136758707/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4124) Unstable test WrapperSetupHelperTest.testCreateTopologyContext

2016-06-27 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4124:
-

 Summary: Unstable test 
WrapperSetupHelperTest.testCreateTopologyContext
 Key: FLINK-4124
 URL: https://issues.apache.org/jira/browse/FLINK-4124
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas


Instances of the failure: 

view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/140554510/log.txt

view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/140558082/log.txt

view-source:https://s3.amazonaws.com/archive.travis-ci.org/jobs/140558119/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4126) Unstable test ZooKeeperLeaderElectionTest.testZooKeeperReelection

2016-06-27 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4126:
-

 Summary: Unstable test 
ZooKeeperLeaderElectionTest.testZooKeeperReelection
 Key: FLINK-4126
 URL: https://issues.apache.org/jira/browse/FLINK-4126
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas
Priority: Critical


Here is an instance of the failure:

https://s3.amazonaws.com/archive.travis-ci.org/jobs/140558301/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4199) Wrong client behavior when submitting job to non-existing cluster

2016-07-12 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4199:
-

 Summary: Wrong client behavior when submitting job to non-existing 
cluster
 Key: FLINK-4199
 URL: https://issues.apache.org/jira/browse/FLINK-4199
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas


Trying to submit a job jar from the client to a non-existing cluster gives the 
following messages. In particular the last line: "Job has been submitted with" 
is totally misleading.

Cluster retrieved: Standalone cluster with JobManager at 
localhost/127.0.0.1:6123
Using address localhost:6123 to connect to JobManager.
JobManager web interface address http://localhost:8081
Starting execution of program
Submitting job with JobID: 9c7120e5cc55b2a9157a7e2bc5a12c9d. Waiting for job 
completion.
org.apache.flink.client.program.ProgramInvocationException: The program 
execution failed: Communication with JobManager failed: Lost connection to the 
JobManager.
Job has been submitted with JobID 9c7120e5cc55b2a9157a7e2bc5a12c9d



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4224) Exception after successful execution when submitting job through the web interface.

2016-07-16 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4224:
-

 Summary: Exception after successful execution when submitting job 
through the web interface.
 Key: FLINK-4224
 URL: https://issues.apache.org/jira/browse/FLINK-4224
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Kostas Kloudas


When submitting the job below through the web interface to a local flink 
cluster:
{code}
public static void main(String[] args) throws Exception {

StreamExecutionEnvironment env =
  SreamExecutionEnvironment.getExecutionEnvironment();

TextInputFormat inputFormat = 
 new TextInputFormat(new 
Path("file:///YOUR_TEXT_FILE"));

DataStream> input = env.

readFile(inputFormat,inputFormat.getFilePath().toString()).
flatMap(new Tokenizer()).
keyBy(0).
sum(1);
input.print();
env.execute("WebClient Testing Example.");
}
{/code}

The job succeeds, but at the logs there is a NPE: 

{code}
2016-07-16 16:56:18,197 WARN  
org.apache.flink.runtime.webmonitor.RuntimeMonitorHandler - 
 Error while handling request java.lang.RuntimeException: 
 Couldn't deserialize ExecutionConfig.
...
Caused by: java.lang.NullPointerException at 
org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:55)at
 
org.apache.flink.runtime.webmonitor.handlers.JobConfigHandler.handleRequest(JobConfigHandler.java:50)
{/code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4237) ClassLoaderITCase.testDisposeSavepointWithCustomKvState fails due to Timeout Futures

2016-07-20 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4237:
-

 Summary: ClassLoaderITCase.testDisposeSavepointWithCustomKvState 
fails due to Timeout Futures
 Key: FLINK-4237
 URL: https://issues.apache.org/jira/browse/FLINK-4237
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas


An instance of the problem can be found here:

https://s3.amazonaws.com/archive.travis-ci.org/jobs/146080605/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4255) Unstable test WebRuntimeMonitorITCase.testNoEscape

2016-07-22 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4255:
-

 Summary: Unstable test WebRuntimeMonitorITCase.testNoEscape
 Key: FLINK-4255
 URL: https://issues.apache.org/jira/browse/FLINK-4255
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas


An instance of the problem can be found here:

https://s3.amazonaws.com/archive.travis-ci.org/jobs/146615994/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4398) Unstable test KvStateServerHandlerTest.testSimpleQuery

2016-08-15 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4398:
-

 Summary: Unstable test KvStateServerHandlerTest.testSimpleQuery
 Key: FLINK-4398
 URL: https://issues.apache.org/jira/browse/FLINK-4398
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas


An instance can be found here:

https://s3.amazonaws.com/archive.travis-ci.org/jobs/152392521/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4397) Unstable test SlotCountExceedingParallelismTest.tearDown

2016-08-15 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4397:
-

 Summary: Unstable test SlotCountExceedingParallelismTest.tearDown
 Key: FLINK-4397
 URL: https://issues.apache.org/jira/browse/FLINK-4397
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas


An instance can be found here:

https://s3.amazonaws.com/archive.travis-ci.org/jobs/152392524/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4407) Implement the trigger DSL

2016-08-17 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4407:
-

 Summary: Implement the trigger DSL
 Key: FLINK-4407
 URL: https://issues.apache.org/jira/browse/FLINK-4407
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4433) Refactor the StreamSource.

2016-08-19 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4433:
-

 Summary: Refactor the StreamSource.
 Key: FLINK-4433
 URL: https://issues.apache.org/jira/browse/FLINK-4433
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas


With the addition of continuous file monitoring, apart from the 
{{StreamSource}} also the {{ContinuousFileReaderOperator}} uses a 
{{SourceContext}}. Given this, all the implementations of the {{SourceContext}} 
should be removed from the {{StreamSource}} and become independent classes.

In addition, the {{AsyncExceptionChecker}} interface should be removed as its 
functionality can be replaced by the {{task.failExternally()}} method. This 
also implies slight changes in the source context implementations. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-4494) Expose the TimeServiceProvider from the Task to each Operator.

2016-08-25 Thread Kostas Kloudas (JIRA)
Kostas Kloudas created FLINK-4494:
-

 Summary: Expose the TimeServiceProvider from the Task to each 
Operator.
 Key: FLINK-4494
 URL: https://issues.apache.org/jira/browse/FLINK-4494
 Project: Flink
  Issue Type: Bug
Reporter: Kostas Kloudas


This change aims at simplifying the {{StreamTask}} class by exposing directly 
the {{TimeServiceProvider}} to the operators being executed. 

This implies removing the {{registerTimer()}} and 
{{getCurrentProcessingTime()}} methods from the {{StreamTask}}. Now, to 
register a timer and query the time, each operator will be able to get the 
{{TimeServiceProvider}} and call the corresponding methods directly on it.

In addition, this will simplify many of the tests which now implement their own 
time providers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >