Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-22 Thread Márton Balassi
Hi Austin,

Thank you for your thoughts. This is where this discussion stops. This
email thread already contains more characters than the implementation and
what is needed for the next 20 years of maintenance.

It is great that you have a view on modern solutions and thank you for
offering your help with brainstorming solutions. I am responsible for Flink
at Cloudera and we do need an implementation like this and it is in fact
already in production at dozens of customers. We are open to adapting that
to expose a more generic API (and keeping Kerberos to our fork), to
contribute this to the community as others have asked for it and to protect
ourselves from occasionally having to update this critical implementation
path based on changes in the Apache codebase. I have worked with close to a
hundred Big Data customers as a consultant and an engineering manager and
committed hundreds of changes to Apache Flink over the past decade, please
trust my judgement on a simple matter like this.

Please forgive me for referencing authority, this discussion was getting
out of hand. Please keep vigilant.

Best,
Marton

On Mon, Jun 21, 2021 at 10:50 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

> Hi Gabor + Marton,
>
> I don't believe that the issue with this proposal is the specific mechanism
> proposed (Kerberos), but rather that it is not the level to implement it at
> (Flink). I'm just one voice, so please take this with a grain of salt.
>
> In the other solutions previously noted there is no need to instrument
> Flink which, in addition to reducing the maintenance burden, provides a
> better, decoupled end result.
>
> IMO we should not add any new API in Flink for this use case. I think it is
> unfortunate and sympathize with the work that has already been done on this
> feature – perhaps we could brainstorm ways to run this alongside Flink in
> your setup. Again, I don't think the proposed solution of an agnostic API
> would not work, nor is it a bad idea, but is not one that will make Flink
> more compatible with the modern solutions to this problem.
>
> Best,
> Austin
>
> On Mon, Jun 21, 2021 at 2:18 PM Márton Balassi 
> wrote:
>
> > Hi team,
> >
> > Thank you for your input. Based on this discussion I agree with G that
> > selecting and standardizing on a specific strong authentication mechanism
> > is more challenging than the whole rest of the scope of this
> authentication
> > story. :-) I suggest that G and I go back to the drawing board and come
> up
> > with an API that can support multiple authentication mechanisms, and we
> > would only merge said API to Flink. Specific implementations of it can be
> > maintained outside of the project. This way we tackle the main challenge
> in
> > a truly minimal way.
> >
> > Best,
> > Marton
> >
> > On Mon, Jun 21, 2021 at 4:18 PM Gabor Somogyi  >
> > wrote:
> >
> > > Hi All,
> > >
> > > We see that adding any kind of specific authentication raises more
> > > questions than answers.
> > > What would be if a generic API would be added without any real
> > > authentication logic?
> > > That way every provider can add its own protocol implementation as
> > > additional jar.
> > >
> > > BR,
> > > G
> > >
> > >
> > > On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards <
> > > austin.caw...@gmail.com> wrote:
> > >
> > >> Hi all,
> > >>
> > >> Sorry to be joining the conversation late. I'm also on the side of
> > >> Konstantin, generally, in that this seems to not be a core goal of
> Flink
> > >> as
> > >> a project and adds a maintenance burden.
> > >>
> > >> Would another con of Kerberos be that is likely a fading project in
> > terms
> > >> of network security? (serious question, please correct me if there is
> > >> reason to believe it is gaining adoption)
> > >>
> > >> The point about Kerberos being independent of infrastructure is a good
> > one
> > >> but is something that is also solved by modern sidecar proxies +
> service
> > >> meshes that can run across Kubernetes and bare-metal. These solutions
> > also
> > >> handle certificate provisioning, rotation, etc. in addition to
> > >> higher-level
> > >> authorization policies. Some examples of projects with this "universal
> > >> infrastructure support" are Kuma[1] (CNCF Sandbox, I'm a maintainer)
> and
> > >> Istio[2] (Google).
> > >>
> > >> Wondering out loud: has anyone tried to run Flink on top of cilium[3],
> > >> which also provides zero-trust networking at the kernel level without
> > >> needing to instrument applications? This currently only runs on
> > Kubernetes
> > >> on Linux, so that's a major limitation, but solves many of the request
> > >> forging concerns at all levels.
> > >>
> > >> Thanks,
> > >> Austin
> > >>
> > >> [1]: https://kuma.io/docs/1.1.6/quickstart/universal/
> > >> [2]: https://istio.io/latest/docs/setup/install/virtual-machine/
> > >> [3]: https://cilium.io/
> > >>
> > >> On Thu, Jun 17, 2021 at 1:50 PM Till Rohrmann 
> > >> wrote:
> > >>
> > >> > I left some comments in

[jira] [Created] (FLINK-23081) Move Executors#directExecutorContext to AkkaFutureUtils

2021-06-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23081:


 Summary: Move Executors#directExecutorContext to AkkaFutureUtils
 Key: FLINK-23081
 URL: https://issues.apache.org/jira/browse/FLINK-23081
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2

2021-06-22 Thread Piotr Nowojski
Thanks for volunteering.

A quick update about FLINK-23011. It turned out to be only an issue in the
master, so we don't need to block bug fix releases on this issue.

Best,
Piotrek

wt., 22 cze 2021 o 05:20 Xintong Song  napisał(a):

> Thanks Dawid for starting the discussion, and thanks Yun, Godfrey and
> Jingsong for volunteering as release managers.
>
>
> +1 for the releases, and +1 for the release managers.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Tue, Jun 22, 2021 at 10:15 AM Jingsong Li 
> wrote:
>
> > +1 to the release.
> >
> > Thanks Dawid for driving this discussion.
> >
> > I am willing to volunteer as the release manager too.
> >
> > Best,
> > Jingsong
> >
> > On Tue, Jun 22, 2021 at 9:58 AM godfrey he  wrote:
> >
> > > Thanks for driving this, Dawid. +1 to the release.
> > > I would also like to volunteer as the release manager for these
> versions.
> > >
> > >
> > > Best
> > > Godfrey
> > >
> > > Seth Wiesman  于2021年6月22日周二 上午8:39写道:
> > >
> > > > +1 to the release.
> > > >
> > > > It would be great if we could get FLINK-23073 into 1.13.2. There's
> > > already
> > > > an open PR and it unblocks upgrading the table API walkthrough in
> > > > apache/flink-playgrounds to 1.13.
> > > >
> > > > Seth
> > > >
> > > > On Mon, Jun 21, 2021 at 6:28 AM Yun Tang  wrote:
> > > >
> > > > > Hi Dawid,
> > > > >
> > > > > Thanks for driving this discussion, I am willing to volunteer as
> the
> > > > > release manager for these versions.
> > > > >
> > > > >
> > > > > Best
> > > > > Yun Tang
> > > > > 
> > > > > From: Konstantin Knauf 
> > > > > Sent: Friday, June 18, 2021 22:35
> > > > > To: dev 
> > > > > Subject: Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2
> > > > >
> > > > > Hi Dawid,
> > > > >
> > > > > Thank you for starting the discussion. I'd like to add
> > > > > https://issues.apache.org/jira/browse/FLINK-23025 to the list for
> > > Flink
> > > > > 1.13.2.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Konstantin
> > > > >
> > > > > On Fri, Jun 18, 2021 at 3:26 PM Dawid Wysakowicz <
> > > dwysakow...@apache.org
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi devs,
> > > > > >
> > > > > > Quite recently we pushed, in our opinion, quite an important
> fix[1]
> > > for
> > > > > > unaligned checkpoints which disables UC for broadcast
> partitioning.
> > > > > > Without the fix there might be some broadcast state corruption.
> > > > > > Therefore we think it would be beneficial to release it soonish.
> > What
> > > > do
> > > > > > you think? Do you have other issues in mind you'd like to have
> > > included
> > > > > > in these versions.
> > > > > >
> > > > > > Would someone be willing to volunteer to help with the releases
> as
> > a
> > > > > > release manager? I guess there is a couple of spots to fill in
> here
> > > ;)
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Dawid
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-22815
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Konstantin Knauf
> > > > >
> > > > > https://twitter.com/snntrable
> > > > >
> > > > > https://github.com/knaufk
> > > > >
> > > >
> > >
> >
> >
> > --
> > Best, Jingsong Lee
> >
>


[DISCUSS] Drop Mesos in 1.14

2021-06-22 Thread Chesnay Schepler
Last week I spent some time looking into making flink-runtime scala 
free, which effectively means to move the Akka-reliant classes to 
another module, and load that module along with Akka and all of it's 
dependencies (including Scala) through a separate classloader.


This would finally decouple the Scala versions required by the runtime 
and API, and would allow us to upgrade Akka as we'd no longer be limited 
to Scala 2.11. It would rid the classpath of a few dependencies, and 
remove the need for scala suffixes on quite a few modules.


However, our Mesos support has unfortunately a hard dependency on Akka, 
which naturally does not play well with the goal of isolating Akka in 
it's own ClassLoader.


To solve this issue I was thinking of simple dropping flink-mesos in 
1.14 (it was deprecated in 1.13).


Truth be told, I picked this option because it is the easiest to do. We 
_could_ probably make things work somehow (likely by shipping a second 
Akka version just for flink-mesos), but it doesn't seem worth the hassle 
and would void some of the benefits. So far we kept flink-mesos around, 
despite not really developing it further, because it didn't hurt to have 
it in still in Flink, but this has now changed.


Please tell me what you think.



[jira] [Created] (FLINK-23082) Split AkkaRpcServiceUtils

2021-06-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23082:


 Summary: Split AkkaRpcServiceUtils
 Key: FLINK-23082
 URL: https://issues.apache.org/jira/browse/FLINK-23082
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


The AkkaRpcServiceUtils should be split because it contains some generic 
methods (create(Random/Wildcard)name) that are not dependent on Akka.
The remainder of the AkkaRpcServiceUtils can then be moved into a separate 
module later on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23083) Isolate RpcConnectionTest from runtime implementations

2021-06-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23083:


 Summary: Isolate RpcConnectionTest from runtime implementations
 Key: FLINK-23083
 URL: https://issues.apache.org/jira/browse/FLINK-23083
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination, Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


The RpcConnectionTest relies a lot on Akka, but at the same time has a 
dependency on the TaskExecutorGateway.
We should replace the gateway usage with some dummy interface, such that this 
test can be moved in the future to a new akka-dependent module.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23084) Move ExecutorThreadFactory to flink-core

2021-06-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23084:


 Summary: Move ExecutorThreadFactory to flink-core
 Key: FLINK-23084
 URL: https://issues.apache.org/jira/browse/FLINK-23084
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


The ExecutorThreadFactory is both required by the Akka RPC service and 
flink-runtime, and in general could also be useful for other components not 
relying on flink-runtime (e.g., connectors).
We should move it to flink-runtime.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23085) Consolidate FutureUtils to flink-core

2021-06-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23085:


 Summary: Consolidate FutureUtils to flink-core
 Key: FLINK-23085
 URL: https://issues.apache.org/jira/browse/FLINK-23085
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


The FutureUtils are both required by the Akka RPC service and flink-runtime, 
and in general could also be useful for other components not relying on 
flink-runtime.

We should move it to flink-core. Additionally, the FutureUtils class from 
flink-queryable-state-client should be moved as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23086) Add getRuntimeExecutionMode to StreamExecutionEnvironment

2021-06-22 Thread Shuo Cheng (Jira)
Shuo Cheng created FLINK-23086:
--

 Summary: Add getRuntimeExecutionMode to StreamExecutionEnvironment
 Key: FLINK-23086
 URL: https://issues.apache.org/jira/browse/FLINK-23086
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration
Affects Versions: 1.13.0
Reporter: Shuo Cheng
 Fix For: 1.14.0


`RuntimeExecutionMode` is used when creating a `StreamGraphGenerator` inside 
`StreamExecutionEnvironment`, however, when we want to create a 
`StreamGraphGenerator` outside `StreamExecutionEnvironment` (like 
ExecutorUtils#generateStreamGraph),  `getRuntimeExecutionMode` is needed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23087) AddressResolution should be a top-level class

2021-06-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23087:


 Summary: AddressResolution should be a top-level class
 Key: FLINK-23087
 URL: https://issues.apache.org/jira/browse/FLINK-23087
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


Make AddressResolution a top-level class so that it can be moved more easily to 
a separate module, because it is required both by the Akka RPC service and 
flink-runtime.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23088) Add flink-rpc-core module

2021-06-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23088:


 Summary: Add flink-rpc-core module
 Key: FLINK-23088
 URL: https://issues.apache.org/jira/browse/FLINK-23088
 Project: Flink
  Issue Type: Sub-task
  Components: Build System, Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


Add a new module containing the fundamental interfaces required for 
implementing an RPC service.

Flink-runtime will depend on this module, as will the new akka-specific rpc 
module.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23089) Add flink-rpc-akka module

2021-06-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23089:


 Summary: Add flink-rpc-akka module
 Key: FLINK-23089
 URL: https://issues.apache.org/jira/browse/FLINK-23089
 Project: Flink
  Issue Type: Sub-task
  Components: Build System, Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


Add a new module containing all Akka-reliant classes.

As a first step we will just move classes, dependencies and build stuff to this 
module.
Loading this module through a separate classloader will be handled in a 
follow-up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23090) Add RpcSystem abstraction

2021-06-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23090:


 Summary: Add RpcSystem abstraction
 Key: FLINK-23090
 URL: https://issues.apache.org/jira/browse/FLINK-23090
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


Flink-runtime code does not only interact with akka-reliant code through the 
RpcService interface, but also through various static helper methods.

A new RpcSystem interface shall be added that serves both as a factory for 
RpcService and a facade for these utilities.
Through this interface we will later load the AkkaRpcSystem as a plugin.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23091) Bump Akka to 2.6

2021-06-22 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23091:


 Summary: Bump Akka to 2.6
 Key: FLINK-23091
 URL: https://issues.apache.org/jira/browse/FLINK-23091
 Project: Flink
  Issue Type: Sub-task
  Components: Build System, Runtime / Coordination
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


When loading akka 2.5 through a separate classloader a leak occurs because akka 
2.5 contains a custom ForkJoinPool implementation with a ThreadLocal, which 
results in a ClassLoaderLeak if any outside thread touches this implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23092) Fix built-in functions are unsupported in Python Group Window UDAF

2021-06-22 Thread Huang Xingbo (Jira)
Huang Xingbo created FLINK-23092:


 Summary: Fix built-in functions are unsupported in Python Group 
Window UDAF
 Key: FLINK-23092
 URL: https://issues.apache.org/jira/browse/FLINK-23092
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.13.1
Reporter: Huang Xingbo
Assignee: Huang Xingbo
 Fix For: 1.13.2






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-22 Thread Xintong Song
Hi everyone,

In the past a couple of weeks, I've observed several times that PRs are
merged without a green light from the CI tests, where failure cases are
considered *unrelated*. This may not always cause problems, but would
increase the chance of breaking our code base. In fact, it has occurred to
me twice in the past few weeks that I had to revert a commit which breaks
the master branch due to this.

I think it would be nicer to enforce a stricter rule, that no PRs should be
merged without passing CI.

The problems of merging PRs with "unrelated" test failures are:
- It's not always straightforward to tell whether a test failures are
related or not.
- It prevents subsequent test cases from being executed, which may fail
relating to the PR changes.

To make things easier for the committers, the following exceptions might be
considered acceptable.
- The PR has passed CI in the contributor's personal workspace. Please post
the link in such cases.
- The CI tests have been triggered multiple times, on the same commit, and
each stage has at least passed for once. Please also comment in such cases.

If we all agree on this, I'd update the community guidelines for merging
PRs wrt. this proposal. [1]

Please let me know what do you think.

Thank you~

Xintong Song


[1] https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests


Re: [DISCUSS] FLIP-171: Async Sink

2021-06-22 Thread Hausmann, Steffen
Hey,

Agreed on starting with a blocking `write`. I've adapted the FLIP accordingly.

For now I've chosen to add the `InterruptedException` to the `write` method 
signature as I'm not fully understanding the implications of swallowing the 
exception. Depending on the details of  the code that is calling the write 
method, it may cause event loss. But this seems more of an implementation 
detail, that we can revisit once we are actually implementing the sink.

Unless there are additional comments, does it make sense to start the voting 
process in the next day or two?

Cheers, Steffen


On 21.06.21, 14:51, "Piotr Nowojski"  wrote:

CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



Hi,

Thanks Steffen for the explanations. I think it makes sense to me.

Re Arvid/Steffen:

- Keep in mind that even if we choose to provide a non blocking API using
the `isAvailable()`/`getAvailableFuture()` method, we would still need to
support blocking inside the sinks. For example at the very least, emitting
many records at once (`flatMap`) or firing timers are scenarios when output
availability would be ignored at the moment by the runtime. Also I would
imagine writing very large (like 1GB) records would be blocking on
something as well.
- Secondly, exposing availability to the API level might not be that
easy/trivial. The availability pattern as defined in `AvailabilityProvider`
class is quite complicated and not that easy to implement by a user.

Both of those combined with lack of a clear motivation for adding
`AvailabilityProvider` to the sinks/operators/functions,  I would vote on
just starting with blocking `write` calls. This can always be extended in
the future with availability if needed/motivated properly.

That would be aligned with either Arvid's option 1 or 2. I don't know what
are the best practices with `InterruptedException`, but I'm always afraid
of it, so I would feel personally safer with option 2.

I'm not sure what problem option 3 is helping to solve? Adding `wakeUp()`
would sound strange to me.

Best,
Piotrek

pon., 21 cze 2021 o 12:15 Arvid Heise  napisał(a):

> Hi Piotr,
>
> to pick up this discussion thread again:
> - This FLIP is about providing some base implementation for FLIP-143 sinks
> that make adding new implementations easier, similar to the
> SourceReaderBase.
> - The whole availability topic will most likely be a separate FLIP. The
> basic issue just popped up here because we currently have no way to signal
> backpressure in sinks except by blocking `write`. This feels quite natural
> in sinks with sync communication but quite unnatural in async sinks.
>
> Now we have a couple of options. In all cases, we would have some WIP
> limit on the number of records/requests being able to be processed in
> parallel asynchronously (similar to asyncIO).
> 1. We use some blocking queue in `write`, then we need to handle
> interruptions. In the easiest case, we extend `write` to throw the
> `InterruptedException`, which is a small API change.
> 2. We use a blocking queue, but handle interrupts and swallow/translate
> them. No API change.
> Both solutions block the task thread, so any RPC message / unaligned
> checkpoint would be processed only after the backpressure is temporarily
> lifted. That's similar to the discussions that you linked. Cancellation 
may
> also be a tad harder on 2.
> 3. We could also add some `wakeUp` to the `SinkWriter` similar to
> `SplitFetcher` [1]. Basically, you use a normal queue with a completeable
> future on which you block. Wakeup would be a clean way to complete it next
> to the natural completion through finished requests.
> 4. We add availability to the sink. However, this API change also requires
> that we allow operators to be available so it may be a bigger change with
> undesired side-effects. On the other hand, we could also use the same
> mechanism for asyncIO.
>
> For users of FLIP-171, none of the options are exposed. So we could also
> start with a simple solution (add `InterruptedException`) and later try to
> add availability. Option 1+2 would also not require an additional FLIP; we
> could add it as part of this FLIP.
>
> Best,
>
> Arvid
>
> [1]
> 
https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/reader/fetcher/SplitFetcher.java#L258-L258
> On Thu, Jun 10, 2021 at 10:09 AM Hausmann, Steffen
>  wrote:
>
>> Hey Piotrek,
>>
>> Thanks for your comments on the FLIP. I'll address your second question
>> first, as I think it's more central to this FLIP. Jus

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-22 Thread Konstantin Knauf
Hi everyone,

First, Marton and I had a brief conversation yesterday offline and
discussed exploring the approach of exposing the authentication
functionality via an API. So, I am looking forward to your proposal in that
direction. The benefit of such a solution would be that it is extensible
for others and it does add a smaller maintenance (in particular testing)
footprint to Apache Flink itself. If we end up going down this route,
flink-packages.org would be a great way to promote these third party
"authentication modules".

Second, Marton, I understand your frustration about the long discussion on
this "simple matter", but the condescending tone of your last mail feels
uncalled for to me. Austin expressed a valid opinion on the topic, which is
based on his experience from other Open Source frameworks (CNCF mostly). I
am sure you agree that it is important for Apache Flink to stay open and to
consider different approaches and ideas and I don't think it helps the
culture of discussion to shoot it down like this ("This is where this
discussion stops.").

Let's continue to move this discussion forward and I am sure we'll find a
consensus based on product and technological considerations.

Thanks,

Konstantin

On Tue, Jun 22, 2021 at 9:31 AM Márton Balassi 
wrote:

> Hi Austin,
>
> Thank you for your thoughts. This is where this discussion stops. This
> email thread already contains more characters than the implementation and
> what is needed for the next 20 years of maintenance.
>
> It is great that you have a view on modern solutions and thank you for
> offering your help with brainstorming solutions. I am responsible for Flink
> at Cloudera and we do need an implementation like this and it is in fact
> already in production at dozens of customers. We are open to adapting that
> to expose a more generic API (and keeping Kerberos to our fork), to
> contribute this to the community as others have asked for it and to protect
> ourselves from occasionally having to update this critical implementation
> path based on changes in the Apache codebase. I have worked with close to a
> hundred Big Data customers as a consultant and an engineering manager and
> committed hundreds of changes to Apache Flink over the past decade, please
> trust my judgement on a simple matter like this.
>
> Please forgive me for referencing authority, this discussion was getting
> out of hand. Please keep vigilant.
>
> Best,
> Marton
>
> On Mon, Jun 21, 2021 at 10:50 PM Austin Cawley-Edwards <
> austin.caw...@gmail.com> wrote:
>
> > Hi Gabor + Marton,
> >
> > I don't believe that the issue with this proposal is the specific
> mechanism
> > proposed (Kerberos), but rather that it is not the level to implement it
> at
> > (Flink). I'm just one voice, so please take this with a grain of salt.
> >
> > In the other solutions previously noted there is no need to instrument
> > Flink which, in addition to reducing the maintenance burden, provides a
> > better, decoupled end result.
> >
> > IMO we should not add any new API in Flink for this use case. I think it
> is
> > unfortunate and sympathize with the work that has already been done on
> this
> > feature – perhaps we could brainstorm ways to run this alongside Flink in
> > your setup. Again, I don't think the proposed solution of an agnostic API
> > would not work, nor is it a bad idea, but is not one that will make Flink
> > more compatible with the modern solutions to this problem.
> >
> > Best,
> > Austin
> >
> > On Mon, Jun 21, 2021 at 2:18 PM Márton Balassi  >
> > wrote:
> >
> > > Hi team,
> > >
> > > Thank you for your input. Based on this discussion I agree with G that
> > > selecting and standardizing on a specific strong authentication
> mechanism
> > > is more challenging than the whole rest of the scope of this
> > authentication
> > > story. :-) I suggest that G and I go back to the drawing board and come
> > up
> > > with an API that can support multiple authentication mechanisms, and we
> > > would only merge said API to Flink. Specific implementations of it can
> be
> > > maintained outside of the project. This way we tackle the main
> challenge
> > in
> > > a truly minimal way.
> > >
> > > Best,
> > > Marton
> > >
> > > On Mon, Jun 21, 2021 at 4:18 PM Gabor Somogyi <
> gabor.g.somo...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > We see that adding any kind of specific authentication raises more
> > > > questions than answers.
> > > > What would be if a generic API would be added without any real
> > > > authentication logic?
> > > > That way every provider can add its own protocol implementation as
> > > > additional jar.
> > > >
> > > > BR,
> > > > G
> > > >
> > > >
> > > > On Thu, Jun 17, 2021 at 7:53 PM Austin Cawley-Edwards <
> > > > austin.caw...@gmail.com> wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> Sorry to be joining the conversation late. I'm also on the side of
> > > >> Konstantin, generally, in that this seems to not be a core goal of

Re: [DISCUSS] FLIP-171: Async Sink

2021-06-22 Thread Till Rohrmann
Adding the InterruptException to the write method would make it explicit
that the write call can block but must react to interruptions (e.g. when
Flink wants to cancel the operation). I think this makes the contract a bit
clearer.

I think starting simple and then extending the API as we see the need is a
good idea.

Cheers,
Till

On Tue, Jun 22, 2021 at 11:20 AM Hausmann, Steffen 
wrote:

> Hey,
>
> Agreed on starting with a blocking `write`. I've adapted the FLIP
> accordingly.
>
> For now I've chosen to add the `InterruptedException` to the `write`
> method signature as I'm not fully understanding the implications of
> swallowing the exception. Depending on the details of  the code that is
> calling the write method, it may cause event loss. But this seems more of
> an implementation detail, that we can revisit once we are actually
> implementing the sink.
>
> Unless there are additional comments, does it make sense to start the
> voting process in the next day or two?
>
> Cheers, Steffen
>
>
> On 21.06.21, 14:51, "Piotr Nowojski"  wrote:
>
> CAUTION: This email originated from outside of the organization. Do
> not click links or open attachments unless you can confirm the sender and
> know the content is safe.
>
>
>
> Hi,
>
> Thanks Steffen for the explanations. I think it makes sense to me.
>
> Re Arvid/Steffen:
>
> - Keep in mind that even if we choose to provide a non blocking API
> using
> the `isAvailable()`/`getAvailableFuture()` method, we would still need
> to
> support blocking inside the sinks. For example at the very least,
> emitting
> many records at once (`flatMap`) or firing timers are scenarios when
> output
> availability would be ignored at the moment by the runtime. Also I
> would
> imagine writing very large (like 1GB) records would be blocking on
> something as well.
> - Secondly, exposing availability to the API level might not be that
> easy/trivial. The availability pattern as defined in
> `AvailabilityProvider`
> class is quite complicated and not that easy to implement by a user.
>
> Both of those combined with lack of a clear motivation for adding
> `AvailabilityProvider` to the sinks/operators/functions,  I would vote
> on
> just starting with blocking `write` calls. This can always be extended
> in
> the future with availability if needed/motivated properly.
>
> That would be aligned with either Arvid's option 1 or 2. I don't know
> what
> are the best practices with `InterruptedException`, but I'm always
> afraid
> of it, so I would feel personally safer with option 2.
>
> I'm not sure what problem option 3 is helping to solve? Adding
> `wakeUp()`
> would sound strange to me.
>
> Best,
> Piotrek
>
> pon., 21 cze 2021 o 12:15 Arvid Heise  napisał(a):
>
> > Hi Piotr,
> >
> > to pick up this discussion thread again:
> > - This FLIP is about providing some base implementation for FLIP-143
> sinks
> > that make adding new implementations easier, similar to the
> > SourceReaderBase.
> > - The whole availability topic will most likely be a separate FLIP.
> The
> > basic issue just popped up here because we currently have no way to
> signal
> > backpressure in sinks except by blocking `write`. This feels quite
> natural
> > in sinks with sync communication but quite unnatural in async sinks.
> >
> > Now we have a couple of options. In all cases, we would have some WIP
> > limit on the number of records/requests being able to be processed in
> > parallel asynchronously (similar to asyncIO).
> > 1. We use some blocking queue in `write`, then we need to handle
> > interruptions. In the easiest case, we extend `write` to throw the
> > `InterruptedException`, which is a small API change.
> > 2. We use a blocking queue, but handle interrupts and
> swallow/translate
> > them. No API change.
> > Both solutions block the task thread, so any RPC message / unaligned
> > checkpoint would be processed only after the backpressure is
> temporarily
> > lifted. That's similar to the discussions that you linked.
> Cancellation may
> > also be a tad harder on 2.
> > 3. We could also add some `wakeUp` to the `SinkWriter` similar to
> > `SplitFetcher` [1]. Basically, you use a normal queue with a
> completeable
> > future on which you block. Wakeup would be a clean way to complete
> it next
> > to the natural completion through finished requests.
> > 4. We add availability to the sink. However, this API change also
> requires
> > that we allow operators to be available so it may be a bigger change
> with
> > undesired side-effects. On the other hand, we could also use the same
> > mechanism for asyncIO.
> >
> > For users of FLIP-171, none of the options are exposed. So we could
> also
> > start with a simple solution (add `InterruptedException`) and later
> try to

退订

2021-06-22 Thread lizikunn
退订


| |
lizikunn
|
|
lizik...@163.com
|
签名由网易邮箱大师定制



Re: [DISCUSS] Drop Mesos in 1.14

2021-06-22 Thread Robert Metzger
I'm adding Oleksandr from Criteo to this thread, they've mentioned in the
past that they are using Flink on Mesos (IIRC the only company that seemed
to be doing so)

I'm personally fine with dropping Mesos support.

On Tue, Jun 22, 2021 at 10:29 AM Chesnay Schepler 
wrote:

> Last week I spent some time looking into making flink-runtime scala
> free, which effectively means to move the Akka-reliant classes to
> another module, and load that module along with Akka and all of it's
> dependencies (including Scala) through a separate classloader.
>
> This would finally decouple the Scala versions required by the runtime
> and API, and would allow us to upgrade Akka as we'd no longer be limited
> to Scala 2.11. It would rid the classpath of a few dependencies, and
> remove the need for scala suffixes on quite a few modules.
>
> However, our Mesos support has unfortunately a hard dependency on Akka,
> which naturally does not play well with the goal of isolating Akka in
> it's own ClassLoader.
>
> To solve this issue I was thinking of simple dropping flink-mesos in
> 1.14 (it was deprecated in 1.13).
>
> Truth be told, I picked this option because it is the easiest to do. We
> _could_ probably make things work somehow (likely by shipping a second
> Akka version just for flink-mesos), but it doesn't seem worth the hassle
> and would void some of the benefits. So far we kept flink-mesos around,
> despite not really developing it further, because it didn't hurt to have
> it in still in Flink, but this has now changed.
>
> Please tell me what you think.
>
>


Re: [DISCUSS] Drop Mesos in 1.14

2021-06-22 Thread Till Rohrmann
I'd be ok with dropping support for Mesos if it helps us to clear our
dependencies in the flink-runtime module. If we do it, then we should
probably update our documentation with a pointer to the latest Flink
version that supports Mesos in case of users strictly need Mesos.

Cheers,
Till

On Tue, Jun 22, 2021 at 10:29 AM Chesnay Schepler 
wrote:

> Last week I spent some time looking into making flink-runtime scala
> free, which effectively means to move the Akka-reliant classes to
> another module, and load that module along with Akka and all of it's
> dependencies (including Scala) through a separate classloader.
>
> This would finally decouple the Scala versions required by the runtime
> and API, and would allow us to upgrade Akka as we'd no longer be limited
> to Scala 2.11. It would rid the classpath of a few dependencies, and
> remove the need for scala suffixes on quite a few modules.
>
> However, our Mesos support has unfortunately a hard dependency on Akka,
> which naturally does not play well with the goal of isolating Akka in
> it's own ClassLoader.
>
> To solve this issue I was thinking of simple dropping flink-mesos in
> 1.14 (it was deprecated in 1.13).
>
> Truth be told, I picked this option because it is the easiest to do. We
> _could_ probably make things work somehow (likely by shipping a second
> Akka version just for flink-mesos), but it doesn't seem worth the hassle
> and would void some of the benefits. So far we kept flink-mesos around,
> despite not really developing it further, because it didn't hurt to have
> it in still in Flink, but this has now changed.
>
> Please tell me what you think.
>
>


Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-22 Thread Till Rohrmann
Thanks for bringing this topic to our attention Xintong. I think your
proposal makes a lot of sense and we should follow it. It will give us
confidence that our changes are working and it might be a good incentive to
quickly fix build instabilities. Hence, +1.

Cheers,
Till

On Tue, Jun 22, 2021 at 11:12 AM Xintong Song  wrote:

> Hi everyone,
>
> In the past a couple of weeks, I've observed several times that PRs are
> merged without a green light from the CI tests, where failure cases are
> considered *unrelated*. This may not always cause problems, but would
> increase the chance of breaking our code base. In fact, it has occurred to
> me twice in the past few weeks that I had to revert a commit which breaks
> the master branch due to this.
>
> I think it would be nicer to enforce a stricter rule, that no PRs should be
> merged without passing CI.
>
> The problems of merging PRs with "unrelated" test failures are:
> - It's not always straightforward to tell whether a test failures are
> related or not.
> - It prevents subsequent test cases from being executed, which may fail
> relating to the PR changes.
>
> To make things easier for the committers, the following exceptions might be
> considered acceptable.
> - The PR has passed CI in the contributor's personal workspace. Please post
> the link in such cases.
> - The CI tests have been triggered multiple times, on the same commit, and
> each stage has at least passed for once. Please also comment in such cases.
>
> If we all agree on this, I'd update the community guidelines for merging
> PRs wrt. this proposal. [1]
>
> Please let me know what do you think.
>
> Thank you~
>
> Xintong Song
>
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests
>


Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2

2021-06-22 Thread Till Rohrmann
Thanks a lot, Yun, Godfrey, and Jingsong for being our release managers!
Creating these bug-fix releases will be super helpful for our users.

Cheers,
Till

On Tue, Jun 22, 2021 at 9:59 AM Piotr Nowojski  wrote:

> Thanks for volunteering.
>
> A quick update about FLINK-23011. It turned out to be only an issue in the
> master, so we don't need to block bug fix releases on this issue.
>
> Best,
> Piotrek
>
> wt., 22 cze 2021 o 05:20 Xintong Song  napisał(a):
>
> > Thanks Dawid for starting the discussion, and thanks Yun, Godfrey and
> > Jingsong for volunteering as release managers.
> >
> >
> > +1 for the releases, and +1 for the release managers.
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Tue, Jun 22, 2021 at 10:15 AM Jingsong Li 
> > wrote:
> >
> > > +1 to the release.
> > >
> > > Thanks Dawid for driving this discussion.
> > >
> > > I am willing to volunteer as the release manager too.
> > >
> > > Best,
> > > Jingsong
> > >
> > > On Tue, Jun 22, 2021 at 9:58 AM godfrey he 
> wrote:
> > >
> > > > Thanks for driving this, Dawid. +1 to the release.
> > > > I would also like to volunteer as the release manager for these
> > versions.
> > > >
> > > >
> > > > Best
> > > > Godfrey
> > > >
> > > > Seth Wiesman  于2021年6月22日周二 上午8:39写道:
> > > >
> > > > > +1 to the release.
> > > > >
> > > > > It would be great if we could get FLINK-23073 into 1.13.2. There's
> > > > already
> > > > > an open PR and it unblocks upgrading the table API walkthrough in
> > > > > apache/flink-playgrounds to 1.13.
> > > > >
> > > > > Seth
> > > > >
> > > > > On Mon, Jun 21, 2021 at 6:28 AM Yun Tang  wrote:
> > > > >
> > > > > > Hi Dawid,
> > > > > >
> > > > > > Thanks for driving this discussion, I am willing to volunteer as
> > the
> > > > > > release manager for these versions.
> > > > > >
> > > > > >
> > > > > > Best
> > > > > > Yun Tang
> > > > > > 
> > > > > > From: Konstantin Knauf 
> > > > > > Sent: Friday, June 18, 2021 22:35
> > > > > > To: dev 
> > > > > > Subject: Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2
> > > > > >
> > > > > > Hi Dawid,
> > > > > >
> > > > > > Thank you for starting the discussion. I'd like to add
> > > > > > https://issues.apache.org/jira/browse/FLINK-23025 to the list
> for
> > > > Flink
> > > > > > 1.13.2.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Konstantin
> > > > > >
> > > > > > On Fri, Jun 18, 2021 at 3:26 PM Dawid Wysakowicz <
> > > > dwysakow...@apache.org
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi devs,
> > > > > > >
> > > > > > > Quite recently we pushed, in our opinion, quite an important
> > fix[1]
> > > > for
> > > > > > > unaligned checkpoints which disables UC for broadcast
> > partitioning.
> > > > > > > Without the fix there might be some broadcast state corruption.
> > > > > > > Therefore we think it would be beneficial to release it
> soonish.
> > > What
> > > > > do
> > > > > > > you think? Do you have other issues in mind you'd like to have
> > > > included
> > > > > > > in these versions.
> > > > > > >
> > > > > > > Would someone be willing to volunteer to help with the releases
> > as
> > > a
> > > > > > > release manager? I guess there is a couple of spots to fill in
> > here
> > > > ;)
> > > > > > >
> > > > > > > Best,
> > > > > > >
> > > > > > > Dawid
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-22815
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Konstantin Knauf
> > > > > >
> > > > > > https://twitter.com/snntrable
> > > > > >
> > > > > > https://github.com/knaufk
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> > >
> >
>


Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-22 Thread Arvid Heise
I think this is overall a good idea. So +1 from my side.
However, I'd like to put a higher priority on infrastructure then, in
particular docker image/artifact caches.

On Tue, Jun 22, 2021 at 11:50 AM Till Rohrmann  wrote:

> Thanks for bringing this topic to our attention Xintong. I think your
> proposal makes a lot of sense and we should follow it. It will give us
> confidence that our changes are working and it might be a good incentive to
> quickly fix build instabilities. Hence, +1.
>
> Cheers,
> Till
>
> On Tue, Jun 22, 2021 at 11:12 AM Xintong Song 
> wrote:
>
> > Hi everyone,
> >
> > In the past a couple of weeks, I've observed several times that PRs are
> > merged without a green light from the CI tests, where failure cases are
> > considered *unrelated*. This may not always cause problems, but would
> > increase the chance of breaking our code base. In fact, it has occurred
> to
> > me twice in the past few weeks that I had to revert a commit which breaks
> > the master branch due to this.
> >
> > I think it would be nicer to enforce a stricter rule, that no PRs should
> be
> > merged without passing CI.
> >
> > The problems of merging PRs with "unrelated" test failures are:
> > - It's not always straightforward to tell whether a test failures are
> > related or not.
> > - It prevents subsequent test cases from being executed, which may fail
> > relating to the PR changes.
> >
> > To make things easier for the committers, the following exceptions might
> be
> > considered acceptable.
> > - The PR has passed CI in the contributor's personal workspace. Please
> post
> > the link in such cases.
> > - The CI tests have been triggered multiple times, on the same commit,
> and
> > each stage has at least passed for once. Please also comment in such
> cases.
> >
> > If we all agree on this, I'd update the community guidelines for merging
> > PRs wrt. this proposal. [1]
> >
> > Please let me know what do you think.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> > [1]
> > https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests
> >
>


Re: [DISCUSS] Drop Mesos in 1.14

2021-06-22 Thread Arvid Heise
+1 for dropping. Frankly speaking, I don't see it having any future (and D2iQ
agrees).

If there is a surprisingly huge demand, I'd try to evaluate plugins for it.

On Tue, Jun 22, 2021 at 11:46 AM Till Rohrmann  wrote:

> I'd be ok with dropping support for Mesos if it helps us to clear our
> dependencies in the flink-runtime module. If we do it, then we should
> probably update our documentation with a pointer to the latest Flink
> version that supports Mesos in case of users strictly need Mesos.
>
> Cheers,
> Till
>
> On Tue, Jun 22, 2021 at 10:29 AM Chesnay Schepler 
> wrote:
>
> > Last week I spent some time looking into making flink-runtime scala
> > free, which effectively means to move the Akka-reliant classes to
> > another module, and load that module along with Akka and all of it's
> > dependencies (including Scala) through a separate classloader.
> >
> > This would finally decouple the Scala versions required by the runtime
> > and API, and would allow us to upgrade Akka as we'd no longer be limited
> > to Scala 2.11. It would rid the classpath of a few dependencies, and
> > remove the need for scala suffixes on quite a few modules.
> >
> > However, our Mesos support has unfortunately a hard dependency on Akka,
> > which naturally does not play well with the goal of isolating Akka in
> > it's own ClassLoader.
> >
> > To solve this issue I was thinking of simple dropping flink-mesos in
> > 1.14 (it was deprecated in 1.13).
> >
> > Truth be told, I picked this option because it is the easiest to do. We
> > _could_ probably make things work somehow (likely by shipping a second
> > Akka version just for flink-mesos), but it doesn't seem worth the hassle
> > and would void some of the benefits. So far we kept flink-mesos around,
> > despite not really developing it further, because it didn't hurt to have
> > it in still in Flink, but this has now changed.
> >
> > Please tell me what you think.
> >
> >
>


Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-22 Thread Till Rohrmann
Hi everyone,

I do like the idea of keeping the actual change outside of Flink but to
enable Flink to support such a use case (different authentication
mechanisms). I think this is a good compromise for the community that
combines long-term maintainability with support for new use-cases. I am
looking forward to your proposal.

I also want to second Konstantin here that the tone of your last email,
Marton, does not reflect the values and manners of the Flink community and
is not representative of how we conduct discussions. Especially, the more
senior community members should know this and act accordingly in order to
be good role models for others in the community. Technical discussions
should not be decided by who wields presumably the greatest authority but
by the soundness of arguments and by what is the best solution for a
problem.

Let us now try to find the best solution for the problem at hand!

Cheers,
Till

On Tue, Jun 22, 2021 at 11:24 AM Konstantin Knauf  wrote:

> Hi everyone,
>
> First, Marton and I had a brief conversation yesterday offline and
> discussed exploring the approach of exposing the authentication
> functionality via an API. So, I am looking forward to your proposal in that
> direction. The benefit of such a solution would be that it is extensible
> for others and it does add a smaller maintenance (in particular testing)
> footprint to Apache Flink itself. If we end up going down this route,
> flink-packages.org would be a great way to promote these third party
> "authentication modules".
>
> Second, Marton, I understand your frustration about the long discussion on
> this "simple matter", but the condescending tone of your last mail feels
> uncalled for to me. Austin expressed a valid opinion on the topic, which is
> based on his experience from other Open Source frameworks (CNCF mostly). I
> am sure you agree that it is important for Apache Flink to stay open and to
> consider different approaches and ideas and I don't think it helps the
> culture of discussion to shoot it down like this ("This is where this
> discussion stops.").
>
> Let's continue to move this discussion forward and I am sure we'll find a
> consensus based on product and technological considerations.
>
> Thanks,
>
> Konstantin
>
> On Tue, Jun 22, 2021 at 9:31 AM Márton Balassi 
> wrote:
>
> > Hi Austin,
> >
> > Thank you for your thoughts. This is where this discussion stops. This
> > email thread already contains more characters than the implementation and
> > what is needed for the next 20 years of maintenance.
> >
> > It is great that you have a view on modern solutions and thank you for
> > offering your help with brainstorming solutions. I am responsible for
> Flink
> > at Cloudera and we do need an implementation like this and it is in fact
> > already in production at dozens of customers. We are open to adapting
> that
> > to expose a more generic API (and keeping Kerberos to our fork), to
> > contribute this to the community as others have asked for it and to
> protect
> > ourselves from occasionally having to update this critical implementation
> > path based on changes in the Apache codebase. I have worked with close
> to a
> > hundred Big Data customers as a consultant and an engineering manager and
> > committed hundreds of changes to Apache Flink over the past decade,
> please
> > trust my judgement on a simple matter like this.
> >
> > Please forgive me for referencing authority, this discussion was getting
> > out of hand. Please keep vigilant.
> >
> > Best,
> > Marton
> >
> > On Mon, Jun 21, 2021 at 10:50 PM Austin Cawley-Edwards <
> > austin.caw...@gmail.com> wrote:
> >
> > > Hi Gabor + Marton,
> > >
> > > I don't believe that the issue with this proposal is the specific
> > mechanism
> > > proposed (Kerberos), but rather that it is not the level to implement
> it
> > at
> > > (Flink). I'm just one voice, so please take this with a grain of salt.
> > >
> > > In the other solutions previously noted there is no need to instrument
> > > Flink which, in addition to reducing the maintenance burden, provides a
> > > better, decoupled end result.
> > >
> > > IMO we should not add any new API in Flink for this use case. I think
> it
> > is
> > > unfortunate and sympathize with the work that has already been done on
> > this
> > > feature – perhaps we could brainstorm ways to run this alongside Flink
> in
> > > your setup. Again, I don't think the proposed solution of an agnostic
> API
> > > would not work, nor is it a bad idea, but is not one that will make
> Flink
> > > more compatible with the modern solutions to this problem.
> > >
> > > Best,
> > > Austin
> > >
> > > On Mon, Jun 21, 2021 at 2:18 PM Márton Balassi <
> balassi.mar...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi team,
> > > >
> > > > Thank you for your input. Based on this discussion I agree with G
> that
> > > > selecting and standardizing on a specific strong authentication
> > mechanism
> > > > is more challenging than the whole r

Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-22 Thread Till Rohrmann
One way to address the problem of regularly failing tests that block
merging of PRs is to disable the respective tests for the time being. Of
course, the failing test then needs to be fixed. But at least that way we
would not block everyone from making progress.

Cheers,
Till

On Tue, Jun 22, 2021 at 12:00 PM Arvid Heise  wrote:

> I think this is overall a good idea. So +1 from my side.
> However, I'd like to put a higher priority on infrastructure then, in
> particular docker image/artifact caches.
>
> On Tue, Jun 22, 2021 at 11:50 AM Till Rohrmann 
> wrote:
>
> > Thanks for bringing this topic to our attention Xintong. I think your
> > proposal makes a lot of sense and we should follow it. It will give us
> > confidence that our changes are working and it might be a good incentive
> to
> > quickly fix build instabilities. Hence, +1.
> >
> > Cheers,
> > Till
> >
> > On Tue, Jun 22, 2021 at 11:12 AM Xintong Song 
> > wrote:
> >
> > > Hi everyone,
> > >
> > > In the past a couple of weeks, I've observed several times that PRs are
> > > merged without a green light from the CI tests, where failure cases are
> > > considered *unrelated*. This may not always cause problems, but would
> > > increase the chance of breaking our code base. In fact, it has occurred
> > to
> > > me twice in the past few weeks that I had to revert a commit which
> breaks
> > > the master branch due to this.
> > >
> > > I think it would be nicer to enforce a stricter rule, that no PRs
> should
> > be
> > > merged without passing CI.
> > >
> > > The problems of merging PRs with "unrelated" test failures are:
> > > - It's not always straightforward to tell whether a test failures are
> > > related or not.
> > > - It prevents subsequent test cases from being executed, which may fail
> > > relating to the PR changes.
> > >
> > > To make things easier for the committers, the following exceptions
> might
> > be
> > > considered acceptable.
> > > - The PR has passed CI in the contributor's personal workspace. Please
> > post
> > > the link in such cases.
> > > - The CI tests have been triggered multiple times, on the same commit,
> > and
> > > each stage has at least passed for once. Please also comment in such
> > cases.
> > >
> > > If we all agree on this, I'd update the community guidelines for
> merging
> > > PRs wrt. this proposal. [1]
> > >
> > > Please let me know what do you think.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > > [1]
> > >
> https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests
> > >
> >
>


[jira] [Created] (FLINK-23093) Limit number of I/O pool and Future threads in Mini Cluster

2021-06-22 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-23093:


 Summary: Limit number of I/O pool and Future threads in Mini 
Cluster
 Key: FLINK-23093
 URL: https://issues.apache.org/jira/browse/FLINK-23093
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.13.0
Reporter: Stephan Ewen
 Fix For: 1.14.0


When running tests on CI via the minicluster, the mini cluster typically spawns 
100s of I/O threads, both in the MiniCluster I/O pool and in the TM I/O pool.

The standard rule for the maximum pool size is 4*num-cores, but the number of 
cores can be fairly large these days. Various Java versions also mess up core 
counting when running in containers (JVM container might have been given 2 
cores as resource limits, but the JVM counts the system as a whole, like 64/128 
cores).

This is both a nuisance for debugging, and a big waste of memory (each thread 
takes by default around 1MB when spawned, so the test JVM wastes 100s of MBs 
for nothing).

I would suggest to set a default of 8 I/O threads for the Mini Cluster. The 
scaling-with-cores is important for proper TM/JM deployments, but not for the 
Mini Cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-22 Thread 刘建刚
It is a good principle to run all tests successfully with any change.  This
means a lot for project's stability and development. I am big +1 for this
proposal.

Best
liujiangang

Till Rohrmann  于2021年6月22日周二 下午6:36写道:

> One way to address the problem of regularly failing tests that block
> merging of PRs is to disable the respective tests for the time being. Of
> course, the failing test then needs to be fixed. But at least that way we
> would not block everyone from making progress.
>
> Cheers,
> Till
>
> On Tue, Jun 22, 2021 at 12:00 PM Arvid Heise  wrote:
>
> > I think this is overall a good idea. So +1 from my side.
> > However, I'd like to put a higher priority on infrastructure then, in
> > particular docker image/artifact caches.
> >
> > On Tue, Jun 22, 2021 at 11:50 AM Till Rohrmann 
> > wrote:
> >
> > > Thanks for bringing this topic to our attention Xintong. I think your
> > > proposal makes a lot of sense and we should follow it. It will give us
> > > confidence that our changes are working and it might be a good
> incentive
> > to
> > > quickly fix build instabilities. Hence, +1.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Tue, Jun 22, 2021 at 11:12 AM Xintong Song 
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > In the past a couple of weeks, I've observed several times that PRs
> are
> > > > merged without a green light from the CI tests, where failure cases
> are
> > > > considered *unrelated*. This may not always cause problems, but would
> > > > increase the chance of breaking our code base. In fact, it has
> occurred
> > > to
> > > > me twice in the past few weeks that I had to revert a commit which
> > breaks
> > > > the master branch due to this.
> > > >
> > > > I think it would be nicer to enforce a stricter rule, that no PRs
> > should
> > > be
> > > > merged without passing CI.
> > > >
> > > > The problems of merging PRs with "unrelated" test failures are:
> > > > - It's not always straightforward to tell whether a test failures are
> > > > related or not.
> > > > - It prevents subsequent test cases from being executed, which may
> fail
> > > > relating to the PR changes.
> > > >
> > > > To make things easier for the committers, the following exceptions
> > might
> > > be
> > > > considered acceptable.
> > > > - The PR has passed CI in the contributor's personal workspace.
> Please
> > > post
> > > > the link in such cases.
> > > > - The CI tests have been triggered multiple times, on the same
> commit,
> > > and
> > > > each stage has at least passed for once. Please also comment in such
> > > cases.
> > > >
> > > > If we all agree on this, I'd update the community guidelines for
> > merging
> > > > PRs wrt. this proposal. [1]
> > > >
> > > > Please let me know what do you think.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > > [1]
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests
> > > >
> > >
> >
>


[VOTE] FLIP-171: Async Sink

2021-06-22 Thread Hausmann, Steffen
Hi there,

After the discussion in [1], I’d like to open a voting thread for FLIP-171 [2], 
which proposes a base implementation for sinks that support async requests.

The vote will be open until June 25 (72h), unless there is an objection or not 
enough votes.

Cheers, Steffen

[1] 
https://mail-archives.apache.org/mod_mbox/flink-dev/202106.mbox/%3CC83F4222-4D07-412D-9BD5-DB92D59DDF03%40amazon.de%3E
[2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink




Amazon Web Services EMEA SARL
38 avenue John F. Kennedy, L-1855 Luxembourg
Sitz der Gesellschaft: L-1855 Luxemburg
eingetragen im Luxemburgischen Handelsregister unter R.C.S. B186284

Amazon Web Services EMEA SARL, Niederlassung Deutschland
Marcel-Breuer-Str. 12, D-80807 Muenchen
Sitz der Zweigniederlassung: Muenchen
eingetragen im Handelsregister des Amtsgerichts Muenchen unter HRB 242240, 
USt-ID DE317013094





[jira] [Created] (FLINK-23094) encounter thread-safe problem when using StreamExecutionEnvironment#initializeContextEnvironment in multiple-threads environment

2021-06-22 Thread godfrey he (Jira)
godfrey he created FLINK-23094:
--

 Summary: encounter thread-safe problem when using 
StreamExecutionEnvironment#initializeContextEnvironment in multiple-threads 
environment
 Key: FLINK-23094
 URL: https://issues.apache.org/jira/browse/FLINK-23094
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Affects Versions: 1.13.0, 1.12.0, 1.11.0, 1.14.0
Reporter: godfrey he


encounter thread-safe problem when using 
StreamExecutionEnvironment#initializeContextEnvironment (or related code, such 
as PackagedProgramUtils#getPipelineFromProgram) in multiple-threads environment.

The reason is the {{initializeContextEnvironment}} method is not thread-safe:
{code:java}
protected static void 
initializeContextEnvironment(StreamExecutionEnvironmentFactory ctx) {
contextEnvironmentFactory = ctx;
threadLocalContextEnvironmentFactory.set(contextEnvironmentFactory);
}
{code}

{{contextEnvironmentFactory}} is a static variable, and when 
{{initializeContextEnvironment}} is executed in multiple-threads environment, 
the value of {{contextEnvironmentFactory}} may be changed by other thread when 
executing {{ 
threadLocalContextEnvironmentFactory.set(contextEnvironmentFactory);}}

The solution is: use the local variable {{ctx}} instead of the static variable 
{{contextEnvironmentFactory}}.
 
{code:java}
protected static void 
initializeContextEnvironment(StreamExecutionEnvironmentFactory ctx) {
contextEnvironmentFactory = ctx;
threadLocalContextEnvironmentFactory.set(ctx);
}
{code}

Another thing I doubt is whether {{contextEnvironmentFactory}} is really needed 
? Currently, {{contextEnvironmentFactory}} is not set or reset individually, it 
is always changed with {{threadLocalContextEnvironmentFactory}}.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23095) Close the channel unspilling thread pool after restore

2021-06-22 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-23095:


 Summary: Close the channel unspilling thread pool after restore
 Key: FLINK-23095
 URL: https://issues.apache.org/jira/browse/FLINK-23095
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Task
Reporter: Dawid Wysakowicz
 Fix For: 1.14.0


Currently we keep around the {{StreamTask#channelIOExecutor}} for the entire 
life cycle of the {{StreamTask}} and we keep unnecessary alive threads. We 
could close it right after we finish the restore of the gates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23096) HiveParser could not attach the sessionstate of hive

2021-06-22 Thread shizhengchao (Jira)
shizhengchao created FLINK-23096:


 Summary: HiveParser could not attach the sessionstate of hive
 Key: FLINK-23096
 URL: https://issues.apache.org/jira/browse/FLINK-23096
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Affects Versions: 1.13.1
Reporter: shizhengchao


My sql code is as follows:
{code:java}
//代码占位符
CREATE CATALOG myhive WITH (
'type' = 'hive',
'default-database' = 'default',
'hive-conf-dir' = '/home/service/upload-job-file/1624269463008'
);

use catalog hive;

set 'table.sql-dialect' = 'hive';

create view if not exists view_test as
select
  cast(goods_id as string) as goods_id,
  cast(depot_id as string) as depot_id,
  cast(product_id as string) as product_id,
  cast(tenant_code as string) as tenant_code
from edw.dim_yezi_whse_goods_base_info/*+ 
OPTIONS('streaming-source.consume-start-offset'='dayno=20210621') */;
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23097) 'Queryable state (rocksdb) with TM restart end-to-end test' fails on azure

2021-06-22 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-23097:


 Summary: 'Queryable state (rocksdb) with TM restart end-to-end 
test' fails on azure
 Key: FLINK-23097
 URL: https://issues.apache.org/jira/browse/FLINK-23097
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Queryable State
Affects Versions: 1.14.0
Reporter: Dawid Wysakowicz


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19310&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=ff888d9b-cd34-53cc-d90f-3e446d355529&l=12333

{code}
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
Jun 22 14:04:55 MapState has 22 entries
Jun 22 14:04:56 TaskManager 422719 killed.
Jun 22 14:04:56 Number of running task managers 1 is not yet 0.
Jun 22 14:05:00 Number of running task managers 1 is not yet 0.
Jun 22 14:05:04 Number of running task managers 1 is not yet 0.
Jun 22 14:05:08 Number of running task managers has reached 0.
Jun 22 14:05:08 Latest snapshot count was 42
Jun 22 14:05:09 Starting taskexecutor daemon on host fv-az68-17.
Jun 22 14:05:09 Number of running task managers 0 is not yet 1.
Jun 22 14:05:13 Number of running task managers has reached 1.
Jun 22 14:05:15 Job (5b515e0f9168e338d1645bf2e9f92820) is running.
Jun 22 14:05:15 Starting to wait for completion of 18 checkpoints
Jun 22 14:05:15 13/18 completed checkpoints
Jun 22 14:05:17 13/18 completed checkpoints
Jun 22 14:05:19 17/18 completed checkpoints
Jun 22 14:05:21 17/18 completed checkpoints
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
Jun 22 14:05:24 after: 40
Jun 22 14:05:24 An error occurred
Jun 22 14:05:24 [FAIL] Test script contains errors.
Jun 22 14:05:24 Checking of logs skipped.
Jun 22 14:05:24 
Jun 22 14:05:24 [FAIL] 'Queryable state (rocksdb) with TM restart end-to-end 
test' failed after 0 minutes and 50 seconds! Test exited with exit code 1
Jun 22 14:05:24 
14:05:24 ##[group]Environment Information

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23098) Update operations playground for 1.13

2021-06-22 Thread David Anderson (Jira)
David Anderson created FLINK-23098:
--

 Summary: Update operations playground for 1.13
 Key: FLINK-23098
 URL: https://issues.apache.org/jira/browse/FLINK-23098
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation / Training / Exercises
Affects Versions: 1.13.0
Reporter: David Anderson
Assignee: David Anderson






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23099) Update table walkthrough playground for 1.13

2021-06-22 Thread David Anderson (Jira)
David Anderson created FLINK-23099:
--

 Summary: Update table walkthrough playground for 1.13
 Key: FLINK-23099
 URL: https://issues.apache.org/jira/browse/FLINK-23099
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation / Training / Exercises
Reporter: David Anderson
Assignee: David Anderson






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23100) Update pyflink walkthrough playground for 1.13

2021-06-22 Thread David Anderson (Jira)
David Anderson created FLINK-23100:
--

 Summary: Update pyflink walkthrough playground for 1.13
 Key: FLINK-23100
 URL: https://issues.apache.org/jira/browse/FLINK-23100
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation / Training / Exercises
Reporter: David Anderson






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Drop Mesos in 1.14

2021-06-22 Thread Seth Wiesman
Sorry if this is a naive question, I don't have any real Mesos experience.
Is it possible to deploy a standalone cluster on top of Mesos in the same
way you can with Kubernetes? If so, and there is still Mesos demand from
the community, we could document that process as the recommended deployment
mode going forward.

Seth

On Tue, Jun 22, 2021 at 5:02 AM Arvid Heise  wrote:

> +1 for dropping. Frankly speaking, I don't see it having any future (and
> D2iQ
> agrees).
>
> If there is a surprisingly huge demand, I'd try to evaluate plugins for it.
>
> On Tue, Jun 22, 2021 at 11:46 AM Till Rohrmann 
> wrote:
>
> > I'd be ok with dropping support for Mesos if it helps us to clear our
> > dependencies in the flink-runtime module. If we do it, then we should
> > probably update our documentation with a pointer to the latest Flink
> > version that supports Mesos in case of users strictly need Mesos.
> >
> > Cheers,
> > Till
> >
> > On Tue, Jun 22, 2021 at 10:29 AM Chesnay Schepler 
> > wrote:
> >
> > > Last week I spent some time looking into making flink-runtime scala
> > > free, which effectively means to move the Akka-reliant classes to
> > > another module, and load that module along with Akka and all of it's
> > > dependencies (including Scala) through a separate classloader.
> > >
> > > This would finally decouple the Scala versions required by the runtime
> > > and API, and would allow us to upgrade Akka as we'd no longer be
> limited
> > > to Scala 2.11. It would rid the classpath of a few dependencies, and
> > > remove the need for scala suffixes on quite a few modules.
> > >
> > > However, our Mesos support has unfortunately a hard dependency on Akka,
> > > which naturally does not play well with the goal of isolating Akka in
> > > it's own ClassLoader.
> > >
> > > To solve this issue I was thinking of simple dropping flink-mesos in
> > > 1.14 (it was deprecated in 1.13).
> > >
> > > Truth be told, I picked this option because it is the easiest to do. We
> > > _could_ probably make things work somehow (likely by shipping a second
> > > Akka version just for flink-mesos), but it doesn't seem worth the
> hassle
> > > and would void some of the benefits. So far we kept flink-mesos around,
> > > despite not really developing it further, because it didn't hurt to
> have
> > > it in still in Flink, but this has now changed.
> > >
> > > Please tell me what you think.
> > >
> > >
> >
>


[jira] [Created] (FLINK-23101) Flame Graphs initial view says it is 18800 days in the past

2021-06-22 Thread Nico Kruber (Jira)
Nico Kruber created FLINK-23101:
---

 Summary: Flame Graphs initial view says it is 18800 days in the 
past
 Key: FLINK-23101
 URL: https://issues.apache.org/jira/browse/FLINK-23101
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Reporter: Nico Kruber
 Attachments: image.png

When you look at the Flame Graphs for a task for the first time, it will show 
an empty space and say that the measurement was ~18800 days in the past (see 
the attached image).

 

This should rather be something more useful like "no measurement yet" or so...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23102) Accessing FlameGraphs while not being enabled returns an exception

2021-06-22 Thread Nico Kruber (Jira)
Nico Kruber created FLINK-23102:
---

 Summary: Accessing FlameGraphs while not being enabled returns an 
exception
 Key: FLINK-23102
 URL: https://issues.apache.org/jira/browse/FLINK-23102
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.13.1
Reporter: Nico Kruber
 Attachments: image-2021-06-22-17-36-47-730.png

Trying to retrieve the FlameGraph in a job that doesn't have it enabled returns 
this ugly exception:

!image-2021-06-22-17-36-47-730.png!

Instead, it could mention that this feature is not enabled and describe how to 
enable it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Drop Mesos in 1.14

2021-06-22 Thread Israel Ekpo
I am in favor of dropping the support for Mesos.

In terms of the landscape for users leveraging Mesos for the kind of
workloads Flink is used, I think it is on the decline.

+1 from me

On Tue, Jun 22, 2021 at 11:32 AM Seth Wiesman  wrote:

> Sorry if this is a naive question, I don't have any real Mesos experience.
> Is it possible to deploy a standalone cluster on top of Mesos in the same
> way you can with Kubernetes? If so, and there is still Mesos demand from
> the community, we could document that process as the recommended deployment
> mode going forward.
>
> Seth
>
> On Tue, Jun 22, 2021 at 5:02 AM Arvid Heise  wrote:
>
> > +1 for dropping. Frankly speaking, I don't see it having any future (and
> > D2iQ
> > agrees).
> >
> > If there is a surprisingly huge demand, I'd try to evaluate plugins for
> it.
> >
> > On Tue, Jun 22, 2021 at 11:46 AM Till Rohrmann 
> > wrote:
> >
> > > I'd be ok with dropping support for Mesos if it helps us to clear our
> > > dependencies in the flink-runtime module. If we do it, then we should
> > > probably update our documentation with a pointer to the latest Flink
> > > version that supports Mesos in case of users strictly need Mesos.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Tue, Jun 22, 2021 at 10:29 AM Chesnay Schepler 
> > > wrote:
> > >
> > > > Last week I spent some time looking into making flink-runtime scala
> > > > free, which effectively means to move the Akka-reliant classes to
> > > > another module, and load that module along with Akka and all of it's
> > > > dependencies (including Scala) through a separate classloader.
> > > >
> > > > This would finally decouple the Scala versions required by the
> runtime
> > > > and API, and would allow us to upgrade Akka as we'd no longer be
> > limited
> > > > to Scala 2.11. It would rid the classpath of a few dependencies, and
> > > > remove the need for scala suffixes on quite a few modules.
> > > >
> > > > However, our Mesos support has unfortunately a hard dependency on
> Akka,
> > > > which naturally does not play well with the goal of isolating Akka in
> > > > it's own ClassLoader.
> > > >
> > > > To solve this issue I was thinking of simple dropping flink-mesos in
> > > > 1.14 (it was deprecated in 1.13).
> > > >
> > > > Truth be told, I picked this option because it is the easiest to do.
> We
> > > > _could_ probably make things work somehow (likely by shipping a
> second
> > > > Akka version just for flink-mesos), but it doesn't seem worth the
> > hassle
> > > > and would void some of the benefits. So far we kept flink-mesos
> around,
> > > > despite not really developing it further, because it didn't hurt to
> > have
> > > > it in still in Flink, but this has now changed.
> > > >
> > > > Please tell me what you think.
> > > >
> > > >
> > >
> >
>


Migration to akka 2.6

2021-06-22 Thread Etienne Chauchot

Hi everyone,

I was thinking today about the migration from akka 2.5.21 that we 
currently use to 2.6.x. I saw Chesnay has been thinking about this as 
well as he just opened this ticket (1)  :)


Besides the general good reasons to upgrade - among others things I'd 
add a security fix in akka 2.6.14 - , I was wondering about scala 2.12 
build profile and the ability to have a different akka version between 
scala 2.11 and scala 2.12 environments.


I commented the details in the ticket. Can you folks tell me what you 
think about it ?


Best

Etienne

[1] https://issues.apache.org/jira/browse/FLINK-23091



[jira] [Created] (FLINK-23103) Provide Inteceptor for connector

2021-06-22 Thread lqjacklee (Jira)
lqjacklee created FLINK-23103:
-

 Summary: Provide Inteceptor for connector
 Key: FLINK-23103
 URL: https://issues.apache.org/jira/browse/FLINK-23103
 Project: Flink
  Issue Type: New Feature
  Components: API / Core
Affects Versions: 1.14.0
Reporter: lqjacklee
 Fix For: 1.14.0


In order to provide the extension for 
org.apache.flink.api.common.io.InputFormat/org.apache.flink.api.common.io.OutputFormat/.
 

we can inject the log/metrics/.. to inject Interceptors to 
SourceFunction/SinkFunction

{code:java}
/**
*
**/
public interface Interceptor extends Closeable {

 void init(Configuration config);

 /**the method will be invoked before user function */
 void pre(Context context);

 /**the method will be invoked after user function*/
 void post(Context context);

 /**
  *  the runtime context for the interceptor
  **/
 interface Context extends Map{
 }

}
{code}

Once 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Migration to akka 2.6

2021-06-22 Thread Chesnay Schepler

Targeted for 1.14.

On 6/22/2021 6:16 PM, Etienne Chauchot wrote:


Hi everyone,

I was thinking today about the migration from akka 2.5.21 that we 
currently use to 2.6.x. I saw Chesnay has been thinking about this as 
well as he just opened this ticket (1)  :)


Besides the general good reasons to upgrade - among others things I'd 
add a security fix in akka 2.6.14 - , I was wondering about scala 2.12 
build profile and the ability to have a different akka version between 
scala 2.11 and scala 2.12 environments.


I commented the details in the ticket. Can you folks tell me what you 
think about it ?


Best

Etienne

[1] https://issues.apache.org/jira/browse/FLINK-23091





Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2

2021-06-22 Thread Xintong Song
Seems there's no objections.

After reaching out to the release managers, the work will be divided as
follows.
- 1.11.4: Godfrey
- 1.12.5: Jingsong
- 1.13.2: Yun

I'll also be helping with the release managers where PMC authority is
required in the release process.

Thank you~

Xintong Song



On Tue, Jun 22, 2021 at 5:52 PM Till Rohrmann  wrote:

> Thanks a lot, Yun, Godfrey, and Jingsong for being our release managers!
> Creating these bug-fix releases will be super helpful for our users.
>
> Cheers,
> Till
>
> On Tue, Jun 22, 2021 at 9:59 AM Piotr Nowojski 
> wrote:
>
> > Thanks for volunteering.
> >
> > A quick update about FLINK-23011. It turned out to be only an issue in
> the
> > master, so we don't need to block bug fix releases on this issue.
> >
> > Best,
> > Piotrek
> >
> > wt., 22 cze 2021 o 05:20 Xintong Song 
> napisał(a):
> >
> > > Thanks Dawid for starting the discussion, and thanks Yun, Godfrey and
> > > Jingsong for volunteering as release managers.
> > >
> > >
> > > +1 for the releases, and +1 for the release managers.
> > >
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Tue, Jun 22, 2021 at 10:15 AM Jingsong Li 
> > > wrote:
> > >
> > > > +1 to the release.
> > > >
> > > > Thanks Dawid for driving this discussion.
> > > >
> > > > I am willing to volunteer as the release manager too.
> > > >
> > > > Best,
> > > > Jingsong
> > > >
> > > > On Tue, Jun 22, 2021 at 9:58 AM godfrey he 
> > wrote:
> > > >
> > > > > Thanks for driving this, Dawid. +1 to the release.
> > > > > I would also like to volunteer as the release manager for these
> > > versions.
> > > > >
> > > > >
> > > > > Best
> > > > > Godfrey
> > > > >
> > > > > Seth Wiesman  于2021年6月22日周二 上午8:39写道:
> > > > >
> > > > > > +1 to the release.
> > > > > >
> > > > > > It would be great if we could get FLINK-23073 into 1.13.2.
> There's
> > > > > already
> > > > > > an open PR and it unblocks upgrading the table API walkthrough in
> > > > > > apache/flink-playgrounds to 1.13.
> > > > > >
> > > > > > Seth
> > > > > >
> > > > > > On Mon, Jun 21, 2021 at 6:28 AM Yun Tang 
> wrote:
> > > > > >
> > > > > > > Hi Dawid,
> > > > > > >
> > > > > > > Thanks for driving this discussion, I am willing to volunteer
> as
> > > the
> > > > > > > release manager for these versions.
> > > > > > >
> > > > > > >
> > > > > > > Best
> > > > > > > Yun Tang
> > > > > > > 
> > > > > > > From: Konstantin Knauf 
> > > > > > > Sent: Friday, June 18, 2021 22:35
> > > > > > > To: dev 
> > > > > > > Subject: Re: [DISCUSS] Releasing Flink 1.11.4, 1.12.5, 1.13.2
> > > > > > >
> > > > > > > Hi Dawid,
> > > > > > >
> > > > > > > Thank you for starting the discussion. I'd like to add
> > > > > > > https://issues.apache.org/jira/browse/FLINK-23025 to the list
> > for
> > > > > Flink
> > > > > > > 1.13.2.
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Konstantin
> > > > > > >
> > > > > > > On Fri, Jun 18, 2021 at 3:26 PM Dawid Wysakowicz <
> > > > > dwysakow...@apache.org
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi devs,
> > > > > > > >
> > > > > > > > Quite recently we pushed, in our opinion, quite an important
> > > fix[1]
> > > > > for
> > > > > > > > unaligned checkpoints which disables UC for broadcast
> > > partitioning.
> > > > > > > > Without the fix there might be some broadcast state
> corruption.
> > > > > > > > Therefore we think it would be beneficial to release it
> > soonish.
> > > > What
> > > > > > do
> > > > > > > > you think? Do you have other issues in mind you'd like to
> have
> > > > > included
> > > > > > > > in these versions.
> > > > > > > >
> > > > > > > > Would someone be willing to volunteer to help with the
> releases
> > > as
> > > > a
> > > > > > > > release manager? I guess there is a couple of spots to fill
> in
> > > here
> > > > > ;)
> > > > > > > >
> > > > > > > > Best,
> > > > > > > >
> > > > > > > > Dawid
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/browse/FLINK-22815
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Konstantin Knauf
> > > > > > >
> > > > > > > https://twitter.com/snntrable
> > > > > > >
> > > > > > > https://github.com/knaufk
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Best, Jingsong Lee
> > > >
> > >
> >
>


Re: [DISCUSS] Drop Mesos in 1.14

2021-06-22 Thread Xintong Song
+1 for dropping.

I like Seth's idea. I don't have any real Mesos experience either.
According to this article [1], it looks like we can deploy a standalone
cluster on Mesos similar to Kubernetes. However, we should only do it if
there's indeed a strong demand from the community for deploying a
latest version of Flink on Mesos.

Thank you~

Xintong Song


[1] https://www.baeldung.com/ops/mesos-kubernetes-comparison

On Tue, Jun 22, 2021 at 11:59 PM Israel Ekpo  wrote:

> I am in favor of dropping the support for Mesos.
>
> In terms of the landscape for users leveraging Mesos for the kind of
> workloads Flink is used, I think it is on the decline.
>
> +1 from me
>
> On Tue, Jun 22, 2021 at 11:32 AM Seth Wiesman  wrote:
>
> > Sorry if this is a naive question, I don't have any real Mesos
> experience.
> > Is it possible to deploy a standalone cluster on top of Mesos in the same
> > way you can with Kubernetes? If so, and there is still Mesos demand from
> > the community, we could document that process as the recommended
> deployment
> > mode going forward.
> >
> > Seth
> >
> > On Tue, Jun 22, 2021 at 5:02 AM Arvid Heise  wrote:
> >
> > > +1 for dropping. Frankly speaking, I don't see it having any future
> (and
> > > D2iQ
> > > agrees).
> > >
> > > If there is a surprisingly huge demand, I'd try to evaluate plugins for
> > it.
> > >
> > > On Tue, Jun 22, 2021 at 11:46 AM Till Rohrmann 
> > > wrote:
> > >
> > > > I'd be ok with dropping support for Mesos if it helps us to clear our
> > > > dependencies in the flink-runtime module. If we do it, then we should
> > > > probably update our documentation with a pointer to the latest Flink
> > > > version that supports Mesos in case of users strictly need Mesos.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Tue, Jun 22, 2021 at 10:29 AM Chesnay Schepler <
> ches...@apache.org>
> > > > wrote:
> > > >
> > > > > Last week I spent some time looking into making flink-runtime scala
> > > > > free, which effectively means to move the Akka-reliant classes to
> > > > > another module, and load that module along with Akka and all of
> it's
> > > > > dependencies (including Scala) through a separate classloader.
> > > > >
> > > > > This would finally decouple the Scala versions required by the
> > runtime
> > > > > and API, and would allow us to upgrade Akka as we'd no longer be
> > > limited
> > > > > to Scala 2.11. It would rid the classpath of a few dependencies,
> and
> > > > > remove the need for scala suffixes on quite a few modules.
> > > > >
> > > > > However, our Mesos support has unfortunately a hard dependency on
> > Akka,
> > > > > which naturally does not play well with the goal of isolating Akka
> in
> > > > > it's own ClassLoader.
> > > > >
> > > > > To solve this issue I was thinking of simple dropping flink-mesos
> in
> > > > > 1.14 (it was deprecated in 1.13).
> > > > >
> > > > > Truth be told, I picked this option because it is the easiest to
> do.
> > We
> > > > > _could_ probably make things work somehow (likely by shipping a
> > second
> > > > > Akka version just for flink-mesos), but it doesn't seem worth the
> > > hassle
> > > > > and would void some of the benefits. So far we kept flink-mesos
> > around,
> > > > > despite not really developing it further, because it didn't hurt to
> > > have
> > > > > it in still in Flink, but this has now changed.
> > > > >
> > > > > Please tell me what you think.
> > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] Do not merge PRs with "unrelated" test failures.

2021-06-22 Thread Yangze Guo
+1 for appending this to community guidelines for merging PRs.

@Till Rohrmann
I agree that with this approach unstable tests will not block other
commit merges. However, it might be hard to prevent merging commits
that are related to those tests and should have been passed them. It's
true that this judgment can be made by the committers, but no one can
ensure the judgment is always precise and so that we have this
discussion thread.

Regarding the unstable tests, how about adding another exception:
committers verify it in their local environment and comment in such
cases?

Best,
Yangze Guo

On Tue, Jun 22, 2021 at 8:23 PM 刘建刚  wrote:
>
> It is a good principle to run all tests successfully with any change.  This
> means a lot for project's stability and development. I am big +1 for this
> proposal.
>
> Best
> liujiangang
>
> Till Rohrmann  于2021年6月22日周二 下午6:36写道:
>
> > One way to address the problem of regularly failing tests that block
> > merging of PRs is to disable the respective tests for the time being. Of
> > course, the failing test then needs to be fixed. But at least that way we
> > would not block everyone from making progress.
> >
> > Cheers,
> > Till
> >
> > On Tue, Jun 22, 2021 at 12:00 PM Arvid Heise  wrote:
> >
> > > I think this is overall a good idea. So +1 from my side.
> > > However, I'd like to put a higher priority on infrastructure then, in
> > > particular docker image/artifact caches.
> > >
> > > On Tue, Jun 22, 2021 at 11:50 AM Till Rohrmann 
> > > wrote:
> > >
> > > > Thanks for bringing this topic to our attention Xintong. I think your
> > > > proposal makes a lot of sense and we should follow it. It will give us
> > > > confidence that our changes are working and it might be a good
> > incentive
> > > to
> > > > quickly fix build instabilities. Hence, +1.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Tue, Jun 22, 2021 at 11:12 AM Xintong Song 
> > > > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > In the past a couple of weeks, I've observed several times that PRs
> > are
> > > > > merged without a green light from the CI tests, where failure cases
> > are
> > > > > considered *unrelated*. This may not always cause problems, but would
> > > > > increase the chance of breaking our code base. In fact, it has
> > occurred
> > > > to
> > > > > me twice in the past few weeks that I had to revert a commit which
> > > breaks
> > > > > the master branch due to this.
> > > > >
> > > > > I think it would be nicer to enforce a stricter rule, that no PRs
> > > should
> > > > be
> > > > > merged without passing CI.
> > > > >
> > > > > The problems of merging PRs with "unrelated" test failures are:
> > > > > - It's not always straightforward to tell whether a test failures are
> > > > > related or not.
> > > > > - It prevents subsequent test cases from being executed, which may
> > fail
> > > > > relating to the PR changes.
> > > > >
> > > > > To make things easier for the committers, the following exceptions
> > > might
> > > > be
> > > > > considered acceptable.
> > > > > - The PR has passed CI in the contributor's personal workspace.
> > Please
> > > > post
> > > > > the link in such cases.
> > > > > - The CI tests have been triggered multiple times, on the same
> > commit,
> > > > and
> > > > > each stage has at least passed for once. Please also comment in such
> > > > cases.
> > > > >
> > > > > If we all agree on this, I'd update the community guidelines for
> > > merging
> > > > > PRs wrt. this proposal. [1]
> > > > >
> > > > > Please let me know what do you think.
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > > [1]
> > > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/Merging+Pull+Requests
> > > > >
> > > >
> > >
> >


Re: [DISCUSS] Drop Mesos in 1.14

2021-06-22 Thread Yangze Guo
+1 for dropping if there is no strong demand from the community.

I'm willing to help with the removal of e2e tests part.

Best,
Yangze Guo

On Wed, Jun 23, 2021 at 10:09 AM Xintong Song  wrote:
>
> +1 for dropping.
>
> I like Seth's idea. I don't have any real Mesos experience either.
> According to this article [1], it looks like we can deploy a standalone
> cluster on Mesos similar to Kubernetes. However, we should only do it if
> there's indeed a strong demand from the community for deploying a
> latest version of Flink on Mesos.
>
> Thank you~
>
> Xintong Song
>
>
> [1] https://www.baeldung.com/ops/mesos-kubernetes-comparison
>
> On Tue, Jun 22, 2021 at 11:59 PM Israel Ekpo  wrote:
>
> > I am in favor of dropping the support for Mesos.
> >
> > In terms of the landscape for users leveraging Mesos for the kind of
> > workloads Flink is used, I think it is on the decline.
> >
> > +1 from me
> >
> > On Tue, Jun 22, 2021 at 11:32 AM Seth Wiesman  wrote:
> >
> > > Sorry if this is a naive question, I don't have any real Mesos
> > experience.
> > > Is it possible to deploy a standalone cluster on top of Mesos in the same
> > > way you can with Kubernetes? If so, and there is still Mesos demand from
> > > the community, we could document that process as the recommended
> > deployment
> > > mode going forward.
> > >
> > > Seth
> > >
> > > On Tue, Jun 22, 2021 at 5:02 AM Arvid Heise  wrote:
> > >
> > > > +1 for dropping. Frankly speaking, I don't see it having any future
> > (and
> > > > D2iQ
> > > > agrees).
> > > >
> > > > If there is a surprisingly huge demand, I'd try to evaluate plugins for
> > > it.
> > > >
> > > > On Tue, Jun 22, 2021 at 11:46 AM Till Rohrmann 
> > > > wrote:
> > > >
> > > > > I'd be ok with dropping support for Mesos if it helps us to clear our
> > > > > dependencies in the flink-runtime module. If we do it, then we should
> > > > > probably update our documentation with a pointer to the latest Flink
> > > > > version that supports Mesos in case of users strictly need Mesos.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Tue, Jun 22, 2021 at 10:29 AM Chesnay Schepler <
> > ches...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Last week I spent some time looking into making flink-runtime scala
> > > > > > free, which effectively means to move the Akka-reliant classes to
> > > > > > another module, and load that module along with Akka and all of
> > it's
> > > > > > dependencies (including Scala) through a separate classloader.
> > > > > >
> > > > > > This would finally decouple the Scala versions required by the
> > > runtime
> > > > > > and API, and would allow us to upgrade Akka as we'd no longer be
> > > > limited
> > > > > > to Scala 2.11. It would rid the classpath of a few dependencies,
> > and
> > > > > > remove the need for scala suffixes on quite a few modules.
> > > > > >
> > > > > > However, our Mesos support has unfortunately a hard dependency on
> > > Akka,
> > > > > > which naturally does not play well with the goal of isolating Akka
> > in
> > > > > > it's own ClassLoader.
> > > > > >
> > > > > > To solve this issue I was thinking of simple dropping flink-mesos
> > in
> > > > > > 1.14 (it was deprecated in 1.13).
> > > > > >
> > > > > > Truth be told, I picked this option because it is the easiest to
> > do.
> > > We
> > > > > > _could_ probably make things work somehow (likely by shipping a
> > > second
> > > > > > Akka version just for flink-mesos), but it doesn't seem worth the
> > > > hassle
> > > > > > and would void some of the benefits. So far we kept flink-mesos
> > > around,
> > > > > > despite not really developing it further, because it didn't hurt to
> > > > have
> > > > > > it in still in Flink, but this has now changed.
> > > > > >
> > > > > > Please tell me what you think.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >


[jira] [Created] (FLINK-23104) flink-statebackend-changelog does not build with scala 2.12

2021-06-22 Thread Xintong Song (Jira)
Xintong Song created FLINK-23104:


 Summary: flink-statebackend-changelog does not build with scala 
2.12
 Key: FLINK-23104
 URL: https://issues.apache.org/jira/browse/FLINK-23104
 Project: Flink
  Issue Type: Bug
  Components: Build System, Runtime / State Backends
Affects Versions: 1.14.0
Reporter: Xintong Song
 Fix For: 1.14.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19335&view=logs&j=ed6509f5-1153-558c-557a-5ee0afbcdf24&t=241b1e5e-1a8e-5e6a-469a-a9b8cad87065&l=4868



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23105) YARNSessionCapacitySchedulerITCase.testVCoresAreSetCorrectlyAndJobManagerHostnameAreShownInWebInterfaceAndDynamicPropertiesAndYarnApplicationNameAndTaskManagerSlots fail

2021-06-22 Thread Xintong Song (Jira)
Xintong Song created FLINK-23105:


 Summary: 
YARNSessionCapacitySchedulerITCase.testVCoresAreSetCorrectlyAndJobManagerHostnameAreShownInWebInterfaceAndDynamicPropertiesAndYarnApplicationNameAndTaskManagerSlots
 fails on azure
 Key: FLINK-23105
 URL: https://issues.apache.org/jira/browse/FLINK-23105
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.14.0
Reporter: Xintong Song
 Fix For: 1.14.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19335&view=logs&j=a5ef94ef-68c2-57fd-3794-dc108ed1c495&t=9c1ddabe-d186-5a2c-5fcc-f3cafb3ec699&l=28045

{code}
 The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: The main method 
caused an error: Failed to execute job 'Windowed Join Example'.
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372)
at 
org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222)
at 
org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114)
at 
org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812)
at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246)
at 
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054)
at org.apache.flink.yarn.YarnTestBase$Runner.run(YarnTestBase.java:1144)
Caused by: org.apache.flink.util.FlinkException: Failed to execute job 
'Windowed Join Example'.
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.executeAsync(StreamExecutionEnvironment.java:2027)
at 
org.apache.flink.client.program.StreamContextEnvironment.executeAsync(StreamContextEnvironment.java:135)
at 
org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:76)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1891)
at 
org.apache.flink.streaming.examples.join.WindowJoin.main(WindowJoin.java:88)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355)
... 6 more
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to 
submit JobGraph.
at 
org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$9(RestClusterClient.java:405)
at 
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884)
at 
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866)
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:386)
at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
at 
org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$1(RestClient.java:430)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:570)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:549)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:608)
at 
org.apache.flink.shaded.netty4.io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)
at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:321)
at 
org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioChannel$

Re: Re: [DISCUSS] FLIP-169: DataStream API for Fine-Grained Resource Requirements

2021-06-22 Thread Yangze Guo
FYI, I updated the FLIP accordingly. To sum up, Flink will throw an
exception and tell user to configure an internal
"fine-grained.shuffle-mode.all-blocking" to be true in this scenario.

Best,
Yangze Guo

On Tue, Jun 22, 2021 at 2:20 PM Yangze Guo  wrote:
>
> Thanks for the comment, Xintong.
>
> I used to wonder if it was reasonable or worthwhile to introduce a
> configuration like "table.exec.shuffle-mode" for DataStream API.
> Narrow down the scope of effect sounds good to me.
>
> Best,
> Yangze Guo
>
> On Tue, Jun 22, 2021 at 2:08 PM Xintong Song  wrote:
> >
> > I second Zhu and Till's opinion.
> >
> > Failing with an exception that also includes how to resolve the problem
> > sounds better, in terms of making it explicit to users that pipelined edges
> > are replaced with blocking edges.
> >
> > Concerning absence of knobs tuning the edge types, we can introduce a
> > configuration option. Since currently the edge types are fixed based on the
> > job execution mode and are not exposed to users, I'd suggest introducing a
> > configuration option that only affects fine-grained resource management use
> > cases. To be specific, we can have something like
> > 'fine-grained.xxx.all-blocking'. The default value should be false, and we
> > can suggest users to set it to true in the error message. When set to true,
> > this should take effect only when fine-grained resource requirements are
> > detected. Thus, it should not affect the default execution-mode based edge
> > type strategy for non fine-grained use cases.
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Mon, Jun 21, 2021 at 8:59 PM Yangze Guo  wrote:
> >
> > > Thanks for the feedback, Till!
> > >
> > > Actually, we cannot give user any resolution for this issue as there
> > > is no API for DataStream users to influence the edge types at the
> > > moment. The edge types are currently fixed based on the jobs' mode
> > > (batch or streaming).
> > > a) I think it might not confuse the user a lot as the behavior has
> > > never been documented or guaranteed to be unchanged.
> > > b) Thanks for your illustration. I agree that add complexity can make
> > > other feature development harder in the future. However, I think this
> > > might not introduce much complexity. In this case, we construct an
> > > all-edges-blocking job graph, which already exists since 1.11 and
> > > should have been considered by the following features. I admit we
> > > cannot assume the all-edges-blocking job graph will exist forever in
> > > Flink, but AFAIK there is no seeable feature that will intend to
> > > deprecate it.
> > >
> > > WDYT?
> > >
> > >
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Mon, Jun 21, 2021 at 6:10 PM Till Rohrmann 
> > > wrote:
> > > >
> > > > I would be more in favor of what Zhu Zhu proposed to throw an exception
> > > > with a meaningful and understandable explanation that also includes how
> > > to
> > > > resolve this problem. I do understand the reasoning behind automatically
> > > > switching the edge types in order to make things easier to use but a)
> > > this
> > > > can also be confusing if the user does not expect this to happen and b)
> > > it
> > > > can add some complexity which makes other feature development harder in
> > > the
> > > > future because users might rely on it. An example of such a case I
> > > stumbled
> > > > upon rather recently is that we adjust the maximum parallelism wrt the
> > > > given savepoint if it has not been explicitly configured. On the paper
> > > this
> > > > sounds like a good usability improvement, however, for the
> > > > AdaptiveScheduler it posed a quite annoying complexity. If instead, we
> > > said
> > > > that we fail the job submission if the max parallelism does not equal 
> > > > the
> > > > max parallelism of the savepoint, it would have been a lot easier.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Mon, Jun 21, 2021 at 9:36 AM Yangze Guo  wrote:
> > > >
> > > > > Thanks, I append it to the known limitations of this FLIP.
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Mon, Jun 21, 2021 at 3:20 PM Zhu Zhu  wrote:
> > > > > >
> > > > > > Thanks for the quick response Yangze.
> > > > > > The proposal sounds good to me.
> > > > > >
> > > > > > Thanks,
> > > > > > Zhu
> > > > > >
> > > > > > Yangze Guo  于2021年6月21日周一 下午3:01写道:
> > > > > >>
> > > > > >> Thanks for the comments, Zhu!
> > > > > >>
> > > > > >> Yes, it is a known limitation for fine-grained resource management.
> > > We
> > > > > >> also have filed this issue in FLINK-20865 when we proposed 
> > > > > >> FLIP-156.
> > > > > >>
> > > > > >> As a first step, I agree that we can mark batch jobs with PIPELINED
> > > > > >> edges as an invalid case for this feature. However, just throwing 
> > > > > >> an
> > > > > >> exception, in that case, might confuse users who do not understand
> > > the
> > > > > >> concept of pipeline region. Maybe we can force all the edges in 
> > > > > >> this

[jira] [Created] (FLINK-23106) RetractableTopNFunction should send retract first and then send insert

2021-06-22 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-23106:


 Summary: RetractableTopNFunction should send retract first and 
then send insert
 Key: FLINK-23106
 URL: https://issues.apache.org/jira/browse/FLINK-23106
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Reporter: Jingsong Lee
 Fix For: 1.14.0


Consider:

Source -> Rank(rownum=1) -> Join(key is partitionKey)

In this case, the downstream will produce the wrong result, even the wrong 
retraction (retract message that does not exist)
 * Rank output partition key + rownum , the rank here has row_ rownum=1
 * In the downstream join judgment rank output, rownum is 1, so assuming 
partition key is PK, do PK based optimization
 * Because select rownum, rank thinks that the PK of downstream data must be 
partition key + rownum, the retraction message of partition key is out of 
order, leading to problems in downstream PK based optimization

Fix: there is a problem with the rank implementation. It should not result in 
disordered retraction messages of partition key.

Bypass: do not select rownum of rank output



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23107) Separate deduplicate rank from rank functions

2021-06-22 Thread Jingsong Lee (Jira)
Jingsong Lee created FLINK-23107:


 Summary: Separate deduplicate rank from rank functions
 Key: FLINK-23107
 URL: https://issues.apache.org/jira/browse/FLINK-23107
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Runtime
Reporter: Jingsong Lee
 Fix For: 1.14.0


SELECT * FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY d ORDER BY e DESC) AS 
rownum from T) WHERE rownum=1

Actually above sql is a deduplicate rank instead of a normal rank. We should 
separate the implementation for optimize the deduplicate rank and reduce bugs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


退订

2021-06-22 Thread William王卫光
退订

Re: [DISCUSS] Feedback Collection Jira Bot

2021-06-22 Thread Konstantin Knauf
Hi everyone,

I was hoping for more feedback from other committers, but seems like this
is not happening, so here's my proposal for immediate changes:

* Ignore tickets with a fixVersion for all rules but the stale-unassigned
role.

* We change the time intervals as follows, accepting reality a bit more ;)

* stale-assigned only after 30 days (instead of 14 days)
* stale-critical only after 14 days (instead of 7 days)
* stale-major only after 60 days (instead of 30 days)

Unless there are -1s, I'd implement the changes Monday next week.

Cheers,

Konstantin

On Thu, Jun 17, 2021 at 2:17 PM Piotr Nowojski  wrote:

> Hi,
>
> I also think that the bot is a bit too aggressive/too quick with assigning
> stale issues/deprioritizing them, but that's not that big of a deal for me.
>
> What bothers me much more is that it's closing minor issues automatically.
> Depriotising issues makes sense to me. If a wish for improvement or a bug
> report has been opened a long time ago, and they got no attention over the
> time, sure depriotize them. But closing them is IMO a bad idea. Bug might
> be minor, but if it's not fixed it's still there - it shouldn't be closed.
> Closing with "won't fix" should be done for very good reasons and very
> rarely. Same applies to improvements/wishes. Furthermore, very often
> descriptions and comments have a lot of value, and if we keep closing minor
> issues I'm afraid that we end up with:
> - more duplication. I doubt anyone will be looking for prior "closed" bug
> reports/improvement requests. Definitely I'm only looking for open tickets
> when looking if a ticket for XYZ already exists or not
> - we will be losing knowledge
>
> Piotrek
>
> śr., 16 cze 2021 o 15:12 Robert Metzger  napisał(a):
>
> > Very sorry for the delayed response.
> >
> > Regarding tickets with the "test-instability" label (topic 1): I'm
> usually
> > assigning a fixVersion to the next release of the branch where the
> failure
> > occurred, when I'm opening a test failure ticket. Others seem to do that
> > too. Hence my comment that not checking tickets with a fixVersion set by
> > Flink bot is good (because test failures should always stay "Critical"
> > until we've understood what's going on)
> > I see that it is a bit contradicting that Critical test instabilities
> > receive no attention for 14 days, but that seems to be the norm given the
> > current number of incoming test instabilities.
> >
> > On Wed, Jun 16, 2021 at 2:05 PM Till Rohrmann 
> > wrote:
> >
> > > Another example for category 4 would be the ticket where we collect
> > > breaking API changes for Flink 2.0 [1]. The idea behind this ticket is
> to
> > > collect things to consider when developing the next major version.
> > > Admittedly, we have never seen the benefits of collecting the breaking
> > > changes because we haven't started Flink 2.x yet. Also, it is not clear
> > how
> > > relevant these tickets are right now.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-3957
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Jun 16, 2021 at 11:42 AM Konstantin Knauf 
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > thank you for all the feedback so far. I believe we have four
> different
> > > > topics by now:
> > > >
> > > > 1 about *test-instability tickets* raised by Robert. Waiting for
> > feedback
> > > > by Robert.
> > > >
> > > > 2 about *aggressiveness of stale-assigned *rule raised by Timo.
> Waiting
> > > > for feedback by Timo and others.
> > > >
> > > > 3 about *excluding issues with a fixVersion* raised by Konstantin,
> > Till.
> > > > Waiting for more feedback by the community as it involves general
> > changes
> > > > to how we deal with fixVersion.
> > > >
> > > > 4 about *excluding issues with a specific-label* raised by Arvid.
> > > >
> > > > I've already written something about 1-3. Regarding 4:
> > > >
> > > > How do we make sure that these don't become stale? I think, there
> have
> > > > been a few "long-term efforts" in the past that never got the
> attention
> > > > that we initially wanted. Is this just about the ability to collect
> > > tickets
> > > > under an umbrella to document a future effort? Maybe for the example
> of
> > > > DataStream replacing DataSet how would this look like in Jira?
> > > >
> > > > Cheers,
> > > >
> > > > Konstantin
> > > >
> > > >
> > > > On Tue, Jun 8, 2021 at 11:31 AM Till Rohrmann 
> > > > wrote:
> > > >
> > > >> I like this idea. It would then be the responsibility of the
> component
> > > >> maintainers to manage the lifecycle explicitly.
> > > >>
> > > >> Cheers,
> > > >> Till
> > > >>
> > > >> On Mon, Jun 7, 2021 at 1:48 PM Arvid Heise 
> wrote:
> > > >>
> > > >> > One more idea for the bot. Could we have a label to exclude
> certain
> > > >> tickets
> > > >> > from the life-cycle?
> > > >> >
> > > >> > I'm thinking about long-term tickets such as improving DataStream
> to
> > > >> > eventually replace DataSet. We would collect ideas over the next
> > > co

Re: [DISCUSS] Feedback Collection Jira Bot

2021-06-22 Thread Konstantin Knauf
Hi Piotr,

the bot does not close with "Won't fix", it closes with resolution
"Auto-Closed". It also says in a comment:

This issue was labeled "{warning_label}" {warning_days} days ago and has
not received any updates so I have gone ahead and closed it. If you are
still affected by this or would like to raise the priority of this ticket
please re-open, removing the label "{done_label}" and raise the ticket
priority accordingly.
In my opinion it is important that we close tickets eventually. There are a
lot of tickets (bugs, improvements, tech debt) that over time became
irrelevant, out-of-scope, irreproducible, etc.  In my experience, these
tickets are usually not closed by anyone but the bot.

Cheers,

Konstantin



On Thu, Jun 17, 2021 at 2:17 PM Piotr Nowojski  wrote:

> Hi,
>
> I also think that the bot is a bit too aggressive/too quick with assigning
> stale issues/deprioritizing them, but that's not that big of a deal for me.
>
> What bothers me much more is that it's closing minor issues automatically.
> Depriotising issues makes sense to me. If a wish for improvement or a bug
> report has been opened a long time ago, and they got no attention over the
> time, sure depriotize them. But closing them is IMO a bad idea. Bug might
> be minor, but if it's not fixed it's still there - it shouldn't be closed.
> Closing with "won't fix" should be done for very good reasons and very
> rarely. Same applies to improvements/wishes. Furthermore, very often
> descriptions and comments have a lot of value, and if we keep closing minor
> issues I'm afraid that we end up with:
> - more duplication. I doubt anyone will be looking for prior "closed" bug
> reports/improvement requests. Definitely I'm only looking for open tickets
> when looking if a ticket for XYZ already exists or not
> - we will be losing knowledge
>
> Piotrek
>
> śr., 16 cze 2021 o 15:12 Robert Metzger  napisał(a):
>
> > Very sorry for the delayed response.
> >
> > Regarding tickets with the "test-instability" label (topic 1): I'm
> usually
> > assigning a fixVersion to the next release of the branch where the
> failure
> > occurred, when I'm opening a test failure ticket. Others seem to do that
> > too. Hence my comment that not checking tickets with a fixVersion set by
> > Flink bot is good (because test failures should always stay "Critical"
> > until we've understood what's going on)
> > I see that it is a bit contradicting that Critical test instabilities
> > receive no attention for 14 days, but that seems to be the norm given the
> > current number of incoming test instabilities.
> >
> > On Wed, Jun 16, 2021 at 2:05 PM Till Rohrmann 
> > wrote:
> >
> > > Another example for category 4 would be the ticket where we collect
> > > breaking API changes for Flink 2.0 [1]. The idea behind this ticket is
> to
> > > collect things to consider when developing the next major version.
> > > Admittedly, we have never seen the benefits of collecting the breaking
> > > changes because we haven't started Flink 2.x yet. Also, it is not clear
> > how
> > > relevant these tickets are right now.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-3957
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Jun 16, 2021 at 11:42 AM Konstantin Knauf 
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > thank you for all the feedback so far. I believe we have four
> different
> > > > topics by now:
> > > >
> > > > 1 about *test-instability tickets* raised by Robert. Waiting for
> > feedback
> > > > by Robert.
> > > >
> > > > 2 about *aggressiveness of stale-assigned *rule raised by Timo.
> Waiting
> > > > for feedback by Timo and others.
> > > >
> > > > 3 about *excluding issues with a fixVersion* raised by Konstantin,
> > Till.
> > > > Waiting for more feedback by the community as it involves general
> > changes
> > > > to how we deal with fixVersion.
> > > >
> > > > 4 about *excluding issues with a specific-label* raised by Arvid.
> > > >
> > > > I've already written something about 1-3. Regarding 4:
> > > >
> > > > How do we make sure that these don't become stale? I think, there
> have
> > > > been a few "long-term efforts" in the past that never got the
> attention
> > > > that we initially wanted. Is this just about the ability to collect
> > > tickets
> > > > under an umbrella to document a future effort? Maybe for the example
> of
> > > > DataStream replacing DataSet how would this look like in Jira?
> > > >
> > > > Cheers,
> > > >
> > > > Konstantin
> > > >
> > > >
> > > > On Tue, Jun 8, 2021 at 11:31 AM Till Rohrmann 
> > > > wrote:
> > > >
> > > >> I like this idea. It would then be the responsibility of the
> component
> > > >> maintainers to manage the lifecycle explicitly.
> > > >>
> > > >> Cheers,
> > > >> Till
> > > >>
> > > >> On Mon, Jun 7, 2021 at 1:48 PM Arvid Heise 
> wrote:
> > > >>
> > > >> > One more idea for the bot. Could we have a label to exclude
> certain
> > > >> tickets
> > > >> > from the life-cycle?
> > > >> >
> > > >> > I'm 

Re: [ANNOUNCE] New PMC member: Xintong Song

2021-06-22 Thread Matthias Pohl
Congratulations, Xintong!

On Mon, Jun 21, 2021 at 5:28 AM Zhilong Hong  wrote:

> Congratulations, Xintong!
>
> Dawid Wysakowicz  于2021年6月16日周三 下午5:23写道:
>
> > Hi all!
> >
> > I'm very happy to announce that Xintong Song has joined the Flink PMC!
> >
> > Congratulations and welcome Xintong!
> >
> > Best,
> > Dawid
> >


Re: [DISCUSS] Drop Mesos in 1.14

2021-06-22 Thread Matthias Pohl
+1 for dropping Mesos support. There was no feedback opposing the direction
from the community in the most-recent discussion [1,2] on deprecating it.

Matthias

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Deprecating-Mesos-support-td50142.html

On Wed, Jun 23, 2021 at 4:21 AM Yangze Guo  wrote:

> +1 for dropping if there is no strong demand from the community.
>
> I'm willing to help with the removal of e2e tests part.
>
> Best,
> Yangze Guo
>
> On Wed, Jun 23, 2021 at 10:09 AM Xintong Song 
> wrote:
> >
> > +1 for dropping.
> >
> > I like Seth's idea. I don't have any real Mesos experience either.
> > According to this article [1], it looks like we can deploy a standalone
> > cluster on Mesos similar to Kubernetes. However, we should only do it if
> > there's indeed a strong demand from the community for deploying a
> > latest version of Flink on Mesos.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> > [1] https://www.baeldung.com/ops/mesos-kubernetes-comparison
> >
> > On Tue, Jun 22, 2021 at 11:59 PM Israel Ekpo 
> wrote:
> >
> > > I am in favor of dropping the support for Mesos.
> > >
> > > In terms of the landscape for users leveraging Mesos for the kind of
> > > workloads Flink is used, I think it is on the decline.
> > >
> > > +1 from me
> > >
> > > On Tue, Jun 22, 2021 at 11:32 AM Seth Wiesman 
> wrote:
> > >
> > > > Sorry if this is a naive question, I don't have any real Mesos
> > > experience.
> > > > Is it possible to deploy a standalone cluster on top of Mesos in the
> same
> > > > way you can with Kubernetes? If so, and there is still Mesos demand
> from
> > > > the community, we could document that process as the recommended
> > > deployment
> > > > mode going forward.
> > > >
> > > > Seth
> > > >
> > > > On Tue, Jun 22, 2021 at 5:02 AM Arvid Heise 
> wrote:
> > > >
> > > > > +1 for dropping. Frankly speaking, I don't see it having any future
> > > (and
> > > > > D2iQ
> > > > > agrees).
> > > > >
> > > > > If there is a surprisingly huge demand, I'd try to evaluate
> plugins for
> > > > it.
> > > > >
> > > > > On Tue, Jun 22, 2021 at 11:46 AM Till Rohrmann <
> trohrm...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > I'd be ok with dropping support for Mesos if it helps us to
> clear our
> > > > > > dependencies in the flink-runtime module. If we do it, then we
> should
> > > > > > probably update our documentation with a pointer to the latest
> Flink
> > > > > > version that supports Mesos in case of users strictly need Mesos.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Tue, Jun 22, 2021 at 10:29 AM Chesnay Schepler <
> > > ches...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Last week I spent some time looking into making flink-runtime
> scala
> > > > > > > free, which effectively means to move the Akka-reliant classes
> to
> > > > > > > another module, and load that module along with Akka and all of
> > > it's
> > > > > > > dependencies (including Scala) through a separate classloader.
> > > > > > >
> > > > > > > This would finally decouple the Scala versions required by the
> > > > runtime
> > > > > > > and API, and would allow us to upgrade Akka as we'd no longer
> be
> > > > > limited
> > > > > > > to Scala 2.11. It would rid the classpath of a few
> dependencies,
> > > and
> > > > > > > remove the need for scala suffixes on quite a few modules.
> > > > > > >
> > > > > > > However, our Mesos support has unfortunately a hard dependency
> on
> > > > Akka,
> > > > > > > which naturally does not play well with the goal of isolating
> Akka
> > > in
> > > > > > > it's own ClassLoader.
> > > > > > >
> > > > > > > To solve this issue I was thinking of simple dropping
> flink-mesos
> > > in
> > > > > > > 1.14 (it was deprecated in 1.13).
> > > > > > >
> > > > > > > Truth be told, I picked this option because it is the easiest
> to
> > > do.
> > > > We
> > > > > > > _could_ probably make things work somehow (likely by shipping a
> > > > second
> > > > > > > Akka version just for flink-mesos), but it doesn't seem worth
> the
> > > > > hassle
> > > > > > > and would void some of the benefits. So far we kept flink-mesos
> > > > around,
> > > > > > > despite not really developing it further, because it didn't
> hurt to
> > > > > have
> > > > > > > it in still in Flink, but this has now changed.
> > > > > > >
> > > > > > > Please tell me what you think.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >


[jira] [Created] (FLINK-23108) KafkaITCase.testCancelingEmptyTopic fails on azure

2021-06-22 Thread Xintong Song (Jira)
Xintong Song created FLINK-23108:


 Summary: KafkaITCase.testCancelingEmptyTopic fails on azure
 Key: FLINK-23108
 URL: https://issues.apache.org/jira/browse/FLINK-23108
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.13.1
Reporter: Xintong Song
 Fix For: 1.13.2


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=19337&view=logs&j=b0097207-033c-5d9a-b48c-6d4796fbe60d&t=e8fcc430-213e-5cce-59d4-6942acf09121&l=6607

{code}
Jun 23 02:01:27 [INFO] Running 
org.apache.flink.streaming.connectors.kafka.KafkaITCase
java.lang.Exception: Topic still exists after timeout
at 
org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.tryDelete(KafkaTestEnvironmentImpl.java:206)
at 
org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.deleteTestTopic(KafkaTestEnvironmentImpl.java:174)
at 
org.apache.flink.streaming.connectors.kafka.KafkaTestBase.deleteTestTopic(KafkaTestBase.java:216)
at 
org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase.runCancelingOnEmptyInputTest(KafkaConsumerTestBase.java:1243)
at 
org.apache.flink.streaming.connectors.kafka.KafkaITCase.testCancelingEmptyTopic(KafkaITCase.java:83)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
Jun 23 02:04:27 [ERROR] Tests run: 23, Failures: 1, Errors: 0, Skipped: 0, Time 
elapsed: 179.821 s <<< FAILURE! - in 
org.apache.flink.streaming.connectors.kafka.KafkaITCase
Jun 23 02:04:27 [ERROR] 
testCancelingEmptyTopic(org.apache.flink.streaming.connectors.kafka.KafkaITCase)
  Time elapsed: 38 s  <<< FAILURE!
Jun 23 02:04:27 java.lang.AssertionError: Delete test topic : 
cancelingOnEmptyInputTopic failed, Topic still exists after timeout
Jun 23 02:04:27 at org.junit.Assert.fail(Assert.java:88)
Jun 23 02:04:27 at 
org.apache.flink.streaming.connectors.kafka.KafkaTestEnvironmentImpl.deleteTestTopic(KafkaTestEnvironmentImpl.java:177)
Jun 23 02:04:27 at 
org.apache.flink.streaming.connectors.kafka.KafkaTestBase.deleteTestTopic(KafkaTestBase.java:216)
Jun 23 02:04:27 at 
org.apache.flink.streaming.connectors.kafka.KafkaConsumerTestBase.runCancelingOnEmptyInputTest(KafkaConsumerTestBase.java:1243)
Jun 23 02:04:27 at 
org.apache.flink.streaming.connectors.kafka.KafkaITCase.testCancelingEmptyTopic(KafkaITCase.java:83)
Jun 23 02:04:27 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
Jun 23 02:04:27 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Jun 23 02:04:27 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Jun 23 02:04:27 at java.lang.reflect.Method.invoke(Method.java:498)
Jun 23 02:04:27 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
Jun 23 02:04:27 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
Jun 23 02:04:27 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
Jun 23 02:04:27 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
Jun 23 02:04:27 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
Jun 23 02:04:27 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
Jun 23 02:04:27 at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
Jun 23 02:04:27 at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Re: [ANNOUNCE] New PMC member: Arvid Heise

2021-06-22 Thread Matthias Pohl
Congratulations, Arvid! :-)

On Thu, Jun 17, 2021 at 9:02 AM Arvid Heise  wrote:

> Thank you for your trust and support.
>
> Arvid
>
> On Thu, Jun 17, 2021 at 8:39 AM Roman Khachatryan 
> wrote:
>
> > Congratulations!
> >
> > Regards,
> > Roman
> >
> > On Thu, Jun 17, 2021 at 5:56 AM Xingbo Huang  wrote:
> > >
> > > Congratulations, Arvid!
> > >
> > > Best,
> > > Xingbo
> > >
> > > Yun Tang  于2021年6月17日周四 上午10:49写道:
> > >
> > > > Congratulations, Arvid
> > > >
> > > > Best
> > > > Yun Tang
> > > > 
> > > > From: Yun Gao 
> > > > Sent: Thursday, June 17, 2021 10:46
> > > > To: Jingsong Li ; dev 
> > > > Subject: Re: Re: [ANNOUNCE] New PMC member: Arvid Heise
> > > >
> > > > Congratulations, Arvid!
> > > >
> > > > Best,
> > > > Yun
> > > >
> > > >
> > > > --
> > > > Sender:Jingsong Li
> > > > Date:2021/06/17 10:41:29
> > > > Recipient:dev
> > > > Theme:Re: [ANNOUNCE] New PMC member: Arvid Heise
> > > >
> > > > Congratulations, Arvid!
> > > >
> > > > Best,
> > > > Jingsong
> > > >
> > > > On Thu, Jun 17, 2021 at 6:52 AM Matthias J. Sax 
> > wrote:
> > > >
> > > > > Congrats!
> > > > >
> > > > > On 6/16/21 6:06 AM, Leonard Xu wrote:
> > > > > > Congratulations, Arvid!
> > > > > >
> > > > > >
> > > > > >> 在 2021年6月16日,20:08,Till Rohrmann  写道:
> > > > > >>
> > > > > >> Congratulations, Arvid!
> > > > > >>
> > > > > >> Cheers,
> > > > > >> Till
> > > > > >>
> > > > > >> On Wed, Jun 16, 2021 at 1:47 PM JING ZHANG <
> beyond1...@gmail.com>
> > > > > wrote:
> > > > > >>
> > > > > >>> Congratulations, Arvid!
> > > > > >>>
> > > > > >>> Nicholas Jiang  于2021年6月16日周三 下午7:25写道:
> > > > > >>>
> > > > >  Congratulations, Arvid!
> > > > > 
> > > > > 
> > > > > 
> > > > >  --
> > > > >  Sent from:
> > > > > >>>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > > > 
> > > > > >>>
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Best, Jingsong Lee
> > > >
> > > >
> >
>


CONTENTS DELETED in nabble frontend

2021-06-22 Thread Matthias Pohl
Hi everyone,
Is it only me or does anyone else have the same problem with messages being
not available anymore in the nabble frontend? I get multiple messages like
the following one for individual messages:
> CONTENTS DELETED
> The author has deleted this message.

This appears for instance in [1], where all the messages are deleted except
for Till Rohrmann's, Yangze Guo's and mine. This issue is not limited to
the dev mailing list but also seem to appear in the user mailing list (e.g.
[2]).

Logging into nabble doesn't solve the problem. I'd assume that it's some
infrastructure issue rather than people collectively deleting their
messages. But a Google search wasn't of any help.

Matthias

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html
[2]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/1-1-snapshot-issues-td6971.html#a6973


[jira] [Created] (FLINK-23109) Translate "Scala API Extensions" pages into Chinese

2021-06-22 Thread Zhiliang Zhang (Jira)
Zhiliang Zhang created FLINK-23109:
--

 Summary: Translate "Scala API Extensions" pages into Chinese
 Key: FLINK-23109
 URL: https://issues.apache.org/jira/browse/FLINK-23109
 Project: Flink
  Issue Type: Improvement
  Components: chinese-translation, Documentation
Affects Versions: 1.13.0
Reporter: Zhiliang Zhang


Translate the internal page 
"https://ci.apache.org/projects/flink/flink-docs-release-1.13/zh/docs/dev/datastream/scala_api_extensions/";
 into Chinese.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: CONTENTS DELETED in nabble frontend

2021-06-22 Thread Yangze Guo
It seems the post will remain iff it is sent by a registered email. I
do not register nabble in user ML and my post is deleted in [1].

[1] 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/after-upgrade-flink1-12-to-flink1-13-1-flink-web-ui-s-taskmanager-detail-page-error-tt44391.html

Best,
Yangze Guo

On Wed, Jun 23, 2021 at 2:16 PM Matthias Pohl  wrote:
>
> Hi everyone,
> Is it only me or does anyone else have the same problem with messages being
> not available anymore in the nabble frontend? I get multiple messages like
> the following one for individual messages:
> > CONTENTS DELETED
> > The author has deleted this message.
>
> This appears for instance in [1], where all the messages are deleted except
> for Till Rohrmann's, Yangze Guo's and mine. This issue is not limited to
> the dev mailing list but also seem to appear in the user mailing list (e.g.
> [2]).
>
> Logging into nabble doesn't solve the problem. I'd assume that it's some
> infrastructure issue rather than people collectively deleting their
> messages. But a Google search wasn't of any help.
>
> Matthias
>
> [1]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html
> [2]
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/1-1-snapshot-issues-td6971.html#a6973


[jira] [Created] (FLINK-23110) Modernize Flink UI

2021-06-22 Thread Jira
Ingo Bürk created FLINK-23110:
-

 Summary: Modernize Flink UI
 Key: FLINK-23110
 URL: https://issues.apache.org/jira/browse/FLINK-23110
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Reporter: Ingo Bürk


The current Flink UI has fallen out of a little bit into disrepair. It is using 
Angular 7 (current version as of writing is 12) which has been EOL for a while. 
With it, other dependencies (node, ng-zorro) have become out of date as well, 
which also introduces vulnerabilities in various dependencies.

This issue is an umbrella issue to collect various tasks which should be done 
in order to modernize Flink UI (again).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23111) Update Angular

2021-06-22 Thread Jira
Ingo Bürk created FLINK-23111:
-

 Summary: Update Angular
 Key: FLINK-23111
 URL: https://issues.apache.org/jira/browse/FLINK-23111
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Reporter: Ingo Bürk


Angular 7 (currently used by Flink UI) is already EOL. In order to stay up to 
date with improvements to the framework, browser support, security and 
dependencies, we need to update it to the latest version.

This also involves updating various dependencies like node, ng-zorro, 
TypeScript, RxJs, …



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: CONTENTS DELETED in nabble frontend

2021-06-22 Thread Yangze Guo
Ahh. It seems nabble has updated mailing lists to regular forums this week[1].

[1] 
http://support.nabble.com/Mailing-Lists-will-be-updated-to-regular-forums-next-week-td7609458.html

Best,
Yangze Guo

On Wed, Jun 23, 2021 at 2:37 PM Yangze Guo  wrote:
>
> It seems the post will remain iff it is sent by a registered email. I
> do not register nabble in user ML and my post is deleted in [1].
>
> [1] 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/after-upgrade-flink1-12-to-flink1-13-1-flink-web-ui-s-taskmanager-detail-page-error-tt44391.html
>
> Best,
> Yangze Guo
>
> On Wed, Jun 23, 2021 at 2:16 PM Matthias Pohl  wrote:
> >
> > Hi everyone,
> > Is it only me or does anyone else have the same problem with messages being
> > not available anymore in the nabble frontend? I get multiple messages like
> > the following one for individual messages:
> > > CONTENTS DELETED
> > > The author has deleted this message.
> >
> > This appears for instance in [1], where all the messages are deleted except
> > for Till Rohrmann's, Yangze Guo's and mine. This issue is not limited to
> > the dev mailing list but also seem to appear in the user mailing list (e.g.
> > [2]).
> >
> > Logging into nabble doesn't solve the problem. I'd assume that it's some
> > infrastructure issue rather than people collectively deleting their
> > messages. But a Google search wasn't of any help.
> >
> > Matthias
> >
> > [1]
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/SURVEY-Remove-Mesos-support-td45974.html
> > [2]
> > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/1-1-snapshot-issues-td6971.html#a6973


[jira] [Created] (FLINK-23112) Formatting should be enforced for Flink UI

2021-06-22 Thread Jira
Ingo Bürk created FLINK-23112:
-

 Summary: Formatting should be enforced for Flink UI
 Key: FLINK-23112
 URL: https://issues.apache.org/jira/browse/FLINK-23112
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Reporter: Ingo Bürk


There is some formatting infrastructure in place, however this currently 
doesn't seem to work or at least be enforced in the CI.

We should make sure that formatting works "out of the box" as much as possible, 
and definitely ensure that the CI enforces consistent formatting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23113) Linting for Flink UI

2021-06-22 Thread Jira
Ingo Bürk created FLINK-23113:
-

 Summary: Linting for Flink UI
 Key: FLINK-23113
 URL: https://issues.apache.org/jira/browse/FLINK-23113
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Reporter: Ingo Bürk


There is some linting setup (using TSLint). We should revisit the current 
configuration and improve it for better type-safety. Also, we need to make sure 
linting is enforced in the CI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23114) Address vulnerabilities in Flink UI

2021-06-22 Thread Jira
Ingo Bürk created FLINK-23114:
-

 Summary: Address vulnerabilities in Flink UI
 Key: FLINK-23114
 URL: https://issues.apache.org/jira/browse/FLINK-23114
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Reporter: Ingo Bürk


We should at least run npm audit and address any current, open vulnerabilities.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-172: Support custom transactional.id prefix in FlinkKafkaProducer

2021-06-22 Thread Stephan Ewen
The motivation and the proposal sound good to me, +1 from my side.

Would be good to have a quick opinion from someone who worked specifically
with Kafka, maybe Becket or Piotr?

Best,
Stephan


On Sat, Jun 12, 2021 at 9:50 AM Wenhao Ji  wrote:

> Hi everyone,
>
> I would like to open this discussion thread to take about the FLIP-172
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-172%3A+Support+custom+transactional.id+prefix+in+FlinkKafkaProducer
> >,
> which aims to provide a way to support specifying a custom
> transactional.id
> in the FlinkKafkaProducer class.
>
> I am looking forwards to your feedback and suggestions!
>
> Thanks,
> Wenhao
>