[jira] [Created] (FLINK-35417) JobManager and TaskManager support merging and run in a single process

2024-05-21 Thread melin (Jira)
melin created FLINK-35417:
-

 Summary: JobManager and TaskManager support merging and run in a 
single process
 Key: FLINK-35417
 URL: https://issues.apache.org/jira/browse/FLINK-35417
 Project: Flink
  Issue Type: Improvement
  Components: API / Core
Reporter: melin


flink is widely used in data integration scenarios, where a single concurrency 
is not high, and in many cases a single concurrency can run a task. Consider 
the high availability, application mode, and large number of JobManger nodes 
that cost a lot of resources. If the Session mode is used, the stability is not 
high.
In application mode, JobManager and TaskManager can be run together to achieve 
reliability and save resources.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] The performance of serializerHeavyString starts regress since April 3

2024-05-21 Thread Zakelly Lan
Hi Rui and RMs of Flink 1.20,

Thanks for driving this!

Available information indicates this issue is environment- and
JDK-specific, and I also failed to reproduce it in my Mac. Thus I guess it
is caused by JIT behavior, which is unpredictable and vulnerable to
disturbance of the codebase. Considering the historical context of this
test provided by Piotr, I vote a "Won't fix" for this problem.

And I can offer some help if anyone wants to investigate the benchmark
environment, please reach out to me. JDK version info:

> openjdk version "11.0.19" 2023-04-18 LTS
> OpenJDK Runtime Environment (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS)
> OpenJDK 64-Bit Server VM (Red_Hat-11.0.19.0.7-2) (build 11.0.19+7-LTS,
> mixed mode, sharing)

The OS version is Alibaba Cloud Linux 3.2104 LTS 64-bit[1]. The linux
kernel version is 5.10.134-15.al8.x86_64.


Best,
Zakelly

[1]
https://www.alibabacloud.com/help/en/alinux/product-overview/release-notes-for-alibaba-cloud-linux
(See: Alibaba Cloud Linux 3.2104 U8, image id:
aliyun_3_x64_20G_alibase_20230727.vhd)

On Tue, May 21, 2024 at 8:15 PM Piotr Nowojski  wrote:

> Hi,
>
> Given what you wrote, that you have investigated the issue and couldn't
> find any easy explanation, I would suggest closing this ticket as "Won't
> do" or "Can not reproduce" and ignoring the problem.
>
> In the past there have been quite a bit of cases where some benchmark
> detected a performance regression. Sometimes those can not be reproduced,
> other times (as it's the case here), some seemingly unrelated change is
> causing the regression. The same thing happened in this benchmark many
> times in the past [1], [2], [3], [4]. Generally speaking this benchmark has
> been in the spotlight a couple of times [5].
>
> Note that there have been cases where this benchmark did detect a
> performance regression :)
>
> My personal suspicion is that after that commons-io version bump,
> something poked JVM/JIT to compile the code a bit differently for string
> serialization causing this regression. We have a couple of benchmarks that
> seem to be prone to such semi intermittent issues. For example the same
> benchmark was subject to this annoying pattern [6], that I've spotted in
> quite a bit of benchmarks over the years [6]:
>
> [image: image.png]
> (https://imgur.com/a/AoygmWS)
>
> Where benchmark results are very stable within a single JVM fork. But
> between two forks, they can reach two different "stable" levels. Here it
> looks like 50% of the chance of getting stable "200 records/ms" and 50%
> chances of "250 records/ms".
>
> A small interlude. Each of our benchmarks run in 3 different JVM forks, 10
> warm up iterations and 10 measurement iterations. Each iteration
> lasts/invokes the benchmarking method at least for one second. So by "very
> stable" results, I mean that for example after the 2nd or 3rd warm up
> iteration, the results stabilize < +/-1%, and stay on that level for the
> whole duration of the fork.
>
> Given that we are repeating the same benchmark in 3 different forks, we
> can have by pure chance:
> - 3 slow fork - total average 200 records/ms
> - 2 slow fork, 1 fast fork - average 216 r/ms
> - 1 slow fork, 2 fast forks - average 233 r/ms
> - 3 fast forks - average 250 r/ms
>
> So this benchmark is susceptible to enter some different semi stable
> states. As I wrote above, I guess something with the commons-io version
> bump just swayed it to a different semi stable state :( I have never gotten
> desperate enough to actually dig further what's exactly causing this kind
> of issues.
>
> Best,
> Piotrek
>
> [1] https://issues.apache.org/jira/browse/FLINK-18684
> [2] https://issues.apache.org/jira/browse/FLINK-27133
> [3] https://issues.apache.org/jira/browse/FLINK-27165
> [4] https://issues.apache.org/jira/browse/FLINK-31745
> [5]
> https://issues.apache.org/jira/browse/FLINK-35040?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20Resolved%2C%20Closed)%20AND%20text%20~%20%22serializerHeavyString%22
> [6]
> http://flink-speed.xyz/timeline/#/?exe=1=serializerHeavyString=on=on=off=2=1000
>
> wt., 21 maj 2024 o 12:50 Rui Fan <1996fan...@gmail.com> napisał(a):
>
>> Hi devs:
>>
>> We(release managers of flink 1.20) wanna update one performance
>> regresses to the flink dev mail list.
>>
>> # Background:
>>
>> The performance of serializerHeavyString starts regress since April 3,
>> and we created FLINK-35040[1] to follow it.
>>
>> In brief:
>> - The performance only regresses for jdk 11, and Java 8 and Java 17 are
>> fine.
>> - The regression reason is upgrading commons-io version from 2.11.0 to
>> 2.15.1
>>   - This upgrading is done in FLINK-34955[2].
>>   - The performance can be recovered after reverting the commons-io
>> version
>> to 2.11.0
>>
>> You can get more details from FLINK-35040[1].
>>
>> # Problem
>>
>> We try to generate the flame graph (wall mode) to analyze why upgrading
>> the commons-io version affects the performance. 

[jira] [Created] (FLINK-35416) Weekly CI for ElasticSearch connector failed to compile

2024-05-21 Thread Weijie Guo (Jira)
Weijie Guo created FLINK-35416:
--

 Summary: Weekly CI for ElasticSearch connector failed to compile
 Key: FLINK-35416
 URL: https://issues.apache.org/jira/browse/FLINK-35416
 Project: Flink
  Issue Type: Bug
  Components: Connectors / ElasticSearch
Reporter: Weijie Guo
Assignee: Weijie Guo


ElasticsearchSinkBaseITCase.java:[31,65] package 
org.apache.flink.shaded.guava30.com.google.common.collect does not exist



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-451: Refactor Async sink API

2024-05-21 Thread Leonard Xu
Thanks for your reply, Ahmed.

> (2) The FLIP-451 aims to introduce a timeout configuration, but I didn’t
>> find the configuration in FLIP even I lookup some historical versions of
>> the FLIP. Did I miss some key informations?
>> 
> 
> Yes, I tried to implicitly point that it will be added to the existing
> AsyncSinkWriterConfiguration to not inflate the FLIP, but I get it might be
> confusing. I have added the changes to the configuration classes in the
> FLIP to make it clearer.

(1) Implicitly point a public API change is not enough, Could you add a section 
Public Interfaces to enumerate all Public APIs that you proposed and you 
changed?
It’s a standard part of a FLIP template[1]. 

(2) About the proposed public interface ResultHandler, Could you 
explain or show how to use the methods #completeExceptionally and 
#retryForEntries? I didn’t find 
detail explanation or Usage example code to understand them.

(3) Could you add necessary java documents for all public API changes like new 
method AsyncSinkWriterConfiguration#setRequestTimeoutMs ? The java doc of [2] 
is a good example.

(4) Another minor reminder AsyncSinkBase is a @PublicEvolving interface too, 
please correct it, and please ensure the backward compatibility has been 
considered for all public interfaces the FLIP changed.


Best,
Leonard
[1]https://cwiki.apache.org/confluence/display/FLINK/FLIP+Template
[2]https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink


> 
> 
> On Tue, 21 May 2024 at 14:56, Leonard Xu  wrote:
> 
>> Thanks Ahmed for kicking off this discussion, sorry for jumping the
>> discussion late.
>> 
>> (1)I’m confused about the discuss thread name ‘FLIP-451: Refactor Async
>> sink API’  and FLIP title/vote thread name '
>> FLIP-451: Introduce timeout configuration to AsyncSink API <
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API>’,
>> they are different for me. Could you help explain the change history?
>> 
>> (2) The FLIP-451 aims to introduce a timeout configuration, but I didn’t
>> find the configuration in FLIP even I lookup some historical versions of
>> the FLIP. Did I miss some key informations?
>> 
>> (3) About the code change part, there’re some un-complete pieces in
>> AsyncSinkWriter for example `submitRequestEntries(List
>> requestEntries,);` is incorrect and `sendTime` variable I didn’t
>> find the place we define it and where we use it.
>> 
>> Sorry for jumping the discussion thread during vote phase again.
>> 
>> Best,
>> Leonard
>> 
>> 
>>> 2024年5月21日 下午3:49,Ahmed Hamdy  写道:
>>> 
>>> Hi Hong,
>>> Thanks for pointing that out, no we are not
>>> deprecating getFatalExceptionCons(). I have updated the FLIP
>>> Best Regards
>>> Ahmed Hamdy
>>> 
>>> 
>>> On Mon, 20 May 2024 at 15:40, Hong Liang  wrote:
>>> 
 Hi Ahmed,
 Thanks for putting this together! Should we still be marking
 getFatalExceptionCons() as @Deprecated in this FLIP, if we are not
 providing a replacement?
 
 Regards,
 Hong
 
 On Mon, May 13, 2024 at 7:58 PM Ahmed Hamdy 
>> wrote:
 
> Hi David,
> yes there error classification was initially left to sink implementers
>> to
> handle while we provided utilities to classify[1] and bubble up[2]
>> fatal
> exceptions to avoid retrying them.
> Additionally some sink implementations provide an option to short
>> circuit
> the failures by exposing a `failOnError` flag as in
 KinesisStreamsSink[3],
> however this FLIP scope doesn't include any changes for retry
>> mechanisms.
> 
> 1-
> 
> 
 
>> https://github.com/apache/flink/blob/015867803ff0c128b1c67064c41f37ca0731ed86/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/throwable/FatalExceptionClassifier.java#L32
> 2-
> 
> 
 
>> https://github.com/apache/flink/blob/015867803ff0c128b1c67064c41f37ca0731ed86/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java#L533
> 3-
> 
> 
 
>> https://github.com/apache/flink-connector-aws/blob/c6e0abb65a0e51b40dd218b890a111886fbf797f/flink-connector-aws/flink-connector-aws-kinesis-streams/src/main/java/org/apache/flink/connector/kinesis/sink/KinesisStreamsSinkWriter.java#L106
> 
> Best Regards
> Ahmed Hamdy
> 
> 
> On Mon, 13 May 2024 at 16:20, David Radley 
> wrote:
> 
>> Hi,
>> I wonder if the way that the async request fails could be a retriable
 or
>> non-retriable error, so it would retry only for retriable (transient)
>> errors (like IOExceptions) . I see some talk on the internet around
>> retriable SQL errors.
>> If this was the case then we may need configuration to limit the
 number
>> of retries of retriable errors.
>>   Kind regards, David
>> 
>> 
>> From: Muhammet Orazov 
>> Date: Monday, 13 May 2024 at 

Re: [VOTE] Release flink-connector-cassandra v3.2.0, release candidate #1

2024-05-21 Thread Hang Ruan
+1 (non-binding)

- Validated checksum hash
- Verified signature
- Verified that no binaries exist in the source archive
- Build the source with Maven and jdk8
- Verified web PR
- Check that the jar is built by jdk8

Best,
Hang

Muhammet Orazov  于2024年5月22日周三 04:15写道:

> Hey all,
>
> Could we please get some more votes to proceed with the release?
>
> Thanks and best,
> Muhammet
>
> On 2024-04-22 13:04, Danny Cranmer wrote:
> > Hi everyone,
> >
> > Please review and vote on release candidate #1 for
> > flink-connector-cassandra v3.2.0, as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > This release supports Flink 1.18 and 1.19.
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release to be deployed to dist.apache.org
> > [2],
> > which are signed with the key with fingerprint 125FD8DB [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag v3.2.0-rc1 [5],
> > * website pull request listing the new release [6].
> > * CI build of the tag [7].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Danny
> >
> > [1]
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353148
> > [2]
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-cassandra-3.2.0-rc1
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1722
> > [5]
> >
> https://github.com/apache/flink-connector-cassandra/releases/tag/v3.2.0-rc1
> > [6] https://github.com/apache/flink-web/pull/737
> > [7]
> >
> https://github.com/apache/flink-connector-cassandra/actions/runs/8784310241
>


[jira] [Created] (FLINK-35415) CDC Fails to create sink with Flink 1.19

2024-05-21 Thread yux (Jira)
yux created FLINK-35415:
---

 Summary: CDC Fails to create sink with Flink 1.19
 Key: FLINK-35415
 URL: https://issues.apache.org/jira/browse/FLINK-35415
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.1.0
Reporter: yux
 Fix For: cdc-3.2.0


Currently, Flink CDC doesn't work with Flink 1.19 with the following exception:

Exception in thread "main" java.lang.NoSuchMethodError: 'void 
org.apache.flink.streaming.runtime.operators.sink.CommitterOperatorFactory.(org.apache.flink.api.connector.sink2.TwoPhaseCommittingSink,
 boolean, boolean)'

The reason is Flink CDC uses Flink @Internal API and it was changed in 1.19 
update.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release flink-connector-cassandra v3.2.0, release candidate #1

2024-05-21 Thread Muhammet Orazov

Hey all,

Could we please get some more votes to proceed with the release?

Thanks and best,
Muhammet

On 2024-04-22 13:04, Danny Cranmer wrote:

Hi everyone,

Please review and vote on release candidate #1 for
flink-connector-cassandra v3.2.0, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

This release supports Flink 1.18 and 1.19.

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org 
[2],

which are signed with the key with fingerprint 125FD8DB [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v3.2.0-rc1 [5],
* website pull request listing the new release [6].
* CI build of the tag [7].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Danny

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353148
[2]
https://dist.apache.org/repos/dist/dev/flink/flink-connector-cassandra-3.2.0-rc1
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] 
https://repository.apache.org/content/repositories/orgapacheflink-1722

[5]
https://github.com/apache/flink-connector-cassandra/releases/tag/v3.2.0-rc1
[6] https://github.com/apache/flink-web/pull/737
[7]
https://github.com/apache/flink-connector-cassandra/actions/runs/8784310241


Re: [DISCUSS] FLIP-443: Interruptible watermark processing

2024-05-21 Thread Piotr Nowojski
Hi Zakelly and others,

> 1. I'd suggest also providing `isInterruptable()` in `Mail`, and the
> continuation mail will return true. The FLIP-425 will leverage this queue
> to execute some state requests, and when the cp arrives, the operator may
> call `yield()` to drain. It may happen that the continuation mail is
called
> again in `yield()`. By checking `isInterruptable()`, we can skip this mail
> and re-enqueue.

Do you have some suggestions on how `isInterruptible` should be defined?
Do we have to double the amount of methods in the `MailboxExecutor`, to
provide versions of the existing methods, that would enqueue
"interruptible"
versions of mails? Something like:

default void execute(ThrowingRunnable command,
String description) {
execute(DEFAULT_OPTIONS, command, description);
}

default void execute(MailOptions options, ThrowingRunnable command, String description) {
execute(options, command, description, EMPTY_ARGS);
}

default void execute(
ThrowingRunnable command,
String descriptionFormat,
Object... descriptionArgs) {
execute(DEFAULT_OPTIONS, command, descriptionFormat,
descriptionArgs);
}

   void execute(
MailOptions options,
ThrowingRunnable command,
String descriptionFormat,
Object... descriptionArgs);

   public static class MailOptions {
(...)
public MailOptions() {
}

MailOptions setIsInterruptible() {
this.isInterruptible = true;
return this;
}
}

And usage would be like this:

mailboxExecutor.execute(new MailOptions().setIsInterruptible(), () -> {
foo(); }, "foo");

?

Best,
Piotrek

czw., 16 maj 2024 o 11:26 Rui Fan <1996fan...@gmail.com> napisał(a):

> Hi Piotr,
>
> > we checked in the firing timers benchmark [1] and we didn't observe any
> > performance regression.
>
> Thanks for the feedback, it's good news to hear that. I didn't notice
> we already have fireProcessingTimers benchmark.
>
> If so, we can follow it after this FLIP is merged.
>
> +1 for this FLIP.
>
> Best,
> Rui
>
> On Thu, May 16, 2024 at 5:13 PM Piotr Nowojski 
> wrote:
>
> > Hi Zakelly,
> >
> > > I'm suggesting skipping the continuation mail during draining of async
> > state access.
> >
> > I see. That makes sense to me now. I will later update the FLIP.
> >
> > > the code path will become more complex after this FLIP
> > due to the addition of shouldIntterupt() checks, right?
> >
> > Yes, that's correct.
> >
> > > If so, it's better to add a benchmark to check whether the job
> > > performance regresses when one job has a lot of timers.
> > > If the performance regresses too much, we need to re-consider it.
> > > Of course, I hope the performance is fine.
> >
> > I had the same concerns when initially David Moravek proposed this
> > solution,
> > but we checked in the firing timers benchmark [1] and we didn't observe
> any
> > performance regression.
> >
> > Best,
> > Piotrek
> >
> > [1] http://flink-speed.xyz/timeline/?ben=fireProcessingTimers=3
> >
> >
> >
> > wt., 7 maj 2024 o 09:47 Rui Fan <1996fan...@gmail.com> napisał(a):
> >
> > > Hi Piotr,
> > >
> > > Overall this FLIP is fine for me. I have a minor concern:
> > > IIUC, the code path will become more complex after this FLIP
> > > due to the addition of shouldIntterupt() checks, right?
> > >
> > > If so, it's better to add a benchmark to check whether the job
> > > performance regresses when one job has a lot of timers.
> > > If the performance regresses too much, we need to re-consider it.
> > > Of course, I hope the performance is fine.
> > >
> > > Best,
> > > Rui
> > >
> > > On Mon, May 6, 2024 at 6:30 PM Zakelly Lan 
> > wrote:
> > >
> > > > Hi Piotr,
> > > >
> > > > I'm saying the scenario where things happen in the following order:
> > > > 1. advance watermark and process timers.
> > > > 2. the cp arrives and interrupts the timer processing, after this the
> > > > continuation mail is in the mailbox queue.
> > > > 3. `snapshotState` is called, where the async state access responses
> > will
> > > > be drained by calling `tryYield()` [1]. —— What if the continuation
> > mail
> > > is
> > > > triggered by `tryYield()`?
> > > >
> > > > I'm suggesting skipping the continuation mail during draining of
> async
> > > > state access.
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://github.com/apache/flink/blob/1904b215e36e4fd48e48ece7ffdf2f1470653130/flink-runtime/src/main/java/org/apache/flink/runtime/asyncprocessing/AsyncExecutionController.java#L305
> > > >
> > > > Best,
> > > > Zakelly
> > > >
> > > >
> > > > On Mon, May 6, 2024 at 6:00 PM Piotr Nowojski 
> > > > wrote:
> > > >
> > > > > Hi Zakelly,
> > > > >
> > > > > Can you elaborate a bit more on what you have in mind? How marking
> > > mails
> > > > as
> > > > > interruptible helps with something? If an incoming async state
> access
> > > > > response comes, it could just request to 

Re: [DISCUSS] FLIP-451: Refactor Async sink API

2024-05-21 Thread Ahmed Hamdy
Hi Leonard,
Thanks for joining the discussion.

> (1)I’m confused about the discuss thread name ‘FLIP-451: Refactor Async
> sink API’  and FLIP title/vote thread name '
> FLIP-451: Introduce timeout configuration to AsyncSink API <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API>’,
> they are different for me. Could you help explain the change history?
>

The discussion started with a wider scope to refactor the API for handling
results and exceptions as well as adding timeout configuration, however the
scope was reduced after feedback above to focus on the timeout
configuration since this is the most urgent and is not tightly coupled with
the remaining suggestions.

(2) The FLIP-451 aims to introduce a timeout configuration, but I didn’t
> find the configuration in FLIP even I lookup some historical versions of
> the FLIP. Did I miss some key informations?
>

Yes, I tried to implicitly point that it will be added to the existing
AsyncSinkWriterConfiguration to not inflate the FLIP, but I get it might be
confusing. I have added the changes to the configuration classes in the
FLIP to make it clearer.

(3) About the code change part, there’re some un-complete pieces in
> AsyncSinkWriter for example `submitRequestEntries(List
> requestEntries,);` is incorrect and `sendTime` variable I didn’t
> find the place we define it and where we use it.
>

Thanks for catching, I fixed the incomplete methods, This should clarify
how the new method is going to be integrated with the rest of the writer.
Regarding the  sendTime variable, I have replaced it with requestTimestamp;
this is an unchanged part of the code to ensure the
existing completeRequest method is unchanged.


Best Regards
Ahmed Hamdy


On Tue, 21 May 2024 at 14:56, Leonard Xu  wrote:

> Thanks Ahmed for kicking off this discussion, sorry for jumping the
> discussion late.
>
> (1)I’m confused about the discuss thread name ‘FLIP-451: Refactor Async
> sink API’  and FLIP title/vote thread name '
> FLIP-451: Introduce timeout configuration to AsyncSink API <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API>’,
> they are different for me. Could you help explain the change history?
>
> (2) The FLIP-451 aims to introduce a timeout configuration, but I didn’t
> find the configuration in FLIP even I lookup some historical versions of
> the FLIP. Did I miss some key informations?
>
> (3) About the code change part, there’re some un-complete pieces in
> AsyncSinkWriter for example `submitRequestEntries(List
> requestEntries,);` is incorrect and `sendTime` variable I didn’t
> find the place we define it and where we use it.
>
> Sorry for jumping the discussion thread during vote phase again.
>
> Best,
> Leonard
>
>
> > 2024年5月21日 下午3:49,Ahmed Hamdy  写道:
> >
> > Hi Hong,
> > Thanks for pointing that out, no we are not
> > deprecating getFatalExceptionCons(). I have updated the FLIP
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Mon, 20 May 2024 at 15:40, Hong Liang  wrote:
> >
> >> Hi Ahmed,
> >> Thanks for putting this together! Should we still be marking
> >> getFatalExceptionCons() as @Deprecated in this FLIP, if we are not
> >> providing a replacement?
> >>
> >> Regards,
> >> Hong
> >>
> >> On Mon, May 13, 2024 at 7:58 PM Ahmed Hamdy 
> wrote:
> >>
> >>> Hi David,
> >>> yes there error classification was initially left to sink implementers
> to
> >>> handle while we provided utilities to classify[1] and bubble up[2]
> fatal
> >>> exceptions to avoid retrying them.
> >>> Additionally some sink implementations provide an option to short
> circuit
> >>> the failures by exposing a `failOnError` flag as in
> >> KinesisStreamsSink[3],
> >>> however this FLIP scope doesn't include any changes for retry
> mechanisms.
> >>>
> >>> 1-
> >>>
> >>>
> >>
> https://github.com/apache/flink/blob/015867803ff0c128b1c67064c41f37ca0731ed86/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/throwable/FatalExceptionClassifier.java#L32
> >>> 2-
> >>>
> >>>
> >>
> https://github.com/apache/flink/blob/015867803ff0c128b1c67064c41f37ca0731ed86/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java#L533
> >>> 3-
> >>>
> >>>
> >>
> https://github.com/apache/flink-connector-aws/blob/c6e0abb65a0e51b40dd218b890a111886fbf797f/flink-connector-aws/flink-connector-aws-kinesis-streams/src/main/java/org/apache/flink/connector/kinesis/sink/KinesisStreamsSinkWriter.java#L106
> >>>
> >>> Best Regards
> >>> Ahmed Hamdy
> >>>
> >>>
> >>> On Mon, 13 May 2024 at 16:20, David Radley 
> >>> wrote:
> >>>
>  Hi,
>  I wonder if the way that the async request fails could be a retriable
> >> or
>  non-retriable error, so it would retry only for retriable (transient)
>  errors (like IOExceptions) . I see some talk on the internet around
>  retriable SQL errors.
>  If 

[jira] [Created] (FLINK-35414) Cancel jobs through rest api for last-state upgrades

2024-05-21 Thread Gyula Fora (Jira)
Gyula Fora created FLINK-35414:
--

 Summary: Cancel jobs through rest api for last-state upgrades
 Key: FLINK-35414
 URL: https://issues.apache.org/jira/browse/FLINK-35414
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Gyula Fora
Assignee: Gyula Fora


The kubernetes operator currently always deletes the JM deployment directly 
during last-state upgrades instead of attempting any type of graceful shutdown.

We could improve the last-state upgrade logic to cancel the job in cases where 
the JM is healthy and then simply extract the last checkpoint info through the 
rest api like we already do for terminal job states.

This would allow the last-state upgrade mode to work even for session jobs and 
this may even eliminate a few corner cases that can result from the current 
forceful upgrade mechanism. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Add a JDBC Sink Plugin to Flink-CDC-Pipeline

2024-05-21 Thread Leonard Xu
Thanks Jerry for kicking off this thread, the idea makes sense to me, JDBC Sink 
is users’ need and Flink CDC project should support it soon.

Could you share your design docs(FLIP) firstly[1]? And then we can continue the 
design discussion.

Please feel free to ping me if you have any concerns about FLIP process or 
Flink CDC design part.

Best,
Leonard
[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP+Template 


> 2024年5月15日 下午3:06,Jerry  写道:
> 
> Hi all
> My name is ZhengjunZhou, an user and developer of FlinkCDC. In my recent
> projects, I realized that we could enhance the capabilities of
> Flink-CDC-Pipeline by introducing a JDBC Sink plugin, enabling FlinkCDC to
> directly output change data capture (CDC) to various JDBC-supported
> database systems.
> 
> Currently, while FlinkCDC offers support for a wide range of data sources,
> there is no direct solution for sinks, especially for common relational
> databases. I believe that adding a JDBC Sink plugin will significantly
> boost its applicability in data integration scenarios.
> 
> Specifically, this plugin would allow users to configure database
> connections and stream data directly to SQL databases via the standard JDBC
> interface. This could be used for data migration tasks as well as real-time
> data synchronization.
> 
> To further discuss this proposal and gather feedback from the community, I
> have prepared a preliminary design draft and hope to discuss it in detail
> in the upcoming community meeting. Please consider the potential value of
> this feature and provide your insights and guidance.
> 
> Thank you for your time and consideration. I look forward to your active
> feedback and further discussion.
> 
> [1] https://github.com/apache/flink-connector-jdbc



Re: [DISCUSS] FLIP-XXX: Improve JDBC connector extensibility for Table API

2024-05-21 Thread Keith Lee
Hi Lorenzo

I have a couple of questions:

1. Can the FLIP include at least one implementation of JDBC based connector
using proposed changes? Implementing a connector and solving challenges
that arise using the proposed change will give good insight.
2. The example seems to restrict the connector to managed database service
provider at `exampledb.com`. I think generally database can be self hosted
e.g. localhost or managed by a cloud service provider. If we use the
example configuration pattern, we are actually limiting the connector usage
to managed exampledb, IMO we should not really do that. Can you provide
other example configuration (actual examples for MySQL/Postgres etc. would
be great) that may illustrate the usefulness of the proposal better?


Best regards
Keith Lee


On Tue, May 21, 2024 at 2:01 PM Leonard Xu  wrote:

> Thanks Lorenzo for kicking off this discussion.
>
> +1 for the motivation, and I left some comments as following:
>
> (1) Please add API annotation for all Proposed public interfaces
>
> (2)
> JdbcConnectionOptionsParser/JdbcReadOptionsParser/JdbcExecutionOptionsParser
> offer two methods validate and parse, it’s a little stranger to me as your
> POC code call them at the same time, could we finish validate action in
> parse method internal? And thus a Parser interface offers a parse method
> makes sense to me. It’s better introduce a Validator to support validation
> If you want to do some connection validations during job compile phase.
>
> (3) Above methods return InternalJdbcConnectionOptions with fixed members,
> if the example db requires extra connection options like acessKey, acessId
> and etc, we need to change InternalJdbcConnectionOptions as well, how we
> show our extensibility?
>
> Best,
> Leonard
>
>
> > 2024年5月15日 下午10:17,Ahmed Hamdy  写道:
> >
> > Hi Lorenzo,
> > This seems like a very useful addition.
> > +1 (non-binding) from my side. I echo Jeyhun's question about backward
> > compatibility as it is not mentioned in the FLIP.
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Wed, 15 May 2024 at 08:12, 
> wrote:
> >
> >> Hello Muhammet and Jeyhun!
> >> Thanks for your comments!
> >>
> >> @Jeyhun:
> >>
> >>> Could you please elaborate more on how the new approach will be
> backwards
> >> compatible?
> >>
> >> In the FLIP I provide how the current Factories in JDBC would be changed
> >> with this refactor, do you mean something different? Can you be more
> >> specific with your request?
> >> On May 14, 2024 at 12:32 +0200, Jeyhun Karimov ,
> >> wrote:
> >>> Hi Lorenzo,
> >>>
> >>> Thanks for driving this FLIP. +1 for it.
> >>>
> >>> Could you please elaborate more on how the new approach will be
> backwards
> >>> compatible?
> >>>
> >>> Regards,
> >>> Jeyhun
> >>>
> >>> On Tue, May 14, 2024 at 10:00 AM Muhammet Orazov
> >>>  wrote:
> >>>
>  Hey Lorenzo,
> 
>  Thanks for driving this FLIP! +1
> 
>  It will improve the user experience of using JDBC based
>  connectors and help developers to build with different drivers.
> 
>  Best,
>  Muhammet
> 
>  On 2024-05-13 10:20, lorenzo.affe...@ververica.com.INVALID wrote:
> >> Hello dev!
> >>
> >> I want to share a draft of my FLIP to refactor the JDBC connector
> >> to
> >> improve its extensibility [1].
> >> The goal is to allow implementers to write new connectors on top
> >> of the
> >> JDBC one for Table API with clean and maintainable code.
> >>
> >> Any feedback from the community is more and welcome.
> >>
> >> [1]
> >>
> 
> >>
> https://docs.google.com/document/d/1kl_AikMlqPUI-LNiPBraAFVZDRg1LF4bn6uiNtR4dlY/edit?usp=sharing
> 
> >>
>
>


Re: [DISCUSSION] FLIP-457: Improve Table/SQL Configuration for Flink 2.0

2024-05-21 Thread Lincoln Lee
Hi Jane,

Thanks for the updates!

Just one small comment on the options in IncrementalAggregateRule
& RelNodeBlock, should we also change the API level from Experimental
to PublicEvolving?


Best,
Lincoln Lee


Jane Chan  于2024年5月21日周二 16:41写道:

> Hi all,
>
> Thanks for your valuable feedback!
>
> To @Xuannan
>
> For options to be moved to another module/package, I think we have to
> > mark the old option deprecated in 1.20 for it to be removed in 2.0,
> > according to the API compatibility guarantees[1]. We can introduce the
> > new option in 1.20 with the same option key in the intended class.
>
>
> Good point, fixed.
>
> To @Lincoln and @Benchao
>
> Thanks for sharing the insights into the historical context of which I was
> unaware. I've reorganized the sheet.
>
> 3. Regarding WindowEmitStrategy, IIUC it is currently unsupported on TVF
> > window, so it's recommended to keep it untouched for now and follow up in
> > FLINK-29692
>
>
> How to tackle the configuration is up to whether to remove the legacy
> window aggregate in 2.0, and I've updated the FLIP to leverage this part to
> FLINK-29692.
>
> Please let me know if that answers your questions or if you have other
> comments.
>
> Best,
> Jane
>
>
> On Mon, May 20, 2024 at 1:52 PM Ron Liu  wrote:
>
> > Hi, Lincoln
> >
> > >  2. Regarding the options in HashAggCodeGenerator, since this new
> feature
> > has gone
> > through a couple of release cycles and could be considered for
> > PublicEvolving now,
> > cc @Ron Liu   WDYT?
> >
> > Thanks for cc'ing me,  +1 for public these options now.
> >
> > Best,
> > Ron
> >
> > Benchao Li  于2024年5月20日周一 13:08写道:
> >
> > > I agree with Lincoln about the experimental features.
> > >
> > > Some of these configurations do not even have proper implementation,
> > > take 'table.exec.range-sort.enabled' as an example, there was a
> > > discussion[1] about it before.
> > >
> > > [1] https://lists.apache.org/thread/q5h3obx36pf9po28r0jzmwnmvtyjmwdr
> > >
> > > Lincoln Lee  于2024年5月20日周一 12:01写道:
> > > >
> > > > Hi Jane,
> > > >
> > > > Thanks for the proposal!
> > > >
> > > > +1 for the changes except for these annotated as experimental ones.
> > > >
> > > > For the options annotated as experimental,
> > > >
> > > > +1 for the moving of IncrementalAggregateRule & RelNodeBlock.
> > > >
> > > > For the rest of the options, there are some suggestions:
> > > >
> > > > 1. for the batch related parameters, it's recommended to either
> delete
> > > > them (leaving the necessary defaults value in place) or leave them as
> > > they
> > > > are. Including:
> > > > FlinkRelMdRowCount
> > > > FlinkRexUtil
> > > > BatchPhysicalSortRule
> > > > JoinDeriveNullFilterRule
> > > > BatchPhysicalJoinRuleBase
> > > > BatchPhysicalSortMergeJoinRule
> > > >
> > > > What I understand about the history of these options is that they
> were
> > > once
> > > > used for fine
> > > > tuning for tpc testing, and the current flink planner no longer
> relies
> > on
> > > > these internal
> > > > options when testing tpc[1]. In addition, these options are too
> obscure
> > > for
> > > > SQL users,
> > > > and some of them are actually magic numbers.
> > > >
> > > > 2. Regarding the options in HashAggCodeGenerator, since this new
> > feature
> > > > has gone
> > > > through a couple of release cycles and could be considered for
> > > > PublicEvolving now,
> > > > cc @Ron Liu   WDYT?
> > > >
> > > > 3. Regarding WindowEmitStrategy, IIUC it is currently unsupported on
> > TVF
> > > > window, so
> > > > it's recommended to keep it untouched for now and follow up in
> > > > FLINK-29692[2]. cc @Xuyang 
> > > >
> > > > [1]
> > > >
> > >
> >
> https://github.com/ververica/flink-sql-benchmark/blob/master/tools/common/flink-conf.yaml
> > > > [2] https://issues.apache.org/jira/browse/FLINK-29692
> > > >
> > > >
> > > > Best,
> > > > Lincoln Lee
> > > >
> > > >
> > > > Yubin Li  于2024年5月17日周五 10:49写道:
> > > >
> > > > > Hi Jane,
> > > > >
> > > > > Thank Jane for driving this proposal !
> > > > >
> > > > > This makes sense for users, +1 for that.
> > > > >
> > > > > Best,
> > > > > Yubin
> > > > >
> > > > > On Thu, May 16, 2024 at 3:17 PM Jark Wu  wrote:
> > > > > >
> > > > > > Hi Jane,
> > > > > >
> > > > > > Thanks for the proposal. +1 from my side.
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > > On Thu, 16 May 2024 at 10:28, Xuannan Su 
> > > wrote:
> > > > > >
> > > > > > > Hi Jane,
> > > > > > >
> > > > > > > Thanks for driving this effort! And +1 for the proposed
> changes.
> > > > > > >
> > > > > > > I have one comment on the migration plan.
> > > > > > >
> > > > > > > For options to be moved to another module/package, I think we
> > have
> > > to
> > > > > > > mark the old option deprecated in 1.20 for it to be removed in
> > 2.0,
> > > > > > > according to the API compatibility guarantees[1]. We can
> > introduce
> > > the
> > > > > > > new option in 1.20 with the same option key in the intended
> > class.
> > > > > > > 

Re: [DISCUSS] FLIP-451: Refactor Async sink API

2024-05-21 Thread Leonard Xu
Thanks Ahmed for kicking off this discussion, sorry for jumping the discussion 
late.

(1)I’m confused about the discuss thread name ‘FLIP-451: Refactor Async sink 
API’  and FLIP title/vote thread name '
FLIP-451: Introduce timeout configuration to AsyncSink API 
’,
 they are different for me. Could you help explain the change history?

(2) The FLIP-451 aims to introduce a timeout configuration, but I didn’t find 
the configuration in FLIP even I lookup some historical versions of the FLIP. 
Did I miss some key informations?

(3) About the code change part, there’re some un-complete pieces in 
AsyncSinkWriter for example `submitRequestEntries(List 
requestEntries,);` is incorrect and `sendTime` variable I didn’t 
find the place we define it and where we use it.

Sorry for jumping the discussion thread during vote phase again.

Best,
Leonard


> 2024年5月21日 下午3:49,Ahmed Hamdy  写道:
> 
> Hi Hong,
> Thanks for pointing that out, no we are not
> deprecating getFatalExceptionCons(). I have updated the FLIP
> Best Regards
> Ahmed Hamdy
> 
> 
> On Mon, 20 May 2024 at 15:40, Hong Liang  wrote:
> 
>> Hi Ahmed,
>> Thanks for putting this together! Should we still be marking
>> getFatalExceptionCons() as @Deprecated in this FLIP, if we are not
>> providing a replacement?
>> 
>> Regards,
>> Hong
>> 
>> On Mon, May 13, 2024 at 7:58 PM Ahmed Hamdy  wrote:
>> 
>>> Hi David,
>>> yes there error classification was initially left to sink implementers to
>>> handle while we provided utilities to classify[1] and bubble up[2] fatal
>>> exceptions to avoid retrying them.
>>> Additionally some sink implementations provide an option to short circuit
>>> the failures by exposing a `failOnError` flag as in
>> KinesisStreamsSink[3],
>>> however this FLIP scope doesn't include any changes for retry mechanisms.
>>> 
>>> 1-
>>> 
>>> 
>> https://github.com/apache/flink/blob/015867803ff0c128b1c67064c41f37ca0731ed86/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/throwable/FatalExceptionClassifier.java#L32
>>> 2-
>>> 
>>> 
>> https://github.com/apache/flink/blob/015867803ff0c128b1c67064c41f37ca0731ed86/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java#L533
>>> 3-
>>> 
>>> 
>> https://github.com/apache/flink-connector-aws/blob/c6e0abb65a0e51b40dd218b890a111886fbf797f/flink-connector-aws/flink-connector-aws-kinesis-streams/src/main/java/org/apache/flink/connector/kinesis/sink/KinesisStreamsSinkWriter.java#L106
>>> 
>>> Best Regards
>>> Ahmed Hamdy
>>> 
>>> 
>>> On Mon, 13 May 2024 at 16:20, David Radley 
>>> wrote:
>>> 
 Hi,
 I wonder if the way that the async request fails could be a retriable
>> or
 non-retriable error, so it would retry only for retriable (transient)
 errors (like IOExceptions) . I see some talk on the internet around
 retriable SQL errors.
 If this was the case then we may need configuration to limit the
>> number
 of retries of retriable errors.
Kind regards, David
 
 
 From: Muhammet Orazov 
 Date: Monday, 13 May 2024 at 10:30
 To: dev@flink.apache.org 
 Subject: [EXTERNAL] Re: [DISCUSS] FLIP-451: Refactor Async sink API
 Great, thanks for clarifying!
 
 Best,
 Muhammet
 
 
 On 2024-05-06 13:40, Ahmed Hamdy wrote:
> Hi Muhammet,
> Thanks for the feedback.
> 
>> Could you please add more here why it is harder? Would the
>> `completeExceptionally`
>> method be related to it? Maybe you can add usage example for it
>> also.
>> 
> 
> this is mainly due to the current implementation of fatal exception
> failures which depends on base `getFatalExceptionConsumer` method
>> that
> is
> decoupled from the actual called method `submitRequestEntries`, Since
> this
> is now not the primary concern of the FLIP, I have removed it from
>> the
> motivation so that the scope is defined around introducing the
>> timeout
> configuration.
> 
>> Should we add a list of possible connectors that this FLIP would
>> improve?
> 
> Good call, I have added under migration plan.
> 
> Best Regards
> Ahmed Hamdy
> 
> 
> On Mon, 6 May 2024 at 08:49, Muhammet Orazov 
> wrote:
> 
>> Hey Ahmed,
>> 
>> Thanks for the FLIP! +1 (non-binding)
>> 
>>> Additionally the current interface for passing fatal exceptions
>> and
>>> retrying records relies on java consumers which makes it harder to
>>> understand.
>> 
>> Could you please add more here why it is harder? Would the
>> `completeExceptionally`
>> method be related to it? Maybe you can add usage example for it
>> also.
>> 
>>> we should proceed by adding support in all supporting connector
>>> repos.
>> 
>> Should we add 

Re: [DISCUSS] Flink upgrade compatibility page not updated for Flink 1.19

2024-05-21 Thread Lincoln Lee
Thanks Aleksandr for fixing this missing change!

It was my oversight during the last release. I had gone offline to @Hangxiang
Yu 
before rc1 to confirm the 1.19 compatibility changes and confirmed
that there were no changes (so your pr should be correct), but finally I
didn't make it into subsequent rc's.

I checked the flink release wiki page[1] again, and this update is
mentioned in item 8 of the checklist in the "Creating Release Branches"
section, but it's not listed in jira[2] (I've added a note to it).

Also cc 1.20 rm @weijie guo , remember to
include this documentation
update in the next release.

[1]
https://cwiki.apache.org/confluence/display/FLINK/Creating+a+Flink+Release
[2] https://issues.apache.org/jira/browse/FLINK-34282

Best,
Lincoln Lee


Aleksandr Pilipenko  于2024年5月21日周二 18:28写道:

> Hi all,
>
> Current version of documentation missing savepoint compatibility data for
> Flink 1.19 [1].
> I have created a ticket [2] and PRs to address this, but wanted to clarify
> if there any changes that make savepoint compatibility different from
> previous releases? I did not find any changes related to compatibility in
> release notes [3].
>
> Additionally, is an update to this page included in the process of minor
> version release?
>
> 1:
>
> https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/ops/upgrading/#compatibility-table
> 2: https://issues.apache.org/jira/browse/FLINK-35383
> 3:
>
> https://nightlies.apache.org/flink/flink-docs-master/release-notes/flink-1.19/#checkpoints
>
> Thanks,
> Aleksandr
>


[jira] [Created] (FLINK-35413) VertexFinishedStateCheckerTest causes exit 239

2024-05-21 Thread Ryan Skraba (Jira)
Ryan Skraba created FLINK-35413:
---

 Summary: VertexFinishedStateCheckerTest causes exit 239
 Key: FLINK-35413
 URL: https://issues.apache.org/jira/browse/FLINK-35413
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.20.0
Reporter: Ryan Skraba


1.20 test_cron_azure core 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=59676=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef=9429

{code}
May 21 01:31:42 01:31:42.160 [ERROR] 
org.apache.flink.runtime.checkpoint.VertexFinishedStateCheckerTest
May 21 01:31:42 01:31:42.160 [ERROR] 
org.apache.maven.surefire.booter.SurefireBooterForkException: 
ExecutionException The forked VM terminated without properly saying goodbye. VM 
crash or System.exit called?
May 21 01:31:42 01:31:42.160 [ERROR] Command was /bin/sh -c cd 
'/__w/1/s/flink-runtime' && '/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java' 
'-XX:+UseG1GC' '-Xms256m' '-XX:+IgnoreUnrecognizedVMOptions' 
'--add-opens=java.base/java.util=ALL-UNNAMED' 
'--add-opens=java.base/java.lang=ALL-UNNAMED' 
'--add-opens=java.base/java.net=ALL-UNNAMED' 
'--add-opens=java.base/java.io=ALL-UNNAMED' 
'--add-opens=java.base/java.util.concurrent=ALL-UNNAMED' '-Xmx768m' '-jar' 
'/__w/1/s/flink-runtime/target/surefire/surefirebooter-20240521011847857_99.jar'
 '/__w/1/s/flink-runtime/target/surefire' '2024-05-21T01-15-09_325-jvmRun1' 
'surefire-20240521011847857_97tmp' 'surefire_29-20240521011847857_98tmp'
May 21 01:31:42 01:31:42.160 [ERROR] Error occurred in starting fork, check 
output in log
May 21 01:31:42 01:31:42.160 [ERROR] Process Exit Code: 239
May 21 01:31:42 01:31:42.160 [ERROR] Crashed tests:
May 21 01:31:42 01:31:42.160 [ERROR] 
org.apache.flink.runtime.checkpoint.VertexFinishedStateCheckerTest
May 21 01:31:42 01:31:42.160 [ERROR]at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:456)
May 21 01:31:42 01:31:42.160 [ERROR]at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkOnceMultiple(ForkStarter.java:358)
May 21 01:31:42 01:31:42.160 [ERROR]at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:296)
May 21 01:31:42 01:31:42.160 [ERROR]at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:250)
May 21 01:31:42 01:31:42.160 [ERROR]at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1240)
May 21 01:31:42 01:31:42.160 [ERROR]at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1089)
May 21 01:31:42 01:31:42.160 [ERROR]at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:905)
May 21 01:31:42 01:31:42.160 [ERROR]at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:137)
{code}

In the build artifact {{mvn-1.log}} the following FATAL error is found:

{code}
01:19:08,584 [ pool-9-thread-1] ERROR 
org.apache.flink.util.FatalExitExceptionHandler  [] - FATAL: Thread 
'pool-9-thread-1' produced an uncaught exception. Stopping the process...
java.util.concurrent.CompletionException: 
java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@5ead9062 
rejected from 
java.util.concurrent.ScheduledThreadPoolExecutor@4d0e55ac[Shutting down, pool 
size = 1, active threads = 1, queued tasks = 1, completed tasks = 194]
at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
 ~[?:1.8.0_292]
at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
 ~[?:1.8.0_292]
at 
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:838) 
~[?:1.8.0_292]
at 
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:811)
 ~[?:1.8.0_292]
at 
java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:851)
 ~[?:1.8.0_292]
at 
java.util.concurrent.CompletableFuture.handleAsync(CompletableFuture.java:2178) 
~[?:1.8.0_292]
at 
org.apache.flink.runtime.resourcemanager.slotmanager.DefaultSlotStatusSyncer.allocateSlot(DefaultSlotStatusSyncer.java:138)
 ~[classes/:?]
at 
org.apache.flink.runtime.resourcemanager.slotmanager.FineGrainedSlotManager.allocateSlotsAccordingTo(FineGrainedSlotManager.java:722)
 ~[classes/:?]
at 
org.apache.flink.runtime.resourcemanager.slotmanager.FineGrainedSlotManager.checkResourceRequirements(FineGrainedSlotManager.java:645)
 ~[classes/:?]
at 
org.apache.flink.runtime.resourcemanager.slotmanager.FineGrainedSlotManager.lambda$null$12(FineGrainedSlotManager.java:603)
 ~[classes/:?]
at 

Re: [DISCUSS] FLIP-XXX: Improve JDBC connector extensibility for Table API

2024-05-21 Thread Leonard Xu
Thanks Lorenzo for kicking off this discussion.

+1 for the motivation, and I left some comments as following:

(1) Please add API annotation for all Proposed public interfaces

(2) 
JdbcConnectionOptionsParser/JdbcReadOptionsParser/JdbcExecutionOptionsParser  
offer two methods validate and parse, it’s a little stranger to me as your POC 
code call them at the same time, could we finish validate action in parse 
method internal? And thus a Parser interface offers a parse method makes sense 
to me. It’s better introduce a Validator to support validation If you want to 
do some connection validations during job compile phase.

(3) Above methods return InternalJdbcConnectionOptions with fixed members, if 
the example db requires extra connection options like acessKey, acessId and 
etc, we need to change InternalJdbcConnectionOptions as well, how we show our 
extensibility?

Best,
Leonard


> 2024年5月15日 下午10:17,Ahmed Hamdy  写道:
> 
> Hi Lorenzo,
> This seems like a very useful addition.
> +1 (non-binding) from my side. I echo Jeyhun's question about backward
> compatibility as it is not mentioned in the FLIP.
> Best Regards
> Ahmed Hamdy
> 
> 
> On Wed, 15 May 2024 at 08:12,  wrote:
> 
>> Hello Muhammet and Jeyhun!
>> Thanks for your comments!
>> 
>> @Jeyhun:
>> 
>>> Could you please elaborate more on how the new approach will be backwards
>> compatible?
>> 
>> In the FLIP I provide how the current Factories in JDBC would be changed
>> with this refactor, do you mean something different? Can you be more
>> specific with your request?
>> On May 14, 2024 at 12:32 +0200, Jeyhun Karimov ,
>> wrote:
>>> Hi Lorenzo,
>>> 
>>> Thanks for driving this FLIP. +1 for it.
>>> 
>>> Could you please elaborate more on how the new approach will be backwards
>>> compatible?
>>> 
>>> Regards,
>>> Jeyhun
>>> 
>>> On Tue, May 14, 2024 at 10:00 AM Muhammet Orazov
>>>  wrote:
>>> 
 Hey Lorenzo,
 
 Thanks for driving this FLIP! +1
 
 It will improve the user experience of using JDBC based
 connectors and help developers to build with different drivers.
 
 Best,
 Muhammet
 
 On 2024-05-13 10:20, lorenzo.affe...@ververica.com.INVALID wrote:
>> Hello dev!
>> 
>> I want to share a draft of my FLIP to refactor the JDBC connector
>> to
>> improve its extensibility [1].
>> The goal is to allow implementers to write new connectors on top
>> of the
>> JDBC one for Table API with clean and maintainable code.
>> 
>> Any feedback from the community is more and welcome.
>> 
>> [1]
>> 
 
>> https://docs.google.com/document/d/1kl_AikMlqPUI-LNiPBraAFVZDRg1LF4bn6uiNtR4dlY/edit?usp=sharing
 
>> 



Re: [VOTE] FLIP-449: Reorganization of flink-connector-jdbc

2024-05-21 Thread Leonard Xu
+1(binding),  thanks Joao Boto for driving this FLIP.

Best,
Leonard

> 2024年5月17日 下午4:34,Ahmed Hamdy  写道:
> 
> Hi all,
> +1 (non-binding)
> Best Regards
> Ahmed Hamdy
> 
> 
> On Fri, 17 May 2024 at 02:13, Jiabao Sun  wrote:
> 
>> Thanks for driving this proposal!
>> 
>> +1 (binding)
>> 
>> Best,
>> Jiabao
>> 
>> 
>> On 2024/05/10 22:18:04 Jeyhun Karimov wrote:
>>> Thanks for driving this!
>>> 
>>> +1 (non-binding)
>>> 
>>> Regards,
>>> Jeyhun
>>> 
>>> On Fri, May 10, 2024 at 12:50 PM Muhammet Orazov
>>>  wrote:
>>> 
 Thanks João for your efforts and driving this!
 
 +1 (non-binding)
 
 Best,
 Muhammet
 
 On 2024-05-09 12:01, Joao Boto wrote:
> Hi everyone,
> 
> Thanks for all the feedback, I'd like to start a vote on the
>> FLIP-449:
> Reorganization of flink-connector-jdbc [1].
> The discussion thread is here [2].
> 
> The vote will be open for at least 72 hours unless there is an
> objection or
> insufficient votes.
> 
> [1]
> 
 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-449%3A+Reorganization+of+flink-connector-jdbc
> [2] https://lists.apache.org/thread/jc1yvvo35xwqzlxl5mj77qw3hq6f5sgr
> 
> Best
> Joao Boto
 
>>> 
>> 



[DISCUSS] The performance of serializerHeavyString starts regress since April 3

2024-05-21 Thread Rui Fan
Hi devs:

We(release managers of flink 1.20) wanna update one performance
regresses to the flink dev mail list.

# Background:

The performance of serializerHeavyString starts regress since April 3,
and we created FLINK-35040[1] to follow it.

In brief:
- The performance only regresses for jdk 11, and Java 8 and Java 17 are
fine.
- The regression reason is upgrading commons-io version from 2.11.0 to
2.15.1
  - This upgrading is done in FLINK-34955[2].
  - The performance can be recovered after reverting the commons-io version
to 2.11.0

You can get more details from FLINK-35040[1].

# Problem

We try to generate the flame graph (wall mode) to analyze why upgrading
the commons-io version affects the performance. These flamegraphs can
be found in FLINK-35040[1]. (Many thanks to Zakelly for generating these
flamegraphs from the benchmark server).

Unfortunately, we cannot find any code of commons-io dependency is called.
Also, we try to analyze if any other dependencies are changed during
upgrading
commons-io version. The result is no, other dependencies are totally the
same.

# Request

After the above analysis, we cannot find why the performance of
serializerHeavyString
starts to regress for jdk11.

We are looking forward to hearing valuable suggestions from the Flink
community.
Thanks everyone in advance.

Note:
1. I cannot reproduce the regression on my Mac with jdk11, and we suspect
  this regression may be caused by the benchmark environment.
2. We will accept this regression if the issue still cannot be solved.

[1] https://issues.apache.org/jira/browse/FLINK-35040
[2] https://issues.apache.org/jira/browse/FLINK-34955

Best,
Weijie, Ufuk, Robert and Rui


[DISCUSS] Flink upgrade compatibility page not updated for Flink 1.19

2024-05-21 Thread Aleksandr Pilipenko
Hi all,

Current version of documentation missing savepoint compatibility data for
Flink 1.19 [1].
I have created a ticket [2] and PRs to address this, but wanted to clarify
if there any changes that make savepoint compatibility different from
previous releases? I did not find any changes related to compatibility in
release notes [3].

Additionally, is an update to this page included in the process of minor
version release?

1:
https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/ops/upgrading/#compatibility-table
2: https://issues.apache.org/jira/browse/FLINK-35383
3:
https://nightlies.apache.org/flink/flink-docs-master/release-notes/flink-1.19/#checkpoints

Thanks,
Aleksandr


[jira] [Created] (FLINK-35412) Batch execution of async state request callback

2024-05-21 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-35412:
---

 Summary: Batch execution of async state request callback
 Key: FLINK-35412
 URL: https://issues.apache.org/jira/browse/FLINK-35412
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends, Runtime / Task
Reporter: Zakelly Lan


There is one mail for each callback when async state result returns. One 
possible optimization is to encapsulate multiple callbacks into one mail.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35411) Optimize wait logic in draining of async state requests

2024-05-21 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-35411:
---

 Summary: Optimize wait logic in draining of async state requests
 Key: FLINK-35411
 URL: https://issues.apache.org/jira/browse/FLINK-35411
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / State Backends, Runtime / Task
Reporter: Zakelly Lan


Currently during draining of async state requests, the task thread performs 
{{Thread.sleep}} to avoid cpu overhead when polling mails. This can be 
optimized by wait & notify.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35410) Avoid sync waiting in coordinator thread of ForSt executor

2024-05-21 Thread Zakelly Lan (Jira)
Zakelly Lan created FLINK-35410:
---

 Summary: Avoid sync waiting in coordinator thread of ForSt executor
 Key: FLINK-35410
 URL: https://issues.apache.org/jira/browse/FLINK-35410
 Project: Flink
  Issue Type: Sub-task
Reporter: Zakelly Lan
Assignee: Zakelly Lan
 Fix For: 2.0.0


Currently, the coordinator thread of ForSt executor will sync wait the state 
access result, which can be optimized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35409) Request more splits if all splits are filtered from addSplits method

2024-05-21 Thread Xiao Huang (Jira)
Xiao Huang created FLINK-35409:
--

 Summary: Request more splits if all splits are filtered from 
addSplits method
 Key: FLINK-35409
 URL: https://issues.apache.org/jira/browse/FLINK-35409
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: Xiao Huang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-451: Introduce timeout configuration to AsyncSink

2024-05-21 Thread Hong Liang
+1 (binding)

Thanks Ahmed

On Tue, May 14, 2024 at 11:51 AM David Radley 
wrote:

> Thanks for the clarification Ahmed
>
> +1 (non-binding)
>
> From: Ahmed Hamdy 
> Date: Monday, 13 May 2024 at 19:58
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] Re: [VOTE] FLIP-451: Introduce timeout configuration
> to AsyncSink
> Thanks David,
> I have replied to your question in the discussion thread.
> Best Regards
> Ahmed Hamdy
>
>
> On Mon, 13 May 2024 at 16:21, David Radley 
> wrote:
>
> > Hi,
> > I raised a question on the discussion thread, around retriable errors, as
> > a possible alternative,
> >   Kind regards, David.
> >
> >
> > From: Aleksandr Pilipenko 
> > Date: Monday, 13 May 2024 at 16:07
> > To: dev@flink.apache.org 
> > Subject: [EXTERNAL] Re: [VOTE] FLIP-451: Introduce timeout configuration
> > to AsyncSink
> > Thanks for driving this!
> >
> > +1 (non-binding)
> >
> > Thanks,
> > Aleksandr
> >
> > On Mon, 13 May 2024 at 14:08, 
> > wrote:
> >
> > > Thanks Ahmed!
> > >
> > > +1 non binding
> > > On May 13, 2024 at 12:40 +0200, Jeyhun Karimov ,
> > > wrote:
> > > > Thanks for driving this Ahmed.
> > > >
> > > > +1 (non-binding)
> > > >
> > > > Regards,
> > > > Jeyhun
> > > >
> > > > On Mon, May 13, 2024 at 12:37 PM Muhammet Orazov
> > > >  wrote:
> > > >
> > > > > Thanks Ahmed, +1 (non-binding)
> > > > >
> > > > > Best,
> > > > > Muhammet
> > > > >
> > > > > On 2024-05-13 09:50, Ahmed Hamdy wrote:
> > > > > > > Hi all,
> > > > > > >
> > > > > > > Thanks for the feedback on the discussion thread[1], I would
> like
> > > to
> > > > > > > start
> > > > > > > a vote on FLIP-451[2]: Introduce timeout configuration to
> > AsyncSink
> > > > > > >
> > > > > > > The vote will be open for at least 72 hours unless there is an
> > > > > > > objection or
> > > > > > > insufficient votes.
> > > > > > >
> > > > > > > 1-
> > https://lists.apache.org/thread/ft7wcw7kyftvww25n5fm4l925tlgdfg0
> > > > > > > 2-
> > > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Introduce+timeout+configuration+to+AsyncSink+API
> > > > > > > Best Regards
> > > > > > > Ahmed Hamdy
> > > > >
> > >
> >
> > Unless otherwise stated above:
> >
> > IBM United Kingdom Limited
> > Registered in England and Wales with number 741598
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
> >
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>


[jira] [Created] (FLINK-35408) Add 30 min tolerance value when validating the time-zone setting

2024-05-21 Thread Xiao Huang (Jira)
Xiao Huang created FLINK-35408:
--

 Summary: Add 30 min tolerance value when validating the time-zone 
setting
 Key: FLINK-35408
 URL: https://issues.apache.org/jira/browse/FLINK-35408
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: Xiao Huang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Proposing an LTS Release for the 1.x Line

2024-05-21 Thread Alexander Fedulov
Hi everyone,

let's finalize this discussion. As Martijn suggested, I summarized this
thread into a FLIP [1]. Please take a look and let me know if there’s
anything important that I might have missed.

Best,
Alex

[1] https://cwiki.apache.org/confluence/x/BApeEg


On Tue, 23 Jan 2024 at 03:30, Rui Fan <1996fan...@gmail.com> wrote:

> Thanks Martijn for the feedback!
>
> Sounds make sense to me! And I don't have strong opinion that allow
> backporting new features to 1.x.
>
> Best,
> Rui
>
> On Mon, Jan 22, 2024 at 8:56 PM Martijn Visser 
> wrote:
>
> > Hi Rui,
> >
> > I don't think that we should allow backporting of new features from
> > the first minor version of 2.x to 1.x. If a user doesn't yet want to
> > upgrade to 2.0, I think that's fine since we'll have a LTS for 1.x. If
> > a newer feature becomes available in 2.x that's interesting for the
> > user, the user at that point can decide if they want to do the
> > migration. It's always a case-by-case tradeoff of effort vs benefits,
> > and I think with a LTS version that has bug fixes only we provide the
> > users with assurance that existing bugs can get fixed, and that they
> > can decide for themselves when they want to migrate to a newer version
> > with better/newer features.
> >
> > Best regards,
> >
> > Martijn
> >
> > On Thu, Jan 11, 2024 at 3:50 AM Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > > Thanks everyone for discussing this topic!
> > >
> > > My question is could we make a trade-off between Flink users
> > > and Flink maintainers?
> > >
> > > 1. From the perspective of a Flink maintainer
> > >
> > > I strongly agree with Martin's point of view, such as:
> > >
> > > - Allowing backporting of new features to Flink 1.x will result in
> users
> > > delaying the upgrade.
> > > - New features will also introduce new bugs, meaning that maintainers
> > will
> > > have to spend time on two release versions.
> > >
> > > Considering the simplicity of maintenance, don't backport
> > > new features to Flink 1.x is fine.
> > >
> > > 2. From the perspective of a flink user
> > >
> > > In the first version Flink 2.x, flink will remove a lot of
> > > deprecated api, and introduce some features.
> > >
> > > It's a new major version, major version changes are much
> > > greater than minor version and patch version. Big changes
> > > may introduce more bugs, so I guess that a large number
> > > of Flink users will not use the first version of 2.x in the
> > > production environment. Maybe they will wait for the second
> > > minor version of 2.x.
> > >
> > > So, I was wondering whether we allow backport new features
> > > from the first minor version of 2.x to 1.x?
> > >
> > > It means, we allow backport new features of 2.0.0 to 1.21.
> > > And 1.21.x is similar to 2.0.x, their features are same, but
> > > 2.0.x removes deprecated apis. After 2.0.0 is released,
> > > all new features in 2.1.x and above are only available in 2.x.
> > >
> > > Looking forward to your opinions~
> > >
> > > Best,
> > > Rui
> > >
> > > On Wed, Jan 10, 2024 at 9:39 PM Martijn Visser <
> martijnvis...@apache.org
> > >
> > > wrote:
> > >
> > > > Hi Alex,
> > > >
> > > > I saw that I missed replying to this topic. I do think that Xintong
> > > > touched on an important topic when he mentioned that we should define
> > > > what an LTS version means. From my point of view, I would state that
> > > > an LTS version for Apache Flink means that bug fixes only will be
> made
> > > > available for a longer period of time. I think that, combined with
> > > > what you called option 1 (a clear end-of-life date) is the best
> > > > option.
> > > >
> > > > Flink 2.0 will give us primarily the ability to remove a lot of
> > > > deprecated APIs, especially with Flink's deprecation strategy. I
> > > > expect that the majority of users will have an easy migration path
> > > > from a Flink 1.x to a Flink 2.0, if you're currently not using a
> > > > deprecated API and are a Java user.
> > > >
> > > > Allowing backporting of new features to Flink 1.x will result in
> users
> > > > delaying the upgrade, but it doesn't make the upgrade any easier when
> > > > they must upgrade. New features will also introduce new bugs, meaning
> > > > that maintainers will have to spend time on two release versions. As
> > > > the codebases diverge more and more, this will just become
> > > > increasingly more complex.
> > > >
> > > > With that being said, I do think that it makes sense to also
> formalize
> > > > the result of this discussion in a FLIP. That's just easier to point
> > > > users towards at a later stage.
> > > >
> > > > Best regards,
> > > >
> > > > Martijn
> > > >
> > > > On Mon, Dec 4, 2023 at 9:55 PM Alexander Fedulov
> > > >  wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > As we progress with the 1.19 release, which might potentially
> > (although
> > > > not
> > > > > likely) be the last in the 1.x line, I'd like to revive our
> > discussion on
> > > > > the
> > > > > LTS support 

Re: [DISCUSSION] FLIP-457: Improve Table/SQL Configuration for Flink 2.0

2024-05-21 Thread Jane Chan
Hi all,

Thanks for your valuable feedback!

To @Xuannan

For options to be moved to another module/package, I think we have to
> mark the old option deprecated in 1.20 for it to be removed in 2.0,
> according to the API compatibility guarantees[1]. We can introduce the
> new option in 1.20 with the same option key in the intended class.


Good point, fixed.

To @Lincoln and @Benchao

Thanks for sharing the insights into the historical context of which I was
unaware. I've reorganized the sheet.

3. Regarding WindowEmitStrategy, IIUC it is currently unsupported on TVF
> window, so it's recommended to keep it untouched for now and follow up in
> FLINK-29692


How to tackle the configuration is up to whether to remove the legacy
window aggregate in 2.0, and I've updated the FLIP to leverage this part to
FLINK-29692.

Please let me know if that answers your questions or if you have other
comments.

Best,
Jane


On Mon, May 20, 2024 at 1:52 PM Ron Liu  wrote:

> Hi, Lincoln
>
> >  2. Regarding the options in HashAggCodeGenerator, since this new feature
> has gone
> through a couple of release cycles and could be considered for
> PublicEvolving now,
> cc @Ron Liu   WDYT?
>
> Thanks for cc'ing me,  +1 for public these options now.
>
> Best,
> Ron
>
> Benchao Li  于2024年5月20日周一 13:08写道:
>
> > I agree with Lincoln about the experimental features.
> >
> > Some of these configurations do not even have proper implementation,
> > take 'table.exec.range-sort.enabled' as an example, there was a
> > discussion[1] about it before.
> >
> > [1] https://lists.apache.org/thread/q5h3obx36pf9po28r0jzmwnmvtyjmwdr
> >
> > Lincoln Lee  于2024年5月20日周一 12:01写道:
> > >
> > > Hi Jane,
> > >
> > > Thanks for the proposal!
> > >
> > > +1 for the changes except for these annotated as experimental ones.
> > >
> > > For the options annotated as experimental,
> > >
> > > +1 for the moving of IncrementalAggregateRule & RelNodeBlock.
> > >
> > > For the rest of the options, there are some suggestions:
> > >
> > > 1. for the batch related parameters, it's recommended to either delete
> > > them (leaving the necessary defaults value in place) or leave them as
> > they
> > > are. Including:
> > > FlinkRelMdRowCount
> > > FlinkRexUtil
> > > BatchPhysicalSortRule
> > > JoinDeriveNullFilterRule
> > > BatchPhysicalJoinRuleBase
> > > BatchPhysicalSortMergeJoinRule
> > >
> > > What I understand about the history of these options is that they were
> > once
> > > used for fine
> > > tuning for tpc testing, and the current flink planner no longer relies
> on
> > > these internal
> > > options when testing tpc[1]. In addition, these options are too obscure
> > for
> > > SQL users,
> > > and some of them are actually magic numbers.
> > >
> > > 2. Regarding the options in HashAggCodeGenerator, since this new
> feature
> > > has gone
> > > through a couple of release cycles and could be considered for
> > > PublicEvolving now,
> > > cc @Ron Liu   WDYT?
> > >
> > > 3. Regarding WindowEmitStrategy, IIUC it is currently unsupported on
> TVF
> > > window, so
> > > it's recommended to keep it untouched for now and follow up in
> > > FLINK-29692[2]. cc @Xuyang 
> > >
> > > [1]
> > >
> >
> https://github.com/ververica/flink-sql-benchmark/blob/master/tools/common/flink-conf.yaml
> > > [2] https://issues.apache.org/jira/browse/FLINK-29692
> > >
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Yubin Li  于2024年5月17日周五 10:49写道:
> > >
> > > > Hi Jane,
> > > >
> > > > Thank Jane for driving this proposal !
> > > >
> > > > This makes sense for users, +1 for that.
> > > >
> > > > Best,
> > > > Yubin
> > > >
> > > > On Thu, May 16, 2024 at 3:17 PM Jark Wu  wrote:
> > > > >
> > > > > Hi Jane,
> > > > >
> > > > > Thanks for the proposal. +1 from my side.
> > > > >
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > > On Thu, 16 May 2024 at 10:28, Xuannan Su 
> > wrote:
> > > > >
> > > > > > Hi Jane,
> > > > > >
> > > > > > Thanks for driving this effort! And +1 for the proposed changes.
> > > > > >
> > > > > > I have one comment on the migration plan.
> > > > > >
> > > > > > For options to be moved to another module/package, I think we
> have
> > to
> > > > > > mark the old option deprecated in 1.20 for it to be removed in
> 2.0,
> > > > > > according to the API compatibility guarantees[1]. We can
> introduce
> > the
> > > > > > new option in 1.20 with the same option key in the intended
> class.
> > > > > > WDYT?
> > > > > >
> > > > > > Best,
> > > > > > Xuannan
> > > > > >
> > > > > > [1]
> > > > > >
> > > >
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/upgrading/#api-compatibility-guarantees
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, May 15, 2024 at 6:20 PM Jane Chan  >
> > > > wrote:
> > > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I'd like to start a discussion on FLIP-457: Improve Table/SQL
> > > > > > Configuration
> > > > > > > for Flink 2.0 [1]. This FLIP revisited all Table/SQL
> > configurations
> > > > to
> 

Re: [DISCUSS] FLIP-451: Refactor Async sink API

2024-05-21 Thread Ahmed Hamdy
Hi Hong,
Thanks for pointing that out, no we are not
deprecating getFatalExceptionCons(). I have updated the FLIP
Best Regards
Ahmed Hamdy


On Mon, 20 May 2024 at 15:40, Hong Liang  wrote:

> Hi Ahmed,
> Thanks for putting this together! Should we still be marking
> getFatalExceptionCons() as @Deprecated in this FLIP, if we are not
> providing a replacement?
>
> Regards,
> Hong
>
> On Mon, May 13, 2024 at 7:58 PM Ahmed Hamdy  wrote:
>
> > Hi David,
> > yes there error classification was initially left to sink implementers to
> > handle while we provided utilities to classify[1] and bubble up[2] fatal
> > exceptions to avoid retrying them.
> > Additionally some sink implementations provide an option to short circuit
> > the failures by exposing a `failOnError` flag as in
> KinesisStreamsSink[3],
> > however this FLIP scope doesn't include any changes for retry mechanisms.
> >
> > 1-
> >
> >
> https://github.com/apache/flink/blob/015867803ff0c128b1c67064c41f37ca0731ed86/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/throwable/FatalExceptionClassifier.java#L32
> > 2-
> >
> >
> https://github.com/apache/flink/blob/015867803ff0c128b1c67064c41f37ca0731ed86/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java#L533
> > 3-
> >
> >
> https://github.com/apache/flink-connector-aws/blob/c6e0abb65a0e51b40dd218b890a111886fbf797f/flink-connector-aws/flink-connector-aws-kinesis-streams/src/main/java/org/apache/flink/connector/kinesis/sink/KinesisStreamsSinkWriter.java#L106
> >
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Mon, 13 May 2024 at 16:20, David Radley 
> > wrote:
> >
> > > Hi,
> > > I wonder if the way that the async request fails could be a retriable
> or
> > > non-retriable error, so it would retry only for retriable (transient)
> > > errors (like IOExceptions) . I see some talk on the internet around
> > > retriable SQL errors.
> > >  If this was the case then we may need configuration to limit the
> number
> > > of retries of retriable errors.
> > > Kind regards, David
> > >
> > >
> > > From: Muhammet Orazov 
> > > Date: Monday, 13 May 2024 at 10:30
> > > To: dev@flink.apache.org 
> > > Subject: [EXTERNAL] Re: [DISCUSS] FLIP-451: Refactor Async sink API
> > > Great, thanks for clarifying!
> > >
> > > Best,
> > > Muhammet
> > >
> > >
> > > On 2024-05-06 13:40, Ahmed Hamdy wrote:
> > > > Hi Muhammet,
> > > > Thanks for the feedback.
> > > >
> > > >> Could you please add more here why it is harder? Would the
> > > >> `completeExceptionally`
> > > >> method be related to it? Maybe you can add usage example for it
> also.
> > > >>
> > > >
> > > > this is mainly due to the current implementation of fatal exception
> > > > failures which depends on base `getFatalExceptionConsumer` method
> that
> > > > is
> > > > decoupled from the actual called method `submitRequestEntries`, Since
> > > > this
> > > > is now not the primary concern of the FLIP, I have removed it from
> the
> > > > motivation so that the scope is defined around introducing the
> timeout
> > > > configuration.
> > > >
> > > >> Should we add a list of possible connectors that this FLIP would
> > > >> improve?
> > > >
> > > > Good call, I have added under migration plan.
> > > >
> > > > Best Regards
> > > > Ahmed Hamdy
> > > >
> > > >
> > > > On Mon, 6 May 2024 at 08:49, Muhammet Orazov 
> > > > wrote:
> > > >
> > > >> Hey Ahmed,
> > > >>
> > > >> Thanks for the FLIP! +1 (non-binding)
> > > >>
> > > >> > Additionally the current interface for passing fatal exceptions
> and
> > > >> > retrying records relies on java consumers which makes it harder to
> > > >> > understand.
> > > >>
> > > >> Could you please add more here why it is harder? Would the
> > > >> `completeExceptionally`
> > > >> method be related to it? Maybe you can add usage example for it
> also.
> > > >>
> > > >> > we should proceed by adding support in all supporting connector
> > repos.
> > > >>
> > > >> Should we add list of possible connectors that this FLIP would
> > > >> improve?
> > > >>
> > > >> Best,
> > > >> Muhammet
> > > >>
> > > >>
> > > >> On 2024-04-29 14:08, Ahmed Hamdy wrote:
> > > >> > Hi all,
> > > >> > I would like to start a discussion on FLIP-451[1]
> > > >> > The proposal comes on encountering a couple of issues while
> working
> > > >> > with
> > > >> > implementers for Async Sink.
> > > >> > The FLIP mainly proposes a new API similar to AsyncFunction and
> > > >> > ResultFuture as well as introducing timeout handling for AsyncSink
> > > >> > requests.
> > > >> > The FLIP targets 1.20 with backward compatible changes and we
> should
> > > >> > proceed by adding support in all supporting connector repos.
> > > >> >
> > > >> > 1-
> > > >> >
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-451%3A+Refactor+Async+Sink+API
> > > >> > Best Regards
> > > >> > Ahmed Hamdy
> > > >>
> > >
> > > Unless otherwise stated above:
> > 

[jira] [Created] (FLINK-35407) ensure orderly transactions during the full consumption stage of MySQL CDC

2024-05-21 Thread xiaotouming (Jira)
xiaotouming created FLINK-35407:
---

 Summary:  ensure orderly transactions during the full consumption 
stage of MySQL CDC
 Key: FLINK-35407
 URL: https://issues.apache.org/jira/browse/FLINK-35407
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Affects Versions: cdc-3.1.0
Reporter: xiaotouming


Is there a way to ensure orderly transactions during the full consumption stage 
of MySQL CDC? During the snapshot consumption stage, new additions, deletions, 
and modifications do not cause any changes to the original snapshot data



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35406) Use inner serializer when casting RAW type to BINARY or STRING in cast rules

2024-05-21 Thread Zhenghua Gao (Jira)
Zhenghua Gao created FLINK-35406:


 Summary: Use inner serializer when casting RAW type to BINARY or 
STRING in cast rules
 Key: FLINK-35406
 URL: https://issues.apache.org/jira/browse/FLINK-35406
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Zhenghua Gao


The generated code in RawToStringCastRule and RawToBinaryCastRule use 
BinaryRawValueData::toBytes and BinaryRawValueData::toObject to convert 
RawValueData(to java object or byte array), which should use inner serializer 
instead of RawValueDataSerializer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)