[jira] [Created] (FLINK-34417) Add JobID to logging MDC

2024-02-08 Thread Roman Khachatryan (Jira)
Roman Khachatryan created FLINK-34417:
-

 Summary: Add JobID to logging MDC
 Key: FLINK-34417
 URL: https://issues.apache.org/jira/browse/FLINK-34417
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing, Runtime / Coordination, Runtime 
/ Task
Reporter: Roman Khachatryan
Assignee: Roman Khachatryan
 Fix For: 1.19.0


Adding JobID to logging MDC allows to apply Structural Logging 
and analyze Flink logs more efficiently.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] Support the Ozone Filesystem

2024-02-08 Thread Ferenc Csaky
Hello devs,

I would like to start a discussion regarding Apache Ozone FS support. The
jira [1] is stale for quite a while, but supporting it with some limitations 
could
be done with minimal effort.

Ozone do not have truncate() impl, so it falls to the same category as
Hadoop < 2.7 [2], on Datastream API it requires the usage of
OnCheckpointRollingPolicy when checkpointing enabled to make sure
the FileSink will not use truncate().

Table API is a bit trickier, because checkpointing policy cannot be ocnfigured
explicitly (why?), it behaves differently regarding the write mode [3]. Bulk 
mode
is covered, but for fow format, auto-compaction has to be set.

Even with the mentioned limitations, I think it would worth to add support for 
OFS,
it would require 1 small change to enable "ofs" [4] and documenting the 
limitations.

WDYT?

Regards,
Ferenc

[1] https://issues.apache.org/jira/browse/FLINK-28231
[2] 
https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/datastream/filesystem/#general
[3] 
https://github.com/apache/flink/blob/a33a0576364ac3d9c0c038c74362f1faac8d47b8/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/table/FileSystemTableSink.java#L226
[4] 
https://github.com/apache/flink/blob/a33a0576364ac3d9c0c038c74362f1faac8d47b8/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableWriter.java#L62

[jira] [Created] (FLINK-34416) "Local recovery and sticky scheduling end-to-end test" still doesn't work with AdaptiveScheduler

2024-02-08 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34416:
-

 Summary: "Local recovery and sticky scheduling end-to-end test" 
still doesn't work with AdaptiveScheduler
 Key: FLINK-34416
 URL: https://issues.apache.org/jira/browse/FLINK-34416
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / Coordination
Affects Versions: 1.18.1, 1.19.0, 1.20.0
Reporter: Matthias Pohl


We tried to enable all {{AdaptiveScheduler}}-related tests in FLINK-34409 
because it appeared that all Jira issues that were referenced are resolved. 
That's not the case for the {{"Local recovery and sticky scheduling end-to-end 
test"}} tests, though.

With the {{AdaptiveScheduler}} being enabled, we run into issues where the test 
runs forever due to a {{NullPointerException}} continuously triggering a 
failure:
{code}
Feb 07 19:02:59 2024-02-07 19:02:21,706 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Flat Map -> 
Sink: Unnamed (3/4) 
(54075d3d22edb729e5f396726f777860_20ba6b65f97481d5570070de90e4e791_2_16292) 
switched from INITIALIZING to FAILED on localhost:40893-09ff7>
Feb 07 19:02:59 java.lang.NullPointerException: Expected to find info here.
Feb 07 19:02:59 at 
org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:76) 
~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.tests.StickyAllocationAndLocalRecoveryTestJob$StateCreatingFlatMap.initializeState(StickyAllocationAndLocalRecoveryTestJob.java:340)
 ~[?:?]
Feb 07 19:02:59 at 
org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:187)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:169)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:134)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:285)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreStateAndGates(StreamTask.java:799)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$restoreInternal$3(StreamTask.java:753)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:753)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:712)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
 ~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) 
~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:751) 
~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) 
~[flink-dist-1.20-SNAPSHOT.jar:1.20-SNAPSHOT]
Feb 07 19:02:59 at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_402]
{code}

This error is caused by a Precondition in 
[StickyAllocationAndLocalRecoveryTestJob:340|https://github.com/apache/flink/blob/0f3470db83c1fddba9ac9a7299b1e61baab4ff12/flink-end-to-end-tests/flink-local-recovery-and-allocation-test/src/main/java/org/apache/flink/streaming/tests/StickyAllocationAndLocalRecoveryTestJob.java#L340]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34415) Move away from Kafka-Zookeeper based tests in favor of Kafka-KRaft

2024-02-08 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-34415:
--

 Summary: Move away from Kafka-Zookeeper based tests in favor of 
Kafka-KRaft
 Key: FLINK-34415
 URL: https://issues.apache.org/jira/browse/FLINK-34415
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Kafka
Reporter: Martijn Visser


The current Flink Kafka connector still uses Zookeeper for Kafka-based testing. 
Since Kafka 3.4, KRaft has been marked as production ready [1]. In order to 
reduce tech debt, we should remove all the dependencies on Zookeeper and only 
uses KRaft for the Flink Kafka connector. 

[1] 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-833%3A+Mark+KRaft+as+Production+Ready




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Release flink-connector-mongodb v1.1.0, release candidate #2

2024-02-08 Thread Martijn Visser
+1 (binding)

- Validated hashes
- Verified signature
- Verified that no binaries exist in the source archive
- Build the source with Maven
- Verified licenses
- Verified web PRs

On Wed, Jan 31, 2024 at 10:41 AM Danny Cranmer  wrote:
>
> Thanks for driving this Leonard!
>
> +1 (binding)
>
> - Release notes look ok
> - Signatures/checksums of source archive are good
> - Verified there are no binaries in the source archive
> - Built sources locally successfully
> - v1.0.0-rc2 tag exists in github
> - Tag build passing on CI [1]
> - Contents of Maven dist look complete
> - Verified signatures/checksums of binary in maven dist is correct
> - Verified NOTICE files and bundled dependencies
>
> Thanks,
> Danny
>
> [1]
> https://github.com/apache/flink-connector-mongodb/actions/runs/7709467379
>
> On Wed, Jan 31, 2024 at 7:54 AM gongzhongqiang 
> wrote:
>
> > +1(non-binding)
> >
> > - Signatures and Checksums are good
> > - No binaries in the source archive
> > - Tag is present
> > - Build successful with jdk8 on ubuntu 22.04
> >
> >
> > Leonard Xu  于2024年1月30日周二 18:23写道:
> >
> > > Hey all,
> > >
> > > Please help review and vote on the release candidate #2 for the version
> > > v1.1.0 of the
> > > Apache Flink MongoDB Connector as follows:
> > >
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * The official Apache source release to be deployed to dist.apache.org
> > > [2],
> > > which are signed with the key with fingerprint
> > > 5B2F6608732389AEB67331F5B197E1F1108998AD [3],
> > > * All artifacts to be deployed to the Maven Central Repository [4],
> > > * Source code tag v1.1.0-rc2 [5],
> > > * Website pull request listing the new release [6].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > >
> > > Best,
> > > Leonard
> > > [1]
> > >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353483
> > > [2]
> > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-connector-mongodb-1.1.0-rc2/
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > > https://repository.apache.org/content/repositories/orgapacheflink-1705/
> > > [5] https://github.com/apache/flink-connector-mongodb/tree/v1.1.0-rc2
> > > [6] https://github.com/apache/flink-web/pull/719
> >


Re: Impact of redacting UPDATE_BEFORE fields?

2024-02-08 Thread Kevin Lam
Hey Yaroslav,

Thanks for your response! Got it, so the need for UPDATE_BEFOREs will
depend on your sinks. I just watched the talk and it makes sense when you
think of the UPDATE_BEFOREs as retractions.

In the talk, Timo discusses how removing the need for UPDATE_BEFORE is an
optimization of sorts, if your use-case allows for it, since it'd enable
removing a bunch of messages that processed by Flink.

I'm wondering about the converse, are there any situations where having
UPDATE_BEFORE's will result in improved performance? Does the planner take
advantage of them in some situations?
I don't have a specific example in mind but just trying to understand the
full implications of missing UPDATE_BEFORE messages.

On Wed, Feb 7, 2024 at 4:24 PM Yaroslav Tkachenko
 wrote:

> Hey Kevin,
>
> In my experience it mostly depends on the type of your sinks. If all of
> your sinks can leverage primary keys and support upsert semantics, you
> don't really need UPDATE_BEFOREs altogether (you can even filter them out).
> But if you have sinks with append-only semantics (OR if you don't have
> primary keys defined) you need UPDATE_BEFOREs to correctly support
> retractions (in case of updates and deletes).
>
> Great talk on this topic:
> https://www.youtube.com/watch?v=iRlLaY-P6iE_channel=PlainSchwarz (the
> middle part is the most relevant).
>
>
> On Wed, Feb 7, 2024 at 12:13 PM Kevin Lam 
> wrote:
>
> > Hi there!
> >
> > I have a question about Changelog Stream Processing with Flink SQL and
> the
> > Flink Table API. I would like to better understand how UPDATE_BEFORE
> fields
> > are used by Flink.
> >
> > Our team uses Debezium to extract Change Data Capture events from MySQL
> > databases. We currently redact the `before` fields in the envelope [0] so
> > that redacted PII doesn't sit in our Kafka topics in the `before` field
> of
> > UPDATE events.
> >
> > As a result if we were to consume these CDC streams with Flink, there
> would
> > be missing UPDATE_BEFORE fields for UPDATE events. What kind of impact
> > would this have on performance and correctness, if any? Any other
> > considerations we should be aware of?
> >
> > Thanks in advance for your help!
> >
> >
> > [0]
> > https://debezium.io/documentation/reference/stable/connectors/mysql.html
> >
>


Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for improved state compatibility on parallelism change

2024-02-08 Thread Zhanghao Chen
We're only concerned with parallelism tuning here (with the same Flink 
version).  The plans will be compatible as long as the operator IDs keep the 
same. Currently, this only holds if we do not break/create a chain, and we want 
to make it hold when we break/create a chain as well. That's what the FLIP is 
all about.

The typical user story is that one has a job with a uniform parallelism in the 
first place. The source is chained with an expensive operator. Later on, the 
job parallelism needs to be increased, but the source can't due to limits like 
Kafka partition number. The user then configures different parallelism for the 
source and the remaining part of the job, which breaks a chain, and leads to 
state-incompatibility.

Best,
Zhanghao Chen

From: Chesnay Schepler 
Sent: Thursday, February 8, 2024 18:12
To: dev@flink.apache.org ; Martijn Visser 

Cc: Yu Chen 
Subject: Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for 
improved state compatibility on parallelism change

How exactly are you tuning SQL jobs without compiled plans while
ensuring that the resulting compiled plans are compatible? That's
explicitly not supported by Flink, hence why CompiledPlans exist.
If you change _anything_ the planner is free to generate a completely
different plan, where you have no guarantees that you can map the state
between one another.

On 08/02/2024 09:42, Martijn Visser wrote:
> Hi,
>
>> However, compiled plan is still too complicated for Flink newbies from my 
>> point of view.
> I don't think that the compiled plan was ever positioned to be a
> simple solution. If you want to have an easy approach, we have a
> declarative solution in place with SQL and/or the Table API imho.
>
> Best regards,
>
> Martijn
>
> On Thu, Feb 8, 2024 at 9:14 AM Zhanghao Chen  
> wrote:
>> Hi Piotr,
>>
>> Thanks for the comment. I agree that compiled plan is the ultimate tool for 
>> Flink SQL if one wants to make any changes to
>> query later, and this FLIP indeed is not essential in this sense. However, 
>> compiled plan is still too complicated for Flink newbies from my point of 
>> view. As I mentioned previously, our internal platform provides a visualized 
>> tool for editing the compiled plan but most users still find it complex. 
>> Therefore, the FLIP can still benefit users with better useability and the 
>> proposed changes are actually quite lightweight (just copying a new hasher 
>> with 2 lines deleted + extending the OperatorIdPair data structure) without 
>> much extra effort.
>>
>> Best,
>> Zhanghao Chen
>> 
>> From: Piotr Nowojski 
>> Sent: Thursday, February 8, 2024 14:50
>> To: Zhanghao Chen 
>> Cc: Chesnay Schepler ; dev@flink.apache.org 
>> ; Yu Chen 
>> Subject: Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation 
>> for improved state compatibility on parallelism change
>>
>> Hey
>>
>>> AFAIK, there's no way to set UIDs for a SQL job,
>> AFAIK you can't set UID manually, but  Flink SQL generates a compiled plan
>> of a query with embedded UIDs. As I understand it, using a compiled plan is
>> the preferred (only?) way for Flink SQL if one wants to make any changes to
>> query later on or support Flink's runtime upgrades, without losing the
>> state.
>>
>> If that's the case, what would be the usefulness of this FLIP? Only for
>> DataStream API for users that didn't know that they should have manually
>> configured UIDs? But they have the workaround to actually post-factum add
>> the UIDs anyway, right? So maybe indeed Chesnay is right that this FLIP is
>> not that helpful/worth the extra effort?
>>
>> Best,
>> Piotrek
>>
>> czw., 8 lut 2024 o 03:55 Zhanghao Chen 
>> napisał(a):
>>
>>> Hi Chesnay,
>>>
>>> AFAIK, there's no way to set UIDs for a SQL job, it'll be great if you can
>>> share how you allow UID setting for SQL jobs. We've explored providing a
>>> visualized DAG editor for SQL jobs that allows UID setting on our internal
>>> platform, but most users found it too complicated to use. Another
>>> possible way is to utilize SQL hints, but that's complicated as well. From
>>> our experience, many SQL users are not familiar with Flink, what they want
>>> is an experience similar to writing a normal SQL in MySQL, without
>>> involving much extra concepts like the DAG and the UID. In fact, some
>>> DataStream and PyFlink users also share the same concern.
>>>
>>> On the other hand, some performance-tuning is inevitable for a
>>> long-running jobs in production, and parallelism tuning is among the most
>>> common techniques. FLIP-367 [1] and FLIP-146 [2] allow user to tune the
>>> parallelism of source and sinks, and both are well-received in the
>>> discussion thread. Users definitely don't want to lost state after a
>>> parallelism tuning, which is highly risky at present.
>>>
>>> Putting these together, I think the FLIP has a high value in production.
>>> Through offline discussion, I leant that multiple 

RE: FW: RE: [VOTE] Release flink-connector-jdbc, release candidate #3

2024-02-08 Thread David Radley
Thanks Sergey,

It looks better now.

gpg --verify flink-connector-jdbc-3.1.2-1.18.jar.asc

gpg: assuming signed data in 'flink-connector-jdbc-3.1.2-1.18.jar'

gpg: Signature made Thu  1 Feb 10:54:45 2024 GMT

gpg:using RSA key F7529FAE24811A5C0DF3CA741596BBF0726835D8

gpg: Good signature from "Sergey Nuyanzin (CODE SIGNING KEY) 
snuyan...@apache.org" [unknown]

gpg: aka "Sergey Nuyanzin (CODE SIGNING KEY) 
snuyan...@gmail.com" [unknown]

gpg: aka "Sergey Nuyanzin 
snuyan...@gmail.com" [unknown]

gpg: WARNING: This key is not certified with a trusted signature!

gpg:  There is no indication that the signature belongs to the owner.

I assume the warning is ok,
  Kind regards, David.

From: Sergey Nuyanzin 
Date: Thursday, 8 February 2024 at 14:39
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: FW: RE: [VOTE] Release flink-connector-jdbc, release 
candidate #3
Hi David

it looks like in your case you don't specify the jar itself and probably it
is not in current dir
so it should be something like that (assuming that both asc and jar file
are downloaded and are in current folder)
gpg --verify flink-connector-jdbc-3.1.2-1.16.jar.asc
flink-connector-jdbc-3.1.2-1.16.jar

Here it is a more complete guide how to do it for Apache projects [1]

[1] https://www.apache.org/info/verification.html#CheckingSignatures

On Thu, Feb 8, 2024 at 12:38 PM David Radley 
wrote:

> Hi,
> I was looking more at the asc files. I imported the keys and tried.
>
>
> gpg --verify flink-connector-jdbc-3.1.2-1.16.jar.asc
>
> gpg: no signed data
>
> gpg: can't hash datafile: No data
>
> This seems to be the same for all the asc file. It does not look right; am
> I doing doing incorrect?
>Kind regards, David.
>
>
> From: David Radley 
> Date: Thursday, 8 February 2024 at 10:46
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] RE: [VOTE] Release flink-connector-jdbc, release
> candidate #3
> +1 (non-binding)
>
> I assume that thttps://github.com/apache/flink-web/pull/707 and be
> completed after the release is out.
>
> From: Martijn Visser 
> Date: Friday, 2 February 2024 at 08:38
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] Re: [VOTE] Release flink-connector-jdbc, release
> candidate #3
> +1 (binding)
>
> - Validated hashes
> - Verified signature
> - Verified that no binaries exist in the source archive
> - Build the source with Maven
> - Verified licenses
> - Verified web PRs
>
> On Fri, Feb 2, 2024 at 9:31 AM Yanquan Lv  wrote:
>
> > +1 (non-binding)
> >
> > - Validated checksum hash
> > - Verified signature
> > - Build the source with Maven and jdk8/11/17
> > - Check that the jar is built by jdk8
> > - Verified that no binaries exist in the source archive
> >
> > Sergey Nuyanzin  于2024年2月1日周四 19:50写道:
> >
> > > Hi everyone,
> > > Please review and vote on the release candidate #3 for the version
> 3.1.2,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > > This version is compatible with Flink 1.16.x, 1.17.x and 1.18.x.
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release to be deployed to dist.apache.org
> > > [2],
> > > which are signed with the key with fingerprint 1596BBF0726835D8 [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag v3.1.2-rc3 [5],
> > > * website pull request listing the new release [6].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > Release Manager
> > >
> > > [1]
> > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354088
> > > [2]
> > >
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-jdbc-3.1.2-rc3
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1706/
> > > [5]
> > https://github.com/apache/flink-connector-jdbc/releases/tag/v3.1.2-rc3
> > > [6] https://github.com/apache/flink-web/pull/707
> > >
> >
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>


--
Best regards,
Sergey

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


[jira] [Created] (FLINK-34414) EXACTLY_ONCE guarantee doesn't work properly for Flink/Pulsar connector

2024-02-08 Thread Jira
Rafał Trójczak created FLINK-34414:
--

 Summary: EXACTLY_ONCE guarantee doesn't work properly for 
Flink/Pulsar connector 
 Key: FLINK-34414
 URL: https://issues.apache.org/jira/browse/FLINK-34414
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Pulsar
Affects Versions: 1.17.2
Reporter: Rafał Trójczak


Using Pulsar connector for Flink (version 4.1.0-1.17) with Flink job (version 
1.17.2) when there is an exception thrown within the job, the job gets 
restarted, starts from the last checkpoint, but the sink writes to the output 
more events than it should, even though the EXACT_ONCE guarantees are set 
everywhere. To be more specific, here is my Job's flow:
 * a Pulsar source that reads from the input topic,
 * a simple processing function,
 * and a sink that writes to the output topic.

Here is a fragment of the source creation:
{code:java}
.setDeserializationSchema(Schema.AVRO(inClass), inClass)
.setSubscriptionName(subscription)
.enableSchemaEvolution()
.setConfig(PulsarOptions.PULSAR_ENABLE_TRANSACTION, true)
.setConfig(PulsarSourceOptions.PULSAR_ACK_RECEIPT_ENABLED, true)
.setConfig(PulsarSourceOptions.PULSAR_MAX_FETCH_RECORDS, 1)
.setConfig(PulsarSourceOptions.PULSAR_ENABLE_AUTO_ACKNOWLEDGE_MESSAGE, 
false);
{code}
Here is the fragment of the sink creation:
{code:java}
.setSerializationSchema(Schema.AVRO(outClass), outClass)
.setConfig(PulsarOptions.PULSAR_ENABLE_TRANSACTION, true)
.setConfig(PulsarSinkOptions.PULSAR_WRITE_DELIVERY_GUARANTEE, 
DeliveryGuarantee.EXACTLY_ONCE)
.setDeliveryGuarantee(DeliveryGuarantee.EXACTLY_ONCE);
{code}
And here is the Flink environment preparation:
{code:java}
environment.setRuntimeMode(RuntimeExecutionMode.STREAMING);
environment.enableCheckpointing(CHECKPOINTING_INTERVAL, 
CheckpointingMode.EXACTLY_ONCE);
{code}
After sending 1000 events on the input topic, on the output topic I got 1048 
events.

I ran the job on my local Kubernetes cluster, using Kubernetes Flink Operator.

Here is the MRE for this problem (mind that there is an internal dependency, 
but it may be commented out together with the code that relies on it): 
[https://github.com/trojczak/flink-pulsar-connector-problem]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: FW: RE: [VOTE] Release flink-connector-jdbc, release candidate #3

2024-02-08 Thread Sergey Nuyanzin
Hi David

it looks like in your case you don't specify the jar itself and probably it
is not in current dir
so it should be something like that (assuming that both asc and jar file
are downloaded and are in current folder)
gpg --verify flink-connector-jdbc-3.1.2-1.16.jar.asc
flink-connector-jdbc-3.1.2-1.16.jar

Here it is a more complete guide how to do it for Apache projects [1]

[1] https://www.apache.org/info/verification.html#CheckingSignatures

On Thu, Feb 8, 2024 at 12:38 PM David Radley 
wrote:

> Hi,
> I was looking more at the asc files. I imported the keys and tried.
>
>
> gpg --verify flink-connector-jdbc-3.1.2-1.16.jar.asc
>
> gpg: no signed data
>
> gpg: can't hash datafile: No data
>
> This seems to be the same for all the asc file. It does not look right; am
> I doing doing incorrect?
>Kind regards, David.
>
>
> From: David Radley 
> Date: Thursday, 8 February 2024 at 10:46
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] RE: [VOTE] Release flink-connector-jdbc, release
> candidate #3
> +1 (non-binding)
>
> I assume that thttps://github.com/apache/flink-web/pull/707 and be
> completed after the release is out.
>
> From: Martijn Visser 
> Date: Friday, 2 February 2024 at 08:38
> To: dev@flink.apache.org 
> Subject: [EXTERNAL] Re: [VOTE] Release flink-connector-jdbc, release
> candidate #3
> +1 (binding)
>
> - Validated hashes
> - Verified signature
> - Verified that no binaries exist in the source archive
> - Build the source with Maven
> - Verified licenses
> - Verified web PRs
>
> On Fri, Feb 2, 2024 at 9:31 AM Yanquan Lv  wrote:
>
> > +1 (non-binding)
> >
> > - Validated checksum hash
> > - Verified signature
> > - Build the source with Maven and jdk8/11/17
> > - Check that the jar is built by jdk8
> > - Verified that no binaries exist in the source archive
> >
> > Sergey Nuyanzin  于2024年2月1日周四 19:50写道:
> >
> > > Hi everyone,
> > > Please review and vote on the release candidate #3 for the version
> 3.1.2,
> > > as follows:
> > > [ ] +1, Approve the release
> > > [ ] -1, Do not approve the release (please provide specific comments)
> > >
> > > This version is compatible with Flink 1.16.x, 1.17.x and 1.18.x.
> > >
> > > The complete staging area is available for your review, which includes:
> > > * JIRA release notes [1],
> > > * the official Apache source release to be deployed to dist.apache.org
> > > [2],
> > > which are signed with the key with fingerprint 1596BBF0726835D8 [3],
> > > * all artifacts to be deployed to the Maven Central Repository [4],
> > > * source code tag v3.1.2-rc3 [5],
> > > * website pull request listing the new release [6].
> > >
> > > The vote will be open for at least 72 hours. It is adopted by majority
> > > approval, with at least 3 PMC affirmative votes.
> > >
> > > Thanks,
> > > Release Manager
> > >
> > > [1]
> > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354088
> > > [2]
> > >
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-jdbc-3.1.2-rc3
> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > [4]
> > >
> https://repository.apache.org/content/repositories/orgapacheflink-1706/
> > > [5]
> > https://github.com/apache/flink-connector-jdbc/releases/tag/v3.1.2-rc3
> > > [6] https://github.com/apache/flink-web/pull/707
> > >
> >
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>


-- 
Best regards,
Sergey


FW: RE: [VOTE] Release flink-connector-jdbc, release candidate #3

2024-02-08 Thread David Radley
Hi,
I was looking more at the asc files. I imported the keys and tried.


gpg --verify flink-connector-jdbc-3.1.2-1.16.jar.asc

gpg: no signed data

gpg: can't hash datafile: No data

This seems to be the same for all the asc file. It does not look right; am I 
doing doing incorrect?
   Kind regards, David.


From: David Radley 
Date: Thursday, 8 February 2024 at 10:46
To: dev@flink.apache.org 
Subject: [EXTERNAL] RE: [VOTE] Release flink-connector-jdbc, release candidate 
#3
+1 (non-binding)

I assume that thttps://github.com/apache/flink-web/pull/707 and be completed 
after the release is out.

From: Martijn Visser 
Date: Friday, 2 February 2024 at 08:38
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: [VOTE] Release flink-connector-jdbc, release candidate 
#3
+1 (binding)

- Validated hashes
- Verified signature
- Verified that no binaries exist in the source archive
- Build the source with Maven
- Verified licenses
- Verified web PRs

On Fri, Feb 2, 2024 at 9:31 AM Yanquan Lv  wrote:

> +1 (non-binding)
>
> - Validated checksum hash
> - Verified signature
> - Build the source with Maven and jdk8/11/17
> - Check that the jar is built by jdk8
> - Verified that no binaries exist in the source archive
>
> Sergey Nuyanzin  于2024年2月1日周四 19:50写道:
>
> > Hi everyone,
> > Please review and vote on the release candidate #3 for the version 3.1.2,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > This version is compatible with Flink 1.16.x, 1.17.x and 1.18.x.
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release to be deployed to dist.apache.org
> > [2],
> > which are signed with the key with fingerprint 1596BBF0726835D8 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag v3.1.2-rc3 [5],
> > * website pull request listing the new release [6].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Release Manager
> >
> > [1]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354088
> > [2]
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-jdbc-3.1.2-rc3
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1706/
> > [5]
> https://github.com/apache/flink-connector-jdbc/releases/tag/v3.1.2-rc3
> > [6] https://github.com/apache/flink-web/pull/707
> >
>

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


RE: Flink jdbc connector rc3 for flink 1.18

2024-02-08 Thread David Radley
Hi Sergey,
Yes that makes sense, thanks,
Kind regards, David.

From: Sergey Nuyanzin 
Date: Wednesday, 7 February 2024 at 11:41
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: Flink jdbc connector rc3 for flink 1.18
Hi David,

Thanks for testing.

Yes the jars are built from the same sources and same git tag apart from
the Flink version.

as it was mentioned in jdbc connector RC thread [1]

>The complete staging area is available for your review, which includes:
>* all artifacts to be deployed to the Maven Central Repository [2]

which contains jars for three Flink versions (1.16.x, 1.17.x, 1.18.x)

Please let  me know whether this answers your question or not

[1] https://lists.apache.org/thread/rlk5kp2vxgkmbxmq4wnco885q5vv9rtp
[2] https://repository.apache.org/content/repositories/orgapacheflink-1706/

On Wed, Feb 7, 2024 at 12:18 PM David Radley 
wrote:

> Hi ,
> I had a question on Flink jdbc connector new release. I notice for the
> last release we have
>
> https://mvnrepository.com/artifact/org.apache.flink/flink-connector-jdbc/3.1.1-1.16
>
> https://mvnrepository.com/artifact/org.apache.flink/flink-connector-jdbc/3.1.1-1.17
> It seems that the 2 jars above are identical apart from the manifest.
>
> I assume for the new Flink JDBC connector, there will be a 1.16 1.17 and
> 1.18 versions in Maven central, each level will be compiled against Flink
> 1.6.0, 1.17.0 and 1.18.0 ( or would it be 1.16.3 , 1.17.2 and 1.18.1?)
> respectively. Either way the 3 jar files should be the same (apart from the
> manifest names) as the dependencies on core Flink are forward compatible.
>
> We are looking to use a JDBC connector that works with Flink 1.18 and
> fixes the lookup join filter issue. So we are planning to build against the
> latest 3.1 branch code against 1.18.0– unless the connector is released
> very soon – and we would pick that up.
>
>Kind regards, David.
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>


--
Best regards,
Sergey

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


RE: [VOTE] Release flink-connector-jdbc, release candidate #3

2024-02-08 Thread David Radley
+1 (non-binding)

I assume that thttps://github.com/apache/flink-web/pull/707 and be completed 
after the release is out.

From: Martijn Visser 
Date: Friday, 2 February 2024 at 08:38
To: dev@flink.apache.org 
Subject: [EXTERNAL] Re: [VOTE] Release flink-connector-jdbc, release candidate 
#3
+1 (binding)

- Validated hashes
- Verified signature
- Verified that no binaries exist in the source archive
- Build the source with Maven
- Verified licenses
- Verified web PRs

On Fri, Feb 2, 2024 at 9:31 AM Yanquan Lv  wrote:

> +1 (non-binding)
>
> - Validated checksum hash
> - Verified signature
> - Build the source with Maven and jdk8/11/17
> - Check that the jar is built by jdk8
> - Verified that no binaries exist in the source archive
>
> Sergey Nuyanzin  于2024年2月1日周四 19:50写道:
>
> > Hi everyone,
> > Please review and vote on the release candidate #3 for the version 3.1.2,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > This version is compatible with Flink 1.16.x, 1.17.x and 1.18.x.
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release to be deployed to dist.apache.org
> > [2],
> > which are signed with the key with fingerprint 1596BBF0726835D8 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag v3.1.2-rc3 [5],
> > * website pull request listing the new release [6].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Release Manager
> >
> > [1]
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354088
> > [2]
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-jdbc-3.1.2-rc3
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1706/
> > [5]
> https://github.com/apache/flink-connector-jdbc/releases/tag/v3.1.2-rc3
> > [6] https://github.com/apache/flink-web/pull/707
> >
>

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for improved state compatibility on parallelism change

2024-02-08 Thread Chesnay Schepler
How exactly are you tuning SQL jobs without compiled plans while 
ensuring that the resulting compiled plans are compatible? That's 
explicitly not supported by Flink, hence why CompiledPlans exist.
If you change _anything_ the planner is free to generate a completely 
different plan, where you have no guarantees that you can map the state 
between one another.


On 08/02/2024 09:42, Martijn Visser wrote:

Hi,


However, compiled plan is still too complicated for Flink newbies from my point 
of view.

I don't think that the compiled plan was ever positioned to be a
simple solution. If you want to have an easy approach, we have a
declarative solution in place with SQL and/or the Table API imho.

Best regards,

Martijn

On Thu, Feb 8, 2024 at 9:14 AM Zhanghao Chen  wrote:

Hi Piotr,

Thanks for the comment. I agree that compiled plan is the ultimate tool for 
Flink SQL if one wants to make any changes to
query later, and this FLIP indeed is not essential in this sense. However, 
compiled plan is still too complicated for Flink newbies from my point of view. 
As I mentioned previously, our internal platform provides a visualized tool for 
editing the compiled plan but most users still find it complex. Therefore, the 
FLIP can still benefit users with better useability and the proposed changes 
are actually quite lightweight (just copying a new hasher with 2 lines deleted 
+ extending the OperatorIdPair data structure) without much extra effort.

Best,
Zhanghao Chen

From: Piotr Nowojski 
Sent: Thursday, February 8, 2024 14:50
To: Zhanghao Chen 
Cc: Chesnay Schepler ; dev@flink.apache.org 
; Yu Chen 
Subject: Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for 
improved state compatibility on parallelism change

Hey


AFAIK, there's no way to set UIDs for a SQL job,

AFAIK you can't set UID manually, but  Flink SQL generates a compiled plan
of a query with embedded UIDs. As I understand it, using a compiled plan is
the preferred (only?) way for Flink SQL if one wants to make any changes to
query later on or support Flink's runtime upgrades, without losing the
state.

If that's the case, what would be the usefulness of this FLIP? Only for
DataStream API for users that didn't know that they should have manually
configured UIDs? But they have the workaround to actually post-factum add
the UIDs anyway, right? So maybe indeed Chesnay is right that this FLIP is
not that helpful/worth the extra effort?

Best,
Piotrek

czw., 8 lut 2024 o 03:55 Zhanghao Chen 
napisał(a):


Hi Chesnay,

AFAIK, there's no way to set UIDs for a SQL job, it'll be great if you can
share how you allow UID setting for SQL jobs. We've explored providing a
visualized DAG editor for SQL jobs that allows UID setting on our internal
platform, but most users found it too complicated to use. Another
possible way is to utilize SQL hints, but that's complicated as well. From
our experience, many SQL users are not familiar with Flink, what they want
is an experience similar to writing a normal SQL in MySQL, without
involving much extra concepts like the DAG and the UID. In fact, some
DataStream and PyFlink users also share the same concern.

On the other hand, some performance-tuning is inevitable for a
long-running jobs in production, and parallelism tuning is among the most
common techniques. FLIP-367 [1] and FLIP-146 [2] allow user to tune the
parallelism of source and sinks, and both are well-received in the
discussion thread. Users definitely don't want to lost state after a
parallelism tuning, which is highly risky at present.

Putting these together, I think the FLIP has a high value in production.
Through offline discussion, I leant that multiple companies have developed
or trying to develop similar hasher changes in their internal distribution,
including ByteDance, Xiaohongshu, and Bilibili. It'll be great if we can
improve the SQL experience for all community users as well, WDYT?

Best,
Zhanghao Chen
--
*From:* Chesnay Schepler 
*Sent:* Thursday, February 8, 2024 2:01
*To:* dev@flink.apache.org ; Zhanghao Chen <
zhanghao.c...@outlook.com>; Piotr Nowojski ; Yu
Chen 
*Subject:* Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID
generation for improved state compatibility on parallelism change

The FLIP is a bit weird to be honest. It only applies in cases where
users haven't set uids, but that goes against best-practices and as far
as I'm told SQL also sets UIDs everywhere.

I'm wondering if this is really worth the effort.

On 07/02/2024 10:23, Zhanghao Chen wrote:

After offline discussion with @Yu Chen, I've updated the FLIP [1] to include a design
that allows for compatible hasher upgrade by adding StreamGraphHasherV2 to
the legacy hasher list, which is actually a revival of the idea from
FLIP-5290 [2] when StreamGraphHasherV2 was introduced in Flink 1.2. We're
targeting to make V3 the default hasher in Flink 1.20 given that

[jira] [Created] (FLINK-34413) Drop support for HBase v1

2024-02-08 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-34413:
--

 Summary: Drop support for HBase v1
 Key: FLINK-34413
 URL: https://issues.apache.org/jira/browse/FLINK-34413
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / HBase
Reporter: Martijn Visser


As discussed in 
https://lists.apache.org/thread/6663052dmfnqm8wvqoxx9k8jwcshg1zq 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Drop support for HBase v1

2024-02-08 Thread Martijn Visser
Hi all,

I will open a ticket to drop support for HBase v1. If there are no objections 
brought forward next week, we'll move forward with dropping support for HBase 
v1.

Best regards,

Martijn

On 2024/02/01 02:31:00 jialiang tan wrote:
> Hi Martijn, Ferenc
> Thanks all for driving this. As Ferenc said, HBase 1.x is dead, so on the
> way forward it should be safe to drop it. Same view as mine. So +1 for this.
> 
> Best!
> tanjialiang
> 
> 
>  Replied Message 
> From Ferenc Csaky 
> Date 1/30/2024 22:14
> To  
> Subject Re: [DISCUSS] Drop support for HBase v1
> Hi Martijn,
> 
> thanks for starting the discussion. Let me link the older discussion
> regarding the same topic [1]. My opinion did not change, so +1.
> 
> BR,
> Ferenc
> 
> [1] https://lists.apache.org/thread/x7l2gj8g93r4v6x6953cyt6jrs8c4r1b
> 
> 
> 
> 
> On Monday, January 29th, 2024 at 09:37, Martijn Visser <
> martijnvis...@apache.org> wrote:
> 
> 
> 
> Hi all,
> 
> While working on adding support for Flink 1.19 for HBase, we've run into a
> dependency convergence issue because HBase v1 relies on a really old
> version of Guava.
> 
> HBase v2 has been made available since May 2018, and there have been no new
> releases of HBase v1 since August 2022.
> 
> I would like to propose that the Flink HBase connector drops support for
> HBase v1, and will only continue HBase v2 in the future. I don't think this
> requires a full FLIP and vote, but I do want to start a discussion thread
> for this.
> 
> Best regards,
> 
> Martijn
> 


Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for improved state compatibility on parallelism change

2024-02-08 Thread Martijn Visser
Hi,

> However, compiled plan is still too complicated for Flink newbies from my 
> point of view.

I don't think that the compiled plan was ever positioned to be a
simple solution. If you want to have an easy approach, we have a
declarative solution in place with SQL and/or the Table API imho.

Best regards,

Martijn

On Thu, Feb 8, 2024 at 9:14 AM Zhanghao Chen  wrote:
>
> Hi Piotr,
>
> Thanks for the comment. I agree that compiled plan is the ultimate tool for 
> Flink SQL if one wants to make any changes to
> query later, and this FLIP indeed is not essential in this sense. However, 
> compiled plan is still too complicated for Flink newbies from my point of 
> view. As I mentioned previously, our internal platform provides a visualized 
> tool for editing the compiled plan but most users still find it complex. 
> Therefore, the FLIP can still benefit users with better useability and the 
> proposed changes are actually quite lightweight (just copying a new hasher 
> with 2 lines deleted + extending the OperatorIdPair data structure) without 
> much extra effort.
>
> Best,
> Zhanghao Chen
> 
> From: Piotr Nowojski 
> Sent: Thursday, February 8, 2024 14:50
> To: Zhanghao Chen 
> Cc: Chesnay Schepler ; dev@flink.apache.org 
> ; Yu Chen 
> Subject: Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for 
> improved state compatibility on parallelism change
>
> Hey
>
> > AFAIK, there's no way to set UIDs for a SQL job,
>
> AFAIK you can't set UID manually, but  Flink SQL generates a compiled plan
> of a query with embedded UIDs. As I understand it, using a compiled plan is
> the preferred (only?) way for Flink SQL if one wants to make any changes to
> query later on or support Flink's runtime upgrades, without losing the
> state.
>
> If that's the case, what would be the usefulness of this FLIP? Only for
> DataStream API for users that didn't know that they should have manually
> configured UIDs? But they have the workaround to actually post-factum add
> the UIDs anyway, right? So maybe indeed Chesnay is right that this FLIP is
> not that helpful/worth the extra effort?
>
> Best,
> Piotrek
>
> czw., 8 lut 2024 o 03:55 Zhanghao Chen 
> napisał(a):
>
> > Hi Chesnay,
> >
> > AFAIK, there's no way to set UIDs for a SQL job, it'll be great if you can
> > share how you allow UID setting for SQL jobs. We've explored providing a
> > visualized DAG editor for SQL jobs that allows UID setting on our internal
> > platform, but most users found it too complicated to use. Another
> > possible way is to utilize SQL hints, but that's complicated as well. From
> > our experience, many SQL users are not familiar with Flink, what they want
> > is an experience similar to writing a normal SQL in MySQL, without
> > involving much extra concepts like the DAG and the UID. In fact, some
> > DataStream and PyFlink users also share the same concern.
> >
> > On the other hand, some performance-tuning is inevitable for a
> > long-running jobs in production, and parallelism tuning is among the most
> > common techniques. FLIP-367 [1] and FLIP-146 [2] allow user to tune the
> > parallelism of source and sinks, and both are well-received in the
> > discussion thread. Users definitely don't want to lost state after a
> > parallelism tuning, which is highly risky at present.
> >
> > Putting these together, I think the FLIP has a high value in production.
> > Through offline discussion, I leant that multiple companies have developed
> > or trying to develop similar hasher changes in their internal distribution,
> > including ByteDance, Xiaohongshu, and Bilibili. It'll be great if we can
> > improve the SQL experience for all community users as well, WDYT?
> >
> > Best,
> > Zhanghao Chen
> > --
> > *From:* Chesnay Schepler 
> > *Sent:* Thursday, February 8, 2024 2:01
> > *To:* dev@flink.apache.org ; Zhanghao Chen <
> > zhanghao.c...@outlook.com>; Piotr Nowojski ; Yu
> > Chen 
> > *Subject:* Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID
> > generation for improved state compatibility on parallelism change
> >
> > The FLIP is a bit weird to be honest. It only applies in cases where
> > users haven't set uids, but that goes against best-practices and as far
> > as I'm told SQL also sets UIDs everywhere.
> >
> > I'm wondering if this is really worth the effort.
> >
> > On 07/02/2024 10:23, Zhanghao Chen wrote:
> > > After offline discussion with @Yu Chen > >, I've updated the FLIP [1] to include a design
> > that allows for compatible hasher upgrade by adding StreamGraphHasherV2 to
> > the legacy hasher list, which is actually a revival of the idea from
> > FLIP-5290 [2] when StreamGraphHasherV2 was introduced in Flink 1.2. We're
> > targeting to make V3 the default hasher in Flink 1.20 given that
> > state-compatibility is no longer an issue. Take a review when you have a
> > chance, and I'd like to especially thank @Yu Chen<
> > 

[jira] [Created] (FLINK-34412) ResultPartitionDeploymentDescriptorTest fails due to fatal error (239 exit code)

2024-02-08 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34412:
-

 Summary: ResultPartitionDeploymentDescriptorTest fails due to 
fatal error (239 exit code)
 Key: FLINK-34412
 URL: https://issues.apache.org/jira/browse/FLINK-34412
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.17.2
Reporter: Matthias Pohl


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57388=logs=77a9d8e1-d610-59b3-fc2a-4766541e0e33=125e07e7-8de0-5c6c-a541-a567415af3ef=8323

{code}
Feb 08 04:56:31 [ERROR] 
org.apache.flink.runtime.deployment.ResultPartitionDeploymentDescriptorTest
Feb 08 04:56:31 [ERROR] 
org.apache.maven.surefire.booter.SurefireBooterForkException: 
ExecutionException The forked VM terminated without properly saying goodbye. VM 
crash or System.exit called?
Feb 08 04:56:31 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-runtime && 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -XX:+UseG1GC -Xms256m -Xmx768m 
-jar 
/__w/1/s/flink-runtime/target/surefire/surefirebooter6684124987290515696.jar 
/__w/1/s/flink-runtime/target/surefire 2024-02-08T04-45-49_396-jvmRun4 
surefire6142105262662423760tmp surefire_245661504424247139476tmp
Feb 08 04:56:31 [ERROR] Error occurred in starting fork, check output in log
Feb 08 04:56:31 [ERROR] Process Exit Code: 239
Feb 08 04:56:31 [ERROR] Crashed tests:
Feb 08 04:56:31 [ERROR] 
org.apache.flink.runtime.deployment.ResultPartitionDeploymentDescriptorTest
Feb 08 04:56:31 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:532)
Feb 08 04:56:31 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkOnceMultiple(ForkStarter.java:405)
Feb 08 04:56:31 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:321)
Feb 08 04:56:31 [ERROR] at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:266)
Feb 08 04:56:31 [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1314)
Feb 08 04:56:31 [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:1159)
Feb 08 04:56:31 [ERROR] at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:932)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34411) "Wordcount on Docker test (custom fs plugin)" timed out with some strange issue while setting the test up

2024-02-08 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34411:
-

 Summary: "Wordcount on Docker test (custom fs plugin)" timed out 
with some strange issue while setting the test up
 Key: FLINK-34411
 URL: https://issues.apache.org/jira/browse/FLINK-34411
 Project: Flink
  Issue Type: Bug
  Components: Test Infrastructure
Affects Versions: 1.19.0, 1.20.0
Reporter: Matthias Pohl


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=57380=logs=bea52777-eaf8-5663-8482-18fbc3630e81=43ba8ce7-ebbf-57cd-9163-444305d74117=5802

{code}
Feb 07 15:22:39 
==
Feb 07 15:22:39 Running 'Wordcount on Docker test (custom fs plugin)'
Feb 07 15:22:39 
==
Feb 07 15:22:39 TEST_DATA_DIR: 
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-39516987853
Feb 07 15:22:40 Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.19-SNAPSHOT-bin/flink-1.19-SNAPSHOT
Feb 07 15:22:40 Flink dist directory: 
/home/vsts/work/1/s/flink-dist/target/flink-1.19-SNAPSHOT-bin/flink-1.19-SNAPSHOT
Feb 07 15:22:41 Docker version 24.0.7, build afdd53b
Feb 07 15:22:44 docker-compose version 1.29.2, build 5becea4c
Feb 07 15:22:44 Starting fileserver for Flink distribution
Feb 07 15:22:44 ~/work/1/s/flink-dist/target/flink-1.19-SNAPSHOT-bin ~/work/1/s
Feb 07 15:23:07 ~/work/1/s
Feb 07 15:23:07 
~/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-39516987853 
~/work/1/s
Feb 07 15:23:07 Preparing Dockeriles
Feb 07 15:23:07 Executing command: git clone 
https://github.com/apache/flink-docker.git --branch dev-1.19 --single-branch
Cloning into 'flink-docker'...
/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/common_docker.sh: line 
65: ./add-custom.sh: No such file or directory
Feb 07 15:23:07 Building images
ERROR: unable to prepare context: path "dev/test_docker_embedded_job-ubuntu" 
not found
Feb 07 15:23:09 ~/work/1/s
Feb 07 15:23:09 Command: build_image test_docker_embedded_job failed. 
Retrying...
Feb 07 15:23:14 Starting fileserver for Flink distribution
Feb 07 15:23:14 ~/work/1/s/flink-dist/target/flink-1.19-SNAPSHOT-bin ~/work/1/s
Feb 07 15:23:36 ~/work/1/s
Feb 07 15:23:36 
~/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-39516987853 
~/work/1/s
Feb 07 15:23:36 Preparing Dockeriles
Feb 07 15:23:36 Executing command: git clone 
https://github.com/apache/flink-docker.git --branch dev-1.19 --single-branch
fatal: destination path 'flink-docker' already exists and is not an empty 
directory.
Feb 07 15:23:36 Retry 1/5 exited 128, retrying in 1 seconds...
Traceback (most recent call last):
  File 
"/home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/python3_fileserver.py",
 line 26, in 
httpd = socketserver.TCPServer(("", ), handler)
  File "/usr/lib/python3.8/socketserver.py", line 452, in __init__
self.server_bind()
  File "/usr/lib/python3.8/socketserver.py", line 466, in server_bind
self.socket.bind(self.server_address)
OSError: [Errno 98] Address already in use
[...]
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for improved state compatibility on parallelism change

2024-02-08 Thread Zhanghao Chen
Hi Piotr,

Thanks for the comment. I agree that compiled plan is the ultimate tool for 
Flink SQL if one wants to make any changes to
query later, and this FLIP indeed is not essential in this sense. However, 
compiled plan is still too complicated for Flink newbies from my point of view. 
As I mentioned previously, our internal platform provides a visualized tool for 
editing the compiled plan but most users still find it complex. Therefore, the 
FLIP can still benefit users with better useability and the proposed changes 
are actually quite lightweight (just copying a new hasher with 2 lines deleted 
+ extending the OperatorIdPair data structure) without much extra effort.

Best,
Zhanghao Chen

From: Piotr Nowojski 
Sent: Thursday, February 8, 2024 14:50
To: Zhanghao Chen 
Cc: Chesnay Schepler ; dev@flink.apache.org 
; Yu Chen 
Subject: Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID generation for 
improved state compatibility on parallelism change

Hey

> AFAIK, there's no way to set UIDs for a SQL job,

AFAIK you can't set UID manually, but  Flink SQL generates a compiled plan
of a query with embedded UIDs. As I understand it, using a compiled plan is
the preferred (only?) way for Flink SQL if one wants to make any changes to
query later on or support Flink's runtime upgrades, without losing the
state.

If that's the case, what would be the usefulness of this FLIP? Only for
DataStream API for users that didn't know that they should have manually
configured UIDs? But they have the workaround to actually post-factum add
the UIDs anyway, right? So maybe indeed Chesnay is right that this FLIP is
not that helpful/worth the extra effort?

Best,
Piotrek

czw., 8 lut 2024 o 03:55 Zhanghao Chen 
napisał(a):

> Hi Chesnay,
>
> AFAIK, there's no way to set UIDs for a SQL job, it'll be great if you can
> share how you allow UID setting for SQL jobs. We've explored providing a
> visualized DAG editor for SQL jobs that allows UID setting on our internal
> platform, but most users found it too complicated to use. Another
> possible way is to utilize SQL hints, but that's complicated as well. From
> our experience, many SQL users are not familiar with Flink, what they want
> is an experience similar to writing a normal SQL in MySQL, without
> involving much extra concepts like the DAG and the UID. In fact, some
> DataStream and PyFlink users also share the same concern.
>
> On the other hand, some performance-tuning is inevitable for a
> long-running jobs in production, and parallelism tuning is among the most
> common techniques. FLIP-367 [1] and FLIP-146 [2] allow user to tune the
> parallelism of source and sinks, and both are well-received in the
> discussion thread. Users definitely don't want to lost state after a
> parallelism tuning, which is highly risky at present.
>
> Putting these together, I think the FLIP has a high value in production.
> Through offline discussion, I leant that multiple companies have developed
> or trying to develop similar hasher changes in their internal distribution,
> including ByteDance, Xiaohongshu, and Bilibili. It'll be great if we can
> improve the SQL experience for all community users as well, WDYT?
>
> Best,
> Zhanghao Chen
> --
> *From:* Chesnay Schepler 
> *Sent:* Thursday, February 8, 2024 2:01
> *To:* dev@flink.apache.org ; Zhanghao Chen <
> zhanghao.c...@outlook.com>; Piotr Nowojski ; Yu
> Chen 
> *Subject:* Re: [DISCUSS] FLIP 411: Chaining-agnostic Operator ID
> generation for improved state compatibility on parallelism change
>
> The FLIP is a bit weird to be honest. It only applies in cases where
> users haven't set uids, but that goes against best-practices and as far
> as I'm told SQL also sets UIDs everywhere.
>
> I'm wondering if this is really worth the effort.
>
> On 07/02/2024 10:23, Zhanghao Chen wrote:
> > After offline discussion with @Yu Chen >, I've updated the FLIP [1] to include a design
> that allows for compatible hasher upgrade by adding StreamGraphHasherV2 to
> the legacy hasher list, which is actually a revival of the idea from
> FLIP-5290 [2] when StreamGraphHasherV2 was introduced in Flink 1.2. We're
> targeting to make V3 the default hasher in Flink 1.20 given that
> state-compatibility is no longer an issue. Take a review when you have a
> chance, and I'd like to especially thank @Yu Chen<
> mailto:yuchen.e...@gmail.com > for the through
> offline discussion and code debugging help to make this possible.
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-411%3A+Chaining-agnostic+Operator+ID+generation+for+improved+state+compatibility+on+parallelism+change
> > [2] https://issues.apache.org/jira/browse/FLINK-5290
> >
> > Best,
> > Zhanghao Chen
> > 
> > From: Zhanghao Chen 
> > Sent: Friday, January 12, 2024 10:46
> > To: Piotr Nowojski ; Yu Chen <
> yuchen.e...@gmail.com>
> > Cc: