[jira] [Created] (FLINK-33720) Building Flink took suspiciously long in e2e1 stage with Hadoop 3.1.3 enabled

2023-11-30 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-33720:
-

 Summary: Building Flink took suspiciously long in e2e1 stage with 
Hadoop 3.1.3 enabled
 Key: FLINK-33720
 URL: https://issues.apache.org/jira/browse/FLINK-33720
 Project: Flink
  Issue Type: Bug
  Components: Build System / CI
Affects Versions: 1.17.2
Reporter: Matthias Pohl


[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=55085&view=logs&j=bbb1e2a2-a43c-55c8-fb48-5cfe7a8a0ca6&t=ba24ad14-6ea3-5ee3-c4ec-9e7cd2c9e754]

the stage for building Flink hit the time limit of 6 hours and was cancelled. 
There are no logs provided. But I suspect it to be an infrastructure problem 
(and, therefore, not falling into the class of issues we can resolve). I 
created this issue for documentation purposes, anyway.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33719) Cleanup the usage of deprecated StreamTableEnvironment#toRetractStream

2023-11-30 Thread Jane Chan (Jira)
Jane Chan created FLINK-33719:
-

 Summary: Cleanup the usage of deprecated 
StreamTableEnvironment#toRetractStream
 Key: FLINK-33719
 URL: https://issues.apache.org/jira/browse/FLINK-33719
 Project: Flink
  Issue Type: Sub-task
Reporter: Jane Chan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33718) Cleanup the usage of deprecated StreamTableEnvironment#toAppendStream

2023-11-30 Thread Jane Chan (Jira)
Jane Chan created FLINK-33718:
-

 Summary: Cleanup the usage of deprecated 
StreamTableEnvironment#toAppendStream
 Key: FLINK-33718
 URL: https://issues.apache.org/jira/browse/FLINK-33718
 Project: Flink
  Issue Type: Sub-task
Reporter: Jane Chan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33717) Cleanup the usage of deprecated StreamTableEnvironment#fromDataStream(DataStream, Expression...)

2023-11-30 Thread Jane Chan (Jira)
Jane Chan created FLINK-33717:
-

 Summary: Cleanup the usage of deprecated 
StreamTableEnvironment#fromDataStream(DataStream, Expression...)
 Key: FLINK-33717
 URL: https://issues.apache.org/jira/browse/FLINK-33717
 Project: Flink
  Issue Type: Sub-task
Reporter: Jane Chan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33716) Cleanup the usage of deprecated StreamTableEnvironment#createTemporaryView(String, DataStream, Expression...)

2023-11-30 Thread Jane Chan (Jira)
Jane Chan created FLINK-33716:
-

 Summary: Cleanup the usage of deprecated 
StreamTableEnvironment#createTemporaryView(String, DataStream, Expression...)
 Key: FLINK-33716
 URL: https://issues.apache.org/jira/browse/FLINK-33716
 Project: Flink
  Issue Type: Sub-task
Reporter: Jane Chan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33715) Enhance history server to archive multiple histories per jobid

2023-11-30 Thread dongwoo.kim (Jira)
dongwoo.kim created FLINK-33715:
---

 Summary: Enhance history server to archive multiple histories per 
jobid
 Key: FLINK-33715
 URL: https://issues.apache.org/jira/browse/FLINK-33715
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: dongwoo.kim


Hello Flink team,

I'd like to propose an improvement to how the job manager archives job 
histories and how flink history server fetches the history. 
Currently, only one job history per jobid is available to be archived and 
fectched.  
When a flink job tries to archive the job's history more than once, usually 
'FileAlreadyExistsException' error happens.
This makes sense in most cases, since a job typically gets a new ID when it 
gets restarted from latest checkpoint/savepoint.

However, there's a specific situation where this behavior can be problematic:

1) When we upgrade a job using the savepoint mode, the job's first history gets 
successfully archived.
2) If the same job later fails due to an error, its history isn't archived 
again because there's already a record with the same job ID.

This can be an issue because the most valuable information – why the job failed 
– gets lost.

To simply solve this, I suggest to include currentTimeMillis to the history 
filename along with jobid. ( \{jobid}-\{currentTimeMillis} )
And also in the history fetching side parse jobid before the *"-"* delimiter 
and fetch all the histories for that jobid.
For UI we can keep current display or maybe enhance with adding extra hierarchy 
for each jobid since each jobid can now have multiple histories.

If we could reach an agreement I'll be glad to take on the implementation.
Thanks in advance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] FLIP-380: Support Full Partition Processing On Non-keyed DataStream

2023-11-30 Thread Wencong Liu
Hi devs,

I'm excited to propose a new FLIP[1] aimed at enhancing the DataStream API

to support full window processing on non-keyed streams. This feature addresses
the current limitation where non-keyed DataStreams cannot accumulate records
per subtask for collective processing at the end of input.

Key proposals include:


1. Introduction of PartitionWindowedStream allowing non-keyed DataStreams to
be transformed for full window processing per subtask.

2. Addition of four new APIs - mapPartition, sortPartition, aggregate, and 
reduce
- to enable powerful operations on PartitionWindowedStream.

This initiative seeks to fill the gap left by the deprecation of the DataSet 
API,
marrying its partition processing strengths with the dynamic capabilities
of the DataStream API.

Looking forward to your feedback on this FLIP.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-380%3A+Support+Full+Partition+Processing+On+Non-keyed+DataStream

Best regards,
Wencong Liu

[jira] [Created] (FLINK-33714) Update documentation about the usage of RuntimeContext#getExecutionConfig

2023-11-30 Thread Junrui Li (Jira)
Junrui Li created FLINK-33714:
-

 Summary: Update documentation about the usage of 
RuntimeContext#getExecutionConfig
 Key: FLINK-33714
 URL: https://issues.apache.org/jira/browse/FLINK-33714
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Reporter: Junrui Li






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33713) Deprecate RuntimeContext#getExecutionConfig

2023-11-30 Thread Junrui Li (Jira)
Junrui Li created FLINK-33713:
-

 Summary: Deprecate RuntimeContext#getExecutionConfig
 Key: FLINK-33713
 URL: https://issues.apache.org/jira/browse/FLINK-33713
 Project: Flink
  Issue Type: Sub-task
  Components: API / Core
Reporter: Junrui Li


Deprecate RuntimeContext#getExecutionConfig



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33712) FLIP-391: Deprecate RuntimeContext#getExecutionConfig

2023-11-30 Thread Junrui Li (Jira)
Junrui Li created FLINK-33712:
-

 Summary: FLIP-391: Deprecate RuntimeContext#getExecutionConfig
 Key: FLINK-33712
 URL: https://issues.apache.org/jira/browse/FLINK-33712
 Project: Flink
  Issue Type: Technical Debt
  Components: API / Core
Reporter: Junrui Li


Deprecate the RuntimeContext#getExecutionConfig and  introduce alternative 
getter methods that allow users to access specific information without exposing 
unnecessary runtime details. More details see: 
[FLIP-391|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=278465937]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[RESULT][VOTE] FLIP-391: Deprecate RuntimeContext#getExecutionConfig

2023-11-30 Thread Junrui Lee
The voting time of FLIP-391: Deprecate RuntimeContext#getExecutionConfig[1]
has passed. I'm closing the vote now.

There were 5 +1 votes, 4 of which are binding:

Rui Fan (binding)
Weijie Guo (binding)
Jing Ge (binding)
Zhu Zhu (binding)
Zhanghao Chen (non-binding)


There were no -1 votes.

Thus FLIP-391 has been accepted.

Thanks everyone for joining the discussion and giving feedback!

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=278465937


Best,
Junrui


Re: [DISCUSS] FLIP-395: Migration to GitHub Actions

2023-11-30 Thread Yangze Guo
Thanks for the efforts, @Matthias. +1 to start a trial on Github
Actions and migrate the CI if we can prove its computation capacity
and stability.

I share the same concern with Xintong that we do not explicitly claim
the effect of this trial on the contribution procedure. I think you
can elaborate more on this in the migration plan section. Here is my
thought about it:
I prefer to enable the CI workflow based on GitHub Actions for each PR
because it helps us understand its stability and performance under
certain pressures. However, I am not inclined to make "passing the CI
via GitHub Actions" a necessity in the code contribution process, we
can encourage contributors to report unstable cases under a specific
ticket umbrella when they encounter them.

Best,
Yangze Guo

On Thu, Nov 30, 2023 at 12:10 AM Matthias Pohl
 wrote:
>
> With regards to Alex' concerns on hardware disparity: I did a bit more
> digging on that one. I added my findings in a hardware section to FLIP-396
> [1]. It appears that the hardware is more or less the same between the
> different hosts. Apache INFRA's runners have more disk space (1TB in
> comparison to 14GB), though.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-396%3A+Trial+during+Flink+1.19+Cycle+to+test+migrating+to+GitHub+Actions#FLIP396:TrialduringFlink1.19CycletotestmigratingtoGitHubActions-HardwareSpecifications
>
> On Wed, Nov 29, 2023 at 4:01 PM Matthias Pohl 
> wrote:
>
> > Thanks for your feedback Alex. I responded to your comments below:
> >
> > This is mentioned in the "Limitations of GitHub Actions in the past"
> >> section of the FLIP. Does this also apply to the Apache INFRA setup or
> >> can we expect contributors' runs executed there too?
> >
> >
> > Workflow runs on Flink forks (independent of PRs that would merge to
> > Apache Flink's core repo) will be executed with runners provided by GitHub
> > with their own limitations. Secrets are not set in these runs (similar to
> > what we have right now with PR runs).
> >
> > If we allow the PR CI to run on Apache INFRA-hosted ephemeral runners we
> > might have the same freedom because of their ephemeral nature (the VMs are
> > discarded leaving).
> >
> > We only have to start thinking about self-hosted customized runners if we
> > decide/need to have dedicated VMs for Flink's CI (similar to what we have
> > right now with Azure CI and Alibaba's VMs). This might happen if the
> > waiting times for acquiring a runner are too long. In that case, we might
> > give a certain group of people (e.g. committers) or certain types of events
> > (for PRs,  nightly builds, PR merges) the ability to use the self-hosted
> > runners.
> >
> > As you mentioned in the FLIP, there are some timeout-related test
> >> discrepancies between different setups. Similar discrepancies could
> >> manifest themselves between the Github runners and the Apache INFRA
> >> runners. It would be great if we should have a uniform setup, where if
> >> tests pass in the individual CI, they also pass in the main runner and vice
> >> versa.
> >
> >
> > I agree. So far, what we've seen is that the timeout instability is coming
> > from too optimistic timeout configurations in some tests (they eventually
> > also fail in Azure CI; but the GitHub-provided runners seem to be more
> > sensitive in this regard). Fixing the tests if such a flakiness is observed
> > should bring us to a stage where the test behavior is matching between
> > different runners.
> >
> > We had a similar issue in the Azure CI setup: Certain tests were more
> > stable on the Alibaba machines than on Azure VMs. That is why we introduced
> > a dedicated stage for Azure CI VMs as part of the nightly runs (see
> > FLINK-18370 [1]). We could do the same for GitHub Actions if necessary.
> >
> > Currently we have such memory limits-related issues in individual vs main
> >> Azure CI pipelines.
> >
> >
> > I'm not sure I understand what you mean by memory limit-related issues.
> > The GitHub-provided runners do not seem to run into memory-related issues.
> > We have to see whether this also applies to Apache INFRA-provided runners.
> > My hope is that they have even better hardware than what GitHub offers. But
> > GitHub-provided runners seem to be a good fallback to rely on (see the
> > workflows I shared in my previous response to Xintong's message).
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-18370
> >
> > On Wed, Nov 29, 2023 at 3:17 PM Matthias Pohl 
> > wrote:
> >
> >> Thanks for your comments, Xintong. See my answers below.
> >>
> >>
> >>> I think it would be helpful if we can at the end migrate the CI to an
> >>> ASF-managed Github Action, as long as it provides us a similar
> >>> computation capacity and stability.
> >>
> >>
> >> The current test runs in my Flink fork (using the GitHub-provided
> >> runners) suggest that even with using generic GitHub runners we get decent
> >> performance and stability. In this way I'm confident that we wouldn't lose
> >

Re: [VOTE] FLIP-364: Improve the restart-strategy

2023-11-30 Thread Zhu Zhu
+1 (binding)

Thanks,
Zhu

Zhanghao Chen  于2023年11月30日周四 23:31写道:

> +1 (non-binding)
>
> Best,
> Zhanghao Chen
> 
> From: Rui Fan <1996fan...@gmail.com>
> Sent: Monday, November 13, 2023 11:01
> To: dev 
> Subject: [VOTE] FLIP-364: Improve the restart-strategy
>
> Hi everyone,
>
> Thank you to everyone for the feedback on FLIP-364: Improve the
> restart-strategy[1]
> which has been discussed in this thread [2].
>
> I would like to start a vote for it. The vote will be open for at least 72
> hours unless there is an objection or not enough votes.
>
> [1] https://cwiki.apache.org/confluence/x/uJqzDw
> [2] https://lists.apache.org/thread/5cgrft73kgkzkgjozf9zfk0w2oj7rjym
>
> Best,
> Rui
>


Re: [DISCUSS] Contribute Flink Doris Connector to the Flink community

2023-11-30 Thread wudi
Thank you everyone. But I encountered a problem when creating FLIP. There is no 
permission to create files in the Flink Improvement Proposals [1] space. I may 
need PMC to help me add permissions: My Jira account is Di Wu The email is 
d...@apache.org Thanks  [1] 
https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals 



--

Brs,

di.wu


> 2023年11月27日 下午1:22,Jing Ge  写道:
> 
> That sounds great! +1
> 
> Best regards
> Jing
> 
> On Mon, Nov 27, 2023 at 3:38 AM Leonard Xu  wrote:
> 
>> Thanks wudi for kicking off the discussion,
>> 
>> +1 for the idea from my side.
>> 
>> A FLIP like Yun posted is required if no other objections.
>> 
>> Best,
>> Leonard
>> 
>>> 2023年11月26日 下午6:22,wudi <676366...@qq.com.INVALID> 写道:
>>> 
>>> Hi all,
>>> 
>>> At present, Flink Connector and Flink's repository have been
>> decoupled[1].
>>> At the same time, the Flink-Doris-Connector[3] has been maintained based
>> on the Apache Doris[2] community.
>>> I think the Flink Doris Connector can be migrated to the Flink community
>> because it It is part of Flink Connectors and can also expand the ecosystem
>> of Flink Connectors.
>>> 
>>> I volunteer to move this forward if I can.
>>> 
>>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development
>>> [2] https://doris.apache.org/
>>> [3] https://github.com/apache/doris-flink-connector
>>> 
>>> --
>>> 
>>> Brs,
>>> di.wu
>> 
>> 



[jira] [Created] (FLINK-33711) Fix numbers of field of the taxi event / Fix broken link

2023-11-30 Thread Leona Yoda (Jira)
Leona Yoda created FLINK-33711:
--

 Summary: Fix numbers of field  of the taxi event / Fix broken link
 Key: FLINK-33711
 URL: https://issues.apache.org/jira/browse/FLINK-33711
 Project: Flink
  Issue Type: Improvement
  Components: Documentation / Training / Exercises
Reporter: Leona Yoda


* Fix number of fields of the taxi event from 11 to 10.
 ** According to https://issues.apache.org/jira/browse/FLINK-23926, the number 
of fields decreased from 11 to 10.
 * Fix broken link
 ** Fix "use taxi data streams" link. On Chinese document, it seems to be added 
additional ancher for this issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Doc about cleaning savePoints and checkPoints

2023-11-30 Thread Rodrigo Meneses
Hi Flink Community,

I'm searching for docs about how the cleaning of checkpoints and savepoints
actually work.

I'm interested particularly in the cases when the user has `NATIVE` format
(incremental savepoint). Somehow, when using NATIVE format, the number of
savepoints kept are not matching the savepoint parameters like :
```
  ["kubernetes.operator.savepoint.history.max.age"] = "7d"
  ["kubernetes.operator.savepoint.history.max.count"] = "14"
```

Also, I would like to understand better when the checkpoints are cleaned.
According to
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/checkpoints/
the checkpoints are cleaned when a program is cancelled. What happens if a
user suspends and then restores the job? Or when a user upgrades the job?
Are the checkpoints also cleaned in this situation?

Thanks so much for you time
-Rodrigo


[jira] [Created] (FLINK-33710) Autoscaler redeploys pipeline for a NOOP parallelism change

2023-11-30 Thread Maximilian Michels (Jira)
Maximilian Michels created FLINK-33710:
--

 Summary: Autoscaler redeploys pipeline for a NOOP parallelism 
change
 Key: FLINK-33710
 URL: https://issues.apache.org/jira/browse/FLINK-33710
 Project: Flink
  Issue Type: Bug
  Components: Autoscaler, Kubernetes Operator
Affects Versions: kubernetes-operator-1.7.0, kubernetes-operator-1.6.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels
 Fix For: kubernetes-operator-1.8.0


The operator supports two modes to apply autoscaler changes:

# Use the internal Flink config {{pipeline.jobvertex-parallelism-overrides}} 
# Make use of Flink's Rescale API 

For (1), a string has to be generated for the Flink config with the actual 
overrides. This string has to be deterministic for a given map. But it is not.

Consider the following observed log:

{noformat}
  >>> Event  | Info| SPECCHANGED | SCALE change(s) detected (Diff: 
FlinkDeploymentSpec[flinkConfiguration.pipeline.jobvertex-parallelism-overrides 
: 
92542d1280187bd464274368a5f86977:3,9f979ed859083299d29f281832cb5be0:1,84881d7bda0dc3d44026e37403420039:1,1652184ffd0522859c7840a24936847c:1
 -> 
9f979ed859083299d29f281832cb5be0:1,84881d7bda0dc3d44026e37403420039:1,92542d1280187bd464274368a5f86977:3,1652184ffd0522859c7840a24936847c:1]),
 starting reconciliation. 
{noformat}

The overrides are identical but the order is different which triggers a 
redeploy. This does not seem to happen often but some deterministic string 
generation (e.g. sorting by key) is required to prevent any NOOP updates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [RESULT][VOTE] Apache Flink Kafka Connectors v3.0.2, RC #1

2023-11-30 Thread Tzu-Li (Gordon) Tai
I'm happy to announce that we have unanimously approved this release.

There are 6 approving votes, 3 of which are binding:
* Gordon Tai (binding)
* Rui Fan
* Jing Ge
* Leonard Xu (binding)
* Martijn Visser (binding)
* Sergey Nuyanzin

There are no disapproving votes.

Thanks everyone! I'll now release the artifacts, and separately announce
once everything is ready.

Best,
Gordon


On Thu, Nov 30, 2023 at 3:05 AM Sergey Nuyanzin  wrote:

> +1(non-binding)
>
> - Downloaded all the resources
> - Verified signatures
> - Validated hashsums
> - Built from source code
> - Checked Github release tag
> - Reviewed the web PR
>
> On Mon, Nov 27, 2023 at 9:05 AM Martijn Visser 
> wrote:
>
> > +1 (binding)
> >
> > - Validated hashes
> > - Verified signature
> > - Verified that no binaries exist in the source archive
> > - Build the source with Maven
> > - Verified licenses
> > - Verified web PRs
> >
> > On Mon, Nov 27, 2023 at 4:17 AM Leonard Xu  wrote:
> > >
> > > +1 (binding)
> > >
> > > - checked the flink-connector-base dependency scope has been changed to
> > provided
> > > - built from source code succeeded
> > > - verified signatures
> > > - verified hashsums
> > > - checked the contents contains jar and pom files in apache repo
> > > - checked Github release tag
> > > - checked release notes
> > > - reviewed the web PR
> > >
> > > Best,
> > > Leonard
> > >
> > >
> > > > 2023年11月26日 下午4:40,Jing Ge  写道:
> > > >
> > > > +1 (non-binding)
> > > >
> > > > - verified signature and hash
> > > > - checked repo
> > > > - checked tag, BTW, the tag link at [5] should be
> > > >
> > https://github.com/apache/flink-connector-kafka/releases/tag/v3.0.2-rc1
> > > > - verified source archives do not contains any binaries
> > > > - build source maven 3.8.6 and jdk11
> > > > - verified web PR
> > > >
> > > > Best regards,
> > > > Jing
> > > >
> > > > On Sat, Nov 25, 2023 at 6:44 AM Rui Fan <1996fan...@gmail.com>
> wrote:
> > > >
> > > >> +1 (non-binding)
> > > >>
> > > >> - Validated checksum hash
> > > >> - Verified signature
> > > >> - Verified that no binaries exist in the source archive
> > > >> - Build the source with Maven and jdk8
> > > >> - Verified licenses
> > > >> - Verified web PRs
> > > >>
> > > >> Best,
> > > >> Rui
> > > >>
> > > >> On Sat, Nov 25, 2023 at 2:05 AM Tzu-Li (Gordon) Tai <
> > tzuli...@apache.org>
> > > >> wrote:
> > > >>
> > > >>> +1 (binding)
> > > >>>
> > > >>> - Verified signature and hashes
> > > >>> - Verified mvn dependency:tree for a typical user job jar [1]. When
> > using
> > > >>> Flink 1.18.0, flink-connector-base is no longer getting bundled,
> and
> > all
> > > >>> Flink dependencies resolve as 1.18.0 / provided.
> > > >>> - Submitting user job jar to local Flink 1.18.0 cluster works and
> job
> > > >> runs
> > > >>>
> > > >>> note: If running in the IDE, the flink-connector-base dependency is
> > > >>> explicitly required when using KafkaSource. Otherwise, if
> submitting
> > an
> > > >>> uber jar, the flink-connector-base dependency should not be bundled
> > as
> > > >>> it'll be provided by the Flink distribution and will already be on
> > the
> > > >>> classpath.
> > > >>>
> > > >>> [1] mvn dependency:tree output
> > > >>> ```
> > > >>> [INFO] com.tzulitai:testing-kafka:jar:1.0-SNAPSHOT
> > > >>> [INFO] +- org.apache.flink:flink-streaming-java:jar:1.18.0:provided
> > > >>> [INFO] |  +- org.apache.flink:flink-core:jar:1.18.0:provided
> > > >>> [INFO] |  |  +-
> > org.apache.flink:flink-annotations:jar:1.18.0:provided
> > > >>> [INFO] |  |  +-
> > org.apache.flink:flink-metrics-core:jar:1.18.0:provided
> > > >>> [INFO] |  |  +-
> > org.apache.flink:flink-shaded-asm-9:jar:9.5-17.0:provided
> > > >>> [INFO] |  |  +-
> > > >>> org.apache.flink:flink-shaded-jackson:jar:2.14.2-17.0:provided
> > > >>> [INFO] |  |  +-
> org.apache.commons:commons-lang3:jar:3.12.0:provided
> > > >>> [INFO] |  |  +- org.apache.commons:commons-text:jar:1.10.0:provided
> > > >>> [INFO] |  |  +- com.esotericsoftware.kryo:kryo:jar:2.24.0:provided
> > > >>> [INFO] |  |  |  +-
> > com.esotericsoftware.minlog:minlog:jar:1.2:provided
> > > >>> [INFO] |  |  |  \- org.objenesis:objenesis:jar:2.1:provided
> > > >>> [INFO] |  |  +-
> > > >> commons-collections:commons-collections:jar:3.2.2:provided
> > > >>> [INFO] |  |  \-
> org.apache.commons:commons-compress:jar:1.21:provided
> > > >>> [INFO] |  +-
> > org.apache.flink:flink-file-sink-common:jar:1.18.0:provided
> > > >>> [INFO] |  +- org.apache.flink:flink-runtime:jar:1.18.0:provided
> > > >>> [INFO] |  |  +- org.apache.flink:flink-rpc-core:jar:1.18.0:provided
> > > >>> [INFO] |  |  +-
> > > >> org.apache.flink:flink-rpc-akka-loader:jar:1.18.0:provided
> > > >>> [INFO] |  |  +-
> > > >>>
> > org.apache.flink:flink-queryable-state-client-java:jar:1.18.0:provided
> > > >>> [INFO] |  |  +-
> org.apache.flink:flink-hadoop-fs:jar:1.18.0:provided
> > > >>> [INFO] |  |  +- commons-io:commons-io:jar:2.11.0:provided
> > > >>> [INFO] |  |  +-
> > > >>> org.apache.flink:flink-shade

Re: [DISCUSS] Release flink-connector-parent v1.01

2023-11-30 Thread Etienne Chauchot
Thanks Sergey for your vote. Indeed I have listed only the PRs merged 
since last release but there are these 2 open PRs that could be worth 
reviewing/merging before release.


https://github.com/apache/flink-connector-shared-utils/pull/25

https://github.com/apache/flink-connector-shared-utils/pull/20

Best

Etienne


Le 30/11/2023 à 11:12, Sergey Nuyanzin a écrit :

thanks for volunteering Etienne

+1 for releasing
however there is one more PR to enable custom jvm flags for connectors
in similar way it is done in Flink main repo for modules
It will simplify a bit support for java 17

could we have this as well in the coming release?



On Wed, Nov 29, 2023 at 11:40 AM Etienne Chauchot
wrote:


Hi all,

I would like to discuss making a v1.0.1 release of flink-connector-parent.

Since last release, there were only 2 changes:

-https://github.com/apache/flink-connector-shared-utils/pull/19
(spotless addition)

-https://github.com/apache/flink-connector-shared-utils/pull/26
(surefire configuration)

The new release would bring the ability to skip some tests in the
connectors and among other things skip the archunit tests. It is
important for connectors to skip archunit tests when tested against a
version of Flink that changes the archunit rules leading to a change of
the violation store. As there is only one violation store and the
connector needs to be tested against last 2 minor Flink versions, only
the version the connector was built against needs to run the archunit
tests and have them reflected in the violation store.


I volunteer to make the release. As it would be my first ASF release, I
might require the guidance of one of the PMC members.


Best

Etienne






Re: [VOTE] FLIP-364: Improve the restart-strategy

2023-11-30 Thread Zhanghao Chen
+1 (non-binding)

Best,
Zhanghao Chen

From: Rui Fan <1996fan...@gmail.com>
Sent: Monday, November 13, 2023 11:01
To: dev 
Subject: [VOTE] FLIP-364: Improve the restart-strategy

Hi everyone,

Thank you to everyone for the feedback on FLIP-364: Improve the
restart-strategy[1]
which has been discussed in this thread [2].

I would like to start a vote for it. The vote will be open for at least 72
hours unless there is an objection or not enough votes.

[1] https://cwiki.apache.org/confluence/x/uJqzDw
[2] https://lists.apache.org/thread/5cgrft73kgkzkgjozf9zfk0w2oj7rjym

Best,
Rui


Re: [VOTE] Release flink-shaded 18.0, release candidate #1

2023-11-30 Thread Jing Ge
+1(not binding)

- validate checksum
- validate hash
- checked the release notes
- verified that no binaries exist in the source archive
- build the source with Maven 3.8.6 and jdk11
- checked repo
- checked tag
- verified web PR

Best regards,
Jing

On Thu, Nov 30, 2023 at 11:39 AM Sergey Nuyanzin 
wrote:

> +1 (non-binding)
>
> - Downloaded all the resources
> - Validated checksum hash
> - Build the source with Maven and jdk8
> - Build Flink master with new flink-shaded and check that all the tests are
> passing
>
> one minor thing that I noticed during releasing: for ci it uses maven 3.8.6
> at the same time for release profile there is an enforcement plugin to
> check that maven version is less than 3.3
> I created a jira issue[1] for that
> i made the release with 3.2.5 maven version (I suppose previous version was
> also done with 3.2.5 because of same issue)
>
> [1] https://issues.apache.org/jira/browse/FLINK-33703
>
> On Wed, Nov 29, 2023 at 11:41 AM Matthias Pohl 
> wrote:
>
> > +1 (binding)
> >
> > * Downloaded all resources
> > * Extracts sources and compilation on these sources
> > * Diff of git tag checkout with downloaded sources
> > * Verifies SHA512 checksums & GPG certification
> > * Checks that all POMs have the right expected version
> > * Generated diffs to compare pom file changes with NOTICE files: Nothing
> > suspicious found except for a minor (non-blocking) typo [1]
> >
> > Thanks for driving this effort, Sergey. :)
> >
> > [1] https://github.com/apache/flink-shaded/pull/126/files#r1409080162
> >
> > On Wed, Nov 29, 2023 at 10:25 AM Rui Fan <1996fan...@gmail.com> wrote:
> >
> >> Sorry, it's non-binding.
> >>
> >> On Wed, Nov 29, 2023 at 5:19 PM Rui Fan <1996fan...@gmail.com> wrote:
> >>
> >> > Thanks Matthias for the clarification!
> >> >
> >> > After I import the latest KEYS, it works fine.
> >> >
> >> > +1(binding)
> >> >
> >> > - Validated checksum hash
> >> > - Verified signature
> >> > - Verified that no binaries exist in the source archive
> >> > - Build the source with Maven and jdk8
> >> > - Verified licenses
> >> > - Verified web PRs, and left a comment
> >> >
> >> > Best,
> >> > Rui
> >> >
> >> > On Wed, Nov 29, 2023 at 5:05 PM Matthias Pohl
> >> >  wrote:
> >> >
> >> >> The key is the last key in the KEYS file. It's just having a
> different
> >> >> format with spaces being added (due to different gpg versions?): F752
> >> 9FAE
> >> >> 2481 1A5C 0DF3  CA74 1596 BBF0 7268 35D8
> >> >>
> >> >> On Wed, Nov 29, 2023 at 9:41 AM Rui Fan <1996fan...@gmail.com>
> wrote:
> >> >>
> >> >> > Hey Sergey,
> >> >> >
> >> >> > Thank you for driving this release.
> >> >> >
> >> >> > I try to check this signature, the whole key is
> >> >> > F7529FAE24811A5C0DF3CA741596BBF0726835D8,
> >> >> > it matches your 1596BBF0726835D8, but I cannot
> >> >> > find it from the Flink KEYS[1].
> >> >> >
> >> >> > Please correct me if my operation is wrong, thanks~
> >> >> >
> >> >> > [1] https://dist.apache.org/repos/dist/release/flink/KEYS
> >> >> >
> >> >> > Best,
> >> >> > Rui
> >> >> >
> >> >> >
> >> >> > On Wed, Nov 29, 2023 at 6:09 AM Sergey Nuyanzin <
> snuyan...@gmail.com
> >> >
> >> >> > wrote:
> >> >> >
> >> >> > > Hi everyone,
> >> >> > > Please review and vote on the release candidate #1 for the
> version
> >> >> 18.0,
> >> >> > as
> >> >> > > follows:
> >> >> > > [ ] +1, Approve the release
> >> >> > > [ ] -1, Do not approve the release (please provide specific
> >> comments)
> >> >> > >
> >> >> > >
> >> >> > > The complete staging area is available for your review, which
> >> >> includes:
> >> >> > > * JIRA release notes [1],
> >> >> > > * the official Apache source release to be deployed to
> >> >> dist.apache.org
> >> >> > > [2],
> >> >> > > which are signed with the key with fingerprint 1596BBF0726835D8
> >> [3],
> >> >> > > * all artifacts to be deployed to the Maven Central Repository
> [4],
> >> >> > > * source code tag "release-18.0-rc1" [5],
> >> >> > > * website pull request listing the new release [6].
> >> >> > >
> >> >> > > The vote will be open for at least 72 hours. It is adopted by
> >> majority
> >> >> > > approval, with at least 3 PMC affirmative votes.
> >> >> > >
> >> >> > > Thanks,
> >> >> > > Sergey
> >> >> > >
> >> >> > > [1]
> >> https://issues.apache.org/jira/projects/FLINK/versions/12353081
> >> >> > > [2]
> >> >> https://dist.apache.org/repos/dist/dev/flink/flink-shaded-18.0-rc1
> >> >> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> >> >> > > [4]
> >> >> > >
> >> >>
> >> https://repository.apache.org/content/repositories/orgapacheflink-1676/
> >> >> > > [5]
> >> >> https://github.com/apache/flink-shaded/releases/tag/release-18.0-rc1
> >> >> > > [6] https://github.com/apache/flink-web/pull/701
> >> >> > >
> >> >> >
> >> >>
> >> >
> >>
> >
>
> --
> Best regards,
> Sergey
>


Re: [ANNOUNCE] Apache Flink 1.16.3 released

2023-11-30 Thread Maximilian Michels
Thank you Rui for driving this!

On Thu, Nov 30, 2023 at 3:01 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> The Apache Flink community is very happy to announce the release of
> Apache Flink 1.16.3, which is the
> third bugfix release for the Apache Flink 1.16 series.
>
>
>
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data
> streaming applications.
>
>
>
> The release is available for download at:
>
> https://flink.apache.org/downloads.html
>
>
>
> Please check out the release blog post for an overview of the
> improvements for this bugfix release:
>
> https://flink.apache.org/2023/11/29/apache-flink-1.16.3-release-announcement/
>
>
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353259
>
>
>
> We would like to thank all contributors of the Apache Flink community
> who made this release possible!
>
>
>
> Feel free to reach out to the release managers (or respond to this
> thread) with feedback on the release process. Our goal is to
> constantly improve the release process. Feedback on what could be
> improved or things that didn't go so well are appreciated.
>
>
>
> Regards,
>
> Release Manager


Re: [VOTE] FLIP-364: Improve the restart-strategy

2023-11-30 Thread Maximilian Michels
+1 (binding)

-Max

On Thu, Nov 30, 2023 at 9:15 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> +1(binding)
>
> Best,
> Rui
>
> On Mon, Nov 13, 2023 at 11:01 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> > Hi everyone,
> >
> > Thank you to everyone for the feedback on FLIP-364: Improve the
> > restart-strategy[1]
> > which has been discussed in this thread [2].
> >
> > I would like to start a vote for it. The vote will be open for at least 72
> > hours unless there is an objection or not enough votes.
> >
> > [1] https://cwiki.apache.org/confluence/x/uJqzDw
> > [2] https://lists.apache.org/thread/5cgrft73kgkzkgjozf9zfk0w2oj7rjym
> >
> > Best,
> > Rui
> >


[jira] [Created] (FLINK-33709) Report CheckpointStats as Spans

2023-11-30 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-33709:
--

 Summary: Report CheckpointStats as Spans
 Key: FLINK-33709
 URL: https://issues.apache.org/jira/browse/FLINK-33709
 Project: Flink
  Issue Type: Sub-task
Reporter: Piotr Nowojski






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33708) Add Span and TraceReporter concepts

2023-11-30 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-33708:
--

 Summary: Add Span and TraceReporter concepts
 Key: FLINK-33708
 URL: https://issues.apache.org/jira/browse/FLINK-33708
 Project: Flink
  Issue Type: Sub-task
Reporter: Piotr Nowojski






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Resolve diamond inheritance of Sink.createWriter

2023-11-30 Thread Becket Qin
Hi folks,

Sorry for replying late on the thread.

For this particular FLIP, I see two solutions:

Option 1:
1. On top of the the current status, rename
*org.apache.flink.api.connector.sink2.InitContext *to
*CommonInitContext (*should
probably be package private*)*.
2. Change the name *WriterInitContext* back to *InitContext, *and revert
the deprecation. We can change the parameter name to writerContext if we
want to.
Admittedly, this does not have full symmetric naming of the InitContexts -
we will have CommonInitContext / InitContext / CommitterInitContext instead
of InitContext / WriterInitContext / CommitterInitContext. However, the
naming seems clear without much confusion. Personally, I can live with
that, treating the class InitContext as a non-ideal legacy class name
without much material harm.

Option 2:
Theoretically speaking, if we really want to reach the perfect state while
being backwards compatible, we can create a brand new set of Sink
interfaces and deprecate the old ones. But I feel this is an overkill here.

The solution to this particular issue aside, the evolvability of the
current interface hierarchy seems a more fundamental issue and worries me
more. I haven't completely thought it through, but there are two noticeable
differences between the interface design principles between Source and Sink.
1. Source uses decorative interfaces. For example, we have a
SupportsFilterPushdown interface, instead of a subclass of
FilterableSource. This seems provides better flexibility.
2. Source tends to have a more coarse-grained interface. For example,
SourceReader always has the methods of snapshotState(),
notifyCheckpointComplete(). Even if they may not be always required, we do
not separate them into different interfaces.
My hunch is that if we follow similar approach as Source, the evolvability
might be better. If we want to do this, we'd better to do it before 2.0.
What do you think?

Process wise,
- I agree that if there is a change to the passed FLIP during
implementation, it should be brought back to the mailing list.
- There might be value for the connector nightly build to depend on the
latest snapshot of the same Flink major version. It helps catching
unexpected breaking changes sooner.
- I'll update the website to reflect the latest API stability policy.
Apologies for the confusion caused by the stale doc.

Thanks,

Jiangjie (Becket) Qin



On Wed, Nov 29, 2023 at 10:55 PM Márton Balassi 
wrote:

> Thanks, Martijn and Peter.
>
> In terms of the concrete issue:
>
>- I am following up with the author of FLIP-321 [1] (Becket) to update
>the docs [2] to reflect the right state.
>- I see two reasonable approaches in terms of proceeding with the
>specific changeset:
>
>
>1. We allow the exception from FLIP-321 for this change and let the
>   PublicEvolving API change happen between Flink 1.18 and 1.19, which
> is
>   consistent with current state of the relevant documentation. [2]
> We commit
>   to helping the connector repos make the necessary (one liner)
> changes.
>   2. We revert back to the original implementation plan as explicitly
>   voted on in FLIP-371 [3]. That has no API breaking changes.
> However we end
>   up with an inconsistently named API with duplicated internal
> methods. Peter
>   has also discovered additional bad patterns during his work in
> FLIP-372
>   [3], the total of these changes could be handled in a separate FLIP
> that
>   would do multiple PublicEvolving breaking changes to clean up the
> API.
>
> In terms of the general issues:
>
>- I agree that if a PR review of an accepted FLIP newly introduces a
>breaking API change that warrants an update to the mailing list
> discussion
>and possibly even a new vote.
>- I agree with the general sentiment of FLIP-321 to provide stronger API
>guarantees with the minor note that if we have changes in mind we should
>prioritize them now such that they can be validated by Flink 2.0.
>- I agree that ideally the connector repos should build against the
>latest release and not the master branch.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-321%3A+Introduce+an+API+deprecation+process
> [2]
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/upgrading/#api-compatibility-guarantees
> [3]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-371%3A+Provide+initialization+context+for+Committer+creation+in+TwoPhaseCommittingSink
> [4]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Allow+TwoPhaseCommittingSink+WithPreCommitTopology+to+alter+the+type+of+the+Committable
>
> Best,
> Marton
>
> On Mon, Nov 27, 2023 at 7:23 PM Péter Váry 
> wrote:
>
> > I think we should try to separate the discussion in a few different
> topics:
> >
> >- Concrete issue
> >   - How to solve this problem in 1.19 and wrt the affected
> createWriter
> >   interface
> >   - Update the documentation

[ANNOUNCE] Apache flink-connector-aws 4.2.0 released

2023-11-30 Thread Danny Cranmer
The Apache Flink community is very happy to announce the release of Apache
flink-connector-aws 4.2.0. This release supports Flink 1.17 and 1.18.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353011

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Danny


Re: [DISCUSS] FLIP-389: Annotate SingleThreadFetcherManager and FutureCompletingBlockingQueue as PublicEvolving

2023-11-30 Thread Hongshun Wang
Hi all,
Any additional questions or concern regarding this FLIP-389[1].? Looking
forward to hearing from you.

Thanks,
Hongshun Wang

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=278465498

On Wed, Nov 22, 2023 at 3:44 PM Hongshun Wang 
wrote:

> Hi Becket,
>
> Thanks a lot, I have no problem any more. And I have made further
> modifications to FLIP-389[1].
> In summary, this flip has 2 goals:
>
>- Annotate SingleThreadFetcherManager as PublicEvolving.
>- Shield FutureCompletingBlockingQueue from users and limit all
>operations on FutureCompletingBlockingQueue in SplitFetcherManager.
>
> All the changes are listed below:
>
>- Mark constructor of SourceReaderBase and
>SingleThreadMultiplexSourceReaderBase as @Depricated and provide a new
>constructor without FutureCompletingBlockingQueue.
>- Mark SplitFetcherManager andSingleThreadFetcherManager as
>`@PublicEvolving`,  mark constructor of SplitFetcherManager and
>SingleThreadFetcherManager as  @Depricated and provide a new constructor
>without FutureCompletingBlockingQueue.
>- SplitFetcherManager provides  wrapper methods for
>FutureCompletingBlockingQueue  to replace its usage in SourceReaderBase.
>- Mark SplitFetcher and SplitFetcherTask as PublicEvolving.
>
>
> Any additional questions regarding this FLIP? Looking forward to hearing
> from you.
>
> Thanks,
> Hongshun Wang
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=278465498
>
>
>
>
> On Wed, Nov 22, 2023 at 10:15 AM Becket Qin  wrote:
>
>> Hi Hongshun,
>>
>> The constructor of the SplitFetcher is already package private. So it can
>> only be accessed from the classes in the package
>> org.apache.flink.connector.base.source.reader.fetcher. And apparently,
>> user
>> classes should not be in this package. Therefore, even if we mark the
>> SplitFetcher class as PublicEvolving, the constructor is not available to
>> the users. Only the public and protected methods are considered public API
>> in this case. Private / package private methods and fields are still
>> internal.
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Wed, Nov 22, 2023 at 9:46 AM Hongshun Wang 
>> wrote:
>>
>> > Hi Becket,
>> >
>> > If SplitFetcherManager becomes PublicEvolving, that also means
>> SplitFetcher
>> > > needs to be PublicEvolving, because it is returned by the protected
>> > method
>> > > SplitFetcherManager.createSplitFetcher().
>> >
>> >
>> >
>> > > it looks like there is no need to expose the constructor of
>> SplitFetcher
>> > > to the end users. Having an interface of SplitFetcher is also fine,
>> but
>> > > might not be necessary in this case.
>> >
>> >
>> >
>> > I don't know how to make SplitFetcher as PublicEnvolving but not  to
>> expose
>> > the constructor of SplitFetcher to the end users?
>> >
>> > Thanks,
>> > Hongshun Wang
>> >
>> > On Tue, Nov 21, 2023 at 7:23 PM Becket Qin 
>> wrote:
>> >
>> > > Hi Hongshun,
>> > >
>> > > Do we need to expose the constructor of SplitFetcher to the users?
>> > Ideally,
>> > > users should always get a new fetcher instance by calling
>> > > SplitFetcherManager.createSplitFetcher(). Or, they can get an existing
>> > > SplitFetcher by looking up in the SplitFetcherManager.fetchers map. I
>> > think
>> > > this makes sense because a SplitFetcher should always belong to a
>> > > SplitFetcherManager. Therefore, it should be created via a
>> > > SplitFetcherManager as well. So, it looks like there is no need to
>> expose
>> > > the constructor of SplitFetcher to the end users.
>> > >
>> > > Having an interface of SplitFetcher is also fine, but might not be
>> > > necessary in this case.
>> > >
>> > > Thanks,
>> > >
>> > > Jiangjie (Becket) Qin
>> > >
>> > > On Tue, Nov 21, 2023 at 10:36 AM Hongshun Wang <
>> loserwang1...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Becket,
>> > > >
>> > > > > Additionally, SplitFetcherTask requires
>> FutureCompletingBlockingQueue
>> > > as
>> > > > a constructor parameter, which is not allowed  now.
>> > > > Sorry, it was my writing mistake. What I meant is that
>> *SplitFetcher*
>> > > > requires FutureCompletingBlockingQueue as a constructor parameter.
>> > > > SplitFetcher
>> > > > is a class rather than Interface. Therefore, I want to  change
>> > > > SplitFetcher to a public Interface and moving its implementation
>> > > > details to an implement
>> > > > subclass .
>> > > >
>> > > > Thanks,
>> > > > Hongshun Wang
>> > > >
>> > > > On Fri, Nov 17, 2023 at 6:21 PM Becket Qin 
>> > wrote:
>> > > >
>> > > > > Hi Hongshun,
>> > > > >
>> > > > > SplitFetcher.enqueueTask() returns void, right? SplitFetcherTask
>> is
>> > > > already
>> > > > > an interface, and we need to make that as a PublicEvolving API as
>> > well.
>> > > > >
>> > > > > So overall, a source developer can potentially do a few things in
>> the
>> > > > > SplitFetcherManager.
>> > > > > 1. for customized logic including split-to-fetcher assignment,
>> > > threading
>> > > > 

Re: [ANNOUNCE] Experimental Java 21 support now available on master

2023-11-30 Thread Yun Tang
Hi Sergey,

I checked the CI [1] which was executed with Java21, and noticed that the 
StatefulJobSnapshotMigrationITCase-related tests have passed, which proves what 
I guessed before, most checkpoints/savepoints should be restored successfully.

I think we shall introduce such snapshot migration tests, which restore 
snapshots containing scala code. I also create a ticket focused on Java17 [2]


[1] 
https://dev.azure.com/snuyanzin/flink/_build/results?buildId=2620&view=logs&j=0a15d512-44ac-5ba5-97ab-13a5d066c22c&t=9a028d19-6c4b-5a4e-d378-03fca149d0b1
[2] https://issues.apache.org/jira/browse/FLINK-33707


Best
Yun Tang

From: Sergey Nuyanzin 
Sent: Thursday, November 30, 2023 14:41
To: dev@flink.apache.org 
Subject: Re: [ANNOUNCE] Experimental Java 21 support now available on master

Thanks Yun Tang

One question to clarify: since the scala version was also bumped for java
17, shouldn't there be a similar task for java 17?

On Thu, Nov 30, 2023 at 3:43 AM Yun Tang  wrote:

> Hi Sergey,
>
> You can leverage all tests extending SnapshotMigrationTestBase[1] to
> verify the logic. I believe all binary _metadata existing in the resources
> folder[2] were built by JDK8.
>
> I also create a ticket FLINK-33699[3] to track this.
>
> [1]
> https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/checkpointing/utils/SnapshotMigrationTestBase.java
> [2]
> https://github.com/apache/flink/tree/master/flink-tests/src/test/resources
> [3] https://issues.apache.org/jira/browse/FLINK-33699
>
> Best
> Yun Tang
> 
> From: Sergey Nuyanzin 
> Sent: Wednesday, November 29, 2023 22:56
> To: dev@flink.apache.org 
> Subject: Re: [ANNOUNCE] Experimental Java 21 support now available on
> master
>
> thanks for the response
>
>
> >I feel doubt about the conclusion that "don't try to load a savepoint from
> a Java 8/11/17 build due to bumping to scala-2.12.18", since the
> snapshotted state (operator/keyed state-backend),  and most key/value
> serializer snapshots are generated by pure-java code.
> >The only left part is that the developer uses scala UDF or scala types for
> key/value types. However, since all user-facing scala APIs have been
> deprecated, I don't think we have so many cases. Maybe we can give
> descriptions without such strong suggestions.
>
> That is the area where I feel I lack the knowledge to answer this
> precisely.
> My assumption was that statement about Java 21 regarding this should be
> similar to Java 17 which is almost same [1]
> Sorry for the inaccuracy
> Based on your statements I agree that the conclusion could be more relaxed.
>
> I'm curious whether there are some tests or anything which could clarify
> this?
>
> [1] https://lists.apache.org/thread/mz0m6wqjmqy8htob3w4469pjbg9305do
>
> On Wed, Nov 29, 2023 at 12:25 PM Yun Tang  wrote:
>
> > Thanks Sergey for the great work.
> >
> > I feel doubt about the conclusion that "don't try to load a savepoint
> from
> > a Java 8/11/17 build due to bummping to scala-2.12.18", since the
> > snapshotted state (operator/keyed state-backend),  and most key/value
> > serializer snapshots are generated by pure-java code. The only left part
> is
> > that the developer uses scala UDF or scala types for key/value types.
> > However, since all user-facing scala APIs have been deprecated [1], I
> don't
> > think we have so many cases. Maybe we can give descriptions without such
> > strong suggestions.
> >
> > Please correct me if I am wrong.
> >
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-29740
> >
> > Best
> > Yun Tang
> >
> > 
> > From: Rui Fan <1996fan...@gmail.com>
> > Sent: Wednesday, November 29, 2023 16:43
> > To: dev@flink.apache.org 
> > Subject: Re: [ANNOUNCE] Experimental Java 21 support now available on
> > master
> >
> > Thanks Sergey for the great work!
> >
> > Best,
> > Rui
> >
> > On Wed, Nov 29, 2023 at 4:42 PM Leonard Xu  wrote:
> >
> > > Cool !
> > >
> > > Thanks Sergey for the great effort and all involved.
> > >
> > >
> > > Best,
> > > Leonard
> > >
> > > > 2023年11月29日 下午4:31,Swapnal Varma  写道:
> > > >
> > > > Congratulations Sergey, and everyone involved!
> > > >
> > > > Excited to work with and on this!
> > > >
> > > > Best,
> > > > Swapnal
> > > >
> > > >
> > > > On Wed, 29 Nov 2023, 13:58 Sergey Nuyanzin, 
> > wrote:
> > > >
> > > >> The master branch now builds and runs with Java 21 out-of-the-box.
> > > >>
> > > >> Notes:
> > > >> - a nightly cron build was set up.
> > > >> - In Java 21 builds, Scala is being bumped to 2.12.18
> > > >> which causes incompatibilities within Flink;
> > > >> i.e. don't try to load a savepoint from a Java 8/11/17 build
> > > >> - All the tests that are being skipped on Java 11/17
> > > >> are also skipped on Java 21.
> > > >>
> > > >> Huge shout-out to everyone participating
> > > >> in review of my Java 21 related PRs
> > > >>
> > > >> If you run into any issues, please rep

[jira] [Created] (FLINK-33707) Verify the snapshot migration on Java17

2023-11-30 Thread Yun Tang (Jira)
Yun Tang created FLINK-33707:


 Summary: Verify the snapshot migration on Java17
 Key: FLINK-33707
 URL: https://issues.apache.org/jira/browse/FLINK-33707
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Checkpointing
Reporter: Yun Tang


This task is like FLINK-33699, I think we could introduce a 
StatefulJobSnapshotMigrationITCase-like test to restore snapshots containing 
scala code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Python connector question

2023-11-30 Thread Dian Fu
Hi Peter,

Thanks a lot for taking care of this. Appreciate it!

Re flink-sql-client-1.18.0.jar: The original motivation of including
it in the PyFlink package is that we want to make sure that Python
users get most things they need after piping install PyFlink. So most
artifacts of a normal Flink distribution are included in the PyFlink
package. Actually there are several jars under the opt directory of a
Flink distribution, only flink-sql-client-1.18.0.jar is chosen because
it seems the most useful one among them for users. The reason
flink-sql-client-1.18.0.jar is not added to classpath is because this
JAR is usually used by the sql-client.sh which is located under the
bin directory. I'm fine to remove it (since this is not a must-to-have
dependency for Python users) if we think it's not necessary for most
users (users could manually download it if needed). In this case, we
may also need to remove some shell scripts under the bin directory,
e.g. sql-client.sh, etc since it may not work any more.

Re including connector jars under opt directory: Big +1 to this if
this is feasible. One problem I can think of is that currently the
connector repositories are externalized and depend on the Flink
repository and so when releasing Flink & PyFlink, the connectors are
still not released and so may not be possible to be included into the
PyFlink package.

Re marking lint-python.sh and install_command.sh as public, I'm +1 to
Gabor's point. The original scripts are not designed to be used by
external projects and so I guess we may need to refactor them a bit to
make them more general. @Experimental sounds reasonable to me.

Regards,
Dian

On Thu, Nov 30, 2023 at 12:27 AM Gabor Somogyi
 wrote:
>
> Hi Peter,
>
> Thanks for picking this up! Please find my thoughts inline.
>
> BR,
> G
>
>
> On Mon, Nov 27, 2023 at 4:05 PM Péter Váry 
> wrote:
>
> > Hi Team,
> >
> > During some, mostly unrelated work we come to realize that during the
> > externalization of the connector's python modules and the related tests are
> > not moved to the respective connectors repository.
> > We created the jiras [1] to create the infra, and to move the python code
> > and the tests to the respective repos.
> >
> > When working on this code, I have found several oddities, which I would
> > like to hear the community's opinion on:
>
> - Does anyone know what the
> > site-packages/pyflink/opt/flink-sql-client-1.18.0.jar supposed to do? We
> > specifically put it there [2], but then we ignore it when we check the
> > directory of jars [3]. If there is no objection, I would remove it.
> >
> +1 on removing that unless there are objections. As I see  the jar is not
> included in the classpath so not used.
>
> > - I would like to use the `opt` directory, to contain optional jars created
> > by the connectors, like flink-sql-connector-kafka-3.1.x.jar.
> >
> +1 on that. The other options would be:
>  * plugins -> Kafka connector is not used as plugin w/ separate classloader
> so this solution is not preferred
>  * lib -> here we store our core jars which are part of the main Flink repo so
> this solution is not preferred
> All in all I'm fine to use opt.
>
> > Also, the lint-python.sh [4], and install_command.sh [5] provides the base
> > of the testing infra. Would there be any objections to mark these as a
> > public apis for the connectors?
>
> I fully agree that we should avoid code duplications and re-using the
> existing code parts but making them API
> is not what I can imagine. The reason behind is simple. That would just
> block development in mid/long term in Flink main repo.
>  * lint-python.sh is far from a stable version from my understanding + that
> way Flink will contain the superset of maybe 10+ connector needs
>  * install_command.sh contains generic and Flink specific parts. When we
> extract the Flink specific parts we can declare it logically API
> What I can imagine is to make the mentioned scripts more configurable and
> make them @Experimental.
>
> When a development team of a connector is not happy to use it as-is then a
> deep-copy and/or rewrite is still an option.
>
> > Thanks,
> > Peter
> >
> > [1] - https://issues.apache.org/jira/browse/FLINK-33528
> > [2] -
> >
> > https://github.com/apache/flink/blob/2da9a9639216b8c48850ee714065f090a80dcd65/flink-python/apache-flink-libraries/setup.py#L129-L130
> > [3] -
> >
> > https://github.com/apache/flink/blob/2da9a9639216b8c48850ee714065f090a80dcd65/flink-python/pyflink/pyflink_gateway_server.py#L183-L190
> > [4] -
> > https://github.com/apache/flink/blob/master/flink-python/dev/lint-python.sh
> > [5] -
> >
> > https://github.com/apache/flink/blob/master/flink-python/dev/install_command.sh
> >


[jira] [Created] (FLINK-33706) Build_wheels_on_macos fails on AZP

2023-11-30 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-33706:
---

 Summary: Build_wheels_on_macos fails on AZP
 Key: FLINK-33706
 URL: https://issues.apache.org/jira/browse/FLINK-33706
 Project: Flink
  Issue Type: Bug
  Components: API / Python, Test Infrastructure
Affects Versions: 1.19.0
Reporter: Sergey Nuyanzin


This build 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=55044&view=logs&j=f73b5736-8355-5390-ec71-4dfdec0ce6c5&t=90f7230e-bf5a-531b-8566-ad48d3e03bbb&l=427
fails on AZP as 
{noformat}
   note: This error originates from a subprocess, and is likely not a 
problem with pip.
ERROR: Failed cleaning build dir for crcmod
Building wheel for dill (setup.py): started
Building wheel for dill (setup.py): finished with status 'done'
Created wheel for dill: filename=dill-0.0.0-py3-none-any.whl size=899 
sha256=39d0b4b66ce11f42313482f4ad825029e861fd6dab87a743a95d75a44a1fedd6
Stored in directory: 
/Users/runner/Library/Caches/pip/wheels/07/35/78/e9004fa30578734db7f10e7a211605f3f0778d2bdde38a239d
Building wheel for hdfs (setup.py): started
Building wheel for hdfs (setup.py): finished with status 'done'
Created wheel for hdfs: filename=UNKNOWN-0.0.0-py3-none-any.whl 
size=928 sha256=cb3fd7d8c71b52bbc27cfb75842f9d4d9c6f3b847f3f4fe50323c945a0e38ccc
Stored in directory: 
/Users/runner/Library/Caches/pip/wheels/68/dd/29/c1a590238f9ebbe4f7ee9b3583f5185d0b9577e23f05c990eb
WARNING: Built wheel for hdfs is invalid: Wheel has unexpected file 
name: expected 'hdfs', got 'UNKNOWN'
Building wheel for pymongo (pyproject.toml): started
Building wheel for pymongo (pyproject.toml): finished with status 'done'
Created wheel for pymongo: 
filename=pymongo-4.6.1-cp38-cp38-macosx_10_9_x86_64.whl size=478012 
sha256=5dfc6fdb6a8a399f8f9da44e28bae19be244b15c8000cd3b2d7d6ff513cc6277
Stored in directory: 
/Users/runner/Library/Caches/pip/wheels/54/d8/0e/2a61e90bb3872d903b15eb3c94cb70f438fb8792a28fee7bb1
Building wheel for docopt (setup.py): started
Building wheel for docopt (setup.py): finished with status 'done'
Created wheel for docopt: filename=UNKNOWN-0.0.0-py3-none-any.whl 
size=920 sha256=612c56cd1a6344b8def6c4d3c3c1c8bb10e1f2b0d978fee0fc8b9281026e8288
Stored in directory: 
/Users/runner/Library/Caches/pip/wheels/56/ea/58/ead137b087d9e326852a851351d1debf4ada529b6ac0ec4e8c
WARNING: Built wheel for docopt is invalid: Wheel has unexpected file 
name: expected 'docopt', got 'UNKNOWN'
  Successfully built dill pymongo
  Failed to build fastavro crcmod hdfs docopt
  ERROR: Could not build wheels for fastavro, which is required to install 
pyproject.toml-based projects
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Apache Flink Kafka Connectors v3.0.2, RC #1

2023-11-30 Thread Sergey Nuyanzin
+1(non-binding)

- Downloaded all the resources
- Verified signatures
- Validated hashsums
- Built from source code
- Checked Github release tag
- Reviewed the web PR

On Mon, Nov 27, 2023 at 9:05 AM Martijn Visser 
wrote:

> +1 (binding)
>
> - Validated hashes
> - Verified signature
> - Verified that no binaries exist in the source archive
> - Build the source with Maven
> - Verified licenses
> - Verified web PRs
>
> On Mon, Nov 27, 2023 at 4:17 AM Leonard Xu  wrote:
> >
> > +1 (binding)
> >
> > - checked the flink-connector-base dependency scope has been changed to
> provided
> > - built from source code succeeded
> > - verified signatures
> > - verified hashsums
> > - checked the contents contains jar and pom files in apache repo
> > - checked Github release tag
> > - checked release notes
> > - reviewed the web PR
> >
> > Best,
> > Leonard
> >
> >
> > > 2023年11月26日 下午4:40,Jing Ge  写道:
> > >
> > > +1 (non-binding)
> > >
> > > - verified signature and hash
> > > - checked repo
> > > - checked tag, BTW, the tag link at [5] should be
> > >
> https://github.com/apache/flink-connector-kafka/releases/tag/v3.0.2-rc1
> > > - verified source archives do not contains any binaries
> > > - build source maven 3.8.6 and jdk11
> > > - verified web PR
> > >
> > > Best regards,
> > > Jing
> > >
> > > On Sat, Nov 25, 2023 at 6:44 AM Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > >> +1 (non-binding)
> > >>
> > >> - Validated checksum hash
> > >> - Verified signature
> > >> - Verified that no binaries exist in the source archive
> > >> - Build the source with Maven and jdk8
> > >> - Verified licenses
> > >> - Verified web PRs
> > >>
> > >> Best,
> > >> Rui
> > >>
> > >> On Sat, Nov 25, 2023 at 2:05 AM Tzu-Li (Gordon) Tai <
> tzuli...@apache.org>
> > >> wrote:
> > >>
> > >>> +1 (binding)
> > >>>
> > >>> - Verified signature and hashes
> > >>> - Verified mvn dependency:tree for a typical user job jar [1]. When
> using
> > >>> Flink 1.18.0, flink-connector-base is no longer getting bundled, and
> all
> > >>> Flink dependencies resolve as 1.18.0 / provided.
> > >>> - Submitting user job jar to local Flink 1.18.0 cluster works and job
> > >> runs
> > >>>
> > >>> note: If running in the IDE, the flink-connector-base dependency is
> > >>> explicitly required when using KafkaSource. Otherwise, if submitting
> an
> > >>> uber jar, the flink-connector-base dependency should not be bundled
> as
> > >>> it'll be provided by the Flink distribution and will already be on
> the
> > >>> classpath.
> > >>>
> > >>> [1] mvn dependency:tree output
> > >>> ```
> > >>> [INFO] com.tzulitai:testing-kafka:jar:1.0-SNAPSHOT
> > >>> [INFO] +- org.apache.flink:flink-streaming-java:jar:1.18.0:provided
> > >>> [INFO] |  +- org.apache.flink:flink-core:jar:1.18.0:provided
> > >>> [INFO] |  |  +-
> org.apache.flink:flink-annotations:jar:1.18.0:provided
> > >>> [INFO] |  |  +-
> org.apache.flink:flink-metrics-core:jar:1.18.0:provided
> > >>> [INFO] |  |  +-
> org.apache.flink:flink-shaded-asm-9:jar:9.5-17.0:provided
> > >>> [INFO] |  |  +-
> > >>> org.apache.flink:flink-shaded-jackson:jar:2.14.2-17.0:provided
> > >>> [INFO] |  |  +- org.apache.commons:commons-lang3:jar:3.12.0:provided
> > >>> [INFO] |  |  +- org.apache.commons:commons-text:jar:1.10.0:provided
> > >>> [INFO] |  |  +- com.esotericsoftware.kryo:kryo:jar:2.24.0:provided
> > >>> [INFO] |  |  |  +-
> com.esotericsoftware.minlog:minlog:jar:1.2:provided
> > >>> [INFO] |  |  |  \- org.objenesis:objenesis:jar:2.1:provided
> > >>> [INFO] |  |  +-
> > >> commons-collections:commons-collections:jar:3.2.2:provided
> > >>> [INFO] |  |  \- org.apache.commons:commons-compress:jar:1.21:provided
> > >>> [INFO] |  +-
> org.apache.flink:flink-file-sink-common:jar:1.18.0:provided
> > >>> [INFO] |  +- org.apache.flink:flink-runtime:jar:1.18.0:provided
> > >>> [INFO] |  |  +- org.apache.flink:flink-rpc-core:jar:1.18.0:provided
> > >>> [INFO] |  |  +-
> > >> org.apache.flink:flink-rpc-akka-loader:jar:1.18.0:provided
> > >>> [INFO] |  |  +-
> > >>>
> org.apache.flink:flink-queryable-state-client-java:jar:1.18.0:provided
> > >>> [INFO] |  |  +- org.apache.flink:flink-hadoop-fs:jar:1.18.0:provided
> > >>> [INFO] |  |  +- commons-io:commons-io:jar:2.11.0:provided
> > >>> [INFO] |  |  +-
> > >>> org.apache.flink:flink-shaded-netty:jar:4.1.91.Final-17.0:provided
> > >>> [INFO] |  |  +-
> > >>> org.apache.flink:flink-shaded-zookeeper-3:jar:3.7.1-17.0:provided
> > >>> [INFO] |  |  +- org.javassist:javassist:jar:3.24.0-GA:provided
> > >>> [INFO] |  |  +- org.xerial.snappy:snappy-java:jar:1.1.10.4:runtime
> > >>> [INFO] |  |  \- org.lz4:lz4-java:jar:1.8.0:runtime
> > >>> [INFO] |  +- org.apache.flink:flink-java:jar:1.18.0:provided
> > >>> [INFO] |  |  \- com.twitter:chill-java:jar:0.7.6:provided
> > >>> [INFO] |  +-
> > >> org.apache.flink:flink-shaded-guava:jar:31.1-jre-17.0:provided
> > >>> [INFO] |  +- org.apache.commons:commons-math3:jar:3.6.1:provided
> > >>> [INFO] |  +- org.slf4j:slf4j-api:jar:1.7.36:runtime
> > >>> 

Re: [VOTE] Release flink-shaded 18.0, release candidate #1

2023-11-30 Thread Sergey Nuyanzin
+1 (non-binding)

- Downloaded all the resources
- Validated checksum hash
- Build the source with Maven and jdk8
- Build Flink master with new flink-shaded and check that all the tests are
passing

one minor thing that I noticed during releasing: for ci it uses maven 3.8.6
at the same time for release profile there is an enforcement plugin to
check that maven version is less than 3.3
I created a jira issue[1] for that
i made the release with 3.2.5 maven version (I suppose previous version was
also done with 3.2.5 because of same issue)

[1] https://issues.apache.org/jira/browse/FLINK-33703

On Wed, Nov 29, 2023 at 11:41 AM Matthias Pohl 
wrote:

> +1 (binding)
>
> * Downloaded all resources
> * Extracts sources and compilation on these sources
> * Diff of git tag checkout with downloaded sources
> * Verifies SHA512 checksums & GPG certification
> * Checks that all POMs have the right expected version
> * Generated diffs to compare pom file changes with NOTICE files: Nothing
> suspicious found except for a minor (non-blocking) typo [1]
>
> Thanks for driving this effort, Sergey. :)
>
> [1] https://github.com/apache/flink-shaded/pull/126/files#r1409080162
>
> On Wed, Nov 29, 2023 at 10:25 AM Rui Fan <1996fan...@gmail.com> wrote:
>
>> Sorry, it's non-binding.
>>
>> On Wed, Nov 29, 2023 at 5:19 PM Rui Fan <1996fan...@gmail.com> wrote:
>>
>> > Thanks Matthias for the clarification!
>> >
>> > After I import the latest KEYS, it works fine.
>> >
>> > +1(binding)
>> >
>> > - Validated checksum hash
>> > - Verified signature
>> > - Verified that no binaries exist in the source archive
>> > - Build the source with Maven and jdk8
>> > - Verified licenses
>> > - Verified web PRs, and left a comment
>> >
>> > Best,
>> > Rui
>> >
>> > On Wed, Nov 29, 2023 at 5:05 PM Matthias Pohl
>> >  wrote:
>> >
>> >> The key is the last key in the KEYS file. It's just having a different
>> >> format with spaces being added (due to different gpg versions?): F752
>> 9FAE
>> >> 2481 1A5C 0DF3  CA74 1596 BBF0 7268 35D8
>> >>
>> >> On Wed, Nov 29, 2023 at 9:41 AM Rui Fan <1996fan...@gmail.com> wrote:
>> >>
>> >> > Hey Sergey,
>> >> >
>> >> > Thank you for driving this release.
>> >> >
>> >> > I try to check this signature, the whole key is
>> >> > F7529FAE24811A5C0DF3CA741596BBF0726835D8,
>> >> > it matches your 1596BBF0726835D8, but I cannot
>> >> > find it from the Flink KEYS[1].
>> >> >
>> >> > Please correct me if my operation is wrong, thanks~
>> >> >
>> >> > [1] https://dist.apache.org/repos/dist/release/flink/KEYS
>> >> >
>> >> > Best,
>> >> > Rui
>> >> >
>> >> >
>> >> > On Wed, Nov 29, 2023 at 6:09 AM Sergey Nuyanzin > >
>> >> > wrote:
>> >> >
>> >> > > Hi everyone,
>> >> > > Please review and vote on the release candidate #1 for the version
>> >> 18.0,
>> >> > as
>> >> > > follows:
>> >> > > [ ] +1, Approve the release
>> >> > > [ ] -1, Do not approve the release (please provide specific
>> comments)
>> >> > >
>> >> > >
>> >> > > The complete staging area is available for your review, which
>> >> includes:
>> >> > > * JIRA release notes [1],
>> >> > > * the official Apache source release to be deployed to
>> >> dist.apache.org
>> >> > > [2],
>> >> > > which are signed with the key with fingerprint 1596BBF0726835D8
>> [3],
>> >> > > * all artifacts to be deployed to the Maven Central Repository [4],
>> >> > > * source code tag "release-18.0-rc1" [5],
>> >> > > * website pull request listing the new release [6].
>> >> > >
>> >> > > The vote will be open for at least 72 hours. It is adopted by
>> majority
>> >> > > approval, with at least 3 PMC affirmative votes.
>> >> > >
>> >> > > Thanks,
>> >> > > Sergey
>> >> > >
>> >> > > [1]
>> https://issues.apache.org/jira/projects/FLINK/versions/12353081
>> >> > > [2]
>> >> https://dist.apache.org/repos/dist/dev/flink/flink-shaded-18.0-rc1
>> >> > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>> >> > > [4]
>> >> > >
>> >>
>> https://repository.apache.org/content/repositories/orgapacheflink-1676/
>> >> > > [5]
>> >> https://github.com/apache/flink-shaded/releases/tag/release-18.0-rc1
>> >> > > [6] https://github.com/apache/flink-web/pull/701
>> >> > >
>> >> >
>> >>
>> >
>>
>

-- 
Best regards,
Sergey


Re: [DISCUSS] Release flink-connector-parent v1.01

2023-11-30 Thread Sergey Nuyanzin
thanks for volunteering Etienne

+1 for releasing
however there is one more PR to enable custom jvm flags for connectors
in similar way it is done in Flink main repo for modules
It will simplify a bit support for java 17

could we have this as well in the coming release?



On Wed, Nov 29, 2023 at 11:40 AM Etienne Chauchot 
wrote:

> Hi all,
>
> I would like to discuss making a v1.0.1 release of flink-connector-parent.
>
> Since last release, there were only 2 changes:
>
> - https://github.com/apache/flink-connector-shared-utils/pull/19
> (spotless addition)
>
> - https://github.com/apache/flink-connector-shared-utils/pull/26
> (surefire configuration)
>
> The new release would bring the ability to skip some tests in the
> connectors and among other things skip the archunit tests. It is
> important for connectors to skip archunit tests when tested against a
> version of Flink that changes the archunit rules leading to a change of
> the violation store. As there is only one violation store and the
> connector needs to be tested against last 2 minor Flink versions, only
> the version the connector was built against needs to run the archunit
> tests and have them reflected in the violation store.
>
>
> I volunteer to make the release. As it would be my first ASF release, I
> might require the guidance of one of the PMC members.
>
>
> Best
>
> Etienne
>
>
>
>
>

-- 
Best regards,
Sergey


[jira] [Created] (FLINK-33705) Upgrade flink-shaded to 18.0

2023-11-30 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-33705:
---

 Summary: Upgrade flink-shaded to 18.0
 Key: FLINK-33705
 URL: https://issues.apache.org/jira/browse/FLINK-33705
 Project: Flink
  Issue Type: Technical Debt
  Components: BuildSystem / Shaded
Affects Versions: 1.19.0
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin


Currently flink-shaded is in a process of releasing

Once it is released it would make sense to upgrade the dependency in Flink



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33704) Update GCS filesystems to latest available versions

2023-11-30 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-33704:
--

 Summary: Update GCS filesystems to latest available versions
 Key: FLINK-33704
 URL: https://issues.apache.org/jira/browse/FLINK-33704
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / FileSystem, FileSystems
Reporter: Martijn Visser
Assignee: Martijn Visser


Update GS SDK from 2.15.0 to 2.29.1 and GS Hadoop Connector from 2.2.15 to 
2.2.18



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33703) Use maven 3.8.6 for releasing of flink-shaded

2023-11-30 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-33703:
---

 Summary: Use maven 3.8.6 for releasing of flink-shaded
 Key: FLINK-33703
 URL: https://issues.apache.org/jira/browse/FLINK-33703
 Project: Flink
  Issue Type: Technical Debt
  Components: BuildSystem / Shaded
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin


Currently there is maven-enforcer-plugin configuration (for release only)
{noformat}


(,3.3)

{noformat}
which seems to be outdated and for ci 3.8.6 is used

We should keep them in sync and use 3.8.6 for both



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-383: Support Job Recovery for Batch Jobs

2023-11-30 Thread Lijie Wang
Hi Guowei,

Thanks for your feedback.

>> As far as I know, there are multiple job managers on standby in some
scenarios. In this case, is your design still effective?
I think it's still effective. There will only be one leader. After becoming
the leader, the startup process of JobMaster is the same as only one
jobmanger restarts, so I think the current process should also be
applicable to multi-jobmanager situation. We will also do some tests to
cover this case.

>> How do you rule out that there might still be some states in the memory
of the original operator coordinator?
Current restore process is the same as steraming jobs restore from
checkpoint(call the same methods) after failover, which is widely used in
production, so I think there is no problem.

>> Additionally, using NO_CHECKPOINT seems a bit odd. Why not use a normal
checkpoint ID greater than 0 and record it in the event store?
We use -1(NO_CHECKPOINT) to distinguish it from a normal checkpoint, -1
indicates that this is a snapshot for the no-checkpoint/batch scenarios.

Besides, considering that currently some operator coordinators may not
support taking snapshots in the no-checkpint/batch scenarios (or don't
support passing -1 as a checkpoint id), we think it is better to let the
developer explicitly specify whether it supports snapshots in the batch
scenario. Therefore, we intend to introduce the "SupportsBatchSnapshot"
interface for split enumerator and the "supportsBatchSnapshot" method for
operator coordinator. You can find more details in FLIP "Introduce
SupportsBatchSnapshot interface" and "JobEvent" sections.

Looking forward to your further feedback.

Best,
Lijie

Guowei Ma  于2023年11月19日周日 10:47写道:

> Hi,
>
>
> This is a very good proposal, as far as I know, it can solve some very
> critical production operations in certain scenarios. I have two minor
> issues:
>
> As far as I know, there are multiple job managers on standby in some
> scenarios. In this case, is your design still effective? I'm unsure if you
> have conducted any tests. For instance, standby job managers might take
> over these failed jobs more quickly.
> Regarding the part about the operator coordinator, how can you ensure that
> the checkpoint mechanism can restore the state of the operator coordinator:
> For example:
> How do you rule out that there might still be some states in the memory of
> the original operator coordinator? After all, the implementation was done
> under the assumption of scenarios where the job manager doesn't fail.
> Additionally, using NO_CHECKPOINT seems a bit odd. Why not use a normal
> checkpoint ID greater than 0 and record it in the event store?
> If the issues raised in point 2 cannot be resolved in the short term, would
> it be possible to consider not supporting failover with a source job
> manager?
>
> Best,
> Guowei
>
>
> On Thu, Nov 2, 2023 at 6:01 PM Lijie Wang 
> wrote:
>
> > Hi devs,
> >
> > Zhu Zhu and I would like to start a discussion about FLIP-383: Support
> Job
> > Recovery for Batch Jobs[1]
> >
> > Currently, when Flink’s job manager crashes or gets killed, possibly due
> to
> > unexpected errors or planned nodes decommission, it will cause the
> > following two situations:
> > 1. Failed, if the job does not enable HA.
> > 2. Restart, if the job enable HA. If it’s a streaming job, the job will
> be
> > resumed from the last successful checkpoint. If it’s a batch job, it has
> to
> > run from beginning, all previous progress will be lost.
> >
> > In view of this, we think the JM crash may cause great regression for
> batch
> > jobs, especially long running batch jobs. This FLIP is mainly to solve
> this
> > problem so that batch jobs can recover most job progress after JM
> crashes.
> > In this FLIP, our goal is to let most finished tasks not need to be
> re-run.
> >
> > You can find more details in the FLIP-383[1]. Looking forward to your
> > feedback.
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-383%3A+Support+Job+Recovery+for+Batch+Jobs
> >
> > Best,
> > Lijie
> >
>


Re: [VOTE] FLIP-379: Dynamic source parallelism inference for batch jobs

2023-11-30 Thread Leonard Xu
+1(binding)

Btw, @Etienne, IIRC, your vote should be a binding one.


Best,
Leonard

> 2023年11月30日 下午5:03,Etienne Chauchot  写道:
> 
> +1 (non-biding)
> 
> Etienne
> 
> Le 30/11/2023 à 09:13, Rui Fan a écrit :
>> +1(binding)
>> 
>> Best,
>> Rui
>> 
>> On Thu, Nov 30, 2023 at 3:56 PM Lijie Wang  wrote:
>> 
>>> +1 (binding)
>>> 
>>> Best,
>>> Lijie
>>> 
>>> Zhu Zhu  于2023年11月30日周四 13:13写道:
>>> 
 +1
 
 Thanks,
 Zhu
 
 Xia Sun  于2023年11月30日周四 11:41写道:
 
> Hi everyone,
> 
> I'd like to start a vote on FLIP-379: Dynamic source parallelism
 inference
> for batch jobs[1] which has been discussed in this thread [2].
> 
> The vote will be open for at least 72 hours unless there is an
>>> objection
 or
> not enough votes.
> 
> 
> [1]
> 
> 
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-379%3A+Dynamic+source+parallelism+inference+for+batch+jobs
> [2]https://lists.apache.org/thread/ocftkqy5d2x4n58wzprgm5qqrzzkbmb8
> 
> 
> Best Regards,
> Xia



[RESULT][VOTE] Release flink-connector-aws, v4.2.0 release candidate #1

2023-11-30 Thread Danny Cranmer
I'm happy to announce that we have unanimously approved this release.

There are 7 approving votes, 3 of which are binding:
* Mystic Lama
* Danny Cranmer (binding)
* Jiabao Sun
* Martijn Visser (binding)
* Ahmed Hamdy
* Leonard Xu (binding)
* Samrat Deb

There are no disapproving votes.

Thanks everyone!
Danny


Re: [VOTE] Release flink-connector-aws, v4.2.0 release candidate #1

2023-11-30 Thread Danny Cranmer
Thanks everyone for the support. This vote is now closed, I will
announce the results in a separate email.

Danny.

On Thu, Nov 30, 2023 at 5:20 AM Samrat Deb  wrote:

> Hi danny ,
>
> +1 (non binding)
>
> Release notes look good
> - Signatures and checksums verified
> - pom versions checked
> - Source build and tests verified
>
> Bests,
> Samrat
>
> On Wed, 29 Nov 2023 at 12:11 PM, Leonard Xu  wrote:
>
> > Thanks Danny for driving this, sorry for late verification.
> >
> > +1 (binding)
> >
> > - built from source code succeeded
> > - verified signatures
> > - verified hashsums
> > - checked the contents contains jar and pom files in apache repo
> > - checked Github release tag
> > - reviewed the web PR with minor comment
> >
> > Best,
> > Leonard
> >
> >
> >
> > > 2023年11月27日 上午1:13,Ahmed Hamdy  写道:
> > >
> > > Hi Danny
> > > +1 (non-binding)
> > >
> > >
> > > - Verified signatures and checksums
> > > - verified no binaries exists in archive
> > > - built source
> > > - reviewed web PR
> > > - Ran E2E example with Kinesis, Firehose & DynamoDB datastream
> > connectors.
> > >
> > > Best Regards
> > > Ahmed Hamdy
> > >
> > >
> > > On Fri, 24 Nov 2023 at 08:38, Martijn Visser  >
> > > wrote:
> > >
> > >> Hi Danny,
> > >>
> > >> Thanks for driving this.
> > >>
> > >> +1 (binding)
> > >>
> > >> - Validated hashes
> > >> - Verified signature
> > >> - Verified that no binaries exist in the source archive
> > >> - Build the source with Maven
> > >> - Verified licenses
> > >> - Verified web PRs
> > >>
> > >> On Mon, Nov 20, 2023 at 12:29 PM Jiabao Sun
> > >>  wrote:
> > >>>
> > >>> Thanks Danny for driving the release,
> > >>>
> > >>> +1 (non-binding)
> > >>>
> > >>> - built from source code succeeded
> > >>> - verified signatures
> > >>> - verified hashsums
> > >>> - checked release notes
> > >>>
> > >>> Best,
> > >>> Jiabao
> > >>>
> > >>>
> >  2023年11月20日 19:11,Danny Cranmer  写道:
> > 
> >  Hello all,
> > 
> >  +1 (binding).
> > 
> >  - Release notes look good
> >  - Signatures and checksums match
> >  - There are no binaries in the source archive
> >  - pom versions are correct
> >  - Tag is present in Github
> >  - CI passes against FLink 1.17 and 1.18
> >  - Source build and tests pass
> > 
> >  Thanks,
> >  Danny
> > 
> >  On Wed, Nov 1, 2023 at 1:15 AM mystic lama  >
> > >> wrote:
> > 
> > > +1 (non-binding)
> > >
> > > - validated shasum
> > > - verified build
> > >  - Java 8   - build good and all test cases pass
> > >  - Java 11 - build good and all test cases pass
> > >
> > > Observations: got test failures with Java 17, something to look for
> > in
> > > future
> > >
> > > On Tue, 31 Oct 2023 at 08:42, Danny Cranmer <
> dannycran...@apache.org
> > >
> > > wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> Please review and vote on release candidate #1 for the version
> > >> 4.2.0, as
> > >> follows:
> > >> [ ] +1, Approve the release
> > >> [ ] -1, Do not approve the release (please provide specific
> > comments)
> > >>
> > >> The complete staging area is available for your review, which
> > >> includes:
> > >> * JIRA release notes [1],
> > >> * the official Apache source release to be deployed to
> > >> dist.apache.org
> > >> [2],
> > >> which are signed with the key with fingerprint
> > >> 0F79F2AFB2351BC29678544591F9C1EC125FD8DB [3],
> > >> * all artifacts to be deployed to the Maven Central Repository
> [4],
> > >> * source code tag v4.2.0-rc1 [5],
> > >> * website pull request listing the new release [6].
> > >> * A link to the CI run on the release tag [7]
> > >>
> > >> The vote will be open for at least 72 hours. It is adopted by
> > >> majority
> > >> approval, with at least 3 PMC affirmative votes.
> > >>
> > >> Thanks,
> > >> Danny
> > >>
> > >> [1]
> > >>
> > >>
> > >
> > >>
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12353011
> > >> [2]
> > >>
> > >
> > >>
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-aws-4.2.0-rc1
> > >> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > >> [4]
> > >>
> > >>
> https://repository.apache.org/content/repositories/orgapacheflink-1665/
> > >> [5]
> > >
> > https://github.com/apache/flink-connector-aws/releases/tag/v4.2.0-rc1
> > >> [6] https://github.com/apache/flink-web/pull/693
> > >> [7]
> > >
> > https://github.com/apache/flink-connector-aws/actions/runs/6707962074
> > >>
> > >
> > >>>
> > >>
> >
> >
>


Re: [VOTE] FLIP-379: Dynamic source parallelism inference for batch jobs

2023-11-30 Thread Etienne Chauchot

+1 (non-biding)

Etienne

Le 30/11/2023 à 09:13, Rui Fan a écrit :

+1(binding)

Best,
Rui

On Thu, Nov 30, 2023 at 3:56 PM Lijie Wang  wrote:


+1 (binding)

Best,
Lijie

Zhu Zhu  于2023年11月30日周四 13:13写道:


+1

Thanks,
Zhu

Xia Sun  于2023年11月30日周四 11:41写道:


Hi everyone,

I'd like to start a vote on FLIP-379: Dynamic source parallelism

inference

for batch jobs[1] which has been discussed in this thread [2].

The vote will be open for at least 72 hours unless there is an

objection

or

not enough votes.


[1]



https://cwiki.apache.org/confluence/display/FLINK/FLIP-379%3A+Dynamic+source+parallelism+inference+for+batch+jobs

[2]https://lists.apache.org/thread/ocftkqy5d2x4n58wzprgm5qqrzzkbmb8


Best Regards,
Xia


[jira] [Created] (FLINK-33702) Add IncrementalDelayRetryStrategy in AsyncRetryStrategies

2023-11-30 Thread xiangyu feng (Jira)
xiangyu feng created FLINK-33702:


 Summary: Add IncrementalDelayRetryStrategy in AsyncRetryStrategies 
 Key: FLINK-33702
 URL: https://issues.apache.org/jira/browse/FLINK-33702
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Reporter: xiangyu feng


AsyncRetryStrategies now supports NoRetryStrategy, FixedDelayRetryStrategy and 
ExponentialBackoffDelayRetryStrategy.  In certain scenarios, we also need 
IncrementalDelayRetryStrategy to reduce the retry count and perform the action 
more timely. 

 

IncrementalDelayRetryStrategy will increase the retry delay at a fixed rate for 
each attempt.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33701) restart.time-tracking doc is wrong

2023-11-30 Thread Rui Fan (Jira)
Rui Fan created FLINK-33701:
---

 Summary: restart.time-tracking doc is wrong
 Key: FLINK-33701
 URL: https://issues.apache.org/jira/browse/FLINK-33701
 Project: Flink
  Issue Type: Technical Debt
  Components: Autoscaler
Affects Versions: kubernetes-operator-1.8.0
Reporter: Rui Fan
Assignee: Rui Fan
 Fix For: kubernetes-operator-1.8.0
 Attachments: image-2023-11-30-16-27-06-149.png

The {color:#6a8759}restart.time-tracking.limit {color}as the upper bound 
instead of {color:#6a8759}restart.time.{color}

 

!image-2023-11-30-16-27-06-149.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-364: Improve the restart-strategy

2023-11-30 Thread Rui Fan
+1(binding)

Best,
Rui

On Mon, Nov 13, 2023 at 11:01 AM Rui Fan <1996fan...@gmail.com> wrote:

> Hi everyone,
>
> Thank you to everyone for the feedback on FLIP-364: Improve the
> restart-strategy[1]
> which has been discussed in this thread [2].
>
> I would like to start a vote for it. The vote will be open for at least 72
> hours unless there is an objection or not enough votes.
>
> [1] https://cwiki.apache.org/confluence/x/uJqzDw
> [2] https://lists.apache.org/thread/5cgrft73kgkzkgjozf9zfk0w2oj7rjym
>
> Best,
> Rui
>


Re: [VOTE] FLIP-379: Dynamic source parallelism inference for batch jobs

2023-11-30 Thread Rui Fan
+1(binding)

Best,
Rui

On Thu, Nov 30, 2023 at 3:56 PM Lijie Wang  wrote:

> +1 (binding)
>
> Best,
> Lijie
>
> Zhu Zhu  于2023年11月30日周四 13:13写道:
>
> > +1
> >
> > Thanks,
> > Zhu
> >
> > Xia Sun  于2023年11月30日周四 11:41写道:
> >
> > > Hi everyone,
> > >
> > > I'd like to start a vote on FLIP-379: Dynamic source parallelism
> > inference
> > > for batch jobs[1] which has been discussed in this thread [2].
> > >
> > > The vote will be open for at least 72 hours unless there is an
> objection
> > or
> > > not enough votes.
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-379%3A+Dynamic+source+parallelism+inference+for+batch+jobs
> > > [2] https://lists.apache.org/thread/ocftkqy5d2x4n58wzprgm5qqrzzkbmb8
> > >
> > >
> > > Best Regards,
> > > Xia
> > >
> >
>