Re: [DISCUSS] Releasing Flink ML 2.2.0

2023-04-03 Thread Dong Lin
Thanks everyone for the comments!

We will go ahead to release Flink ML 2.2.0. Please see here
 for the
release plan.

Best Regards,
Dong


On Fri, Mar 31, 2023 at 6:50 PM Yu Li  wrote:

> +1. Great to know the (exciting) progress and thanks for the efforts!
>
> Best Regards,
> Yu
>
>
> On Fri, 31 Mar 2023 at 14:39, Fan Hong  wrote:
>
>> Hi Dong and Zhipeng,
>>
>> Thanks for starting the discussion. Glad to see a new release of Flink ML.
>>
>> Cheers!
>>
>> On Fri, Mar 31, 2023 at 2:34 PM Zhipeng Zhang 
>> wrote:
>>
>> > Hi Dong,
>> >
>> > Thanks for starting the discussion. +1 for the Flink ML 2.1.0 release.
>> >
>>
>


[jira] [Created] (FLINK-31718) pulsar connector v3.0 branch's CI is not working properly

2023-04-03 Thread Weijie Guo (Jira)
Weijie Guo created FLINK-31718:
--

 Summary: pulsar connector v3.0 branch's CI is not working properly
 Key: FLINK-31718
 URL: https://issues.apache.org/jira/browse/FLINK-31718
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Pulsar
Affects Versions: pulsar-3.0.1
Reporter: Weijie Guo
Assignee: Weijie Guo


After FLINK-30963, we no longer manually set {{flink_url}}, but it is required 
in pulsar connector's own {{ci.yml}}, which causes CI to fail to run normally. 
The root of the problem is that the v3.0 branch does not use the {{ci.yml}} in 
{{flink-connector-shared-utils}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31717) Unit tests running with local kube config

2023-04-03 Thread Matyas Orhidi (Jira)
Matyas Orhidi created FLINK-31717:
-

 Summary: Unit tests running with local kube config
 Key: FLINK-31717
 URL: https://issues.apache.org/jira/browse/FLINK-31717
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator
Reporter: Matyas Orhidi


Some unit tests are using local kube environment. This can be dangerous when 
pointing to sensitive clusters e.g. in prod.

{{2023-04-03 12:32:53,956 i.f.k.c.Config [DEBUG] Found for 
Kubernetes config at: [/Users//.kube/config].
}}
A misconfigured kube config environment revealed the issue:

{{[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.012 
s <<< FAILURE! - in org.apache.flink.kubernetes.operator.FlinkOperatorTest
[ERROR] 
org.apache.flink.kubernetes.operator.FlinkOperatorTest.testConfigurationPassedToJOSDK
  Time elapsed: 0.008 s  <<< ERROR!
java.lang.NullPointerException
at 
org.apache.flink.kubernetes.operator.FlinkOperatorTest.testConfigurationPassedToJOSDK(FlinkOperatorTest.java:63)

[ERROR] 
org.apache.flink.kubernetes.operator.FlinkOperatorTest.testLeaderElectionConfig 
 Time elapsed: 0.004 s  <<< ERROR!
java.lang.NullPointerException
at 
org.apache.flink.kubernetes.operator.FlinkOperatorTest.testLeaderElectionConfig(FlinkOperatorTest.java:108)
}}

move ~/.kube/config



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


RE: [VOTE] Release flink-connector-aws, 4.1.0 for Flink 1.17

2023-04-03 Thread Elphas Tori
+1 (non-binding)

+ verified hashes and signatures
+ checked local build of website pull request and approved

On 2023/04/03 16:19:00 Danny Cranmer wrote:
> Hi everyone,
> Please review and vote on the binaries for flink-connector-aws version
> 4.1.0-1.17, as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
> 
> The v4.1.0 source release has already been approved [1], this vote is to
> distribute the binaries for Flink 1.17 support.
> 
> The complete staging area is available for your review, which includes:
> * all artifacts to be deployed to the Maven Central Repository [2], which
> are signed with the key with fingerprint
> 0F79F2AFB2351BC29678544591F9C1EC125FD8DB [3],
> * website pull request listing the new release [4].
> 
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
> 
> Thanks,
> Danny
> 
> [1] https://lists.apache.org/thread/7q3ysg9jz5cjwdgdmgckbnqhxh44ncmv
> [2] https://repository.apache.org/content/repositories/orgapacheflink-1602/
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4] https://github.com/apache/flink-web/pull/628
> 


[jira] [Created] (FLINK-31716) Event UID field is missing the first time that an event is consumed

2023-04-03 Thread Rodrigo Meneses (Jira)
Rodrigo Meneses created FLINK-31716:
---

 Summary: Event UID field is missing the first time that an event 
is consumed
 Key: FLINK-31716
 URL: https://issues.apache.org/jira/browse/FLINK-31716
 Project: Flink
  Issue Type: Bug
  Components: Kubernetes Operator
Reporter: Rodrigo Meneses


on `EventUtils.createOrUpdateEvent` we use a `Consumer` instance to 
`accept` the underlying event that is being created or updated.

The first time an event is created, we are calling 
`client.resource(event).createOrReplace()` but we are discarding the return 
value of such method, and we are returning the `event` that we just created, 
which has an empty UID field.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


TaskManager OutOfMemoryError. Metaspace out of memory with python udf

2023-04-03 Thread tom yang
Hi Flink community,

I am running a session cluster with 1gb of jvm metaspace. Each time I
submit and cancel the flink job with python udf I am noticing that the
metaspace is gradually increasing until it eventually kills the task
manager due to an out of memory exception.

To reproduce this error locally I installed flink v1.16.1 and pyflink
1.16.1 with python version 3.9 . Using the word count python example
here 
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/python/table_api_tutorial/

I would submit this python job to my local cluster via
./flink-1.16.1/bin/flink run -pyexec /opt/homebrew/bin/python3.9
--python wordcount.py

and then cancel the running job. Over time I can see from the flink UI
the metaspace is gradually increasing until the job manager crashes
with the following exception

2023-03-29 10:17:19,270 ERROR
org.apache.flink.runtime.taskexecutor.TaskManagerRunner [] - Fatal
error occurred while executing the TaskManager. Shutting it down...
java.lang.OutOfMemoryError: Metaspace. The metaspace out-of-memory
error has occurred. This can mean two things: either the job requires
a larger size of JVM metaspace to load classes or there is a class
loading leak. In the first case
'taskmanager.memory.jvm-metaspace.size' configuration option should be
increased. If the error persists (usually in cluster after several job
(re-)submissions) then there is probably a class loading leak in user
code or some of its dependencies which has to be investigated and
fixed. The task executor has to be shutdown...
at java.lang.ClassLoader.defineClass1(Native Method) ~[?:?]
at java.lang.ClassLoader.defineClass(ClassLoader.java:1017) ~[?:?]

I noticed that a similar issue was mentioned in
https://issues.apache.org/jira/browse/FLINK-15338 due to a leaky class
loader but was fixed in version 1.10.


Has anyone else encountered similar issues?

Thanks,
Tom


[jira] [Created] (FLINK-31715) Warning - 'An illegal reflective access operation has occurred'

2023-04-03 Thread Feroze Daud (Jira)
Feroze Daud created FLINK-31715:
---

 Summary: Warning - 'An illegal reflective access operation has 
occurred'
 Key: FLINK-31715
 URL: https://issues.apache.org/jira/browse/FLINK-31715
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.14.4
Reporter: Feroze Daud


I am seeing the following exception when my app starts up.

 
{noformat}
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner 
(file:/Users/ferozed/.gradle/caches/modules-2/files-2.1/org.apache.flink/flink-core/1.14.4/1c397865a94743deb286c658384fae954a381df/flink-core-1.14.4.jar)
 to field java.lang.String.value
WARNING: Please consider reporting this to the maintainers of 
org.apache.flink.api.java.ClosureCleaner
WARNING: Use --illegal-access=warn to enable warnings of further illegal 
reflective access operations
WARNING: All illegal access operations will be denied in a future release
09:51:49.626 [main] INFO  
com.zillow.clickstream.preprocess.utils.SchemaRegistryHelper  - Schema id = 9636
An illegal reflective access operation has occurredIllegal reflective access by 
org.apache.flink.api.java.ClosureCleaner 
(file:/Users/ferozed/.gradle/caches/modules-2/files-2.1/org.apache.flink/flink-core/1.14.4/1c397865a94743deb286c658384fae954a381df/flink-core-1.14.4.jar)
 to field java.lang.String.valuePlease consider reporting this to the 
maintainers of org.apache.flink.api.java.ClosureCleanerUse 
--illegal-access=warn to enable warnings of further illegal reflective access 
operationsAll illegal access operations will be denied in a future release
 {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Status of Statefun Project

2023-04-03 Thread Galen Warren
Thanks for bringing this up.

I'm currently using Statefun, and I've made a few small code contributions
over time. All of my PRs have been merged into master and most have been
released, but a few haven't been part of a release yet. Most recently, I
helped upgrade Statefun to be compatible with Flink 1.15.2, which was
merged last October but hasn't been released. (And, of course, there have
been more Flink releases since then.)

IMO, the main thing driving the need for ongoing Statefun releases -- even
in the absence of any new feature development -- is that there is typically
a bit of work to do to make Statefun compatible with each new Flink
release. This usually involves updating dependency versions and sometimes
some simple code changes, a common example being adapting to changes in
Flink config parameters that have changed from, say, delimited strings to
arrays.

I'd be happy to continue to make the necessary changes to Statefun to be
compatible with each new Flink release, but I don't have the committer
rights that would allow me to release the code.





On Mon, Apr 3, 2023 at 5:02 AM Martijn Visser 
wrote:

> Hi everyone,
>
> I want to open a discussion on the status of the Statefun Project [1] in
> Apache Flink. As you might have noticed, there hasn't been much development
> over the past months in the Statefun repository [2]. There is currently a
> lack of active contributors and committers who are able to help with the
> maintenance of the project.
>
> In order to improve the situation, we need to solve the lack of committers
> and the lack of contributors.
>
> On the lack of committers:
>
> 1. Ideally, there are some of the current Flink committers who have the
> bandwidth and can help with reviewing PRs and merging them.
> 2. If that's not an option, it could be a consideration that current
> committers only approve and review PRs, that are approved by those who are
> willing to contribute to Statefun and if the CI passes
>
> On the lack of contributors:
>
> 3. Next to having this discussion on the Dev and User mailing list, we can
> also create a blog with a call for new contributors on the Flink project
> website, send out some tweets on the Flink / Statefun twitter accounts,
> post messages on Slack etc. In that message, we would inform how those that
> are interested in contributing can start and where they could reach out for
> more information.
>
> There's also option 4. where a group of interested people would split
> Statefun from the Flink project and make it a separate top level project
> under the Apache Flink umbrella (similar as recently has happened with
> Flink Table Store, which has become Apache Paimon).
>
> If we see no improvements in the coming period, we should consider
> sunsetting Statefun and communicate that clearly to the users.
>
> I'm looking forward to your thoughts.
>
> Best regards,
>
> Martijn
>
> [1] https://nightlies.apache.org/flink/flink-statefun-docs-master/
> [2] https://github.com/apache/flink-statefun
>


[VOTE] Release flink-connector-aws, 4.1.0 for Flink 1.17

2023-04-03 Thread Danny Cranmer
Hi everyone,
Please review and vote on the binaries for flink-connector-aws version
4.1.0-1.17, as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The v4.1.0 source release has already been approved [1], this vote is to
distribute the binaries for Flink 1.17 support.

The complete staging area is available for your review, which includes:
* all artifacts to be deployed to the Maven Central Repository [2], which
are signed with the key with fingerprint
0F79F2AFB2351BC29678544591F9C1EC125FD8DB [3],
* website pull request listing the new release [4].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Danny

[1] https://lists.apache.org/thread/7q3ysg9jz5cjwdgdmgckbnqhxh44ncmv
[2] https://repository.apache.org/content/repositories/orgapacheflink-1602/
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://github.com/apache/flink-web/pull/628


Re: [DISCUSS] Releasing connectors for Flink 1.17

2023-04-03 Thread Danny Cranmer
Hi everyone,

I started the release process for flink-connector-aws and
flink-connector-pulsar. Unfortunately the Pulsar connector does not build
against Flink 1.17 at the v3.0.0 tag, errors below. The nightly is passing
since the code has been updated to support 1.17.
The PulsarFetcherManagerBase [1] was using a @VisibleForTesting constructor
of SplitFetcherManager which has been modified in Flink 1.17 [2]. This
means we will need a new source release for flink-connector-pulsar to
support Flink 1.17.

For now I will follow up with the release of flink-connector-aws.

Thanks,
Danny

[1]
https://github.com/apache/flink-connector-pulsar/blob/v3.0.0/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/source/reader/fetcher/PulsarFetcherManagerBase.java#L61
[2]
https://github.com/apache/flink/pull/20485/files#diff-2b83c936a0cbf7c013c8cf9316966efefe72494e0a658d1cefb5ce1af13b7db8

On Mon, Apr 3, 2023 at 12:01 PM Hang Ruan  wrote:

> Hi Denny,
>
> I could help with JDBC connectors. Thanks~
>
> Best,
> Hang
>
> Danny Cranmer  于2023年4月3日周一 17:57写道:
>
> > Hi everyone,
> >
> > I have created Jiras to investigate each failing build. Thanks Andriy, I
> > have assigned the Elasticsearch/Opensearch ones to you. Let me know if
> you
> > do not have capacity.
> >
> > - Elasticsearch: https://issues.apache.org/jira/browse/FLINK-31696
> > - Opensearch: https://issues.apache.org/jira/browse/FLINK-31697
> > - RabbitMQ: https://issues.apache.org/jira/browse/FLINK-31701
> > - JDBC: https://issues.apache.org/jira/browse/FLINK-31699
> > - Cassandra: https://issues.apache.org/jira/browse/FLINK-31698
> > - MongoDB: https://issues.apache.org/jira/browse/FLINK-31700
> >
> > Thanks,
> > Danny
> >
> >
> > On Fri, Mar 31, 2023 at 5:59 PM Andrey Redko  wrote:
> >
> > > Hi Denny,
> > >
> > > I could give you a hand with Elasticsearch / Opensearch connectors,
> > please
> > > let me know if help is needed. Thank you.
> > >
> > > Best Regards,
> > > Andriy Redko
> > >
> > > On Fri, Mar 31, 2023, 9:39 AM Danny Cranmer 
> > > wrote:
> > >
> > > > Apologies for the typo.
> Elasticsearch/Opensearch/Cassandra/JDBC/MongoDB
> > > are
> > > > build __failures__.
> > > >
> > > > On Fri, Mar 31, 2023 at 2:36 PM Danny Cranmer <
> dannycran...@apache.org
> > >
> > > > wrote:
> > > >
> > > > > Hello all,
> > > > >
> > > > > With the release of 1.17.0 [1], we now need to add support for our
> > > > > externalized connectors. This release is lightweight compared to
> > normal
> > > > > since we are ideally not updating any source code; we only need to
> > > > rebuild
> > > > > and distribute the binary to Maven. Therefore I am proposing that
> we
> > > > bundle
> > > > > all the connectors into a single VOTE, and release them in one go.
> I
> > am
> > > > > considering connectors that are released and passing CI for 1.17,
> > which
> > > > > unfortunately only includes:
> > > > > - flink-connector-aws [2] at version 4.1.0-1.17 (build success [3])
> > > > > - flink-connector-pulsar [4] at version 3.0.0-1.17 (build success
> > [5])
> > > > >
> > > > > The following connectors are not passing builds against Flink
> 1.17.0
> > > and
> > > > > potentially need a new source release, therefore are excluded and
> > will
> > > > need
> > > > > dedicated releases:
> > > > > - flink-connector-elasticsearch [6] at version 3.0.0-1.17 (build
> > > success
> > > > > [7])
> > > > > - flink-connector-opensearch [8] at version 1.0.0-1.17 (build
> success
> > > > [9])
> > > > > - flink-connector-cassandra [10] at version 3.0.0-1.17 (build
> success
> > > > [11])
> > > > > - flink-connector-jdbc [12] at version 3.0.0-1.17 (build success
> > [13])
> > > > > - flink-connector-mongodb [14] at version 1.0.0-1.17 (build success
> > > [15])
> > > > > - flink-connector-rabbitmq [16] at version 3.0.0-1.17 (build
> failure
> > > > [17])
> > > > >
> > > > > The following connectors are not yet released and are therefore
> > > excluded:
> > > > > - flink-connector-kafka [18]
> > > > >
> > > > > I volunteer myself as the release manager for flink-connector-aws
> and
> > > > > flink-connector-pulsar. I am happy to pick up others too, but will
> > move
> > > > to
> > > > > different threads.
> > > > >
> > > > > Thanks,
> > > > > Danny
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://flink.apache.org/2023/03/23/announcing-the-release-of-apache-flink-1.17/
> > > > > [2] https://github.com/apache/flink-connector-aws
> > > > > [3]
> > > >
> https://github.com/apache/flink-connector-aws/actions/runs/4570151032
> > > > > [4] https://github.com/apache/flink-connector-pulsar
> > > > > [5]
> > > > >
> > >
> https://github.com/apache/flink-connector-pulsar/actions/runs/4570233207
> > > > > [6] https://github.com/apache/flink-connector-elasticsearch
> > > > > [7]
> > > > >
> > > >
> > >
> >
> https://github.com/apache/flink-connector-elasticsearch/actions/runs/4575392625
> > > > > [8] https://github.com/apache/flink-connector-opensearch
> > > > > [9]
> > > > >
> > 

Re: [External] [DISCUSS] FLIP-292: Support configuring state TTL at operator level for Table API & SQL programs

2023-04-03 Thread Jane Chan
Hi Timo,

Thanks for your valuable feedback. Let me explain the design details.

> However, actually fine-grained state TTL should already be possible
today. I don't fully understand where your proposed StateMetadata is
located? Would this be a value of @ExecNodeMetadata, StreamExecNode, or
TwoInputStreamOperator?

Currently, all ExecNodes that support JSON SerDe are annotated with
@ExecNoteMetadata. This annotation interface has a key called
consumedOptions, which persists all configuration that affects the
topology. For ExecNodes that translate to OneInputStreamOperator, adding
"table.exec.state.ttl" to consumedOptions is enough to achieve the goal of
configuring TTL with fine granularity. However, this is not a generalized
solution for ExecNodes that translate to TwoInputStreamOperator or
MultipleInputStreamOperator. Because we may need to set different TTLs for
the left / right (or k-th) input stream, but we do not want to introduce
configurations like "table.exec.left-state.ttl" or
"table.exec.right-state.ttl" or "table.exec.kth-input-state.ttl".

The proposed StateMetadata will be the member variable of ExecNodes that
translates to stateful operators, similar to inputProperties (which is
shared by all ExecNodes, though).
I'd like to illustrate this in the following snippet of code for
StreamExecJoin.

@ExecNodeMetadata(
  name = "stream-exec-join",
  version = 1,
  producedTransformations = StreamExecJoin.JOIN_TRANSFORMATION,
  minPlanVersion = FlinkVersion.v1_15,
  minStateVersion = FlinkVersion.v1_15)

+@ExecNodeMetadata(
+   name = "stream-exec-join",
+   version = 2,
+   producedTransformations = StreamExecJoin.JOIN_TRANSFORMATION,
+   minPlanVersion = FlinkVersion.v1_18,
+   minStateVersion = FlinkVersion.v1_15)
public class StreamExecJoin extends ExecNodeBase
  implements StreamExecNode, SingleTransformationTranslator<
RowData> {


+   @Nullable
+   @JsonProperty(FIELD_NAME_STATE)
+   private final List stateMetadataList;

  public StreamExecJoin(
  ReadableConfig tableConfig,
  JoinSpec joinSpec,
  List leftUpsertKeys,
  List rightUpsertKeys,
  InputProperty leftInputProperty,
  InputProperty rightInputProperty,
  RowType outputType,
  String description) {
  this(
  ExecNodeContext.newNodeId(),
  ExecNodeContext.newContext(StreamExecJoin.class),
  ExecNodeContext.newPersistedConfig(StreamExecJoin.class,
tableConfig),
  joinSpec,
  leftUpsertKeys,
  rightUpsertKeys,
  Lists.newArrayList(leftInputProperty, rightInputProperty),
  outputType,
  description,
+  StateMetadata.multiInputDefaultMeta(tableConfig,
LEFT_STATE_NAME, RIGHT_STATE_NAME));
  }

  @JsonCreator
  public StreamExecJoin(
  @JsonProperty(FIELD_NAME_ID) int id,
  @JsonProperty(FIELD_NAME_TYPE) ExecNodeContext context,
  @JsonProperty(FIELD_NAME_CONFIGURATION) ReadableConfig
persistedConfig,
  @JsonProperty(FIELD_NAME_JOIN_SPEC) JoinSpec joinSpec,
  @JsonProperty(FIELD_NAME_LEFT_UPSERT_KEYS) List
leftUpsertKeys,
  @JsonProperty(FIELD_NAME_RIGHT_UPSERT_KEYS) List
rightUpsertKeys,
  @JsonProperty(FIELD_NAME_INPUT_PROPERTIES) List
inputProperties,
  @JsonProperty(FIELD_NAME_OUTPUT_TYPE) RowType outputType,
  @JsonProperty(FIELD_NAME_DESCRIPTION) String description,

+  @Nullable
+  @JsonProperty(FIELD_NAME_STATE) List
stateMetadataList) {
  super(id, context, persistedConfig, inputProperties, outputType,
description);
  checkArgument(inputProperties.size() == 2);
  this.joinSpec = checkNotNull(joinSpec);
  this.leftUpsertKeys = leftUpsertKeys;
  this.rightUpsertKeys = rightUpsertKeys;
+  this.stateMetadataList = stateMetadataList;
  }
@Override
  @SuppressWarnings("unchecked")
  protected Transformation translateToPlanInternal(
  PlannerBase planner, ExecNodeConfig config) {
  final ExecEdge leftInputEdge = …;
  final ExecEdge rightInputEdge = …;
  . . .

   // for backward compatibility
  long leftStateRetentionTime =
  isNullOrEmpty(stateMetadataList)
  ? config.getStateRetentionTime()
  : stateMetadataList.get(0).getStateTtl();
  long rightStateRetentionTime =
  isNullOrEmpty(stateMetadataList)
  ? leftStateRetentionTime
  : stateMetadataList.get(1).getStateTtl();

  AbstractStreamingJoinOperator operator;
  FlinkJoinType joinType = joinSpec.getJoinType();
  if (joinType == FlinkJoinType.ANTI || joinType == FlinkJoinType.SEMI)
{
  operator =
  new StreamingSemiAntiJoinOperator(
  joinType == FlinkJoinType.ANTI,
  leftTypeInfo,
  rightTypeInfo,
   

Re: [DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-03 Thread Yanfei Lei
Hi Zakelly,

Thanks for driving this,  this proposal enables the files merging of
different types of states to be grouped under a unified framework. I
think it has the added benefit of lightening the load on JM. As
FLINK-26590[1] described,  triggered checkpoints can be delayed by
discarding shared state when JM manages a large number of files. After
this FLIP, JM only needs to manage some folders, which greatly reduces
the burden on JM.

In Section 4.1, two types of merging granularities(per subtask and per
task manager) are proposed, the shared state is managed by per subtask
granularity, but for the changelog state backend, its DSTL files are
shared between checkpoints, and are currently merged in batches at the
task manager level. When merging with the SEGMENTED_WITHIN_CP_BOUNDARY
mode, I'm concerned about the performance degradation of its merging,
hence I wonder if the merge granularities are configurable? Further,
from a user perspective, three new options are introduced in this
FLIP, do they have recommended defaults?


[1] https://issues.apache.org/jira/browse/FLINK-26590

Best,
Yanfei

Zakelly Lan  于2023年4月3日周一 18:36写道:

>
> Hi everyone,
>
> I would like to open a discussion on providing a unified file merging
> mechanism for checkpoints[1].
>
> Currently, many files are uploaded to the DFS during checkpoints,
> leading to the 'file flood' problem when running
> intensive workloads in a cluster.  To tackle this problem, various
> solutions have been proposed for different types
> of state files. Although these methods are similar, they lack a
> systematic view and approach. We believe that it is
> better to consider this problem as a whole and introduce a unified
> framework to address the file flood problem for
> all types of state files. A POC has been implemented based on current
> FLIP design, and the test results are promising.
>
>
> Looking forward to your comments or feedback.
>
> Best regards,
> Zakelly
>
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-306%3A+Unified+File+Merging+Mechanism+for+Checkpoints


Re: [External] [DISCUSS] FLIP-292: Support configuring state TTL at operator level for Table API & SQL programs

2023-04-03 Thread Jing Ge
Hi Jane,

Thanks for your clarification. +1 for using this feature for advanced use
cases.

Best regards,
Jing

On Mon, Apr 3, 2023 at 4:43 PM Jane Chan  wrote:

> Hi Yisha,
>
> Thanks for your detailed explanation. Here are my thoughts.
>
> > I'm interested in how to design a graphical interface to help users to
> maintain their custom fine-grained configuration between their job
> versions.
>
> Regarding the graphical IDE, this was mentioned to address the concern of
> editing less human-readable compiled JSON files. As far as I know, Flink
> previously provided a visualizer tool (
> https://flink.apache.org/visualizer/,
> but now retired) that takes a JSON representation of the job execution plan
> and visualizes it as a graph with complete annotations of execution
> strategies. It could have been extended to support the stateful operator's
> TTL edition and export to the new JSON text if it was still online.
>
> Meanwhile, I'm unsure whether this usability issue is indispensable to the
> community. I understand that many usability features, such as version
> control of SQL text and graphical SQL editor, are outside the scope of
> community features. Therefore, If everyone feels that visual editing of
> compiled plans is an essential part of the community feature, then I
> suggest that we should discuss a possible restart of this function in
> another FLIP while also enhancing it, such as allowing users to edit
> operators with TTL or other configs, and then exporting as a new JSON file.
>
> By "between job versions", I assume it refers to the query evolution. I
> think this is beyond the scope of this FLIP since query and schema changes
> may result in entirely different jobs, while this FLIP aims to achieve the
> operator-level TTL configuration per query.
>
> Best regards,
> Jane
>
> On Thu, Mar 30, 2023 at 10:03 PM 周伊莎 
> wrote:
>
> > Hi Jane,
> >
> > Thanks for your detailed response.
> >
> > You mentioned that there are 10k+ SQL jobs in your production
> > > environment, but only ~100 jobs' migration involves plan editing. Is
> 10k+
> > > the number of total jobs, or the number of jobs that use stateful
> > > computation and need state migration?
> > >
> >
> > 10k is the number of SQL jobs that enable periodic checkpoint. And
> > surely if users change their sql which result in changes of the plan,
> they
> > need to do state migration.
> >
> > - You mentioned that "A truth that can not be ignored is that users
> > > usually tend to give up editing TTL(or operator ID in our case) instead
> > of
> > > migrating this configuration between their versions of one given job."
> So
> > > what would users prefer to do if they're reluctant to edit the operator
> > > ID? Would they submit the same SQL as a new job with a higher version
> to
> > > re-accumulating the state from the earliest offset?
> >
> >
> > You're exactly right. People will tend to re-accumulate the state from a
> > given offset by changing the namespace of their checkpoint.
> > Namespace is an internal concept and restarting the sql job in a new
> > namespace can be simply understood as submitting a new job.
> >
> > Back to your suggestions, I noticed that FLIP-190 [3] proposed the
> > > following syntax to perform plan migration
> >
> >
> > The 'plan migration'  I said in my last reply may be inaccurate.  It's
> more
> > like 'query evolution'. In other word, if a user submitted a sql job
> with a
> > configured compiled plan, and then
> > he changes the sql,  the compiled plan changes too, how to move the
> > configuration in the old plan to the new plan.
> > IIUC, FLIP-190 aims to solve issues in flink version upgrades and leave
> out
> > the 'query evolution' which is a fundamental change to the query. E.g.
> > adding a filter condition, a different aggregation.
> > And I'm really looking forward to a solution for query evolution.
> >
> > And I'm also curious about how to use the hint
> > > approach to cover cases like
> > >
> > > - configuring TTL for operators like ChangelogNormalize,
> > > SinkUpsertMaterializer, etc., these operators are derived by the
> planner
> > > implicitly
> > > - cope with two/multiple input stream operator's state TTL, like join,
> > > and other operations like row_number, rank, correlate, etc.
> >
> >
> >  Actually, in our company , we make operators in the query block where
> the
> > hint locates all affected by that hint. For example,
> >
> > INSERT INTO sink
> > > SELECT /*+ STATE_TTL('1D') */
> > >id,
> > >name,
> > >num
> > > FROM (
> > >SELECT
> > >*,
> > >ROW_NUMBER() OVER (PARTITION BY id ORDER BY num DESC) as row_num
> > >FROM (
> > >SELECT
> > >*
> > >FROM (
> > >SELECT
> > >id,
> > >name,
> > >max(num) as num
> > >FROM source1
> > >GROUP BY
> > >id, name, TUMBLE(proc, INTERVAL '1' MINUTE)
> > >)
> > >GROUP BY
> > >  

[jira] [Created] (FLINK-31714) Conjars.org has died

2023-04-03 Thread Niels Basjes (Jira)
Niels Basjes created FLINK-31714:


 Summary: Conjars.org has died
 Key: FLINK-31714
 URL: https://issues.apache.org/jira/browse/FLINK-31714
 Project: Flink
  Issue Type: Bug
Reporter: Niels Basjes


Recently conjars.org has died.
The effect is that it is now *impossible* to build Flink on a clean machine.
Chris Wensel has setup a readonly mirror https://conjars.wensel.net/




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [External] [DISCUSS] FLIP-292: Support configuring state TTL at operator level for Table API & SQL programs

2023-04-03 Thread Jane Chan
Hi Yisha,

Thanks for your detailed explanation. Here are my thoughts.

> I'm interested in how to design a graphical interface to help users to
maintain their custom fine-grained configuration between their job versions.

Regarding the graphical IDE, this was mentioned to address the concern of
editing less human-readable compiled JSON files. As far as I know, Flink
previously provided a visualizer tool (https://flink.apache.org/visualizer/,
but now retired) that takes a JSON representation of the job execution plan
and visualizes it as a graph with complete annotations of execution
strategies. It could have been extended to support the stateful operator's
TTL edition and export to the new JSON text if it was still online.

Meanwhile, I'm unsure whether this usability issue is indispensable to the
community. I understand that many usability features, such as version
control of SQL text and graphical SQL editor, are outside the scope of
community features. Therefore, If everyone feels that visual editing of
compiled plans is an essential part of the community feature, then I
suggest that we should discuss a possible restart of this function in
another FLIP while also enhancing it, such as allowing users to edit
operators with TTL or other configs, and then exporting as a new JSON file.

By "between job versions", I assume it refers to the query evolution. I
think this is beyond the scope of this FLIP since query and schema changes
may result in entirely different jobs, while this FLIP aims to achieve the
operator-level TTL configuration per query.

Best regards,
Jane

On Thu, Mar 30, 2023 at 10:03 PM 周伊莎 
wrote:

> Hi Jane,
>
> Thanks for your detailed response.
>
> You mentioned that there are 10k+ SQL jobs in your production
> > environment, but only ~100 jobs' migration involves plan editing. Is 10k+
> > the number of total jobs, or the number of jobs that use stateful
> > computation and need state migration?
> >
>
> 10k is the number of SQL jobs that enable periodic checkpoint. And
> surely if users change their sql which result in changes of the plan, they
> need to do state migration.
>
> - You mentioned that "A truth that can not be ignored is that users
> > usually tend to give up editing TTL(or operator ID in our case) instead
> of
> > migrating this configuration between their versions of one given job." So
> > what would users prefer to do if they're reluctant to edit the operator
> > ID? Would they submit the same SQL as a new job with a higher version to
> > re-accumulating the state from the earliest offset?
>
>
> You're exactly right. People will tend to re-accumulate the state from a
> given offset by changing the namespace of their checkpoint.
> Namespace is an internal concept and restarting the sql job in a new
> namespace can be simply understood as submitting a new job.
>
> Back to your suggestions, I noticed that FLIP-190 [3] proposed the
> > following syntax to perform plan migration
>
>
> The 'plan migration'  I said in my last reply may be inaccurate.  It's more
> like 'query evolution'. In other word, if a user submitted a sql job with a
> configured compiled plan, and then
> he changes the sql,  the compiled plan changes too, how to move the
> configuration in the old plan to the new plan.
> IIUC, FLIP-190 aims to solve issues in flink version upgrades and leave out
> the 'query evolution' which is a fundamental change to the query. E.g.
> adding a filter condition, a different aggregation.
> And I'm really looking forward to a solution for query evolution.
>
> And I'm also curious about how to use the hint
> > approach to cover cases like
> >
> > - configuring TTL for operators like ChangelogNormalize,
> > SinkUpsertMaterializer, etc., these operators are derived by the planner
> > implicitly
> > - cope with two/multiple input stream operator's state TTL, like join,
> > and other operations like row_number, rank, correlate, etc.
>
>
>  Actually, in our company , we make operators in the query block where the
> hint locates all affected by that hint. For example,
>
> INSERT INTO sink
> > SELECT /*+ STATE_TTL('1D') */
> >id,
> >name,
> >num
> > FROM (
> >SELECT
> >*,
> >ROW_NUMBER() OVER (PARTITION BY id ORDER BY num DESC) as row_num
> >FROM (
> >SELECT
> >*
> >FROM (
> >SELECT
> >id,
> >name,
> >max(num) as num
> >FROM source1
> >GROUP BY
> >id, name, TUMBLE(proc, INTERVAL '1' MINUTE)
> >)
> >GROUP BY
> >id, name, num
> >)
> > )
> > WHERE row_num = 1
> >
>
> In the SQL above, the state TTL of Rank and Agg will be all configured as 1
> day.  If users want to set different TTL for Rank and Agg, they can just
> make these two queries located in two different query blocks.
> It looks quite rough but straightforward enough.  For each side of join
> operator, one of my users proposed a 

[jira] [Created] (FLINK-31713) k8s operator should gather job version metrics

2023-04-03 Thread Jira
Márton Balassi created FLINK-31713:
--

 Summary: k8s operator should gather job version metrics
 Key: FLINK-31713
 URL: https://issues.apache.org/jira/browse/FLINK-31713
 Project: Flink
  Issue Type: New Feature
  Components: Kubernetes Operator, Runtime / Metrics
Affects Versions: kubernetes-operator-1.5.0
Reporter: Márton Balassi


Similarly to the FLINK-31303 we should expose the number of times each Flink 
version is used in applications on a per namespace basis, this is sufficient 
for FlinkDeployments imho (no need to try to dig into session jobs) as the main 
purpose is to be able to gain visibility to the distribution of version used 
and be able to nudge users along to upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [External] [DISCUSS] FLIP-292: Support configuring state TTL at operator level for Table API & SQL programs

2023-04-03 Thread Timo Walther

Hi Jane,

thanks for proposing this FLIP. More state insights and fine-grained 
state TTL are a frequently requested feature for Flink SQL. Eventually, 
we need to address this.


I agree with the previous responses that doing this with a hint might 
cause more confusion than it actually helps. We should use hints only if 
they can be placed close to an operation (e.g. JOIN or table). And only 
where a global flag for the entire query is not sufficient using SET.


In general, I support the current direction of the FLIP and continuing 
the vision of FLIP-190. However, actually fine-grained state TTL should 
already be possible today. Maybe this is untested yet, but we largely 
reworked how configuration works within the planner in Flink 1.15.


As you quickly mentioned in the FLIP, ExecNodeConfig[1] already combines 
configuration coming from TableConfig with per-ExecNode config. 
Actually, state TTL from JSON plan should already have higher precedence 
than TableConfig.


It would be great to extend the meta-information of ExecNodes with state 
insights. I don't fully understand where your proposed StateMetadata is 
located? Would this be a value of @ExecNodeMetadata, StreamExecNode, or 
TwoInputStreamOperator?


I think it should be a combination of ExecNodeMetadata with rough 
estimates (declaration) or StreamExecNode. But should not bubble into 
TwoInputStreamOperator.


Regards,
Timo

[1] 
https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/nodes/exec/ExecNodeConfig.java


On 03.04.23 09:15, godfrey he wrote:

Hi Jane,

Thanks for driving this FLIP.

I think the compiled plan solution and the hint solution do not
conflict, the two can exist at the same time.
The compiled plan solution can address the need of advanced users and
the platform users
which all stateful operators' state TTL can be defined by user. While
the hint solution can address some
  specific simple scenarios, which is very user-friendly, convenient,
and unambiguous to use.

Some stateful operators are not compiled from SQL directly, such as
ChangelogNormalize and
SinkUpsertMaterializer mentioned above,  I notice the the example given by Yisha
has hints propagation problem which does not conform to the current design.
The rough idea about the hint solution should be simple (only the
common operators are supported)
and easy to understand (no hints propagation).

If the hint solution is supported, a compiled plan which is from a
query with state TTL hints
  can also be further modified for the state TTL parts.

So, I prefer the hint solution to be discuss in a separate FLIP.  I
think that FLIP maybe
need a lot discussion.

Best,
Godfrey

周伊莎  于2023年3月30日周四 22:04写道:


Hi Jane,

Thanks for your detailed response.

You mentioned that there are 10k+ SQL jobs in your production

environment, but only ~100 jobs' migration involves plan editing. Is 10k+
the number of total jobs, or the number of jobs that use stateful
computation and need state migration?



10k is the number of SQL jobs that enable periodic checkpoint. And
surely if users change their sql which result in changes of the plan, they
need to do state migration.

- You mentioned that "A truth that can not be ignored is that users

usually tend to give up editing TTL(or operator ID in our case) instead of
migrating this configuration between their versions of one given job." So
what would users prefer to do if they're reluctant to edit the operator
ID? Would they submit the same SQL as a new job with a higher version to
re-accumulating the state from the earliest offset?



You're exactly right. People will tend to re-accumulate the state from a
given offset by changing the namespace of their checkpoint.
Namespace is an internal concept and restarting the sql job in a new
namespace can be simply understood as submitting a new job.

Back to your suggestions, I noticed that FLIP-190 [3] proposed the

following syntax to perform plan migration



The 'plan migration'  I said in my last reply may be inaccurate.  It's more
like 'query evolution'. In other word, if a user submitted a sql job with a
configured compiled plan, and then
he changes the sql,  the compiled plan changes too, how to move the
configuration in the old plan to the new plan.
IIUC, FLIP-190 aims to solve issues in flink version upgrades and leave out
the 'query evolution' which is a fundamental change to the query. E.g.
adding a filter condition, a different aggregation.
And I'm really looking forward to a solution for query evolution.

And I'm also curious about how to use the hint

approach to cover cases like

- configuring TTL for operators like ChangelogNormalize,
SinkUpsertMaterializer, etc., these operators are derived by the planner
implicitly
- cope with two/multiple input stream operator's state TTL, like join,
and other operations like row_number, rank, correlate, etc.



  Actually, in our company , we make operators in the query 

[ANNOUNCE] Starting with Flink 1.18 Release Sync

2023-04-03 Thread Qingsheng Ren
Hi everyone,

As a fresh start of the Flink release 1.18, I'm happy to share with you
that the first release sync meeting of 1.18 will happen tomorrow on
Tuesday, April 4th at 10am (UTC+2) / 4pm (UTC+8). Welcome and feel free to
join us and share your ideas about the new release cycle!

Details of joining the release sync can be found in the 1.18 release wiki
page [1].

All contributors are invited to update the same wiki page [1] and include
features targeting the 1.18 release.

Looking forward to seeing you all in the meeting!

[1] https://cwiki.apache.org/confluence/display/FLINK/1.18+Release

Best regards,
Jing, Konstantin, Sergey and Qingsheng


[jira] [Created] (FLINK-31712) Allow skipping of archunit tests for nightly connector builds

2023-04-03 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-31712:
--

 Summary: Allow skipping of archunit tests for nightly connector 
builds
 Key: FLINK-31712
 URL: https://issues.apache.org/jira/browse/FLINK-31712
 Project: Flink
  Issue Type: Improvement
  Components: Build System, Connectors / Common
Reporter: Martijn Visser
Assignee: Martijn Visser






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31711) OpenAPI spec omits complete-statement request body

2023-04-03 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-31711:


 Summary: OpenAPI spec omits complete-statement request body
 Key: FLINK-31711
 URL: https://issues.apache.org/jira/browse/FLINK-31711
 Project: Flink
  Issue Type: Bug
  Components: Documentation, Runtime / REST
Affects Versions: 1.17.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.18.0, 1.17.1


The OpenAPI generator omits request bodies for get requests because it is 
usually a bad idea.

Still, the generator shouldn't omit this on it's own.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31710) Remove and rely on the curator dependency that's provided by flink

2023-04-03 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-31710:
-

 Summary: Remove  and rely on the curator 
dependency that's provided by flink
 Key: FLINK-31710
 URL: https://issues.apache.org/jira/browse/FLINK-31710
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Affects Versions: 1.16.1, 1.17.0, 1.18.0
Reporter: Matthias Pohl


Currently, we're relying on a dedicated curator dependency in tests to use the 
{{TestingZooKeeperServer}} (see [Flink's parent 
pom|https://github.com/apache/flink/blob/97cff0768d05e4a7d0217ddc92fd9ea3c7fae2c2/pom.xml#L143]).
 Besides that, we're using {{flink-shaded}} to provide the zookeeper and 
curator dependency that is used in Flink's production code.

The flaw of that approach is that we have to maintain two curator versions. 
This Jira issue is about investigating whether we could just remove the curator 
test dependency and rely on the {{flink-shaded}} curator sources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31709) JobResultStore and ExecutionGraphInfoStore could be merged

2023-04-03 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-31709:
-

 Summary: JobResultStore and ExecutionGraphInfoStore could be merged
 Key: FLINK-31709
 URL: https://issues.apache.org/jira/browse/FLINK-31709
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Coordination
Reporter: Matthias Pohl


This is a initial proposal for an improvement in coordination layer:

The {{JobResultStore}} (JRS) was introduced as part of 
[FLIP-194|https://cwiki.apache.org/confluence/display/FLINK/FLIP-194%3A+introduce+the+jobresultstore].
 For now, it only stores the JobResult. Through the JRS, jobs can be marked as 
finished even when the JobManager fails and the information from the 
{{ExecutionGraphInfoStore}} is lost (see FLINK-11813).

While implementing {{FLIP-194}}, it became apparent, that we have some 
redundancy between the JRS and the {{ExecutionGraphInfoStore}}. Both components 
store some meta information of a finished job. The {{ExecutionGraphInfoStore}} 
is used to make information about the finished job available in user-facing 
APIs (REST, web-UI). The JRS is used to expose the job's state to the cleanup 
logic and stores limited data.

This proposal is about merging the two and making the 
{{ArchivedExecutionGraph}} information available even after a JobManager is 
restarted. That way, completed jobs can be still listed in the job overview 
after a Flink cluster restart. Additionally, we could provide the last 
checkpoint information. The JRS would be a way to access this information even 
after the Flink cluster is shut down. The latter feature would be also a way to 
improve the Flink Kubernetes Operator's latest-state handling.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31708) RuntimeException/KryoException thrown when deserializing an empty protobuf record

2023-04-03 Thread shen (Jira)
shen created FLINK-31708:


 Summary: RuntimeException/KryoException thrown when deserializing 
an empty protobuf record
 Key: FLINK-31708
 URL: https://issues.apache.org/jira/browse/FLINK-31708
 Project: Flink
  Issue Type: Bug
  Components: API / Type Serialization System
Affects Versions: 1.17.0, 1.10.0
Reporter: shen


h1. Problem description

I am using protobuf defined Class in Flink job. When the application runs on 
production, the job throws following Exception:
{code:java}
java.lang.RuntimeException: Could not create class com.MYClass < generated 
by protobuf
at 
com.twitter.chill.protobuf.ProtobufSerializer.read(ProtobufSerializer.java:76)
at 
com.twitter.chill.protobuf.ProtobufSerializer.read(ProtobufSerializer.java:40)
at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813)
at 
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.deserialize(KryoSerializer.java:346)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:205)
at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:46)
at 
org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
at 
org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:141)
at 
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:121)
at 
org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:185)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:319)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:494)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:478)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.esotericsoftware.kryo.KryoException: java.io.EOFException: No 
more bytes left.
at 
org.apache.flink.api.java.typeutils.runtime.NoFetchingInput.readBytes(NoFetchingInput.java:127)
at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:332)
at 
com.twitter.chill.protobuf.ProtobufSerializer.read(ProtobufSerializer.java:73)
... 16 common frames omitted
 {code}
h1. How to reproduce

I think this is similar to another issue: FLINK-29347.

Follwing is an example to reproduce the problem:
{code:java}
package com.test;

import com.test.ProtobufGeneratedClass;

import com.google.protobuf.Message;
import com.twitter.chill.protobuf.ProtobufSerializer;
import lombok.extern.slf4j.Slf4j;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.common.restartstrategy.RestartStrategies;
import org.apache.flink.api.common.state.MapStateDescriptor;
import org.apache.flink.api.common.time.Time;
import org.apache.flink.api.java.utils.MultipleParameterTool;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.contrib.streaming.state.RocksDBStateBackend;
import org.apache.flink.streaming.api.CheckpointingMode;
import org.apache.flink.streaming.api.datastream.BroadcastStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.datastream.SingleOutputStreamOperator;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import 
org.apache.flink.streaming.api.functions.co.KeyedBroadcastProcessFunction;
import org.apache.flink.streaming.api.functions.sink.SinkFunction;
import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
import org.apache.flink.util.Collector;
import org.apache.flink.util.OutputTag;

import java.util.Random;
@Slf4j
public class app {
  public static final OutputTag OUTPUT_TAG_1 =
  new OutputTag("output-tag-1") {
  };

  public static final OutputTag OUTPUT_TAG_2 =
  new OutputTag("output-tag-2") {
  };

  public static final OutputTag OUTPUT_TAG_3 =
  new OutputTag("output-tag-3") {
  };

  public static class MySourceFunction extends 
RichParallelSourceFunction {
Random rnd = new Random();
private final String name;

private boolean running = true;

private MySourceFunction(String name) {
  this.name = name;
}

@Override
public void run(SourceContext sourceContext) throws 
Exception {
  final int index = 

[jira] [Created] (FLINK-31707) Constant string cannot be used as input arguments of Pandas UDAF

2023-04-03 Thread Dian Fu (Jira)
Dian Fu created FLINK-31707:
---

 Summary: Constant string cannot be used as input arguments of 
Pandas UDAF
 Key: FLINK-31707
 URL: https://issues.apache.org/jira/browse/FLINK-31707
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Reporter: Dian Fu
Assignee: Dian Fu


It will throw exceptions as following when using constant strings in Pandas 
UDAF:
{code}
E   raise ValueError("field_type %s is not supported." % 
field_type)
E   ValueError: field_type type_name: CHAR
E   char_info {
E length: 3
E   }
Eis not supported.
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31706) The default source parallelism should be the same as execution's default parallelism under adaptive batch scheduler

2023-04-03 Thread Yun Tang (Jira)
Yun Tang created FLINK-31706:


 Summary: The default source parallelism should be the same as 
execution's default parallelism under adaptive batch scheduler
 Key: FLINK-31706
 URL: https://issues.apache.org/jira/browse/FLINK-31706
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: Yun Tang
 Fix For: 1.18.0


Currently, the sources need to set 
{{execution.batch.adaptive.auto-parallelism.default-source-parallelism }} in 
the adaptive batch scheduler mode, otherwise, the source parallelism is only 1 
by default. A better solution might be set as the default execution parallelism 
if no user configured. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31705) Remove Conjars

2023-04-03 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-31705:
--

 Summary: Remove Conjars
 Key: FLINK-31705
 URL: https://issues.apache.org/jira/browse/FLINK-31705
 Project: Flink
  Issue Type: Technical Debt
  Components: Build System
Reporter: Martijn Visser
Assignee: Martijn Visser


With Conjars no longer being available (only https://conjars.wensel.net/ is 
there), we should remove all the notices to Conjars in Flink. We've already 
removed the need for Conjars because we've excluded Pentaho as part of 
FLINK-27640, which eliminates having any dependency that relies on Conjars. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Releasing connectors for Flink 1.17

2023-04-03 Thread Hang Ruan
Hi Denny,

I could help with JDBC connectors. Thanks~

Best,
Hang

Danny Cranmer  于2023年4月3日周一 17:57写道:

> Hi everyone,
>
> I have created Jiras to investigate each failing build. Thanks Andriy, I
> have assigned the Elasticsearch/Opensearch ones to you. Let me know if you
> do not have capacity.
>
> - Elasticsearch: https://issues.apache.org/jira/browse/FLINK-31696
> - Opensearch: https://issues.apache.org/jira/browse/FLINK-31697
> - RabbitMQ: https://issues.apache.org/jira/browse/FLINK-31701
> - JDBC: https://issues.apache.org/jira/browse/FLINK-31699
> - Cassandra: https://issues.apache.org/jira/browse/FLINK-31698
> - MongoDB: https://issues.apache.org/jira/browse/FLINK-31700
>
> Thanks,
> Danny
>
>
> On Fri, Mar 31, 2023 at 5:59 PM Andrey Redko  wrote:
>
> > Hi Denny,
> >
> > I could give you a hand with Elasticsearch / Opensearch connectors,
> please
> > let me know if help is needed. Thank you.
> >
> > Best Regards,
> > Andriy Redko
> >
> > On Fri, Mar 31, 2023, 9:39 AM Danny Cranmer 
> > wrote:
> >
> > > Apologies for the typo. Elasticsearch/Opensearch/Cassandra/JDBC/MongoDB
> > are
> > > build __failures__.
> > >
> > > On Fri, Mar 31, 2023 at 2:36 PM Danny Cranmer  >
> > > wrote:
> > >
> > > > Hello all,
> > > >
> > > > With the release of 1.17.0 [1], we now need to add support for our
> > > > externalized connectors. This release is lightweight compared to
> normal
> > > > since we are ideally not updating any source code; we only need to
> > > rebuild
> > > > and distribute the binary to Maven. Therefore I am proposing that we
> > > bundle
> > > > all the connectors into a single VOTE, and release them in one go. I
> am
> > > > considering connectors that are released and passing CI for 1.17,
> which
> > > > unfortunately only includes:
> > > > - flink-connector-aws [2] at version 4.1.0-1.17 (build success [3])
> > > > - flink-connector-pulsar [4] at version 3.0.0-1.17 (build success
> [5])
> > > >
> > > > The following connectors are not passing builds against Flink 1.17.0
> > and
> > > > potentially need a new source release, therefore are excluded and
> will
> > > need
> > > > dedicated releases:
> > > > - flink-connector-elasticsearch [6] at version 3.0.0-1.17 (build
> > success
> > > > [7])
> > > > - flink-connector-opensearch [8] at version 1.0.0-1.17 (build success
> > > [9])
> > > > - flink-connector-cassandra [10] at version 3.0.0-1.17 (build success
> > > [11])
> > > > - flink-connector-jdbc [12] at version 3.0.0-1.17 (build success
> [13])
> > > > - flink-connector-mongodb [14] at version 1.0.0-1.17 (build success
> > [15])
> > > > - flink-connector-rabbitmq [16] at version 3.0.0-1.17 (build failure
> > > [17])
> > > >
> > > > The following connectors are not yet released and are therefore
> > excluded:
> > > > - flink-connector-kafka [18]
> > > >
> > > > I volunteer myself as the release manager for flink-connector-aws and
> > > > flink-connector-pulsar. I am happy to pick up others too, but will
> move
> > > to
> > > > different threads.
> > > >
> > > > Thanks,
> > > > Danny
> > > >
> > > > [1]
> > > >
> > >
> >
> https://flink.apache.org/2023/03/23/announcing-the-release-of-apache-flink-1.17/
> > > > [2] https://github.com/apache/flink-connector-aws
> > > > [3]
> > > https://github.com/apache/flink-connector-aws/actions/runs/4570151032
> > > > [4] https://github.com/apache/flink-connector-pulsar
> > > > [5]
> > > >
> > https://github.com/apache/flink-connector-pulsar/actions/runs/4570233207
> > > > [6] https://github.com/apache/flink-connector-elasticsearch
> > > > [7]
> > > >
> > >
> >
> https://github.com/apache/flink-connector-elasticsearch/actions/runs/4575392625
> > > > [8] https://github.com/apache/flink-connector-opensearch
> > > > [9]
> > > >
> > >
> >
> https://github.com/apache/flink-connector-opensearch/actions/runs/4575383878
> > > > [10] https://github.com/apache/flink-connector-cassandra
> > > > [11]
> > > >
> > >
> >
> https://github.com/apache/flink-connector-cassandra/actions/runs/4575364252
> > > > [12] https://github.com/apache/flink-connector-jdbc
> > > > [13]
> > > >
> https://github.com/apache/flink-connector-jdbc/actions/runs/4575361507
> > > > [14] https://github.com/apache/flink-connector-mongodb
> > > > [15]
> > > >
> > >
> >
> https://github.com/apache/flink-connector-mongodb/actions/runs/4575230622
> > > > [16] https://github.com/apache/flink-connector-rabbitmq
> > > > [17]
> > > >
> > >
> >
> https://github.com/apache/flink-connector-rabbitmq/actions/runs/4575370483
> > > > [18] https://github.com/apache/flink-connector-kafka
> > > >
> > > >
> > > >
> > >
> >
>


[jira] [Created] (FLINK-31704) Pulsar docs should be pulled from dedicated branch

2023-04-03 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-31704:
-

 Summary: Pulsar docs should be pulled from dedicated branch
 Key: FLINK-31704
 URL: https://issues.apache.org/jira/browse/FLINK-31704
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Pulsar, Documentation
Reporter: Danny Cranmer






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31703) Update Flink docs for AWS v4.1.0

2023-04-03 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-31703:
-

 Summary: Update Flink docs for AWS v4.1.0
 Key: FLINK-31703
 URL: https://issues.apache.org/jira/browse/FLINK-31703
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Opensearch, Documentation
Reporter: Danny Cranmer
Assignee: Danny Cranmer
 Fix For: 1.18.0, 1.17.1


Update Flink docs for 1.16 to pull in the opensearch docs from 
flink-connector-opensearch repo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] FLIP-306: Unified File Merging Mechanism for Checkpoints

2023-04-03 Thread Zakelly Lan
Hi everyone,

I would like to open a discussion on providing a unified file merging
mechanism for checkpoints[1].

Currently, many files are uploaded to the DFS during checkpoints,
leading to the 'file flood' problem when running
intensive workloads in a cluster.  To tackle this problem, various
solutions have been proposed for different types
of state files. Although these methods are similar, they lack a
systematic view and approach. We believe that it is
better to consider this problem as a whole and introduce a unified
framework to address the file flood problem for
all types of state files. A POC has been implemented based on current
FLIP design, and the test results are promising.


Looking forward to your comments or feedback.

Best regards,
Zakelly

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-306%3A+Unified+File+Merging+Mechanism+for+Checkpoints


[jira] [Created] (FLINK-31702) Integrate Opensearch connector docs into Flink docs v1.17/master

2023-04-03 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-31702:
-

 Summary: Integrate Opensearch connector docs into Flink docs 
v1.17/master
 Key: FLINK-31702
 URL: https://issues.apache.org/jira/browse/FLINK-31702
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / Opensearch, Documentation
Reporter: Danny Cranmer
Assignee: Danny Cranmer
 Fix For: 1.16.1


Update Flink docs for 1.16 to pull in the opensearch docs from 
flink-connector-opensearch repo



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Releasing connectors for Flink 1.17

2023-04-03 Thread Danny Cranmer
Hi everyone,

I have created Jiras to investigate each failing build. Thanks Andriy, I
have assigned the Elasticsearch/Opensearch ones to you. Let me know if you
do not have capacity.

- Elasticsearch: https://issues.apache.org/jira/browse/FLINK-31696
- Opensearch: https://issues.apache.org/jira/browse/FLINK-31697
- RabbitMQ: https://issues.apache.org/jira/browse/FLINK-31701
- JDBC: https://issues.apache.org/jira/browse/FLINK-31699
- Cassandra: https://issues.apache.org/jira/browse/FLINK-31698
- MongoDB: https://issues.apache.org/jira/browse/FLINK-31700

Thanks,
Danny


On Fri, Mar 31, 2023 at 5:59 PM Andrey Redko  wrote:

> Hi Denny,
>
> I could give you a hand with Elasticsearch / Opensearch connectors, please
> let me know if help is needed. Thank you.
>
> Best Regards,
> Andriy Redko
>
> On Fri, Mar 31, 2023, 9:39 AM Danny Cranmer 
> wrote:
>
> > Apologies for the typo. Elasticsearch/Opensearch/Cassandra/JDBC/MongoDB
> are
> > build __failures__.
> >
> > On Fri, Mar 31, 2023 at 2:36 PM Danny Cranmer 
> > wrote:
> >
> > > Hello all,
> > >
> > > With the release of 1.17.0 [1], we now need to add support for our
> > > externalized connectors. This release is lightweight compared to normal
> > > since we are ideally not updating any source code; we only need to
> > rebuild
> > > and distribute the binary to Maven. Therefore I am proposing that we
> > bundle
> > > all the connectors into a single VOTE, and release them in one go. I am
> > > considering connectors that are released and passing CI for 1.17, which
> > > unfortunately only includes:
> > > - flink-connector-aws [2] at version 4.1.0-1.17 (build success [3])
> > > - flink-connector-pulsar [4] at version 3.0.0-1.17 (build success [5])
> > >
> > > The following connectors are not passing builds against Flink 1.17.0
> and
> > > potentially need a new source release, therefore are excluded and will
> > need
> > > dedicated releases:
> > > - flink-connector-elasticsearch [6] at version 3.0.0-1.17 (build
> success
> > > [7])
> > > - flink-connector-opensearch [8] at version 1.0.0-1.17 (build success
> > [9])
> > > - flink-connector-cassandra [10] at version 3.0.0-1.17 (build success
> > [11])
> > > - flink-connector-jdbc [12] at version 3.0.0-1.17 (build success [13])
> > > - flink-connector-mongodb [14] at version 1.0.0-1.17 (build success
> [15])
> > > - flink-connector-rabbitmq [16] at version 3.0.0-1.17 (build failure
> > [17])
> > >
> > > The following connectors are not yet released and are therefore
> excluded:
> > > - flink-connector-kafka [18]
> > >
> > > I volunteer myself as the release manager for flink-connector-aws and
> > > flink-connector-pulsar. I am happy to pick up others too, but will move
> > to
> > > different threads.
> > >
> > > Thanks,
> > > Danny
> > >
> > > [1]
> > >
> >
> https://flink.apache.org/2023/03/23/announcing-the-release-of-apache-flink-1.17/
> > > [2] https://github.com/apache/flink-connector-aws
> > > [3]
> > https://github.com/apache/flink-connector-aws/actions/runs/4570151032
> > > [4] https://github.com/apache/flink-connector-pulsar
> > > [5]
> > >
> https://github.com/apache/flink-connector-pulsar/actions/runs/4570233207
> > > [6] https://github.com/apache/flink-connector-elasticsearch
> > > [7]
> > >
> >
> https://github.com/apache/flink-connector-elasticsearch/actions/runs/4575392625
> > > [8] https://github.com/apache/flink-connector-opensearch
> > > [9]
> > >
> >
> https://github.com/apache/flink-connector-opensearch/actions/runs/4575383878
> > > [10] https://github.com/apache/flink-connector-cassandra
> > > [11]
> > >
> >
> https://github.com/apache/flink-connector-cassandra/actions/runs/4575364252
> > > [12] https://github.com/apache/flink-connector-jdbc
> > > [13]
> > > https://github.com/apache/flink-connector-jdbc/actions/runs/4575361507
> > > [14] https://github.com/apache/flink-connector-mongodb
> > > [15]
> > >
> >
> https://github.com/apache/flink-connector-mongodb/actions/runs/4575230622
> > > [16] https://github.com/apache/flink-connector-rabbitmq
> > > [17]
> > >
> >
> https://github.com/apache/flink-connector-rabbitmq/actions/runs/4575370483
> > > [18] https://github.com/apache/flink-connector-kafka
> > >
> > >
> > >
> >
>


[jira] [Created] (FLINK-31701) RabbitMQ nightly CI failure

2023-04-03 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-31701:
-

 Summary: RabbitMQ nightly CI failure
 Key: FLINK-31701
 URL: https://issues.apache.org/jira/browse/FLINK-31701
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / MongoDB
Reporter: Danny Cranmer


Investigate and fix the nightly CI failure. Example 
https://github.com/apache/flink-connector-mongodb/actions/runs/4585933750

 
{code:java}
Error:  
/home/runner/work/flink-connector-mongodb/flink-connector-mongodb/flink-connector-mongodb/src/main/java/org/apache/flink/connector/mongodb/sink/writer/context/DefaultMongoSinkContext.java:[33,8]
 org.apache.flink.connector.mongodb.sink.writer.context.DefaultMongoSinkContext 
is not abstract and does not override abstract method getAttemptNumber() in 
org.apache.flink.api.connector.sink2.Sink.InitContext{code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31700) MongoDB nightly CI failure

2023-04-03 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-31700:
-

 Summary: MongoDB nightly CI failure
 Key: FLINK-31700
 URL: https://issues.apache.org/jira/browse/FLINK-31700
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / JDBC
Reporter: Danny Cranmer


Investigate and fix the nightly CI failure. Example 
[https://github.com/apache/flink-connector-jdbc/actions/runs/4585903259]

 
{code:java}
Error:  Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test (default-test) on 
project flink-connector-jdbc: Execution default-test of goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test failed: 
org.junit.platform.commons.JUnitException: TestEngine with ID 'archunit' failed 
to discover tests: 
com.tngtech.archunit.lang.syntax.elements.MethodsThat.areAnnotatedWith(Ljava/lang/Class;)Ljava/lang/Object;
 -> [Help 1]{code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31699) JDBC nightly CI failure

2023-04-03 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-31699:
-

 Summary: JDBC nightly CI failure
 Key: FLINK-31699
 URL: https://issues.apache.org/jira/browse/FLINK-31699
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Cassandra
Reporter: Danny Cranmer


Investigate and fix the nightly CI failure. Example 
[https://github.com/apache/flink-connector-cassandra/actions/runs/4585936901]

 
{code:java}
Error:  Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test (default-test) on 
project flink-connector-cassandra_2.12: Execution default-test of goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test failed: 
org.junit.platform.commons.JUnitException: TestEngine with ID 'junit-jupiter' 
failed to discover tests: org/junit/jupiter/api/io/CleanupMode: 
org.junit.jupiter.api.io.CleanupMode -> [Help 1]{code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31698) Cassandra nightly CI failure

2023-04-03 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-31698:
-

 Summary: Cassandra nightly CI failure
 Key: FLINK-31698
 URL: https://issues.apache.org/jira/browse/FLINK-31698
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / Opensearch
Reporter: Danny Cranmer


Investigate and fix the nightly CI failure. Example 
[https://github.com/apache/flink-connector-opensearch/actions/runs/4585851921]

 

 
{code:java}
Error: Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test (default-test) on 
project flink-connector-opensearch: Execution default-test of goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test failed: 
org.junit.platform.commons.JUnitException: TestEngine with ID 'archunit' failed 
to discover tests: 'java.lang.Object 
com.tngtech.archunit.lang.syntax.elements.MethodsThat.areAnnotatedWith(java.lang.Class)'
 -> [Help 1]{code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31697) OpenSearch nightly CI failure

2023-04-03 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-31697:
-

 Summary: OpenSearch nightly CI failure
 Key: FLINK-31697
 URL: https://issues.apache.org/jira/browse/FLINK-31697
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / ElasticSearch
Reporter: Danny Cranmer


Investigate and fix the nightly CI failure. Example 
[https://github.com/apache/flink-connector-elasticsearch/actions/runs/4585918498/jobs/8098357503]

 
{code:java}
Error:  Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test (default-test) on 
project flink-connector-elasticsearch-base: Execution default-test of goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test failed: 
org.junit.platform.commons.JUnitException: TestEngine with ID 'archunit' failed 
to discover tests: 'java.lang.Object 
com.tngtech.archunit.lang.syntax.elements.MethodsThat.areAnnotatedWith(java.lang.Class)'
 -> [Help 1] {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31696) ElasticSearch nightly CI failure

2023-04-03 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-31696:
-

 Summary: ElasticSearch nightly CI failure
 Key: FLINK-31696
 URL: https://issues.apache.org/jira/browse/FLINK-31696
 Project: Flink
  Issue Type: Technical Debt
  Components: Connectors / ElasticSearch
Reporter: Danny Cranmer


Investigate and fix the nightly CI failure. Example 
[https://github.com/apache/flink-connector-elasticsearch/actions/runs/4585918498/jobs/8098357503]

 
{code:java}
Error:  Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test (default-test) on 
project flink-connector-elasticsearch-base: Execution default-test of goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M5:test failed: 
org.junit.platform.commons.JUnitException: TestEngine with ID 'archunit' failed 
to discover tests: 'java.lang.Object 
com.tngtech.archunit.lang.syntax.elements.MethodsThat.areAnnotatedWith(java.lang.Class)'
 -> [Help 1] {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] FLIP-287: Extend Sink#InitContext to expose ExecutionConfig and JobID

2023-04-03 Thread Zhu Zhu
Hi João,

Thanks for creating this FLIP!
I'm overall +1 for it to unblock the migration of sinks to SinkV2.

Yet I think it's better to let the `ReadableExecutionConfig` extend
`ExecutionConfig`, because otherwise we have to introduce a new method
`TypeInformation#createSerializer(ReadableExecutionConfig)`. The new
method may require every `TypeInformation` to implement it, including
Flink built-in ones and custom ones, otherwise exceptions will happen.
That goal, however, is pretty hard to achieve.

Thanks,
Zhu

João Boto  于2023年2月28日周二 23:34写道:
>
> I have update the FLIP with the 2 options that we have discussed..
>
> Option 1: Expose ExecutionConfig directly on InitContext
> this have a minimal impact as we only have to expose the new methods
>
> Option 2: Expose ReadableExecutionConfig on InitContext
> with this option we have more impact as we need to add a new method to 
> TypeInformation and change all implementations (current exists 72 
> implementations)
>
> Waiting for feedback or concerns about the two options


Re: [ANNOUNCE] Apache flink-connector-mongodb v1.0.0 released

2023-04-03 Thread Jiabao Sun
Good to know & Congrats!

Many thanks to Danny, Chesnay, Martijn and Ross!

Best,
Jiabao

> 2023年4月3日 下午3:56,Danny Cranmer  写道:
> 
> The Apache Flink community is very happy to announce the release of Apache
> flink-connector-mongodb v1.0.0
> 
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data streaming
> applications.
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352386
> 
> We would like to thank all contributors of the Apache Flink community who
> made this release possible!
> 
> Regards,
> Danny



[jira] [Created] (FLINK-31695) Calling `bin/flink stop` when flink-shaded-hadoop-2-uber-2.8.3-10.0.jar is in lib directory thwos NoSuchMethodError

2023-04-03 Thread Caizhi Weng (Jira)
Caizhi Weng created FLINK-31695:
---

 Summary: Calling `bin/flink stop` when 
flink-shaded-hadoop-2-uber-2.8.3-10.0.jar is in lib directory thwos 
NoSuchMethodError
 Key: FLINK-31695
 URL: https://issues.apache.org/jira/browse/FLINK-31695
 Project: Flink
  Issue Type: Bug
  Components: Command Line Client
Affects Versions: 1.15.4, 1.16.1, 1.17.0
Reporter: Caizhi Weng


To reproduce this bug, follow these steps:

1. Download 
[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar|https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-hadoop-2-uber/2.8.3-10.0]
 and put it in the {{lib}} directory.
2. Run {{bin/flink stop}}

The exception stack is

{code}
java.lang.NoSuchMethodError: 
org.apache.commons.cli.CommandLine.hasOption(Lorg/apache/commons/cli/Option;)Z
at org.apache.flink.client.cli.StopOptions.(StopOptions.java:53)
at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:539)
at 
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1102)
at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1165)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
at 
org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1165)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] Status of Statefun Project

2023-04-03 Thread Martijn Visser
Hi everyone,

I want to open a discussion on the status of the Statefun Project [1] in
Apache Flink. As you might have noticed, there hasn't been much development
over the past months in the Statefun repository [2]. There is currently a
lack of active contributors and committers who are able to help with the
maintenance of the project.

In order to improve the situation, we need to solve the lack of committers
and the lack of contributors.

On the lack of committers:

1. Ideally, there are some of the current Flink committers who have the
bandwidth and can help with reviewing PRs and merging them.
2. If that's not an option, it could be a consideration that current
committers only approve and review PRs, that are approved by those who are
willing to contribute to Statefun and if the CI passes

On the lack of contributors:

3. Next to having this discussion on the Dev and User mailing list, we can
also create a blog with a call for new contributors on the Flink project
website, send out some tweets on the Flink / Statefun twitter accounts,
post messages on Slack etc. In that message, we would inform how those that
are interested in contributing can start and where they could reach out for
more information.

There's also option 4. where a group of interested people would split
Statefun from the Flink project and make it a separate top level project
under the Apache Flink umbrella (similar as recently has happened with
Flink Table Store, which has become Apache Paimon).

If we see no improvements in the coming period, we should consider
sunsetting Statefun and communicate that clearly to the users.

I'm looking forward to your thoughts.

Best regards,

Martijn

[1] https://nightlies.apache.org/flink/flink-statefun-docs-master/
[2] https://github.com/apache/flink-statefun


Re: [blog article] Howto create a batch source with the new Source framework

2023-04-03 Thread Etienne Chauchot

Hi all,

I just published the last article of the series about creating a batch 
source with the new Source framework. This one is about testing the source.


Can you tell me if you think it would make sense to publish both 
articles to the official Flink blog as they could serve as a detailed 
documentation ?


Thanks

Etienne

[1] 
https://echauchot.blogspot.com/2023/04/flink-howto-test-batch-source-with-new.html


Le 03/04/2023 à 03:37, yuxia a écrit :

Thanks Etienne for detail explanation.

Best regards,
Yuxia

- 原始邮件 -
发件人: "Etienne Chauchot" 
收件人: "dev" 
发送时间: 星期五, 2023年 3 月 31日 下午 9:08:36
主题: Re: [blog article] Howto create a batch source with the new Source framework

Hi Yuxia,

Thanks for your feedback.

Comments inline


Le 31/03/2023 à 04:21, yuxia a écrit :

Hi, Etienne.

Thanks for Etienne for sharing this article. I really like it and learn much 
from it.

=> Glad it was useful, that was precisely the point :)

I'd like to raise some questions about implementing batch source. Welcome devs 
to share insights about them.

The first question is how to generate splits:
As the article mentioned:
"Whenever possible, it is preferable to generate the splits lazily, meaning that 
each time a reader asks the enumerator for a split, the enumerator generates one on 
demand and assigns it to the reader."
I think it maybe not for all cases. In some cases, generating split may be time 
counsuming, then it may be better to generate a batch of splits on demand to 
amortize the expense.
But it then raises another question, how many splits should be generated in a 
batch, too many maywell cause OOM, too less may not make good use of batch 
generating splits.
To solve it, I think maybe we can provide a configuration to make user to 
configure how many splits should be generated in a batch.
What's your opinion on it. Have you ever encountered this problem in your 
implementation?

=> I agree, lazy splits is not the only way. I've mentioned in the
article that batch generation is another in case of high split
generation cost, thanks for the suggestion. During the implementation I
didn't have this problem as generating a split was not costly, the only
costly processing was the splits preparation. It was run asynchronously
and only once, then each split generation was straightforward. That
being said, during development, I had OOM risks in the size and number
of splits. For the number of splits, lazy generation solved it as no
list of splits was stored in the ennumerator apart from the splits to
reassign. For the size of split I used a user provided max split memory
size similar to what you suggest here. In the batch generation case, we
could allow the user to set a max memory size for the batch : number of
splits in batch looks more dangerous to me if we don't know the size of
a split but if we are talking about storing the split objects and not
their content then that is ok. IMO, memory size is more clear for the
user as it is linked to the memory of a task manager.



The second question is how to assign splits:
What's your split assign stratgy?

=> the naïve one: a reader asks for a split, the enumerator receives the
request, generates a split and assigns it to the demanding reader.

In flink, we provide `LocalityAwareSplitAssigner` to make use of locality to 
assign split to reader.

=> well, it has interest only when the data-backend cluster nodes can be
co-localized with Flink task managers right? That would rarely be the
case as clusters seem to be separated most of the time to use the
maximum available CPU (at least for CPU-band workloads) no ?

But it may not perfert for the case of failover

=> Agree: it would require costly shuffle to keep the co-location after
restoration and this cost would not be balanced by the gain raised by
co-locality (mainly avoiding network use) I think.

for which we intend to introduce another split assign strategy[1].
But I do think it should be configurable to enable advanced user to decide 
which assign stratgy to use.

=> when you say the "user" I guess you mean user of the source not user
of the dev framework (implementor of the source). I think that it should
be configurable indeed as the user is the one knowing the repartition of
the partitions of the backend data.

Best

Etienne



Welcome other devs to share opinion.

[1]: https://issues.apache.org/jira/browse/FLINK-31065





Also as for split assigner .


Best regards,
Yuxia

- 原始邮件 -
发件人: "Etienne Chauchot" 
收件人: "dev" 
抄送: "Chesnay Schepler" 
发送时间: 星期四, 2023年 3 月 30日 下午 10:36:39
主题: [blog article] Howto create a batch source with the new Source framework

Hi all,

After creating the Cassandra source connector (thanks Chesnay for the
review!), I wrote a blog article about how to create a batch source with
the new Source framework [1]. It gives field feedback on how to
implement the different components.

I felt it could be useful to people interested in contributing or
migrating connectors.

=> 

[jira] [Created] (FLINK-31694) Bump ua-parser-js from 0.7.31 to 0.7.33

2023-04-03 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-31694:
--

 Summary: Bump ua-parser-js from 0.7.31 to 0.7.33
 Key: FLINK-31694
 URL: https://issues.apache.org/jira/browse/FLINK-31694
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / Web Frontend
Reporter: Martijn Visser
Assignee: Martijn Visser


Dependabot PR: https://github.com/apache/flink/pull/21767



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31693) Bump http-cache-semantics from 4.1.0 to 4.1.1 in

2023-04-03 Thread Martijn Visser (Jira)
Martijn Visser created FLINK-31693:
--

 Summary: Bump http-cache-semantics from 4.1.0 to 4.1.1 in
 Key: FLINK-31693
 URL: https://issues.apache.org/jira/browse/FLINK-31693
 Project: Flink
  Issue Type: Technical Debt
  Components: Runtime / Web Frontend
Reporter: Martijn Visser
Assignee: Martijn Visser






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] FLINK-31553

2023-04-03 Thread Marios Trivyzas
Hello to all,

I'd like to discuss an issue with flink-jdbc-connector:
https://issues.apache.org/jira/browse/FLINK-31553.
Currently, the Dialect/Catalog/TypeMappers, etc. are chosen automatically
based on the JDBC connection string prefix.
For example, if `jdbc:postgresql://` is used, then the
PostgresDialect/Catalog,etc. is chosen which can be problematic
for databases like CrateDB or Cochroach DB which they fully support the
PostgreSQL JDBC driver (and therefore use the
`jdbc:postgresql://...` connection strings) but internally they don't
support the same database/schema structure along with
PostgreSQL specific pg_tables, (example pg_cursors, pg_constraints, etc.),
and maybe the need different type mapping.

I propose to add some new parameter to the `JdbcCatalog`, which will allow
the users to select the Dialect, overriding the
automatic url-based decision.

I'd propose to just add another method in `JdbcDialectFactory` which
resolves the dialect before the `acceptsURL` method
which will be the fallback. So if the method is `useDialect(String
dialectName)` which will be used first, and if string is null/empty
or not a matching Dialect is found, the `acceptsURL` is called as a
fallback. I propose that the `dialectName` is something simple
like `CrateDB`, `MySQL`, etc. or another option would be the fully
qualified name of a dialect, which is then loaded by the class loader.
With the latter, users can create their own Dialect which will be loaded
and used, without the need to merge this into the upstream
flink-jdbc-connector repo.

Please let me know what do you think of this proposal, and provide your
suggestions.

Thank you!

-- 
Marios


[jira] [Created] (FLINK-31692) Integrate MongoDB connector docs into Flink website

2023-04-03 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-31692:
-

 Summary: Integrate MongoDB connector docs into Flink website
 Key: FLINK-31692
 URL: https://issues.apache.org/jira/browse/FLINK-31692
 Project: Flink
  Issue Type: Improvement
  Components: Connectors / MongoDB
Reporter: Danny Cranmer
 Fix For: 1.18.0


Update Flink docs build to pull the MongoDB connector docs [1]

 

 

[1] https://github.com/apache/flink-connector-mongodb



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[ANNOUNCE] Apache flink-connector-aws v4.1.0 released

2023-04-03 Thread Danny Cranmer
The Apache Flink community is very happy to announce the release of Apache
flink-connector-aws v4.1.0

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352646

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Danny


[ANNOUNCE] Apache flink-connector-mongodb v1.0.0 released

2023-04-03 Thread Danny Cranmer
The Apache Flink community is very happy to announce the release of Apache
flink-connector-mongodb v1.0.0

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352386

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Danny


Re: [DISCUSS] FLIP 295: Support persistence of Catalog configuration and asynchronous registration

2023-04-03 Thread Feng Jin
Hi everyone, Thank you all for your interest in this DISCUSS.

@Shammon
> How to handle a catalog with the same name that exists for both of them?

I believe this is a crucial point. Based on my current personal
understanding, the Map catalogs will serve as a cache
for instantiated catalogs and have the highest priority.

There are three methods that can affect the Map catalogs:

1. registerCatalog(String catalogName, Catalog catalog)

This method puts the catalog instance into the Map catalogs.

2. unregisterCatalog(String catalogName)

This method removes the catalog instance corresponding to catalogName
from the Map catalogs.

3. getCatalog(String catalogName)

This method first retrieves the corresponding catalog instance from
the Map catalogs. If the catalog does not exist, it
retrieves the corresponding configuration from the CatalogStore,
initializes it, and puts the initialized Catalog instance into the
Map catalogs.

The following two methods only modify the configuration in the CatalogStore:

1. addCatalog(String catalogName, Map properties)

This method saves the properties to the catalogStore and checks
whether there is a catalogName with the same name.

2. removeCatalog(String catalogName)
This method removes the specified configuration of the specified
catalogName in the catalogStore.

The following are possible conflict scenarios:

1. When the corresponding catalogName already exists in the
CatalogStore but not in the Map, the
registerCatalog(String catalogName, Catalog catalog) method can
succeed and be directly saved to the Map catalogs.

2. When the corresponding catalogName already exists in both the
CatalogStore and the Map, the registerCatalog(String
catalogName, Catalog catalog) method will fail.

3. When the corresponding catalogName already exists in the
Map, the addCatalog(String catalogName, Map properties) method can directly save the properties to the
catalogStore, but the getCatalog(String catalogName) method will not
use the new properties for initialization because the corresponding
catalog instance already exists in catalogs and will be prioritized.
Therefore, using the unregisterCatalog(String catalogName) method to
remove the instance corresponding to the original catalogName is
necessary.



> I think it will confuse users that `registerCatalog(String 
> catalogName,Catalog catalog)` in the `Map catalogs` and 
> `registerCatalog(String catalogName, Map properties)

This could potentially lead to confusion. I suggest changing the
method name, perhaps to addCatalog(String catalogName, Map properties), as previously mentioned


@Hang
> add `registerCatalog(String catalogName,Catalog catalog,
boolean lazyInit)` method

Since a catalog is already an instance, adding the "lazyInit"
parameter to the registerCatalog(String catalogName, Catalog catalog)
method may not necessarily result in lazy initialization.

> Do we need to think about encryption

I think encryption is necessary, but perhaps the encryption logic
should be implemented in the CatalogStore when the data is actually
saved. For instance, the FileCatalogStore could encrypt the data when
it is saved to a file, while the MemoryCatalogStore does not require
encryption.

>  Do we really need the `MemoryCatalogStore`?

I think it is necessary to have a MemoryCatalogStore as the default
implementation, which saves Catalog configurations in memory.
Otherwise, if we want to implement asynchronous loading of Catalog, we
would need to introduce additional cache.


@Xianxun

> What if asynchronous registration failed?

This is also a critical concern.  When executing DDL, DQL, or DML
statements that reference a specific Catalog, the Catalog instance
will be initialized if it has not been initialized yet. If the
initialization process fails, these statements will not be executed
successfully.

> nt. The Map catalogs and CatalogStore catalogstore in the 
> CatalogManager all have the same catalog name, but correponding to different 
> catalog instances.

This issue can be resolved by referring to the previous responses. The
key principle is that the Catalog that has been most recently used
will be given priority for subsequent use.

On Sun, Apr 2, 2023 at 10:58 PM Xianxun Ye  wrote:
>
> Hi Feng,
>
> Thanks for driving this Flip, I do believe this Flip could be helpful for 
> users. The firm I work for also manages a lot of catalogs, and the submission 
> of tasks becomes slow because of loading a number of catalogs. We obtain the 
> catalogs in the user's Flink SQL through regular expression to avoid loading 
> all catalogs to improve the speed of task submission. After reading flip-295, 
> I have some questions:
>
> 1. When creating a catalog with CREATE CATALOG, the asynchronous registration 
> method is used by default. What if asynchronous registration failed? And how 
> to implement asynchronous registration?
>
> 2. I also have the same question with Shammon’s first comment. The 
> Map catalogs and CatalogStore catalogstore 

Re: [External] [DISCUSS] FLIP-292: Support configuring state TTL at operator level for Table API & SQL programs

2023-04-03 Thread godfrey he
Hi Jane,

Thanks for driving this FLIP.

I think the compiled plan solution and the hint solution do not
conflict, the two can exist at the same time.
The compiled plan solution can address the need of advanced users and
the platform users
which all stateful operators' state TTL can be defined by user. While
the hint solution can address some
 specific simple scenarios, which is very user-friendly, convenient,
and unambiguous to use.

Some stateful operators are not compiled from SQL directly, such as
ChangelogNormalize and
SinkUpsertMaterializer mentioned above,  I notice the the example given by Yisha
has hints propagation problem which does not conform to the current design.
The rough idea about the hint solution should be simple (only the
common operators are supported)
and easy to understand (no hints propagation).

If the hint solution is supported, a compiled plan which is from a
query with state TTL hints
 can also be further modified for the state TTL parts.

So, I prefer the hint solution to be discuss in a separate FLIP.  I
think that FLIP maybe
need a lot discussion.

Best,
Godfrey

周伊莎  于2023年3月30日周四 22:04写道:
>
> Hi Jane,
>
> Thanks for your detailed response.
>
> You mentioned that there are 10k+ SQL jobs in your production
> > environment, but only ~100 jobs' migration involves plan editing. Is 10k+
> > the number of total jobs, or the number of jobs that use stateful
> > computation and need state migration?
> >
>
> 10k is the number of SQL jobs that enable periodic checkpoint. And
> surely if users change their sql which result in changes of the plan, they
> need to do state migration.
>
> - You mentioned that "A truth that can not be ignored is that users
> > usually tend to give up editing TTL(or operator ID in our case) instead of
> > migrating this configuration between their versions of one given job." So
> > what would users prefer to do if they're reluctant to edit the operator
> > ID? Would they submit the same SQL as a new job with a higher version to
> > re-accumulating the state from the earliest offset?
>
>
> You're exactly right. People will tend to re-accumulate the state from a
> given offset by changing the namespace of their checkpoint.
> Namespace is an internal concept and restarting the sql job in a new
> namespace can be simply understood as submitting a new job.
>
> Back to your suggestions, I noticed that FLIP-190 [3] proposed the
> > following syntax to perform plan migration
>
>
> The 'plan migration'  I said in my last reply may be inaccurate.  It's more
> like 'query evolution'. In other word, if a user submitted a sql job with a
> configured compiled plan, and then
> he changes the sql,  the compiled plan changes too, how to move the
> configuration in the old plan to the new plan.
> IIUC, FLIP-190 aims to solve issues in flink version upgrades and leave out
> the 'query evolution' which is a fundamental change to the query. E.g.
> adding a filter condition, a different aggregation.
> And I'm really looking forward to a solution for query evolution.
>
> And I'm also curious about how to use the hint
> > approach to cover cases like
> >
> > - configuring TTL for operators like ChangelogNormalize,
> > SinkUpsertMaterializer, etc., these operators are derived by the planner
> > implicitly
> > - cope with two/multiple input stream operator's state TTL, like join,
> > and other operations like row_number, rank, correlate, etc.
>
>
>  Actually, in our company , we make operators in the query block where the
> hint locates all affected by that hint. For example,
>
> INSERT INTO sink
> > SELECT /*+ STATE_TTL('1D') */
> >id,
> >name,
> >num
> > FROM (
> >SELECT
> >*,
> >ROW_NUMBER() OVER (PARTITION BY id ORDER BY num DESC) as row_num
> >FROM (
> >SELECT
> >*
> >FROM (
> >SELECT
> >id,
> >name,
> >max(num) as num
> >FROM source1
> >GROUP BY
> >id, name, TUMBLE(proc, INTERVAL '1' MINUTE)
> >)
> >GROUP BY
> >id, name, num
> >)
> > )
> > WHERE row_num = 1
> >
>
> In the SQL above, the state TTL of Rank and Agg will be all configured as 1
> day.  If users want to set different TTL for Rank and Agg, they can just
> make these two queries located in two different query blocks.
> It looks quite rough but straightforward enough.  For each side of join
> operator, one of my users proposed a syntax like below:
>
> > /*+ 
> > JOIN_TTL('tables'='left_talbe,right_table','left_ttl'='10','right_ttl'='1')
> >  */
> >
> > We haven't accepted this proposal now, maybe we could find some better
> design for this kind of case. Just for your information.
>
> I think if we want to utilize hints to support fine-grained configuration,
> we can open a new FLIP to discuss it.
> BTW, personally, I'm interested in how to design a graphical interface to
> help users to maintain their custom 

Re: [VOTE] Release flink-connector-kafka, release candidate #1

2023-04-03 Thread Konstantin Knauf
+1. Thanks, Gordon!

Am Mo., 3. Apr. 2023 um 06:37 Uhr schrieb Tzu-Li (Gordon) Tai <
tzuli...@apache.org>:

> Hi Martijn,
>
> Since this RC vote was opened, we had three critical bug fixes that was
> merged for the Kafka connector:
>
>- https://issues.apache.org/jira/browse/FLINK-31363
>- https://issues.apache.org/jira/browse/FLINK-31305
>- https://issues.apache.org/jira/browse/FLINK-31620
>
> Given the severity of these issues (all of them are violations of
> exactly-once semantics), and the fact that they are currently not included
> yet in any released version, do you think it makes sense to cancel this RC
> in favor of a new one that includes these?
> Since this RC vote has been stale for quite some time already, it doesn't
> seem like we're throwing away too much effort that has already been done if
> we start a new RC with these critical fixes included.
>
> What do you think?
>
> Thanks,
> Gordon
>
> On Thu, Feb 9, 2023 at 3:26 PM Tzu-Li (Gordon) Tai 
> wrote:
>
> > +1 (binding)
> >
> > - Verified legals (license headers and root LICENSE / NOTICE file).
> AFAICT
> > no dependencies require explicit acknowledgement in the NOTICE files.
> > - No binaries in staging area
> > - Built source with tests
> > - Verified signatures and hashes
> > - Web PR changes LGTM
> >
> > Thanks Martijn!
> >
> > Cheers,
> > Gordon
> >
> > On Mon, Feb 6, 2023 at 6:12 PM Mason Chen 
> wrote:
> >
> >> That makes sense, thanks for the clarification!
> >>
> >> Best,
> >> Mason
> >>
> >> On Wed, Feb 1, 2023 at 7:16 AM Martijn Visser  >
> >> wrote:
> >>
> >> > Hi Mason,
> >> >
> >> > Thanks, [4] is indeed a copy-paste error and you've made the right
> >> > assumption that
> >> >
> >> >
> >>
> https://repository.apache.org/content/repositories/orgapacheflink-1582/org/apache/flink/
> >> > is the correct maven central link.
> >> >
> >> > I think we should use FLINK-30052 to move the Kafka connector code
> from
> >> the
> >> > 1.17 release also over the Kafka connector repo (especially since
> >> there's
> >> > now a v3.0 branch for the Kafka connector, so it can be merged in
> main).
> >> > When those commits have been merged, we can make a next Kafka
> connector
> >> > release (which is equivalent to the 1.17 release, which can only be
> done
> >> > when 1.17 is done because of the split level watermark alignment) and
> >> then
> >> > FLINK-30859 can be finished.
> >> >
> >> > Best regards,
> >> >
> >> > Martijn
> >> >
> >> > Op wo 1 feb. 2023 om 09:16 schreef Mason Chen  >:
> >> >
> >> > > +1 (non-binding)
> >> > >
> >> > > * Verified hashes and signatures
> >> > > * Verified no binaries
> >> > > * Verified LICENSE and NOTICE files
> >> > > * Verified poms point to 3.0.0-1.16
> >> > > * Reviewed web PR
> >> > > * Built from source
> >> > > * Verified git tag
> >> > >
> >> > > I think [4] your is a copy-paste error and I did all the
> verification
> >> > > assuming that
> >> > >
> >> > >
> >> >
> >>
> https://repository.apache.org/content/repositories/orgapacheflink-1582/org/apache/flink/
> >> > > is the correct maven central link.
> >> > >
> >> > > Regarding the release notes, should we close
> >> > > https://issues.apache.org/jira/browse/FLINK-30052 and link it
> there?
> >> > I've
> >> > > created https://issues.apache.org/jira/browse/FLINK-30859 to remove
> >> the
> >> > > existing code from the master branch.
> >> > >
> >> > > Best,
> >> > > Mason
> >> > >
> >> > > On Tue, Jan 31, 2023 at 6:23 AM Martijn Visser <
> >> martijnvis...@apache.org
> >> > >
> >> > > wrote:
> >> > >
> >> > > > Hi everyone,
> >> > > > Please review and vote on the release candidate #1 for
> >> > > > flink-connector-kafka version 3.0.0, as follows:
> >> > > > [ ] +1, Approve the release
> >> > > > [ ] -1, Do not approve the release (please provide specific
> >> comments)
> >> > > >
> >> > > > Note: this is the same code as the Kafka connector for the Flink
> >> 1.16
> >> > > > release.
> >> > > >
> >> > > > The complete staging area is available for your review, which
> >> includes:
> >> > > > * JIRA release notes [1],
> >> > > > * the official Apache source release to be deployed to
> >> dist.apache.org
> >> > > > [2],
> >> > > > which are signed with the key with fingerprint
> >> > > > A5F3BCE4CBE993573EC5966A65321B8382B219AF [3],
> >> > > > * all artifacts to be deployed to the Maven Central Repository
> [4],
> >> > > > * source code tag v3.0.0-rc1 [5],
> >> > > > * website pull request listing the new release [6].
> >> > > >
> >> > > > The vote will be open for at least 72 hours. It is adopted by
> >> majority
> >> > > > approval, with at least 3 PMC affirmative votes.
> >> > > >
> >> > > > Thanks,
> >> > > > Release Manager
> >> > > >
> >> > > > [1]
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352577
> >> > > > [2]
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-kafka-3.0.0-rc1
> >> > > > [3]