Re: [2.0] Help needed for release 2.0 work items

2024-01-03 Thread Zakelly Lan
Thanks for the information Xintong! I'll consider working on this if I have
time later this month.


Best,
Zakelly

On Wed, Jan 3, 2024 at 11:54 AM Xintong Song  wrote:

> >
> > I would like to ask if there is any plan to review and refactor the CLI
> in
> > Flink 2.0.
> >
>
> Not that I'm aware of.
>
>
> I have the impression of seeing discussions about not using CLI options and
> using only "-Dconfig.key", but I cannot find it now. I personally think
> that is a good direction to go, and should also solve the problem mentioned
> in your example. My biggest concern is whether there are contributors with
> enough capacity to work on it.
>
>
> Feel free to pick it up if you'd like to.
>
>
> Best,
>
> Xintong
>
>
>
> On Wed, Jan 3, 2024 at 11:37 AM Zakelly Lan  wrote:
>
> > Hi Xintong,
> >
> > Thanks for driving this.
> >
> > I would like to ask if there is any plan to review and refactor the CLI
> in
> > Flink 2.0. I recently found that the CLI commands and parameters are
> > confusing in some ways (e.g.
> > https://github.com/apache/flink/pull/23253#discussion_r1405707256). It
> > would be beneficial to offer a more intuitive and straightforward CLI
> > command to enhance usability.
> >
> >
> > Best,
> > Zakelly
> >
> > On Wed, Jan 3, 2024 at 10:45 AM Xintong Song 
> > wrote:
> >
> > > Thanks a lot for offering the help, Rui. The plan sounds good to me.
> I'll
> > > put your name and the milestones into the 2.0 wiki page.
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Wed, Jan 3, 2024 at 10:38 AM Rui Fan <1996fan...@gmail.com> wrote:
> > >
> > > > Thanks Xintong for promoting the progress of Flink 2.0.
> > > >
> > > > If no one minds, I'd like to pick this one: Use Java’s Duration
> instead
> > > of
> > > > Flink’s Time.
> > > > Could I assign FLINK-14068[1] to me?
> > > >
> > > > My expected progress is:
> > > > - Mark org.apache.flink.api.common.time.Time and
> > > >  org.apache.flink.streaming.api.windowing.time.Time
> > > >  as @Deprecated in 1.19 (Must do in 1.19)
> > > > - Refactor all usages of them to Java's Duration(Nice do in 1.19,
> must
> > do
> > > > in 1.20)
> > > > - Remove them in 2.0
> > > >
> > > > Is this plan reasonable?
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-14068
> > > >
> > > > Best,
> > > > Rui
> > > >
> > > > On Wed, Jan 3, 2024 at 9:18 AM Xintong Song 
> > > wrote:
> > > >
> > > >> Hi devs,
> > > >>
> > > >> The release managers have been tracking the progress of release 2.0
> > work
> > > >> items. Unfortunately, some of the items are not in good progress,
> and
> > > >> either don't have a contributor or the original contributor no
> longer
> > > has
> > > >> capacity to work on them. We have already tried reaching out to some
> > > >> developers, but unfortunately don't find many people with capacity.
> > > >>
> > > >> Therefore, we are looking for developers who want to pick them up.
> > > >>
> > > >> Helps are needed on:
> > > >>
> > > >>- Introduce dedicated MetricsScope
> > > >>- Rework MetricGroup scope APIs
> > > >>- Remove MetricGroup methods accepting an int as a name
> > > >>- Remove brackets around variables
> > > >>- Drop MetricReporter#open
> > > >>- Gauge should only take subclasses of Number, rather than
> > > >> everything
> > > >>- Add MetricGroup#getLogicalScope
> > > >>- User Java’s Duration instead of Flink’s Time
> > > >>- Review and refactor the REST API
> > > >>- Properly handle NaN/Infinity in OpenAPI spec
> > > >>- Enforce single maxExceptions query parameter
> > > >>- Drop Yarn specific get rest endpoints
> > > >>- Review and refactor the metrics implementation
> > > >>- Attach semantics to Gauges; refactor Counter / Meter to be
> Gauges
> > > >> with
> > > >>syntactic sugar on top
> > > >>- Restructure
> > > >>
> > > >>
> > > >> Please note that:
> > > >>
> > > >>- For some of the items, the milestones are already given, and
> > there
> > > >>might be some actions that need to be performed by Flink 1.19.
> > Please
> > > >> be
> > > >>aware that we are only 3.5 weeks from the 1.19 feature freeze.
> > > >>- There are also items which don't have any plans / milestones
> yet.
> > > For
> > > >>such items, we may want to quickly look into them to find out if
> > > >> there's
> > > >>anything that needs to be done in 1.19.
> > > >>- See more details on the 2.0 wiki page [1]
> > > >>
> > > >>
> > > >> If these items do not make Flink 1.19, we can discuss later what to
> do
> > > >> with
> > > >> them, either postpone release 2.0 or exclude them from this major
> > > release.
> > > >> But for now, let's first see what we can do by 1.19.
> > > >>
> > > >> Best,
> > > >>
> > > >> Xintong
> > > >>
> > > >>
> > > >> [1] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > > >>
> > > >
> > >
> >
>


[jira] [Created] (FLINK-33987) Flink Table API Support file extention or suffix

2024-01-03 Thread haojie (Jira)
haojie created FLINK-33987:
--

 Summary: Flink Table API Support file extention or suffix
 Key: FLINK-33987
 URL: https://issues.apache.org/jira/browse/FLINK-33987
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / API
Reporter: haojie


Problem:

when sink file by filesystem table API there is no file extension according to 
format, or no way to costomize file suffix.

https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-files/src/main/java/org/apache/flink/connector/file/table/FileSystemTableSink.java#L238

New feature: give user option to costomize file suffix, or set default file 
extension by format



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33986) Introduce SupportsBatchSnapshot for shuffle master

2024-01-03 Thread Junrui Li (Jira)
Junrui Li created FLINK-33986:
-

 Summary: Introduce SupportsBatchSnapshot for shuffle master
 Key: FLINK-33986
 URL: https://issues.apache.org/jira/browse/FLINK-33986
 Project: Flink
  Issue Type: Sub-task
Reporter: Junrui Li






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33985) Extend ShuffleMaster to fetch all partition

2024-01-03 Thread Junrui Li (Jira)
Junrui Li created FLINK-33985:
-

 Summary: Extend ShuffleMaster to fetch all partition
 Key: FLINK-33985
 URL: https://issues.apache.org/jira/browse/FLINK-33985
 Project: Flink
  Issue Type: Sub-task
Reporter: Junrui Li






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33984) Introduce SupportsBatchSnapshot for operator coordinator

2024-01-03 Thread Junrui Li (Jira)
Junrui Li created FLINK-33984:
-

 Summary: Introduce SupportsBatchSnapshot for operator coordinator
 Key: FLINK-33984
 URL: https://issues.apache.org/jira/browse/FLINK-33984
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Junrui Li






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33983) Introduce JobEvent and JobEventStore for Job Recovery

2024-01-03 Thread Junrui Li (Jira)
Junrui Li created FLINK-33983:
-

 Summary: Introduce JobEvent and JobEventStore for Job Recovery
 Key: FLINK-33983
 URL: https://issues.apache.org/jira/browse/FLINK-33983
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Coordination
Reporter: Junrui Li






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33982) Introduce new config options for Job Recovery.

2024-01-03 Thread Junrui Li (Jira)
Junrui Li created FLINK-33982:
-

 Summary: Introduce new config options for Job Recovery.
 Key: FLINK-33982
 URL: https://issues.apache.org/jira/browse/FLINK-33982
 Project: Flink
  Issue Type: Sub-task
  Components: API / Core
Reporter: Junrui Li






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33981) File Descriptor References Not Released After Job Execution in MiniCluster Mode

2024-01-03 Thread Feng Jiajie (Jira)
Feng Jiajie created FLINK-33981:
---

 Summary: File Descriptor References Not Released After Job 
Execution in MiniCluster Mode
 Key: FLINK-33981
 URL: https://issues.apache.org/jira/browse/FLINK-33981
 Project: Flink
  Issue Type: Bug
  Components: Runtime / State Backends
Affects Versions: 1.18.0
Reporter: Feng Jiajie


When using MiniCluster mode, file descriptors like 
{{/tmp/minicluster_6e62ab17ab058e0d5df56b97e9ccef71/tm_0/localState}} are not 
released after a Job completes. Executing multiple Jobs in the same JVM might 
result in leftover file descriptors, potentially leading to problems.

After executing the reproducing code provided below (after entering the sleep), 
running lsof -p 18162 reveals:
{code:java}
...
java18162 sa_cluster   30r   DIR  253,1 01311962 
/tmp/minicluster_79186ba3856bb18fd21c20d9adbaa3da/tm_0/localState (deleted)
java18162 sa_cluster   31r   DIR  253,1 01311962 
/tmp/minicluster_79186ba3856bb18fd21c20d9adbaa3da/tm_0/localState (deleted)
java18162 sa_cluster   32r   DIR  253,1 01310787 
/tmp/minicluster_f3ac056ffa1c47010698fc02c5e6d4cf/tm_0/localState (deleted)
java18162 sa_cluster   33r   DIR  253,1 01310787 
/tmp/minicluster_f3ac056ffa1c47010698fc02c5e6d4cf/tm_0/localState (deleted)
java18162 sa_cluster   34r   DIR  253,1 01311960 
/tmp/minicluster_ef6177c97a2c9096758d0a8f132f9f02/tm_0/localState (deleted)
java18162 sa_cluster   35r   DIR  253,1 01311960 
/tmp/minicluster_ef6177c97a2c9096758d0a8f132f9f02/tm_0/localState (deleted)
java18162 sa_cluster   36r   DIR  253,1 01311974 
/tmp/minicluster_661a2967ea5c377265fa9f41c768db21/tm_0/localState (deleted)
java18162 sa_cluster   37r   DIR  253,1 01311974 
/tmp/minicluster_661a2967ea5c377265fa9f41c768db21/tm_0/localState (deleted)
java18162 sa_cluster   38r   DIR  253,1 01311979 
/tmp/minicluster_8413f476b77245203ed1ef759eb0d2de/tm_0/localState (deleted)
...
{code}
The code used for reproduction is as follows:

 
{code:java}
import org.apache.flink.api.common.JobExecutionResult;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.core.execution.JobClient;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.sink.DiscardingSink;
import org.apache.flink.streaming.api.graph.StreamGraph;

import java.util.concurrent.CompletableFuture;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

/**
 * javac -cp 'lib/*' TestReleaseFd.java
 * java -Xmx600m -cp '.:lib/*' TestReleaseFd
 */
public class TestReleaseFd {

  public static void main(String[] args) throws Exception {
for (int i = 0; i < 10; ++i) {
  int round = i;
  Thread thread = new Thread(() -> {
try {
  Configuration configuration = new Configuration();
  final StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment(configuration);
  env.setParallelism(1);

  DataStreamSource longDataStreamSource = env.fromSequence(1, 
10);
  longDataStreamSource.addSink(new DiscardingSink<>());

  StreamGraph streamGraph = env.getStreamGraph();
  streamGraph.setJobName("test-" + System.nanoTime());
  JobClient jobClient = env.executeAsync(streamGraph);

  CompletableFuture 
jobExecutionResultCompletableFuture = jobClient.getJobExecutionResult();
  JobExecutionResult jobExecutionResult = null;
  while (jobExecutionResult == null) {
try {
  jobExecutionResult = jobExecutionResultCompletableFuture.get(20, 
TimeUnit.SECONDS);
} catch (TimeoutException timeoutException) {
  // ignore
}
  }
  System.out.println("finished round: " + round);
  env.close();
} catch (Exception e) {
  throw new RuntimeException(e);
}
  });

  thread.setDaemon(true);
  thread.start();
  thread.join();

  System.out.println("done ... " + i);
}

// === lsof -p 18162
Thread.sleep(500_000_000);
  }
}
 {code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-397: Add config options for administrator JVM options

2024-01-03 Thread xiangyu feng
+1 (non-binding)

Regards,
Xiangyu Feng

Rui Fan <1996fan...@gmail.com> 于2024年1月4日周四 13:03写道:

> +1 (binding)
>
> Best,
> Rui
>
> On Thu, Jan 4, 2024 at 11:45 AM Benchao Li  wrote:
>
> > +1 (binding)
> >
> > Zhanghao Chen  于2024年1月4日周四 10:30写道:
> > >
> > > Hi everyone,
> > >
> > > Thanks for all the feedbacks on FLIP-397 [1], which proposes to add a
> > set of default JVM options for administrator use that prepends the
> user-set
> > extra JVM options for easier platform-wide JVM pre-tuning. It has been
> > discussed in [2].
> > >
> > > I'd like to start a vote. The vote will be open for at least 72 hours
> > (until January 8th 12:00 GMT) unless there is an objection or
> insufficient
> > votes.
> > >
> > > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-397%3A+Add+config+options+for+administrator+JVM+options
> > > [2] https://lists.apache.org/thread/cflonyrfd1ftmyrpztzj3ywckbq41jzg
> > >
> > > Best,
> > > Zhanghao Chen
> >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >
>


Re: [VOTE] FLIP-397: Add config options for administrator JVM options

2024-01-03 Thread Rui Fan
+1 (binding)

Best,
Rui

On Thu, Jan 4, 2024 at 11:45 AM Benchao Li  wrote:

> +1 (binding)
>
> Zhanghao Chen  于2024年1月4日周四 10:30写道:
> >
> > Hi everyone,
> >
> > Thanks for all the feedbacks on FLIP-397 [1], which proposes to add a
> set of default JVM options for administrator use that prepends the user-set
> extra JVM options for easier platform-wide JVM pre-tuning. It has been
> discussed in [2].
> >
> > I'd like to start a vote. The vote will be open for at least 72 hours
> (until January 8th 12:00 GMT) unless there is an objection or insufficient
> votes.
> >
> > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-397%3A+Add+config+options+for+administrator+JVM+options
> > [2] https://lists.apache.org/thread/cflonyrfd1ftmyrpztzj3ywckbq41jzg
> >
> > Best,
> > Zhanghao Chen
>
>
>
> --
>
> Best,
> Benchao Li
>


Re: [DISCUSS] FLIP-402: Extend ZooKeeper Curator configurations

2024-01-03 Thread Zhanghao Chen
Thanks for driving this. I'm not familiar with the listed advanced Curator 
configs, but the previous added switch for disabling ensemble tracking [1] 
saved us when deploying Flink in a cloud env where ZK can only be accessible 
via URLs. That being said, +1 for the overall idea, these configs may help 
users in certain scenarios sooner or later.

[1] https://issues.apache.org/jira/browse/FLINK-31780

Best,
Zhanghao Chen

From: Alex Nitavsky 
Sent: Thursday, December 14, 2023 21:20
To: dev@flink.apache.org 
Subject: [DISCUSS] FLIP-402: Extend ZooKeeper Curator configurations

Hi all,

I would like to start a discussion thread for: *FLIP-402: Extend ZooKeeper
Curator configurations *[1]

* Problem statement *
Currently Flink misses several Apache Curator configurations, which could
be useful for Flink deployment with ZooKeeper as HA provider.

* Proposed solution *
We have inspected all possible options for Apache Curator and proposed
those which could be valuable for Flink users:

- high-availability.zookeeper.client.authorization [2]
- high-availability.zookeeper.client.maxCloseWaitMs [3]
- high-availability.zookeeper.client.simulatedSessionExpirationPercent [4]

The proposed way is to reflect those properties into Flink configuration
options for Apache ZooKeeper.

Looking forward to your feedback and suggestions.

Kind regards
Oleksandr

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-402%3A+Extend+ZooKeeper+Curator+configurations
[2]
https://curator.apache.org/apidocs/org/apache/curator/framework/CuratorFrameworkFactory.Builder.html#authorization(java.lang.String,byte%5B%5D)
[3]
https://curator.apache.org/apidocs/org/apache/curator/framework/CuratorFrameworkFactory.Builder.html#maxCloseWaitMs(int)
[4]
https://curator.apache.org/apidocs/org/apache/curator/framework/CuratorFrameworkFactory.Builder.html#simulatedSessionExpirationPercent(int)


[jira] [Created] (FLINK-33980) Reorganize job configuration

2024-01-03 Thread Junrui Li (Jira)
Junrui Li created FLINK-33980:
-

 Summary: Reorganize job configuration
 Key: FLINK-33980
 URL: https://issues.apache.org/jira/browse/FLINK-33980
 Project: Flink
  Issue Type: Technical Debt
  Components: API / Core
Reporter: Junrui Li


Currently, job configuration in FLINK is spread out across different 
components, including StreamExecutionEnvironment, CheckpointConfig, and 
ExecutionConfig. This distribution leads to inconsistencies among the 
configurations stored within these components. Furthermore, the methods used to 
configure these components vary; some rely on complex Java objects, while 
others use ConfigOption, which is a key-value configuration approach. This 
variation complicates the effective management of job configurations. 
Additionally, passing complex Java objects (e.g., StateBackend and 
CheckpointStorage) between the environment, StreamGraph, and JobGraph adds 
complexity to development.

With the completion of FLIP-381, it is now time to standardize and unify job 
configuration in FLINK. The goals of this JIRA are as follows:
 # Migrate configuration from non-ConfigOption objects to use ConfigOption.
 # Adopt a single Configuration object to house all configurations.
 # Create complex Java objects, such as RestartBackoffTimeStrategyFactory, 
CheckpointStorage, and StateBackend, directly from the configuration on the JM 
side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-397: Add config options for administrator JVM options

2024-01-03 Thread Benchao Li
+1 (binding)

Zhanghao Chen  于2024年1月4日周四 10:30写道:
>
> Hi everyone,
>
> Thanks for all the feedbacks on FLIP-397 [1], which proposes to add a set of 
> default JVM options for administrator use that prepends the user-set extra 
> JVM options for easier platform-wide JVM pre-tuning. It has been discussed in 
> [2].
>
> I'd like to start a vote. The vote will be open for at least 72 hours (until 
> January 8th 12:00 GMT) unless there is an objection or insufficient votes.
>
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-397%3A+Add+config+options+for+administrator+JVM+options
> [2] https://lists.apache.org/thread/cflonyrfd1ftmyrpztzj3ywckbq41jzg
>
> Best,
> Zhanghao Chen



-- 

Best,
Benchao Li


[VOTE] FLIP-397: Add config options for administrator JVM options

2024-01-03 Thread Zhanghao Chen
Hi everyone,

Thanks for all the feedbacks on FLIP-397 [1], which proposes to add a set of 
default JVM options for administrator use that prepends the user-set extra JVM 
options for easier platform-wide JVM pre-tuning. It has been discussed in [2].

I'd like to start a vote. The vote will be open for at least 72 hours (until 
January 8th 12:00 GMT) unless there is an objection or insufficient votes.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-397%3A+Add+config+options+for+administrator+JVM+options
[2] https://lists.apache.org/thread/cflonyrfd1ftmyrpztzj3ywckbq41jzg

Best,
Zhanghao Chen


Re: FW: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov

2024-01-03 Thread Jane Chan
Congratulations, Alex!

Best,
Jane

On Thu, Jan 4, 2024 at 10:03 AM Junrui Lee  wrote:

> Congratulations, Alex!
>
> Best,
> Junrui
>
> weijie guo  于2024年1月4日周四 09:57写道:
>
> > Congratulations, Alex!
> >
> > Best regards,
> >
> > Weijie
> >
> >
> > Steven Wu  于2024年1月4日周四 02:07写道:
> >
> > > Congra, Alex! Well deserved!
> > >
> > > On Wed, Jan 3, 2024 at 2:31 AM David Radley 
> > > wrote:
> > >
> > > > Sorry for my typo.
> > > >
> > > > Many congratulations Alex!
> > > >
> > > > From: David Radley 
> > > > Date: Wednesday, 3 January 2024 at 10:23
> > > > To: David Anderson 
> > > > Cc: dev@flink.apache.org 
> > > > Subject: Re: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer -
> > Alexander
> > > > Fedulov
> > > > Many Congratulations David .
> > > >
> > > > From: Maximilian Michels 
> > > > Date: Tuesday, 2 January 2024 at 12:16
> > > > To: dev 
> > > > Cc: Alexander Fedulov 
> > > > Subject: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer - Alexander
> > > > Fedulov
> > > > Happy New Year everyone,
> > > >
> > > > I'd like to start the year off by announcing Alexander Fedulov as a
> > > > new Flink committer.
> > > >
> > > > Alex has been active in the Flink community since 2019. He has
> > > > contributed more than 100 commits to Flink, its Kubernetes operator,
> > > > and various connectors [1][2].
> > > >
> > > > Especially noteworthy are his contributions on deprecating and
> > > > migrating the old Source API functions and test harnesses, the
> > > > enhancement to flame graphs, the dynamic rescale time computation in
> > > > Flink Autoscaling, as well as all the small enhancements Alex has
> > > > contributed which make a huge difference.
> > > >
> > > > Beyond code contributions, Alex has been an active community member
> > > > with his activity on the mailing lists [3][4], as well as various
> > > > talks and blog posts about Apache Flink [5][6].
> > > >
> > > > Congratulations Alex! The Flink community is proud to have you.
> > > >
> > > > Best,
> > > > The Flink PMC
> > > >
> > > > [1]
> > > >
> > https://github.com/search?type=commits&q=author%3Aafedulov+org%3Aapache
> > > > [2]
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/FLINK-28229?jql=status%20in%20(Resolved%2C%20Closed)%20AND%20assignee%20in%20(afedulov)%20ORDER%20BY%20resolved%20DESC%2C%20created%20DESC
> > > > [3]
> > https://lists.apache.org/list?dev@flink.apache.org:lte=100M:Fedulov
> > > > [4]
> > https://lists.apache.org/list?u...@flink.apache.org:lte=100M:Fedulov
> > > > [5]
> > > >
> > >
> >
> https://flink.apache.org/2020/01/15/advanced-flink-application-patterns-vol.1-case-study-of-a-fraud-detection-system/
> > > > [6]
> > > >
> > >
> >
> https://www.ververica.com/blog/presenting-our-streaming-concepts-introduction-to-flink-video-series
> > > >
> > > > Unless otherwise stated above:
> > > >
> > > > IBM United Kingdom Limited
> > > > Registered in England and Wales with number 741598
> > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6
> 3AU
> > > >
> > >
> >
>


Re: [DISCUSS][FLINK-31830] Align the Nullability Handling of ROW between SQL and TableAPI

2024-01-03 Thread Jane Chan
Hi Timo,

Thanks for your valuable feedback.

How about we work together on this topic and create a FLIP for this? We
> need more examples in a unified document. Currently, the proposal is split
> across multiple Flink and Calcite JIRA issues and a ML discussion.


That sounds like a great idea. A FLIP would provide a more precise and
better-organized document, and let's fix it together.

Towards several points you mentioned, here are my cents

RelDataType is similarly just a type declaration. Please correct me if
> I'm wrong but RelDataType itself also allows `ROW NOT
> NULL`.
>

In the context of RelDataType, `ROW NOT NULL` is a
legitimate type specification. However, I presume that the type you intend
to refer to is `ROW`.

Theoretically speaking, the answer is no and yes.
**NO** It would not be able to create a type like `RecordType(INTEGER NOT
NULL f0by using Calcite fluent API[1]. If the record nullability is true,
then the inner field's nullability is set to true implicitly.
**YES** It's conceptually viable to create a type like `RecordType(INTEGER
NOT NULL f0)` by directly calling the constructor of RelRecordType.
Nevertheless, the JavaDoc constructor summary[2] emphasizes that
ctor#(StructKind, List, boolean) should only be called
from a factory method.

The following code snippet demonstrates the difference at the API level.

@Test
void testRelRecordType() {
  // create an inner type INT NOT NULL
  RelDataType innerFieldType =
  new BasicSqlType(RelDataTypeSystem.DEFAULT, SqlTypeName.INTEGER);
  RelDataTypeField f0 = new RelDataTypeFieldImpl("f0", 0, innerFieldType);

  // Directly call RelRecordType ctor to specify the record level
nullability

// However, in practice, users are not recommended to do so
  RelDataType r1 =
  new RelRecordType(StructKind.FULLY_QUALIFIED,
Collections.singletonList(f0), true);
  // This will print r1: RecordType(INTEGER NOT NULL f0)
  System.out.println("r1: " + r1.getFullTypeString());

  // Use Calcite fluent API
  RelDataTypeFactory calciteTypeFactory = new
SqlTypeFactoryImpl(RelDataTypeSystem.DEFAULT);

  RelDataType r2 =
  new RelDataTypeFactory.Builder(calciteTypeFactory)
  .add(f0)
  .nullable(false) // field nullability will be overridden
by record nullability
  .nullableRecord(true)
  .build();
  // This will print r2: RecordType(INTEGER f0)
  System.out.println("r2: " + r2.getFullTypeString());

  // NOTE: replace type factory with flinkTypeFactory also get
RecordType(INTEGER f0)
  FlinkTypeFactory flinkTypeFactory =
  ((PlannerBase) (((TableEnvironmentImpl)
tableEnv).getPlanner())).getTypeFactory();
}


It's the factory or optimizer that performs necessary changes.
> - It's up to the framework (i.e. planner or Table API) to decide what to
> do with these declarations.
>

You're correct; theoretically, the responsibility for type declaration
resides with the optimizer and framework. However, Flink allows users to
create types like `ROW` through the public API, like
`DataTypes.ROW(DataTypes.FIELD("f0", DataTypes.INT.notNull())).nullable()`.
In contrast, Calcite restricts such actions(suppose they follow the best
practice and use fluent API). Perhaps we can take a reference from
Calcite's RelDataTypeFactory.Builder to align the behavior of table API to
SQL.


> MyPojo can be nullable, but i cannot. This is the reason why we decided
> to introduce the current behavior. Complex structs are usually generated
> from Table API or from the catalog (e.g. when mapping to schema registry
> or some other external system). It could lead to other downstream
> inconsistencies if we change the method above.
>

Correct me if I'm mistaken, but if `MyPojo myPojo = null`, we cannot
conclude that `myPojo.i` is not null because an NPE will be thrown. And we
can only safely say `myPojo.i` should not be null only if `myPojo != null`
is true. Similarly, in SQL, null or NULL is a special marker used to
indicate that a data value does not exist in the database. If the record
itself is absent(nullable), then declaring the inner field as not
null(present) may not make much sense.

[1]
https://github.com/apache/calcite/blob/8d9b27f1ace7f975407920cb88806715b1f0ef82/core/src/main/java/org/apache/calcite/rel/type/RelDataTypeFactory.java#L615
[2]
https://calcite.apache.org/javadocAggregate/org/apache/calcite/rel/type/RelRecordType.html#%3Cinit%3E(org.apache.calcite.rel.type.StructKind,java.util.List,boolean)

Best,
Jane


On Tue, Jan 2, 2024 at 6:26 PM Timo Walther  wrote:

> Hi Jane,
>
> thanks for the heavy investigation and extensive summaries. I'm sorry
> that I ignored this discussion for too long but would like to help in
> shaping a sustainable long-term solution.
>
> I fear that changing:
> - RowType#copy()
> - RowType's constructor
> - FieldsDataType#nullable()
> will not solve all transitive issues.
>
> We should approach the problem from a different perspective. In my point
> of view:

Re: FW: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov

2024-01-03 Thread Junrui Lee
Congratulations, Alex!

Best,
Junrui

weijie guo  于2024年1月4日周四 09:57写道:

> Congratulations, Alex!
>
> Best regards,
>
> Weijie
>
>
> Steven Wu  于2024年1月4日周四 02:07写道:
>
> > Congra, Alex! Well deserved!
> >
> > On Wed, Jan 3, 2024 at 2:31 AM David Radley 
> > wrote:
> >
> > > Sorry for my typo.
> > >
> > > Many congratulations Alex!
> > >
> > > From: David Radley 
> > > Date: Wednesday, 3 January 2024 at 10:23
> > > To: David Anderson 
> > > Cc: dev@flink.apache.org 
> > > Subject: Re: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer -
> Alexander
> > > Fedulov
> > > Many Congratulations David .
> > >
> > > From: Maximilian Michels 
> > > Date: Tuesday, 2 January 2024 at 12:16
> > > To: dev 
> > > Cc: Alexander Fedulov 
> > > Subject: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer - Alexander
> > > Fedulov
> > > Happy New Year everyone,
> > >
> > > I'd like to start the year off by announcing Alexander Fedulov as a
> > > new Flink committer.
> > >
> > > Alex has been active in the Flink community since 2019. He has
> > > contributed more than 100 commits to Flink, its Kubernetes operator,
> > > and various connectors [1][2].
> > >
> > > Especially noteworthy are his contributions on deprecating and
> > > migrating the old Source API functions and test harnesses, the
> > > enhancement to flame graphs, the dynamic rescale time computation in
> > > Flink Autoscaling, as well as all the small enhancements Alex has
> > > contributed which make a huge difference.
> > >
> > > Beyond code contributions, Alex has been an active community member
> > > with his activity on the mailing lists [3][4], as well as various
> > > talks and blog posts about Apache Flink [5][6].
> > >
> > > Congratulations Alex! The Flink community is proud to have you.
> > >
> > > Best,
> > > The Flink PMC
> > >
> > > [1]
> > >
> https://github.com/search?type=commits&q=author%3Aafedulov+org%3Aapache
> > > [2]
> > >
> >
> https://issues.apache.org/jira/browse/FLINK-28229?jql=status%20in%20(Resolved%2C%20Closed)%20AND%20assignee%20in%20(afedulov)%20ORDER%20BY%20resolved%20DESC%2C%20created%20DESC
> > > [3]
> https://lists.apache.org/list?dev@flink.apache.org:lte=100M:Fedulov
> > > [4]
> https://lists.apache.org/list?u...@flink.apache.org:lte=100M:Fedulov
> > > [5]
> > >
> >
> https://flink.apache.org/2020/01/15/advanced-flink-application-patterns-vol.1-case-study-of-a-fraud-detection-system/
> > > [6]
> > >
> >
> https://www.ververica.com/blog/presenting-our-streaming-concepts-introduction-to-flink-video-series
> > >
> > > Unless otherwise stated above:
> > >
> > > IBM United Kingdom Limited
> > > Registered in England and Wales with number 741598
> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
> > >
> >
>


Re: FW: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov

2024-01-03 Thread weijie guo
Congratulations, Alex!

Best regards,

Weijie


Steven Wu  于2024年1月4日周四 02:07写道:

> Congra, Alex! Well deserved!
>
> On Wed, Jan 3, 2024 at 2:31 AM David Radley 
> wrote:
>
> > Sorry for my typo.
> >
> > Many congratulations Alex!
> >
> > From: David Radley 
> > Date: Wednesday, 3 January 2024 at 10:23
> > To: David Anderson 
> > Cc: dev@flink.apache.org 
> > Subject: Re: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer - Alexander
> > Fedulov
> > Many Congratulations David .
> >
> > From: Maximilian Michels 
> > Date: Tuesday, 2 January 2024 at 12:16
> > To: dev 
> > Cc: Alexander Fedulov 
> > Subject: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer - Alexander
> > Fedulov
> > Happy New Year everyone,
> >
> > I'd like to start the year off by announcing Alexander Fedulov as a
> > new Flink committer.
> >
> > Alex has been active in the Flink community since 2019. He has
> > contributed more than 100 commits to Flink, its Kubernetes operator,
> > and various connectors [1][2].
> >
> > Especially noteworthy are his contributions on deprecating and
> > migrating the old Source API functions and test harnesses, the
> > enhancement to flame graphs, the dynamic rescale time computation in
> > Flink Autoscaling, as well as all the small enhancements Alex has
> > contributed which make a huge difference.
> >
> > Beyond code contributions, Alex has been an active community member
> > with his activity on the mailing lists [3][4], as well as various
> > talks and blog posts about Apache Flink [5][6].
> >
> > Congratulations Alex! The Flink community is proud to have you.
> >
> > Best,
> > The Flink PMC
> >
> > [1]
> > https://github.com/search?type=commits&q=author%3Aafedulov+org%3Aapache
> > [2]
> >
> https://issues.apache.org/jira/browse/FLINK-28229?jql=status%20in%20(Resolved%2C%20Closed)%20AND%20assignee%20in%20(afedulov)%20ORDER%20BY%20resolved%20DESC%2C%20created%20DESC
> > [3] https://lists.apache.org/list?dev@flink.apache.org:lte=100M:Fedulov
> > [4] https://lists.apache.org/list?u...@flink.apache.org:lte=100M:Fedulov
> > [5]
> >
> https://flink.apache.org/2020/01/15/advanced-flink-application-patterns-vol.1-case-study-of-a-fraud-detection-system/
> > [6]
> >
> https://www.ververica.com/blog/presenting-our-streaming-concepts-introduction-to-flink-video-series
> >
> > Unless otherwise stated above:
> >
> > IBM United Kingdom Limited
> > Registered in England and Wales with number 741598
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
> >
>


[jira] [Created] (FLINK-33979) Implement restore tests for TableSink node

2024-01-03 Thread Bonnie Varghese (Jira)
Bonnie Varghese created FLINK-33979:
---

 Summary: Implement restore tests for TableSink node
 Key: FLINK-33979
 URL: https://issues.apache.org/jira/browse/FLINK-33979
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: Bonnie Varghese
Assignee: Bonnie Varghese






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33978) FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2024-01-03 Thread Alan Sheinberg (Jira)
Alan Sheinberg created FLINK-33978:
--

 Summary: FLIP-400: AsyncScalarFunction for asynchronous scalar 
function support
 Key: FLINK-33978
 URL: https://issues.apache.org/jira/browse/FLINK-33978
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / Planner, Table SQL / Runtime
Reporter: Alan Sheinberg


https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[RESULT][VOTE] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2024-01-03 Thread Alan Sheinberg
The vote for FLIP-400: AsyncScalarFunction for asynchronous scalar function
support [1] has concluded (discussion thread [2]). The vote will be closed
[3].

There were 7 approving votes, 4 binding and 3 non-binding, and there were
no disapprovals:

- Martijn Visser  (binding)
- Lincoln Lee (binding)
- Timo Walther  (binding)
- Piotr Nowojski (binding)
- Natea Eshetu Beshada (non-binding)
- Jim Hughes (non-binding)
- Yuepeng Pan (non-binding)

Therefore, FLIP-400 was accepted. Thanks for all of the feedback and
discussion!

-Alan

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support

[2] https://lists.apache.org/thread/q3st6t1w05grd7bthzfjtr4r54fv4tm2
[3] https://lists.apache.org/thread/dt4tnwk8hcfj0sp3l3qwloqlljog8xm7


Re: [VOTE] FLIP-400: AsyncScalarFunction for asynchronous scalar function support

2024-01-03 Thread Alan Sheinberg
Thank you everyone for participating in the vote. I'm closing this vote now
and will announce the results in a separate thread.

-Alan

On Tue, Jan 2, 2024 at 8:40 AM Piotr Nowojski  wrote:

> +1 (binding)
>
> Best,
> Piotrek
>
> czw., 28 gru 2023 o 09:19 Timo Walther  napisał(a):
>
> > +1 (binding)
> >
> > Cheers,
> > Timo
> >
> > > Am 28.12.2023 um 03:13 schrieb Yuepeng Pan :
> > >
> > > +1 (non-binding).
> > >
> > > Best,
> > > Yuepeng Pan.
> > >
> > >
> > >
> > >
> > > At 2023-12-28 09:19:37, "Lincoln Lee"  wrote:
> > >> +1 (binding)
> > >>
> > >> Best,
> > >> Lincoln Lee
> > >>
> > >>
> > >> Martijn Visser  于2023年12月27日周三 23:16写道:
> > >>
> > >>> +1 (binding)
> > >>>
> > >>> On Fri, Dec 22, 2023 at 1:44 AM Jim Hughes
> > 
> > >>> wrote:
> > 
> >  Hi Alan,
> > 
> >  +1 (non binding)
> > 
> >  Cheers,
> > 
> >  Jim
> > 
> >  On Wed, Dec 20, 2023 at 2:41 PM Alan Sheinberg
> >   wrote:
> > 
> > > Hi everyone,
> > >
> > > I'd like to start a vote on FLIP-400 [1]. It covers introducing a
> new
> > >>> UDF
> > > type, AsyncScalarFunction for completing invocations
> asynchronously.
> > >>> It
> > > has been discussed in this thread [2].
> > >
> > > I would like to start a vote.  The vote will be open for at least
> 72
> > >>> hours
> > > (until December 28th 18:00 GMT) unless there is an objection or
> > > insufficient votes.
> > >
> > > [1]
> > >
> > >
> > >>>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-400%3A+AsyncScalarFunction+for+asynchronous+scalar+function+support
> > > [2]
> https://lists.apache.org/thread/q3st6t1w05grd7bthzfjtr4r54fv4tm2
> > >
> > > Thanks,
> > > Alan
> > >
> > >>>
> >
> >
>


Re: FW: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov

2024-01-03 Thread Steven Wu
Congra, Alex! Well deserved!

On Wed, Jan 3, 2024 at 2:31 AM David Radley  wrote:

> Sorry for my typo.
>
> Many congratulations Alex!
>
> From: David Radley 
> Date: Wednesday, 3 January 2024 at 10:23
> To: David Anderson 
> Cc: dev@flink.apache.org 
> Subject: Re: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer - Alexander
> Fedulov
> Many Congratulations David .
>
> From: Maximilian Michels 
> Date: Tuesday, 2 January 2024 at 12:16
> To: dev 
> Cc: Alexander Fedulov 
> Subject: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer - Alexander
> Fedulov
> Happy New Year everyone,
>
> I'd like to start the year off by announcing Alexander Fedulov as a
> new Flink committer.
>
> Alex has been active in the Flink community since 2019. He has
> contributed more than 100 commits to Flink, its Kubernetes operator,
> and various connectors [1][2].
>
> Especially noteworthy are his contributions on deprecating and
> migrating the old Source API functions and test harnesses, the
> enhancement to flame graphs, the dynamic rescale time computation in
> Flink Autoscaling, as well as all the small enhancements Alex has
> contributed which make a huge difference.
>
> Beyond code contributions, Alex has been an active community member
> with his activity on the mailing lists [3][4], as well as various
> talks and blog posts about Apache Flink [5][6].
>
> Congratulations Alex! The Flink community is proud to have you.
>
> Best,
> The Flink PMC
>
> [1]
> https://github.com/search?type=commits&q=author%3Aafedulov+org%3Aapache
> [2]
> https://issues.apache.org/jira/browse/FLINK-28229?jql=status%20in%20(Resolved%2C%20Closed)%20AND%20assignee%20in%20(afedulov)%20ORDER%20BY%20resolved%20DESC%2C%20created%20DESC
> [3] https://lists.apache.org/list?dev@flink.apache.org:lte=100M:Fedulov
> [4] https://lists.apache.org/list?u...@flink.apache.org:lte=100M:Fedulov
> [5]
> https://flink.apache.org/2020/01/15/advanced-flink-application-patterns-vol.1-case-study-of-a-fraud-detection-system/
> [6]
> https://www.ververica.com/blog/presenting-our-streaming-concepts-introduction-to-flink-video-series
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>


Re: [DISCUSS] FLIP-402: Extend ZooKeeper Curator configurations

2024-01-03 Thread Martijn Visser
Hi Oleksandr,

You're right, I was misreading the POM files because Curator is coming
in via Flink-Shaded and not directly. So the POM files only contain
references directly to Curator as test dependencies.
Then I'll defer to others with experience on this topic to chime in.

Cheers,

Martijn

On Wed, Jan 3, 2024 at 4:36 PM Alex Nitavsky  wrote:
>
> Hello Martjin,
>
> Thank you for reviewing the document.
>
> I'm under the impression that Flink utilizes Apache Curator to handle 
> high-availability leader elections for Job Managers with Zookeeper Services 
> [0].
> In the most recent master version, it seems that Curator is instantiated at 
> [1] ZooKeeperUtils, and the purpose of invoking this code for non-testing 
> purposes is [2] and [3]. As these parameters are passed through the Builder 
> pattern, I find myself unsure about how to convey them through the 
> flink-conf.yaml file.
>
> Your insights and guidance on this matter would be greatly appreciated.
>
> Kind regards
> Oleksandr
>
> [0] 
> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/ha/zookeeper_ha/
> [1] 
> https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/util/ZooKeeperUtils.java#L177
> [2] 
> https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/HighAvailabilityServicesUtils.java#L59
> [3] 
> https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/HighAvailabilityServicesUtils.java#L86
>
> On Thu, Dec 28, 2023 at 1:07 PM Martijn Visser  
> wrote:
>>
>> Hi Oleksandr,
>>
>> The FLIP talks about Curator, but outside of flink test utils, the
>> usage of Curator is only for test purposes. I don't think there's
>> anything preventing you right now from providing these additional
>> parameters as values in the flink-conf.yaml ?
>>
>> Best regards,
>>
>> Martijn
>>
>> On Thu, Dec 14, 2023 at 2:21 PM Alex Nitavsky  wrote:
>> >
>> > Hi all,
>> >
>> > I would like to start a discussion thread for: *FLIP-402: Extend ZooKeeper
>> > Curator configurations *[1]
>> >
>> > * Problem statement *
>> > Currently Flink misses several Apache Curator configurations, which could
>> > be useful for Flink deployment with ZooKeeper as HA provider.
>> >
>> > * Proposed solution *
>> > We have inspected all possible options for Apache Curator and proposed
>> > those which could be valuable for Flink users:
>> >
>> > - high-availability.zookeeper.client.authorization [2]
>> > - high-availability.zookeeper.client.maxCloseWaitMs [3]
>> > - high-availability.zookeeper.client.simulatedSessionExpirationPercent [4]
>> >
>> > The proposed way is to reflect those properties into Flink configuration
>> > options for Apache ZooKeeper.
>> >
>> > Looking forward to your feedback and suggestions.
>> >
>> > Kind regards
>> > Oleksandr
>> >
>> > [1]
>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-402%3A+Extend+ZooKeeper+Curator+configurations
>> > [2]
>> > https://curator.apache.org/apidocs/org/apache/curator/framework/CuratorFrameworkFactory.Builder.html#authorization(java.lang.String,byte%5B%5D)
>> > [3]
>> > https://curator.apache.org/apidocs/org/apache/curator/framework/CuratorFrameworkFactory.Builder.html#maxCloseWaitMs(int)
>> > [4]
>> > https://curator.apache.org/apidocs/org/apache/curator/framework/CuratorFrameworkFactory.Builder.html#simulatedSessionExpirationPercent(int)


Re: [DISCUSS] FLIP-402: Extend ZooKeeper Curator configurations

2024-01-03 Thread Alex Nitavsky
Hello Martjin,

Thank you for reviewing the document.

I'm under the impression that Flink utilizes Apache Curator to handle
high-availability leader elections for Job Managers with Zookeeper Services
[0].
In the most recent master version, it seems that Curator is instantiated at
[1] ZooKeeperUtils, and the purpose of invoking this code for non-testing
purposes is [2] and [3]. As these parameters are passed through the Builder
pattern, I find myself unsure about how to convey them through the
flink-conf.yaml file.

Your insights and guidance on this matter would be greatly appreciated.

Kind regards
Oleksandr

[0]
https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/deployment/ha/zookeeper_ha/
[1]
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/util/ZooKeeperUtils.java#L177
[2]
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/HighAvailabilityServicesUtils.java#L59
[3]
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/highavailability/HighAvailabilityServicesUtils.java#L86

On Thu, Dec 28, 2023 at 1:07 PM Martijn Visser 
wrote:

> Hi Oleksandr,
>
> The FLIP talks about Curator, but outside of flink test utils, the
> usage of Curator is only for test purposes. I don't think there's
> anything preventing you right now from providing these additional
> parameters as values in the flink-conf.yaml ?
>
> Best regards,
>
> Martijn
>
> On Thu, Dec 14, 2023 at 2:21 PM Alex Nitavsky 
> wrote:
> >
> > Hi all,
> >
> > I would like to start a discussion thread for: *FLIP-402: Extend
> ZooKeeper
> > Curator configurations *[1]
> >
> > * Problem statement *
> > Currently Flink misses several Apache Curator configurations, which could
> > be useful for Flink deployment with ZooKeeper as HA provider.
> >
> > * Proposed solution *
> > We have inspected all possible options for Apache Curator and proposed
> > those which could be valuable for Flink users:
> >
> > - high-availability.zookeeper.client.authorization [2]
> > - high-availability.zookeeper.client.maxCloseWaitMs [3]
> > - high-availability.zookeeper.client.simulatedSessionExpirationPercent
> [4]
> >
> > The proposed way is to reflect those properties into Flink configuration
> > options for Apache ZooKeeper.
> >
> > Looking forward to your feedback and suggestions.
> >
> > Kind regards
> > Oleksandr
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-402%3A+Extend+ZooKeeper+Curator+configurations
> > [2]
> >
> https://curator.apache.org/apidocs/org/apache/curator/framework/CuratorFrameworkFactory.Builder.html#authorization(java.lang.String,byte%5B%5D)
> > [3]
> >
> https://curator.apache.org/apidocs/org/apache/curator/framework/CuratorFrameworkFactory.Builder.html#maxCloseWaitMs(int)
> > [4]
> >
> https://curator.apache.org/apidocs/org/apache/curator/framework/CuratorFrameworkFactory.Builder.html#simulatedSessionExpirationPercent(int)
>


[jira] [Created] (FLINK-33977) Adaptive scheduler may not minimize the number of TMs during downscaling

2024-01-03 Thread Zhanghao Chen (Jira)
Zhanghao Chen created FLINK-33977:
-

 Summary: Adaptive scheduler may not minimize the number of TMs 
during downscaling
 Key: FLINK-33977
 URL: https://issues.apache.org/jira/browse/FLINK-33977
 Project: Flink
  Issue Type: Improvement
Affects Versions: 1.18.0
Reporter: Zhanghao Chen


Adaptive Scheduler uses SlotAssigner to assign free slots to slot sharing 
groups. Currently, there're two implementations of SlotAssigner available: the 
DefaultSlotAssigner that treats all slots and slot sharing groups equally and 
the {color:#172b4d}StateLocalitySlotAssigner{color} that assigns slots based on 
the number of local key groups to utilize local state recovery. The scheduler 
will use the DefaultSlotAssigner when no key group assignment info is available 
and use the StateLocalitySlotAssigner otherwise.
 
However, none of the SlotAssigner targets at minimizing the number of TMs, 
which may produce suboptimal slot assignment under the Application Mode. For 
example, when a job with 8 slot sharing groups and 2 TMs (each 4 slots) is 
downscaled through the FLIP-291 API to have 4 slot sharing groups instead, the 
cluster may still have 2 TMs, one with 1 free slot, and the other with 3 free 
slots. For end-users, this implies an ineffective downscaling as the total 
cluster resources are not reduced.
 
We should take minimizing number of TMs into consideration as well. A possible 
approach is to enhance the {color:#172b4d}StateLocalitySlotAssigner: when the 
number of free slots exceeds need, sort all the TMs by a score summing from the 
allocation scores of all slots on it, remove slots from the excessive TMs with 
the lowest score and proceed the remaining slot assignment.{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33976) AdaptiveScheduler cooldown period is taken from a wrong configuration

2024-01-03 Thread Jira
David Morávek created FLINK-33976:
-

 Summary: AdaptiveScheduler cooldown period is taken from a wrong 
configuration
 Key: FLINK-33976
 URL: https://issues.apache.org/jira/browse/FLINK-33976
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Configuration, Runtime / Coordination
Reporter: David Morávek


The new JobManager options introduced in FLINK-21883: 
`scaling-interval.\{min,max}` of AdaptiveScheduler are resolved from the 
per-Job configuration instead of JobManager's configuration, which is not 
correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33975) Tests for the new Sink V2 transformations

2024-01-03 Thread Peter Vary (Jira)
Peter Vary created FLINK-33975:
--

 Summary: Tests for the new Sink V2 transformations
 Key: FLINK-33975
 URL: https://issues.apache.org/jira/browse/FLINK-33975
 Project: Flink
  Issue Type: Sub-task
Reporter: Peter Vary


Create new tests for the SinkV2 api transformations, and migrate some of the 
tests to use the new API. Some of the old test should be kept using the old API 
to make sure that the backward compatibility is tested until the deprecation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33974) Implement the Sink transformation depending on the new SinkV2 interfaces

2024-01-03 Thread Peter Vary (Jira)
Peter Vary created FLINK-33974:
--

 Summary: Implement the Sink transformation depending on the new 
SinkV2 interfaces
 Key: FLINK-33974
 URL: https://issues.apache.org/jira/browse/FLINK-33974
 Project: Flink
  Issue Type: Sub-task
Reporter: Peter Vary


Implement the changes to the Sink transformation which should depend only on 
the new API interfaces. The tests should remain the same, to ensure backward 
compatibility.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33973) Add new interfaces for SinkV2 to synchronize the API with the SourceV2 API

2024-01-03 Thread Peter Vary (Jira)
Peter Vary created FLINK-33973:
--

 Summary: Add new interfaces for SinkV2 to synchronize the API with 
the SourceV2 API
 Key: FLINK-33973
 URL: https://issues.apache.org/jira/browse/FLINK-33973
 Project: Flink
  Issue Type: Sub-task
Reporter: Peter Vary


Create the new interfaces, set inheritance and deprecation to finalize the 
interface.
After this change the new interafaces will exits, but they will not be 
functional.

The existing interfaces, and test should be working without issue, to verify 
that adding the API will be backward compatible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33972) Enhance and synchronize Sink API to match the Source API

2024-01-03 Thread Peter Vary (Jira)
Peter Vary created FLINK-33972:
--

 Summary: Enhance and synchronize Sink API to match the Source API
 Key: FLINK-33972
 URL: https://issues.apache.org/jira/browse/FLINK-33972
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream
Reporter: Peter Vary


Umbrella jira for the implementation of 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Enhance+and+synchronize+Sink+API+to+match+the+Source+API



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [RESULT][VOTE] FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable

2024-01-03 Thread Martijn Visser
Thank you for driving this!

On Wed, Jan 3, 2024 at 1:02 PM Péter Váry  wrote:
>
> Hi All,
>
> `FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the
> type of the Committable` [1], which has been renamed to `FLIP-372: Enhance
> and synchronize Sink API to match the Source API` has been accepted and
> voted through this thread [2].
>
> The proposal received 6 binding approval votes and there is no disapproval:
> - Márton Balassi (binding)
> - Gyula Fora (binding)
> - Gabor Somogyi (binding)
> - Yanquan Lv (non-binding)
> - Jiabao Sun (non-binding)
> - Hang Ruan (non-binding)
> - Leonard Xu (binding)
> - Martijn Visser (binding)
> - Tzu-Li (Gordon) Tai (binding)
>
> Thanks to all involved!
>
> BR,
> G
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Enhance+and+synchronize+Sink+API+to+match+the+Source+API
> [2] https://lists.apache.org/thread/zjkss3j9d291ldvynspotzjlsl3tmdkl


[VOTE] FLIP-389: Annotate SingleThreadFetcherManager as PublicEvolving

2024-01-03 Thread Hongshun Wang
Dear Flink Developers,

Thank you for providing feedback on FLIP-389: Annotate
SingleThreadFetcherManager as PublicEvolving[1] on the discussion
thread[2]. The goal of the FLIP is as follows:

   - To expose the SplitFetcherManager / SingleThreadFetcheManager as
   Public, allowing connector developers to easily create their own threading
   models in the SourceReaderBase by implementing addSplits(), removeSplits(),
   maybeShutdownFinishedFetchers() and other functions.
   - To hide the element queue from the connector developers and simplify
   the SourceReaderBase to consist of only SplitFetcherManager and
   RecordEmitter as major components.


Any additional questions regarding this FLIP? Looking forward to hearing
from you.


Thanks,
Hongshun Wang


[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=278465498

[2] https://lists.apache.org/thread/b8f509878ptwl3kmmgg95tv8sb1j5987


[RESULT][VOTE] FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the type of the Committable

2024-01-03 Thread Péter Váry
Hi All,

`FLIP-372: Allow TwoPhaseCommittingSink WithPreCommitTopology to alter the
type of the Committable` [1], which has been renamed to `FLIP-372: Enhance
and synchronize Sink API to match the Source API` has been accepted and
voted through this thread [2].

The proposal received 6 binding approval votes and there is no disapproval:
- Márton Balassi (binding)
- Gyula Fora (binding)
- Gabor Somogyi (binding)
- Yanquan Lv (non-binding)
- Jiabao Sun (non-binding)
- Hang Ruan (non-binding)
- Leonard Xu (binding)
- Martijn Visser (binding)
- Tzu-Li (Gordon) Tai (binding)

Thanks to all involved!

BR,
G

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Enhance+and+synchronize+Sink+API+to+match+the+Source+API
[2] https://lists.apache.org/thread/zjkss3j9d291ldvynspotzjlsl3tmdkl


FW: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov

2024-01-03 Thread David Radley
Sorry for my typo.

Many congratulations Alex!

From: David Radley 
Date: Wednesday, 3 January 2024 at 10:23
To: David Anderson 
Cc: dev@flink.apache.org 
Subject: Re: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer - Alexander 
Fedulov
Many Congratulations David .

From: Maximilian Michels 
Date: Tuesday, 2 January 2024 at 12:16
To: dev 
Cc: Alexander Fedulov 
Subject: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov
Happy New Year everyone,

I'd like to start the year off by announcing Alexander Fedulov as a
new Flink committer.

Alex has been active in the Flink community since 2019. He has
contributed more than 100 commits to Flink, its Kubernetes operator,
and various connectors [1][2].

Especially noteworthy are his contributions on deprecating and
migrating the old Source API functions and test harnesses, the
enhancement to flame graphs, the dynamic rescale time computation in
Flink Autoscaling, as well as all the small enhancements Alex has
contributed which make a huge difference.

Beyond code contributions, Alex has been an active community member
with his activity on the mailing lists [3][4], as well as various
talks and blog posts about Apache Flink [5][6].

Congratulations Alex! The Flink community is proud to have you.

Best,
The Flink PMC

[1] https://github.com/search?type=commits&q=author%3Aafedulov+org%3Aapache
[2] 
https://issues.apache.org/jira/browse/FLINK-28229?jql=status%20in%20(Resolved%2C%20Closed)%20AND%20assignee%20in%20(afedulov)%20ORDER%20BY%20resolved%20DESC%2C%20created%20DESC
[3] https://lists.apache.org/list?dev@flink.apache.org:lte=100M:Fedulov
[4] https://lists.apache.org/list?u...@flink.apache.org:lte=100M:Fedulov
[5] 
https://flink.apache.org/2020/01/15/advanced-flink-application-patterns-vol.1-case-study-of-a-fraud-detection-system/
[6] 
https://www.ververica.com/blog/presenting-our-streaming-concepts-introduction-to-flink-video-series

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


Re: [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov

2024-01-03 Thread David Radley
Many Congratulations David .

From: Maximilian Michels 
Date: Tuesday, 2 January 2024 at 12:16
To: dev 
Cc: Alexander Fedulov 
Subject: [EXTERNAL] [ANNOUNCE] New Apache Flink Committer - Alexander Fedulov
Happy New Year everyone,

I'd like to start the year off by announcing Alexander Fedulov as a
new Flink committer.

Alex has been active in the Flink community since 2019. He has
contributed more than 100 commits to Flink, its Kubernetes operator,
and various connectors [1][2].

Especially noteworthy are his contributions on deprecating and
migrating the old Source API functions and test harnesses, the
enhancement to flame graphs, the dynamic rescale time computation in
Flink Autoscaling, as well as all the small enhancements Alex has
contributed which make a huge difference.

Beyond code contributions, Alex has been an active community member
with his activity on the mailing lists [3][4], as well as various
talks and blog posts about Apache Flink [5][6].

Congratulations Alex! The Flink community is proud to have you.

Best,
The Flink PMC

[1] https://github.com/search?type=commits&q=author%3Aafedulov+org%3Aapache
[2] 
https://issues.apache.org/jira/browse/FLINK-28229?jql=status%20in%20(Resolved%2C%20Closed)%20AND%20assignee%20in%20(afedulov)%20ORDER%20BY%20resolved%20DESC%2C%20created%20DESC
[3] https://lists.apache.org/list?dev@flink.apache.org:lte=100M:Fedulov
[4] https://lists.apache.org/list?u...@flink.apache.org:lte=100M:Fedulov
[5] 
https://flink.apache.org/2020/01/15/advanced-flink-application-patterns-vol.1-case-study-of-a-fraud-detection-system/
[6] 
https://www.ververica.com/blog/presenting-our-streaming-concepts-introduction-to-flink-video-series

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


Re: [Discuss] FLIP-407: Improve Flink Client performance in interactive scenarios

2024-01-03 Thread Zhanghao Chen
Thanks for driving this effort on improving the interactive use experience of 
Flink. The proposal overall looks good to me.

Best,
Zhanghao Chen

From: xiangyu feng 
Sent: Tuesday, December 26, 2023 16:51
To: dev@flink.apache.org 
Subject: [Discuss] FLIP-407: Improve Flink Client performance in interactive 
scenarios

Hi devs,

I'm opening this thread to discuss FLIP-407: Improve Flink Client
performance in interactive scenarios. The POC test results and design doc
can be found at: FLIP-407

.

Currently, Flink Client is mainly designed for one time interaction with
the Flink Cluster. All the resources(http connections, threads, ha
services) and instances(ClusterDescriptor, ClusterClient, RestClient) are
created and recycled for each interaction. This works well when users do
not need to interact frequently with Flink Cluster and also saves resource
usage since resources are recycled immediately after each usage.

However, in OLAP or StreamingWarehouse scenarios, users might submit
interactive jobs to a dedicated Flink Session Cluster very often. In this
case, we find that for short queries that can finish in less than 1s in
Flink Cluster will still have E2E latency greater than 2s. Hence, we
propose this FLIP to improve the Flink Client performance in this scenario.
This could also improve the user experience when using session debug mode.

The major change in this FLIP is that there will be a new introduced option
*'execution.interactive-client'*. When this option is enabled, Flink
Client will reuse all the necessary resources to improve interactive
performance, including: HA Services, HTTP connections, threads and all
kinds of instances related to a long-running Flink Cluster. The default
value of this option will be false, then Flink Client will behave as before.

Also, this FLIP proposed a configurable RetryStrategy when fetching results
from client-side to Flink Cluster. In interactive scenarios, this can save
more than 15% of TM CPU usage without performance degradation.

Looking forward to your feedback, thanks.

Best regards,
Xiangyu