[jira] [Created] (FLINK-35701) SqlServer the primary key type is uniqueidentifier, the scan.incremental.snapshot.chunk.size parameter does not take effect during split chunk

2024-06-25 Thread yangxiao (Jira)
yangxiao created FLINK-35701:


 Summary: SqlServer the primary key type is uniqueidentifier, the 
scan.incremental.snapshot.chunk.size parameter does not take effect during 
split chunk
 Key: FLINK-35701
 URL: https://issues.apache.org/jira/browse/FLINK-35701
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: yangxiao


1. The source table in the SQL Server database contains 100 inventory data 
records. The default value of scan.incremental.snapshot.chunk.size is 8096.
2. Only one chunk is split, which should be 124 chunks.
 
Problem reproduction:
1. Create a test table in the SQL Server and import data.
 
BEGIN TRANSACTION
USE [testdb];
DROP TABLE [dbo].[testtable];
CREATE TABLE [dbo].[testtable] (
  [TestId] varchar(64),
  [CustomerId] varchar(64),
  [Id] uniqueidentifier NOT NULL,
PRIMARY KEY CLUSTERED ([Id])
);
ALTER TABLE [dbo].[testtable] SET (LOCK_ESCALATION = TABLE);
COMMIT
 
 
declare @Id int;
set @Id=1;
while @Id<=100
begin
insert into testtable values(NEWID(), NEWID(), NEWID());
    set @Id=@Id+1;
end;
 
2. Use flinkcdc sqlserver connector to collect data.
CREATE TABLE testtable (
  TestId STRING,
  CustomerId STRING,
  Id STRING,
  PRIMARY KEY (Id) NOT ENFORCED
) WITH (
  'connector' = 'sqlserver-cdc',
  'hostname' = '',
  'port' = '1433',
  'username' = '',
  'password' = '',
  'database-name' = 'testdb',
  'table-name' = 'dbo.testtable'
);
 
3、LOG
2024-06-26 10:04:43,377 | INFO  | [SourceCoordinator-Source: testtable[1]] | 
Use unevenly-sized chunks for table cdm.dbo.CustomerVehicle, the chunk size is 
8096 | 
com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.splitUnevenlySizedChunks(SqlServerChunkSplitter.java:268)
2024-06-26 10:04:43,385 | INFO  | [SourceCoordinator-Source: testtable[1]] | 
Split table cdm.dbo.CustomerVehicle into 1 chunks, time cost: 144ms. | 
com.ververica.cdc.connectors.sqlserver.source.dialect.SqlServerChunkSplitter.generateSplits(SqlServerChunkSplitter.java:117)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[SUMMARY] Flink 1.20 Release Sync 06/25/2024

2024-06-25 Thread weijie guo
Dear devs,


This is the second meeting after feature freeze of Flink 1.20.


I'd like to share the information synced in the meeting.


-*Cut branch*


 release-1.20 branch has been cut on Tuesday, and we also updated the
version of master branch to 2.0-SNAPSHOT(confirmed with all RMs of
2.0)


 For PRs that should be presented in 1.20.0, please make sure:

* Merge the PR into both master and release-1.20 branches

* The JIRA ticket should be closed with the correct fix-versions (1.20.0).


-*Cross-team release testing*


 Release testing is already start and in the meantime, there're still serval

flips which need to be confirmed whether cross-team testinig is required[1].

RM had created related tickets includes all the features listed on the 1.20

wiki page[2] as well as other actually completed flips.


 Also contributors are encouraged to create tickets if there are other ones

that need to be cross-team tested (Just create new ticket for testing using

title 'Release Testing: Verify ...' without 'Instructions' keyword).


Progress of Release Testing(We plan to finish all by the end of next week):

   - total 13 flip/features, 9 confirmed,  4 wait for response

   - testing instructions ready: 9  (1 assigned)


-*Blockers*


 Congrats, no known blocker for now.


-*Release notes [Highlights]*


New features and behavior changes which without the 'Release Note', please

help to fill out column in the JIRA(click the Edit button and pull the page

to the center), which is important for users and will be part of the
release announcement.


-*Sync meeting[3]*


 The next meeting is 07/02/2024 10am (UTC+2) and 4pm (UTC+8), please
feel free to join us.


Lastly, we encourage attendees to fill out the topics to be discussed at
the bottom of 1.20 wiki page[2] a day in advance, to make it easier for
everyone to understand the background of the topics, thanks!


[1] https://issues.apache.org/jira/browse/FLINK-35602

[2] https://cwiki.apache.org/confluence/display/FLINK/1.20+Release

[3] https://meet.google.com/mtj-huez-apu


Best,

Robert, Rui, Ufuk, Weijie


[jira] [Created] (FLINK-35700) Loosen CDC pipeline options validation

2024-06-25 Thread yux (Jira)
yux created FLINK-35700:
---

 Summary: Loosen CDC pipeline options validation
 Key: FLINK-35700
 URL: https://issues.apache.org/jira/browse/FLINK-35700
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: yux


FLINK-35121 adds pipeline configuration validation, rejecting any unknown 
options, which turns to be too strict, and it's not possible to create 
customized configuration extensions. Also, Flink doesn't reject unknown entries 
in flink-conf / config.yaml, just silently ignores them. It might be better for 
CDC to follow such behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [2.0] How to handle on-going feature development in Flink 2.0?

2024-06-25 Thread Jark Wu
I also think this should not block new feature development.
Having "nice-to-have" and "must-to-have" tags on the FLIPs is a good idea.

For the downstream projects, I think we need to release a 2.0 preview
version one or
two months before the formal release. This can leave some time for the
downstream
projects to integrate and provide feedback. So we can fix the problems
(e.g. unexpected
breaking changes, Java versions) before 2.0.

Best,
Jark

On Wed, 26 Jun 2024 at 09:39, Xintong Song  wrote:

> I also don't think we should block new feature development until 2.0. From
> my understanding, the new major release is no different from the regular
> minor releases for new features.
>
> I think tracking new features, either as nice-to-have items or in a
> separate list, is necessary. It helps us understand what's going on in the
> release cycle, and what to announce and promote. Maybe we should start a
> discussion on updating the 2.0 item list, to 1) collect new items that are
> proposed / initiated after the original list being created and 2) to remove
> some items that are no longer suitable. I'll discuss this with the other
> release managers first.
>
> For the connectors and operators, I think it depends on whether they depend
> on any deprecated APIs or internal implementations of Flink. Ideally,
> all @Public APIs and @PublicEvolving APIs that we plan to change / remove
> should have been deprecated in 1.19 and 1.20 respectively. That means if
> the connectors and operators only use non-deprecated @Puclib
> and @PublicEvolving APIs in 1.20, hopefully there should not be any
> problems upgrading to 2.0.
>
> Best,
>
> Xintong
>
>
>
> On Wed, Jun 26, 2024 at 5:20 AM Becket Qin  wrote:
>
> > Thanks for the question, Matthias.
> >
> > My two cents, I don't think we are blocking new feature development. My
> > understanding is that the community will just prioritize removing
> > deprecated APIs in the 2.0 dev cycle. Because of that, it is possible
> that
> > some new feature development may slow down a little bit since some
> > contributors may be working on the must-have features for 2.0. But policy
> > wise, I don't see a reason to block the new feature development for the
> 2.0
> > release feature plan[1].
> >
> > Process wise, I like your idea of adding the new features as nice-to-have
> > in the 2.0 feature list.
> >
> > Re: David,
> > Given it is a major version bump. It is possible that some of the
> > downstream projects (e.g. connectors, Paimon, etc) will have to see if a
> > major version bump is also needed there. And it is probably going to be
> > decisions made on a per-project basis.
> > Regarding the Java version specifically, this probably worth a separate
> > discussion. According to a recent report[2] on the state of Java, it
> might
> > be a little early to drop support for Java 11. We can discuss this
> > separately.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > [1] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > [2]
> >
> >
> https://newrelic.com/sites/default/files/2024-04/new-relic-state-of-the-java-ecosystem-report-2024-04-30.pdf
> >
> > On Tue, Jun 25, 2024 at 4:58 AM David Radley 
> > wrote:
> >
> > > Hi,
> > > I think this is a great question. I am not sure if this has been
> covered
> > > elsewhere, but it would be good to be clear how this effects the
> > connectors
> > > and operator repos, with potentially v1 and v2 oriented new featuresI
> > > suspect this will be a connector by connector investigation. I am
> > thinking
> > > connectors with Hadoop eco-system dependencies (e.g. Paimon) which may
> > not
> > > work nicely with Java 17,
> > >
> > >  Kind regards, David.
> > >
> > >
> > > From: Matthias Pohl 
> > > Date: Tuesday, 25 June 2024 at 09:57
> > > To: dev@flink.apache.org 
> > > Cc: Xintong Song , martijnvis...@apache.org <
> > > martijnvis...@apache.org>, imj...@gmail.com ,
> > > becket@gmail.com 
> > > Subject: [EXTERNAL] [2.0] How to handle on-going feature development in
> > > Flink 2.0?
> > > Hi 2.0 release managers,
> > > With the 1.20 release branch being cut [1], master is now referring to
> > > 2.0-SNAPSHOT. I remember that, initially, the community had the idea of
> > > keeping the 2.0 release as small as possible focusing on API changes
> [2].
> > >
> > > What does this mean for new features? I guess blocking them until 2.0
> is
> > > released is not a good option. Shall we treat new features as
> > > "nice-to-have" items as documented in the 2.0 release overview [3] and
> > > merge them to master like it was done for minor releases in the past?
> Do
> > > you want to add a separate section in the 2.0 release overview [3] to
> > list
> > > these new features (e.g. FLIP-461 [4]) separately? That might help to
> > > manage planned 2.0 deprecations/API removal and new features
> separately.
> > Or
> > > do you have a different process in mind?
> > >
> > > Apologies if this was already discussed somewhere. I didn't 

Re: [2.0] How to handle on-going feature development in Flink 2.0?

2024-06-25 Thread Xintong Song
I also don't think we should block new feature development until 2.0. From
my understanding, the new major release is no different from the regular
minor releases for new features.

I think tracking new features, either as nice-to-have items or in a
separate list, is necessary. It helps us understand what's going on in the
release cycle, and what to announce and promote. Maybe we should start a
discussion on updating the 2.0 item list, to 1) collect new items that are
proposed / initiated after the original list being created and 2) to remove
some items that are no longer suitable. I'll discuss this with the other
release managers first.

For the connectors and operators, I think it depends on whether they depend
on any deprecated APIs or internal implementations of Flink. Ideally,
all @Public APIs and @PublicEvolving APIs that we plan to change / remove
should have been deprecated in 1.19 and 1.20 respectively. That means if
the connectors and operators only use non-deprecated @Puclib
and @PublicEvolving APIs in 1.20, hopefully there should not be any
problems upgrading to 2.0.

Best,

Xintong



On Wed, Jun 26, 2024 at 5:20 AM Becket Qin  wrote:

> Thanks for the question, Matthias.
>
> My two cents, I don't think we are blocking new feature development. My
> understanding is that the community will just prioritize removing
> deprecated APIs in the 2.0 dev cycle. Because of that, it is possible that
> some new feature development may slow down a little bit since some
> contributors may be working on the must-have features for 2.0. But policy
> wise, I don't see a reason to block the new feature development for the 2.0
> release feature plan[1].
>
> Process wise, I like your idea of adding the new features as nice-to-have
> in the 2.0 feature list.
>
> Re: David,
> Given it is a major version bump. It is possible that some of the
> downstream projects (e.g. connectors, Paimon, etc) will have to see if a
> major version bump is also needed there. And it is probably going to be
> decisions made on a per-project basis.
> Regarding the Java version specifically, this probably worth a separate
> discussion. According to a recent report[2] on the state of Java, it might
> be a little early to drop support for Java 11. We can discuss this
> separately.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> [1] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> [2]
>
> https://newrelic.com/sites/default/files/2024-04/new-relic-state-of-the-java-ecosystem-report-2024-04-30.pdf
>
> On Tue, Jun 25, 2024 at 4:58 AM David Radley 
> wrote:
>
> > Hi,
> > I think this is a great question. I am not sure if this has been covered
> > elsewhere, but it would be good to be clear how this effects the
> connectors
> > and operator repos, with potentially v1 and v2 oriented new featuresI
> > suspect this will be a connector by connector investigation. I am
> thinking
> > connectors with Hadoop eco-system dependencies (e.g. Paimon) which may
> not
> > work nicely with Java 17,
> >
> >  Kind regards, David.
> >
> >
> > From: Matthias Pohl 
> > Date: Tuesday, 25 June 2024 at 09:57
> > To: dev@flink.apache.org 
> > Cc: Xintong Song , martijnvis...@apache.org <
> > martijnvis...@apache.org>, imj...@gmail.com ,
> > becket@gmail.com 
> > Subject: [EXTERNAL] [2.0] How to handle on-going feature development in
> > Flink 2.0?
> > Hi 2.0 release managers,
> > With the 1.20 release branch being cut [1], master is now referring to
> > 2.0-SNAPSHOT. I remember that, initially, the community had the idea of
> > keeping the 2.0 release as small as possible focusing on API changes [2].
> >
> > What does this mean for new features? I guess blocking them until 2.0 is
> > released is not a good option. Shall we treat new features as
> > "nice-to-have" items as documented in the 2.0 release overview [3] and
> > merge them to master like it was done for minor releases in the past? Do
> > you want to add a separate section in the 2.0 release overview [3] to
> list
> > these new features (e.g. FLIP-461 [4]) separately? That might help to
> > manage planned 2.0 deprecations/API removal and new features separately.
> Or
> > do you have a different process in mind?
> >
> > Apologies if this was already discussed somewhere. I didn't manage to
> find
> > anything related to this topic.
> >
> > Best,
> > Matthias
> >
> > [1] https://lists.apache.org/thread/mwnfd7o10xo6ynx0n640pw9v2opbkm8l
> > [2] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> > [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> > [4]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler
> >
> > Unless otherwise stated above:
> >
> > IBM United Kingdom Limited
> > Registered in England and Wales with number 741598
> > Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
> >
>


Re: [2.0] How to handle on-going feature development in Flink 2.0?

2024-06-25 Thread Becket Qin
Thanks for the question, Matthias.

My two cents, I don't think we are blocking new feature development. My
understanding is that the community will just prioritize removing
deprecated APIs in the 2.0 dev cycle. Because of that, it is possible that
some new feature development may slow down a little bit since some
contributors may be working on the must-have features for 2.0. But policy
wise, I don't see a reason to block the new feature development for the 2.0
release feature plan[1].

Process wise, I like your idea of adding the new features as nice-to-have
in the 2.0 feature list.

Re: David,
Given it is a major version bump. It is possible that some of the
downstream projects (e.g. connectors, Paimon, etc) will have to see if a
major version bump is also needed there. And it is probably going to be
decisions made on a per-project basis.
Regarding the Java version specifically, this probably worth a separate
discussion. According to a recent report[2] on the state of Java, it might
be a little early to drop support for Java 11. We can discuss this
separately.

Thanks,

Jiangjie (Becket) Qin

[1] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
[2]
https://newrelic.com/sites/default/files/2024-04/new-relic-state-of-the-java-ecosystem-report-2024-04-30.pdf

On Tue, Jun 25, 2024 at 4:58 AM David Radley 
wrote:

> Hi,
> I think this is a great question. I am not sure if this has been covered
> elsewhere, but it would be good to be clear how this effects the connectors
> and operator repos, with potentially v1 and v2 oriented new featuresI
> suspect this will be a connector by connector investigation. I am thinking
> connectors with Hadoop eco-system dependencies (e.g. Paimon) which may not
> work nicely with Java 17,
>
>  Kind regards, David.
>
>
> From: Matthias Pohl 
> Date: Tuesday, 25 June 2024 at 09:57
> To: dev@flink.apache.org 
> Cc: Xintong Song , martijnvis...@apache.org <
> martijnvis...@apache.org>, imj...@gmail.com ,
> becket@gmail.com 
> Subject: [EXTERNAL] [2.0] How to handle on-going feature development in
> Flink 2.0?
> Hi 2.0 release managers,
> With the 1.20 release branch being cut [1], master is now referring to
> 2.0-SNAPSHOT. I remember that, initially, the community had the idea of
> keeping the 2.0 release as small as possible focusing on API changes [2].
>
> What does this mean for new features? I guess blocking them until 2.0 is
> released is not a good option. Shall we treat new features as
> "nice-to-have" items as documented in the 2.0 release overview [3] and
> merge them to master like it was done for minor releases in the past? Do
> you want to add a separate section in the 2.0 release overview [3] to list
> these new features (e.g. FLIP-461 [4]) separately? That might help to
> manage planned 2.0 deprecations/API removal and new features separately. Or
> do you have a different process in mind?
>
> Apologies if this was already discussed somewhere. I didn't manage to find
> anything related to this topic.
>
> Best,
> Matthias
>
> [1] https://lists.apache.org/thread/mwnfd7o10xo6ynx0n640pw9v2opbkm8l
> [2] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
> [3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
> [4]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU
>


Re: Flink-connector-jdbc v3.2.0 support flink 1.17.x?

2024-06-25 Thread Muhammet Orazov

Hello Jerry,

The JDBC connector version v3.2.0 is built and released[1]
for Flink versions 1.18+.

I'd suggest to use v3.1.x versions with Flink 1.17 or
ideally upgrade the Flink version.

Best,
Muhammet

[1]: 
https://mvnrepository.com/artifact/org.apache.flink/flink-connector-jdbc


On 2024-06-24 10:23, Jerry wrote:

Can Flink-connector-jdbc v3.2.0 support flink 1.17.x?


Re: [DISCUSS] FLIP-456: Introduce DESCRIBE FUNCTION

2024-06-25 Thread Natea Eshetu Beshada
Oh no haha, yes thanks for pointing that out Jim! Of course I make a typo
on an editable format like email :)

On Tue, Jun 25, 2024 at 11:20 AM Jim Hughes 
wrote:

> Hi Natea,
>
> Looks good.  As a note, in the title of this email, a number got switched!
> FLIP-456 is about compiled plans for batch operators. :)
>
> The link below is correct.
>
> Cheers,
>
> Jim
>
> On Tue, Jun 25, 2024 at 1:29 PM Natea Eshetu Beshada
>  wrote:
>
> > Hello all,
> >
> > I would like to kickstart the discussion of FLIP-465: Introduce DESCRIBE
> > FUNCTION [1].
> >
> > The proposal is to add SQL syntax that would allow users to describe the
> > metadata of a given function.
> >
> > I look forward to hearing feedback from the community.
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-465%3A+Introduce+DESCRIBE+FUNCTION
> >
> > Thanks,
> > Natea
> >
>


Re: [DISCUSS] FLIP-456: Introduce DESCRIBE FUNCTION

2024-06-25 Thread Jim Hughes
Hi Natea,

Looks good.  As a note, in the title of this email, a number got switched!
FLIP-456 is about compiled plans for batch operators. :)

The link below is correct.

Cheers,

Jim

On Tue, Jun 25, 2024 at 1:29 PM Natea Eshetu Beshada
 wrote:

> Hello all,
>
> I would like to kickstart the discussion of FLIP-465: Introduce DESCRIBE
> FUNCTION [1].
>
> The proposal is to add SQL syntax that would allow users to describe the
> metadata of a given function.
>
> I look forward to hearing feedback from the community.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-465%3A+Introduce+DESCRIBE+FUNCTION
>
> Thanks,
> Natea
>


[DISCUSS] FLIP-456: Introduce DESCRIBE FUNCTION

2024-06-25 Thread Natea Eshetu Beshada
Hello all,

I would like to kickstart the discussion of FLIP-465: Introduce DESCRIBE
FUNCTION [1].

The proposal is to add SQL syntax that would allow users to describe the
metadata of a given function.

I look forward to hearing feedback from the community.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-465%3A+Introduce+DESCRIBE+FUNCTION

Thanks,
Natea


[jira] [Created] (FLINK-35699) The flink-kubernetes artifact shades Jackson 2.15.3 from fabric8

2024-06-25 Thread Ferenc Csaky (Jira)
Ferenc Csaky created FLINK-35699:


 Summary: The flink-kubernetes artifact shades Jackson 2.15.3 from 
fabric8
 Key: FLINK-35699
 URL: https://issues.apache.org/jira/browse/FLINK-35699
 Project: Flink
  Issue Type: Bug
  Components: Deployment / Kubernetes
Affects Versions: 1.19.1
Reporter: Ferenc Csaky
 Fix For: 1.20.0, 1.19.2


The {{flink-kubernetes}} artifact shades Jackson classes coming through 
fabric8, but since Jackson 2.15, Jackson is a [multi-release 
JAR|https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.15#jar-changes],
 which requires some additional relocations for correct shading.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-444: Native file copy support

2024-06-25 Thread Piotr Nowojski
Ops, I must have forgotten to update the FLIP as we discussed. I will fix
it tomorrow and the vote period will be extended.

Best,
Piotrek

wt., 25 cze 2024 o 13:56 Zakelly Lan  napisał(a):

> Hi Piotrek,
>
> I don't see any statement about removing or renaming the
> `DuplicatingFileSystem` in the FLIP, shall we do that as mentioned in the
> discussion thread?
>
>
> Best,
> Zakelly
>
> On Tue, Jun 25, 2024 at 4:58 PM Piotr Nowojski 
> wrote:
>
> > Hi all,
> >
> > I would like to start a vote for the FLIP-444 [1]. The discussion thread
> is
> > here [2].
> >
> > The vote will be open for at least 72.
> >
> > Best,
> > Piotrek
> >
> > [1] https://cwiki.apache.org/confluence/x/rAn9EQ
> > [2] https://lists.apache.org/thread/lkwmyjt2bnmvgx4qpp82rldwmtd4516c
> >
>


[jira] [Created] (FLINK-35698) Parquet connector fails to load ROW after save

2024-06-25 Thread Andrey Gaskov (Jira)
Andrey Gaskov created FLINK-35698:
-

 Summary: Parquet connector fails to load ROW 
after save
 Key: FLINK-35698
 URL: https://issues.apache.org/jira/browse/FLINK-35698
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Table 
SQL / API
Affects Versions: 1.19.0, 1.18.0, 1.17.0
Reporter: Andrey Gaskov


The bug could be reproduced by the following test added to 
ParquetFileSystemITCase.java:

 
{code:java}
@TestTemplate
void testRowColumnType() throws IOException, ExecutionException, 
InterruptedException {
String path = 
TempDirUtils.newFolder(super.fileTempFolder()).toURI().getPath();
super.tableEnv()
.executeSql(
"create table t_in("
+ "grp ROW"
+ ") with ("
+ "'connector' = 'datagen',"
+ "'number-of-rows' = '10'"
+ ")");
super.tableEnv()
.executeSql(
"create table t_out("
+ "grp ROW"
+ ") with ("
+ "'connector' = 'filesystem',"
+ "'path' = '"
+ path
+ "',"
+ "'format' = 'parquet'"
+ ")");
super.tableEnv().executeSql("insert into t_out select * from t_in").await();
List rows =
CollectionUtil.iteratorToList(
super.tableEnv().executeSql("select * from t_out limit 
10").collect());
assertThat(rows).hasSize(10);
} {code}
It fails with this root exception after hanging for 40 seconds:

 

 
{code:java}
Caused by: java.lang.ClassCastException: 
org.apache.flink.table.data.columnar.vector.heap.HeapIntVector cannot be cast 
to org.apache.flink.table.data.columnar.vector.DecimalColumnVector
    at 
org.apache.flink.table.data.columnar.vector.VectorizedColumnBatch.getDecimal(VectorizedColumnBatch.java:118)
    at 
org.apache.flink.table.data.columnar.ColumnarRowData.getDecimal(ColumnarRowData.java:128)
    at 
org.apache.flink.table.data.RowData.lambda$createFieldGetter$89bd9445$1(RowData.java:233)
    at 
org.apache.flink.table.data.RowData.lambda$createFieldGetter$25774257$1(RowData.java:296)
    at 
org.apache.flink.table.runtime.typeutils.RowDataSerializer.toBinaryRow(RowDataSerializer.java:207)
    at 
org.apache.flink.table.data.writer.AbstractBinaryWriter.writeRow(AbstractBinaryWriter.java:147)
    at 
org.apache.flink.table.data.writer.BinaryRowWriter.writeRow(BinaryRowWriter.java:27)
    at 
org.apache.flink.table.data.writer.BinaryWriter.write(BinaryWriter.java:155)
    at 
org.apache.flink.table.runtime.typeutils.RowDataSerializer.toBinaryRow(RowDataSerializer.java:204)
    at 
org.apache.flink.table.runtime.typeutils.RowDataSerializer.serialize(RowDataSerializer.java:103)
    at 
org.apache.flink.table.runtime.typeutils.RowDataSerializer.serialize(RowDataSerializer.java:48)
    at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:173)
    at 
org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:44)
    at 
org.apache.flink.runtime.plugable.SerializationDelegate.write(SerializationDelegate.java:54)
    at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.serializeRecord(RecordWriter.java:152)
    at 
org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:108)
    at 
org.apache.flink.runtime.io.network.api.writer.ChannelSelectorRecordWriter.emit(ChannelSelectorRecordWriter.java:55)
    at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:140)
    at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collectAndCheckIfChained(RecordWriterOutput.java:120)
    at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:101)
    at 
org.apache.flink.streaming.runtime.io.RecordWriterOutput.collect(RecordWriterOutput.java:53)
    at 
org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:60)
    at 
org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:32)
    at 
org.apache.flink.table.runtime.operators.sort.LimitOperator.processElement(LimitOperator.java:47)
    at 
org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:109)
    at 
org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:78)
    at 
org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:40)
    at 

[jira] [Created] (FLINK-35697) Release Testing: Verify FLIP-451 Introduce timeout configuration to AsyncSink

2024-06-25 Thread Ahmed Hamdy (Jira)
Ahmed Hamdy created FLINK-35697:
---

 Summary: Release Testing: Verify FLIP-451 Introduce timeout 
configuration to AsyncSink
 Key: FLINK-35697
 URL: https://issues.apache.org/jira/browse/FLINK-35697
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / Common
Reporter: Ahmed Hamdy
 Fix For: 1.20.0


h2. Description

In FLIP-451 we added Timeout configuration to {{AsyncSinkWriter}}, with default 
value of 10 minutes and default failOnTimeout to false. 
We need to test the new feature on different levels
- Functional Testing
- Performance Testing
- Regression Testing

h2. Common Utils

The feature introduced affects an abstract {{AsyncSinkWriter}} class. we need 
to use an implementation sink for our tests, Any implementation where we can 
track delivery of elements is accepted in our tests, an example is:
{code}
class DiscardingElementWriter extends AsyncSinkWriter {
SeparateThreadExecutor executor =
new SeparateThreadExecutor(r -> new Thread(r, 
"DiscardingElementWriter"));

public DiscardingElementWriter(
Sink.InitContext context,
AsyncSinkWriterConfiguration configuration,
Collection> bufferedRequestStates) 
{
super(
(element, context1) -> element.toString(),
context,
configuration,
bufferedRequestStates);
}

@Override
protected long getSizeInBytes(String requestEntry) {
return requestEntry.length();
}

@Override
protected void submitRequestEntries(
List requestEntries, ResultHandler 
resultHandler) {
executor.execute(
() -> {
long delayMillis = new Random().nextInt(5000);
try {
Thread.sleep(delayMillis);
} catch (InterruptedException ignored) {
}
for (String entry : requestEntries) {
LOG.info("Discarding {} after {} ms", entry, 
delayMillis);
}

resultHandler.complete();
});
}
}
{code}

We will also need a simple Flink Job that writes data using the sink

{code}
final StreamExecutionEnvironment env = StreamExecutionEnvironment
.getExecutionEnvironment();
env.setParallelism(1);
env.fromSequence(0, 100)
.map(Object::toString)
.sinkTo(new DiscardingTestAsyncSink<>());
{code}

We can use least values for batch size and inflight requests to increase number 
of requests that are subject to timeout

{code}
public class DiscardingTestAsyncSink extends AsyncSinkBase {
private static final Logger LOG = 
LoggerFactory.getLogger(DiscardingTestAsyncSink.class);

public DiscardingTestAsyncSink(long requestTimeoutMS, boolean 
failOnTimeout) {
super(
(element, context) -> element.toString(),
1, // maxBatchSize
1, // maxInflightRequests
10, // maxBufferedRequests
1000L, // maxBatchsize
100, // MaxTimeInBuffer
500L, // maxRecordSize
requestTimeoutMS,
failOnTimeout);
}

@Override
public SinkWriter createWriter(WriterInitContext context) throws 
IOException {
return new DiscardingElementWriter(
new InitContextWrapper(context),
AsyncSinkWriterConfiguration.builder()
.setMaxBatchSize(this.getMaxBatchSize())
.setMaxBatchSizeInBytes(this.getMaxBatchSizeInBytes())
.setMaxInFlightRequests(this.getMaxInFlightRequests())
.setMaxBufferedRequests(this.getMaxBufferedRequests())
.setMaxTimeInBufferMS(this.getMaxTimeInBufferMS())
.setMaxRecordSizeInBytes(this.getMaxRecordSizeInBytes())
.setFailOnTimeout(this.getFailOnTimeout())
.setRequestTimeoutMS(this.getRequestTimeoutMS())
.build(),
Collections.emptyList());
}

@Override
public StatefulSinkWriter> restoreWriter(
WriterInitContext context, Collection> 
recoveredState)
throws IOException {
return new DiscardingElementWriter(
new InitContextWrapper(context),
AsyncSinkWriterConfiguration.builder()
.setMaxBatchSize(this.getMaxBatchSize())
.setMaxBatchSizeInBytes(this.getMaxBatchSizeInBytes())
.setMaxInFlightRequests(this.getMaxInFlightRequests())

[jira] [Created] (FLINK-35696) JSON_VALUE/QUERY functions incorrectly map floating numbers

2024-06-25 Thread Dawid Wysakowicz (Jira)
Dawid Wysakowicz created FLINK-35696:


 Summary: JSON_VALUE/QUERY functions incorrectly map floating 
numbers
 Key: FLINK-35696
 URL: https://issues.apache.org/jira/browse/FLINK-35696
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Affects Versions: 1.19.1
Reporter: Dawid Wysakowicz
 Fix For: 1.20.0


{code}
SELECT JSON_VALUE('{"bigNumber":123456789.987654321}', '$.bigNumber')
{code}

produces

{code}
1.2345678998765433E8
{code}

which can not be mapped back. It gets rounded.

The reason is we use {{double}} for floats in {{SqlJsonUtils}}. We should 
rather configure {{jackson}} to use {{BigDecimals}}. In order to do that we 
need to properly shade {{jayway}} though.

I suggest we:

1. Add {{com.jayway.jsonpath:son-path}} to {{flink-shaded}}
2. We use the shaded library and pass a configured mapper which maps floats to 
{{BigDecimal}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-444: Native file copy support

2024-06-25 Thread Zakelly Lan
Hi Piotrek,

I don't see any statement about removing or renaming the
`DuplicatingFileSystem` in the FLIP, shall we do that as mentioned in the
discussion thread?


Best,
Zakelly

On Tue, Jun 25, 2024 at 4:58 PM Piotr Nowojski  wrote:

> Hi all,
>
> I would like to start a vote for the FLIP-444 [1]. The discussion thread is
> here [2].
>
> The vote will be open for at least 72.
>
> Best,
> Piotrek
>
> [1] https://cwiki.apache.org/confluence/x/rAn9EQ
> [2] https://lists.apache.org/thread/lkwmyjt2bnmvgx4qpp82rldwmtd4516c
>


Re: [2.0] How to handle on-going feature development in Flink 2.0?

2024-06-25 Thread David Radley
Hi,
I think this is a great question. I am not sure if this has been covered 
elsewhere, but it would be good to be clear how this effects the connectors and 
operator repos, with potentially v1 and v2 oriented new featuresI suspect this 
will be a connector by connector investigation. I am thinking connectors with 
Hadoop eco-system dependencies (e.g. Paimon) which may not work nicely with 
Java 17,

 Kind regards, David.


From: Matthias Pohl 
Date: Tuesday, 25 June 2024 at 09:57
To: dev@flink.apache.org 
Cc: Xintong Song , martijnvis...@apache.org 
, imj...@gmail.com , 
becket@gmail.com 
Subject: [EXTERNAL] [2.0] How to handle on-going feature development in Flink 
2.0?
Hi 2.0 release managers,
With the 1.20 release branch being cut [1], master is now referring to
2.0-SNAPSHOT. I remember that, initially, the community had the idea of
keeping the 2.0 release as small as possible focusing on API changes [2].

What does this mean for new features? I guess blocking them until 2.0 is
released is not a good option. Shall we treat new features as
"nice-to-have" items as documented in the 2.0 release overview [3] and
merge them to master like it was done for minor releases in the past? Do
you want to add a separate section in the 2.0 release overview [3] to list
these new features (e.g. FLIP-461 [4]) separately? That might help to
manage planned 2.0 deprecations/API removal and new features separately. Or
do you have a different process in mind?

Apologies if this was already discussed somewhere. I didn't manage to find
anything related to this topic.

Best,
Matthias

[1] https://lists.apache.org/thread/mwnfd7o10xo6ynx0n640pw9v2opbkm8l
[2] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
[3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
[4]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


[jira] [Created] (FLINK-35695) Release Testing: Verify FLINK-32315: Support local file upload in K8s mode

2024-06-25 Thread Ferenc Csaky (Jira)
Ferenc Csaky created FLINK-35695:


 Summary: Release Testing: Verify FLINK-32315: Support local file 
upload in K8s mode
 Key: FLINK-35695
 URL: https://issues.apache.org/jira/browse/FLINK-35695
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Reporter: Ferenc Csaky
 Fix For: 1.20.0


Follow up the test for https://issues.apache.org/jira/browse/FLINK-35533

In Flink 1.20,  we proposed integrating Flink's Hybrid Shuffle with Apache 
Celeborn through a pluggable remote tier interface. To verify this feature, you 
should reference these main two steps.

1. Implement Celeborn tier.
 * Implement a new tier factory and tier for Celeborn, including these APIs, 
including TierFactory/TierMasterAgent/TierProducerAgent/TierConsumerAgent.
 * The implementations should support granular data management at the Segment 
level for both client and server sides.

2. Use the implemented tier to shuffle data.
 * Compile Flink and Celeborn.
 * Deploy Celeborn service
 ** Deploy a new Celeborn service with the new compiled packages. You can 
reference the doc ([https://celeborn.apache.org/docs/latest/]) to deploy the 
cluster.
 * Add the compiled flink plugin jar (celeborn-client-flink-xxx.jar) to Flink 
classpath.
 * Configure the options to enable the feature.
 ** Configure the option 
taskmanager.network.hybrid-shuffle.external-remote-tier-factory.class to the 
new Celeborn tier classes. Except for this option, the following options should 
also be added.

{code:java}
execution.batch-shuffle-mode: ALL_EXCHANGES_HYBRID_FULL 
celeborn.master.endpoints: 
celeborn.client.shuffle.partition.type: MAP{code}
 * Run some test examples(e.g., WordCount) to verify the feature.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-463: Schema Definition in CREATE TABLE AS Statement

2024-06-25 Thread Martijn Visser
+1 (binding)

On Sun, Jun 23, 2024 at 9:07 PM Ferenc Csaky 
wrote:

> +1 (non-binding)
>
> Best,
> Ferenc
>
>
>
>
> On Sunday, June 23rd, 2024 at 05:13, Yanquan Lv 
> wrote:
>
> >
> >
> > Thnaks Sergio, +1 (non-binding)
> >
> > gongzhongqiang gongzhongqi...@apache.org 于2024年6月23日周日 10:06写道:
> >
> > > +1 (non-binding)
> > >
> > > Best,
> > > Zhongqiang Gong
> > >
> > > Sergio Pena ser...@confluent.io.invalid 于2024年6月21日周五 22:18写道:
> > >
> > > > Hi everyone,
> > > >
> > > > Thanks for all the feedback about FLIP-463: Schema Definition in
> CREATE
> > > > TABLE AS Statement [1]. The discussion thread is here [2].
> > > >
> > > > I'd like to start a vote for it. The vote will be open for at least
> 72
> > > > hours unless there is an objection or insufficient votes. The FLIP
> will
> > > > be
> > > > considered accepted if 3 binding votes (from active committers
> according
> > > > to
> > > > the Flink bylaws [3]) are gathered by the community.
> > > >
> > > > [1]
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-463%3A+Schema+Definition+in+CREATE+TABLE+AS+Statement
> > >
> > > > [2] https://lists.apache.org/thread/1ryxxyyg3h9v4rbosc80zryvjk6c8k83
> > > > [3] [
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws#FlinkBylaws-Approvals](https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws#FlinkBylaws-Approvals)
> 
> > >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws#FlinkBylaws-Approvals](https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws%23FlinkBylaws-Approvals)
> > >
> > > > <
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws#FlinkBylaws-Approvals](https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws%23FlinkBylaws-Approvals)
> > > >
> > > > <
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws#FlinkBylaws-Approvals](https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws%23FlinkBylaws-Approvals)
> > >
> > > > Thanks,
> > > > Sergio Peña
>


[jira] [Created] (FLINK-35694) Flink-end-to-end-test kubernetes fails due to hostname --ip-adress command

2024-06-25 Thread David Kornel (Jira)
David Kornel created FLINK-35694:


 Summary: Flink-end-to-end-test kubernetes fails due to hostname 
--ip-adress command
 Key: FLINK-35694
 URL: https://issues.apache.org/jira/browse/FLINK-35694
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.20.0
 Environment: centos9
Reporter: David Kornel


Im trying to run e2e tests on my linux machne (centos9) and I see that it fails 
due to wrong return value of `hostname --ip-address` which returns ipv6 
loopback address as first which causes issue during build containers

 
build_image test_kubernetes_embedded_job fe80...(omitted) Retrying...
 

The solution if replace `hostname --ip-address` with `hostanme -I` in

get_host_machine_address method in common_kubernetes.sh

 

I can open PR.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] FLIP-444: Native file copy support

2024-06-25 Thread Piotr Nowojski
Hi all,

I would like to start a vote for the FLIP-444 [1]. The discussion thread is
here [2].

The vote will be open for at least 72.

Best,
Piotrek

[1] https://cwiki.apache.org/confluence/x/rAn9EQ
[2] https://lists.apache.org/thread/lkwmyjt2bnmvgx4qpp82rldwmtd4516c


[2.0] How to handle on-going feature development in Flink 2.0?

2024-06-25 Thread Matthias Pohl
Hi 2.0 release managers,
With the 1.20 release branch being cut [1], master is now referring to
2.0-SNAPSHOT. I remember that, initially, the community had the idea of
keeping the 2.0 release as small as possible focusing on API changes [2].

What does this mean for new features? I guess blocking them until 2.0 is
released is not a good option. Shall we treat new features as
"nice-to-have" items as documented in the 2.0 release overview [3] and
merge them to master like it was done for minor releases in the past? Do
you want to add a separate section in the 2.0 release overview [3] to list
these new features (e.g. FLIP-461 [4]) separately? That might help to
manage planned 2.0 deprecations/API removal and new features separately. Or
do you have a different process in mind?

Apologies if this was already discussed somewhere. I didn't manage to find
anything related to this topic.

Best,
Matthias

[1] https://lists.apache.org/thread/mwnfd7o10xo6ynx0n640pw9v2opbkm8l
[2] https://lists.apache.org/thread/b8w5cx0qqbwzzklyn5xxf54vw9ymys1c
[3] https://cwiki.apache.org/confluence/display/FLINK/2.0+Release
[4]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler


[jira] [Created] (FLINK-35693) Change variable specificOffsetFile, specificOffsetPos and startupTimestampMillis to Offset in StartupOptions

2024-06-25 Thread Hongshun Wang (Jira)
Hongshun Wang created FLINK-35693:
-

 Summary: Change variable specificOffsetFile, specificOffsetPos and 
startupTimestampMillis to Offset in StartupOptions
 Key: FLINK-35693
 URL: https://issues.apache.org/jira/browse/FLINK-35693
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Affects Versions: cdc-3.1.1
Reporter: Hongshun Wang
 Fix For: cdc-3.2.0


Current, StartupOptions use specificOffsetFile, specificOffsetPos and  
startupTimestampMillis to describe specific offset or timestamp. However, it's 
suitable for mysql, rather than postgres's LSN or oracle's RedoLogOffset.
{code:java}
public final class StartupOptions implements Serializable {
private static final long serialVersionUID = 1L;

public final StartupMode startupMode;
public final String specificOffsetFile;
public final Integer specificOffsetPos;
public final Long startupTimestampMillis;
} {code}
Now that we have already retract Offset to represent the position of log, why 
not use offset in StartupOptions?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Apache Flink Kubernetes Operator Release 1.9.0, release candidate #1

2024-06-25 Thread Gyula Fóra
+1 (binding)

Verified:
 - Sources/signates
 - Install 1.9.0 from helm chart
 - Stateful example job basic interactions
 - Operator upgrade from 1.8.0 -> 1.9.0 with running flinkdeployments
 - Flink-web PR looks good

Cheers,
Gyula


On Wed, Jun 19, 2024 at 12:09 PM Gyula Fóra  wrote:

> Hi,
>
> I have updated the KEYs file and extended the expiration date so that
> should not be an issue. Thanks for pointing that out.
>
> Gyula
>
> On Wed, 19 Jun 2024 at 12:07, Rui Fan <1996fan...@gmail.com> wrote:
>
>> Thanks Gyula and Mate for driving this release!
>>
>> +1 (binding)
>>
>> Except the key is expired, and leaving a couple of comments to the
>> flink-web PR,
>> the rest of them are fine.
>>
>> - Downloaded artifacts from dist ( svn co https://dist.apache
>> .org/repos/dist/dev/flink/flink-kubernetes-operator-1.9.0-rc1/ )
>> - Verified SHA512 checksums : ( for i in *.tgz; do echo $i; sha512sum
>> --check $i.sha512; done )
>> - Verified GPG signatures : ( for i in *.tgz; do echo $i; gpg --verify
>> $i.asc $i; done)
>> - Build the source with java-11 and java-17 ( mvn -T 20 clean install
>> -DskipTests )
>> - Verified the license header during build the source
>> - Verified that chart and appVersion matches the target release (less the
>> index.yaml and Chart.yaml )
>> - Download Autoscaler standalone: wget https://repository.apache
>> .org/content/repositories/orgapacheflink-1740/org/apache/flink/flink
>> -autoscaler-standalone/1.9.0/flink-autoscaler-standalone-1.9.0.jar
>> - Ran Autoscaler standalone locally, it works well with rescale api.
>>
>> Best,
>> Rui
>>
>> On Wed, Jun 19, 2024 at 1:50 AM Mate Czagany  wrote:
>>
>> > Hi,
>> >
>> > +1 (non-binding)
>> >
>> > Note: Using the Apache Flink KEYS file [1] to verify the signatures your
>> > key seems to be expired, so that file should be updated as well.
>> >
>> > - Verified checksums and signatures
>> > - Built source distribution
>> > - Verified all pom.xml versions are the same
>> > - Verified install from RC repo
>> > - Verified Chart.yaml and values.yaml contents
>> > - Submitted basic example with 1.17 and 1.19 Flink versions in native
>> and
>> > standalone mode
>> > - Tested operator HA, added new watched namespace dynamically
>> > - Checked operator logs
>> >
>> > Regards,
>> > Mate
>> >
>> > [1] https://dist.apache.org/repos/dist/release/flink/KEYS
>> >
>> > Gyula Fóra  ezt írta (időpont: 2024. jún. 18., K,
>> > 8:14):
>> >
>> > > Hi Everyone,
>> > >
>> > > Please review and vote on the release candidate #1 for the version
>> 1.9.0
>> > of
>> > > Apache Flink Kubernetes Operator,
>> > > as follows:
>> > > [ ] +1, Approve the release
>> > > [ ] -1, Do not approve the release (please provide specific comments)
>> > >
>> > > **Release Overview**
>> > >
>> > > As an overview, the release consists of the following:
>> > > a) Kubernetes Operator canonical source distribution (including the
>> > > Dockerfile), to be deployed to the release repository at
>> dist.apache.org
>> > > b) Kubernetes Operator Helm Chart to be deployed to the release
>> > repository
>> > > at dist.apache.org
>> > > c) Maven artifacts to be deployed to the Maven Central Repository
>> > > d) Docker image to be pushed to dockerhub
>> > >
>> > > **Staging Areas to Review**
>> > >
>> > > The staging areas containing the above mentioned artifacts are as
>> > follows,
>> > > for your review:
>> > > * All artifacts for a,b) can be found in the corresponding dev
>> repository
>> > > at dist.apache.org [1]
>> > > * All artifacts for c) can be found at the Apache Nexus Repository [2]
>> > > * The docker image for d) is staged on github [3]
>> > >
>> > > All artifacts are signed with the key 21F06303B87DAFF1 [4]
>> > >
>> > > Other links for your review:
>> > > * JIRA release notes [5]
>> > > * source code tag "release-1.9.0-rc1" [6]
>> > > * PR to update the website Downloads page to
>> > > include Kubernetes Operator links [7]
>> > >
>> > > **Vote Duration**
>> > >
>> > > The voting time will run for at least 72 hours.
>> > > It is adopted by majority approval, with at least 3 PMC affirmative
>> > votes.
>> > >
>> > > **Note on Verification**
>> > >
>> > > You can follow the basic verification guide here[8].
>> > > Note that you don't need to verify everything yourself, but please
>> make
>> > > note of what you have tested together with your +- vote.
>> > >
>> > > Cheers!
>> > > Gyula Fora
>> > >
>> > > [1]
>> > >
>> > >
>> >
>> https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.9.0-rc1/
>> > > [2]
>> > >
>> https://repository.apache.org/content/repositories/orgapacheflink-1740/
>> > > [3]  ghcr.io/apache/flink-kubernetes-operator:17129ff
>> > > [4] https://dist.apache.org/repos/dist/release/flink/KEYS
>> > > [5]
>> > >
>> > >
>> >
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354417
>> > > [6]
>> > >
>> >
>> https://github.com/apache/flink-kubernetes-operator/tree/release-1.9.0-rc1
>> > > [7] 

[jira] [Created] (FLINK-35692) Remove hugo_extended package introduced by wrong operation

2024-06-25 Thread dalongliu (Jira)
dalongliu created FLINK-35692:
-

 Summary: Remove hugo_extended package introduced by wrong operation
 Key: FLINK-35692
 URL: https://issues.apache.org/jira/browse/FLINK-35692
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.20.0
Reporter: dalongliu
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35691) Fix the bugs found in releasing testing of Materialized Table

2024-06-25 Thread dalongliu (Jira)
dalongliu created FLINK-35691:
-

 Summary: Fix the bugs found in releasing testing of Materialized 
Table 
 Key: FLINK-35691
 URL: https://issues.apache.org/jira/browse/FLINK-35691
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.20.0
Reporter: dalongliu
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35690) Release Testing: Verify FLIP-459: Support Flink hybrid shuffle integration with Apache Celeborn

2024-06-25 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-35690:
-

 Summary: Release Testing: Verify FLIP-459: Support Flink hybrid 
shuffle integration with Apache Celeborn
 Key: FLINK-35690
 URL: https://issues.apache.org/jira/browse/FLINK-35690
 Project: Flink
  Issue Type: Sub-task
Reporter: Yuxin Tan
 Fix For: 1.20.0


Follow up the test for https://issues.apache.org/jira/browse/FLINK-35533

In Flink 1.20,  we proposed integrating Flink's Hybrid Shuffle with Apache 
Celeborn through a pluggable remote tier interface. To verify this feature, you 
should reference these main two steps.

1. Implement Celeborn tier.
 * Implement a new tier factory and tier for Celeborn, including these APIs, 
including TierFactory/TierMasterAgent/TierProducerAgent/TierConsumerAgent.
 * The implementations should support granular data management at the Segment 
level for both client and server sides.

2. Use the implemented tier to shuffle data.
 * Compile Flink and Celeborn.
 * Deploy Celeborn service
 ** Deploy a new Celeborn service with the new compiled packages. You can 
reference the doc (https://celeborn.apache.org/docs/latest/) to deploy the 
cluster. 
 * Add the compiled flink plugin jar (celeborn-client-flink-xxx.jar) to Flink 
classpaths.
 * Configure the options to enable the feature.
 ** Configure the option 
taskmanager.network.hybrid-shuffle.external-remote-tier-factory.class to the 
new Celeborn tier classes. Except for this option, the following options should 
also be added.

 
{code:java}
execution.batch-shuffle-mode: ALL_EXCHANGES_HYBRID_FULL 
celeborn.master.endpoints: 
celeborn.client.shuffle.partition.type: MAP\{code}
 * Run some test examples(e.g., WordCount) to verify the feature.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)