RE: Re: [VOTE] Release flink-connector-jdbc, release candidate #3

2024-02-01 Thread Jiabao Sun
Thanks Sergey for driving this.

+1(non-binding)

- Release notes look good
- Tag is present in Github
- Validated checksum hash
- Verified signature
- Build the source with Maven by jdk8,11,17
- Checked jar was built by jdk8
- Reviewed web PR

Best,
Jiabao

On 2024/02/02 06:58:35 Hang Ruan wrote:
> +1 (non-binding)
> 
> - Validated checksum hash
> - Verified signature
> - Verified that no binaries exist in the source archive
> - Build the source with Maven and jdk8
> - Check that the jar is built by jdk8
> 
> Best,
> Hang
> 
> Sergey Nuyanzin  于2024年2月1日周四 19:50写道:
> 
> > Hi everyone,
> > Please review and vote on the release candidate #3 for the version 3.1.2,
> > as follows:
> > [ ] +1, Approve the release
> > [ ] -1, Do not approve the release (please provide specific comments)
> >
> > This version is compatible with Flink 1.16.x, 1.17.x and 1.18.x.
> >
> > The complete staging area is available for your review, which includes:
> > * JIRA release notes [1],
> > * the official Apache source release to be deployed to dist.apache.org
> > [2],
> > which are signed with the key with fingerprint 1596BBF0726835D8 [3],
> > * all artifacts to be deployed to the Maven Central Repository [4],
> > * source code tag v3.1.2-rc3 [5],
> > * website pull request listing the new release [6].
> >
> > The vote will be open for at least 72 hours. It is adopted by majority
> > approval, with at least 3 PMC affirmative votes.
> >
> > Thanks,
> > Release Manager
> >
> > [1]
> >
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354088
> > [2]
> > https://dist.apache.org/repos/dist/dev/flink/flink-connector-jdbc-3.1.2-rc3
> > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> > [4]
> > https://repository.apache.org/content/repositories/orgapacheflink-1706/
> > [5] https://github.com/apache/flink-connector-jdbc/releases/tag/v3.1.2-rc3
> > [6] https://github.com/apache/flink-web/pull/707
> >
> 

Re: [VOTE] Release flink-connector-jdbc, release candidate #3

2024-02-01 Thread Hang Ruan
+1 (non-binding)

- Validated checksum hash
- Verified signature
- Verified that no binaries exist in the source archive
- Build the source with Maven and jdk8
- Check that the jar is built by jdk8

Best,
Hang

Sergey Nuyanzin  于2024年2月1日周四 19:50写道:

> Hi everyone,
> Please review and vote on the release candidate #3 for the version 3.1.2,
> as follows:
> [ ] +1, Approve the release
> [ ] -1, Do not approve the release (please provide specific comments)
>
> This version is compatible with Flink 1.16.x, 1.17.x and 1.18.x.
>
> The complete staging area is available for your review, which includes:
> * JIRA release notes [1],
> * the official Apache source release to be deployed to dist.apache.org
> [2],
> which are signed with the key with fingerprint 1596BBF0726835D8 [3],
> * all artifacts to be deployed to the Maven Central Repository [4],
> * source code tag v3.1.2-rc3 [5],
> * website pull request listing the new release [6].
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
> Thanks,
> Release Manager
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354088
> [2]
> https://dist.apache.org/repos/dist/dev/flink/flink-connector-jdbc-3.1.2-rc3
> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> [4]
> https://repository.apache.org/content/repositories/orgapacheflink-1706/
> [5] https://github.com/apache/flink-connector-jdbc/releases/tag/v3.1.2-rc3
> [6] https://github.com/apache/flink-web/pull/707
>


[jira] [Created] (FLINK-34338) An exception is thrown when some named params change order when using window tvf

2024-02-01 Thread xuyang (Jira)
xuyang created FLINK-34338:
--

 Summary: An exception is thrown when some named params change 
order when using window tvf
 Key: FLINK-34338
 URL: https://issues.apache.org/jira/browse/FLINK-34338
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / API
Affects Versions: 1.15.0
Reporter: xuyang
 Fix For: 1.19.0


This bug can be reproduced by the following sql in `WindowTableFunctionTest`

 
{code:java}
@Test
def test(): Unit = {
  val sql =
"""
  |SELECT *
  |FROM TABLE(TUMBLE(
  |   DATA => TABLE MyTable,
  |   SIZE => INTERVAL '15' MINUTE,
  |   TIMECOL => DESCRIPTOR(rowtime)
  |   ))
  |""".stripMargin
  util.verifyRelPlan(sql)
}{code}
In Flip-145 and user doc, we can found `the DATA param must be the first`, but 
it seems that we also can't change the order about other params.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34337) Sink.InitContextWrapper should implement metadataConsumer method

2024-02-01 Thread Jiabao Sun (Jira)
Jiabao Sun created FLINK-34337:
--

 Summary: Sink.InitContextWrapper should implement metadataConsumer 
method
 Key: FLINK-34337
 URL: https://issues.apache.org/jira/browse/FLINK-34337
 Project: Flink
  Issue Type: Bug
  Components: API / Core
Affects Versions: 1.19.0
Reporter: Jiabao Sun
 Fix For: 1.19.0


Sink.InitContextWrapper should implement metadataConsumer method.

If the metadataConsumer method is not implemented, the behavior of the wrapped 
WriterInitContext's metadataConsumer will be lost.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34336) AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState may hang sometimes

2024-02-01 Thread Rui Fan (Jira)
Rui Fan created FLINK-34336:
---

 Summary: 
AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState may 
hang sometimes
 Key: FLINK-34336
 URL: https://issues.apache.org/jira/browse/FLINK-34336
 Project: Flink
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.19.0
Reporter: Rui Fan
Assignee: Rui Fan
 Fix For: 1.19.0


AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState may 
hang in 
waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
{color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};{color}
h2. Reason:

The job has 2 tasks(vertices), after calling updateJobResourceRequirements. The 
source parallelism isn't changed (It's parallelism) , and the FlatMapper+Sink 
is changed from  parallelism to parallelism2.

So we expect the task number should be parallelism + parallelism2 instead of 
parallelism2.

 
h2. Why it can be passed for now?

Flink 1.19 supports the scaling cooldown, and the cooldown time is 30s by 
default. It means, flink job will rescale job 30 seconds after 
updateJobResourceRequirements is called.

 

So the running tasks are old parallelism when we call 
waitForRunningTasks({color:#9876aa}restClusterClient{color}{color:#cc7832}, 
{color}jobID{color:#cc7832}, {color}parallelism2){color:#cc7832};. {color}

IIUC, it cannot be guaranteed, and it's unexpected.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34335) Query hints in RexSubQuery could not be printed

2024-02-01 Thread xuyang (Jira)
xuyang created FLINK-34335:
--

 Summary: Query hints in RexSubQuery could not be printed
 Key: FLINK-34335
 URL: https://issues.apache.org/jira/browse/FLINK-34335
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: xuyang
 Fix For: 1.19.0


That is because in `RelTreeWriterImpl`, we don't care about the `RexSubQuery`. 
And `RexSubQuery` use `RelOptUtil.toString(rel)` to print itself instead of 
adding extra information such as query hints.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] FLIP-421: Support Custom Conversion for LogicalTypes

2024-02-01 Thread Becket Qin
Hi folks,

I'd like to kick off the discussion of FLIP-421[1].

The motivation is to support custom conversion between Flink SQL internal
data classes and external classes. The typical scenarios of these
conversions are:
1. In source / sink
2. conversion between Table and DataStream.

I think this FLIP will help improve the user experience in
format development (It makes the implementation of FLIP-358[2] much
easier). And it also makes the table-datastream conversion more usable.

Comments are welcome!

Thanks,

Jiangjie (Becket) Qin

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-421%3A+Support+Custom+Conversion+for+LogicalTypes
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-358%3A+flink-avro+enhancement+and+cleanup


Security fixes for Flink 1.18 (flink-shaded)

2024-02-01 Thread Hong Liang
Hi all,

Recently, we detected some active CVEs on the flink-shaded-guava and
flink-shaded-zookeeper package used in Flink 1.18. Since Flink 1.18 is
still in support for security fixes, we should consider fixing this.
However, since the vulnerable package is coming from flink-shaded, I wanted
to check if there are thoughts from the community around releasing a patch
version of flink-shaded.

Problem:
Flink 1.18 uses guava 31.1-jre from flink-shaded-guava 17.0, which is
affected by CVE-2023-2976 (HIGH) [1] and CVE-2020-8908 (LOW) [2]. Flink
1.18 also uses zookeeper 3.7.1, which is affected by CVE-2023-44981
(CRITICAL) [3].

To fix, I can think of two options:
Option 1:
Upgrade Flink 1.18 to use flink.shaded.version 18.0. This is easiest as we
can backport the change for Flink 1.19 directly (after the performance
regression is addressed) [4]. However, there are also upgrades to jackson,
asm and netty in flink.shaded.version 1.18.

Option 2:
Release flink.shaded.version 17.1, with just a bump in zookeeper and guava
versions. Then, upgrade Flink 1.18 to use this new flink.shaded.version
17.1. This is harder, but keeps the changes contained and minimal.

Given the version bump is on flink-shaded, which is relocated to keep the
usage of libraries contained within the flink runtime itself, I am inclined
to go with Option 1, even though the change is slightly larger than just
the security fixes.

Do people have any objections?


Regards,
Hong

[1] https://nvd.nist.gov/vuln/detail/CVE-2023-2976
[2] https://nvd.nist.gov/vuln/detail/CVE-2020-8908
[3] https://nvd.nist.gov/vuln/detail/CVE-2023-44981
[4] https://issues.apache.org/jira/browse/FLINK-33705


Re: [VOTE] Release flink-connector-parent, release candidate #1

2024-02-01 Thread Chesnay Schepler

- checked source/maven pom contents

Please file a ticket to exclude tools/release from the source release.

+1 (binding)

On 29/01/2024 15:59, Maximilian Michels wrote:

- Inspected the source for licenses and corresponding headers
- Checksums and signature OK

+1 (binding)

On Tue, Jan 23, 2024 at 4:08 PM Etienne Chauchot  wrote:

Hi everyone,

Please review and vote on the release candidate #1 for the version
1.1.0, as follows:

[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org
[2], which are signed with the key with fingerprint
D1A76BA19D6294DD0033F6843A019F0B8DD163EA [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v1.1.0-rc1 [5],
* website pull request listing the new release [6]

* confluence wiki: connector parent upgrade to version 1.1.0 that will
be validated after the artifact is released (there is no PR mechanism on
the wiki) [7]

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,

Etienne

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353442
[2]
https://dist.apache.org/repos/dist/dev/flink/flink-connector-parent-1.1.0-rc1
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1698/
[5]
https://github.com/apache/flink-connector-shared-utils/releases/tag/v1.1.0-rc1
[6] https://github.com/apache/flink-web/pull/717

[7]
https://cwiki.apache.org/confluence/display/FLINK/Externalized+Connector+development




[jira] [Created] (FLINK-34334) Add sub-task level RocksDB file count metric

2024-02-01 Thread Jufang He (Jira)
Jufang He created FLINK-34334:
-

 Summary: Add sub-task level RocksDB file count metric
 Key: FLINK-34334
 URL: https://issues.apache.org/jira/browse/FLINK-34334
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / State Backends
Affects Versions: 1.18.0
Reporter: Jufang He
 Attachments: img_v3_027i_7ed0b8ba-3f12-48eb-aab3-cc368ac47cdg.jpg

In our production environment, we encountered the problem of task deploy 
failure. The root cause was that too many sst files of a single sub-task led to 
too much task deployment information(OperatorSubtaskState), and then caused 
akka request timeout in the task deploy phase. Therefore, I wanted to add 
sub-task level RocksDB file count metrics. It is convenient to avoid 
performance problems caused by too many sst files in time.

RocksDB has provided the JNI 
(https://javadoc.io/doc/org.rocksdb/rocksdbjni/6.20.3/org/rocksdb/RocksDB.html#getColumnFamilyMetaData
 ()) We can easily retrieve the file count and report it via metrics reporter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] Release flink-connector-jdbc, release candidate #3

2024-02-01 Thread Sergey Nuyanzin
Hi everyone,
Please review and vote on the release candidate #3 for the version 3.1.2,
as follows:
[ ] +1, Approve the release
[ ] -1, Do not approve the release (please provide specific comments)

This version is compatible with Flink 1.16.x, 1.17.x and 1.18.x.

The complete staging area is available for your review, which includes:
* JIRA release notes [1],
* the official Apache source release to be deployed to dist.apache.org [2],
which are signed with the key with fingerprint 1596BBF0726835D8 [3],
* all artifacts to be deployed to the Maven Central Repository [4],
* source code tag v3.1.2-rc3 [5],
* website pull request listing the new release [6].

The vote will be open for at least 72 hours. It is adopted by majority
approval, with at least 3 PMC affirmative votes.

Thanks,
Release Manager

[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354088
[2]
https://dist.apache.org/repos/dist/dev/flink/flink-connector-jdbc-3.1.2-rc3
[3] https://dist.apache.org/repos/dist/release/flink/KEYS
[4] https://repository.apache.org/content/repositories/orgapacheflink-1706/
[5] https://github.com/apache/flink-connector-jdbc/releases/tag/v3.1.2-rc3
[6] https://github.com/apache/flink-web/pull/707


[ANNOUNCE] Apache flink-connector-opensearch 1.1.0 released

2024-02-01 Thread Danny Cranmer
The Apache Flink community is very happy to announce the release of Apache
flink-connector-opensearch 1.1.0. This release supports Apache Flink 1.17
and 1.18.

Apache Flink® is an open-source stream processing framework for
distributed, high-performing, always-available, and accurate data streaming
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353141

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Danny


[RESULT] [VOTE] flink-connector-opensearch v1.1.0, release candidate #1

2024-02-01 Thread Danny Cranmer
I'm happy to announce that we have unanimously approved this release.

There are 8 approving votes, 3 of which are binding:
* Andriy Redko
* Ahmed Hamdy
* Ryan Skraba
* Sergy Nuyanzin
* Martijn Visser (binding)
* Danny Cranmer (binding)
* Jiabao Sun
* Leonard Xu (binding)

There are no disapproving votes.

Thanks everyone!


Re: [VOTE] Release flink-connector-opensearch v1.1.0, release candidate #1

2024-02-01 Thread Danny Cranmer
Thanks all, this vote is now closed, I will announce the results on a
separate thread.


On Tue, Jan 30, 2024 at 11:33 AM Leonard Xu  wrote:

> Sorry for late verification, +1(binding)
>
> - built from source code succeeded
> - verified signatures
> - verified hashsums
> - checked the contents contains jar and pom files in apache repo
> - checked Github release tag
> - checked release notes
> - reviewed the web PR
>
>
> Best,
> Leonard
>
> > 2024年1月13日 下午4:41,Jiabao Sun  写道:
> >
> > +1 (non-binding)
> >
> > - Validated hashes
> > - Verified signature
> > - Verified tags
> > - Verified no binaries in the source archive
> > - Reviewed web pr and found that there're some conflicts need to be
> resolved
> >
> > Best,
> > Jiabao
> >
> >
> >> 2024年1月12日 23:58,Danny Cranmer  写道:
> >>
> >> Apologies I jumped the gun on this one. We only have 2 binding votes.
> >> Reopening the thread.
> >>
> >> On Fri, Jan 12, 2024 at 3:43 PM Danny Cranmer 
> >> wrote:
> >>
> >>> Thanks all, this vote is now closed, I will announce the results on a
> >>> separate thread.
> >>>
> >>> Thanks,
> >>> Danny
> >>>
> >>> On Fri, Jan 12, 2024 at 3:43 PM Danny Cranmer  >
> >>> wrote:
> >>>
>  +1 (binding)
> 
>  - Verified signatures and checksums
>  - Reviewed release notes
>  - Verified no binaries in the source archive
>  - Source builds using Maven
>  - Reviewed NOTICE files (I suppose the copyright needs to be 2024
> now!)
> 
>  Thanks,
>  Danny
> 
>  On Fri, Jan 12, 2024 at 12:56 PM Martijn Visser <
> martijnvis...@apache.org>
>  wrote:
> 
> > One non blocking nit: the version for flink.version in the main POM
> is
> > set to 1.17.1. I think this should be 1.17.0 (since that's the lowest
> > possible Flink version that's supported).
> >
> > +1 (binding)
> >
> > - Validated hashes
> > - Verified signature
> > - Verified that no binaries exist in the source archive
> > - Build the source with Maven
> > - Verified licenses
> > - Verified web PRs
> >
> > On Mon, Jan 1, 2024 at 11:57 AM Danny Cranmer <
> dannycran...@apache.org>
> > wrote:
> >>
> >> Hey,
> >>
> >> Gordon, apologies for the delay. Yes this is the correct
> > understanding, all
> >> connectors follow a similar pattern.
> >>
> >> Would appreciate some PMC eyes on this release.
> >>
> >> Thanks,
> >> Danny
> >>
> >> On Thu, 23 Nov 2023, 23:28 Tzu-Li (Gordon) Tai, <
> tzuli...@apache.org>
> > wrote:
> >>
> >>> Hi Danny,
> >>>
> >>> Thanks for starting a RC for this.
> >>>
> >>> From the looks of the staged POMs for 1.1.0-1.18, the flink
> versions
> > for
> >>> Flink dependencies still point to 1.17.1.
> >>>
> >>> My understanding is that this is fine, as those provided scope
> >>> dependencies (e.g. flink-streaming-java) will have their versions
> >>> overwritten by the user POM if they do intend to compile their jobs
> > against
> >>> Flink 1.18.x.
> >>> Can you clarify if this is the correct understanding of how we
> > intend the
> >>> externalized connector artifacts to be published? Related
> discussion
> > on
> >>> [1].
> >>>
> >>> Thanks,
> >>> Gordon
> >>>
> >>> [1]
> https://lists.apache.org/thread/x1pyrrrq7o1wv1lcdovhzpo4qhd4tvb4
> >>>
> >>> On Thu, Nov 23, 2023 at 3:14 PM Sergey Nuyanzin <
> snuyan...@gmail.com
> >>
> >>> wrote:
> >>>
>  +1 (non-binding)
> 
>  - downloaded artifacts
>  - built from source
>  - verified checksums and signatures
>  - reviewed web pr
> 
> 
>  On Mon, Nov 6, 2023 at 5:31 PM Ryan Skraba
> >  
>  wrote:
> 
> > Hello! +1 (non-binding) Thanks for the release!
> >
> > I've validated the source for the RC1:
> > * flink-connector-opensearch-1.1.0-src.tgz at r64995
> > * The sha512 checksum is OK.
> > * The source file is signed correctly.
> > * The signature 0F79F2AFB2351BC29678544591F9C1EC125FD8DB is
> > found in
> >>> the
> > KEYS file, and on https://keyserver.ubuntu.com/
> > * The source file is consistent with the GitHub tag v1.1.0-rc1,
> > which
> > corresponds to commit 0f659cc65131c9ff7c8c35eb91f5189e80414ea1
> > - The files explicitly excluded by create_pristine_sources (such
> > as
> > .gitignore and the submodule tools/releasing/shared) are not
> > present.
> > * Has a LICENSE file and a NOTICE file
> > * Does not contain any compiled binaries.
> >
> > * The sources can be compiled and unit tests pass with
> > flink.version
>  1.17.1
> > and flink.version 1.18.0
> >
> > * Nexus has three staged artifact ids for 1.1.0-1.17 and
> > 1.1.0-1.18
> > - 

Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-02-01 Thread Kartoglu, Emre
Hi Rui,

Thanks for the useful feedback and caring about the user experience. 
I will update the FLIP based on 1 comment. I consider this a minor update.

Please find my detailed responses below. 

"numRecordsInPerSecond sounds make sense to me, and I think
it's necessary to mention it in the FLIP wiki. It will let other developers
to easily understand. WDYT?"

I feel like this might be touching implementation details. No objections though,
 I will update the FLIP with this as one of the ways in which we can achieve 
the proposal.


"After I detailed read the FLIP and Average_absolute_deviation, we know
0% is the best, 100% is worst."

Correct.


"I guess it is difficult for users who have not read the documentation to
know the meaning of 50%. We hope that the designed Data skew will
be easy for users to understand without reading or learning a series
of backgrounds."

I think I understand where you're coming from. My thought is that the user 
won't have to
know exactly how the skew percentage/score is calculated. But this score will
act as a warning sign for them. Upon seeing a skew score of 80% for an 
operator, as a user 
I will go and click on the operator to see many of my subtasks are not 
receiving any data at all.
So it acts as a metric to get the user's attention to the skewed operator and 
fix issues.


"For example, as you mentioned before, flink has a metric:
numRecordsInPerSecond.
I believe users know what numRecordsInPerSecond means even if they
didn't read any documentation."

The FLIP suggests that we will provide an explanation of the data skew score
under the proposed Data Skew tab. I would like the exact wording to be left to 
the code review process to prevent these from blocking the implementation 
work/progress. 
This will be a user-friendly explanation with an option for the curious user to 
see the exact formula.


Kind regards,
Emre


On 01/02/2024, 03:26, "Rui Fan" <1996fan...@gmail.com 
> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






> I was thinking about using the existing numRecordsInPerSecond metric


numRecordsInPerSecond sounds make sense to me, and I think
it's necessary to mention it in the FLIP wiki. It will let other developers
to easily understand. WDYT?


BTW, that's why I ask whether the data skew score means total
receive records.


> this would always give you a score higher than 1, with no way to cap the
score.


Yeah, you are right. max/mean is not a score, it's the data skew multiple.
And I guess max/mean is easier to understand than
Average_absolute_deviation.


> I'm more used to working with percentages. The problem with the max/mean
metric is I wouldn't immediately know whether a score of 300 is bad for
instance.
> Whereas if users saw above 50% as suggested in the FLIP for instance,
they would consider taking action. I'm tempted to push back on this
suggestion. Happy to discuss further, there is a chance I'm not seeing the
downside of the proposed percentage based metric yet. Please let me know.


After I detailed read the FLIP and Average_absolute_deviation, we know
0% is the best, 100% is worst.


I guess it is difficult for users who have not read the documentation to
know the meaning of 50%. We hope that the designed Data skew will
be easy for users to understand without reading or learning a series
of backgrounds.


For example, as you mentioned before, flink has a metric:
numRecordsInPerSecond.
I believe users know what numRecordsInPerSecond means even if they
didn't read any documentation.


Of course, I'm opening for it. I may have missed something. I'd like to
hear
more feedback from the community.


Best,
Rui


On Thu, Feb 1, 2024 at 4:13 AM Kartoglu, Emre mailto:kar...@amazon.co.uk.inva>lid>
wrote:


> Hi Rui,
>
> " and provide the total and current score in the detailed tab. I didn't
> see the detailed design in the FLIP, would you mind
> improve the design doc? Thanks".
>
> It will essentially be a basic list view similar to the "Checkpoints" tab.
> I only briefly mentioned this in the FLIP because it will be a basic list
> view.
> No problem though, I will update the FLIP.
>
>
> Please find my responses below quotations.
>
> " 1. About the current skew score, I still don't understand how to get
> the list_of_number_of_records_received_by_each_subtask for
> each subtask.
>
> the list_of_number_of_records_received_by_each_subtask of subtask1
> is
>
> total received records of subtask 1 from beginning to now -
> total received records of subtask 1 from beginning to (now - 1min), right?"
>
> Yes, essentially correct. I was thinking about using the existing
> numRecordsInPerSecond metric (see
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/ 
> ),
> this would give us per second 

[jira] [Created] (FLINK-34333) Fix FLINK-34007 LeaderElector bug in 1.18

2024-02-01 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34333:
-

 Summary: Fix FLINK-34007 LeaderElector bug in 1.18
 Key: FLINK-34333
 URL: https://issues.apache.org/jira/browse/FLINK-34333
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.18.1
Reporter: Matthias Pohl


FLINK-34007 revealed a bug in the k8s client v6.6.2 which we're using since 
Flink 1.18. This issue was fixed with FLINK-34007 for Flink 1.19 which required 
an update of the k8s client to v6.9.0.

This Jira issue is about finding a solution in Flink 1.18 for the very same 
problem FLINK-34007 covered. It's a dedicated Jira issue because we want to 
unblock the release of 1.19 by resolving FLINK-34007.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34332) Investigate the permissions

2024-02-01 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34332:
-

 Summary: Investigate the permissions
 Key: FLINK-34332
 URL: https://issues.apache.org/jira/browse/FLINK-34332
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.18.1, 1.19.0
Reporter: Matthias Pohl


We're currently using {{read-all}} for our workflows. We might want to limit 
the scope and document why certain reads are needed (see [GHA 
docs|https://docs.github.com/en/actions/using-jobs/assigning-permissions-to-jobs]).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34331) Enable Apache INFRA runners for nightly builds

2024-02-01 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34331:
-

 Summary: Enable Apache INFRA runners for nightly builds
 Key: FLINK-34331
 URL: https://issues.apache.org/jira/browse/FLINK-34331
 Project: Flink
  Issue Type: Sub-task
  Components: Build System / CI
Affects Versions: 1.18.1, 1.19.0
Reporter: Matthias Pohl


The nightly CI is currently still utilizing the GitHub runners. We want to 
switch to Apache INFRA runners or ephemeral runners.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34330) Specify code owners for .github/workflows folder

2024-02-01 Thread Matthias Pohl (Jira)
Matthias Pohl created FLINK-34330:
-

 Summary: Specify code owners for .github/workflows folder
 Key: FLINK-34330
 URL: https://issues.apache.org/jira/browse/FLINK-34330
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.18.1, 1.19.0
Reporter: Matthias Pohl


Currently, the workflow files can be modified by any committer. We have to 
discuss whether we want to limit access to the PMC (or a subset of it) here. 
That might be a means to protect self-hosted runners.

See the [codeowner 
documentation|https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners]
 for further details.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)