[jira] [Created] (FLINK-35664) Flink CDC pipeline transform supports mask function

2024-06-20 Thread melin (Jira)
melin created FLINK-35664:
-

 Summary: Flink CDC pipeline transform supports mask function
 Key: FLINK-35664
 URL: https://issues.apache.org/jira/browse/FLINK-35664
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: melin
 Fix For: cdc-3.2.0


Flink CDC pipeline transform supports mask function



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35663) Release Testing: FLIP-436: Introduce Catalog-related Syntax

2024-06-20 Thread Yubin Li (Jira)
Yubin Li created FLINK-35663:


 Summary: Release Testing: FLIP-436: Introduce Catalog-related 
Syntax
 Key: FLINK-35663
 URL: https://issues.apache.org/jira/browse/FLINK-35663
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / API
Affects Versions: 1.20.0
Reporter: Yubin Li


This describes how to verify FLINK-34914 FLIP-436: Introduce Catalog-related 
Syntax.

The verification steps are as follows.
h3. 1. Start the sql client.

bin/sql-client.sh
h3. 2. Execute the following DDL statements.
{code:java}
create catalog c1 comment 'comment for ''c1''' with 
('type'='generic_in_memory', 'default-database'='db1');

create catalog if not exists c1 comment 'new' with 
('type'='generic_in_memory'); 

create catalog if not exists c2 with ('type'='generic_in_memory'); 

create catalog c2 with ('type'='generic_in_memory', 'default-database'='db2'); 
{code}
Verify whether only the last statement is supposed to throw an exception and 
messages such as `Catalog c2 already exists.`
h3. 3. Execute the following statements.
{code:java}
show catalogs;

show create catalog c1;

describe catalog c1;

desc catalog extended c1;

show create catalog c2;

describe catalog c2;

desc catalog extended c2; {code}
Verify whether they are the same as the given results.

!image-2024-06-16-09-30-16-983.png|width=671,height=788!
h3. 4. Execute the following DDL statements.
{code:java}
alter catalog c1 reset ('default-database');

alter catalog c1 comment '';

alter catalog c2 set ('default-database'='db2');

alter catalog c2 reset ('type');

alter catalog c2 reset ();

alter catalog c2 comment 'hello catalog ''c2''';{code}
Verify whether the forth statement is supposed to throw an exception and 
messages such as `ALTER CATALOG RESET does not support changing 'type'`.

Verify whether the fifth statement is supposed to throw an exception and 
messages such as `ALTER CATALOG RESET does not support empty key`.
h3. 5. Execute the following statements.
{code:java}
show create catalog c1;

describe catalog c1;

desc catalog extended c1;

show create catalog c2;

describe catalog c2;

desc catalog extended c2;  {code}
Verify whether they are the same as the given results.

!image-2024-06-16-09-47-27-956.png|width=679,height=680!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] Flink 1.20 feature freeze

2024-06-20 Thread Ferenc Csaky
Hi Xintong, Weijie,

Thank you both for your answers!

Regarding the CLI deprecation process, I agree that it is tricky
business, hence I opened this discussion. The proposal makes sense
IMO, I will open a discussion thread.

With the present timeline, I think it is reasonable to hold it
from 1.20, as this is more like a nice-to-have, just wanted to
make sure what is possible or what kind of policies we need to
apply when we remove a CLI action.

Best,
Ferenc


On Thursday, 20 June 2024 at 04:05, weijie guo  
wrote:

> 
> 
> +1 for Xintong's point.
> 
> Hi Ferenc,
> 
> After discussing this with all the other RM's, we decided not to merge the
> PR you mentioned into 1.20.
> The main reason is that FLIP-464 was voted in three days ago, which is
> already after feature freeze date. It doesn't make sense to us to put it in
> the 1.20 release cycle.
> 
> 
> Best regards,
> 
> Weijie
> 
> 
> Xintong Song tonysong...@gmail.com 于2024年6月19日周三 17:25写道:
> 
> > Hi Ferenc,
> > 
> > About the deprecation process, removing a @Public API requires the API to
> > be deprecated for at least 2 minor releases AND the removal should be in a
> > major version bump. That means you cannot remove a @Public API in 2.x when
> > x is not 0. The tricky part is, run-application as a command-line interface
> > does not have
> > any @Public/@PublicEvolving/@Experimental/@Deprecated annotations (nor
> > CliFrontend, the class it is defined in). Command-line interfaces are
> > definitely public APIs IMO, but there's no explicit deprecation process for
> > them, which also means we never committed that command-line interfaces will
> > stay compatible.
> > 
> > My suggestion would be to start a separate discussion thread on whether and
> > how to add explicit compatibility guarantees for command-line interfaces.
> > Without that, "legally" we can change command-line interfaces anytime,
> > though I'd be negative doing so.
> > 
> > As for the PR, I'd be in favor of not merging it for 1.20. Because we have
> > passed the feature freeze date for half a week and the PR is still not yet
> > reviewed, and the necessity for making it into 1.20 is unclear due to
> > absence of explicit process.
> > 
> > Best,
> > 
> > Xintong
> > 
> > On Wed, Jun 19, 2024 at 1:33 AM Ferenc Csaky ferenc.cs...@pm.me.invalid
> > wrote:
> > 
> > > Hi Robert, Rui, Ufuk, and Weijie,
> > > 
> > > I would like to raise the PR about merging `flink run` and
> > > `flink run-application` functionality [1] to get considered as
> > > part of the 1.20 release.
> > > 
> > > The reason IMO is that the `run-application` CLI command should be
> > > removed in the same release when Per-Job mode gets removed. AFAIK
> > > when we deprecate a public API, it has to stay for 2 minor
> > > releases to give time for users to adapt. According to that, if
> > > `run-application` is deprecated in Flink 2.0, it can get removed
> > > in Flink 2.3. Currently the drop of per-job mode is blocked [2]
> > > and probably it will not be resolved for a while, but I could
> > > imagine it would be possible in 2.1 or 2.2.
> > > 
> > > The change itself is rather small and concise, and Marton Balassi
> > > volunteered to review it ASAP.
> > > 
> > > Pls. correct me if I am wrong about the deprecation process.
> > > 
> > > Looking forward to your opinion!
> > > 
> > > Thanks,
> > > Ferenc
> > > 
> > > [1] https://issues.apache.org/jira/browse/FLINK-35625
> > > [2] https://issues.apache.org/jira/browse/FLINK-26000
> > > 
> > > On Tuesday, 18 June 2024 at 11:27, weijie guo  > > 
> > > wrote:
> > > 
> > > > Hi Zakelly,
> > > > 
> > > > Thank you for informing us!
> > > > 
> > > > After discussion, all RMs agreed that this was an important fix that
> > > > should
> > > > be merged into 1.20.
> > > > 
> > > > So feel free to merge it.
> > > > 
> > > > Best regards,
> > > > 
> > > > Weijie
> > > > 
> > > > Zakelly Lan zakelly@gmail.com 于2024年6月15日周六 16:29写道:
> > > > 
> > > > > Hi Robert, Rui, Ufuk and Weijie,
> > > > > 
> > > > > Thanks for the update!
> > > > > 
> > > > > FYI: This PR[1] fixes & cleanup the left-over checkpoint directories
> > > > > for
> > > > > file-merging on TM exit. And the second commit fixes the wrong state
> > > > > handle
> > > > > usage. We encountered several unexpected CI fails, so we missed the
> > > > > feature
> > > > > freeze time. It is better to have this PR in 1.20 so I will merge
> > > > > this
> > > > > if
> > > > > you agree. Thanks.
> > > > > 
> > > > > [1] https://github.com/apache/flink/pull/24933
> > > > > 
> > > > > Best,
> > > > > Zakelly
> > > > > 
> > > > > On Sat, Jun 15, 2024 at 6:00 AM weijie guo guoweijieres...@gmail.com
> > > > > wrote:
> > > > > 
> > > > > > Hi everyone,
> > > > > > 
> > > > > > The feature freeze of 1.20 has started now. That means that no new
> > > > > > features
> > > > > > 
> > > > > > or improvements should now be merged into the master branch unless
> > > > > > you
> > > > > > ask
> > > > > > 
> > > > > > the release managers first, w

[jira] [Created] (FLINK-35662) Use maven batch mode in k8s-operator CI

2024-06-20 Thread Ferenc Csaky (Jira)
Ferenc Csaky created FLINK-35662:


 Summary: Use maven batch mode in k8s-operator CI
 Key: FLINK-35662
 URL: https://issues.apache.org/jira/browse/FLINK-35662
 Project: Flink
  Issue Type: Improvement
  Components: Kubernetes Operator
Reporter: Ferenc Csaky
 Fix For: kubernetes-operator-1.10.0


Currently, the GitHub workflows do not use batch mode in the k8s-operator repo, 
so there are a lot of lines in the log like this:
{code}
Progress (1): 4.1/14 kB
Progress (1): 8.2/14 kB
Progress (1): 12/14 kB 
Progress (1): 14 kB
{code}
To produce logs that are for more easy to navigate, all {{mvn}} calls should 
apply the batch-mode option {{-B}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


FW: Non Nullable fields within objects

2024-06-20 Thread David Radley
Hi,
I am looking to get the Avro format to support Non Nullable fields within 
objects or arrays. It works with Confluent Avro. I notice 
https://issues.apache.org/jira/browse/CALCITE-4085 where it looks like Calcite 
was changed to allow this capability to work with Flink. 
https://github.com/apache/flink/pull/24916/files#diff-c2ed59d02b6bec354790442fd97c7694676eaa4199425353d7ef6cde1304c2e0
 might effect this also.

I have a table definition of

CREATE TABLE source_7

(

`order_id` STRING,

`order_time` STRING,

`buyer` ROW<

`first_name`   STRING,

`last_name`  STRING NOT NULL,

`title`   STRING NOT NULL

>

)

 WITH (

 'connector' = 'kafka',

   'topic' = 'vivekavro',

   'properties.bootstrap.servers' = 'localhost:9092',

   'value.format' = 'avro',

   'value.fields-include' = 'ALL',

  'scan.startup.mode' = 'earliest-offset'

 );

And an event of shape:
{
 "order_id": "12345",
 "order_time": "1234",
 "buyer": {
 "first_name": "hvcwc",
 "last_name": "hvcwc2",
 "title": "hvcwc3"
 }
}


When I issue a select it fails deserialize, internally the writer schema has

"name" : "last_name",  "type" : [ "null", "string" ],

"name" : "titie",  "type" : [ "null", "string" ],



So it has lost the non-nullable. I would have expected  "type" : "string", for 
last_name and title.



I have done a little digging. It appears that the issue is in the 
createSqlTableConverter.

In the debugger I see:

Buyer has a nullable followed by 2 non nullable fields.

The FieldsDataType are all nullable. This looks like it has lost the nullable 
hint.

LogicalType does not have the concept of nullable.

 fromLogicalTypeToDataType  creates a DataType from LogicalType and results in 
the fields being set as nullables.

This looks like it could be cause of the behaviour we are seeing or am I 
missing something?



WDYT?



Kind regards, David.































Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU


Re: [DISCUSS] Support to group rows by column ordinals

2024-06-20 Thread Sergey Nuyanzin
Hey Jeyhun,

Thanks for starting the discussion.
Could you please elaborate more on use cases of this feature?

The one that I see in FLINK-34366[1] is to simplify referencing to
aliases in SELECT from GROUP BY
(also potentially ORDER BY and HAVING). I wonder whether there is some
other use cases where
support of addressing by ordinals is required?

I'm asking since SqlConformance in Calcite and as a result
FlinkSqlConformance in Flink give ability
to reference to aliases from SELECT by enabling it e.g. [2] where javadoc says
>   * Whether to allow aliases from the {@code SELECT} clause to be used as
>   * column names in the {@code GROUP BY} clause.

Giving the fact that was already mentioned by Timo in the PR
> In some DBMSs it is common to write GROUP BY 1 or ORDER BY 1 for global 
> aggregation/sorting.
and IMHO referencing by ordinals might be error prone if someone adds
more columns in SELECT and forgets about ordinals.

Would it make sense to consider enabling reference by aliases as
another option here?
Or did I miss anything?

[1] https://issues.apache.org/jira/browse/FLINK-34366
[2] 
https://github.com/apache/calcite/blob/c0a53f6b17daaca9d057e70d7fae0a0e9c2cd02a/core/src/main/java/org/apache/calcite/sql/validate/SqlConformance.java#L92-L103

On Thu, Jun 20, 2024 at 12:12 PM Muhammet Orazov
 wrote:
>
> Hey Jeyhun,
>
> Thanks for bringing it up! +-1 from my side.
>
> Personally, I find this feature confusing, it feels always natural to
> use
> column names. SQL power users will ask for it, I have seen it used in
> automated complex queries also.
>
> But it seems counterintuitive to enable flag for this feature. Enabling
> it, should not disable grouping/ordering by the column names?
>
> Best,
> Muhammet
>
>
> On 2024-06-17 20:30, Jeyhun Karimov wrote:
> > Hi devs,
> >
> > I am moving our discussion on the PR thread [1] to the dev mailing list
> > to
> > close the loop on the related issue [2]. The end goal of the PR is to
> > support grouping/ordering by via column ordinals. The target
> > implementation
> > (currently there is no flag) should support a flag, so that a user can
> > also
> > use the old behavior as suggested by @Timo.
> >
> > Some vendors such as Postgres [3], SQLite [4], MySQL/MariaDB [5],
> > Oracle
> > [6], Spark [7], and BigQuery[8] support group/order by clauses with
> > column
> > ordinals.
> >
> > Obviously, supporting this clause might lead to less readable and
> > maintainable SQL code. This might also cause a bit of complications
> > both on
> > the codebase and on the user-experience side. On the other hand, we
> > already
> > see that numerous vendors support this feature out of the box, because
> > there was/is a need for this feature.
> >
> > That is why, I would like to discuss and hear your opinions about
> > introducing/abandoning this feature.
> >
> > Regards,
> > Jeyhun
> >
> > [1] https://github.com/apache/flink/pull/24270
> > [2] https://issues.apache.org/jira/browse/FLINK-34366
> > [3] https://www.postgresql.org/docs/6.5/sql-select.htm
> > [4] https://www.sqlite.org/lang_select.html
> > [5] https://www.db-fiddle.com/f/uTrfRrNs4uXLr4Q9j2piCk/1
> > [6]
> > https://oracle-base.com/articles/23/group-by-and-having-clause-using-column-alias-or-column-position-23
> > [7]
> > https://github.com/apache/spark/commit/90613df652d45e121ab2b3a5bbb3b63cb15d297a
> > [8]
> > https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#group_by_col_ordinals



-- 
Best regards,
Sergey


[jira] [Created] (FLINK-35661) MiniBatchGroupAggFunction can silently drop records under certain conditions

2024-06-20 Thread Ivan Burmistrov (Jira)
Ivan Burmistrov created FLINK-35661:
---

 Summary: MiniBatchGroupAggFunction can silently drop records under 
certain conditions
 Key: FLINK-35661
 URL: https://issues.apache.org/jira/browse/FLINK-35661
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Runtime
Reporter: Ivan Burmistrov
 Attachments: image-2024-06-20-10-46-51-347.png, 
image-2024-06-20-10-50-20-867.png, image-2024-06-20-11-05-53-253.png

h2. The story / Symptoms

One day we changed a bit our Flink job that utilizes Flink SQL via adding a 
couple of UDF-based aggregations (it's not important what these aggregations 
are doing) and surprisingly the job started working incorrectly - producing 
wrong results for some aggregation keys, or not producing some keys at all.

The symptoms were really weird.  For instance, the read / write access rate to 
accState (the internal state used by Table SQL for group by aggregations) 
dropped sharply. On the screenshot you see the comparison of read rate to this 
state with the similar chart 1d ago - they should behave the same, yet we see a 
big difference that after the change. Similar picture was about write rate.

!image-2024-06-20-10-50-20-867.png|width=770,height=338!

Another interesting observation was that GroupAggregate operator (the one from 
Table SQL responsible for group by aggregation) behaved weirdly: the number of 
"records out" was disproportionally less than the number of "records in". By 
itself it doesn't mean anything, but combined with our other observations about 
the job producing wrong results - this seems suspicious.

!image-2024-06-20-11-05-53-253.png|width=767,height=316!
h2. Digging deeper

After reverting the change things got back to normal. And we concluded that 
adding new UDF-based aggregations caused the issue. Then we realized that we 
accidentally forgot to implement *merge* method in our UDF and this caused the 
planner to fallback to 
ONE_PHASE aggregation instead of TWO_PHASE. After fixing the mistake and 
implementing *merge* _things got back to normal._ 

Moreover, __ we realized that UDF actually has nothing to do with the issue 
(except for causing that ONE_PHASE fallback). So we reverted all the changes 
and tested the job in ONE_PHASE. *The issue was happening in such a mode.* 
So, summarizing: *when the job has mini-batch enabled, ONE_PHASE aggregation 
works incorrectly.*{*}{*}
h2. The bug

It was clear that the issue has something to do with 
[MiniBatchGroupAggFunction|https://github.com/apache/flink/blob/master/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/aggregate/MiniBatchGroupAggFunction.java#L57]
 because this is what distinguish ONE_PHASE from TWO_PHASE mode.

 

After reading the code, we found this interesting fragment:
{code:java}
    @Override
    public void finishBundle(Map> buffer, 
Collector out)
            throws Exception {
        for (Map.Entry> entry : buffer.entrySet()) {
            RowData currentKey = entry.getKey();
            List inputRows = entry.getValue();
            boolean firstRow = false;
            // step 1: get the accumulator for the current key
            // set current key to access state under the key
            ctx.setCurrentKey(currentKey);
            RowData acc = accState.value();
            if (acc == null) {
                // Don't create a new accumulator for a retraction message. This
                // might happen if the retraction message is the first message 
for the
                // key or after a state clean up.
                Iterator inputIter = inputRows.iterator();
                while (inputIter.hasNext()) {
                    RowData current = inputIter.next();
                    if (isRetractMsg(current)) {
                        inputIter.remove(); // remove all the beginning 
retraction messages
                    } else {
                        break;
                    }
                }
                if (inputRows.isEmpty()) {
                    return; //  <--- this is bad 
                }
                acc = function.createAccumulators();
                firstRow = true;
            }
   // ...
}


{code}

In this code we iterate over the whole bundle key by key and at some point do 
this:


{code:java}
if (inputRows.isEmpty()) {
     return; 
}{code}

Obviously, what was meant here is continue (i.e.: finish with the current key, 
move to the next), not the full stop.

 

 

This line is reached when the bundle contains a key that has only retraction 
messages - in this case the code below would result in inputRows being empty:

 
{code:java}
while (inputIter.hasNext()) {
                    RowData current = inputIter.next();
                    if (isRetractMsg(current)) {
                        inputIter.remove(); // remove all the beginning 
retraction messages
             

Re: [DISCUSS] Support to group rows by column ordinals

2024-06-20 Thread Muhammet Orazov

Hey Jeyhun,

Thanks for bringing it up! +-1 from my side.

Personally, I find this feature confusing, it feels always natural to 
use

column names. SQL power users will ask for it, I have seen it used in
automated complex queries also.

But it seems counterintuitive to enable flag for this feature. Enabling
it, should not disable grouping/ordering by the column names?

Best,
Muhammet


On 2024-06-17 20:30, Jeyhun Karimov wrote:

Hi devs,

I am moving our discussion on the PR thread [1] to the dev mailing list 
to

close the loop on the related issue [2]. The end goal of the PR is to
support grouping/ordering by via column ordinals. The target 
implementation
(currently there is no flag) should support a flag, so that a user can 
also

use the old behavior as suggested by @Timo.

Some vendors such as Postgres [3], SQLite [4], MySQL/MariaDB [5], 
Oracle
[6], Spark [7], and BigQuery[8] support group/order by clauses with 
column

ordinals.

Obviously, supporting this clause might lead to less readable and
maintainable SQL code. This might also cause a bit of complications 
both on
the codebase and on the user-experience side. On the other hand, we 
already

see that numerous vendors support this feature out of the box, because
there was/is a need for this feature.

That is why, I would like to discuss and hear your opinions about
introducing/abandoning this feature.

Regards,
Jeyhun

[1] https://github.com/apache/flink/pull/24270
[2] https://issues.apache.org/jira/browse/FLINK-34366
[3] https://www.postgresql.org/docs/6.5/sql-select.htm
[4] https://www.sqlite.org/lang_select.html
[5] https://www.db-fiddle.com/f/uTrfRrNs4uXLr4Q9j2piCk/1
[6]
https://oracle-base.com/articles/23/group-by-and-having-clause-using-column-alias-or-column-position-23
[7]
https://github.com/apache/spark/commit/90613df652d45e121ab2b3a5bbb3b63cb15d297a
[8]
https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#group_by_col_ordinals


Re: [ANNOUNCE] Apache Flink CDC 3.1.1 released

2024-06-20 Thread Muhammet Orazov

Great, thanks Qingsheng for your efforts!

Best,
Muhammet

On 2024-06-18 15:50, Qingsheng Ren wrote:
The Apache Flink community is very happy to announce the release of 
Apache

Flink CDC 3.1.1.

Apache Flink CDC is a distributed data integration tool for real time 
data
and batch data, bringing the simplicity and elegance of data 
integration

via YAML to describe the data movement and transformation in a data
pipeline.

Please check out the release blog post for an overview of the release:
https://flink.apache.org/2024/06/18/apache-flink-cdc-3.1.1-release-announcement/

The release is available for download at:
https://flink.apache.org/downloads.html

Maven artifacts for Flink CDC can be found at:
https://search.maven.org/search?q=g:org.apache.flink%20cdc

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12354763

We would like to thank all contributors of the Apache Flink community 
who

made this release possible!

Regards,
Qingsheng Ren


[jira] [Created] (FLINK-35660) Implement parent-child ordering for DDB Streams source

2024-06-20 Thread Aleksandr Pilipenko (Jira)
Aleksandr Pilipenko created FLINK-35660:
---

 Summary: Implement parent-child ordering for DDB Streams source
 Key: FLINK-35660
 URL: https://issues.apache.org/jira/browse/FLINK-35660
 Project: Flink
  Issue Type: Sub-task
  Components: Connectors / DynamoDB
Reporter: Aleksandr Pilipenko


Implement support for parent-child ordering in DDB Streams source



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing for the AdaptiveScheduler

2024-06-20 Thread Matthias Pohl
Thanks for participating everyone. I finalized this vote and posted the
results in a separate thread [1].

Best,
Matthias

[1] https://lists.apache.org/thread/1yb46n9rpvl6htdz57kw2whym4qnjptq

On Tue, Jun 18, 2024 at 8:47 PM Thomas Weise  wrote:

> +1 (binding)
>
>
> On Tue, Jun 18, 2024 at 11:38 AM Gabor Somogyi 
> wrote:
>
> > +1 (binding)
> >
> > G
> >
> >
> > On Mon, Jun 17, 2024 at 10:24 AM Matthias Pohl 
> wrote:
> >
> > > Hi everyone,
> > > the discussion in [1] about FLIP-461 [2] is kind of concluded. I am
> > > starting a vote on this one here.
> > >
> > > The vote will be open for at least 72 hours (i.e. until June 20, 2024;
> > > 8:30am UTC) unless there are any objections. The FLIP will be
> considered
> > > accepted if 3 binding votes (from active committers according to the
> > Flink
> > > bylaws [3]) are gathered by the community.
> > >
> > > Best,
> > > Matthias
> > >
> > > [1] https://lists.apache.org/thread/nnkonmsv8xlk0go2sgtwnphkhrr5oc3y
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler
> > > [3]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Bylaws#FlinkBylaws-Approvals
> > >
> >
>


[RESULT][VOTE] FLIP-461: Synchronize rescaling with checkpoint creation to minimize reprocessing for the AdaptiveScheduler

2024-06-20 Thread Matthias Pohl
Hi everyone,
the vote [1] for FLIP-461 [2] is over. The number of required binding votes
(3) was reached (total: 10, binding: 7, non-binding: 3). No objections were
raised.

- David Morávek (binding)
- Rui Fan (binding)
- Zakelly Lan (binding)
- Gyula Fóra (binding)
- Weijie Guo (binding)
- Gabor Somogyi (binding)
- Thomas Weise (binding)

- Zhanghao Chen (non-binding)
- Zdenek Tison (non-binding)
- Ferenc Csaky (non-binding)

I will go ahead and prepare the PRs for FLINK-35549 [3].

Best,
Matthias

[1] https://lists.apache.org/thread/d6cdlq1h9h6w51wb2jsvsw70s42n76qh
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-461%3A+Synchronize+rescaling+with+checkpoint+creation+to+minimize+reprocessing+for+the+AdaptiveScheduler
[3] https://issues.apache.org/jira/browse/FLINK-35549


[jira] [Created] (FLINK-35659) Fix TestFileSystem Connector error setting execution mode

2024-06-20 Thread dalongliu (Jira)
dalongliu created FLINK-35659:
-

 Summary: Fix TestFileSystem Connector error setting execution mode 
 Key: FLINK-35659
 URL: https://issues.apache.org/jira/browse/FLINK-35659
 Project: Flink
  Issue Type: Sub-task
  Components: Tests
Affects Versions: 1.20.0
Reporter: dalongliu
 Fix For: 1.20.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35658) Hybrid shuffle external tier can not work with UnknownInputChannel

2024-06-20 Thread Yuxin Tan (Jira)
Yuxin Tan created FLINK-35658:
-

 Summary: Hybrid shuffle external tier can not work with 
UnknownInputChannel
 Key: FLINK-35658
 URL: https://issues.apache.org/jira/browse/FLINK-35658
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.20.0
Reporter: Yuxin Tan
Assignee: Yuxin Tan


Currently, the hybrid shuffle can not work with UnknownInputChannel, we should 
fix it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35657) Flink UI not support show float metric value

2024-06-20 Thread Can Luo (Jira)
Can Luo created FLINK-35657:
---

 Summary: Flink UI not support show float metric value
 Key: FLINK-35657
 URL: https://issues.apache.org/jira/browse/FLINK-35657
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Reporter: Can Luo


Flink ui show float metric value as int/long. For example, 
`outPoolUsage`/`inPoolUsage` are always 0 or 1 at UI, this's a little helpful 
for task fine tunning.

!https://private-user-images.githubusercontent.com/20294680/324221526-2b78892d-b820-42df-85dc-3cd271e44f4d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTg4Njc1MzEsIm5iZiI6MTcxODg2NzIzMSwicGF0aCI6Ii8yMDI5NDY4MC8zMjQyMjE1MjYtMmI3ODg5MmQtYjgyMC00MmRmLTg1ZGMtM2NkMjcxZTQ0ZjRkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIwVDA3MDcxMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTFkODMwNjQ5ZGRkZTM2MDUyODI3MjQ1OTRmOGY1MjU0YzBhNDI1MWUyOTBlZjY1YjU0ZDMwZTFmOWU1YThjN2MmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.De0YRIRuvnw98sb0L7DO8aLuRqX6z3AqmyC6vRu4r1k!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35656) Hive Source has issues setting max parallelism in dynamic inference mode

2024-06-20 Thread xingbe (Jira)
xingbe created FLINK-35656:
--

 Summary: Hive Source has issues setting max parallelism in dynamic 
inference mode
 Key: FLINK-35656
 URL: https://issues.apache.org/jira/browse/FLINK-35656
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Affects Versions: 1.20.0
Reporter: xingbe
 Fix For: 1.20.0


In the dynamic parallelism inference mode of Hive Source, when 
`table.exec.hive.infer-source-parallelism.max` is not configured, it does not 
use `execution.batch.adaptive.auto-parallelism.default-source-parallelism` as 
the upper bound for parallelism inference, which is inconsistent with the 
behavior described in the documentation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)