[GitHub] [flink] TanYuxin-tyx commented on pull request #22443: [FLINK-31878][connectors] Fix the wrong name of PauseOrResumeSplitsTask#toString

2023-04-21 Thread via GitHub


TanYuxin-tyx commented on PR #22443:
URL: https://github.com/apache/flink/pull/22443#issuecomment-1518498947

   @reswqa @MartijnVisser @liuyongvs Thanks for reviewing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-web] tzulitai closed pull request #632: Add Flink Kafka Connector v3.0.0

2023-04-21 Thread via GitHub


tzulitai closed pull request #632: Add Flink Kafka Connector v3.0.0
URL: https://github.com/apache/flink-web/pull/632


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-web] tzulitai closed pull request #606: Flink Kafka Connector 3.0.0

2023-04-21 Thread via GitHub


tzulitai closed pull request #606: Flink Kafka Connector 3.0.0
URL: https://github.com/apache/flink-web/pull/606


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] VinayLondhe14 commented on pull request #18730: [FLINK-25495] The deployment of application-mode support client attach

2023-04-21 Thread via GitHub


VinayLondhe14 commented on PR #18730:
URL: https://github.com/apache/flink/pull/18730#issuecomment-1518395452

   Is there a particular reason this PR has not been merged?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #22451: FLINK-31880: Fixed broken date test. It was failing when run in CST time zone.

2023-04-21 Thread via GitHub


flinkbot commented on PR #22451:
URL: https://github.com/apache/flink/pull/22451#issuecomment-1518370390

   
   ## CI report:
   
   * 639c2246b369dfa275e193207fd2cb7e74bb7870 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31880) Bad Test in OrcColumnarRowSplitReaderTest

2023-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-31880:
---
Labels: pull-request-available  (was: )

> Bad Test in OrcColumnarRowSplitReaderTest
> -
>
> Key: FLINK-31880
> URL: https://issues.apache.org/jira/browse/FLINK-31880
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ORC, Formats (JSON, Avro, Parquet, ORC, 
> SequenceFile)
>Reporter: Kurt Ostfeld
>Priority: Minor
>  Labels: pull-request-available
>
> This is a development issue with, what looks like a buggy unit test.
>  
> I tried to build Flink with a clean copy of the repository and I get:
>  
> ```
> [INFO] Results:
> [INFO]
> [ERROR] Failures:
> [ERROR] OrcColumnarRowSplitReaderTest.testReadFileWithTypes:365
> expected: "1969-12-31"
> but was: "1970-01-01"
> [INFO]
> [ERROR] Tests run: 26, Failures: 1, Errors: 0, Skipped: 0
> ```
>  
> I see the test is testing Date data types with `new Date(562423)` which is 9 
> minutes and 22 seconds after the epoch time, which is 1970-01-01 UTC time, or 
> when I run that on my laptop in CST timezone, I get `Wed Dec 31 18:09:22 CST 
> 1969`.
>  
> I have a simple pull request ready which fixes this issue and uses the Java 8 
> LocalDate API instead which avoids time zones entirely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] kurtostfeld opened a new pull request, #22451: FLINK-31880: Fixed broken date test. It was failing when run in CST time zone.

2023-04-21 Thread via GitHub


kurtostfeld opened a new pull request, #22451:
URL: https://github.com/apache/flink/pull/22451

   ## What is the purpose of the change
   
   Fix the broken unit test so that unit tests pass.
   
   ## Brief change log
   
   Very simple self-explanatory single file fix.
   
   ## Verifying this change
   
   verified.
   
   ## Documentation
   
   None required.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Created] (FLINK-31880) Bad Test in OrcColumnarRowSplitReaderTest

2023-04-21 Thread Kurt Ostfeld (Jira)
Kurt Ostfeld created FLINK-31880:


 Summary: Bad Test in OrcColumnarRowSplitReaderTest
 Key: FLINK-31880
 URL: https://issues.apache.org/jira/browse/FLINK-31880
 Project: Flink
  Issue Type: Bug
  Components: Connectors / ORC, Formats (JSON, Avro, Parquet, ORC, 
SequenceFile)
Reporter: Kurt Ostfeld


This is a development issue with, what looks like a buggy unit test.
 
I tried to build Flink with a clean copy of the repository and I get:
 
```
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] OrcColumnarRowSplitReaderTest.testReadFileWithTypes:365
expected: "1969-12-31"
but was: "1970-01-01"
[INFO]
[ERROR] Tests run: 26, Failures: 1, Errors: 0, Skipped: 0
```
 
I see the test is testing Date data types with `new Date(562423)` which is 9 
minutes and 22 seconds after the epoch time, which is 1970-01-01 UTC time, or 
when I run that on my laptop in CST timezone, I get `Wed Dec 31 18:09:22 CST 
1969`.
 
I have a simple pull request ready which fixes this issue and uses the Java 8 
LocalDate API instead which avoids time zones entirely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31828) List field in a POJO data stream results in table program compilation failure

2023-04-21 Thread Vladimir Matveev (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715160#comment-17715160
 ] 

Vladimir Matveev commented on FLINK-31828:
--

Hi [~aitozi], thank you for the quick response! Yes, I had the EOF exception 
when trying to print the table with RAW columns. Great to know that it will be 
fixed :)

> List field in a POJO data stream results in table program compilation failure
> -
>
> Key: FLINK-31828
> URL: https://issues.apache.org/jira/browse/FLINK-31828
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.1
> Environment: Java 11
> Flink 1.16.1
>Reporter: Vladimir Matveev
>Priority: Major
> Attachments: MainPojo.java, generated-code.txt, stacktrace.txt
>
>
> Suppose I have a POJO class like this:
> {code:java}
> public class Example {
> private String key;
> private List> values;
> // getters, setters, equals+hashCode omitted
> }
> {code}
> When a DataStream with this class is converted to a table, and some 
> operations are performed on it, it results in an exception which explicitly 
> says that I should file a ticket:
> {noformat}
> Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
> cannot be compiled. This is a bug. Please file an issue.
> {noformat}
> Please find the example Java code and the full stack trace attached.
> From the exception and generated code it seems that Flink is upset with the 
> list field being treated as an array - but I cannot have an array type there 
> in the real code.
> Also note that if I _don't_ specify the schema explicitly, it then maps the 
> {{values}} field to a `RAW('java.util.List', '...')` type, which also does 
> not work correctly and fails the job in case of even simplest operations like 
> printing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] XComp commented on a diff in pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into

2023-04-21 Thread via GitHub


XComp commented on code in PR #22422:
URL: https://github.com/apache/flink/pull/22422#discussion_r1174080585


##
flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/LeaderElectionEventHandler.java:
##
@@ -39,12 +40,29 @@ public interface LeaderElectionEventHandler {
  */
 void onGrantLeadership(UUID newLeaderSessionId);
 
+/**
+ * Called by specific {@link LeaderElectionDriver} when the leadership is 
granted.
+ *
+ * This method will trigger the grant event processing in a separate 
thread.
+ *
+ * @param newLeaderSessionId the valid leader session id
+ */
+CompletableFuture onGrantLeadershipAsync(UUID newLeaderSessionId);

Review Comment:
   > We would need to get a reference to the owner of 
`DefaultLeaderElectionService`, wouldn't we?
   
   Ok, I was not thinking straight when writing the previous comment. We have 
the `LeaderContender`'s reference. 🤦 I'm gonna give it some thought.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-31848) And Operator has side effect when operands have udf

2023-04-21 Thread Shuiqiang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715115#comment-17715115
 ] 

Shuiqiang Chen edited comment on FLINK-31848 at 4/21/23 6:32 PM:
-

[~martijnvisser][~zju_zsx][~jark] Thanks for your reply. After running several 
test cases based on release-1.15 (which I previously analyzed) and the latest 
1.18-SNAPSHOT branch, it seems that the left.resultTerm in the left.code has 
always been assigned with a boolean value and will never cause a syntax error 
by calling `if (!${left.resultTerm})`. Therefore I agree with [~zju_zsx] that 
the logic can be safely simplified. While with `if (!${left.nullTerm} && 
!${left.resultTerm})`, it would be a little bit more intuitive that 
${left.nullTerm} indicates whether left result is UNKNOWN or not in three value 
logic.


was (Author: csq):
[~martijnvisser][~zju_zsx][~jark] Thanks for your reply. After running several 
test cases based on release-1.15 (which I previously analyzed) and the latest 
1.18-SNAPSHOT branch, it seems that the left.resultTerm in the left.code has 
always been assigned with a boolean value and will never cause a syntax error 
by calling `if (!${left.resultTerm})`. Therefore I agree with [~zju_zsx] that 
the logic can be safely simplified. While with `if (!${left.nullTerm} && 
!${left.resultTerm})`, it would be a little bit more intuitive that 
${left.nullTerm} indicates whether left result is UNKNOWN or not in three value 
logic.

> And Operator has side effect when operands have udf
> ---
>
> Key: FLINK-31848
> URL: https://issues.apache.org/jira/browse/FLINK-31848
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.2
>Reporter: zju_zsx
>Priority: Major
> Attachments: image-2023-04-19-14-54-46-458.png
>
>
>  
> {code:java}
> CREATE TABLE kafka_source (
>    `content` varchar,
>    `testid` bigint,
>    `extra` int
>    );
> CREATE TABLE console_sink (
>    `content` varchar,
>    `testid` bigint
>  )
>   with (
>     'connector' = 'print'
> );
> insert into console_sink
> select 
>    content,testid+1
> from kafka_source where testid is not null and testid > 0 and my_udf(testid) 
> != 0; {code}
> my_udf has a constraint that the testid should not be null, but the testid is 
> not null and testid > 0 does not take effect.
>  
> Im ScalarOperatorGens.generateAnd
> !image-2023-04-19-14-54-46-458.png!
> if left.nullTerm is true, right code will be execute 。
> it seems that
> {code:java}
> if (!${left.nullTerm} && !${left.resultTerm}) {code}
> can be safely replaced with 
> {code:java}
> if (!${left.resultTerm}){code}
> ? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31848) And Operator has side effect when operands have udf

2023-04-21 Thread Shuiqiang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715115#comment-17715115
 ] 

Shuiqiang Chen edited comment on FLINK-31848 at 4/21/23 6:31 PM:
-

[~martijnvisser][~zju_zsx][~jark] Thanks for your reply. After running several 
test cases based on release-1.15 (which I previously analyzed) and the latest 
1.18-SNAPSHOT branch, it seems that the left.resultTerm in the left.code has 
always been assigned with a boolean value and will never cause a syntax error 
by calling `if (!${left.resultTerm})`. Therefore I agree with [~zju_zsx] that 
the logic can be safely simplified. While with `if (!${left.nullTerm} && 
!${left.resultTerm})`, it would be a little bit more intuitive that 
${left.nullTerm} indicates whether left result is UNKNOWN or not in three value 
logic.


was (Author: csq):
[~martijnvisser][~zju_zsx][~jark] Thanks for your reply. After running several 
test cases based on release-1.15 (which I previously analyzed) and the latest 
1.18-SNAPSHOT branch, it seems that the left.resultTerm in the left.code has 
always been assigned with a boolean value and will never cause a syntax error 
by calling _if (!${left.resultTerm})_ . Therefore I agree with [~zju_zsx] that 
the logic can be safely simplified. While with _if (!${left.nullTerm} && 
!${left.resultTerm})_ in current code base, it would be a little bit more 
intuitive that ${left.nullTerm} indicating whether left result is UNKNOWN or 
not.

> And Operator has side effect when operands have udf
> ---
>
> Key: FLINK-31848
> URL: https://issues.apache.org/jira/browse/FLINK-31848
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.2
>Reporter: zju_zsx
>Priority: Major
> Attachments: image-2023-04-19-14-54-46-458.png
>
>
>  
> {code:java}
> CREATE TABLE kafka_source (
>    `content` varchar,
>    `testid` bigint,
>    `extra` int
>    );
> CREATE TABLE console_sink (
>    `content` varchar,
>    `testid` bigint
>  )
>   with (
>     'connector' = 'print'
> );
> insert into console_sink
> select 
>    content,testid+1
> from kafka_source where testid is not null and testid > 0 and my_udf(testid) 
> != 0; {code}
> my_udf has a constraint that the testid should not be null, but the testid is 
> not null and testid > 0 does not take effect.
>  
> Im ScalarOperatorGens.generateAnd
> !image-2023-04-19-14-54-46-458.png!
> if left.nullTerm is true, right code will be execute 。
> it seems that
> {code:java}
> if (!${left.nullTerm} && !${left.resultTerm}) {code}
> can be safely replaced with 
> {code:java}
> if (!${left.resultTerm}){code}
> ? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31848) And Operator has side effect when operands have udf

2023-04-21 Thread Shuiqiang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715115#comment-17715115
 ] 

Shuiqiang Chen edited comment on FLINK-31848 at 4/21/23 6:29 PM:
-

[~martijnvisser][~zju_zsx][~jark] Thanks for your reply. After running several 
test cases based on release-1.15 (which I previously analyzed) and the latest 
1.18-SNAPSHOT branch, it seems that the left.resultTerm in the left.code has 
always been assigned with a boolean value and will never cause a syntax error 
by calling _if (!${left.resultTerm})_ . Therefore I agree with [~zju_zsx] that 
the logic can be safely simplified. While with _if (!${left.nullTerm} && 
!${left.resultTerm})_ in current code base, it would be a little bit more 
intuitive that ${left.nullTerm} indicating whether left result is UNKNOWN or 
not.


was (Author: csq):
[~martijnvisser][~zju_zsx][~jark] Thanks for your reply. After running several 
test cases based on release-1.15 (which I previously analyzed) and the latest 
1.18-SNAPSHOT branch, it seems that the left.resultTerm in the left.code has 
always been assigned with a boolean value and will never cause a syntax error 
by calling if (!${left.resultTerm}) . Therefore I agree with [~zju_zsx] that 
the logic can be safely simplified.

> And Operator has side effect when operands have udf
> ---
>
> Key: FLINK-31848
> URL: https://issues.apache.org/jira/browse/FLINK-31848
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.2
>Reporter: zju_zsx
>Priority: Major
> Attachments: image-2023-04-19-14-54-46-458.png
>
>
>  
> {code:java}
> CREATE TABLE kafka_source (
>    `content` varchar,
>    `testid` bigint,
>    `extra` int
>    );
> CREATE TABLE console_sink (
>    `content` varchar,
>    `testid` bigint
>  )
>   with (
>     'connector' = 'print'
> );
> insert into console_sink
> select 
>    content,testid+1
> from kafka_source where testid is not null and testid > 0 and my_udf(testid) 
> != 0; {code}
> my_udf has a constraint that the testid should not be null, but the testid is 
> not null and testid > 0 does not take effect.
>  
> Im ScalarOperatorGens.generateAnd
> !image-2023-04-19-14-54-46-458.png!
> if left.nullTerm is true, right code will be execute 。
> it seems that
> {code:java}
> if (!${left.nullTerm} && !${left.resultTerm}) {code}
> can be safely replaced with 
> {code:java}
> if (!${left.resultTerm}){code}
> ? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31848) And Operator has side effect when operands have udf

2023-04-21 Thread Shuiqiang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715115#comment-17715115
 ] 

Shuiqiang Chen edited comment on FLINK-31848 at 4/21/23 6:18 PM:
-

[~martijnvisser][~zju_zsx][~jark] Thanks for your reply. After running several 
test cases based on release-1.15 (which I previously analyzed) and the latest 
1.18-SNAPSHOT branch, it seems that the left.resultTerm in the left.code has 
always been assigned with a boolean value and will never cause a syntax error 
by calling if (!${left.resultTerm}) . Therefore I agree with [~zju_zsx] that 
the logic can be safely simplified.


was (Author: csq):
[~martijnvisser][~zju_zsx][~jark] Thanks for your reply. After ran several test 
cases based on release-1.15 (which I previously analyzed) and the latest 
1.18-SNAPSHOT branch, it seems that the left.resultTerm in the left.code has 
always been assigned a boolean value and will never cause a syntax error by 
calling if (!${left.resultTerm}) . Therefore I agree with [~zju_zsx] that the 
logic can be safely simplified. 

> And Operator has side effect when operands have udf
> ---
>
> Key: FLINK-31848
> URL: https://issues.apache.org/jira/browse/FLINK-31848
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.2
>Reporter: zju_zsx
>Priority: Major
> Attachments: image-2023-04-19-14-54-46-458.png
>
>
>  
> {code:java}
> CREATE TABLE kafka_source (
>    `content` varchar,
>    `testid` bigint,
>    `extra` int
>    );
> CREATE TABLE console_sink (
>    `content` varchar,
>    `testid` bigint
>  )
>   with (
>     'connector' = 'print'
> );
> insert into console_sink
> select 
>    content,testid+1
> from kafka_source where testid is not null and testid > 0 and my_udf(testid) 
> != 0; {code}
> my_udf has a constraint that the testid should not be null, but the testid is 
> not null and testid > 0 does not take effect.
>  
> Im ScalarOperatorGens.generateAnd
> !image-2023-04-19-14-54-46-458.png!
> if left.nullTerm is true, right code will be execute 。
> it seems that
> {code:java}
> if (!${left.nullTerm} && !${left.resultTerm}) {code}
> can be safely replaced with 
> {code:java}
> if (!${left.resultTerm}){code}
> ? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-31879) org.apache.avro.util.Utf8 cannot be serialized with avro when used in state

2023-04-21 Thread Feroze Daud (Jira)
Feroze Daud created FLINK-31879:
---

 Summary: org.apache.avro.util.Utf8 cannot be serialized with avro 
when used in state 
 Key: FLINK-31879
 URL: https://issues.apache.org/jira/browse/FLINK-31879
 Project: Flink
  Issue Type: Bug
  Components: API / Type Serialization System
Reporter: Feroze Daud


Scenario:

Write a flink app that reads avro messages from a kafka topic.

The avro pojos are generated with _org.apache.avro.util.Utf8_ type instead of 
_java.lang.String_

When this happens, Flink logs an error message as follows:
{noformat}
Class class org.apache.avro.util.Utf8 cannot be used as a POJO type because not 
all fields are valid POJO fields, and must be processed as GenericType. Please 
read the Flink documentation on "Data Types & Serialization" for details of the 
effect on performance. {noformat}
 

This is problematic because `Utf8` is designed to be a fast 
serialized/deserialized type for Avro. But since it is not inheriting from 
SpecificRecordBase, it seems as if it gets handled by Kryo serializer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31848) And Operator has side effect when operands have udf

2023-04-21 Thread Shuiqiang Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715115#comment-17715115
 ] 

Shuiqiang Chen commented on FLINK-31848:


[~martijnvisser][~zju_zsx][~jark] Thanks for your reply. After ran several test 
cases based on release-1.15 (which I previously analyzed) and the latest 
1.18-SNAPSHOT branch, it seems that the left.resultTerm in the left.code has 
always been assigned a boolean value and will never cause a syntax error by 
calling if (!${left.resultTerm}) . Therefore I agree with [~zju_zsx] that the 
logic can be safely simplified. 

> And Operator has side effect when operands have udf
> ---
>
> Key: FLINK-31848
> URL: https://issues.apache.org/jira/browse/FLINK-31848
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.2
>Reporter: zju_zsx
>Priority: Major
> Attachments: image-2023-04-19-14-54-46-458.png
>
>
>  
> {code:java}
> CREATE TABLE kafka_source (
>    `content` varchar,
>    `testid` bigint,
>    `extra` int
>    );
> CREATE TABLE console_sink (
>    `content` varchar,
>    `testid` bigint
>  )
>   with (
>     'connector' = 'print'
> );
> insert into console_sink
> select 
>    content,testid+1
> from kafka_source where testid is not null and testid > 0 and my_udf(testid) 
> != 0; {code}
> my_udf has a constraint that the testid should not be null, but the testid is 
> not null and testid > 0 does not take effect.
>  
> Im ScalarOperatorGens.generateAnd
> !image-2023-04-19-14-54-46-458.png!
> if left.nullTerm is true, right code will be execute 。
> it seems that
> {code:java}
> if (!${left.nullTerm} && !${left.resultTerm}) {code}
> can be safely replaced with 
> {code:java}
> if (!${left.resultTerm}){code}
> ? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] reswqa commented on a diff in pull request #22424: [FLINK-31842][runtime] Calculate the utilization of the task executor only when it is using

2023-04-21 Thread via GitHub


reswqa commented on code in PR #22424:
URL: https://github.com/apache/flink/pull/22424#discussion_r1174029242


##
flink-runtime/src/test/java/org/apache/flink/runtime/clusterframework/types/LocationPreferenceSlotSelectionStrategyTest.java:
##
@@ -134,39 +139,57 @@ private 
Matcher withSlotInfo(SlotInf
 resourceProfile, 
Collections.singletonList(nonLocalTm));
 Optional match = 
runMatching(slotProfile);
 
-Assert.assertTrue(match.isPresent());
-final SlotSelectionStrategy.SlotInfoAndLocality slotInfoAndLocality = 
match.get();
-assertThat(candidates, 
hasItem(withSlotInfo(slotInfoAndLocality.getSlotInfo(;
-assertThat(slotInfoAndLocality, hasLocality(Locality.NON_LOCAL));
+assertThat(match)
+.hasValueSatisfying(
+slotInfoAndLocality -> {
+assertThat(slotInfoAndLocality.getLocality())
+.isEqualTo(Locality.NON_LOCAL);
+assertThat(candidates)
+.anySatisfy(
+slotInfoAndResources ->
+
assertThat(slotInfoAndResources.getSlotInfo())
+.isEqualTo(
+
slotInfoAndLocality
+
.getSlotInfo()));
+});
 }
 
 @Test
-public void matchPreferredLocation() {
+void matchPreferredLocation() {
 
 SlotProfile slotProfile =
 SlotProfileTestingUtils.preferredLocality(
 biggerResourceProfile, 
Collections.singletonList(tml2));
 Optional match = 
runMatching(slotProfile);
 
-Assert.assertEquals(slotInfo2, match.get().getSlotInfo());
+assertThat(match)
+.hasValueSatisfying(
+slotInfoAndLocality ->
+
assertThat(slotInfoAndLocality.getSlotInfo()).isEqualTo(slotInfo2));

Review Comment:
   The same logic appears in these two test class many times, I'd prefer 
extracts a specific method to handle this assertion.



##
flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/slotpool/SlotInfoWithUtilization.java:
##
@@ -18,26 +18,37 @@
 
 package org.apache.flink.runtime.jobmaster.slotpool;
 
+import org.apache.flink.annotation.VisibleForTesting;
 import org.apache.flink.runtime.clusterframework.types.AllocationID;
+import org.apache.flink.runtime.clusterframework.types.ResourceID;
 import org.apache.flink.runtime.clusterframework.types.ResourceProfile;
 import org.apache.flink.runtime.jobmaster.SlotInfo;
 import org.apache.flink.runtime.taskmanager.TaskManagerLocation;
 
+import java.util.function.Function;
+
 /**
  * Container for {@link SlotInfo} and the task executors utilization 
(freeSlots /
  * totalOfferedSlots).
  */
 public final class SlotInfoWithUtilization implements SlotInfo {
 private final SlotInfo slotInfoDelegate;
-private final double taskExecutorUtilization;
+private final Function taskExecutorUtilizationLookup;
 
-private SlotInfoWithUtilization(SlotInfo slotInfo, double 
taskExecutorUtilization) {
+private SlotInfoWithUtilization(
+SlotInfo slotInfo, Function 
taskExecutorUtilizationLookup) {
 this.slotInfoDelegate = slotInfo;
-this.taskExecutorUtilization = taskExecutorUtilization;
+this.taskExecutorUtilizationLookup = taskExecutorUtilizationLookup;
 }
 
+@VisibleForTesting
 double getTaskExecutorUtilization() {
-return taskExecutorUtilization;
+return taskExecutorUtilizationLookup.apply(
+slotInfoDelegate.getTaskManagerLocation().getResourceID());
+}

Review Comment:
   It seems that this method has still not been removed. Have you just 
forgotten it or do you have other concerns 🤔 



##
flink-runtime/src/test/java/org/apache/flink/runtime/clusterframework/types/LocationPreferenceSlotSelectionStrategyTest.java:
##
@@ -63,67 +54,81 @@ public void testPhysicalSlotResourceProfileRespected() {
 Collections.emptySet());
 
 Optional match = 
runMatching(slotProfile);
-Assert.assertTrue(
-match.get()
-.getSlotInfo()
-.getResourceProfile()
-
.isMatching(slotProfile.getPhysicalSlotResourceProfile()));
+assertThat(match)
+.hasValueSatisfying(
+slotInfoAndLocality ->
+assertThat(
+slotInfoAndLocality
+.getSlotInfo()
+   

[jira] [Updated] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class

2023-04-21 Thread Eric Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Xiao updated FLINK-31873:
--
Description: 
When turning on Flink reactive mode, it is suggested to convert all 
{{setParallelism}} calls to {{setMaxParallelism}} from [elastic scaling 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the {{DataStreamSink}} class, only the 
{{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
 function of the 
{{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
 class is exposed - {{Transformation}} also has the 
{{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
 function which is not exposed.

 

This means for any sink in the Flink pipeline, we cannot set a max parallelism.

  was:
When turning on Flink reactive mode, it is suggested to convert all 
{{setParallelism}} calls to {{setMaxParallelism from }}[elastic scaling 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the {{DataStreamSink}} class, only the 
{{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
 function of the 
{{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
 class is exposed - {{Transformation}} also has the 
{{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
 function which is not exposed.

 

This means for any sink in the Flink pipeline, we cannot set a max parallelism.


> Add setMaxParallelism to the DataStreamSink Class
> -
>
> Key: FLINK-31873
> URL: https://issues.apache.org/jira/browse/FLINK-31873
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Eric Xiao
>Priority: Major
> Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png
>
>
> When turning on Flink reactive mode, it is suggested to convert all 
> {{setParallelism}} calls to {{setMaxParallelism}} from [elastic scaling 
> docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].
> With the current implementation of the {{DataStreamSink}} class, only the 
> {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
>  function of the 
> {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
>  class is exposed - {{Transformation}} also has the 
> {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
>  function which is not exposed.
>  
> This means for any sink in the Flink pipeline, we cannot set a max 
> parallelism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31828) List field in a POJO data stream results in table program compilation failure

2023-04-21 Thread Aitozi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715111#comment-17715111
 ] 

Aitozi commented on FLINK-31828:


Hi [~netvl] Thanks for this detailed bug report.

I have reproduced your problem. And I also spend some time to dig the way to 
use RAW type to declare the List type in your case. I found that there's a bug 
in the cast rule (using the wrong serializer), so it will fail with EOF 
exception as you mentioned (hope it's the same error with you).

I will prepare a PR to solve this bug 

> List field in a POJO data stream results in table program compilation failure
> -
>
> Key: FLINK-31828
> URL: https://issues.apache.org/jira/browse/FLINK-31828
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.16.1
> Environment: Java 11
> Flink 1.16.1
>Reporter: Vladimir Matveev
>Priority: Major
> Attachments: MainPojo.java, generated-code.txt, stacktrace.txt
>
>
> Suppose I have a POJO class like this:
> {code:java}
> public class Example {
> private String key;
> private List> values;
> // getters, setters, equals+hashCode omitted
> }
> {code}
> When a DataStream with this class is converted to a table, and some 
> operations are performed on it, it results in an exception which explicitly 
> says that I should file a ticket:
> {noformat}
> Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
> cannot be compiled. This is a bug. Please file an issue.
> {noformat}
> Please find the example Java code and the full stack trace attached.
> From the exception and generated code it seems that Flink is upset with the 
> list field being treated as an array - but I cannot have an array type there 
> in the real code.
> Also note that if I _don't_ specify the schema explicitly, it then maps the 
> {{values}} field to a `RAW('java.util.List', '...')` type, which also does 
> not work correctly and fails the job in case of even simplest operations like 
> printing.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31878) Fix the wrong name of PauseOrResumeSplitsTask#toString in connector fetcher

2023-04-21 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-31878:
---
Priority: Minor  (was: Major)

> Fix the wrong name of PauseOrResumeSplitsTask#toString in connector fetcher 
> 
>
> Key: FLINK-31878
> URL: https://issues.apache.org/jira/browse/FLINK-31878
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Common
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> The class name PauseOrResumeSplitsTask#toString is not right. Users will be 
> very confused when calling the toString method of the class. So we should fix 
> it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31878) Fix the wrong name of PauseOrResumeSplitsTask#toString in connector fetcher

2023-04-21 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo closed FLINK-31878.
--
Fix Version/s: 1.18.0
   Resolution: Fixed

master(1.18) via 0104427dc9e38e898ba3865b499cc515004041c9.

> Fix the wrong name of PauseOrResumeSplitsTask#toString in connector fetcher 
> 
>
> Key: FLINK-31878
> URL: https://issues.apache.org/jira/browse/FLINK-31878
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Common
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> The class name PauseOrResumeSplitsTask#toString is not right. Users will be 
> very confused when calling the toString method of the class. So we should fix 
> it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] reswqa merged pull request #22443: [FLINK-31878][connectors] Fix the wrong name of PauseOrResumeSplitsTask#toString

2023-04-21 Thread via GitHub


reswqa merged PR #22443:
URL: https://github.com/apache/flink/pull/22443


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-31398) Don't wrap with TemporaryClassLoaderContext in OperationExecutor

2023-04-21 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo closed FLINK-31398.
--
Fix Version/s: 1.17.1
   Resolution: Fixed

> Don't wrap with TemporaryClassLoaderContext in OperationExecutor
> 
>
> Key: FLINK-31398
> URL: https://issues.apache.org/jira/browse/FLINK-31398
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive, Table SQL / Client
>Reporter: luoyuxia
>Assignee: Weijie Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0, 1.17.1
>
>
> Currently, method OperationExecutor#executeStatement in sql client will wrap 
> currently with `
> sessionContext.getSessionState().resourceManager.getUserClassLoader()`. 
> Actually, it's not necessary. What' worse, 
> it'll will cause the exception 'Trying to access closed classloader. Please 
> check if you store xxx'  after quiting sql client. 
> The reason is in `ShutdownHookManager`, it will register a hook after jvm 
> shutdown. In `ShutdownHookManager`, it will
> create `Configuration`. It will then access 
> `Thread.currentThread().getContextClassLoader()` which is 
> FlinkUserClassLoader, the FlinkUserClassLoader has been closed before. So, 
> it'll then cause `'Trying to access closed classloader` exception.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31398) Don't wrap with TemporaryClassLoaderContext in OperationExecutor

2023-04-21 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714809#comment-17714809
 ] 

Weijie Guo edited comment on FLINK-31398 at 4/21/23 5:17 PM:
-

master(1.18) via 921267bcd06c586d4fad28bbf37b4532593c9e3c.
release-1.17 via d40d4dd26ac544307583477b6930e7af50330935.


was (Author: weijie guo):
master(1.18) via 921267bcd06c586d4fad28bbf37b4532593c9e3c.

> Don't wrap with TemporaryClassLoaderContext in OperationExecutor
> 
>
> Key: FLINK-31398
> URL: https://issues.apache.org/jira/browse/FLINK-31398
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Hive, Table SQL / Client
>Reporter: luoyuxia
>Assignee: Weijie Guo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Currently, method OperationExecutor#executeStatement in sql client will wrap 
> currently with `
> sessionContext.getSessionState().resourceManager.getUserClassLoader()`. 
> Actually, it's not necessary. What' worse, 
> it'll will cause the exception 'Trying to access closed classloader. Please 
> check if you store xxx'  after quiting sql client. 
> The reason is in `ShutdownHookManager`, it will register a hook after jvm 
> shutdown. In `ShutdownHookManager`, it will
> create `Configuration`. It will then access 
> `Thread.currentThread().getContextClassLoader()` which is 
> FlinkUserClassLoader, the FlinkUserClassLoader has been closed before. So, 
> it'll then cause `'Trying to access closed classloader` exception.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] reswqa merged pull request #22444: [BP-1.17][FLINK-31398][sql-gateway] OperationExecutor is no longer set context classloader when executing statement.

2023-04-21 Thread via GitHub


reswqa merged PR #22444:
URL: https://github.com/apache/flink/pull/22444


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29692) Support early/late fires for Windowing TVFs

2023-04-21 Thread Chalres Tan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715100#comment-17715100
 ] 

Chalres Tan commented on FLINK-29692:
-

re [~jark]: Thanks for the response. The example query you provided would yield 
the correct results, but from my understanding with window TVF, after the 
window expires then the state associated with that window will be cleared. With 
the example you provided, is it true that the state will just accumulate over 
time? Also, the approach with the query won't work by adjusting the use case 
slightly. For example, if instead of 1 hour tumbling windows they were 2 hour 
windows or if instead of using tumbling windows we had hopping windows with 
length 1 hour every 30 minutes.

> Support early/late fires for Windowing TVFs
> ---
>
> Key: FLINK-29692
> URL: https://issues.apache.org/jira/browse/FLINK-29692
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner
>Affects Versions: 1.15.3
>Reporter: Canope Nerda
>Priority: Major
>
> I have cases where I need to 1) output data as soon as possible and 2) handle 
> late arriving data to achieve eventual correctness. In the logic, I need to 
> do window deduplication which is based on Windowing TVFs and according to 
> source code, early/late fires are not supported yet in Windowing TVFs.
> Actually 1) contradicts with 2). Without early/late fires, we had to 
> compromise, either live with fresh incorrect data or tolerate excess latency 
> for correctness.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-24998) SIGSEGV in Kryo / C2 CompilerThread on Java 17

2023-04-21 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715086#comment-17715086
 ] 

Chesnay Schepler edited comment on FLINK-24998 at 4/21/23 4:42 PM:
---

[~schoeneu] That is a good find. I _think_ I used JDK 17.0.1 during my tests, 
which _may_ fit the ticket?

I will look into this again next week.


was (Author: zentol):
[~schoeneu] That is a good find. I _think_ I used JDK 17.0.1 during my tests, 
which would fit the ticket?

I will look into this again next week.

> SIGSEGV in Kryo / C2 CompilerThread on Java 17
> --
>
> Key: FLINK-24998
> URL: https://issues.apache.org/jira/browse/FLINK-24998
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Type Serialization System
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Blocker
> Attachments: -__w-1-s-flink-tests-target-hs_err_pid470059.log
>
>
> While running our tests on CI with Java 17 they failed infrequently with a 
> SIGSEGV error.
> All occurrences were related to Kryo and happened in the C2 CompilerThread.
> {code:java}
> Current thread (0x7f1394165c00):  JavaThread "C2 CompilerThread0" daemon 
> [_thread_in_native, id=470077, stack(0x7f1374361000,0x7f1374462000)]
> Current CompileTask:
> C2:  14251 6333   4   com.esotericsoftware.kryo.io.Input::readString 
> (127 bytes)
> {code}
> The full error is attached to the ticket. -I can also provide the core dump 
> if needed.-



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24998) SIGSEGV in Kryo / C2 CompilerThread on Java 17

2023-04-21 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715086#comment-17715086
 ] 

Chesnay Schepler commented on FLINK-24998:
--

[~schoeneu] That is a good find. I _think_ I used JDK 17.0.1 during my tests, 
which would fit the ticket?

I will look into this again next week.

> SIGSEGV in Kryo / C2 CompilerThread on Java 17
> --
>
> Key: FLINK-24998
> URL: https://issues.apache.org/jira/browse/FLINK-24998
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Type Serialization System
>Reporter: Chesnay Schepler
>Priority: Blocker
> Attachments: -__w-1-s-flink-tests-target-hs_err_pid470059.log
>
>
> While running our tests on CI with Java 17 they failed infrequently with a 
> SIGSEGV error.
> All occurrences were related to Kryo and happened in the C2 CompilerThread.
> {code:java}
> Current thread (0x7f1394165c00):  JavaThread "C2 CompilerThread0" daemon 
> [_thread_in_native, id=470077, stack(0x7f1374361000,0x7f1374462000)]
> Current CompileTask:
> C2:  14251 6333   4   com.esotericsoftware.kryo.io.Input::readString 
> (127 bytes)
> {code}
> The full error is attached to the ticket. -I can also provide the core dump 
> if needed.-



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-24998) SIGSEGV in Kryo / C2 CompilerThread on Java 17

2023-04-21 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-24998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler reassigned FLINK-24998:


Assignee: Chesnay Schepler

> SIGSEGV in Kryo / C2 CompilerThread on Java 17
> --
>
> Key: FLINK-24998
> URL: https://issues.apache.org/jira/browse/FLINK-24998
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Type Serialization System
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Blocker
> Attachments: -__w-1-s-flink-tests-target-hs_err_pid470059.log
>
>
> While running our tests on CI with Java 17 they failed infrequently with a 
> SIGSEGV error.
> All occurrences were related to Kryo and happened in the C2 CompilerThread.
> {code:java}
> Current thread (0x7f1394165c00):  JavaThread "C2 CompilerThread0" daemon 
> [_thread_in_native, id=470077, stack(0x7f1374361000,0x7f1374462000)]
> Current CompileTask:
> C2:  14251 6333   4   com.esotericsoftware.kryo.io.Input::readString 
> (127 bytes)
> {code}
> The full error is attached to the ticket. -I can also provide the core dump 
> if needed.-



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31827) Incorrect estimation of the target data rate of a vertex when only a subset of its upstream vertex's output is consumed

2023-04-21 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715078#comment-17715078
 ] 

Gyula Fora commented on FLINK-31827:


The first fix on the operator side is merged for this: 
ff4bbd2a7bbed4ba0b1443d53731c883a230b6d4
This covers most cases without additional Flink metrics. Will folow up with the 
optional flink metrics.

> Incorrect estimation of the target data rate of a vertex when only a subset 
> of its upstream vertex's output is consumed
> ---
>
> Key: FLINK-31827
> URL: https://issues.apache.org/jira/browse/FLINK-31827
> Project: Flink
>  Issue Type: Bug
>  Components: Autoscaler
>Reporter: Zhanghao Chen
>Assignee: Gyula Fora
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-04-17-23-37-35-280.png
>
>
> Currently, the target data rate of a vertex = SUM(target data rate * 
> input/output ratio) for all of its upstream vertices. This assumes that all 
> output records of an upstream vertex is consumed by the downstream vertex. 
> However, it does not always hold. Consider the following job plan generated 
> by a Flink SQL job. The middle vertex contains multiple chained Calc(select 
> xx) operators, each connecting to a separate downstream sink tasks. As a 
> result, each sink task only consumes a sub-portion of the middle vertex's 
> output.
> To fix it, we need operator level edge info to infer the upstream-downstream 
> relationship as well as operator level output metrics. The metrics part is 
> easy but AFAIK, there's no way to get the operator level edge info from the 
> Flink REST API yet.
> !image-2023-04-17-23-37-35-280.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] gyfora merged pull request #571: [FLINK-31827] Compute output ratio per edge

2023-04-21 Thread via GitHub


gyfora merged PR #571:
URL: https://github.com/apache/flink-kubernetes-operator/pull/571


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] XComp commented on a diff in pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into

2023-04-21 Thread via GitHub


XComp commented on code in PR #22422:
URL: https://github.com/apache/flink/pull/22422#discussion_r1173943821


##
flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/LeaderElectionEventHandler.java:
##
@@ -39,12 +40,29 @@ public interface LeaderElectionEventHandler {
  */
 void onGrantLeadership(UUID newLeaderSessionId);
 
+/**
+ * Called by specific {@link LeaderElectionDriver} when the leadership is 
granted.
+ *
+ * This method will trigger the grant event processing in a separate 
thread.
+ *
+ * @param newLeaderSessionId the valid leader session id
+ */
+CompletableFuture onGrantLeadershipAsync(UUID newLeaderSessionId);

Review Comment:
   How would we determine that we're running in the main thread? We would need 
to get a reference to the owner of `DefaultLeaderElectionService`, wouldn't we? 
That way we would be able to check whether the call is run in the main thread 
of the owner. But this would closer coupling of the owner and the 
`DefaultLeaderElectionService`. Or do you have something else in mind which I'm 
missing? :thinking: 
   
   My understanding is that we have the following requirements:
   * `onGrantLeadership`, `onRevokeLeadership` and `onLeaderInformationChanged` 
should be called in a single thread to ensure sequential execution.
   * the methods shouldn't be called in the main thread of the 
`LeaderElectionService`'s owner. The k8s and ZK implementation do this 
implicitly through their event thread handling. The 
`DefaultMultipleComponentLeaderElectionService` doesn't do that (anymore). But 
that's an implementation detail of the service. That's why I thought that it's 
also the `DefaultMultipleComponentLeaderElectionService`'s responsibility to 
call the event processing functions asynchronously.
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] XComp commented on a diff in pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into

2023-04-21 Thread via GitHub


XComp commented on code in PR #22422:
URL: https://github.com/apache/flink/pull/22422#discussion_r1173929303


##
flink-runtime/src/test/java/org/apache/flink/runtime/leaderelection/DefaultLeaderElectionServiceTest.java:
##
@@ -130,22 +137,101 @@ void testLeaderInformationChangedAndShouldBeCorrected() 
throws Exception {
 }
 
 @Test
-void testHasLeadership() throws Exception {
+void testHasLeadershipWithLeadershipButNoGrantEventProcessed() throws 
Exception {
 new Context() {
 {
-runTest(
-() -> {
-testingLeaderElectionDriver.isLeader();
-final UUID currentLeaderSessionId =
-leaderElectionService.getLeaderSessionID();
-assertThat(currentLeaderSessionId).isNotNull();
-
assertThat(leaderElectionService.hasLeadership(currentLeaderSessionId))
+runTestWithManuallyTriggeredEvents(
+executorService -> {
+final UUID expectedSessionID = UUID.randomUUID();
+
+
testingLeaderElectionDriver.isLeader(expectedSessionID);
+
+
assertThat(leaderElectionService.hasLeadership(expectedSessionID))
+.isFalse();
+
assertThat(leaderElectionService.hasLeadership(UUID.randomUUID()))
+.isFalse();
+});
+}
+};
+}
+
+@Test
+void testHasLeadershipWithLeadershipAndGrantEventProcessed() throws 
Exception {
+new Context() {
+{
+runTestWithManuallyTriggeredEvents(
+executorService -> {
+final UUID expectedSessionID = UUID.randomUUID();
+
+
testingLeaderElectionDriver.isLeader(expectedSessionID);
+executorService.trigger();
+
+
assertThat(leaderElectionService.hasLeadership(expectedSessionID))
 .isTrue();
 
assertThat(leaderElectionService.hasLeadership(UUID.randomUUID()))
 .isFalse();
+});
+}
+};
+}
+
+@Test
+void testHasLeadershipWithLeadershipLostButNoRevokeEventProcessed() throws 
Exception {
+new Context() {
+{
+runTestWithManuallyTriggeredEvents(
+executorService -> {
+final UUID expectedSessionID = UUID.randomUUID();
+
+
testingLeaderElectionDriver.isLeader(expectedSessionID);
+executorService.trigger();
+testingLeaderElectionDriver.notLeader();
+
+
assertThat(leaderElectionService.hasLeadership(expectedSessionID))
+.isFalse();

Review Comment:
   I was puzzled about that one as well at the start. But it makes sense: The 
HA backend (i.e. the driver) holds the ground truth of leadership. If the HA 
backend indicates the leadership loss, no operation should be executed that 
relies on `hasLeadership(UUID)` because another leader could have picked up in 
the meantime already. The `onRevokeLeadership` call that follows is used for 
cleaning up the component's leadership state only. Does that make sense?
   
   I could add a description to that assert to avoid other's wonder the same 
when reading the test code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31866) Autoscaler metric trimming reduces the number of metric observations on recovery

2023-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-31866:
---
Labels: pull-request-available  (was: )

> Autoscaler metric trimming reduces the number of metric observations on 
> recovery
> 
>
> Key: FLINK-31866
> URL: https://issues.apache.org/jira/browse/FLINK-31866
> Project: Flink
>  Issue Type: Bug
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.5.0
>
>
> The autoscaler uses a ConfigMap to store past metric observations which is 
> used to re-initialize the autoscaler state in case of failures or upgrades.
> Whenever trimming of the ConfigMap occurs, we need to make sure we also 
> update the timestamp for the start of the metric collection, so any removed 
> observations can be compensated with by collecting new ones. If we don't do 
> this, the metric window will effectively shrink due to removing observations.
> This can lead to triggering scaling decisions when the operator gets 
> redeployed due to the removed items.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] mxm opened a new pull request, #573: [FLINK-31866] Start metric window with timestamp of first observation

2023-04-21 Thread via GitHub


mxm opened a new pull request, #573:
URL: https://github.com/apache/flink-kubernetes-operator/pull/573

   The autoscaler uses a ConfigMap to store past metric observations which is 
used to re-initialize the autoscaler state in case of failures or upgrades.
   
   Whenever trimming of the ConfigMap occurs, we need to make sure we also 
update the timestamp for the start of the metric collection, so any removed 
observations can be compensated with by collecting new ones. If we don't do 
this, the metric window will effectively shrink due to removing observations.
   
   This can lead to triggering scaling decisions when the operator gets 
redeployed due to the removed items.
   
   The solution we are opting here is to treat the first metric observation 
timestamp as the start of the metric collection.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] fredia commented on a diff in pull request #22257: [FLINK-31593][tests] Upgraded migration test data

2023-04-21 Thread via GitHub


fredia commented on code in PR #22257:
URL: https://github.com/apache/flink/pull/22257#discussion_r1173259129


##
flink-tests/src/test/resources/new-stateful-broadcast-udf-migration-itcase-flink1.17-rocksdb-checkpoint/0193581b-bd07-4b9d-b5d3-a6c986164fa4:
##
@@ -0,0 +1,362 @@
+# This is a RocksDB option file.
+#
+# For detailed file format spec, please refer to the example file
+# in examples/rocksdb_option_file_example.ini
+#

Review Comment:
   @XComp Yes, rocksdb config file is usually generated when creating 
checkpoints.
   Rocksdb checkpoint/savepoint-native is added after 
https://issues.apache.org/jira/browse/FLINK-26146, but some rocksdb 
configuration does not take effect, for some tests no rocksdb config file are 
generated, see https://issues.apache.org/jira/browse/FLINK-26176.
   
   The following are some existing checkpoint/native-savepoint files related to 
rocksdb. We can see that there were rocksdb config files in the past.
   
   - [ ] 
new-stateful-broadcast-udf-migration-itcase-flink1.15-rocksdb-checkpoint  
   - [ ] 
new-stateful-broadcast-udf-migration-itcase-flink1.16-rocksdb-checkpoint 
   - [x] new-stateful-udf-migration-itcase-flink1.15-rocksdb-checkpoint 
   - [x] [new-stateful-udf-migration-itcase-flink1.16-rocksdb-checkpoint 
](https://github.com/apache/flink/blob/master/flink-tests/src/test/resources/new-stateful-udf-migration-itcase-flink1.16-rocksdb-checkpoint/0e40bd9a-a3b3-4d71-9d32-dfaacba11244)
   - [ ] stateful-scala-udf-migration-itcase-flink1.15-rocksdb-checkpoint 
   - [ ] stateful-scala-udf-migration-itcase-flink1.16-rocksdb-checkpoint 
   - [ ] 
stateful-scala-with-broadcast-udf-migration-itcase-flink1.15-rocksdb-checkpoint 
   - [x] 
stateful-scala-with-broadcast-udf-migration-itcase-flink1.16-rocksdb-checkpoint 
   - [x] type-serializer-snapshot-migration-itcase-flink1.16-rocksdb-checkpoint 
   
   
   - [ ] 
new-stateful-broadcast-udf-migration-itcase-flink1.15-rocksdb-savepoint-native 
   - [x] 
new-stateful-broadcast-udf-migration-itcase-flink1.16-rocksdb-savepoint-native 
   - [ ] new-stateful-udf-migration-itcase-flink1.15-rocksdb-savepoint-native  
   - [x] new-stateful-udf-migration-itcase-flink1.16-rocksdb-savepoint-native 
   - [ ] stateful-scala-udf-migration-itcase-flink1.15-rocksdb-savepoint-native 
   - [x] stateful-scala-udf-migration-itcase-flink1.16-rocksdb-savepoint-native 
   - [ ] 
stateful-scala-with-broadcast-udf-migration-itcase-flink1.15-rocksdb-savepoint-native
  
   - [x] 
stateful-scala-with-broadcast-udf-migration-itcase-flink1.16-rocksdb-savepoint-native
 
   - [x] 
type-serializer-snapshot-migration-itcase-flink1.16-rocksdb-savepoint-native 



##
flink-tests/src/test/resources/new-stateful-broadcast-udf-migration-itcase-flink1.17-rocksdb-checkpoint/0193581b-bd07-4b9d-b5d3-a6c986164fa4:
##
@@ -0,0 +1,362 @@
+# This is a RocksDB option file.
+#
+# For detailed file format spec, please refer to the example file
+# in examples/rocksdb_option_file_example.ini
+#

Review Comment:
   @XComp Yes, rocksdb config file is usually generated when creating 
checkpoints.
   Rocksdb checkpoint/savepoint-native is added after 
https://issues.apache.org/jira/browse/FLINK-26146, but some rocksdb 
configuration does not take effect, for some tests no rocksdb config file are 
generated, see https://issues.apache.org/jira/browse/FLINK-26176.
   
   The following are some existing checkpoint/native-savepoint files related to 
rocksdb. We can see that there were rocksdb config files in the past.
   
   - [ ] 
new-stateful-broadcast-udf-migration-itcase-flink1.15-rocksdb-checkpoint  
   - [ ] 
new-stateful-broadcast-udf-migration-itcase-flink1.16-rocksdb-checkpoint 
   - [ ] new-stateful-udf-migration-itcase-flink1.15-rocksdb-checkpoint 
   - [x] [new-stateful-udf-migration-itcase-flink1.16-rocksdb-checkpoint 
](https://github.com/apache/flink/blob/master/flink-tests/src/test/resources/new-stateful-udf-migration-itcase-flink1.16-rocksdb-checkpoint/0e40bd9a-a3b3-4d71-9d32-dfaacba11244)
   - [ ] stateful-scala-udf-migration-itcase-flink1.15-rocksdb-checkpoint 
   - [ ] stateful-scala-udf-migration-itcase-flink1.16-rocksdb-checkpoint 
   - [ ] 
stateful-scala-with-broadcast-udf-migration-itcase-flink1.15-rocksdb-checkpoint 
   - [x] 
stateful-scala-with-broadcast-udf-migration-itcase-flink1.16-rocksdb-checkpoint 
   - [x] type-serializer-snapshot-migration-itcase-flink1.16-rocksdb-checkpoint 
   
   
   - [ ] 
new-stateful-broadcast-udf-migration-itcase-flink1.15-rocksdb-savepoint-native 
   - [x] 
new-stateful-broadcast-udf-migration-itcase-flink1.16-rocksdb-savepoint-native 
   - [ ] new-stateful-udf-migration-itcase-flink1.15-rocksdb-savepoint-native  
   - [x] new-stateful-udf-migration-itcase-flink1.16-rocksdb-savepoint-native 
   - [ ] stateful-scala-udf-migration-itcase-flink1.15-rocksdb-savepoint-native 
   - [x] stateful-scala-udf-migration-itcase-flink1.16-rocksdb-savepoint-native 
   - [ ] 
stateful-scala-with-broad

[GitHub] [flink] liuyongvs commented on pull request #22450: [DOC]: Remove duplicate sentence in docker deployment documentation

2023-04-21 Thread via GitHub


liuyongvs commented on PR #22450:
URL: https://github.com/apache/flink/pull/22450#issuecomment-1517949017

   LGTM +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31866) Autoscaler metric trimming reduces the number of metric observations on recovery

2023-04-21 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-31866:
---
Summary: Autoscaler metric trimming reduces the number of metric 
observations on recovery  (was: Autoscaler metric trimming reduces the numbet 
of metric observations on recovery)

> Autoscaler metric trimming reduces the number of metric observations on 
> recovery
> 
>
> Key: FLINK-31866
> URL: https://issues.apache.org/jira/browse/FLINK-31866
> Project: Flink
>  Issue Type: Bug
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: kubernetes-operator-1.5.0
>
>
> The autoscaler uses a ConfigMap to store past metric observations which is 
> used to re-initialize the autoscaler state in case of failures or upgrades.
> Whenever trimming of the ConfigMap occurs, we need to make sure we also 
> update the timestamp for the start of the metric collection, so any removed 
> observations can be compensated with by collecting new ones. If we don't do 
> this, the metric window will effectively shrink due to removing observations.
> This can lead to triggering scaling decisions when the operator gets 
> redeployed due to the removed items.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24998) SIGSEGV in Kryo / C2 CompilerThread on Java 17

2023-04-21 Thread Urs Schoenenberger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715038#comment-17715038
 ] 

Urs Schoenenberger commented on FLINK-24998:


[~chesnay] is this issue still reproducible with newer builds of Java 17?

I'm asking because it looks suspiciously like JDK bug 
[https://bugs.openjdk.org/browse/JDK-8277529] . The comments in there say "It 
turned out that we are updating the control input of a data node directly to an 
ArrayCopy node when splitting a load through a region [...] This is illegal 
[...]" which sounds like it might apply to kryo's Input::readString.

This makes me a little hopeful that current build may have resolved this, 
thereby decoupling the need to update Kryo from the Java 17 enabling efforts?

> SIGSEGV in Kryo / C2 CompilerThread on Java 17
> --
>
> Key: FLINK-24998
> URL: https://issues.apache.org/jira/browse/FLINK-24998
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Type Serialization System
>Reporter: Chesnay Schepler
>Priority: Blocker
> Attachments: -__w-1-s-flink-tests-target-hs_err_pid470059.log
>
>
> While running our tests on CI with Java 17 they failed infrequently with a 
> SIGSEGV error.
> All occurrences were related to Kryo and happened in the C2 CompilerThread.
> {code:java}
> Current thread (0x7f1394165c00):  JavaThread "C2 CompilerThread0" daemon 
> [_thread_in_native, id=470077, stack(0x7f1374361000,0x7f1374462000)]
> Current CompileTask:
> C2:  14251 6333   4   com.esotericsoftware.kryo.io.Input::readString 
> (127 bytes)
> {code}
> The full error is attached to the ticket. -I can also provide the core dump 
> if needed.-



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31866) Autoscaler metric trimming reduces the numbet of metric observations on recovery

2023-04-21 Thread Maximilian Michels (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated FLINK-31866:
---
Summary: Autoscaler metric trimming reduces the numbet of metric 
observations on recovery  (was: Autoscaler metric trimming reduces the numbet 
of metric observations)

> Autoscaler metric trimming reduces the numbet of metric observations on 
> recovery
> 
>
> Key: FLINK-31866
> URL: https://issues.apache.org/jira/browse/FLINK-31866
> Project: Flink
>  Issue Type: Bug
>  Components: Autoscaler, Kubernetes Operator
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: kubernetes-operator-1.5.0
>
>
> The autoscaler uses a ConfigMap to store past metric observations which is 
> used to re-initialize the autoscaler state in case of failures or upgrades.
> Whenever trimming of the ConfigMap occurs, we need to make sure we also 
> update the timestamp for the start of the metric collection, so any removed 
> observations can be compensated with by collecting new ones. If we don't do 
> this, the metric window will effectively shrink due to removing observations.
> This can lead to triggering scaling decisions when the operator gets 
> redeployed due to the removed items.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class

2023-04-21 Thread Eric Xiao (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Xiao updated FLINK-31873:
--
Description: 
When turning on Flink reactive mode, it is suggested to convert all 
{{setParallelism}} calls to {{setMaxParallelism from }}[elastic scaling 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the {{DataStreamSink}} class, only the 
{{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
 function of the 
{{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
 class is exposed - {{Transformation}} also has the 
{{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
 function which is not exposed.

 

This means for any sink in the Flink pipeline, we cannot set a max parallelism.

  was:
When turning on Flink reactive mode, it is suggested to convert all 
{{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling 
docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].

With the current implementation of the {{DataStreamSink}} class, only the 
{{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
 function of the 
{{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
 class is exposed - {{Transformation}} also has the 
{{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
 function which is not exposed.


> Add setMaxParallelism to the DataStreamSink Class
> -
>
> Key: FLINK-31873
> URL: https://issues.apache.org/jira/browse/FLINK-31873
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Eric Xiao
>Priority: Major
> Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png
>
>
> When turning on Flink reactive mode, it is suggested to convert all 
> {{setParallelism}} calls to {{setMaxParallelism from }}[elastic scaling 
> docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].
> With the current implementation of the {{DataStreamSink}} class, only the 
> {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
>  function of the 
> {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
>  class is exposed - {{Transformation}} also has the 
> {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
>  function which is not exposed.
>  
> This means for any sink in the Flink pipeline, we cannot set a max 
> parallelism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] zentol commented on a diff in pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into

2023-04-21 Thread via GitHub


zentol commented on code in PR #22422:
URL: https://github.com/apache/flink/pull/22422#discussion_r1173821559


##
flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/LeaderElectionEventHandler.java:
##
@@ -39,12 +40,29 @@ public interface LeaderElectionEventHandler {
  */
 void onGrantLeadership(UUID newLeaderSessionId);
 
+/**
+ * Called by specific {@link LeaderElectionDriver} when the leadership is 
granted.
+ *
+ * This method will trigger the grant event processing in a separate 
thread.
+ *
+ * @param newLeaderSessionId the valid leader session id
+ */
+CompletableFuture onGrantLeadershipAsync(UUID newLeaderSessionId);

Review Comment:
   Why can't it be an implementation detail of the 
`DefaultLeaderElectionService` _now_?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-31873) Add setMaxParallelism to the DataStreamSink Class

2023-04-21 Thread Eric Xiao (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715028#comment-17715028
 ] 

Eric Xiao commented on FLINK-31873:
---

Thanks [~martijnvisser] and [~luoyuxia], I will start with opening up a thread 
in the Dev mailing list before making a FLIP :).

> Add setMaxParallelism to the DataStreamSink Class
> -
>
> Key: FLINK-31873
> URL: https://issues.apache.org/jira/browse/FLINK-31873
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Reporter: Eric Xiao
>Priority: Major
> Attachments: Screenshot 2023-04-20 at 4.33.14 PM.png
>
>
> When turning on Flink reactive mode, it is suggested to convert all 
> {{setParallelism}} calls to {{setMaxParallelism from the }}[elastic scaling 
> docs|https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/elastic_scaling/#configuration].
> With the current implementation of the {{DataStreamSink}} class, only the 
> {{[setParallelism|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/datastream/DataStreamSink.java#L172-L181]}}
>  function of the 
> {{[Transformation|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L248-L285]}}
>  class is exposed - {{Transformation}} also has the 
> {{[setMaxParallelism|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/dag/Transformation.java#L277-L285]}}
>  function which is not exposed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] zentol commented on a diff in pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into

2023-04-21 Thread via GitHub


zentol commented on code in PR #22422:
URL: https://github.com/apache/flink/pull/22422#discussion_r1173807106


##
flink-runtime/src/test/java/org/apache/flink/runtime/leaderelection/DefaultLeaderElectionServiceTest.java:
##
@@ -130,22 +137,101 @@ void testLeaderInformationChangedAndShouldBeCorrected() 
throws Exception {
 }
 
 @Test
-void testHasLeadership() throws Exception {
+void testHasLeadershipWithLeadershipButNoGrantEventProcessed() throws 
Exception {
 new Context() {
 {
-runTest(
-() -> {
-testingLeaderElectionDriver.isLeader();
-final UUID currentLeaderSessionId =
-leaderElectionService.getLeaderSessionID();
-assertThat(currentLeaderSessionId).isNotNull();
-
assertThat(leaderElectionService.hasLeadership(currentLeaderSessionId))
+runTestWithManuallyTriggeredEvents(
+executorService -> {
+final UUID expectedSessionID = UUID.randomUUID();
+
+
testingLeaderElectionDriver.isLeader(expectedSessionID);
+
+
assertThat(leaderElectionService.hasLeadership(expectedSessionID))
+.isFalse();
+
assertThat(leaderElectionService.hasLeadership(UUID.randomUUID()))
+.isFalse();
+});
+}
+};
+}
+
+@Test
+void testHasLeadershipWithLeadershipAndGrantEventProcessed() throws 
Exception {
+new Context() {
+{
+runTestWithManuallyTriggeredEvents(
+executorService -> {
+final UUID expectedSessionID = UUID.randomUUID();
+
+
testingLeaderElectionDriver.isLeader(expectedSessionID);
+executorService.trigger();
+
+
assertThat(leaderElectionService.hasLeadership(expectedSessionID))
 .isTrue();
 
assertThat(leaderElectionService.hasLeadership(UUID.randomUUID()))
 .isFalse();
+});
+}
+};
+}
+
+@Test
+void testHasLeadershipWithLeadershipLostButNoRevokeEventProcessed() throws 
Exception {
+new Context() {
+{
+runTestWithManuallyTriggeredEvents(
+executorService -> {
+final UUID expectedSessionID = UUID.randomUUID();
+
+
testingLeaderElectionDriver.isLeader(expectedSessionID);
+executorService.trigger();
+testingLeaderElectionDriver.notLeader();
+
+
assertThat(leaderElectionService.hasLeadership(expectedSessionID))
+.isFalse();

Review Comment:
   This seems strange, shouldnt it be `true`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] XComp commented on a diff in pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into

2023-04-21 Thread via GitHub


XComp commented on code in PR #22422:
URL: https://github.com/apache/flink/pull/22422#discussion_r1173806764


##
flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/LeaderElectionEventHandler.java:
##
@@ -39,12 +40,29 @@ public interface LeaderElectionEventHandler {
  */
 void onGrantLeadership(UUID newLeaderSessionId);
 
+/**
+ * Called by specific {@link LeaderElectionDriver} when the leadership is 
granted.
+ *
+ * This method will trigger the grant event processing in a separate 
thread.
+ *
+ * @param newLeaderSessionId the valid leader session id
+ */
+CompletableFuture onGrantLeadershipAsync(UUID newLeaderSessionId);

Review Comment:
   Considering that, adding deprecation annotations here would make sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] XComp commented on a diff in pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into

2023-04-21 Thread via GitHub


XComp commented on code in PR #22422:
URL: https://github.com/apache/flink/pull/22422#discussion_r1173805207


##
flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/LeaderElectionEventHandler.java:
##
@@ -39,12 +40,29 @@ public interface LeaderElectionEventHandler {
  */
 void onGrantLeadership(UUID newLeaderSessionId);
 
+/**
+ * Called by specific {@link LeaderElectionDriver} when the leadership is 
granted.
+ *
+ * This method will trigger the grant event processing in a separate 
thread.
+ *
+ * @param newLeaderSessionId the valid leader session id
+ */
+CompletableFuture onGrantLeadershipAsync(UUID newLeaderSessionId);

Review Comment:
   In the long run, we could remove the async calls again from the interface. 
The driver implementations only need the synchronous methods because their 
event handling lives in a separate thread already. The asynchronous execution 
would become an implementation detail of `DefaultLeaderElectionService`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] XComp commented on a diff in pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into

2023-04-21 Thread via GitHub


XComp commented on code in PR #22422:
URL: https://github.com/apache/flink/pull/22422#discussion_r1173805207


##
flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/LeaderElectionEventHandler.java:
##
@@ -39,12 +40,29 @@ public interface LeaderElectionEventHandler {
  */
 void onGrantLeadership(UUID newLeaderSessionId);
 
+/**
+ * Called by specific {@link LeaderElectionDriver} when the leadership is 
granted.
+ *
+ * This method will trigger the grant event processing in a separate 
thread.
+ *
+ * @param newLeaderSessionId the valid leader session id
+ */
+CompletableFuture onGrantLeadershipAsync(UUID newLeaderSessionId);

Review Comment:
   In the long run, we could remove the async calls again from the interface. 
The driver implementations only need the synchronous methods only because their 
event handling lives in a separate thread already. The asynchronous execution 
would become an implementation detail of `DefaultLeaderElectionService`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol commented on a diff in pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into

2023-04-21 Thread via GitHub


zentol commented on code in PR #22422:
URL: https://github.com/apache/flink/pull/22422#discussion_r1173801508


##
flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/LeaderElectionEventHandler.java:
##
@@ -39,12 +40,29 @@ public interface LeaderElectionEventHandler {
  */
 void onGrantLeadership(UUID newLeaderSessionId);
 
+/**
+ * Called by specific {@link LeaderElectionDriver} when the leadership is 
granted.
+ *
+ * This method will trigger the grant event processing in a separate 
thread.
+ *
+ * @param newLeaderSessionId the valid leader session id
+ */
+CompletableFuture onGrantLeadershipAsync(UUID newLeaderSessionId);

Review Comment:
   What's the long-term plan here; will we keep both variants and everything 
has to support both? Or will we drop the sync variants once we remove legacy 
stuff?
   Can we enforce somehow that only a specific variant is called for a given 
implementation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol commented on a diff in pull request #22422: [FLINK-31838][runtime] Moves thread handling for leader event listener calls from DefaultMultipleComponentLeaderElectionService into

2023-04-21 Thread via GitHub


zentol commented on code in PR #22422:
URL: https://github.com/apache/flink/pull/22422#discussion_r1173801508


##
flink-runtime/src/main/java/org/apache/flink/runtime/leaderelection/LeaderElectionEventHandler.java:
##
@@ -39,12 +40,29 @@ public interface LeaderElectionEventHandler {
  */
 void onGrantLeadership(UUID newLeaderSessionId);
 
+/**
+ * Called by specific {@link LeaderElectionDriver} when the leadership is 
granted.
+ *
+ * This method will trigger the grant event processing in a separate 
thread.
+ *
+ * @param newLeaderSessionId the valid leader session id
+ */
+CompletableFuture onGrantLeadershipAsync(UUID newLeaderSessionId);

Review Comment:
   What's the long-term plan here; will we keep both variants and everything 
has to support both? Or will we drop the sync variants once we remove legacy 
stuff?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] snuyanzin commented on a diff in pull request #22257: [FLINK-31593][tests] Upgraded migration test data

2023-04-21 Thread via GitHub


snuyanzin commented on code in PR #22257:
URL: https://github.com/apache/flink/pull/22257#discussion_r117377


##
flink-tests/src/test/resources/new-stateful-broadcast-udf-migration-itcase-flink1.17-rocksdb-checkpoint/0193581b-bd07-4b9d-b5d3-a6c986164fa4:
##
@@ -0,0 +1,362 @@
+# This is a RocksDB option file.
+#
+# For detailed file format spec, please refer to the example file
+# in examples/rocksdb_option_file_example.ini
+#

Review Comment:
   thanks for clarification @fredia 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] XComp commented on a diff in pull request #22257: [FLINK-31593][tests] Upgraded migration test data

2023-04-21 Thread via GitHub


XComp commented on code in PR #22257:
URL: https://github.com/apache/flink/pull/22257#discussion_r1173766452


##
flink-tests/src/test/resources/new-stateful-broadcast-udf-migration-itcase-flink1.17-rocksdb-checkpoint/0193581b-bd07-4b9d-b5d3-a6c986164fa4:
##
@@ -0,0 +1,362 @@
+# This is a RocksDB option file.
+#
+# For detailed file format spec, please refer to the example file
+# in examples/rocksdb_option_file_example.ini
+#

Review Comment:
   Thanks for clarification. @snuyanzin can we finalize the PR in that case?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (FLINK-30651) Move utility methods to CatalogTest and remove CatalogTestUtils class

2023-04-21 Thread Samrat Deb (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samrat Deb resolved FLINK-30651.

Resolution: Invalid

> Move utility methods to CatalogTest and remove CatalogTestUtils class 
> --
>
> Key: FLINK-30651
> URL: https://issues.apache.org/jira/browse/FLINK-30651
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Table SQL / API, Tests
>Reporter: Samrat Deb
>Assignee: Samrat Deb
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> [CatalogTestUtils|https://github.com/apache/flink/blame/master/flink-table/flink-table-common/src/test/java/org/apache/flink/table/catalog/CatalogTestUtil.java#L43]
>  class contains static utilities function. This functions/ methods can be 
> moved to CatalogTest class and make code-flow easier to understand.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #22450: [DOC]: Remove duplicate sentence in docker deployment documentation

2023-04-21 Thread via GitHub


flinkbot commented on PR #22450:
URL: https://github.com/apache/flink/pull/22450#issuecomment-1517826614

   
   ## CI report:
   
   * e7a3e7051dd0a041fc4e5ab2e68238c183b59a34 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-kubernetes-operator] yangjf2019 commented on pull request #567: [FLINK-31815] Fixing the container vulnerability by upgrade the SnakeYaml Maven dependency

2023-04-21 Thread via GitHub


yangjf2019 commented on PR #567:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/567#issuecomment-1517823027

   I found that the exception is generated when I commit 
`e2e-tests/data/flinkdep-cr.yaml` and run `default/flink-example-statemachine`, 
do we need to upgrade the flink version first? 
   
   
https://github.com/apache/flink-kubernetes-operator/blob/main/e2e-tests/data/flinkdep-cr.yaml#L50
   
https://github.com/apache/flink-kubernetes-operator/blob/main/e2e-tests/data/multi-sessionjob.yaml#L131-L145
   
https://github.com/apache/flink-kubernetes-operator/blob/main/e2e-tests/data/sessionjob-cr.yaml#L79
   
   
   
https://github.com/apache/flink-kubernetes-operator/actions/runs/4750576429/jobs/8439813076
   https://user-images.githubusercontent.com/54518670/233643419-140478bc-b92d-4d2d-b7e8-9c3225423de6.png";>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #22449: [FLINK-31752] Fix SourceOperator numRecordsOut duplicate bug

2023-04-21 Thread via GitHub


flinkbot commented on PR #22449:
URL: https://github.com/apache/flink/pull/22449#issuecomment-1517819883

   
   ## CI report:
   
   * f7e156ab4e74d487e3047bc8ac86182c69d82d15 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Resolved] (FLINK-31723) DispatcherTest#testCancellationDuringInitialization is unstable

2023-04-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-31723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Morávek resolved FLINK-31723.
---
Fix Version/s: 1.18.0
   Resolution: Fixed

> DispatcherTest#testCancellationDuringInitialization is unstable
> ---
>
> Key: FLINK-31723
> URL: https://issues.apache.org/jira/browse/FLINK-31723
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: David Morávek
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.18.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=47889&view=logs&j=0e7be18f-84f2-53f0-a32d-4a5e4a174679&t=7c1d86e3-35bd-5fd5-3b7c-30c126a78702&l=6761
> {noformat}
> Apr 04 02:26:26 [ERROR] 
> org.apache.flink.runtime.dispatcher.DispatcherTest.testCancellationDuringInitialization
>   Time elapsed: 0.033 s  <<< FAILURE!
> Apr 04 02:26:26 java.lang.AssertionError: 
> Apr 04 02:26:26 
> Apr 04 02:26:26 Expected: is 
> Apr 04 02:26:26  but: was 
> Apr 04 02:26:26   at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> Apr 04 02:26:26   at org.junit.Assert.assertThat(Assert.java:964)
> Apr 04 02:26:26   at org.junit.Assert.assertThat(Assert.java:930)
> Apr 04 02:26:26   at 
> org.apache.flink.runtime.dispatcher.DispatcherTest.testCancellationDuringInitialization(DispatcherTest.java:389)
> [...]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] charlesdong1991 opened a new pull request, #22450: [DOC]: Remove duplicate sentence in docker deployment documentation

2023-04-21 Thread via GitHub


charlesdong1991 opened a new pull request, #22450:
URL: https://github.com/apache/flink/pull/22450

   Remove seemingly duplicate sentence in docker deployment page


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-31723) DispatcherTest#testCancellationDuringInitialization is unstable

2023-04-21 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-31723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714997#comment-17714997
 ] 

David Morávek commented on FLINK-31723:
---

master: 378d3ca0d4b487b1ebc9354e9ebe8952cc3a9d11 
52bf14b0ba949e048c78862be2ed8ebfb58c780e

> DispatcherTest#testCancellationDuringInitialization is unstable
> ---
>
> Key: FLINK-31723
> URL: https://issues.apache.org/jira/browse/FLINK-31723
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: David Morávek
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=47889&view=logs&j=0e7be18f-84f2-53f0-a32d-4a5e4a174679&t=7c1d86e3-35bd-5fd5-3b7c-30c126a78702&l=6761
> {noformat}
> Apr 04 02:26:26 [ERROR] 
> org.apache.flink.runtime.dispatcher.DispatcherTest.testCancellationDuringInitialization
>   Time elapsed: 0.033 s  <<< FAILURE!
> Apr 04 02:26:26 java.lang.AssertionError: 
> Apr 04 02:26:26 
> Apr 04 02:26:26 Expected: is 
> Apr 04 02:26:26  but: was 
> Apr 04 02:26:26   at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
> Apr 04 02:26:26   at org.junit.Assert.assertThat(Assert.java:964)
> Apr 04 02:26:26   at org.junit.Assert.assertThat(Assert.java:930)
> Apr 04 02:26:26   at 
> org.apache.flink.runtime.dispatcher.DispatcherTest.testCancellationDuringInitialization(DispatcherTest.java:389)
> [...]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] dmvk merged pull request #22435: [FLINK-31723] Fix DispatcherTest#testCancellationDuringInitialization to not make assumptions about an underlying scheduler implementation.

2023-04-21 Thread via GitHub


dmvk merged PR #22435:
URL: https://github.com/apache/flink/pull/22435


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31752) SourceOperatorStreamTask increments numRecordsOut twice

2023-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-31752:
---
Labels: pull-request-available  (was: )

> SourceOperatorStreamTask increments numRecordsOut twice
> ---
>
> Key: FLINK-31752
> URL: https://issues.apache.org/jira/browse/FLINK-31752
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Metrics
>Affects Versions: 1.17.0
>Reporter: Weihua Hu
>Assignee: Yunfeng Zhou
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2023-04-07-15-51-44-304.png
>
>
> The counter of numRecordsOut was introduce to ChainingOutput to reduce the 
> function call stack depth in 
> https://issues.apache.org/jira/browse/FLINK-30536
> But SourceOperatorStreamTask.AsyncDataOutputToOutput increments the counter 
> of numRecordsOut too. This results in the source operator's numRecordsOut are 
> doubled.
> We should delete the numRecordsOut.inc in 
> SourceOperatorStreamTask.AsyncDataOutputToOutput.
> [~xtsong][~lindong] Could you please take a look at this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] yunfengzhou-hub opened a new pull request, #22449: [FLINK-31752] Fix SourceOperator numRecordsOut duplicate bug

2023-04-21 Thread via GitHub


yunfengzhou-hub opened a new pull request, #22449:
URL: https://github.com/apache/flink/pull/22449

   ## What is the purpose of the change
   
   This pull request fixes the bug that the metric numRecordsOut is increased 
twice in `SourceOperatorStreamTask` and in `ChainingOutput`, which was 
introduced in #21579.
   
   
   ## Brief change log
   
   - Removes the process to increase numRecordsOut in 
`SourceOperatorStreamTask`.
   - Adds integration test class `SourceMetricsITCase` to verify the 
correctness of this metric on SourceOperator. This test class is introduced 
according to `SinkMetricsITCase`.
   
   
   ## Verifying this change
   
   The correctness of the changes in this pull request is covered by the newly 
introduced test class `SourceMetricsITCase`.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): yes
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-31848) And Operator has side effect when operands have udf

2023-04-21 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714995#comment-17714995
 ] 

Jark Wu commented on FLINK-31848:
-

[~csq] do you have a simple case to reproduce the wrong result (and show the 
result)? And did you test it on the latest version? 

> And Operator has side effect when operands have udf
> ---
>
> Key: FLINK-31848
> URL: https://issues.apache.org/jira/browse/FLINK-31848
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.13.2
>Reporter: zju_zsx
>Priority: Major
> Attachments: image-2023-04-19-14-54-46-458.png
>
>
>  
> {code:java}
> CREATE TABLE kafka_source (
>    `content` varchar,
>    `testid` bigint,
>    `extra` int
>    );
> CREATE TABLE console_sink (
>    `content` varchar,
>    `testid` bigint
>  )
>   with (
>     'connector' = 'print'
> );
> insert into console_sink
> select 
>    content,testid+1
> from kafka_source where testid is not null and testid > 0 and my_udf(testid) 
> != 0; {code}
> my_udf has a constraint that the testid should not be null, but the testid is 
> not null and testid > 0 does not take effect.
>  
> Im ScalarOperatorGens.generateAnd
> !image-2023-04-19-14-54-46-458.png!
> if left.nullTerm is true, right code will be execute 。
> it seems that
> {code:java}
> if (!${left.nullTerm} && !${left.resultTerm}) {code}
> can be safely replaced with 
> {code:java}
> if (!${left.resultTerm}){code}
> ? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31869) test_multi_sessionjob.sh gets stuck very frequently

2023-04-21 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-31869.
--
Fix Version/s: kubernetes-operator-1.5.0
   Resolution: Fixed

Merged to main 1f54ffa484c359c4b81a409f27092dbfba82157b

There are still frequent test failures but at least the CI doesnt get stuck for 
hours

> test_multi_sessionjob.sh gets stuck very frequently
> ---
>
> Key: FLINK-31869
> URL: https://issues.apache.org/jira/browse/FLINK-31869
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.5.0
>
>
> The test_multi_sessionjob.sh gets stuck almost all the time on recent builds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31874) Support truncate table statement in batch mode

2023-04-21 Thread Jark Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jark Wu updated FLINK-31874:

Fix Version/s: 1.18.0

> Support truncate table statement in batch mode
> --
>
> Key: FLINK-31874
> URL: https://issues.apache.org/jira/browse/FLINK-31874
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / API
>Reporter: luoyuxia
>Priority: Major
> Fix For: 1.18.0
>
>
> Described in [FLIP-302: Support TRUNCATE TABLE statement in batch 
> mode|https://cwiki.apache.org/confluence/display/FLINK/FLIP-302%3A+Support+TRUNCATE+TABLE+statement+in+batch+mode]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29692) Support early/late fires for Windowing TVFs

2023-04-21 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714985#comment-17714985
 ] 

Jark Wu commented on FLINK-29692:
-

Hi [~charles-tan], thank you for sharing your use case. I'm just curious that 
is it possible to support your use case by using Group Aggregate instead of 
Window Aggregate? For example: 

{code}
SELECT user, COUNT(*) as cnt
FROM withdrawal
GROUP BY 
   user, 
   DATE_FORMAT(withdrawal_timestamp, "-MM-dd HH:00") -- trim into hour
HAVING cnt >= 3
{code}

IIUC, this can also archive that "notified if a withdrawal from a bank account 
happens 3 times in an hour" ASAP. And you may get better performance from the 
tuning[1]. 


[1]: https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/tuning/

> Support early/late fires for Windowing TVFs
> ---
>
> Key: FLINK-29692
> URL: https://issues.apache.org/jira/browse/FLINK-29692
> Project: Flink
>  Issue Type: New Feature
>  Components: Table SQL / Planner
>Affects Versions: 1.15.3
>Reporter: Canope Nerda
>Priority: Major
>
> I have cases where I need to 1) output data as soon as possible and 2) handle 
> late arriving data to achieve eventual correctness. In the logic, I need to 
> do window deduplication which is based on Windowing TVFs and according to 
> source code, early/late fires are not supported yet in Windowing TVFs.
> Actually 1) contradicts with 2). Without early/late fires, we had to 
> compromise, either live with fresh incorrect data or tolerate excess latency 
> for correctness.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31869) test_multi_sessionjob.sh gets stuck very frequently

2023-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-31869:
---
Labels: pull-request-available  (was: )

> test_multi_sessionjob.sh gets stuck very frequently
> ---
>
> Key: FLINK-31869
> URL: https://issues.apache.org/jira/browse/FLINK-31869
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Blocker
>  Labels: pull-request-available
>
> The test_multi_sessionjob.sh gets stuck almost all the time on recent builds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] gyfora merged pull request #572: [FLINK-31869] Fix e2e cleanup getting stuck

2023-04-21 Thread via GitHub


gyfora merged PR #572:
URL: https://github.com/apache/flink-kubernetes-operator/pull/572


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Assigned] (FLINK-31869) test_multi_sessionjob.sh gets stuck very frequently

2023-04-21 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-31869:
--

Assignee: Gyula Fora

> test_multi_sessionjob.sh gets stuck very frequently
> ---
>
> Key: FLINK-31869
> URL: https://issues.apache.org/jira/browse/FLINK-31869
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Blocker
>
> The test_multi_sessionjob.sh gets stuck almost all the time on recent builds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-30609) Add ephemeral storage to CRD

2023-04-21 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-30609.
--
Resolution: Fixed

merged to main bad872d04e324fde2b6396e18bae5a37d804f59b

> Add ephemeral storage to CRD
> 
>
> Key: FLINK-30609
> URL: https://issues.apache.org/jira/browse/FLINK-30609
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Matyas Orhidi
>Assignee: Zhenqiu Huang
>Priority: Major
>  Labels: pull-request-available, starter
> Fix For: kubernetes-operator-1.5.0
>
>
> We should consider adding ephemeral storage to the existing [resource 
> specification 
> |https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/reference/#resource]in
>  CRD, next to {{cpu}} and {{memory}}
> https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#setting-requests-and-limits-for-local-ephemeral-storage



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] gyfora merged pull request #561: [FLINK-30609] Add ephemeral storage to CRD

2023-04-21 Thread via GitHub


gyfora merged PR #561:
URL: https://github.com/apache/flink-kubernetes-operator/pull/561


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-30852) TaskTest.testCleanupWhenSwitchToInitializationFails reports AssertionError but doesn't fail

2023-04-21 Thread Anton Kalashnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Kalashnikov updated FLINK-30852:
--
Fix Version/s: 1.17.1

> TaskTest.testCleanupWhenSwitchToInitializationFails reports AssertionError 
> but doesn't fail
> ---
>
> Key: FLINK-30852
> URL: https://issues.apache.org/jira/browse/FLINK-30852
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Anton Kalashnikov
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.18.0, 1.17.1
>
>
> While investigating FLINK-30844, I noticed that {{TaskTest.testCleanup}} 
> reports an AssertionError in the logs but doesn't fail:
> {code}
> 00:59:01,886 [main] ERROR 
> org.apache.flink.runtime.taskmanager.Task[] - Error while 
> canceling task Test Task (1/1)#0.
> java.lang.AssertionError: This should not be called
> at org.junit.Assert.fail(Assert.java:89) ~[junit-4.13.2.jar:4.13.2]
> at 
> org.apache.flink.runtime.taskmanager.TaskTest$TestInvokableCorrect.cancel(TaskTest.java:1304)
>  ~[test-classes/:?]
> at 
> org.apache.flink.runtime.taskmanager.Task.cancelInvokable(Task.java:1529) 
> ~[classes/:?]
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:796) 
> ~[classes/:?]
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) 
> ~[classes/:?]
> at 
> org.apache.flink.runtime.taskmanager.TaskTest.testCleanupWhenSwitchToInitializationFails(TaskTest.java:184)
>  ~[test-classes/:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_292]
> [...]
> {code}
> [~akalashnikov] is this expected?
> The affected build is 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45440&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=24c3384f-1bcb-57b3-224f-51bf973bbee8



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-30852) TaskTest.testCleanupWhenSwitchToInitializationFails reports AssertionError but doesn't fail

2023-04-21 Thread Anton Kalashnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Kalashnikov resolved FLINK-30852.
---
Resolution: Fixed

> TaskTest.testCleanupWhenSwitchToInitializationFails reports AssertionError 
> but doesn't fail
> ---
>
> Key: FLINK-30852
> URL: https://issues.apache.org/jira/browse/FLINK-30852
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Anton Kalashnikov
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.18.0, 1.17.1
>
>
> While investigating FLINK-30844, I noticed that {{TaskTest.testCleanup}} 
> reports an AssertionError in the logs but doesn't fail:
> {code}
> 00:59:01,886 [main] ERROR 
> org.apache.flink.runtime.taskmanager.Task[] - Error while 
> canceling task Test Task (1/1)#0.
> java.lang.AssertionError: This should not be called
> at org.junit.Assert.fail(Assert.java:89) ~[junit-4.13.2.jar:4.13.2]
> at 
> org.apache.flink.runtime.taskmanager.TaskTest$TestInvokableCorrect.cancel(TaskTest.java:1304)
>  ~[test-classes/:?]
> at 
> org.apache.flink.runtime.taskmanager.Task.cancelInvokable(Task.java:1529) 
> ~[classes/:?]
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:796) 
> ~[classes/:?]
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) 
> ~[classes/:?]
> at 
> org.apache.flink.runtime.taskmanager.TaskTest.testCleanupWhenSwitchToInitializationFails(TaskTest.java:184)
>  ~[test-classes/:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_292]
> [...]
> {code}
> [~akalashnikov] is this expected?
> The affected build is 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45440&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=24c3384f-1bcb-57b3-224f-51bf973bbee8



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30852) TaskTest.testCleanupWhenSwitchToInitializationFails reports AssertionError but doesn't fail

2023-04-21 Thread Anton Kalashnikov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714961#comment-17714961
 ] 

Anton Kalashnikov commented on FLINK-30852:
---

merged to release-1.17: aa47c3f862414511e92637d9816f6908c86b4cf6

> TaskTest.testCleanupWhenSwitchToInitializationFails reports AssertionError 
> but doesn't fail
> ---
>
> Key: FLINK-30852
> URL: https://issues.apache.org/jira/browse/FLINK-30852
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Task
>Affects Versions: 1.17.0
>Reporter: Matthias Pohl
>Assignee: Anton Kalashnikov
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.18.0
>
>
> While investigating FLINK-30844, I noticed that {{TaskTest.testCleanup}} 
> reports an AssertionError in the logs but doesn't fail:
> {code}
> 00:59:01,886 [main] ERROR 
> org.apache.flink.runtime.taskmanager.Task[] - Error while 
> canceling task Test Task (1/1)#0.
> java.lang.AssertionError: This should not be called
> at org.junit.Assert.fail(Assert.java:89) ~[junit-4.13.2.jar:4.13.2]
> at 
> org.apache.flink.runtime.taskmanager.TaskTest$TestInvokableCorrect.cancel(TaskTest.java:1304)
>  ~[test-classes/:?]
> at 
> org.apache.flink.runtime.taskmanager.Task.cancelInvokable(Task.java:1529) 
> ~[classes/:?]
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:796) 
> ~[classes/:?]
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562) 
> ~[classes/:?]
> at 
> org.apache.flink.runtime.taskmanager.TaskTest.testCleanupWhenSwitchToInitializationFails(TaskTest.java:184)
>  ~[test-classes/:?]
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_292]
> [...]
> {code}
> [~akalashnikov] is this expected?
> The affected build is 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45440&view=logs&j=0da23115-68bb-5dcd-192c-bd4c8adebde1&t=24c3384f-1bcb-57b3-224f-51bf973bbee8



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] akalash merged pull request #22436: [FLINK-30852][runtime] Checking task cancelation explicitly rather th…

2023-04-21 Thread via GitHub


akalash merged PR #22436:
URL: https://github.com/apache/flink/pull/22436


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-28386) Trigger an immediate checkpoint after all sources finished

2023-04-21 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714956#comment-17714956
 ] 

Piotr Nowojski commented on FLINK-28386:


[~zlzhang0122] Checkpoint could be used just to make the side effects visible 
(committing results in two phase commit operators/sinks). On the other hand, 
why savepoint makes any sense? There is no point in recovering  from such 
snapshot anyway.

About the ticket. Taking into account unaligned checkpoints, I think a better 
condition would be to trigger a checkpoint once all tasks are finished. With 
unaligned checkpoints, downstream tasks can be still processing in-flight data, 
while upstream sources are finished, so triggering checkpoint on finished 
sources wouldn't achieve the desired goal of stopping the job faster.

> Trigger an immediate checkpoint after all sources finished
> --
>
> Key: FLINK-28386
> URL: https://issues.apache.org/jira/browse/FLINK-28386
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing
>Reporter: Yun Gao
>Priority: Major
>
> Currently for bounded job in streaming mode, by default it will wait for one 
> more checkpoint to commit the last piece of data. If the checkpoint period is 
> long, the waiting time might also be long. to optimize this situation, we 
> could eagerly trigger a checkpoint after all sources are finished. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-31846) Support cancel final checkpoint when all tasks are finished

2023-04-21 Thread Piotr Nowojski (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski closed FLINK-31846.
--
Resolution: Duplicate

I will close this ticket, as I think it just duplicates FLINK-28386, feel free 
to re-open if I missed something. Otherwise, let's maybe move the discussion to 
the other ticket.

> Support cancel final checkpoint when all tasks are finished
> ---
>
> Key: FLINK-31846
> URL: https://issues.apache.org/jira/browse/FLINK-31846
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Checkpointing
>Affects Versions: 1.15.2
>Reporter: Fan Hong
>Priority: Major
>
> As stated in [1], all tasks will wait for the final checkpoint before 
> exiting. It also mentioned this mechanism will prolong the execution time.
> So, can we provide configurations to make tasks NOT wait for the final 
> checkpoint?
>  
>  [1]: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/checkpointing/#waiting-for-the-final-checkpoint-before-task-exit



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31846) Support cancel final checkpoint when all tasks are finished

2023-04-21 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714946#comment-17714946
 ] 

Piotr Nowojski edited comment on FLINK-31846 at 4/21/23 11:33 AM:
--

{quote}
f the final checkpoint cannot be cancelled, can we bring it forward to start 
upon completion of all tasks?
{quote}
I'm afraid it's not possible at the moment, it's a valid feature request. There 
is even a ticket for that https://issues.apache.org/jira/browse/FLINK-28386.

I'm not sure how helpful is this, but you can also manually trigger checkpoints:
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/rest_api/#jobs-jobid-checkpoints-1

Alternatively, you can also disable checkpoints with finished tasks. If your 
sources are finishing more or less all at once, the only thing you are 
sacrificing are:
* not being able to checkpoint job if some sources have already finished, while 
others are still working
* exactly-once semantic (like committing transactions to Kafka). But you could 
still use Kafka sink with at-least-once semantic.


was (Author: pnowojski):
{quote}
f the final checkpoint cannot be cancelled, can we bring it forward to start 
upon completion of all tasks?
{quote}
I'm afraid it's not possible at the moment, it's a valid feature request. There 
is even a ticket for that https://issues.apache.org/jira/browse/FLINK-26113.

I'm not sure how helpful is this, but you can also manually trigger checkpoints:
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/rest_api/#jobs-jobid-checkpoints-1

Alternatively, you can also disable checkpoints with finished tasks. If your 
sources are finishing more or less all at once, the only thing you are 
sacrificing are:
* not being able to checkpoint job if some sources have already finished, while 
others are still working
* exactly-once semantic (like committing transactions to Kafka). But you could 
still use Kafka sink with at-least-once semantic.

> Support cancel final checkpoint when all tasks are finished
> ---
>
> Key: FLINK-31846
> URL: https://issues.apache.org/jira/browse/FLINK-31846
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Checkpointing
>Affects Versions: 1.15.2
>Reporter: Fan Hong
>Priority: Major
>
> As stated in [1], all tasks will wait for the final checkpoint before 
> exiting. It also mentioned this mechanism will prolong the execution time.
> So, can we provide configurations to make tasks NOT wait for the final 
> checkpoint?
>  
>  [1]: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/checkpointing/#waiting-for-the-final-checkpoint-before-task-exit



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-31846) Support cancel final checkpoint when all tasks are finished

2023-04-21 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714946#comment-17714946
 ] 

Piotr Nowojski edited comment on FLINK-31846 at 4/21/23 11:32 AM:
--

{quote}
f the final checkpoint cannot be cancelled, can we bring it forward to start 
upon completion of all tasks?
{quote}
I'm afraid it's not possible at the moment, it's a valid feature request. There 
is even a ticket for that https://issues.apache.org/jira/browse/FLINK-26113.

I'm not sure how helpful is this, but you can also manually trigger checkpoints:
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/rest_api/#jobs-jobid-checkpoints-1

Alternatively, you can also disable checkpoints with finished tasks. If your 
sources are finishing more or less all at once, the only thing you are 
sacrificing are:
* not being able to checkpoint job if some sources have already finished, while 
others are still working
* exactly-once semantic (like committing transactions to Kafka). But you could 
still use Kafka sink with at-least-once semantic.


was (Author: pnowojski):
{quote}
f the final checkpoint cannot be cancelled, can we bring it forward to start 
upon completion of all tasks?
{quote}
I'm afraid it's not possible at the moment, but would be a valid feature 
request.

I'm not sure how helpful is this, but you can also manually trigger checkpoints:
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/rest_api/#jobs-jobid-checkpoints-1

Alternatively, you can also disable checkpoints with finished tasks. If your 
sources are finishing more or less all at once, the only thing you are 
sacrificing are:
* not being able to checkpoint job if some sources have already finished, while 
others are still working
* exactly-once semantic (like committing transactions to Kafka). But you could 
still use Kafka sink with at-least-once semantic.

> Support cancel final checkpoint when all tasks are finished
> ---
>
> Key: FLINK-31846
> URL: https://issues.apache.org/jira/browse/FLINK-31846
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Checkpointing
>Affects Versions: 1.15.2
>Reporter: Fan Hong
>Priority: Major
>
> As stated in [1], all tasks will wait for the final checkpoint before 
> exiting. It also mentioned this mechanism will prolong the execution time.
> So, can we provide configurations to make tasks NOT wait for the final 
> checkpoint?
>  
>  [1]: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/checkpointing/#waiting-for-the-final-checkpoint-before-task-exit



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] Aitozi commented on pull request #22378: [FLINK-31344][planner] Support to update nested columns in update sta…

2023-04-21 Thread via GitHub


Aitozi commented on PR #22378:
URL: https://github.com/apache/flink/pull/22378#issuecomment-1517684780

   ping @luoyuxia 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] 1996fanrui commented on pull request #22392: [FLINK-31588][checkpoint] Correct the unaligned checkpoint type after aligned barrier timeout to unaligned barrier on PipelinedSubpartitio

2023-04-21 Thread via GitHub


1996fanrui commented on PR #22392:
URL: https://github.com/apache/flink/pull/22392#issuecomment-1517672547

   > Thanks for the fix!
   > 
   > Can you add a small unit test? Apart of that LGTM Feel free to merge after 
adding the unit test and with green azure :)
   
   Thanks for the quick feedback, updated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-31846) Support cancel final checkpoint when all tasks are finished

2023-04-21 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714946#comment-17714946
 ] 

Piotr Nowojski edited comment on FLINK-31846 at 4/21/23 11:08 AM:
--

{quote}
f the final checkpoint cannot be cancelled, can we bring it forward to start 
upon completion of all tasks?
{quote}
I'm afraid it's not possible at the moment, but would be a valid feature 
request.

I'm not sure how helpful is this, but you can also manually trigger checkpoints:
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/rest_api/#jobs-jobid-checkpoints-1

Alternatively, you can also disable checkpoints with finished tasks. If your 
sources are finishing more or less all at once, the only thing you are 
sacrificing are:
* not being able to checkpoint job if some sources have already finished, while 
others are still working
* exactly-once semantic (like committing transactions to Kafka). But you could 
still use Kafka sink with at-least-once semantic.


was (Author: pnowojski):
{quote}
f the final checkpoint cannot be cancelled, can we bring it forward to start 
upon completion of all tasks?
{quote}
I'm afraid it's not possible at the moment, but would be a valid feature 
request.

I'm not sure how helpful is this, but you can also manually trigger checkpoints:
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/rest_api/#jobs-jobid-checkpoints-1

Alternatively, you can also disable checkpoints with finished tasks. If your 
sources are finishing more or less all at once, the only thing you are 
sacrificing is exactly-once semantic (like committing transactions to Kafka). 
But you could still use Kafka sink with at-least-once semantic.

> Support cancel final checkpoint when all tasks are finished
> ---
>
> Key: FLINK-31846
> URL: https://issues.apache.org/jira/browse/FLINK-31846
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Checkpointing
>Affects Versions: 1.15.2
>Reporter: Fan Hong
>Priority: Major
>
> As stated in [1], all tasks will wait for the final checkpoint before 
> exiting. It also mentioned this mechanism will prolong the execution time.
> So, can we provide configurations to make tasks NOT wait for the final 
> checkpoint?
>  
>  [1]: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/checkpointing/#waiting-for-the-final-checkpoint-before-task-exit



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31846) Support cancel final checkpoint when all tasks are finished

2023-04-21 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714946#comment-17714946
 ] 

Piotr Nowojski commented on FLINK-31846:


{quote}
f the final checkpoint cannot be cancelled, can we bring it forward to start 
upon completion of all tasks?
{quote}
I'm afraid it's not possible at the moment, but would be a valid feature 
request.

I'm not sure how helpful is this, but you can also manually trigger checkpoints:
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/rest_api/#jobs-jobid-checkpoints-1

Alternatively, you can also disable checkpoints with finished tasks. If your 
sources are finishing more or less all at once, the only thing you are 
sacrificing is exactly-once semantic (like committing transactions to Kafka). 
But you could still use Kafka sink with at-least-once semantic.

> Support cancel final checkpoint when all tasks are finished
> ---
>
> Key: FLINK-31846
> URL: https://issues.apache.org/jira/browse/FLINK-31846
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Checkpointing
>Affects Versions: 1.15.2
>Reporter: Fan Hong
>Priority: Major
>
> As stated in [1], all tasks will wait for the final checkpoint before 
> exiting. It also mentioned this mechanism will prolong the execution time.
> So, can we provide configurations to make tasks NOT wait for the final 
> checkpoint?
>  
>  [1]: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/fault-tolerance/checkpointing/#waiting-for-the-final-checkpoint-before-task-exit



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] dmvk commented on pull request #22441: [hotfix]Use wrapper classes like other parameters

2023-04-21 Thread via GitHub


dmvk commented on PR #22441:
URL: https://github.com/apache/flink/pull/22441#issuecomment-1517640808

   What's the motivation behind the PR? Is this just a cosmetic thing, or is 
there something broken? If cosmetic, I'd expect this to be the other way 
around, using the unboxed types that prevent nullability issues.
   
   If something is broken, we should first have an issue covering it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #22448: [FLINK-31386][network] Fix the potential deadlock issue of blocking shuffle

2023-04-21 Thread via GitHub


flinkbot commented on PR #22448:
URL: https://github.com/apache/flink/pull/22448#issuecomment-1517637993

   
   ## CI report:
   
   * f71e3afff45ea49d2ecd060476dc77b90afe3255 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] wsry commented on pull request #22448: [FLINK-31386][network] Fix the potential deadlock issue of blocking shuffle

2023-04-21 Thread via GitHub


wsry commented on PR #22448:
URL: https://github.com/apache/flink/pull/22448#issuecomment-1517633083

   This is a cherry picked PR, will merge after tests pass.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-31803) UpdateJobResourceRequirementsRecoveryITCase.testRescaledJobGraphsWillBeRecoveredCorrectly(Path) is unstable on azure

2023-04-21 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-31803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714937#comment-17714937
 ] 

David Morávek commented on FLINK-31803:
---

master: c2ab806a3624471bb36f87ba98d51f672b7894fe

> UpdateJobResourceRequirementsRecoveryITCase.testRescaledJobGraphsWillBeRecoveredCorrectly(Path)
>  is unstable on azure
> 
>
> Key: FLINK-31803
> URL: https://issues.apache.org/jira/browse/FLINK-31803
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Tests
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: David Morávek
>Priority: Critical
>  Labels: pull-request-available, test-stability
>
> {noformat}
> Apr 07 01:28:23 java.util.concurrent.CompletionException: 
> Apr 07 01:28:23 org.apache.flink.runtime.rest.util.RestClientException: 
> [org.apache.flink.runtime.rest.NotFoundException: Job 
> d3538259fba86dfc0bd9bd5680076836 not found
> Apr 07 01:28:23   at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$1(AbstractExecutionGraphHandler.java:99)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
> Apr 07 01:28:23   at 
> org.apache.flink.runtime.rest.handler.legacy.DefaultExecutionGraphCache.lambda$getExecutionGraphInternal$0(DefaultExecutionGraphCache.java:109)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
> Apr 07 01:28:23   at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$1(AkkaInvocationHandler.java:260)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
> Apr 07 01:28:23   at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1275)
> Apr 07 01:28:23   at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=47996&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=ea7cf968-e585-52cb-e0fc-f48de023a7ca&l=7713



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-31803) UpdateJobResourceRequirementsRecoveryITCase.testRescaledJobGraphsWillBeRecoveredCorrectly(Path) is unstable on azure

2023-04-21 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-31803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Morávek resolved FLINK-31803.
---
Fix Version/s: 1.18.0
   Resolution: Fixed

> UpdateJobResourceRequirementsRecoveryITCase.testRescaledJobGraphsWillBeRecoveredCorrectly(Path)
>  is unstable on azure
> 
>
> Key: FLINK-31803
> URL: https://issues.apache.org/jira/browse/FLINK-31803
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination, Tests
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: David Morávek
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.18.0
>
>
> {noformat}
> Apr 07 01:28:23 java.util.concurrent.CompletionException: 
> Apr 07 01:28:23 org.apache.flink.runtime.rest.util.RestClientException: 
> [org.apache.flink.runtime.rest.NotFoundException: Job 
> d3538259fba86dfc0bd9bd5680076836 not found
> Apr 07 01:28:23   at 
> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.lambda$handleRequest$1(AbstractExecutionGraphHandler.java:99)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
> Apr 07 01:28:23   at 
> org.apache.flink.runtime.rest.handler.legacy.DefaultExecutionGraphCache.lambda$getExecutionGraphInternal$0(DefaultExecutionGraphCache.java:109)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
> Apr 07 01:28:23   at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$1(AkkaInvocationHandler.java:260)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> Apr 07 01:28:23   at 
> java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
> Apr 07 01:28:23   at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1275)
> Apr 07 01:28:23   at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=47996&view=logs&j=8fd9202e-fd17-5b26-353c-ac1ff76c8f28&t=ea7cf968-e585-52cb-e0fc-f48de023a7ca&l=7713



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] dmvk merged pull request #22408: [FLINK-31803] Harden UpdateJobResourceRequirementsRecoveryITCase.

2023-04-21 Thread via GitHub


dmvk merged PR #22408:
URL: https://github.com/apache/flink/pull/22408


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] reswqa commented on pull request #22432: [FLINK-18808][runtime] Include side outputs in numRecordsOut metric

2023-04-21 Thread via GitHub


reswqa commented on PR #22432:
URL: https://github.com/apache/flink/pull/22432#issuecomment-1517630686

   Hi @pnowojski, Would you mind taking a look at this in you free time as you 
have been reviewed the original 
[PR](https://github.com/apache/flink/pull/13109).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] wsry opened a new pull request, #22448: [FLINK-31386][network] Fix the potential deadlock issue of blocking shuffle

2023-04-21 Thread via GitHub


wsry opened a new pull request, #22448:
URL: https://github.com/apache/flink/pull/22448

   ## What is the purpose of the change
   
   Currently, the SortMergeResultPartition may allocate more network buffers 
than the guaranteed size of the LocalBufferPool. As a result, some result 
partitions may need to wait other result partitions to release the 
over-allocated network buffers to continue. However, the result partitions 
which have allocated more than guaranteed buffers relies on the processing of 
input data to trigger data spilling and buffer recycling. The input data 
further relies on batch reading buffers used by the 
SortMergeResultPartitionReadScheduler which may already taken by those blocked 
result partitions that are waiting for buffers. Then deadlock occurs. This 
patch fixes the deadlock issue by reserving the guaranteed buffers on 
initializing.
   
   ## Brief change log
   
 - Reserve the guaranteed buffers on initializing for 
SortMergeResultPartition.
   
   ## Verifying this change
   
   This change added tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #22447: [FLINK-31764][runtime] Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool

2023-04-21 Thread via GitHub


flinkbot commented on PR #22447:
URL: https://github.com/apache/flink/pull/22447#issuecomment-1517625986

   
   ## CI report:
   
   * 62fc6fa527aaf68ee2e2d582a56107ad52fd8714 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31764) Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool

2023-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-31764:
---
Labels: pull-request-available  (was: )

> Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool
> --
>
> Key: FLINK-31764
> URL: https://issues.apache.org/jira/browse/FLINK-31764
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Weijie Guo
>Assignee: Weijie Guo
>Priority: Major
>  Labels: pull-request-available
>
> After FLINK-31763, we don't need the specific field 
> {{numberOfRequestedOverdraftMemorySegments}} to record the overdraft buffers 
> has been requested anymore since we regard all buffers exceeding the 
> \{{currentPoolSize}} as overdraft.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] reswqa opened a new pull request, #22447: [FLINK-31764][runtime] Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool

2023-04-21 Thread via GitHub


reswqa opened a new pull request, #22447:
URL: https://github.com/apache/flink/pull/22447

   ## What is the purpose of the change
   
   *After [FLINK-31763](https://issues.apache.org/jira/browse/FLINK-31763), we 
don't need the specific field `numberOfRequestedOverdraftMemorySegments` to 
record the overdraft buffers has been requested anymore since we regard all 
buffers exceeding the `currentPoolSize` as overdraft.*
   
   
   ## Brief change log
   
 - *Introduce getNumberOfRequestedMemorySegments and rename the old one to 
a more appropriate name.*
 - *Get rid of numberOfRequestedOverdraftMemorySegments in LocalBufferPool.*
   
   
   ## Verifying this change
   
   
   This change is already covered by existing tests in `LocalBufferPoolTest`.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): per-buffer
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-31878) Fix the wrong name of PauseOrResumeSplitsTask#toString in connector fetcher

2023-04-21 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-31878:
--
Fix Version/s: (was: 1.18.0)

> Fix the wrong name of PauseOrResumeSplitsTask#toString in connector fetcher 
> 
>
> Key: FLINK-31878
> URL: https://issues.apache.org/jira/browse/FLINK-31878
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Common
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
>
> The class name PauseOrResumeSplitsTask#toString is not right. Users will be 
> very confused when calling the toString method of the class. So we should fix 
> it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31878) Fix the wrong name of PauseOrResumeSplitsTask#toString in connector fetcher

2023-04-21 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-31878:
--
Affects Version/s: 1.18.0

> Fix the wrong name of PauseOrResumeSplitsTask#toString in connector fetcher 
> 
>
> Key: FLINK-31878
> URL: https://issues.apache.org/jira/browse/FLINK-31878
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Common
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> The class name PauseOrResumeSplitsTask#toString is not right. Users will be 
> very confused when calling the toString method of the class. So we should fix 
> it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] TanYuxin-tyx commented on a diff in pull request #22395: [FLINK-31799][docs] Python connector download link should refer to the url defined in externalized repository

2023-04-21 Thread via GitHub


TanYuxin-tyx commented on code in PR #22395:
URL: https://github.com/apache/flink/pull/22395#discussion_r1173578262


##
docs/layouts/shortcodes/py_connector_download_link.html:
##
@@ -0,0 +1,62 @@
+{{/*
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+*/}}{{/*
+Generates an XML snippet for the externalized connector python download table.
+*/}}
+{{ $name := .Get 0 }}
+{{ $connector_version := .Get 1 }}
+{{ $connector := index .Site.Data $name }}
+{{ $flink_version := .Site.Params.VersionTitle }}
+{{ $full_version := printf "%s-%s" $connector_version $flink_version }}
+
+
+{{ if eq $.Site.Language.Lang "en" }}
+In order to use the {{ $connector.name }} in PyFlink jobs, the following
+dependencies are required:
+{{ else if eq $.Site.Language.Lang "zh" }}
+为了在 PyFlink 作业中使用 {{ $connector.name }} ,需要添加下列依赖:
+{{ end }}
+
+
+Version
+PyFlink JAR
+
+
+{{ range $connector.variants }}
+
+{{- .maven -}}
+{{ if $.Site.Params.IsStable }}
+{{ if eq .sql_url nil}}
+There is no sql jar available yet.
+{{ else }}
+Download
+{{ end }}
+{{ else }}
+Only available for stable releases.
+{{ end }}
+
+{{ end }}
+
+
+{{ if eq .Site.Language.Lang "en" }}
+See Python
 dependency management
+for more details on how to use JARs in PyFlink.
+{{ else if eq .Site.Language.Lang "zh" }}
+在 PyFlink 中如何添加 JAR 包依赖参见 Python
 依赖管理。

Review Comment:
   "参见" is a little strange, maybe "请参考". But I see that it is also like this 
in `py_download_link.html`. So fixing or not fixing is both fine, no strong 
opinions.
   
   BTW, if fixing this, then the one in the `py_download_link.html` should also 
be fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] pvary commented on pull request #22437: [FLINK-31868] Fix DefaultInputSplitAssigner javadoc for class

2023-04-21 Thread via GitHub


pvary commented on PR #22437:
URL: https://github.com/apache/flink/pull/22437#issuecomment-1517549223

   Thanks @zhuzhurk for the guidance and commit!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] MartijnVisser commented on pull request #22443: [FLINK-31878][connectors] Fix the wrong name of PauseOrResumeSplitsTask#toString

2023-04-21 Thread via GitHub


MartijnVisser commented on PR #22443:
URL: https://github.com/apache/flink/pull/22443#issuecomment-1517539242

   Wouldn't this break all connectors that have implemented this method 
already, like Kafka? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #22446: [FLINK-28060][BP 1.15][Connector/Kafka] Updated Kafka Clients to 3.2.1

2023-04-21 Thread via GitHub


flinkbot commented on PR #22446:
URL: https://github.com/apache/flink/pull/22446#issuecomment-1517536013

   
   ## CI report:
   
   * e0086f25a4e965cbbc5030c7e9517f91d1991e4a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] chachae opened a new pull request, #22446: [FLINK-28060][BP 1.15][Connector/Kafka] Updated Kafka Clients to 3.2.1

2023-04-21 Thread via GitHub


chachae opened a new pull request, #22446:
URL: https://github.com/apache/flink/pull/22446

   Cherry pick https://github.com/apache/flink/pull/19994 and 
https://github.com/apache/flink/pull/20526 to release-1.15
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >