[GitHub] [flink] snuyanzin commented on pull request #23069: [FLINK-32664][table] Fix TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown

2023-07-24 Thread via GitHub


snuyanzin commented on PR #23069:
URL: https://github.com/apache/flink/pull/23069#issuecomment-1649164736

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink-connector-pulsar] syhily commented on pull request #56: [FLINK-XXXXX] Basic table factory for Pulsar connector

2023-07-24 Thread via GitHub


syhily commented on PR #56:
URL: 
https://github.com/apache/flink-connector-pulsar/pull/56#issuecomment-1649158056

   This is a huge PR. I may have time this weekend to glance at it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-29317) Add WebSocket in Dispatcher to support olap query submission and push results in session cluster

2023-07-24 Thread Fang Yong (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746784#comment-17746784
 ] 

Fang Yong commented on FLINK-29317:
---

Hi [~dmvk] Sorry for so late rely. Currently we would like to promote 
improvement about Flink OLAP in the community. Are you still interested in this 
area? We want to create the FLIP and initiate discussions in the community 
later, thanks

> Add WebSocket in Dispatcher to support olap query submission and push results 
> in session cluster
> 
>
> Key: FLINK-29317
> URL: https://issues.apache.org/jira/browse/FLINK-29317
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination, Runtime / REST
>Affects Versions: 1.14.5, 1.15.3
>Reporter: Fang Yong
>Priority: Major
>
> Currently client submit olap query to flink session cluster via http rest 
> api, and pull the results through interval polling. The sink task in 
> TaskManager creates socket server for each query, when the JobManager 
> receives the pull request from client, it requests query results from the 
> socket server. The process is as follows
> Job submission path:
> client -> http rest -> JobManager -> Sink Socket Server
> Result acquisition path:
> client <- http rest <- JobManager <- Sink Socket Server
> This leads to two problems
> 1. There will be some performance loss when submitting jobs through http 
> rest, for example, temporary files will be created for each job
> 2. The client pulls the result data at a certain time interval, which is a 
> fixed cost. The larger interval leads to increase query latency, the smaller 
> interval will increase the pressure of Dispatcher.
> 3. Each sink task initializes a socket server, it will increase the query 
> latency, on the other hand, it wastes resources.
> For the Flink OLAP scenario, we propose to add websocket protocol in session 
> cluster to support submitting jobs and returning results. The client creates 
> and manage a connection with websocket server, submits olap query to session 
> cluster. The TaskManagers create and manage connection to websocket server 
> too, and sink task sends results to the server in stream. When the JobManager 
> receives the results from sink task, it pushes the result data to the client 
> through the connection between them.
> We implemented this feature in the internal Flink version of ByteDance. On 
> average, the latency of each query can be reduced by about 100ms, it's a big 
> optimization for OLAP queries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32664) TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown is failing

2023-07-24 Thread Feifan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746776#comment-17746776
 ] 

Feifan Wang commented on FLINK-32664:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51673=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=11250

> TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown is failing
> ---
>
> Key: FLINK-32664
> URL: https://issues.apache.org/jira/browse/FLINK-32664
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> Blocker since it's failing on every build and reproduced locally
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51661=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=11529



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-28046) Annotate SourceFunction as deprecated

2023-07-24 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746762#comment-17746762
 ] 

Leonard Xu edited comment on FLINK-28046 at 7/25/23 4:25 AM:
-

[~mxm] There're extra features to be done, but we still lack a lot works listed 
in this umbrella ticket to let user migrate to new Source API, current Source 
API is to complex for developers even they're experienced flink developers[1],  
[~lzljs3620320] and  I also have introduced bugs for Apache Paimon Connector 
and Flink CDC Connectors due to understanding the new Source API incorrectly :(.

Although I understand the motivation to deprecate the API without waiting any 
subtasks, but it still doesn't look like a correct workflow, there're same 
concerns[2][3][4] that we should implement these subtasks before we deprecate 
this interfaces from umbrella issue or discussion email. 

If we must deprecate the API firstly ignore these potential 
subtasks(improvements) for the big deprecation API goal of 2.0,  I'd like to 
propose giving green light to this ticket's workflow and mark the 
SourceFunction as deprecated and need someone to say/promise that all subtasks 
will be finished in 1.19.

 
[1]https://lists.apache.org/thread/5olmnypjw2nvmsc1m2gmw1btzm9dl3ch
[2]https://issues.apache.org/jira/browse/FLINK-28045?focusedCommentId=1761=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1761
[3]https://lists.apache.org/thread/5olmnypjw2nvmsc1m2gmw1btzm9dl3ch
[4]https://lists.apache.org/thread/d6cwqw9b3105wcpdkwq7rr4s7x4ywqr9


was (Author: leonard xu):
[~mxm] There're extra features to be done, but we still lack a lot works listed 
in this umbrella ticket to let user migrate to new Source API, current Source 
API is to complex for developers even they're experienced flink developers[1],  
[~lzljs3620320] and  I also have introduced bugs for Apache Paimon Connector 
and Flink CDC Connectors due to understanding the new Source API incorrectly :(.

Although I understand the motivation to deprecate the API without waiting any 
subtasks, but it still doesn't look like a correct workflow, there're same 
concerns[1][2][3] that we should implement these subtasks before we deprecate 
this interfaces from umbrella issue or discussion email. 

If we must deprecate the API firstly ignore these potential 
subtasks(improvements) for the big deprecation API goal of 2.0,  I'd like to 
propose giving green light to this ticket's workflow and mark the 
SourceFunction as deprecated and need someone to say/promise that all subtasks 
will be finished in 1.19.

 

[1]https://issues.apache.org/jira/browse/FLINK-28045?focusedCommentId=1761=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1761
[2]https://lists.apache.org/thread/5olmnypjw2nvmsc1m2gmw1btzm9dl3ch
[3]https://lists.apache.org/thread/d6cwqw9b3105wcpdkwq7rr4s7x4ywqr9

> Annotate SourceFunction as deprecated
> -
>
> Key: FLINK-28046
> URL: https://issues.apache.org/jira/browse/FLINK-28046
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Affects Versions: 1.15.3
>Reporter: Alexander Fedulov
>Assignee: Alexander Fedulov
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-28046) Annotate SourceFunction as deprecated

2023-07-24 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746762#comment-17746762
 ] 

Leonard Xu edited comment on FLINK-28046 at 7/25/23 4:24 AM:
-

[~mxm] There're extra features to be done, but we still lack a lot works listed 
in this umbrella ticket to let user migrate to new Source API, current Source 
API is to complex for developers even they're experienced flink developers[1],  
[~lzljs3620320] and  I also have introduced bugs for Apache Paimon Connector 
and Flink CDC Connectors due to understanding the new Source API incorrectly :(.

Although I understand the motivation to deprecate the API without waiting any 
subtasks, but it still doesn't look like a correct workflow, there're same 
concerns[1][2][3] that we should implement these subtasks before we deprecate 
this interfaces from umbrella issue or discussion email. 

If we must deprecate the API firstly ignore these potential 
subtasks(improvements) for the big deprecation API goal of 2.0,  I'd like to 
propose giving green light to this ticket's workflow and mark the 
SourceFunction as deprecated and need someone to say/promise that all subtasks 
will be finished in 1.19.

 

[1]https://issues.apache.org/jira/browse/FLINK-28045?focusedCommentId=1761=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1761
[2]https://lists.apache.org/thread/5olmnypjw2nvmsc1m2gmw1btzm9dl3ch
[3]https://lists.apache.org/thread/d6cwqw9b3105wcpdkwq7rr4s7x4ywqr9


was (Author: leonard xu):
[~mxm] There're extra features to be done, but we still lack a lot works listed 
in this umbrella ticket to let user migrate to new Source API, current Source 
API is to complex for developers even they're experienced flink developers[1],  
[~lzljs3620320] and  I also have introduced bugs for Apache Paimon Connector 
and Flink CDC Connectors due to understanding the new Source API incorrectly :(.

Although I understand the motivation to deprecate the API without waiting any 
subtasks, but it still doesn't look like a correct workflow, there're same 
concerns[1][2][3] that we should implement these subtasks before we deprecate 
this interfaces from umbrella issue or discussion email. 

If we must deprecate the API firstly ignore these potential 
subtasks(improvements) for the big deprecation API goal of 2.0,  I'd like to 
propose giving green light to this ticket's workflow and mark the 
SourceFunction as deprecated and need someone to say/promise that all subtasks 
will be finished in 1.19.

 

[1]https://issues.apache.org/jira/browse/FLINK-28045?focusedCommentId=1761=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1761
[2][https://lists.apache.org/thread/5olmnypjw2nvmsc1m2gmw1btzm9dl3ch]
[3][https://lists.apache.org/thread/d6cwqw9b3105wcpdkwq7rr4s7x4ywqr9] 

> Annotate SourceFunction as deprecated
> -
>
> Key: FLINK-28046
> URL: https://issues.apache.org/jira/browse/FLINK-28046
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Affects Versions: 1.15.3
>Reporter: Alexander Fedulov
>Assignee: Alexander Fedulov
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28046) Annotate SourceFunction as deprecated

2023-07-24 Thread Leonard Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746762#comment-17746762
 ] 

Leonard Xu commented on FLINK-28046:


[~mxm] There're extra features to be done, but we still lack a lot works listed 
in this umbrella ticket to let user migrate to new Source API, current Source 
API is to complex for developers even they're experienced flink developers[1],  
[~lzljs3620320] and  I also have introduced bugs for Apache Paimon Connector 
and Flink CDC Connectors due to understanding the new Source API incorrectly :(.

Although I understand the motivation to deprecate the API without waiting any 
subtasks, but it still doesn't look like a correct workflow, there're same 
concerns[1][2][3] that we should implement these subtasks before we deprecate 
this interfaces from umbrella issue or discussion email. 

If we must deprecate the API firstly ignore these potential 
subtasks(improvements) for the big deprecation API goal of 2.0,  I'd like to 
propose giving green light to this ticket's workflow and mark the 
SourceFunction as deprecated and need someone to say/promise that all subtasks 
will be finished in 1.19.

 

[1]https://issues.apache.org/jira/browse/FLINK-28045?focusedCommentId=1761=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-1761
[2][https://lists.apache.org/thread/5olmnypjw2nvmsc1m2gmw1btzm9dl3ch]
[3][https://lists.apache.org/thread/d6cwqw9b3105wcpdkwq7rr4s7x4ywqr9] 

> Annotate SourceFunction as deprecated
> -
>
> Key: FLINK-28046
> URL: https://issues.apache.org/jira/browse/FLINK-28046
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Affects Versions: 1.15.3
>Reporter: Alexander Fedulov
>Assignee: Alexander Fedulov
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #23073: Remove parameter in WindowAssigner#getDefaultTrigger()

2023-07-24 Thread via GitHub


flinkbot commented on PR #23073:
URL: https://github.com/apache/flink/pull/23073#issuecomment-1649089416

   
   ## CI report:
   
   * 4189d5d58e5c422778e2f2c63977119c0d1e2f35 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #23072: Deprecate IOReadableWritable serialization in Path

2023-07-24 Thread via GitHub


flinkbot commented on PR #23072:
URL: https://github.com/apache/flink/pull/23072#issuecomment-1649088181

   
   ## CI report:
   
   * 6d650b72d81881de648fafc029a66c604693e6bb UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] WencongLiu opened a new pull request, #23073: Remove parameter in WindowAssigner#getDefaultTrigger()

2023-07-24 Thread via GitHub


WencongLiu opened a new pull request, #23073:
URL: https://github.com/apache/flink/pull/23073

   ## What is the purpose of the change
   
   Remove parameter in WindowAssigner#getDefaultTrigger().
   
   
   ## Brief change log
   
 - Remove parameter in WindowAssigner#getDefaultTrigger()
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] WencongLiu opened a new pull request, #23072: Deprecate IOReadableWritable serialization in Path

2023-07-24 Thread via GitHub


WencongLiu opened a new pull request, #23072:
URL: https://github.com/apache/flink/pull/23072

   ## What is the purpose of the change
   
   Deprecate IOReadableWritable serialization in Path.
   
   
   ## Brief change log
   
 - Deprecate IOReadableWritable serialization in Path
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] WencongLiu closed pull request #23005: Test for path

2023-07-24 Thread via GitHub


WencongLiu closed pull request #23005: Test for path
URL: https://github.com/apache/flink/pull/23005


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #23071: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator.

2023-07-24 Thread via GitHub


flinkbot commented on PR #23071:
URL: https://github.com/apache/flink/pull/23071#issuecomment-1649030088

   
   ## CI report:
   
   * 412a065288c2d5ceb0b42d30616673ce851c9930 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-25322) Support no-claim mode in changelog state backend

2023-07-24 Thread Feifan Wang (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746750#comment-17746750
 ] 

Feifan Wang commented on FLINK-25322:
-

Thanks for reply [~masteryhx] .
{quote}Users have to check the status of restore mode by the Flink UI or the 
REST API, right ?
{quote}
Yes, if users want to delete the restored non claimed checkpoint, he/she must 
check the flink ui or REST API to confirm the new job is state self-sustained. 
Otherwise, new jobs that are not state self-sustaining may fail to restart 
because of file not found. But I don't think this adds complexity to the user, 
because before that the user also has to check the Flink UI or REST UI to 
determine whether the new job has completed at least one checkpoint. In 
contrast, I think checking whether the job is state self-sustained is more 
intuitive.
{quote}If users stop the job before the 'slowest materilization of all 
subtasks', this behaves like LEGACY mode, otherwise this could behaves like 
NO_CLIAM mode, right ?
{quote}
In fact, I am also thinking about how to deal with retained checkpoints before 
the job reaches state self-sustained. One option is that checkpoints before 
state self-sustained will not be retained as retained checkpoints. But this 
will cause data duplication when the job using the transactional sink resumes 
from that restored checkpoint. Another option is to record in the checkpoint 
metadata which state artifacts are borrowed from the non-claimed checkpoint, 
and when the new checkpoint is used for claim mode recovery, those state 
artifacts borrowed from the non-claimed checkpoint will not be deleted. Do you 
have any thoughts on this issue [~pnowojski]  ?
{quote}Of course, IIUC, If users want to use NO_CLAIM mode, they'd like to 
retain a CP to let other jobs use.
{quote}
In fact, even if a user manually redeploys the same job after updating the 
business logic code, it is desirable to be able to use the no-claim mode. 
Because the no-claim mode can guarantee that the job can be rolled back when 
there is a problem with the new logic code.

> Support no-claim mode in changelog state backend
> 
>
> Key: FLINK-25322
> URL: https://issues.apache.org/jira/browse/FLINK-25322
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Reporter: Dawid Wysakowicz
>Assignee: Feifan Wang
>Priority: Major
> Fix For: 1.18.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #23070: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator.

2023-07-24 Thread via GitHub


flinkbot commented on PR #23070:
URL: https://github.com/apache/flink/pull/23070#issuecomment-1649019893

   
   ## CI report:
   
   * b476ae92a8c6f7f767f7bfdce90f93fa02aec431 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] liming30 opened a new pull request, #23071: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator.

2023-07-24 Thread via GitHub


liming30 opened a new pull request, #23071:
URL: https://github.com/apache/flink/pull/23071

   Backporting #23059  to release-1.16


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] liming30 opened a new pull request, #23070: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator.

2023-07-24 Thread via GitHub


liming30 opened a new pull request, #23070:
URL: https://github.com/apache/flink/pull/23070

   Backporting #23059  to release-1.17


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] liming30 commented on pull request #23059: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator

2023-07-24 Thread via GitHub


liming30 commented on PR #23059:
URL: https://github.com/apache/flink/pull/23059#issuecomment-1648988060

   > @liming30 , would you mind backporting this fix to 1.16 and 1.17?
   
   @1996fanrui Sure, I will initiate a related PR for backporting later. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] 1996fanrui commented on pull request #23059: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator

2023-07-24 Thread via GitHub


1996fanrui commented on PR #23059:
URL: https://github.com/apache/flink/pull/23059#issuecomment-1648976736

   @liming30 , would you mind backporting this fix to 1.16 and 1.17?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] liming30 commented on a diff in pull request #23059: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator

2023-07-24 Thread via GitHub


liming30 commented on code in PR #23059:
URL: https://github.com/apache/flink/pull/23059#discussion_r1272950175


##
flink-runtime/src/test/java/org/apache/flink/runtime/operators/coordination/RecreateOnResetOperatorCoordinatorTest.java:
##
@@ -264,6 +264,19 @@ public void testConsecutiveResetToCheckpoint() throws 
Exception {
 "Timed out when waiting for the coordinator to close.");
 }
 
+@Test
+public void testNotifyCheckpointAbortedSuccess() throws Exception {
+TestingCoordinatorProvider provider = new 
TestingCoordinatorProvider(null);
+MockOperatorCoordinatorContext context =
+new MockOperatorCoordinatorContext(OPERATOR_ID, NUM_SUBTASKS);
+RecreateOnResetOperatorCoordinator coordinator = 
createCoordinator(provider, context);
+TestingOperatorCoordinator internalCoordinatorAfterReset =
+getInternalCoordinator(coordinator);
+
+coordinator.notifyCheckpointAborted(1L);
+
assertThat(internalCoordinatorAfterReset.getLastCheckpointAborted()).isEqualTo(1L);

Review Comment:
   @1996fanrui thanks, I have updated this part of the code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] 1996fanrui commented on a diff in pull request #23059: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator

2023-07-24 Thread via GitHub


1996fanrui commented on code in PR #23059:
URL: https://github.com/apache/flink/pull/23059#discussion_r1272939403


##
flink-runtime/src/test/java/org/apache/flink/runtime/operators/coordination/RecreateOnResetOperatorCoordinatorTest.java:
##
@@ -264,6 +264,19 @@ public void testConsecutiveResetToCheckpoint() throws 
Exception {
 "Timed out when waiting for the coordinator to close.");
 }
 
+@Test
+public void testNotifyCheckpointAbortedSuccess() throws Exception {
+TestingCoordinatorProvider provider = new 
TestingCoordinatorProvider(null);
+MockOperatorCoordinatorContext context =
+new MockOperatorCoordinatorContext(OPERATOR_ID, NUM_SUBTASKS);
+RecreateOnResetOperatorCoordinator coordinator = 
createCoordinator(provider, context);
+TestingOperatorCoordinator internalCoordinatorAfterReset =
+getInternalCoordinator(coordinator);
+
+coordinator.notifyCheckpointAborted(1L);
+
assertThat(internalCoordinatorAfterReset.getLastCheckpointAborted()).isEqualTo(1L);

Review Comment:
   @liming30 , thanks for the update,  I left a minor comment, please take a 
look in your free time, thanks!
   
   nits:
   
   ```suggestion
   long checkpointId = 10L;
   coordinator.notifyCheckpointAborted(checkpointId);
   
assertThat(internalCoordinatorAfterReset.getLastCheckpointAborted()).isEqualTo(checkpointId);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] 1996fanrui commented on a diff in pull request #23059: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator

2023-07-24 Thread via GitHub


1996fanrui commented on code in PR #23059:
URL: https://github.com/apache/flink/pull/23059#discussion_r1272939403


##
flink-runtime/src/test/java/org/apache/flink/runtime/operators/coordination/RecreateOnResetOperatorCoordinatorTest.java:
##
@@ -264,6 +264,19 @@ public void testConsecutiveResetToCheckpoint() throws 
Exception {
 "Timed out when waiting for the coordinator to close.");
 }
 
+@Test
+public void testNotifyCheckpointAbortedSuccess() throws Exception {
+TestingCoordinatorProvider provider = new 
TestingCoordinatorProvider(null);
+MockOperatorCoordinatorContext context =
+new MockOperatorCoordinatorContext(OPERATOR_ID, NUM_SUBTASKS);
+RecreateOnResetOperatorCoordinator coordinator = 
createCoordinator(provider, context);
+TestingOperatorCoordinator internalCoordinatorAfterReset =
+getInternalCoordinator(coordinator);
+
+coordinator.notifyCheckpointAborted(1L);
+
assertThat(internalCoordinatorAfterReset.getLastCheckpointAborted()).isEqualTo(1L);

Review Comment:
   @liming30 , thanks for the update,  I left a minor comment.
   
   nits:
   
   ```suggestion
   long checkpointId = 10L;
   coordinator.notifyCheckpointAborted(checkpointId);
   
assertThat(internalCoordinatorAfterReset.getLastCheckpointAborted()).isEqualTo(checkpointId);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] 1996fanrui commented on a diff in pull request #23059: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator

2023-07-24 Thread via GitHub


1996fanrui commented on code in PR #23059:
URL: https://github.com/apache/flink/pull/23059#discussion_r1272922054


##
flink-runtime/src/test/java/org/apache/flink/runtime/operators/coordination/RecreateOnResetOperatorCoordinatorTest.java:
##
@@ -264,6 +264,19 @@ public void testConsecutiveResetToCheckpoint() throws 
Exception {
 "Timed out when waiting for the coordinator to close.");
 }
 
+@Test
+public void testNotifyCheckpointAbortedSuccess() throws Exception {
+TestingCoordinatorProvider provider = new 
TestingCoordinatorProvider(null);
+MockOperatorCoordinatorContext context =
+new MockOperatorCoordinatorContext(OPERATOR_ID, NUM_SUBTASKS);
+RecreateOnResetOperatorCoordinator coordinator = 
createCoordinator(provider, context);
+TestingOperatorCoordinator internalCoordinatorAfterReset =
+getInternalCoordinator(coordinator);
+
+coordinator.notifyCheckpointAborted(1L);
+
assertThat(internalCoordinatorAfterReset.getLastCheckpointAborted()).isEqualTo(1L);

Review Comment:
   nits:
   
   ```suggestion
   
assertThat(internalCoordinatorAfterReset.getLastCheckpointAborted()).isOne();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32658) State should not be silently removed when ignore-unclaimed-state is false

2023-07-24 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-32658:

Description: 
When ignore-unclaimed-state is false and the old state is removed, flink should 
throw exception. It's similar to removing a stateful operator.

This case occurs not only when the user removes state, but also when the 
operator is replaced. 

For example: upgrade FlinkKafkaConsumer to KafkaSource. All logical are not 
changed, so the operator id isn't changed. The KafkaSource cannot resume from 
the state of FlinkKafkaConsumer. However, the new flink job can start, and the 
state is silently removed in the new job.(The old state is not physically 
discarded, it is still stored in the state backend, but the new code will never 
use it.)

It also brings an additional problem: the KafkaSource will snapshot 2 states, 
it includes the new state of KafkaSource, and the union list state of 
FlinkKafkaConsumer. Whenever a job resumes from checkpoint, the union List 
state is inflated. Eventually the state size of kafka offset exceeded 200MB.

 !screenshot-1.png! 


  was:
When ignore-unclaimed-state is false and the old state is removed, flink should 
throw exception. It's similar to removing a stateful operator.

This case occurs not only when the user removes state, but also when the 
operator is replaced. 

For example: upgrade FlinkKafkaConsumer to KafkaSource. All logical are not 
changed, so the operator id isn't changed. The KafkaSource cannot resume from 
the state of FlinkKafkaConsumer. However, flink job can start, and the state is 
silently discarded.(The old state is not physically discarded, it is still 
stored in the state backend, but the new code will never use it.)

It also brings an additional problem: the KafkaSource will snapshot 2 states, 
it includes the new state of KafkaSource, and the union list state of 
FlinkKafkaConsumer. Whenever a job resumes from checkpoint, the union List 
state is inflated. Eventually the state size of kafka offset exceeded 200MB.

 !screenshot-1.png! 



> State should not be silently removed when ignore-unclaimed-state is false
> -
>
> Key: FLINK-32658
> URL: https://issues.apache.org/jira/browse/FLINK-32658
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0, 1.17.1
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
> Attachments: screenshot-1.png
>
>
> When ignore-unclaimed-state is false and the old state is removed, flink 
> should throw exception. It's similar to removing a stateful operator.
> This case occurs not only when the user removes state, but also when the 
> operator is replaced. 
> For example: upgrade FlinkKafkaConsumer to KafkaSource. All logical are not 
> changed, so the operator id isn't changed. The KafkaSource cannot resume from 
> the state of FlinkKafkaConsumer. However, the new flink job can start, and 
> the state is silently removed in the new job.(The old state is not physically 
> discarded, it is still stored in the state backend, but the new code will 
> never use it.)
> It also brings an additional problem: the KafkaSource will snapshot 2 states, 
> it includes the new state of KafkaSource, and the union list state of 
> FlinkKafkaConsumer. Whenever a job resumes from checkpoint, the union List 
> state is inflated. Eventually the state size of kafka offset exceeded 200MB.
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32658) State should not be silently removed when ignore-unclaimed-state is false

2023-07-24 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-32658:

Description: 
When ignore-unclaimed-state is false and the old state is removed, flink should 
throw exception. It's similar to removing a stateful operator.

This case occurs not only when the user removes state, but also when the 
operator is replaced. 

For example: upgrade FlinkKafkaConsumer to KafkaSource. All logical are not 
changed, so the operator id isn't changed. The KafkaSource cannot resume from 
the state of FlinkKafkaConsumer. However, flink job can start, and the state is 
silently discarded.(The old state is not physically discarded, it is still 
stored in the state backend, but the new code will never use it.)

It also brings an additional problem: the KafkaSource will snapshot 2 states, 
it includes the new state of KafkaSource, and the union list state of 
FlinkKafkaConsumer. Whenever a job resumes from checkpoint, the union List 
state is inflated. Eventually the state size of kafka offset exceeded 200MB.

 !screenshot-1.png! 


  was:
When ignore-unclaimed-state is false and the old state is removed, flink should 
throw exception.

It's similar with removing an stateful operator.

This case occurs not only when the user removes state, but also when the 
operator is replaced. 

For example: upgrade FlinkKafkaConsumer to KafkaSource. All logical are not 
changed, so the operator id isn't changed. The KafkaSource cannot resume from 
the state of FlinkKafkaConsumer. However, flink job can start, and the state is 
silently discarded.

It also brings an additional problem: the KafkaSource will snapshot 2 states, 
it includes the new state of KafkaSource, and the union list state of 
FlinkKafkaConsumer. Whenever a job resumes from checkpoint, the union List 
state is inflated. Eventually the state size of kafka offset exceeded 200MB.

 !screenshot-1.png! 



> State should not be silently removed when ignore-unclaimed-state is false
> -
>
> Key: FLINK-32658
> URL: https://issues.apache.org/jira/browse/FLINK-32658
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.18.0, 1.17.1
>Reporter: Rui Fan
>Assignee: Rui Fan
>Priority: Major
> Attachments: screenshot-1.png
>
>
> When ignore-unclaimed-state is false and the old state is removed, flink 
> should throw exception. It's similar to removing a stateful operator.
> This case occurs not only when the user removes state, but also when the 
> operator is replaced. 
> For example: upgrade FlinkKafkaConsumer to KafkaSource. All logical are not 
> changed, so the operator id isn't changed. The KafkaSource cannot resume from 
> the state of FlinkKafkaConsumer. However, flink job can start, and the state 
> is silently discarded.(The old state is not physically discarded, it is still 
> stored in the state backend, but the new code will never use it.)
> It also brings an additional problem: the KafkaSource will snapshot 2 states, 
> it includes the new state of KafkaSource, and the union list state of 
> FlinkKafkaConsumer. Whenever a job resumes from checkpoint, the union List 
> state is inflated. Eventually the state size of kafka offset exceeded 200MB.
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] 1996fanrui commented on a diff in pull request #23059: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator

2023-07-24 Thread via GitHub


1996fanrui commented on code in PR #23059:
URL: https://github.com/apache/flink/pull/23059#discussion_r1272922054


##
flink-runtime/src/test/java/org/apache/flink/runtime/operators/coordination/RecreateOnResetOperatorCoordinatorTest.java:
##
@@ -264,6 +264,19 @@ public void testConsecutiveResetToCheckpoint() throws 
Exception {
 "Timed out when waiting for the coordinator to close.");
 }
 
+@Test
+public void testNotifyCheckpointAbortedSuccess() throws Exception {
+TestingCoordinatorProvider provider = new 
TestingCoordinatorProvider(null);
+MockOperatorCoordinatorContext context =
+new MockOperatorCoordinatorContext(OPERATOR_ID, NUM_SUBTASKS);
+RecreateOnResetOperatorCoordinator coordinator = 
createCoordinator(provider, context);
+TestingOperatorCoordinator internalCoordinatorAfterReset =
+getInternalCoordinator(coordinator);
+
+coordinator.notifyCheckpointAborted(1L);
+
assertThat(internalCoordinatorAfterReset.getLastCheckpointAborted()).isEqualTo(1L);

Review Comment:
   nits:
   
   ```suggestion
   
assertThat(internalCoordinatorAfterReset.getLastCheckpointAborted()).isOne();
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] RanJinh commented on pull request #23000: [FLINK-32594][runtime] Use blocking ResultPartitionType if operator only outputs records on EOF

2023-07-24 Thread via GitHub


RanJinh commented on PR #23000:
URL: https://github.com/apache/flink/pull/23000#issuecomment-1648855116

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-32560) Properly deprecate all Scala APIs

2023-07-24 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746725#comment-17746725
 ] 

Xintong Song commented on FLINK-32560:
--

Thanks, [~rskraba]. Your proposal sounds good to me. The purpose of this PR is 
indeed to get more attentions on the deprecation of Scala APIs, rather than 
legalizing its removal in 2.0.

> Properly deprecate all Scala APIs
> -
>
> Key: FLINK-32560
> URL: https://issues.apache.org/jira/browse/FLINK-32560
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Scala
>Reporter: Xintong Song
>Assignee: Ryan Skraba
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> We agreed to drop Scala API support in FLIP-265 [1], and have tried to 
> deprecate them in FLINK-29740. Also, both user documentation and roadmap[2] 
> shows that scala API supports are deprecated. However, none of the APIs in 
> `flink-streaming-scala` are annotated with `@Deprecated` atm, and only 
> `ExecutionEnvironment` and `package` are marked `@Deprecated` in 
> `flink-scala`.
> [1] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-265+Deprecate+and+remove+Scala+API+support
> [2] https://flink.apache.org/roadmap/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31634) FLIP-301: Hybrid Shuffle supports Remote Storage

2023-07-24 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-31634:
--
Release Note: 
This introduces a new Hybrid Shuffle mode to support remote storage. The remote 
storage path is specified by `taskmanager.network.hybrid-shuffle.remote.path`.  
The new mode has resolved existing issues in the legacy Hybrid Shuffle mode. 
Firstly, the new mode uses less required network memory. Secondly, the new mode 
can store shuffle data in remote storage when the disk space is not enough, 
which could avoid insufficient disk space errors. The new mode is only enabled 
when `taskmanager.network.hybrid-shuffle.remote.path` is configured.

Currently, the new Hybrid Shuffle mode is in an experimental phase. In case of 
unexpected issues, the new mode can fall back to the legacy mode by setting 
`taskmanager.network.hybrid-shuffle.enable-new-mode` to `false`. Once the new 
mode is stable, the legacy mode and the fallback option will be removed.

  was:
This introduces a new Hybrid Shuffle mode to support remote storage. The remote 
storage path is specified by `taskmanager.network.hybrid-shuffle.remote.path`.  
The new Hybrid Shuffle mode has resolved existing issues in the legacy Hybrid 
Shuffle mode. Firstly, the new mode uses less required network memory. 
Secondly, the new mode can store shuffle data in remote storage when the disk 
space is not enough, which could avoid insufficient disk space errors. The new 
mode is only enabled when `taskmanager.network.hybrid-shuffle.remote.path` is 
configured.

Currently, the new Hybrid Shuffle mode is in an experimental phase. In case of 
unexpected issues, the new mode can fall back to the legacy mode by setting 
`taskmanager.network.hybrid-shuffle.enable-new-mode` to `false`. Once the new 
mode is stable, the legacy mode and the fallback option will be removed.


> FLIP-301: Hybrid Shuffle supports Remote Storage
> 
>
> Key: FLINK-31634
> URL: https://issues.apache.org/jira/browse/FLINK-31634
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: Umbrella
>
> This is an umbrella ticket for 
> [FLIP-301|https://cwiki.apache.org/confluence/display/FLINK/FLIP-301%3A+Hybrid+Shuffle+supports+Remote+Storage].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-31634) FLIP-301: Hybrid Shuffle supports Remote Storage

2023-07-24 Thread Yuxin Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-31634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuxin Tan updated FLINK-31634:
--
Release Note: 
This introduces a new Hybrid Shuffle mode to support remote storage. The remote 
storage path is specified by `taskmanager.network.hybrid-shuffle.remote.path`.  
The new Hybrid Shuffle mode has resolved existing issues in the legacy Hybrid 
Shuffle mode. Firstly, the new mode uses less required network memory. 
Secondly, the new mode can store shuffle data in remote storage when the disk 
space is not enough, which could avoid insufficient disk space errors. The new 
mode is only enabled when `taskmanager.network.hybrid-shuffle.remote.path` is 
configured.

Currently, the new Hybrid Shuffle mode is in an experimental phase. In case of 
unexpected issues, the new mode can fall back to the legacy mode by setting 
`taskmanager.network.hybrid-shuffle.enable-new-mode` to `false`. Once the new 
mode is stable, the legacy mode and the fallback option will be removed.

  was:
This introduces a new Hybrid Shuffle mode to support remote storage. The remote 
storage path is specified by `taskmanager.network.hybrid-shuffle.remote.path`.  
The new Hybrid Shuffle mode has resolved existing issues in the legacy mode. 
Firstly, the new mode uses less required network memory. Secondly, the new mode 
can store shuffle data in remote storage when the disk space is not enough, 
which could avoid insufficient disk space errors. The new mode is only enabled 
when `taskmanager.network.hybrid-shuffle.remote.path` is configured.

Currently, the new Hybrid Shuffle mode is in an experimental phase. In case of 
unexpected issues, the new mode can fall back to the legacy mode by setting 
`taskmanager.network.hybrid-shuffle.enable-new-mode` to `false`. Once the new 
mode is stable, the legacy mode and the fallback option will be removed.


> FLIP-301: Hybrid Shuffle supports Remote Storage
> 
>
> Key: FLINK-31634
> URL: https://issues.apache.org/jira/browse/FLINK-31634
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Network
>Affects Versions: 1.18.0
>Reporter: Yuxin Tan
>Assignee: Yuxin Tan
>Priority: Major
>  Labels: Umbrella
>
> This is an umbrella ticket for 
> [FLIP-301|https://cwiki.apache.org/confluence/display/FLINK/FLIP-301%3A+Hybrid+Shuffle+supports+Remote+Storage].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32664) TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown is failing

2023-07-24 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin updated FLINK-32664:

Labels: pull-request-available test-stability  (was: pull-request-available)

> TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown is failing
> ---
>
> Key: FLINK-32664
> URL: https://issues.apache.org/jira/browse/FLINK-32664
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Blocker
>  Labels: pull-request-available, test-stability
>
> Blocker since it's failing on every build and reproduced locally
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51661=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=11529



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] snuyanzin commented on pull request #23069: [FLINK-32664][table] Fix TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown

2023-07-24 Thread via GitHub


snuyanzin commented on PR #23069:
URL: https://github.com/apache/flink/pull/23069#issuecomment-1648774263

   //cc @LadyForest , @twalthr 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #23069: [FLINK-32664][table] Fix TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown

2023-07-24 Thread via GitHub


flinkbot commented on PR #23069:
URL: https://github.com/apache/flink/pull/23069#issuecomment-1648773182

   
   ## CI report:
   
   * 0796eea2ffa27e8c23b7533fa47033c40213ee72 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32664) TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown is failing

2023-07-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-32664:
---
Labels: pull-request-available  (was: )

> TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown is failing
> ---
>
> Key: FLINK-32664
> URL: https://issues.apache.org/jira/browse/FLINK-32664
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Blocker
>  Labels: pull-request-available
>
> Blocker since it's failing on every build and reproduced locally
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51661=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=11529



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] snuyanzin opened a new pull request, #23069: [FLINK-32664][table] Fix TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown

2023-07-24 Thread via GitHub


snuyanzin opened a new pull request, #23069:
URL: https://github.com/apache/flink/pull/23069

   ## What is the purpose of the change
   
   The PR is aiming to fix the ci tests especially 
`TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown` which became 
broken after https://issues.apache.org/jira/browse/FLINK-32657
   
https://github.com/apache/flink/commit/6a3808213d334de614e162362d1583f3e5322358 
   
   
   ## Verifying this change
   
   The change is change of tests
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): ( no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: ( no)
 - The serializers: ( no )
 - The runtime per-record code paths (performance sensitive): ( no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: ( no)
 - The S3 file system connector: ( no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? ( no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-32664) TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown is failing

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746706#comment-17746706
 ] 

Sergey Nuyanzin commented on FLINK-32664:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51663=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=11529

> TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown is failing
> ---
>
> Key: FLINK-32664
> URL: https://issues.apache.org/jira/browse/FLINK-32664
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Blocker
>
> Blocker since it's failing on every build and reproduced locally
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51661=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=11529



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32664) TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown is failing

2023-07-24 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-32664:
---

 Summary: 
TableSourceJsonPlanTest.testReuseSourceWithoutProjectionPushDown is failing
 Key: FLINK-32664
 URL: https://issues.apache.org/jira/browse/FLINK-32664
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.18.0
Reporter: Sergey Nuyanzin
Assignee: Sergey Nuyanzin


Blocker since it's failing on every build and reproduced locally
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51661=logs=0c940707-2659-5648-cbe6-a1ad63045f0a=075c2716-8010-5565-fe08-3c4bb45824a4=11529



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-30719) flink-runtime-web failed due to a corrupted

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-30719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746704#comment-17746704
 ] 

Sergey Nuyanzin commented on FLINK-30719:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51654=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=11008

> flink-runtime-web failed due to a corrupted 
> 
>
> Key: FLINK-30719
> URL: https://issues.apache.org/jira/browse/FLINK-30719
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend, Test Infrastructure, Tests
>Affects Versions: 1.16.0, 1.17.0, 1.18.0
>Reporter: Matthias Pohl
>Assignee: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=44954=logs=52b61abe-a3cc-5bde-cc35-1bbe89bb7df5=54421a62-0c80-5aad-3319-094ff69180bb=12550
> The build failed due to a corrupted nodejs dependency:
> {code}
> [ERROR] The archive file 
> /__w/1/.m2/repository/com/github/eirslett/node/16.13.2/node-16.13.2-linux-x64.tar.gz
>  is corrupted and will be deleted. Please try the build again.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746703#comment-17746703
 ] 

Sergey Nuyanzin commented on FLINK-32628:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51629=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=105

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.16.3, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-28879) New File Sink s3 end-to-end test failed with Output hash mismatch

2023-07-24 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin updated FLINK-28879:

Affects Version/s: 1.17.2

> New File Sink s3 end-to-end test failed with Output hash mismatch
> -
>
> Key: FLINK-28879
> URL: https://issues.apache.org/jira/browse/FLINK-28879
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Connectors / FileSystem, Tests
>Affects Versions: 1.16.0, 1.17.2
>Reporter: Huang Xingbo
>Priority: Major
>  Labels: test-stability
>
> {code:java}
> 2022-08-09T00:50:02.8229585Z Aug 09 00:50:02 FAIL File Streaming Sink: Output 
> hash mismatch.  Got 6037b01ca0ffc61a95c12cb475c661a8, expected 
> 6727342fdd3aae2129e61fc8f433fb6f.
> 2022-08-09T00:50:02.8230700Z Aug 09 00:50:02 head hexdump of actual:
> 2022-08-09T00:50:02.8477319Z Aug 09 00:50:02 000   E   r   r   o   r  
>  e   x   e   c   u   t   i   n   g
> 2022-08-09T00:50:02.8478206Z Aug 09 00:50:02 010   a   w   s   c   o  
>  m   m   a   n   d   :   s   3
> 2022-08-09T00:50:02.8479475Z Aug 09 00:50:02 020   c   p   -   -   q  
>  u   i   e   t   s   3   :   /   /
> 2022-08-09T00:50:02.8480205Z Aug 09 00:50:02 030   f   l   i   n   k   -  
>  i   n   t   e   g   r   a   t   i   o
> 2022-08-09T00:50:02.8480924Z Aug 09 00:50:02 040   n   -   t   e   s   t  
>  s   /   t   e   m   p   /   t   e   s
> 2022-08-09T00:50:02.8481612Z Aug 09 00:50:02 050   t   _   f   i   l   e  
>  _   s   i   n   k   -   1   d   3   d
> 2022-08-09T00:50:02.8483048Z Aug 09 00:50:02 060   4   0   0   8   -   b  
>  0   b   f   -   4   2   6   5   -   b
> 2022-08-09T00:50:02.8483618Z Aug 09 00:50:02 070   e   0   e   -   3   b  
>  9   f   7   8   2   c   5   5   2   d
> 2022-08-09T00:50:02.8484222Z Aug 09 00:50:02 080   /   h   o   s   t  
>  d   i   r   /   /   t   e   m   p   -
> 2022-08-09T00:50:02.8484831Z Aug 09 00:50:02 090   t   e   s   t   -   d  
>  i   r   e   c   t   o   r   y   -   2
> 2022-08-09T00:50:02.8485719Z Aug 09 00:50:02 0a0   3   9   3   7   7   8  
>  2   6   8   0   /   t   e   m   p   /
> 2022-08-09T00:50:02.8486427Z Aug 09 00:50:02 0b0   t   e   s   t   _   f  
>  i   l   e   _   s   i   n   k   -   1
> 2022-08-09T00:50:02.8487134Z Aug 09 00:50:02 0c0   d   3   d   4   0   0  
>  8   -   b   0   b   f   -   4   2   6
> 2022-08-09T00:50:02.8487826Z Aug 09 00:50:02 0d0   5   -   b   e   0   e  
>  -   3   b   9   f   7   8   2   c   5
> 2022-08-09T00:50:02.8488511Z Aug 09 00:50:02 0e0   5   2   d   -   -  
>  e   x   c   l   u   d   e   '   *
> 2022-08-09T00:50:02.8489202Z Aug 09 00:50:02 0f0   '   -   -   i   n  
>  c   l   u   d   e   '   *   /   p
> 2022-08-09T00:50:02.8489891Z Aug 09 00:50:02 100   a   r   t   -   [   !  
>  /   ]   *   '   -   -   r   e   c
> 2022-08-09T00:50:02.8490385Z Aug 09 00:50:02 110   u   r   s   i   v   e  
> \n
> 2022-08-09T00:50:02.8490822Z Aug 09 00:50:02 117
> 2022-08-09T00:50:02.8502212Z Aug 09 00:50:02 Stopping job timeout watchdog 
> (with pid=141134)
> 2022-08-09T00:50:06.8430959Z rm: cannot remove 
> '/home/vsts/work/1/s/flink-dist/target/flink-1.16-SNAPSHOT-bin/flink-1.16-SNAPSHOT/lib/flink-shaded-netty-tcnative-static-*.jar':
>  No such file or directory
> 2022-08-09T00:50:06.9278248Z Aug 09 00:50:06 
> 5ccfeb22307c2a88625a38b9537acc79001d1b29094ef40fd70692ce11407502
> 2022-08-09T00:50:06.9618147Z Aug 09 00:50:06 
> 5ccfeb22307c2a88625a38b9537acc79001d1b29094ef40fd70692ce11407502
> 2022-08-09T00:50:06.9645077Z Aug 09 00:50:06 [FAIL] Test script contains 
> errors.
> 2022-08-09T00:50:06.9666227Z Aug 09 00:50:06 Checking of logs skipped.
> 2022-08-09T00:50:06.9671891Z Aug 09 00:50:06 
> 2022-08-09T00:50:06.9673050Z Aug 09 00:50:06 [FAIL] 'New File Sink s3 
> end-to-end test' failed after 3 minutes and 42 seconds! Test exited with exit 
> code 1
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39667=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=160c9ae5-96fd-516e-1c91-deb81f59292a=4136



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (FLINK-28879) New File Sink s3 end-to-end test failed with Output hash mismatch

2023-07-24 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-28879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin reopened FLINK-28879:
-

> New File Sink s3 end-to-end test failed with Output hash mismatch
> -
>
> Key: FLINK-28879
> URL: https://issues.apache.org/jira/browse/FLINK-28879
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Connectors / FileSystem, Tests
>Affects Versions: 1.16.0
>Reporter: Huang Xingbo
>Priority: Major
>  Labels: test-stability
>
> {code:java}
> 2022-08-09T00:50:02.8229585Z Aug 09 00:50:02 FAIL File Streaming Sink: Output 
> hash mismatch.  Got 6037b01ca0ffc61a95c12cb475c661a8, expected 
> 6727342fdd3aae2129e61fc8f433fb6f.
> 2022-08-09T00:50:02.8230700Z Aug 09 00:50:02 head hexdump of actual:
> 2022-08-09T00:50:02.8477319Z Aug 09 00:50:02 000   E   r   r   o   r  
>  e   x   e   c   u   t   i   n   g
> 2022-08-09T00:50:02.8478206Z Aug 09 00:50:02 010   a   w   s   c   o  
>  m   m   a   n   d   :   s   3
> 2022-08-09T00:50:02.8479475Z Aug 09 00:50:02 020   c   p   -   -   q  
>  u   i   e   t   s   3   :   /   /
> 2022-08-09T00:50:02.8480205Z Aug 09 00:50:02 030   f   l   i   n   k   -  
>  i   n   t   e   g   r   a   t   i   o
> 2022-08-09T00:50:02.8480924Z Aug 09 00:50:02 040   n   -   t   e   s   t  
>  s   /   t   e   m   p   /   t   e   s
> 2022-08-09T00:50:02.8481612Z Aug 09 00:50:02 050   t   _   f   i   l   e  
>  _   s   i   n   k   -   1   d   3   d
> 2022-08-09T00:50:02.8483048Z Aug 09 00:50:02 060   4   0   0   8   -   b  
>  0   b   f   -   4   2   6   5   -   b
> 2022-08-09T00:50:02.8483618Z Aug 09 00:50:02 070   e   0   e   -   3   b  
>  9   f   7   8   2   c   5   5   2   d
> 2022-08-09T00:50:02.8484222Z Aug 09 00:50:02 080   /   h   o   s   t  
>  d   i   r   /   /   t   e   m   p   -
> 2022-08-09T00:50:02.8484831Z Aug 09 00:50:02 090   t   e   s   t   -   d  
>  i   r   e   c   t   o   r   y   -   2
> 2022-08-09T00:50:02.8485719Z Aug 09 00:50:02 0a0   3   9   3   7   7   8  
>  2   6   8   0   /   t   e   m   p   /
> 2022-08-09T00:50:02.8486427Z Aug 09 00:50:02 0b0   t   e   s   t   _   f  
>  i   l   e   _   s   i   n   k   -   1
> 2022-08-09T00:50:02.8487134Z Aug 09 00:50:02 0c0   d   3   d   4   0   0  
>  8   -   b   0   b   f   -   4   2   6
> 2022-08-09T00:50:02.8487826Z Aug 09 00:50:02 0d0   5   -   b   e   0   e  
>  -   3   b   9   f   7   8   2   c   5
> 2022-08-09T00:50:02.8488511Z Aug 09 00:50:02 0e0   5   2   d   -   -  
>  e   x   c   l   u   d   e   '   *
> 2022-08-09T00:50:02.8489202Z Aug 09 00:50:02 0f0   '   -   -   i   n  
>  c   l   u   d   e   '   *   /   p
> 2022-08-09T00:50:02.8489891Z Aug 09 00:50:02 100   a   r   t   -   [   !  
>  /   ]   *   '   -   -   r   e   c
> 2022-08-09T00:50:02.8490385Z Aug 09 00:50:02 110   u   r   s   i   v   e  
> \n
> 2022-08-09T00:50:02.8490822Z Aug 09 00:50:02 117
> 2022-08-09T00:50:02.8502212Z Aug 09 00:50:02 Stopping job timeout watchdog 
> (with pid=141134)
> 2022-08-09T00:50:06.8430959Z rm: cannot remove 
> '/home/vsts/work/1/s/flink-dist/target/flink-1.16-SNAPSHOT-bin/flink-1.16-SNAPSHOT/lib/flink-shaded-netty-tcnative-static-*.jar':
>  No such file or directory
> 2022-08-09T00:50:06.9278248Z Aug 09 00:50:06 
> 5ccfeb22307c2a88625a38b9537acc79001d1b29094ef40fd70692ce11407502
> 2022-08-09T00:50:06.9618147Z Aug 09 00:50:06 
> 5ccfeb22307c2a88625a38b9537acc79001d1b29094ef40fd70692ce11407502
> 2022-08-09T00:50:06.9645077Z Aug 09 00:50:06 [FAIL] Test script contains 
> errors.
> 2022-08-09T00:50:06.9666227Z Aug 09 00:50:06 Checking of logs skipped.
> 2022-08-09T00:50:06.9671891Z Aug 09 00:50:06 
> 2022-08-09T00:50:06.9673050Z Aug 09 00:50:06 [FAIL] 'New File Sink s3 
> end-to-end test' failed after 3 minutes and 42 seconds! Test exited with exit 
> code 1
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39667=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=160c9ae5-96fd-516e-1c91-deb81f59292a=4136



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28879) New File Sink s3 end-to-end test failed with Output hash mismatch

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746702#comment-17746702
 ] 

Sergey Nuyanzin commented on FLINK-28879:
-

reopen since it is reproduced 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51629=logs=87489130-75dc-54e4-1f45-80c30aa367a3=efbee0b1-38ac-597d-6466-1ea8fc908c50=4192

> New File Sink s3 end-to-end test failed with Output hash mismatch
> -
>
> Key: FLINK-28879
> URL: https://issues.apache.org/jira/browse/FLINK-28879
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream, Connectors / FileSystem, Tests
>Affects Versions: 1.16.0
>Reporter: Huang Xingbo
>Priority: Major
>  Labels: test-stability
>
> {code:java}
> 2022-08-09T00:50:02.8229585Z Aug 09 00:50:02 FAIL File Streaming Sink: Output 
> hash mismatch.  Got 6037b01ca0ffc61a95c12cb475c661a8, expected 
> 6727342fdd3aae2129e61fc8f433fb6f.
> 2022-08-09T00:50:02.8230700Z Aug 09 00:50:02 head hexdump of actual:
> 2022-08-09T00:50:02.8477319Z Aug 09 00:50:02 000   E   r   r   o   r  
>  e   x   e   c   u   t   i   n   g
> 2022-08-09T00:50:02.8478206Z Aug 09 00:50:02 010   a   w   s   c   o  
>  m   m   a   n   d   :   s   3
> 2022-08-09T00:50:02.8479475Z Aug 09 00:50:02 020   c   p   -   -   q  
>  u   i   e   t   s   3   :   /   /
> 2022-08-09T00:50:02.8480205Z Aug 09 00:50:02 030   f   l   i   n   k   -  
>  i   n   t   e   g   r   a   t   i   o
> 2022-08-09T00:50:02.8480924Z Aug 09 00:50:02 040   n   -   t   e   s   t  
>  s   /   t   e   m   p   /   t   e   s
> 2022-08-09T00:50:02.8481612Z Aug 09 00:50:02 050   t   _   f   i   l   e  
>  _   s   i   n   k   -   1   d   3   d
> 2022-08-09T00:50:02.8483048Z Aug 09 00:50:02 060   4   0   0   8   -   b  
>  0   b   f   -   4   2   6   5   -   b
> 2022-08-09T00:50:02.8483618Z Aug 09 00:50:02 070   e   0   e   -   3   b  
>  9   f   7   8   2   c   5   5   2   d
> 2022-08-09T00:50:02.8484222Z Aug 09 00:50:02 080   /   h   o   s   t  
>  d   i   r   /   /   t   e   m   p   -
> 2022-08-09T00:50:02.8484831Z Aug 09 00:50:02 090   t   e   s   t   -   d  
>  i   r   e   c   t   o   r   y   -   2
> 2022-08-09T00:50:02.8485719Z Aug 09 00:50:02 0a0   3   9   3   7   7   8  
>  2   6   8   0   /   t   e   m   p   /
> 2022-08-09T00:50:02.8486427Z Aug 09 00:50:02 0b0   t   e   s   t   _   f  
>  i   l   e   _   s   i   n   k   -   1
> 2022-08-09T00:50:02.8487134Z Aug 09 00:50:02 0c0   d   3   d   4   0   0  
>  8   -   b   0   b   f   -   4   2   6
> 2022-08-09T00:50:02.8487826Z Aug 09 00:50:02 0d0   5   -   b   e   0   e  
>  -   3   b   9   f   7   8   2   c   5
> 2022-08-09T00:50:02.8488511Z Aug 09 00:50:02 0e0   5   2   d   -   -  
>  e   x   c   l   u   d   e   '   *
> 2022-08-09T00:50:02.8489202Z Aug 09 00:50:02 0f0   '   -   -   i   n  
>  c   l   u   d   e   '   *   /   p
> 2022-08-09T00:50:02.8489891Z Aug 09 00:50:02 100   a   r   t   -   [   !  
>  /   ]   *   '   -   -   r   e   c
> 2022-08-09T00:50:02.8490385Z Aug 09 00:50:02 110   u   r   s   i   v   e  
> \n
> 2022-08-09T00:50:02.8490822Z Aug 09 00:50:02 117
> 2022-08-09T00:50:02.8502212Z Aug 09 00:50:02 Stopping job timeout watchdog 
> (with pid=141134)
> 2022-08-09T00:50:06.8430959Z rm: cannot remove 
> '/home/vsts/work/1/s/flink-dist/target/flink-1.16-SNAPSHOT-bin/flink-1.16-SNAPSHOT/lib/flink-shaded-netty-tcnative-static-*.jar':
>  No such file or directory
> 2022-08-09T00:50:06.9278248Z Aug 09 00:50:06 
> 5ccfeb22307c2a88625a38b9537acc79001d1b29094ef40fd70692ce11407502
> 2022-08-09T00:50:06.9618147Z Aug 09 00:50:06 
> 5ccfeb22307c2a88625a38b9537acc79001d1b29094ef40fd70692ce11407502
> 2022-08-09T00:50:06.9645077Z Aug 09 00:50:06 [FAIL] Test script contains 
> errors.
> 2022-08-09T00:50:06.9666227Z Aug 09 00:50:06 Checking of logs skipped.
> 2022-08-09T00:50:06.9671891Z Aug 09 00:50:06 
> 2022-08-09T00:50:06.9673050Z Aug 09 00:50:06 [FAIL] 'New File Sink s3 
> end-to-end test' failed after 3 minutes and 42 seconds! Test exited with exit 
> code 1
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=39667=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=160c9ae5-96fd-516e-1c91-deb81f59292a=4136



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746701#comment-17746701
 ] 

Sergey Nuyanzin commented on FLINK-32628:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51628=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=120

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.16.3, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32663) RescalingITCase.testSavepointRescalingInPartitionedOperatorStateList fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746700#comment-17746700
 ] 

Sergey Nuyanzin commented on FLINK-32663:
-

also very similar fail of 
{{RescalingITCase.testSavepointRescalingOutPartitionedOperatorStateList}} 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51627=logs=8fd9202e-fd17-5b26-353c-ac1ff76c8f28=ea7cf968-e585-52cb-e0fc-f48de023a7ca=8664

> RescalingITCase.testSavepointRescalingInPartitionedOperatorStateList fails on 
> AZP
> -
>
> Key: FLINK-32663
> URL: https://issues.apache.org/jira/browse/FLINK-32663
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51501=logs=8fd9202e-fd17-5b26-353c-ac1ff76c8f28=ea7cf968-e585-52cb-e0fc-f48de023a7ca=8665
> fails as
> {noformat}
> Jul 21 01:24:54 01:24:54.146 [ERROR] 
> RescalingITCase.testSavepointRescalingInPartitionedOperatorStateList  Time 
> elapsed: 1.485 s  <<< FAILURE!
> Jul 21 01:24:54 java.lang.AssertionError: expected:<530> but was:<30>
> Jul 21 01:24:54   at org.junit.Assert.fail(Assert.java:89)
> Jul 21 01:24:54   at org.junit.Assert.failNotEquals(Assert.java:835)
> Jul 21 01:24:54   at org.junit.Assert.assertEquals(Assert.java:647)
> Jul 21 01:24:54   at org.junit.Assert.assertEquals(Assert.java:633)
> Jul 21 01:24:54   at 
> org.apache.flink.test.checkpointing.RescalingITCase.testSavepointRescalingPartitionedOperatorState(RescalingITCase.java:621)
> Jul 21 01:24:54   at 
> org.apache.flink.test.checkpointing.RescalingITCase.testSavepointRescalingInPartitionedOperatorStateList(RescalingITCase.java:508)
> Jul 21 01:24:54   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Jul 21 01:24:54   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jul 21 01:24:54   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:4
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32662) JobMasterTest.testRetrievingCheckpointStats fails with NPE on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746699#comment-17746699
 ] 

Sergey Nuyanzin commented on FLINK-32662:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51627=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8583

> JobMasterTest.testRetrievingCheckpointStats fails with NPE on AZP
> -
>
> Key: FLINK-32662
> URL: https://issues.apache.org/jira/browse/FLINK-32662
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51452=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8654
> fails with NPE as
> {noformat}
> Jul 20 01:01:33 01:01:33.491 [ERROR] 
> org.apache.flink.runtime.jobmaster.JobMasterTest.testRetrievingCheckpointStats
>   Time elapsed: 0.036 s  <<< ERROR!
> Jul 20 01:01:33 java.lang.NullPointerException
> Jul 20 01:01:33   at 
> org.apache.flink.runtime.jobmaster.JobMasterTest.testRetrievingCheckpointStats(JobMasterTest.java:2132)
> Jul 20 01:01:33   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Jul 20 01:01:33   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jul 20 01:01:33   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Jul 20 01:01:33   at java.lang.reflect.Method.invoke(Method.java:498)
> Jul 20 01:01:33   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746698#comment-17746698
 ] 

Sergey Nuyanzin commented on FLINK-32628:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51627=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=105

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.16.3, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32523) NotifyCheckpointAbortedITCase.testNotifyCheckpointAborted fails with timeout on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746696#comment-17746696
 ] 

Sergey Nuyanzin edited comment on FLINK-32523 at 7/24/23 11:28 PM:
---

1.16: 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51608=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=8503


was (Author: sergey nuyanzin):
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51608=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=8503

> NotifyCheckpointAbortedITCase.testNotifyCheckpointAborted fails with timeout 
> on AZP
> ---
>
> Key: FLINK-32523
> URL: https://issues.apache.org/jira/browse/FLINK-32523
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.2, 1.18.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This build
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=50795=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=8638
>  fails with timeout
> {noformat}
> Jul 03 01:26:35 org.junit.runners.model.TestTimedOutException: test timed out 
> after 10 milliseconds
> Jul 03 01:26:35   at java.lang.Object.wait(Native Method)
> Jul 03 01:26:35   at java.lang.Object.wait(Object.java:502)
> Jul 03 01:26:35   at 
> org.apache.flink.core.testutils.OneShotLatch.await(OneShotLatch.java:61)
> Jul 03 01:26:35   at 
> org.apache.flink.test.checkpointing.NotifyCheckpointAbortedITCase.verifyAllOperatorsNotifyAborted(NotifyCheckpointAbortedITCase.java:198)
> Jul 03 01:26:35   at 
> org.apache.flink.test.checkpointing.NotifyCheckpointAbortedITCase.testNotifyCheckpointAborted(NotifyCheckpointAbortedITCase.java:189)
> Jul 03 01:26:35   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Jul 03 01:26:35   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jul 03 01:26:35   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Jul 03 01:26:35   at java.lang.reflect.Method.invoke(Method.java:498)
> Jul 03 01:26:35   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Jul 03 01:26:35   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Jul 03 01:26:35   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Jul 03 01:26:35   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Jul 03 01:26:35   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> Jul 03 01:26:35   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> Jul 03 01:26:35   at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> Jul 03 01:26:35   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32523) NotifyCheckpointAbortedITCase.testNotifyCheckpointAborted fails with timeout on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746696#comment-17746696
 ] 

Sergey Nuyanzin commented on FLINK-32523:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51608=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=8503

> NotifyCheckpointAbortedITCase.testNotifyCheckpointAborted fails with timeout 
> on AZP
> ---
>
> Key: FLINK-32523
> URL: https://issues.apache.org/jira/browse/FLINK-32523
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.16.2, 1.18.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This build
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=50795=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=8638
>  fails with timeout
> {noformat}
> Jul 03 01:26:35 org.junit.runners.model.TestTimedOutException: test timed out 
> after 10 milliseconds
> Jul 03 01:26:35   at java.lang.Object.wait(Native Method)
> Jul 03 01:26:35   at java.lang.Object.wait(Object.java:502)
> Jul 03 01:26:35   at 
> org.apache.flink.core.testutils.OneShotLatch.await(OneShotLatch.java:61)
> Jul 03 01:26:35   at 
> org.apache.flink.test.checkpointing.NotifyCheckpointAbortedITCase.verifyAllOperatorsNotifyAborted(NotifyCheckpointAbortedITCase.java:198)
> Jul 03 01:26:35   at 
> org.apache.flink.test.checkpointing.NotifyCheckpointAbortedITCase.testNotifyCheckpointAborted(NotifyCheckpointAbortedITCase.java:189)
> Jul 03 01:26:35   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Jul 03 01:26:35   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jul 03 01:26:35   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Jul 03 01:26:35   at java.lang.reflect.Method.invoke(Method.java:498)
> Jul 03 01:26:35   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> Jul 03 01:26:35   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> Jul 03 01:26:35   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> Jul 03 01:26:35   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> Jul 03 01:26:35   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
> Jul 03 01:26:35   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
> Jul 03 01:26:35   at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> Jul 03 01:26:35   at java.lang.Thread.run(Thread.java:748)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31141) CreateTableAsITCase.testCreateTableAs fails

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746694#comment-17746694
 ] 

Sergey Nuyanzin commented on FLINK-31141:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51607=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=0f3adb59-eefa-51c6-2858-3654d9e0749d=16482

> CreateTableAsITCase.testCreateTableAs fails
> ---
>
> Key: FLINK-31141
> URL: https://issues.apache.org/jira/browse/FLINK-31141
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.17.0, 1.18.0
>Reporter: Rui Fan
>Assignee: dalongliu
>Priority: Major
>  Labels: test-stability
>
> CreateTableAsITCase.testCreateTableAs fails in 
> [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46323=logs=af184cdd-c6d8-5084-0b69-7e9c67b35f7a=160c9ae5-96fd-516e-1c91-deb81f59292a=14772]
>  
> {code:java}
> Feb 20 13:50:12 [ERROR] Failures: 
> Feb 20 13:50:12 [ERROR] CreateTableAsITCase.testCreateTableAs
> Feb 20 13:50:12 [ERROR]   Run 1: Did not get expected results before timeout, 
> actual result: 
> [{"before":null,"after":{"user_name":"Bob","order_cnt":1},"op":"c"}, 
> {"before":null,"after":{"user_name":"Alice","order_cnt":1},"op":"c"}, 
> {"before":{"user_name":"Bob","order_cnt":1},"after":null,"op":"d"}, 
> {"before":null,"after":{"user_name":"Bob","order_cnt":2},"op":"c"}]. ==> 
> expected:  but was: 
> Feb 20 13:50:12 [INFO]   Run 2: PASS
> Feb 20 13:50:12 [INFO] 
> Feb 20 13:50:12 [INFO] 
> Feb 20 13:50:12 [ERROR] Tests run: 15, Failures: 1, Errors: 0, Skipped: 0 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32036) TableEnvironmentTest.test_explain is unstable on azure ci

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746693#comment-17746693
 ] 

Sergey Nuyanzin commented on FLINK-32036:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51592=logs=3e4dd1a2-fe2f-5e5d-a581-48087e718d53=b4612f28-e3b5-5853-8a8b-610ae894217a=24506

> TableEnvironmentTest.test_explain is unstable on azure ci
> -
>
> Key: FLINK-32036
> URL: https://issues.apache.org/jira/browse/FLINK-32036
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.17.1
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> it's failed on ci (1.17 branch so far)
> {noformat}
> May 07 01:51:35 === FAILURES 
> ===
> May 07 01:51:35 __ TableEnvironmentTest.test_explain 
> ___
> May 07 01:51:35 
> May 07 01:51:35 self = 
>  testMethod=test_explain>
> May 07 01:51:35 
> May 07 01:51:35 def test_explain(self):
> May 07 01:51:35 schema = RowType() \
> May 07 01:51:35 .add('a', DataTypes.INT()) \
> May 07 01:51:35 .add('b', DataTypes.STRING()) \
> May 07 01:51:35 .add('c', DataTypes.STRING())
> May 07 01:51:35 t_env = self.t_env
> May 07 01:51:35 t = t_env.from_elements([], schema)
> May 07 01:51:35 result = t.select(t.a + 1, t.b, t.c)
> May 07 01:51:35 
> May 07 01:51:35 >   actual = result.explain()
> May 07 01:51:35 
> May 07 01:51:35 pyflink/table/tests/test_table_environment_api.py:66
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=48766=logs=bf5e383b-9fd3-5f02-ca1c-8f788e2e76d3=85189c57-d8a0-5c9c-b61d-fc05cfac62cf=25029



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746691#comment-17746691
 ] 

Sergey Nuyanzin commented on FLINK-32628:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51592=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=105

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.16.3, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746690#comment-17746690
 ] 

Sergey Nuyanzin commented on FLINK-32628:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51591=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=120

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.16.3, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29031) FlinkKinesisConsumerTest.testSourceSynchronization failed with AssertionFailedError

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746688#comment-17746688
 ] 

Sergey Nuyanzin commented on FLINK-29031:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51556=logs=fc7981dc-d266-55b0-5fff-f0d0a2294e36=1a9b228a-3e0e-598f-fc81-c321539dfdbf=37890

> FlinkKinesisConsumerTest.testSourceSynchronization failed with 
> AssertionFailedError
> ---
>
> Key: FLINK-29031
> URL: https://issues.apache.org/jira/browse/FLINK-29031
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kinesis
>Affects Versions: 1.16.0
>Reporter: Huang Xingbo
>Priority: Major
>  Labels: test-stability
> Fix For: 1.18.0
>
>
> {code:java}
> 2022-08-18T03:58:00.0197521Z Aug 18 03:58:00 [ERROR] 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumerTest.testSourceSynchronization
>   Time elapsed: 10.191 s  <<< FAILURE!
> 2022-08-18T03:58:00.0198736Z Aug 18 03:58:00 
> org.opentest4j.AssertionFailedError: 
> 2022-08-18T03:58:00.0199434Z Aug 18 03:58:00 [first record received] 
> 2022-08-18T03:58:00.0200022Z Aug 18 03:58:00 expected: 1
> 2022-08-18T03:58:00.0200577Z Aug 18 03:58:00  but was: 0
> 2022-08-18T03:58:00.0201285Z Aug 18 03:58:00  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 2022-08-18T03:58:00.0202337Z Aug 18 03:58:00  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 2022-08-18T03:58:00.0203442Z Aug 18 03:58:00  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 2022-08-18T03:58:00.0205001Z Aug 18 03:58:00  at 
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumerTest.testSourceSynchronization(FlinkKinesisConsumerTest.java:1149)
> 2022-08-18T03:58:00.0206078Z Aug 18 03:58:00  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2022-08-18T03:58:00.0206994Z Aug 18 03:58:00  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2022-08-18T03:58:00.0208019Z Aug 18 03:58:00  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2022-08-18T03:58:00.0208952Z Aug 18 03:58:00  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2022-08-18T03:58:00.0209816Z Aug 18 03:58:00  at 
> org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68)
> 2022-08-18T03:58:00.0211029Z Aug 18 03:58:00  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:326)
> 2022-08-18T03:58:00.0212264Z Aug 18 03:58:00  at 
> org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:89)
> 2022-08-18T03:58:00.0213266Z Aug 18 03:58:00  at 
> org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:97)
> 2022-08-18T03:58:00.0214530Z Aug 18 03:58:00  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:310)
> 2022-08-18T03:58:00.0216259Z Aug 18 03:58:00  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:131)
> 2022-08-18T03:58:00.0217769Z Aug 18 03:58:00  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.access$100(PowerMockJUnit47RunnerDelegateImpl.java:59)
> 2022-08-18T03:58:00.0219348Z Aug 18 03:58:00  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner$TestExecutorStatement.evaluate(PowerMockJUnit47RunnerDelegateImpl.java:147)
> 2022-08-18T03:58:00.0220610Z Aug 18 03:58:00  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
> 2022-08-18T03:58:00.0221543Z Aug 18 03:58:00  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2022-08-18T03:58:00.0222807Z Aug 18 03:58:00  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.evaluateStatement(PowerMockJUnit47RunnerDelegateImpl.java:107)
> 2022-08-18T03:58:00.0224339Z Aug 18 03:58:00  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82)
> 2022-08-18T03:58:00.0226110Z Aug 18 03:58:00  at 
> org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:298)
> 

[jira] [Updated] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin updated FLINK-32628:

Affects Version/s: 1.16.3

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.16.3, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746687#comment-17746687
 ] 

Sergey Nuyanzin commented on FLINK-32628:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51556=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=117

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32662) JobMasterTest.testRetrievingCheckpointStats fails with NPE on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746686#comment-17746686
 ] 

Sergey Nuyanzin commented on FLINK-32662:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51555=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8579

> JobMasterTest.testRetrievingCheckpointStats fails with NPE on AZP
> -
>
> Key: FLINK-32662
> URL: https://issues.apache.org/jira/browse/FLINK-32662
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51452=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8654
> fails with NPE as
> {noformat}
> Jul 20 01:01:33 01:01:33.491 [ERROR] 
> org.apache.flink.runtime.jobmaster.JobMasterTest.testRetrievingCheckpointStats
>   Time elapsed: 0.036 s  <<< ERROR!
> Jul 20 01:01:33 java.lang.NullPointerException
> Jul 20 01:01:33   at 
> org.apache.flink.runtime.jobmaster.JobMasterTest.testRetrievingCheckpointStats(JobMasterTest.java:2132)
> Jul 20 01:01:33   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Jul 20 01:01:33   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jul 20 01:01:33   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Jul 20 01:01:33   at java.lang.reflect.Method.invoke(Method.java:498)
> Jul 20 01:01:33   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746685#comment-17746685
 ] 

Sergey Nuyanzin commented on FLINK-32628:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51555=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746683#comment-17746683
 ] 

Sergey Nuyanzin commented on FLINK-32628:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51502=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32662) JobMasterTest.testRetrievingCheckpointStats fails with NPE on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746679#comment-17746679
 ] 

Sergey Nuyanzin commented on FLINK-32662:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51501=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8583

> JobMasterTest.testRetrievingCheckpointStats fails with NPE on AZP
> -
>
> Key: FLINK-32662
> URL: https://issues.apache.org/jira/browse/FLINK-32662
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51452=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8654
> fails with NPE as
> {noformat}
> Jul 20 01:01:33 01:01:33.491 [ERROR] 
> org.apache.flink.runtime.jobmaster.JobMasterTest.testRetrievingCheckpointStats
>   Time elapsed: 0.036 s  <<< ERROR!
> Jul 20 01:01:33 java.lang.NullPointerException
> Jul 20 01:01:33   at 
> org.apache.flink.runtime.jobmaster.JobMasterTest.testRetrievingCheckpointStats(JobMasterTest.java:2132)
> Jul 20 01:01:33   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> Jul 20 01:01:33   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> Jul 20 01:01:33   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> Jul 20 01:01:33   at java.lang.reflect.Method.invoke(Method.java:498)
> Jul 20 01:01:33   at 
> org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
> Jul 20 01:01:33   at 
> org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-31320) Modify DATE_FORMAT system (built-in) function to accepts DATEs

2023-07-24 Thread James Mcguire (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-31320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746681#comment-17746681
 ] 

James Mcguire commented on FLINK-31320:
---

Looks like we have a PR for this one: https://github.com/apache/flink/pull/22353

> Modify DATE_FORMAT system (built-in) function to accepts DATEs
> --
>
> Key: FLINK-31320
> URL: https://issues.apache.org/jira/browse/FLINK-31320
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: James Mcguire
>Priority: Minor
>  Labels: pull-request-available
>
> The current {{DATE_FORMAT}} function only supports {{{}TIMESTAMP{}}}s. 
> (See 
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/systemfunctions/#temporal-functions)
>  
> Ideally, it should be able to format {{{}DATE{}}}'s as well as {{TIMESTAMPs}}
>  
> Example usage:
> {noformat}
> Flink SQL> CREATE TABLE test_table (
> >   some_date DATE,
> >   object AS JSON_OBJECT(
> > KEY 'some_date' VALUE DATE_FORMAT(some_date, '-MM-dd')
> >   )
> > )
> > COMMENT ''
> > WITH (
> >   'connector'='datagen'
> > )
> > ;
> > 
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.calcite.sql.validate.SqlValidatorException: Cannot apply 
> 'DATE_FORMAT' to arguments of type 'DATE_FORMAT(, )'. 
> Supported form(s): 'DATE_FORMAT(, )'
> 'DATE_FORMAT(, )'{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746680#comment-17746680
 ] 

Sergey Nuyanzin commented on FLINK-32628:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51502=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin updated FLINK-32628:

Affects Version/s: 1.17.2

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0, 1.17.2
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32628) build_wheels_on_macos fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746677#comment-17746677
 ] 

Sergey Nuyanzin commented on FLINK-32628:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51501=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102

> build_wheels_on_macos fails on AZP
> --
>
> Key: FLINK-32628
> URL: https://issues.apache.org/jira/browse/FLINK-32628
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / CI
>Affects Versions: 1.18.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: pull-request-available
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51394=logs=f73b5736-8355-5390-ec71-4dfdec0ce6c5=90f7230e-bf5a-531b-8566-ad48d3e03bbb=102
> fails as
> {noformat}
> 2023-07-19T00:18:36.5467620Z   Failed to build fastavro
> 2023-07-19T00:18:36.5507410Z   ERROR: Could not build wheels for 
> fastavro, which is required to install pyproject.toml-based projects
> 2023-07-19T00:18:36.5540080Z   [end of output]
> 2023-07-19T00:18:36.5568470Z   
> 2023-07-19T00:18:36.5603540Z   note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5633470Z error: subprocess-exited-with-error
> 2023-07-19T00:18:36.5669130Z 
> 2023-07-19T00:18:36.5709780Z × pip subprocess to install build dependencies 
> did not run successfully.
> 2023-07-19T00:18:36.5737700Z │ exit code: 1
> 2023-07-19T00:18:36.5764350Z ╰─> See above for output.
> 2023-07-19T00:18:36.5791010Z 
> 2023-07-19T00:18:36.5819050Z note: This error originates from a subprocess, 
> and is likely not a problem with pip.
> 2023-07-19T00:18:36.5847430Z ##[endgroup]
> 2023-07-19T00:18:36.5884460Z  
> ✕ 56.47s
> 2023-07-19T00:18:36.5921230Z Error: Command ['python', '-m', 'pip', 
> 'wheel', '/Users/runner/work/1/s/flink-python', 
> '--wheel-dir=/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvmgn/T/cibw-run-pzvjusx9/cp37-macosx_x86_64/built_wheel',
>  '--no-deps'] failed with code 1. None
> {noformat}
> probably this is the reason of failing 
> {quote}
> is required to install pyproject.toml-based projects
> {quote}
> however not clear why it is started to fail only recently
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32663) RescalingITCase.testSavepointRescalingInPartitionedOperatorStateList fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-32663:
---

 Summary: 
RescalingITCase.testSavepointRescalingInPartitionedOperatorStateList fails on 
AZP
 Key: FLINK-32663
 URL: https://issues.apache.org/jira/browse/FLINK-32663
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.18.0
Reporter: Sergey Nuyanzin


This build 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51501=logs=8fd9202e-fd17-5b26-353c-ac1ff76c8f28=ea7cf968-e585-52cb-e0fc-f48de023a7ca=8665
fails as
{noformat}
Jul 21 01:24:54 01:24:54.146 [ERROR] 
RescalingITCase.testSavepointRescalingInPartitionedOperatorStateList  Time 
elapsed: 1.485 s  <<< FAILURE!
Jul 21 01:24:54 java.lang.AssertionError: expected:<530> but was:<30>
Jul 21 01:24:54 at org.junit.Assert.fail(Assert.java:89)
Jul 21 01:24:54 at org.junit.Assert.failNotEquals(Assert.java:835)
Jul 21 01:24:54 at org.junit.Assert.assertEquals(Assert.java:647)
Jul 21 01:24:54 at org.junit.Assert.assertEquals(Assert.java:633)
Jul 21 01:24:54 at 
org.apache.flink.test.checkpointing.RescalingITCase.testSavepointRescalingPartitionedOperatorState(RescalingITCase.java:621)
Jul 21 01:24:54 at 
org.apache.flink.test.checkpointing.RescalingITCase.testSavepointRescalingInPartitionedOperatorStateList(RescalingITCase.java:508)
Jul 21 01:24:54 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
Jul 21 01:24:54 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Jul 21 01:24:54 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:4
...
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32662) JobMasterTest.testRetrievingCheckpointStats fails with NPE on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-32662:
---

 Summary: JobMasterTest.testRetrievingCheckpointStats fails with 
NPE on AZP
 Key: FLINK-32662
 URL: https://issues.apache.org/jira/browse/FLINK-32662
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.18.0
Reporter: Sergey Nuyanzin


This build 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51452=logs=0e7be18f-84f2-53f0-a32d-4a5e4a174679=7c1d86e3-35bd-5fd5-3b7c-30c126a78702=8654
fails with NPE as
{noformat}
Jul 20 01:01:33 01:01:33.491 [ERROR] 
org.apache.flink.runtime.jobmaster.JobMasterTest.testRetrievingCheckpointStats  
Time elapsed: 0.036 s  <<< ERROR!
Jul 20 01:01:33 java.lang.NullPointerException
Jul 20 01:01:33 at 
org.apache.flink.runtime.jobmaster.JobMasterTest.testRetrievingCheckpointStats(JobMasterTest.java:2132)
Jul 20 01:01:33 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
Jul 20 01:01:33 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Jul 20 01:01:33 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Jul 20 01:01:33 at java.lang.reflect.Method.invoke(Method.java:498)
Jul 20 01:01:33 at 
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
Jul 20 01:01:33 at 
org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
Jul 20 01:01:33 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
Jul 20 01:01:33 at 
org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
Jul 20 01:01:33 at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
Jul 20 01:01:33 at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
Jul 20 01:01:33 at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
Jul 20 01:01:33 at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
Jul 20 01:01:33 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
Jul 20 01:01:33 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
Jul 20 01:01:33 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
...
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32564) Support cast from BYTES to NUMBER

2023-07-24 Thread Hanyu Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746675#comment-17746675
 ] 

Hanyu Zheng commented on FLINK-32564:
-

[~twalthr] ok

> Support cast from BYTES to NUMBER
> -
>
> Key: FLINK-32564
> URL: https://issues.apache.org/jira/browse/FLINK-32564
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Hanyu Zheng
>Assignee: Hanyu Zheng
>Priority: Major
>  Labels: pull-request-available
>
> We are dealing with a task that requires casting from the BYTES type to 
> BIGINT. Specifically, we have a string '00T1p'. Our approach is to convert 
> this string to BYTES and then cast the result to BIGINT with the following 
> SQL query:
> {code:java}
> SELECT CAST((CAST('00T1p' as BYTES)) as BIGINT);{code}
> However, an issue arises when executing this query, likely due to an error in 
> the conversion between BYTES and BIGINT. We aim to identify and rectify this 
> issue so our query can run correctly. The tasks involved are:
>  # Investigate and identify the specific reason for the failure of conversion 
> from BYTES to BIGINT.
>  # Design and implement a solution to ensure our query can function correctly.
>  # Test this solution across all required scenarios to guarantee its 
> functionality.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-32645) Flink pulsar sink is having poor performance

2023-07-24 Thread Yufan Sheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746672#comment-17746672
 ] 

Yufan Sheng edited comment on FLINK-32645 at 7/24/23 10:56 PM:
---

Hi [~vbhasvij], can you add the flink-connector-base to your pom and shade it 
into the uber-jar to see if everything is ok?

It seems like your production Flink environment is based on 1.16. Flink 
introduced a new stop policy since 1.17. So a new {{Configuration}} argument 
has been added to the constructor of {{SplitFetcherManager}}. The 
NoSuchMethodError show that you may not use the 1.17 Flink.


was (Author: syhily):
Hi [~vbhasvij], can you add the flink-connector-base to your pom and shade it 
into the uber-jar to see if everything is ok?

> Flink pulsar sink is having poor performance
> 
>
> Key: FLINK-32645
> URL: https://issues.apache.org/jira/browse/FLINK-32645
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.16.2
> Environment: !Screenshot 2023-07-22 at 1.59.42 PM.png!!Screenshot 
> 2023-07-22 at 2.03.53 PM.png!
>  
>Reporter: Vijaya Bhaskar V
>Priority: Major
> Attachments: Screenshot 2023-07-22 at 2.03.53 PM.png, Screenshot 
> 2023-07-22 at 2.56.55 PM.png, Screenshot 2023-07-22 at 3.45.21 PM-1.png, 
> Screenshot 2023-07-22 at 3.45.21 PM.png, pom.xml
>
>
> Found following issue with flink pulsar sink:
>  
> Flink pulsar sink is always waiting while enqueueing the message and making 
> the task slot busy no matter how many free slots we provide. Attached the 
> screen shot of the same
> Just sending messages of less rate 8k msg/sec and stand alone flink job with 
> discarding sink is able to receive full rate if 8K msg/sec
> Where as pulsar sink was consuming only upto 2K msg/sec and the sink is 
> always busy waiting. Snapshot of thread dump attached.
> Also snap shot of flink stream graph attached
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32661) OperationRelatedITCase.testOperationRelatedApis fails on AZP

2023-07-24 Thread Sergey Nuyanzin (Jira)
Sergey Nuyanzin created FLINK-32661:
---

 Summary: OperationRelatedITCase.testOperationRelatedApis fails on 
AZP
 Key: FLINK-32661
 URL: https://issues.apache.org/jira/browse/FLINK-32661
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Gateway
Affects Versions: 1.18.0
Reporter: Sergey Nuyanzin


This build 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51452=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=12114
fails as 
{noformat}
Jul 20 04:23:49 org.opentest4j.AssertionFailedError: 
Jul 20 04:23:49 
Jul 20 04:23:49 Expecting actual's toString() to return:
Jul 20 04:23:49   "PENDING"
Jul 20 04:23:49 but was:
Jul 20 04:23:49   "RUNNING"
Jul 20 04:23:49 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
Jul 20 04:23:49 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
Jul 20 04:23:49 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
Jul 20 04:23:49 at 
org.apache.flink.table.gateway.rest.OperationRelatedITCase.testOperationRelatedApis(OperationRelatedITCase.java:91)
Jul 20 04:23:49 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
Jul 20 04:23:49 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Jul 20 04:23:49 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Jul 20 04:23:49 at java.lang.reflect.Method.invoke(Method.java:498)
Jul 20 04:23:49 at 
org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
Jul 20 04:23:49 at 
org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:21
...
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32645) Flink pulsar sink is having poor performance

2023-07-24 Thread Yufan Sheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746672#comment-17746672
 ] 

Yufan Sheng commented on FLINK-32645:
-

Hi [~vbhasvij], can you add the flink-connector-base to your pom and shade it 
into the uber-jar to see if everything is ok?

> Flink pulsar sink is having poor performance
> 
>
> Key: FLINK-32645
> URL: https://issues.apache.org/jira/browse/FLINK-32645
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.16.2
> Environment: !Screenshot 2023-07-22 at 1.59.42 PM.png!!Screenshot 
> 2023-07-22 at 2.03.53 PM.png!
>  
>Reporter: Vijaya Bhaskar V
>Priority: Major
> Attachments: Screenshot 2023-07-22 at 2.03.53 PM.png, Screenshot 
> 2023-07-22 at 2.56.55 PM.png, Screenshot 2023-07-22 at 3.45.21 PM-1.png, 
> Screenshot 2023-07-22 at 3.45.21 PM.png, pom.xml
>
>
> Found following issue with flink pulsar sink:
>  
> Flink pulsar sink is always waiting while enqueueing the message and making 
> the task slot busy no matter how many free slots we provide. Attached the 
> screen shot of the same
> Just sending messages of less rate 8k msg/sec and stand alone flink job with 
> discarding sink is able to receive full rate if 8K msg/sec
> Where as pulsar sink was consuming only upto 2K msg/sec and the sink is 
> always busy waiting. Snapshot of thread dump attached.
> Also snap shot of flink stream graph attached
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29634) Support periodic checkpoint triggering

2023-07-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-29634:
---
Labels: pull-request-available  (was: )

> Support periodic checkpoint triggering
> --
>
> Key: FLINK-29634
> URL: https://issues.apache.org/jira/browse/FLINK-29634
> Project: Flink
>  Issue Type: New Feature
>  Components: Kubernetes Operator
>Reporter: Thomas Weise
>Assignee: Alexander Fedulov
>Priority: Major
>  Labels: pull-request-available
>
> Similar to the support for periodic savepoints, the operator should support 
> triggering periodic checkpoints to break the incremental checkpoint chain.
> Support for external triggering will come with 1.17: 
> https://issues.apache.org/jira/browse/FLINK-27101 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink-kubernetes-operator] afedulov opened a new pull request, #637: [FLINK-29634] Support periodic checkpoint triggering

2023-07-24 Thread via GitHub


afedulov opened a new pull request, #637:
URL: https://github.com/apache/flink-kubernetes-operator/pull/637

   ## What is the purpose of the change
   
   Similar to the support for periodic savepoints, adds support for the 
operator to trigger periodic checkpoints ( for instance, break the incremental 
checkpoint chains, as described in 
[FLINK-27101](https://issues.apache.org/jira/browse/FLINK-27101) 
   
   ## Brief change log
   
 - Adds checkpoint tracking information (`CheckpointInfo`) to the CRD
 - Expands utilities previously used for savepoints to checkpoints handling 
(`SnapshotUtils`)
 - Adds reconciliation logic that checks whether the checkpoint needs to be 
triggered (either manual or periodic)
 - Migrates `SavepointsObserver` to `SnapshotsObserver` and adds respective 
checkpoints observation logic
 - Adds checkpoints fetching functionality to FlinkService
   
   ## Verifying this change
   This change added tests and can be verified as follows:
   - Consolidates `SavepointObserverTest` to `SnapshotObserverTest` and adds 
test that checkpoints are observed correctly
   - Adds checkpoints observation tests for session and application modes
   - Adds checkpoints triggering tests to reconciliation tests
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
(**yes** / no)
 - Core observer or reconciler logic that is regularly executed: (**yes** / 
no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (**yes** / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ **not yet documented**)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] hanyuzheng7 closed pull request #23068: array_except

2023-07-24 Thread via GitHub


hanyuzheng7 closed pull request #23068: array_except
URL: https://github.com/apache/flink/pull/23068


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] hanyuzheng7 opened a new pull request, #23068: array_except

2023-07-24 Thread via GitHub


hanyuzheng7 opened a new pull request, #23068:
URL: https://github.com/apache/flink/pull/23068

   array_except
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-32645) Flink pulsar sink is having poor performance

2023-07-24 Thread Vijaya Bhaskar V (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746603#comment-17746603
 ] 

Vijaya Bhaskar V commented on FLINK-32645:
--

1.17.0 connector is crashing repeatedly with pulsar version 2.10.2 and 2.11.  
Above call stack i am getting. 

All my local integration tests with pulsar docker 2.10.2 are succeeding,  here 
is my pom file attached[^pom.xml]

 

> Flink pulsar sink is having poor performance
> 
>
> Key: FLINK-32645
> URL: https://issues.apache.org/jira/browse/FLINK-32645
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.16.2
> Environment: !Screenshot 2023-07-22 at 1.59.42 PM.png!!Screenshot 
> 2023-07-22 at 2.03.53 PM.png!
>  
>Reporter: Vijaya Bhaskar V
>Priority: Major
> Attachments: Screenshot 2023-07-22 at 2.03.53 PM.png, Screenshot 
> 2023-07-22 at 2.56.55 PM.png, Screenshot 2023-07-22 at 3.45.21 PM-1.png, 
> Screenshot 2023-07-22 at 3.45.21 PM.png, pom.xml
>
>
> Found following issue with flink pulsar sink:
>  
> Flink pulsar sink is always waiting while enqueueing the message and making 
> the task slot busy no matter how many free slots we provide. Attached the 
> screen shot of the same
> Just sending messages of less rate 8k msg/sec and stand alone flink job with 
> discarding sink is able to receive full rate if 8K msg/sec
> Where as pulsar sink was consuming only upto 2K msg/sec and the sink is 
> always busy waiting. Snapshot of thread dump attached.
> Also snap shot of flink stream graph attached
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32645) Flink pulsar sink is having poor performance

2023-07-24 Thread Vijaya Bhaskar V (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijaya Bhaskar V updated FLINK-32645:
-
Attachment: pom.xml

> Flink pulsar sink is having poor performance
> 
>
> Key: FLINK-32645
> URL: https://issues.apache.org/jira/browse/FLINK-32645
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.16.2
> Environment: !Screenshot 2023-07-22 at 1.59.42 PM.png!!Screenshot 
> 2023-07-22 at 2.03.53 PM.png!
>  
>Reporter: Vijaya Bhaskar V
>Priority: Major
> Attachments: Screenshot 2023-07-22 at 2.03.53 PM.png, Screenshot 
> 2023-07-22 at 2.56.55 PM.png, Screenshot 2023-07-22 at 3.45.21 PM-1.png, 
> Screenshot 2023-07-22 at 3.45.21 PM.png, pom.xml
>
>
> Found following issue with flink pulsar sink:
>  
> Flink pulsar sink is always waiting while enqueueing the message and making 
> the task slot busy no matter how many free slots we provide. Attached the 
> screen shot of the same
> Just sending messages of less rate 8k msg/sec and stand alone flink job with 
> discarding sink is able to receive full rate if 8K msg/sec
> Where as pulsar sink was consuming only upto 2K msg/sec and the sink is 
> always busy waiting. Snapshot of thread dump attached.
> Also snap shot of flink stream graph attached
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] atris commented on pull request #20187: [FLINK-27473] Introduce Generic Metrics For Job States

2023-07-24 Thread via GitHub


atris commented on PR #20187:
URL: https://github.com/apache/flink/pull/20187#issuecomment-1648437676

   @zentol Somehow this PR fell off my radar. I have updated it now with your 
commit -- can we please proceed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-32645) Flink pulsar sink is having poor performance

2023-07-24 Thread Vijaya Bhaskar V (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746499#comment-17746499
 ] 

Vijaya Bhaskar V edited comment on FLINK-32645 at 7/24/23 6:36 PM:
---

We are facing upgrade issues with this connector after upgrading to flink 
1.17.0, Any help?

 switched from INITIALIZING to FAILED with failure cause: 
java.lang.NoSuchMethodError: 'void 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.(org.apache.flink.connector.base.source.reader.synchronization.FutureCompletingBlockingQueue,
 java.util.function.Supplier, org.apache.flink.configuration.Configuration)'
        at 
org.apache.flink.connector.pulsar.source.reader.PulsarSourceFetcherManager.(PulsarSourceFetcherManager.java:71)
        at 
org.apache.flink.connector.pulsar.source.reader.PulsarSourceReader.create(PulsarSourceReader.java:298)
        at 
org.apache.flink.connector.pulsar.source.PulsarSource.createReader(PulsarSource.java:137)

 

These line numbers  PulsarSource.java:137  and PulsarSourceReader.java:298 
don't exist while i was running my integration test locally. Wondering this 
connector version really works?

 

 

 


was (Author: vbhasvij):
We are facing upgrade issues with this connector after upgrading to flink 
1.17.0, Any help?

 switched from INITIALIZING to FAILED with failure cause: 
java.lang.NoSuchMethodError: 'void 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.(org.apache.flink.connector.base.source.reader.synchronization.FutureCompletingBlockingQueue,
 java.util.function.Supplier, org.apache.flink.configuration.Configuration)'
        at 
org.apache.flink.connector.pulsar.source.reader.PulsarSourceFetcherManager.(PulsarSourceFetcherManager.java:71)
        at 
org.apache.flink.connector.pulsar.source.reader.PulsarSourceReader.create(PulsarSourceReader.java:298)
        at 
org.apache.flink.connector.pulsar.source.PulsarSource.createReader(PulsarSource.java:137)

 

 

> Flink pulsar sink is having poor performance
> 
>
> Key: FLINK-32645
> URL: https://issues.apache.org/jira/browse/FLINK-32645
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.16.2
> Environment: !Screenshot 2023-07-22 at 1.59.42 PM.png!!Screenshot 
> 2023-07-22 at 2.03.53 PM.png!
>  
>Reporter: Vijaya Bhaskar V
>Priority: Major
> Attachments: Screenshot 2023-07-22 at 2.03.53 PM.png, Screenshot 
> 2023-07-22 at 2.56.55 PM.png, Screenshot 2023-07-22 at 3.45.21 PM-1.png, 
> Screenshot 2023-07-22 at 3.45.21 PM.png
>
>
> Found following issue with flink pulsar sink:
>  
> Flink pulsar sink is always waiting while enqueueing the message and making 
> the task slot busy no matter how many free slots we provide. Attached the 
> screen shot of the same
> Just sending messages of less rate 8k msg/sec and stand alone flink job with 
> discarding sink is able to receive full rate if 8K msg/sec
> Where as pulsar sink was consuming only upto 2K msg/sec and the sink is 
> always busy waiting. Snapshot of thread dump attached.
> Also snap shot of flink stream graph attached
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32565) Support cast from NUMBER to BYTES

2023-07-24 Thread Hanyu Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746569#comment-17746569
 ] 

Hanyu Zheng commented on FLINK-32565:
-

[~twalthr] Ok

> Support cast from NUMBER to BYTES
> -
>
> Key: FLINK-32565
> URL: https://issues.apache.org/jira/browse/FLINK-32565
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Hanyu Zheng
>Assignee: Hanyu Zheng
>Priority: Major
>  Labels: pull-request-available
>
> We are undertaking a task that requires casting from the DOUBLE type to BYTES 
> In particular, we have a INTEGER 1234. Our current approach is to convert 
> this INTEGER to BYTES  using the following SQL query:
> {code:java}
> SELECT CAST(1234 as BYTES);{code}
> {{ }}
> However, we encounter an issue when executing this query, potentially due to 
> an error in the conversion between INTEGER and BYTES. Our goal is to identify 
> and correct this issue so that our query can execute successfully. The tasks 
> involved are:
>  # Investigate and pinpoint the specific reason for the conversion failure 
> from INTEGER to BYTES.
>  # Design and implement a solution that enables our query to function 
> correctly.
>  # Test this solution across all required scenarios to ensure its robustness.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] patricklucas commented on a diff in pull request #22987: [FLINK-32583][rest] Fix deadlock in RestClient

2023-07-24 Thread via GitHub


patricklucas commented on code in PR #22987:
URL: https://github.com/apache/flink/pull/22987#discussion_r1272567509


##
flink-runtime/src/test/java/org/apache/flink/runtime/rest/RestClientTest.java:
##
@@ -207,6 +210,40 @@ public void testRestClientClosedHandling() throws 
Exception {
 }
 }
 
+/**
+ * Tests that the futures returned by {@link RestClient} fail immediately 
if the client is
+ * already closed.
+ *
+ * See FLINK-32583
+ */
+@Test
+public void testCloseClientBeforeRequest() throws Exception {

Review Comment:
   Cool, good find, I'll try it out.
   
   Regarding the exception, I've started to feel that maybe `IOException` 
actually _isn't_ the best choice, since no true IO (e.g. via syscalls) is 
happening at the time of the failure. Even if we take an analogous example of 
trying to write to a file that's closed, the error would often come from the 
failure of the syscall to write.
   
   In this situation, the client is in fact in an illegal state (closed) for 
making requests.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-32630) The log level of job failed info should change from INFO to WARN/ERROR if job failed

2023-07-24 Thread Chesnay Schepler (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-32630.

Resolution: Duplicate

> The log level of job failed info should change from INFO to WARN/ERROR if job 
> failed
> 
>
> Key: FLINK-32630
> URL: https://issues.apache.org/jira/browse/FLINK-32630
> Project: Flink
>  Issue Type: Improvement
>  Components: Client / Job Submission, Runtime / Coordination
>Affects Versions: 1.17.1
>Reporter: Matt Wang
>Assignee: Matt Wang
>Priority: Minor
>  Labels: pull-request-available
>
> When a job fails to submit or run, the following log level should be changed 
> to WARN or ERROR, INFO will confuse users
> {code:java}
> 2023-07-14 20:05:26,863 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job 
> flink_test_job (08eefd50) switched from state FAILING 
> to FAILED.
> org.apache.flink.runtime.JobException: Recovery is suppressed by 
> FailureRateRestartBackoffTimeStrategy(FailureRateRestartBackoffTimeStrategy(failuresIntervalMS=240,backoffTimeMS=2,maxFailuresPerInterval=100)
>  
> 2023-07-14 20:05:26,889 INFO  
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job 
> 08eefd50 reached terminal state FAILED.
> org.apache.flink.runtime.JobException: Recovery is suppressed by 
> FailureRateRestartBackoffTimeStrategy(FailureRateRestartBackoffTimeStrategy(failuresIntervalMS=240,backoffTimeMS=2,maxFailuresPerInterval=100)
> 2023-07-14 20:05:26,956 INFO  
> org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap 
> [] - Application FAILED: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.client.deployment.application.ApplicationExecutionException: 
> Could not execute application.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] zentol closed pull request #23066: [FLINK-32630][runtime] The log level of job failed info should change…

2023-07-24 Thread via GitHub


zentol closed pull request #23066: [FLINK-32630][runtime] The log level of job 
failed info should change…
URL: https://github.com/apache/flink/pull/23066


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol commented on pull request #23066: [FLINK-32630][runtime] The log level of job failed info should change…

2023-07-24 Thread via GitHub


zentol commented on PR #23066:
URL: https://github.com/apache/flink/pull/23066#issuecomment-1648265081

   Duplicate of #5399.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol commented on pull request #22996: [FLINK-32468][rpc] Switch from Akka to Pekko

2023-07-24 Thread via GitHub


zentol commented on PR #22996:
URL: https://github.com/apache/flink/pull/22996#issuecomment-1648261423

   > Would we break anything if we also rename pom modules?
   
   I think it'd be annoying for devs, because that implies having to rebuild 
the rpc system whenever you switch between the pre and post pekko state of the 
project.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] XComp commented on pull request #22996: [FLINK-32468][rpc] Switch from Akka to Pekko

2023-07-24 Thread via GitHub


XComp commented on PR #22996:
URL: https://github.com/apache/flink/pull/22996#issuecomment-1648226413

   The changes look good now. I will bring this PR up in tomorrows 1.18 release 
sync. Would we break anything if we also rename pom modules? ...now that we've 
renamed the packages and classes as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] flinkbot commented on pull request #23067: [FLINK-32565][table] Support Cast From NUMBER to BYTES

2023-07-24 Thread via GitHub


flinkbot commented on PR #23067:
URL: https://github.com/apache/flink/pull/23067#issuecomment-1648218581

   
   ## CI report:
   
   * 00802457d0cf030f6267df8f8a144702aa4c1f36 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32565) Support cast from NUMBER to BYTES

2023-07-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-32565:
---
Labels: pull-request-available  (was: )

> Support cast from NUMBER to BYTES
> -
>
> Key: FLINK-32565
> URL: https://issues.apache.org/jira/browse/FLINK-32565
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Hanyu Zheng
>Assignee: Hanyu Zheng
>Priority: Major
>  Labels: pull-request-available
>
> We are undertaking a task that requires casting from the DOUBLE type to BYTES 
> In particular, we have a INTEGER 1234. Our current approach is to convert 
> this INTEGER to BYTES  using the following SQL query:
> {code:java}
> SELECT CAST(1234 as BYTES);{code}
> {{ }}
> However, we encounter an issue when executing this query, potentially due to 
> an error in the conversion between INTEGER and BYTES. Our goal is to identify 
> and correct this issue so that our query can execute successfully. The tasks 
> involved are:
>  # Investigate and pinpoint the specific reason for the conversion failure 
> from INTEGER to BYTES.
>  # Design and implement a solution that enables our query to function 
> correctly.
>  # Test this solution across all required scenarios to ensure its robustness.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] hanyuzheng7 opened a new pull request, #23067: [FLINK-32565][table] Support Cast From NUMBER to BYTES

2023-07-24 Thread via GitHub


hanyuzheng7 opened a new pull request, #23067:
URL: https://github.com/apache/flink/pull/23067

   ### What is the purpose of the change
   Implement` CAST` from `NUMBER` to BINARY`/`VARBINART`/`BYTES `
   
   ### Brief change log
   ` CAST` from `NUMBER` to `BINARY`/`VARBINART`/`BYTES ` for Table API and SQL
   
   Syntax:
   `
   CAST(BYTES AS NUMBER)
   `
   
   Examples:
   ```
   Flink SQL> select cast(1 as BYTES);
   res: x'31'
   
   
   
   ```
   
   ### Verifying this change
   
   - This change added tests in CastFunctionITCase.
   
   ### Does this pull request potentially affect one of the following parts:
   
   - Dependencies (does it add or upgrade a dependency): (no)
   - The public API, i.e., is any changed class annotated with 
@Public(Evolving): (yes)
   - The serializers: (no)
   - The runtime per-record code paths (performance sensitive): (no)
   - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
   - The S3 file system connector: (no)
   
   ### Documentation
   
   - Does this pull request introduce a new feature? (yes)
   - If yes, how is the feature documented? (docs)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-32645) Flink pulsar sink is having poor performance

2023-07-24 Thread Vijaya Bhaskar V (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746499#comment-17746499
 ] 

Vijaya Bhaskar V edited comment on FLINK-32645 at 7/24/23 4:03 PM:
---

We are facing upgrade issues with this connector after upgrading to flink 
1.17.0, Any help?

 switched from INITIALIZING to FAILED with failure cause: 
java.lang.NoSuchMethodError: 'void 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.(org.apache.flink.connector.base.source.reader.synchronization.FutureCompletingBlockingQueue,
 java.util.function.Supplier, org.apache.flink.configuration.Configuration)'
        at 
org.apache.flink.connector.pulsar.source.reader.PulsarSourceFetcherManager.(PulsarSourceFetcherManager.java:71)
        at 
org.apache.flink.connector.pulsar.source.reader.PulsarSourceReader.create(PulsarSourceReader.java:298)
        at 
org.apache.flink.connector.pulsar.source.PulsarSource.createReader(PulsarSource.java:137)

 

 


was (Author: vbhasvij):
We are facing upgrade issues with this connector,  we upgraded to flink 1.17.0, 
still same issues facing. Any help?

 switched from INITIALIZING to FAILED with failure cause: 
java.lang.NoSuchMethodError: 'void 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.(org.apache.flink.connector.base.source.reader.synchronization.FutureCompletingBlockingQueue,
 java.util.function.Supplier, org.apache.flink.configuration.Configuration)'
        at 
org.apache.flink.connector.pulsar.source.reader.PulsarSourceFetcherManager.(PulsarSourceFetcherManager.java:71)
        at 
org.apache.flink.connector.pulsar.source.reader.PulsarSourceReader.create(PulsarSourceReader.java:298)
        at 
org.apache.flink.connector.pulsar.source.PulsarSource.createReader(PulsarSource.java:137)

 

> Flink pulsar sink is having poor performance
> 
>
> Key: FLINK-32645
> URL: https://issues.apache.org/jira/browse/FLINK-32645
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.16.2
> Environment: !Screenshot 2023-07-22 at 1.59.42 PM.png!!Screenshot 
> 2023-07-22 at 2.03.53 PM.png!
>  
>Reporter: Vijaya Bhaskar V
>Priority: Major
> Attachments: Screenshot 2023-07-22 at 2.03.53 PM.png, Screenshot 
> 2023-07-22 at 2.56.55 PM.png, Screenshot 2023-07-22 at 3.45.21 PM-1.png, 
> Screenshot 2023-07-22 at 3.45.21 PM.png
>
>
> Found following issue with flink pulsar sink:
>  
> Flink pulsar sink is always waiting while enqueueing the message and making 
> the task slot busy no matter how many free slots we provide. Attached the 
> screen shot of the same
> Just sending messages of less rate 8k msg/sec and stand alone flink job with 
> discarding sink is able to receive full rate if 8K msg/sec
> Where as pulsar sink was consuming only upto 2K msg/sec and the sink is 
> always busy waiting. Snapshot of thread dump attached.
> Also snap shot of flink stream graph attached
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #23066: [FLINK-32630][runtime] The log level of job failed info should change…

2023-07-24 Thread via GitHub


flinkbot commented on PR #23066:
URL: https://github.com/apache/flink/pull/23066#issuecomment-1648197426

   
   ## CI report:
   
   * c5287a3723b56e3f5bc330e7f7c0c38f83542283 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32630) The log level of job failed info should change from INFO to WARN/ERROR if job failed

2023-07-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-32630:
---
Labels: pull-request-available  (was: )

> The log level of job failed info should change from INFO to WARN/ERROR if job 
> failed
> 
>
> Key: FLINK-32630
> URL: https://issues.apache.org/jira/browse/FLINK-32630
> Project: Flink
>  Issue Type: Improvement
>  Components: Client / Job Submission, Runtime / Coordination
>Affects Versions: 1.17.1
>Reporter: Matt Wang
>Assignee: Matt Wang
>Priority: Minor
>  Labels: pull-request-available
>
> When a job fails to submit or run, the following log level should be changed 
> to WARN or ERROR, INFO will confuse users
> {code:java}
> 2023-07-14 20:05:26,863 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Job 
> flink_test_job (08eefd50) switched from state FAILING 
> to FAILED.
> org.apache.flink.runtime.JobException: Recovery is suppressed by 
> FailureRateRestartBackoffTimeStrategy(FailureRateRestartBackoffTimeStrategy(failuresIntervalMS=240,backoffTimeMS=2,maxFailuresPerInterval=100)
>  
> 2023-07-14 20:05:26,889 INFO  
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job 
> 08eefd50 reached terminal state FAILED.
> org.apache.flink.runtime.JobException: Recovery is suppressed by 
> FailureRateRestartBackoffTimeStrategy(FailureRateRestartBackoffTimeStrategy(failuresIntervalMS=240,backoffTimeMS=2,maxFailuresPerInterval=100)
> 2023-07-14 20:05:26,956 INFO  
> org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap 
> [] - Application FAILED: 
> java.util.concurrent.CompletionException: 
> org.apache.flink.client.deployment.application.ApplicationExecutionException: 
> Could not execute application.{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] wangzzu opened a new pull request, #23066: [FLINK-32630][runtime] The log level of job failed info should change…

2023-07-24 Thread via GitHub


wangzzu opened a new pull request, #23066:
URL: https://github.com/apache/flink/pull/23066

   … from INFO to WARN/ERROR if job failed
   
   
   
   ## What is the purpose of the change
   
   If a job submits failed, the exception stack information should be set to 
the ERROR level, INFO will confuse users.
   
   ## Brief change log
   
   - if the application state is FAILED/CANCELED and exception is not NULL, the 
log level should be ERROR in `ApplicationDispatcherBootstrap`;
   - if `archivedExecutionGraph.getFailureInfo() != null && 
isFailureInfoRelatedToJobTermination` is true, this means there is some 
exception here, so the log level should be ERROR in `Dispatcher`;
   - when transiting the status in `ExecutionGraph`, if the `Throwable` is not 
NULL, the log level should be ERROR;
   -  when transiting the status in `Execution`, if the newState is `FAILED` or 
the `Throwable` is not NULL, the log level should be ERROR; 
   
   ## Verifying this change
   
   - Has been tested and verified on a normal job and a failed job
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] gaborgsomogyi commented on pull request #21788: [FLINK-30812][yarn] Fix uploading local files when using YARN with S3

2023-07-24 Thread via GitHub


gaborgsomogyi commented on PR #21788:
URL: https://github.com/apache/flink/pull/21788#issuecomment-1648160382

   I'm having vacation on the whole week w/o computer. Next week I can start w/ 
this...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] zentol commented on pull request #22996: [FLINK-32468][rpc] Switch from Akka to Pekko

2023-07-24 Thread via GitHub


zentol commented on PR #22996:
URL: https://github.com/apache/flink/pull/22996#issuecomment-1648154969

   > Have we thought of providing a separate module flink-rpc-pekko and 
providing that one along flink-rpc-akka in 1.18 with pekko being the default 
one? 
   
   Yes, but that implies duplicating all classes and increasing the dist size 
by ~30mb.
   
   > Or is this too much of an effort and we're comfortable enough that Pekko 
works as is?
   
   I wouldn't say I'm comfortable, but I think it's better to potentially break 
things now than doing this in a bugfix release or running with an unsupported 
Akka version,
   
   > Shall we add a deprecation flag here and plan to rename it with 2.0?
   
   That's a separate discussion I believe because it's an API breaking change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] JingGe merged pull request #23033: [FLINK-32634][table-api] Deprecate StreamRecordTimestamp and ExistingField

2023-07-24 Thread via GitHub


JingGe merged PR #23033:
URL: https://github.com/apache/flink/pull/23033


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-32645) Flink pulsar sink is having poor performance

2023-07-24 Thread Vijaya Bhaskar V (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746499#comment-17746499
 ] 

Vijaya Bhaskar V commented on FLINK-32645:
--

We are facing upgrade issues with this connector,  we upgraded to flink 1.17.0, 
still same issues facing. Any help?

 switched from INITIALIZING to FAILED with failure cause: 
java.lang.NoSuchMethodError: 'void 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.(org.apache.flink.connector.base.source.reader.synchronization.FutureCompletingBlockingQueue,
 java.util.function.Supplier, org.apache.flink.configuration.Configuration)'
        at 
org.apache.flink.connector.pulsar.source.reader.PulsarSourceFetcherManager.(PulsarSourceFetcherManager.java:71)
        at 
org.apache.flink.connector.pulsar.source.reader.PulsarSourceReader.create(PulsarSourceReader.java:298)
        at 
org.apache.flink.connector.pulsar.source.PulsarSource.createReader(PulsarSource.java:137)

 

> Flink pulsar sink is having poor performance
> 
>
> Key: FLINK-32645
> URL: https://issues.apache.org/jira/browse/FLINK-32645
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Pulsar
>Affects Versions: 1.16.2
> Environment: !Screenshot 2023-07-22 at 1.59.42 PM.png!!Screenshot 
> 2023-07-22 at 2.03.53 PM.png!
>  
>Reporter: Vijaya Bhaskar V
>Priority: Major
> Attachments: Screenshot 2023-07-22 at 2.03.53 PM.png, Screenshot 
> 2023-07-22 at 2.56.55 PM.png, Screenshot 2023-07-22 at 3.45.21 PM-1.png, 
> Screenshot 2023-07-22 at 3.45.21 PM.png
>
>
> Found following issue with flink pulsar sink:
>  
> Flink pulsar sink is always waiting while enqueueing the message and making 
> the task slot busy no matter how many free slots we provide. Attached the 
> screen shot of the same
> Just sending messages of less rate 8k msg/sec and stand alone flink job with 
> discarding sink is able to receive full rate if 8K msg/sec
> Where as pulsar sink was consuming only upto 2K msg/sec and the sink is 
> always busy waiting. Snapshot of thread dump attached.
> Also snap shot of flink stream graph attached
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] flinkbot commented on pull request #23065: [FLINK-32656][table] Deprecate ManagedTable related APIs

2023-07-24 Thread via GitHub


flinkbot commented on PR #23065:
URL: https://github.com/apache/flink/pull/23065#issuecomment-1648088942

   
   ## CI report:
   
   * 73370f6a01204817092d1489b39edb051a7609e2 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] XComp commented on a diff in pull request #22996: [FLINK-32468][rpc] Switch from Akka to Pekko

2023-07-24 Thread via GitHub


XComp commented on code in PR #22996:
URL: https://github.com/apache/flink/pull/22996#discussion_r1272361814


##
flink-core/src/main/java/org/apache/flink/configuration/AkkaOptions.java:
##
@@ -58,26 +59,30 @@ public static boolean 
isForceRpcInvocationSerializationEnabled(Configuration con
 
 /** Flag whether to capture call stacks for RPC ask calls. */
 public static final ConfigOption CAPTURE_ASK_CALLSTACK =
-ConfigOptions.key("akka.ask.callstack")
+ConfigOptions.key("pekko.ask.callstack")
 .booleanType()
 .defaultValue(true)
+.withDeprecatedKeys("akka.ask.callstack")
 .withDescription(
 "If true, call stack for asynchronous asks are 
captured. That way, when an ask fails "
 + "(for example times out), you get a 
proper exception, describing to the original method call and "
 + "call site. Note that in case of having 
millions of concurrent RPC calls, this may add to the "
 + "memory footprint.");
 
-/** Timeout for akka ask calls. */
+/** Timeout for Pekko ask calls. */
 public static final ConfigOption ASK_TIMEOUT_DURATION =
-ConfigOptions.key("akka.ask.timeout")
+ConfigOptions.key("pekko.ask.timeout")
 .durationType()
 .defaultValue(Duration.ofSeconds(10))
+.withDeprecatedKeys("akka.ask.timeout")
 .withDescription(
-"Timeout used for all futures and blocking Akka 
calls. If Flink fails due to timeouts then you"
+"Timeout used for all futures and blocking Pekko 
calls. If Flink fails due to timeouts then you"
 + " should try to increase this value. 
Timeouts can be caused by slow machines or a congested network. The"
 + " timeout value requires a time-unit 
specifier (ms/s/min/h/d).");
 
-/** @deprecated Use {@link #ASK_TIMEOUT_DURATION} */

Review Comment:
   Why did that change?



##
flink-core/src/main/java/org/apache/flink/configuration/AkkaOptions.java:
##
@@ -86,69 +91,78 @@ public static boolean 
isForceRpcInvocationSerializationEnabled(Configuration con
 
TimeUtils.formatWithHighestUnit(ASK_TIMEOUT_DURATION.defaultValue()))
 .withDescription(ASK_TIMEOUT_DURATION.description());
 
-/** The Akka tcp connection timeout. */
+/** The Pekko tcp connection timeout. */
 public static final ConfigOption TCP_TIMEOUT =
-ConfigOptions.key("akka.tcp.timeout")
+ConfigOptions.key("pekko.tcp.timeout")
 .stringType()
 .defaultValue("20 s")
+.withDeprecatedKeys("akka.tcp.timeout")
 .withDescription(
 "Timeout for all outbound connections. If you 
should experience problems with connecting to a"
 + " TaskManager due to a slow network, you 
should increase this value.");
 
 /** Timeout for the startup of the actor system. */
 public static final ConfigOption STARTUP_TIMEOUT =
-ConfigOptions.key("akka.startup-timeout")
+ConfigOptions.key("pekko.startup-timeout")
 .stringType()
 .noDefaultValue()
+.withDeprecatedKeys("akka.startup-timeout")
 .withDescription(
 "Timeout after which the startup of a remote 
component is considered being failed.");
 
-/** Override SSL support for the Akka transport. */
+/** Override SSL support for the Pekko transport. */
 public static final ConfigOption SSL_ENABLED =
-ConfigOptions.key("akka.ssl.enabled")
+ConfigOptions.key("pekko.ssl.enabled")
 .booleanType()
 .defaultValue(true)
+.withDeprecatedKeys("akka.ssl.enabled")
 .withDescription(
 "Turns on SSL for Akka’s remote communication. 
This is applicable only when the global ssl flag"
 + " security.ssl.enabled is set to true.");
 
-/** Maximum framesize of akka messages. */
+/** Maximum framesize of Pekko messages. */
 public static final ConfigOption FRAMESIZE =
-ConfigOptions.key("akka.framesize")
+ConfigOptions.key("pekko.framesize")
 .stringType()
 .defaultValue("10485760b")
+.withDeprecatedKeys("akka.framesize")
 .withDescription(
 "Maximum size of messages which are sent between 
the JobManager and the TaskManagers. If 

[jira] [Updated] (FLINK-32656) Deprecate ManagedTable related APIs

2023-07-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-32656:
---
Labels: pull-request-available  (was: )

> Deprecate ManagedTable related APIs
> ---
>
> Key: FLINK-32656
> URL: https://issues.apache.org/jira/browse/FLINK-32656
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / API
>Affects Versions: 1.18.0
>Reporter: Jane Chan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Please refer to [FLIP-346: Deprecate ManagedTable related 
> APIs|https://cwiki.apache.org/confluence/display/FLINK/FLIP-346%3A+Deprecate+ManagedTable+related+APIs]
>  for more details.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[GitHub] [flink] LadyForest opened a new pull request, #23065: [FLINK-32656][table] Deprecate ManagedTable related APIs

2023-07-24 Thread via GitHub


LadyForest opened a new pull request, #23065:
URL: https://github.com/apache/flink/pull/23065

   ## What is the purpose of the change
   
   This PR marks all `ManagedTable` related APIs as `@Deprecated`, according to 
[FLIP-346](https://cwiki.apache.org/confluence/display/FLINK/FLIP-346%3A+Deprecate+ManagedTable+related+APIs).
   
   ## Brief change log
   Mark all related methods/classes/interfaces as deprecated.
   
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (**yes** / no) Mark as deprecated
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [flink] liming30 commented on pull request #23059: [FLINK-32655][runtime] Fix checkpoint aborted message being swallowed by RecreateOnResetOperatorCoordinator

2023-07-24 Thread via GitHub


liming30 commented on PR #23059:
URL: https://github.com/apache/flink/pull/23059#issuecomment-1648068653

   > Could you help add a test at `RecreateOnResetOperatorCoordinatorTest` to 
test whether the `notifyCheckpointAborted` is expected? Or improve an old unit 
test.
   
   @1996fanrui Thanks for your review. I added 
`RecreateOnResetOperatorCoordinatorTest#testNotifyCheckpointAbortedSuccess` to 
verify that `notifyCheckpointAborted` can be forwarded to the real 
`OperatorCoordinator`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



  1   2   >