[jira] [Created] (FLINK-36127) Support sorting watermark on flink web

2024-08-21 Thread Yu Chen (Jira)
Yu Chen created FLINK-36127:
---

 Summary: Support sorting watermark on flink web
 Key: FLINK-36127
 URL: https://issues.apache.org/jira/browse/FLINK-36127
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 2.0.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35832) IFNULL returns incorrect result in Flink SQL

2024-07-14 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865787#comment-17865787
 ] 

Yu Chen commented on FLINK-35832:
-

Hi [~yunta], Would you mind to help @ someone related to SQL for help in this 
problem?

> IFNULL returns incorrect result in Flink SQL
> 
>
> Key: FLINK-35832
> URL: https://issues.apache.org/jira/browse/FLINK-35832
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 2.0.0
>Reporter: Yu Chen
>Priority: Critical
>
> Run following SQL in sql-client:
> The correct result should be '16', but we got '1' on the master.
> {code:java}
> Flink SQL> SET 'sql-client.execution.result-mode' = 'tableau';
> [INFO] Execute statement succeeded.
> Flink SQL> select JSON_VALUE('{"a":16}','$.a'), 
> IFNULL(JSON_VALUE('{"a":16}','$.a'),'0');
> ++++
> | op |                         EXPR$0 |                         EXPR$1 |
> ++++
> | +I |                             16 |                              1 |
> ++++
> Received a total of 1 row (0.30 seconds){code}
>  
> With some quick debugging, I guess it may be caused by 
> [FLINK-24413|https://issues.apache.org/jira/browse/FLINK-24413] which was 
> introduced in Flink version 1.15.
>  
> I think the wrong result '1' was produced because the simplifying SQL 
> procedure assumed that parameter 1 and parameter 2 ('0' was char) of IFNULL 
> were of the same type, and therefore implicitly cast '16' to char, resulting 
> in the incorrect result.
>  
> I have tested the SQL in the following version:
>  
> ||Flink Version||Result||
> |1.13|16,16|
> |1.17|16,1|
> |1.19|16,1|
> |master|16,1|
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35832) IFNULL returns incorrect result in Flink SQL

2024-07-14 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-35832:

Summary: IFNULL returns incorrect result in Flink SQL  (was: IFNULL returns 
error result in Flink SQL)

> IFNULL returns incorrect result in Flink SQL
> 
>
> Key: FLINK-35832
> URL: https://issues.apache.org/jira/browse/FLINK-35832
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 2.0.0
>Reporter: Yu Chen
>Priority: Critical
>
> Run following SQL in sql-client:
> The correct result should be '16', but we got '1' on the master.
> {code:java}
> Flink SQL> SET 'sql-client.execution.result-mode' = 'tableau';
> [INFO] Execute statement succeeded.
> Flink SQL> select JSON_VALUE('{"a":16}','$.a'), 
> IFNULL(JSON_VALUE('{"a":16}','$.a'),'0');
> ++++
> | op |                         EXPR$0 |                         EXPR$1 |
> ++++
> | +I |                             16 |                              1 |
> ++++
> Received a total of 1 row (0.30 seconds){code}
>  
> With some quick debugging, I guess it may be caused by 
> [FLINK-24413|https://issues.apache.org/jira/browse/FLINK-24413] which was 
> introduced in Flink version 1.15.
>  
> I think the wrong result '1' was produced because the simplifying SQL 
> procedure assumed that parameter 1 and parameter 2 ('0' was char) of IFNULL 
> were of the same type, and therefore implicitly cast '16' to char, resulting 
> in the incorrect result.
>  
> I have tested the SQL in the following version:
>  
> ||Flink Version||Result||
> |1.13|16,16|
> |1.17|16,1|
> |1.19|16,1|
> |master|16,1|
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35832) IFNULL returns error result in Flink SQL

2024-07-14 Thread Yu Chen (Jira)
Yu Chen created FLINK-35832:
---

 Summary: IFNULL returns error result in Flink SQL
 Key: FLINK-35832
 URL: https://issues.apache.org/jira/browse/FLINK-35832
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 2.0.0
Reporter: Yu Chen


Run following SQL in sql-client:

The correct result should be '16', but we got '1' on the master.
{code:java}
Flink SQL> SET 'sql-client.execution.result-mode' = 'tableau';
[INFO] Execute statement succeeded.

Flink SQL> select JSON_VALUE('{"a":16}','$.a'), 
IFNULL(JSON_VALUE('{"a":16}','$.a'),'0');
++++
| op |                         EXPR$0 |                         EXPR$1 |
++++
| +I |                             16 |                              1 |
++++
Received a total of 1 row (0.30 seconds){code}
 

With some quick debugging, I guess it may be caused by 
[FLINK-24413|https://issues.apache.org/jira/browse/FLINK-24413] which was 
introduced in Flink version 1.15.

 

I think the wrong result '1' was produced because the simplifying SQL procedure 
assumed that parameter 1 and parameter 2 ('0' was char) of IFNULL were of the 
same type, and therefore implicitly cast '16' to char, resulting in the 
incorrect result.

 

I have tested the SQL in the following version:

 
||Flink Version||Result||
|1.13|16,16|
|1.17|16,1|
|1.19|16,1|
|master|16,1|

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35039) Create Profiling JobManager/TaskManager Instance failed

2024-04-29 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842230#comment-17842230
 ] 

Yu Chen commented on FLINK-35039:
-

Hi [~wczhu] , sorry for the late response.

It does surprise me that YARN doesn't support POST. But I have a confusing 
point: POST requests are already used in many places in the Flink interfaces, 
such as Stop-with-savepoint. Are these interfaces currently not accessible on 
YARN?

Moreover, according to the principles of RESTful Interface [1], the biggest 
difference between POST and PUT is that PUT requests are idempotent, the same 
request is submitted N times and the result is still the same, but the 
Profiling interface should be a new request each time and the result will be 
added to the server.

Therefore, POST may be more in line with the semantics of this interface.

So I wonder if it is appropriate to do this compatibility in Flink. WDYT 
[~yunta] ?


 [1] https://restfulapi.net/rest-put-vs-post/

!image-2024-04-30-11-12-34-734.png|width=414,height=496!

> Create Profiling JobManager/TaskManager Instance failed
> ---
>
> Key: FLINK-35039
> URL: https://issues.apache.org/jira/browse/FLINK-35039
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
> Environment: Hadoop 3.2.2
> Flink 1.19
>Reporter: ude
>Assignee: ude
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-04-08-10-21-31-066.png, 
> image-2024-04-08-10-21-48-417.png, image-2024-04-08-10-30-16-683.png, 
> image-2024-04-30-11-12-34-734.png, image-2024-04-30-11-14-44-335.png
>
>
> I'm test the "async-profiler" feature in version 1.19, but when I submit a 
> task in yarn per-job mode, I get an error  when I click Create Profiling 
> Instance on the flink Web UI page.
> !image-2024-04-08-10-21-31-066.png!
> !image-2024-04-08-10-21-48-417.png!
> The error message obviously means that the yarn proxy server does not support 
> *POST* calls. I checked the code of _*WebAppProxyServlet.java*_ and found 
> that the *POST* method is indeed not supported, so I changed it to *PUT* 
> method and the call was successful.
> !image-2024-04-08-10-30-16-683.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35039) Create Profiling JobManager/TaskManager Instance failed

2024-04-29 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-35039:

Attachment: image-2024-04-30-11-14-44-335.png

> Create Profiling JobManager/TaskManager Instance failed
> ---
>
> Key: FLINK-35039
> URL: https://issues.apache.org/jira/browse/FLINK-35039
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
> Environment: Hadoop 3.2.2
> Flink 1.19
>Reporter: ude
>Assignee: ude
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-04-08-10-21-31-066.png, 
> image-2024-04-08-10-21-48-417.png, image-2024-04-08-10-30-16-683.png, 
> image-2024-04-30-11-12-34-734.png, image-2024-04-30-11-14-44-335.png
>
>
> I'm test the "async-profiler" feature in version 1.19, but when I submit a 
> task in yarn per-job mode, I get an error  when I click Create Profiling 
> Instance on the flink Web UI page.
> !image-2024-04-08-10-21-31-066.png!
> !image-2024-04-08-10-21-48-417.png!
> The error message obviously means that the yarn proxy server does not support 
> *POST* calls. I checked the code of _*WebAppProxyServlet.java*_ and found 
> that the *POST* method is indeed not supported, so I changed it to *PUT* 
> method and the call was successful.
> !image-2024-04-08-10-30-16-683.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35039) Create Profiling JobManager/TaskManager Instance failed

2024-04-29 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-35039:

Attachment: image-2024-04-30-11-12-34-734.png

> Create Profiling JobManager/TaskManager Instance failed
> ---
>
> Key: FLINK-35039
> URL: https://issues.apache.org/jira/browse/FLINK-35039
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
> Environment: Hadoop 3.2.2
> Flink 1.19
>Reporter: ude
>Assignee: ude
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-04-08-10-21-31-066.png, 
> image-2024-04-08-10-21-48-417.png, image-2024-04-08-10-30-16-683.png, 
> image-2024-04-30-11-12-34-734.png
>
>
> I'm test the "async-profiler" feature in version 1.19, but when I submit a 
> task in yarn per-job mode, I get an error  when I click Create Profiling 
> Instance on the flink Web UI page.
> !image-2024-04-08-10-21-31-066.png!
> !image-2024-04-08-10-21-48-417.png!
> The error message obviously means that the yarn proxy server does not support 
> *POST* calls. I checked the code of _*WebAppProxyServlet.java*_ and found 
> that the *POST* method is indeed not supported, so I changed it to *PUT* 
> method and the call was successful.
> !image-2024-04-08-10-30-16-683.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34968) Update flink-web copyright to 2024

2024-03-30 Thread Yu Chen (Jira)
Yu Chen created FLINK-34968:
---

 Summary: Update flink-web copyright to 2024
 Key: FLINK-34968
 URL: https://issues.apache.org/jira/browse/FLINK-34968
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34622) Typo of execution_mode configuration name in Chinese document

2024-03-07 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-34622:

Description: !image-2024-03-08-14-46-34-859.png|width=794,height=380!

> Typo of execution_mode configuration name in Chinese document
> -
>
> Key: FLINK-34622
> URL: https://issues.apache.org/jira/browse/FLINK-34622
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Yu Chen
>Priority: Major
> Attachments: image-2024-03-08-14-46-34-859.png
>
>
> !image-2024-03-08-14-46-34-859.png|width=794,height=380!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-34622) Typo of execution_mode configuration name in Chinese document

2024-03-07 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-34622:

Attachment: image-2024-03-08-14-46-34-859.png

> Typo of execution_mode configuration name in Chinese document
> -
>
> Key: FLINK-34622
> URL: https://issues.apache.org/jira/browse/FLINK-34622
> Project: Flink
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Yu Chen
>Priority: Major
> Attachments: image-2024-03-08-14-46-34-859.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34622) Typo of execution_mode configuration name in Chinese document

2024-03-07 Thread Yu Chen (Jira)
Yu Chen created FLINK-34622:
---

 Summary: Typo of execution_mode configuration name in Chinese 
document
 Key: FLINK-34622
 URL: https://issues.apache.org/jira/browse/FLINK-34622
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-33325) FLIP-375: Built-in cross-platform powerful java profiler

2024-03-01 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen resolved FLINK-33325.
-
Release Note: 
Since Flink 1.19, we support profiling the JobManager/TaskManager process 
interactively with 
[async-profiler](https://github.com/async-profiler/async-profiler) via the 
Flink Web UI, which allows users to create a profiling instance with arbitrary 
intervals and event modes, e.g ITIMER, CPU, Lock, Wall-Clock and Allocation.
Flink users can complete the profiling submission and result export via Flink 
Web UI conveniently.

For example, 
- First, users should find out the candidate TaskManager/JobManager with 
performance bottleneck for profiling, and switch to the corresponding 
TaskManager/JobManager page (profiler tab).
- Users can submit a profiling instance with a specified period and mode by 
simply clicking on the button `Create Profiling Instance`. (The description of 
the profiling mode will be shown when hovering over the corresponding mode.)
- Once the profiling instance is complete, the user can easily download the 
interactive HTML file by clicking on the link.

**More Information**
- 
[Documents](https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/ops/debugging/profiler/)
- [FLIP-375: Built-in cross-platform powerful java 
profiler](https://cwiki.apache.org/confluence/x/64lEE)
  Resolution: Resolved

> FLIP-375: Built-in cross-platform powerful java profiler
> 
>
> Key: FLINK-33325
> URL: https://issues.apache.org/jira/browse/FLINK-33325
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / REST, Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Yu Chen
>Assignee: Yu Chen
>Priority: Major
> Fix For: 1.19.0
>
>
> This is an umbrella JIRA of 
> [FLIP-375|https://cwiki.apache.org/confluence/x/64lEE]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34388) Release Testing: Verify FLINK-28915 Support artifact fetching in Standalone and native K8s application mode

2024-02-22 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819897#comment-17819897
 ] 

Yu Chen commented on FLINK-34388:
-

Hi [~ferenc-csaky]  [~lincoln.86xy]  [~yunta] , 

I have completed this Release Testing. And I think this feature works well in 
branch `release-1.19`, and this ticket can be closed.

Here are some testing logs as a reference.
*> Test 1: Pass local:// job jar in standalone mode, and check the artifacts 
are not actually copied.*
*Passed.* Flink will use the original file and not actually copied.
By the way, it also works with an absolute path(without `local://`), but it 
will copy the file to the `user.artifacts.base-dir`. I'm not sure whether it's 
expected (In my opinion, it was a local file, maybe we don't need to copy that 
cc [~ferenc-csaky] ).

*> Test 2: Pass multiple artifacts in standalone mode.*
*Passed.* In StandaloneJobCluster, flink could load the jar with 
`http://`(copied) and `local://`(not copied) simultaneously.

*> Test 3: Pass a non-local job jar in native k8s mode.*
*Passed.* Tested by starting native k8s application cluster in `minikube` with 
following command,
{code:java}
./bin/flink run-application \
--target kubernetes-application \
-Dkubernetes.cluster-id=my-first-application-cluster\
-Dkubernetes.container.image.ref=flink:test_community_1.19_SN \
http://localhost:/data/WordCount.jar {code}


flink will throw expected exception with hints(set 
`user.artifacts.raw-http-enabled` to true).
{code:java}
2024-02-23 02:20:48,305 ERROR 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Could not create 
application program.
java.lang.RuntimeException: java.lang.IllegalArgumentException: Artifact 
fetching from raw HTTP endpoints are disabled. Set the 
'user.artifacts.raw-http-enabled' property to override.
at 
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:158)
 ~[flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
at 
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.getPackagedProgramRetriever(KubernetesApplicationClusterEntrypoint.java:129)
 ~[flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
at 
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.getPackagedProgram(KubernetesApplicationClusterEntrypoint.java:111)
 ~[flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
at 
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.lambda$main$0(KubernetesApplicationClusterEntrypoint.java:85)
 ~[flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
 ~[flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
at 
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.main(KubernetesApplicationClusterEntrypoint.java:85)
 [flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
Caused by: java.lang.IllegalArgumentException: Artifact fetching from raw HTTP 
endpoints are disabled. Set the 'user.artifacts.raw-http-enabled' property to 
override.
at 
org.apache.flink.client.program.artifact.ArtifactFetchManager.isRawHttp(ArtifactFetchManager.java:166)
 ~[flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
at 
org.apache.flink.client.program.artifact.ArtifactFetchManager.getFetcher(ArtifactFetchManager.java:142)
 ~[flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
at 
org.apache.flink.client.program.artifact.ArtifactFetchManager.fetchArtifact(ArtifactFetchManager.java:157)
 ~[flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
at 
org.apache.flink.client.program.artifact.ArtifactFetchManager.fetchArtifacts(ArtifactFetchManager.java:124)
 ~[flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
at 
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint.fetchArtifacts(KubernetesApplicationClusterEntrypoint.java:156)
 ~[flink-dist-1.19-SNAPSHOT.jar:1.19-SNAPSHOT]
... 5 more {code}


After setting the parameter in `flink-conf.yaml`, the jar was copied to 
`user.artifacts.base-dir` and the job was running as expected.


*> Test 4: Pass additional remote artifacts in native k8s mode.*
*Passed.* Tested by starting native k8s application cluster with commands in 
`Test 3`.
In addition, we added additional jars in `flink-conf.yaml`, they are copied as 
expected and the job running well.
{code:java}
user.artifacts.artifact-list: 
http://10.23.171.97:/data/AsyncIO.jar;http://10.23.171.97:/data/WordCount.jar;
 {code}
{code:java}
root@my-first-application-cluster-8479579f45-cdpc9:/opt/flink/artifacts/default/my-first-application-cluster#
 ls
AsyncIO.jar WindowJoin.jar WordCount.jar {code}

> Release Testing: Verify FLINK-28915 Support artifact fetching in Standalone 
> and native K8s application mode
> ---
>
>

[jira] [Commented] (FLINK-34388) Release Testing: Verify FLINK-28915 Support artifact fetching in Standalone and native K8s application mode

2024-02-07 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815243#comment-17815243
 ] 

Yu Chen commented on FLINK-34388:
-

I'd like to take this Release Testing. Is there anyone helps to assinged this 
to me?

> Release Testing: Verify FLINK-28915 Support artifact fetching in Standalone 
> and native K8s application mode
> ---
>
> Key: FLINK-34388
> URL: https://issues.apache.org/jira/browse/FLINK-34388
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Metrics
>Affects Versions: 1.19.0
>Reporter: Ferenc Csaky
>Assignee: Yu Chen
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.19.0
>
>
> This ticket covers testing FLINK-28915. More details and the added docs are 
> accessible on the [PR|https://github.com/apache/flink/pull/24065]
> Test 1: Pass {{local://}} job jar in standalone mode, check the artifacts are 
> not actually copied.
> Test 2: Pass multiple artifacts in standalone mode.
> Test 3: Pass a non-local job jar in native k8s mode. [1]
> Test 4: Pass additional remote artifacts in native k8s mode.
> Available config options: 
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#artifact-fetching
> [1] Custom docker image build instructions: 
> https://github.com/apache/flink-docker/tree/dev-master
> Note: The docker build instructions also contains a web server example that 
> can be used to serve HTTP artifacts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34310) Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform powerful java profiler

2024-02-06 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815079#comment-17815079
 ] 

Yu Chen commented on FLINK-34310:
-

Sorry for the late response.

Thanks [~yunta] for creating the Testing Instructions.  
{quote}When I test other features in my Local, I found Profiler page throw some 
exceptions. I'm not sure whether it's expected.  
{quote}
Hi [~fanrui], so far, the determination of whether the profiler is enabled or 
not is achieved by checking if the interface is registered with
`WebMonitorEndpoint`. Therefore, this behavior is by design.
But I think we can implement this check more elegantly in a later version by 
registering an interface to check the enabled status of the profiler.
 
{quote}[~Yu Chen] Could you estimate when the user doc 
(https://issues.apache.org/jira/browse/FLINK-33436) can be finished?{quote}
Hi [~lincoln.86xy] , really sorry for the late response. I was quite busy 
recently, is that OK for me to finish working on the documentation within the 
next week (before 02.18)?

> Release Testing Instructions: Verify FLINK-33325 Built-in cross-platform 
> powerful java profiler
> ---
>
> Key: FLINK-34310
> URL: https://issues.apache.org/jira/browse/FLINK-34310
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / REST, Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: lincoln lee
>Assignee: Yun Tang
>Priority: Blocker
> Fix For: 1.19.0
>
> Attachments: image-2024-02-06-14-09-39-874.png, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png, screenshot-4.png, screenshot-5.png
>
>
> Instructions:
> 1. For the default case, it will print the hint to tell users how to enable 
> this feature.
>  !screenshot-2.png! 
> 2. After we add {{rest.profiling.enabled: true}} in the configurations, we 
> can use this feature now, and the default mode should be {{ITIMER}}
>  !screenshot-3.png! 
> 3. We cannot create another profiling while one is running
>  !screenshot-4.png! 
> 4. We can get at most 10 profilling snapshots by default, and the older one 
> will be deleted automaticially.
>  !screenshot-5.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34099) CheckpointIntervalDuringBacklogITCase.testNoCheckpointDuringBacklog is unstable on AZP

2024-01-15 Thread Yu Chen (Jira)
Yu Chen created FLINK-34099:
---

 Summary: 
CheckpointIntervalDuringBacklogITCase.testNoCheckpointDuringBacklog is unstable 
on AZP
 Key: FLINK-34099
 URL: https://issues.apache.org/jira/browse/FLINK-34099
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.19.0
Reporter: Yu Chen


This build [Pipelines - Run 20240115.30 logs 
(azure.com)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56403&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba]
 fails as 
{code:java}
Jan 15 18:29:51 18:29:51.938 [ERROR] 
org.apache.flink.test.checkpointing.CheckpointIntervalDuringBacklogITCase.testNoCheckpointDuringBacklog
 -- Time elapsed: 2.022 s <<< FAILURE!
Jan 15 18:29:51 org.opentest4j.AssertionFailedError: 
Jan 15 18:29:51 
Jan 15 18:29:51 expected: 0
Jan 15 18:29:51  but was: 1
Jan 15 18:29:51 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
Jan 15 18:29:51 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
Jan 15 18:29:51 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
Jan 15 18:29:51 at 
org.apache.flink.test.checkpointing.CheckpointIntervalDuringBacklogITCase.testNoCheckpointDuringBacklog(CheckpointIntervalDuringBacklogITCase.java:141)
Jan 15 18:29:51 at java.lang.reflect.Method.invoke(Method.java:498)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34072) Use JAVA_RUN in shell scripts

2024-01-14 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17806446#comment-17806446
 ] 

Yu Chen commented on FLINK-34072:
-

Hi [~yunta] , I'd like to take this ticket if you don't mind.

> Use JAVA_RUN in shell scripts
> -
>
> Key: FLINK-34072
> URL: https://issues.apache.org/jira/browse/FLINK-34072
> Project: Flink
>  Issue Type: Improvement
>  Components: Deployment / Scripts
>Reporter: Yun Tang
>Priority: Minor
> Fix For: 1.19.0
>
>
> We should call {{JAVA_RUN}} in all cases when we launch {{java}} command, 
> otherwise we might be able to run the {{java}} if JAVA_HOME is not set.
> such as:
> {code:java}
> flink-1.19-SNAPSHOT-bin/flink-1.19-SNAPSHOT/bin/config.sh: line 339: > 17 : 
> syntax error: operand expected (error token is "> 17 ")
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33325) FLIP-375: Built-in cross-platform powerful java profiler

2024-01-10 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805374#comment-17805374
 ] 

Yu Chen commented on FLINK-33325:
-

Hi [~Zhanghao Chen] , thanks for your attention, the feature will be available 
to everyone soon!

> FLIP-375: Built-in cross-platform powerful java profiler
> 
>
> Key: FLINK-33325
> URL: https://issues.apache.org/jira/browse/FLINK-33325
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / REST, Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Yu Chen
>Assignee: Yu Chen
>Priority: Major
>
> This is an umbrella JIRA of 
> [FLIP-375|https://cwiki.apache.org/confluence/x/64lEE]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34029) Support different profiling mode on Flink WEB

2024-01-08 Thread Yu Chen (Jira)
Yu Chen created FLINK-34029:
---

 Summary: Support different profiling mode on Flink WEB
 Key: FLINK-34029
 URL: https://issues.apache.org/jira/browse/FLINK-34029
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Affects Versions: 1.19.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34013) ProfilingServiceTest.testRollingDeletion is unstable on AZP

2024-01-08 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804387#comment-17804387
 ] 

Yu Chen commented on FLINK-34013:
-

Hi [~Sergey Nuyanzin], I have reproduced it on my local machine. And it's a bug 
caused by triggering {*}`{*}stopProfiling{*}`{*} twice for each profiling 
request.

I'll create a PR to fix this. Thank you for pointing out!

> ProfilingServiceTest.testRollingDeletion is unstable on AZP
> ---
>
> Key: FLINK-34013
> URL: https://issues.apache.org/jira/browse/FLINK-34013
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56073&view=logs&j=0e7be18f-84f2-53f0-a32d-4a5e4a174679&t=7c1d86e3-35bd-5fd5-3b7c-30c126a78702&l=8258
>  fails as 
> {noformat}
> Jan 06 02:09:28 org.opentest4j.AssertionFailedError: expected: <2> but was: 
> <3>
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150)
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145)
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:531)
> Jan 06 02:09:28   at 
> org.apache.flink.runtime.util.profiler.ProfilingServiceTest.verifyRollingDeletionWorks(ProfilingServiceTest.java:167)
> Jan 06 02:09:28   at 
> org.apache.flink.runtime.util.profiler.ProfilingServiceTest.testRollingDeletion(ProfilingServiceTest.java:117)
> Jan 06 02:09:28   at java.lang.reflect.Method.invoke(Method.java:498)
> Jan 06 02:09:28   at 
> java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
> Jan 06 02:09:28   at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> Jan 06 02:09:28   at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> Jan 06 02:09:28   at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> Jan 06 02:09:28   at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34013) ProfilingServiceTest.testRollingDeletion is unstable on AZP

2024-01-08 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804341#comment-17804341
 ] 

Yu Chen commented on FLINK-34013:
-

Hi [~Sergey Nuyanzin] , really sorry about that. Let me have a look to fix this.

> ProfilingServiceTest.testRollingDeletion is unstable on AZP
> ---
>
> Key: FLINK-34013
> URL: https://issues.apache.org/jira/browse/FLINK-34013
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.19.0
>Reporter: Sergey Nuyanzin
>Priority: Critical
>  Labels: test-stability
>
> This build 
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56073&view=logs&j=0e7be18f-84f2-53f0-a32d-4a5e4a174679&t=7c1d86e3-35bd-5fd5-3b7c-30c126a78702&l=8258
>  fails as 
> {noformat}
> Jan 06 02:09:28 org.opentest4j.AssertionFailedError: expected: <2> but was: 
> <3>
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150)
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145)
> Jan 06 02:09:28   at 
> org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:531)
> Jan 06 02:09:28   at 
> org.apache.flink.runtime.util.profiler.ProfilingServiceTest.verifyRollingDeletionWorks(ProfilingServiceTest.java:167)
> Jan 06 02:09:28   at 
> org.apache.flink.runtime.util.profiler.ProfilingServiceTest.testRollingDeletion(ProfilingServiceTest.java:117)
> Jan 06 02:09:28   at java.lang.reflect.Method.invoke(Method.java:498)
> Jan 06 02:09:28   at 
> java.util.concurrent.RecursiveAction.exec(RecursiveAction.java:189)
> Jan 06 02:09:28   at 
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> Jan 06 02:09:28   at 
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
> Jan 06 02:09:28   at 
> java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
> Jan 06 02:09:28   at 
> java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33435) The visualization and download capabilities of profiling history

2023-11-30 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen closed FLINK-33435.
---
Resolution: Duplicate

> The visualization and download capabilities of profiling history 
> -
>
> Key: FLINK-33435
> URL: https://issues.apache.org/jira/browse/FLINK-33435
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Yu Chen
>Assignee: Yu Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33435) The visualization and download capabilities of profiling history

2023-11-30 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791600#comment-17791600
 ] 

Yu Chen commented on FLINK-33435:
-

This subtask will be completed in FLINK-33433 and FLINK-33434. So I'll close 
this ticket.

> The visualization and download capabilities of profiling history 
> -
>
> Key: FLINK-33435
> URL: https://issues.apache.org/jira/browse/FLINK-33435
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Yu Chen
>Assignee: Yu Chen
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33613) Python UDF Runner process leak in Process Mode

2023-11-21 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-33613:

Description: 
While working with PyFlink, we found that in Process Mode, the Python UDF 
process may leak after a failover of the job. It leads to a rising number of 
processes with their threads in the host machine, which eventually results in 
failure to create new threads.

 

You can try to reproduce it with the attached test task 
`streamin_word_count.py`.

(Note that the job will continue failover, and you can watch the process leaks 
by `ps -ef` on Taskmanager.

 

Our test environment:
 * K8S Application Mode
 * 4 Taskmanagers with 12 slots/TM
 * Job's parallelism was set to 48 

The udf process `pyflink.fn_execution.beam.beam_boot` should be consistence 
with slots of TM (12), but we found that there are 180 processes on one 
Taskmanager after several failovers.

  was:
While working with PyFlink, we found that in Process Mode, the Python UDF 
process may leak after a failover of the job. It leads to a rising number of 
processes with their threads in the host machine, which eventually results in 
failure to create new threads.

 

You can try to reproduce it with the attached test task 
`streamin_word_count.py`.

(Note that the job will continue failover, and you can watch the process leaks 
by `ps -ef` on Taskmanager.

 

Our test environment:
 * K8S Application Mode
 * 4 Taskmanagers with 12 slots/TM
 * Job's parallelism was set to 48 

The udf process `pyflink.fn_execution.beam.beam_boot` should be consistence 
with parallelism (48), but we found that there are 180 processes after several 
failovers.


> Python UDF Runner process leak in Process Mode
> --
>
> Key: FLINK-33613
> URL: https://issues.apache.org/jira/browse/FLINK-33613
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.17.0
>Reporter: Yu Chen
>Priority: Major
> Attachments: ps-ef.txt, streaming_word_count-1.py
>
>
> While working with PyFlink, we found that in Process Mode, the Python UDF 
> process may leak after a failover of the job. It leads to a rising number of 
> processes with their threads in the host machine, which eventually results in 
> failure to create new threads.
>  
> You can try to reproduce it with the attached test task 
> `streamin_word_count.py`.
> (Note that the job will continue failover, and you can watch the process 
> leaks by `ps -ef` on Taskmanager.
>  
> Our test environment:
>  * K8S Application Mode
>  * 4 Taskmanagers with 12 slots/TM
>  * Job's parallelism was set to 48 
> The udf process `pyflink.fn_execution.beam.beam_boot` should be consistence 
> with slots of TM (12), but we found that there are 180 processes on one 
> Taskmanager after several failovers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33613) Python UDF Runner process leak in Process Mode

2023-11-21 Thread Yu Chen (Jira)
Yu Chen created FLINK-33613:
---

 Summary: Python UDF Runner process leak in Process Mode
 Key: FLINK-33613
 URL: https://issues.apache.org/jira/browse/FLINK-33613
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.17.0
Reporter: Yu Chen
 Attachments: ps-ef.txt, streaming_word_count-1.py

While working with PyFlink, we found that in Process Mode, the Python UDF 
process may leak after a failover of the job. It leads to a rising number of 
processes with their threads in the host machine, which eventually results in 
failure to create new threads.

 

You can try to reproduce it with the attached test task 
`streamin_word_count.py`.

(Note that the job will continue failover, and you can watch the process leaks 
by `ps -ef` on Taskmanager.

 

Our test environment:
 * K8S Application Mode
 * 4 Taskmanagers with 12 slots/TM
 * Job's parallelism was set to 48 

The udf process `pyflink.fn_execution.beam.beam_boot` should be consistence 
with parallelism (48), but we found that there are 180 processes after several 
failovers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33474) ShowPlan throws undefined exception In Flink Web Submit Page

2023-11-06 Thread Yu Chen (Jira)
Yu Chen created FLINK-33474:
---

 Summary: ShowPlan throws undefined exception In Flink Web Submit 
Page
 Key: FLINK-33474
 URL: https://issues.apache.org/jira/browse/FLINK-33474
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.19.0
Reporter: Yu Chen
 Attachments: image-2023-11-07-13-53-08-216.png

The exception as shown in the figure below, meanwhile, the job plan cannot be 
displayed properly.

 

The root cause is that the dagreComponent is located in the nz-drawer and is 
only loaded when the drawer is visible, so we need to wait for the drawer to 
finish loading and then render the job plan.

!image-2023-11-07-13-53-08-216.png|width=400,height=190!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33436) Documentation on the built-in Profiler

2023-11-02 Thread Yu Chen (Jira)
Yu Chen created FLINK-33436:
---

 Summary: Documentation on the built-in Profiler
 Key: FLINK-33436
 URL: https://issues.apache.org/jira/browse/FLINK-33436
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.19.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33435) The visualization and download capabilities of profiling history

2023-11-02 Thread Yu Chen (Jira)
Yu Chen created FLINK-33435:
---

 Summary: The visualization and download capabilities of profiling 
history 
 Key: FLINK-33435
 URL: https://issues.apache.org/jira/browse/FLINK-33435
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Affects Versions: 1.19.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33434) Support invoke async-profiler on Taskmanager through REST API

2023-11-02 Thread Yu Chen (Jira)
Yu Chen created FLINK-33434:
---

 Summary: Support invoke async-profiler on Taskmanager through REST 
API
 Key: FLINK-33434
 URL: https://issues.apache.org/jira/browse/FLINK-33434
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / REST
Affects Versions: 1.19.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33433) Support invoke async-profiler on Jobmanager through REST API

2023-11-02 Thread Yu Chen (Jira)
Yu Chen created FLINK-33433:
---

 Summary: Support invoke async-profiler on Jobmanager through REST 
API
 Key: FLINK-33433
 URL: https://issues.apache.org/jira/browse/FLINK-33433
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / REST
Affects Versions: 1.19.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33325) FLIP-375: Built-in cross-platform powerful java profiler

2023-10-20 Thread Yu Chen (Jira)
Yu Chen created FLINK-33325:
---

 Summary: FLIP-375: Built-in cross-platform powerful java profiler
 Key: FLINK-33325
 URL: https://issues.apache.org/jira/browse/FLINK-33325
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / REST, Runtime / Web Frontend
Affects Versions: 1.19.0
Reporter: Yu Chen


This is an umbrella JIRA of 
[FLIP-375|https://cwiki.apache.org/confluence/x/64lEE]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33230) Support Expanding ExecutionGraph to StreamGraph in Web UI

2023-10-16 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775743#comment-17775743
 ] 

Yu Chen commented on FLINK-33230:
-

Hi [~JunRuiLi] ,

Sure, I'll illustrate the details of the implementation as a FLIP and create 
the discussion in the dev mailing group.

> Support Expanding ExecutionGraph to StreamGraph in Web UI
> -
>
> Key: FLINK-33230
> URL: https://issues.apache.org/jira/browse/FLINK-33230
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Yu Chen
>Assignee: Yu Chen
>Priority: Major
> Attachments: image-2023-10-10-18-52-38-252.png
>
>
> Flink Web shows users the ExecutionGraph (i.e., chained operators), but in 
> some cases, we would like to know the structure of the chained operators as 
> well as the necessary metrics such as the inputs and outputs of data, etc.
>  
> Thus, we propose to show the stream graphs and some related metrics such as 
> numberRecordInand numberRecordOut on the Flink Web (As shown in the Figure).
>  
> !image-2023-10-10-18-52-38-252.png|width=750,height=263!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33230) Support Expanding ExecutionGraph to StreamGraph in Web UI

2023-10-11 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774333#comment-17774333
 ] 

Yu Chen commented on FLINK-33230:
-

Hi [~lsy]. 

We can store the Json String of the StreamGraph into the ArchiveExecutionGraph 
in a similar way as JsonPlan.
Actually, the execution graph shown in the Web UI was also extracted from the 
JsonPlan.

You can refer to the following code path:
StreamingJobGraphGenerator->createJobGraph-[DefaultExecutionGraphBuilder]->executionGraph
 -> ArchivedExecutionGraph

> Support Expanding ExecutionGraph to StreamGraph in Web UI
> -
>
> Key: FLINK-33230
> URL: https://issues.apache.org/jira/browse/FLINK-33230
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Yu Chen
>Assignee: Yu Chen
>Priority: Major
> Attachments: image-2023-10-10-18-52-38-252.png
>
>
> Flink Web shows users the ExecutionGraph (i.e., chained operators), but in 
> some cases, we would like to know the structure of the chained operators as 
> well as the necessary metrics such as the inputs and outputs of data, etc.
>  
> Thus, we propose to show the stream graphs and some related metrics such as 
> numberRecordInand numberRecordOut on the Flink Web (As shown in the Figure).
>  
> !image-2023-10-10-18-52-38-252.png|width=750,height=263!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33230) Support Expanding ExecutionGraph to StreamGraph in Web UI

2023-10-10 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-33230:

Attachment: (was: image-2023-10-10-18-45-24-486.png)

> Support Expanding ExecutionGraph to StreamGraph in Web UI
> -
>
> Key: FLINK-33230
> URL: https://issues.apache.org/jira/browse/FLINK-33230
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Yu Chen
>Priority: Major
> Attachments: image-2023-10-10-18-52-38-252.png
>
>
> Flink Web shows users the ExecutionGraph (i.e., chained operators), but in 
> some cases, we would like to know the structure of the chained operators as 
> well as the necessary metrics such as the inputs and outputs of data, etc.
>  
> Thus, we propose to show the stream graphs and some related metrics such as 
> numberRecordInand numberRecordOut on the Flink Web (As shown in the Figure).
>  
> !image-2023-10-10-18-52-38-252.png|width=750,height=263!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33230) Support Expanding ExecutionGraph to StreamGraph in Web UI

2023-10-10 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-33230:

Description: 
Flink Web shows users the ExecutionGraph (i.e., chained operators), but in some 
cases, we would like to know the structure of the chained operators as well as 
the necessary metrics such as the inputs and outputs of data, etc.

 

Thus, we propose to show the stream graphs and some related metrics such as 
numberRecordInand numberRecordOut on the Flink Web (As shown in the Figure).

 

!image-2023-10-10-18-52-38-252.png|width=750,height=263!

  was:
Flink Web shows users the ExecutionGraph (i.e., chained operators), but in some 
cases, we would like to know the structure of the chained operators as well as 
the necessary metrics such as the inputs and outputs of data, etc.

 

Thus, we propose to show the stream graphs and some related metrics such as 
numberRecordInand numberRecordOut on the Flink Web (As shown in the Figure).

 

!image-2023-10-10-18-45-42-991.png|width=508,height=178!


> Support Expanding ExecutionGraph to StreamGraph in Web UI
> -
>
> Key: FLINK-33230
> URL: https://issues.apache.org/jira/browse/FLINK-33230
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Yu Chen
>Priority: Major
> Attachments: image-2023-10-10-18-45-24-486.png, 
> image-2023-10-10-18-52-38-252.png
>
>
> Flink Web shows users the ExecutionGraph (i.e., chained operators), but in 
> some cases, we would like to know the structure of the chained operators as 
> well as the necessary metrics such as the inputs and outputs of data, etc.
>  
> Thus, we propose to show the stream graphs and some related metrics such as 
> numberRecordInand numberRecordOut on the Flink Web (As shown in the Figure).
>  
> !image-2023-10-10-18-52-38-252.png|width=750,height=263!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33230) Support Expanding ExecutionGraph to StreamGraph in Web UI

2023-10-10 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-33230:

Attachment: image-2023-10-10-18-52-38-252.png

> Support Expanding ExecutionGraph to StreamGraph in Web UI
> -
>
> Key: FLINK-33230
> URL: https://issues.apache.org/jira/browse/FLINK-33230
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Yu Chen
>Priority: Major
> Attachments: image-2023-10-10-18-45-24-486.png, 
> image-2023-10-10-18-52-38-252.png
>
>
> Flink Web shows users the ExecutionGraph (i.e., chained operators), but in 
> some cases, we would like to know the structure of the chained operators as 
> well as the necessary metrics such as the inputs and outputs of data, etc.
>  
> Thus, we propose to show the stream graphs and some related metrics such as 
> numberRecordInand numberRecordOut on the Flink Web (As shown in the Figure).
>  
> !image-2023-10-10-18-45-42-991.png|width=508,height=178!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33230) Support Expanding ExecutionGraph to StreamGraph in Web UI

2023-10-10 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-33230:

   Attachment: image-2023-10-10-18-45-24-486.png
  Component/s: Runtime / Web Frontend
Affects Version/s: 1.19.0
  Description: 
Flink Web shows users the ExecutionGraph (i.e., chained operators), but in some 
cases, we would like to know the structure of the chained operators as well as 
the necessary metrics such as the inputs and outputs of data, etc.

 

Thus, we propose to show the stream graphs and some related metrics such as 
numberRecordInand numberRecordOut on the Flink Web (As shown in the Figure).

 

!image-2023-10-10-18-45-42-991.png|width=508,height=178!
  Summary: Support Expanding ExecutionGraph to StreamGraph in Web 
UI  (was: Support Expanding ExecutionGraph to StreamGraph in Flink)

> Support Expanding ExecutionGraph to StreamGraph in Web UI
> -
>
> Key: FLINK-33230
> URL: https://issues.apache.org/jira/browse/FLINK-33230
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.19.0
>Reporter: Yu Chen
>Priority: Major
> Attachments: image-2023-10-10-18-45-24-486.png, 
> image-2023-10-10-18-52-38-252.png
>
>
> Flink Web shows users the ExecutionGraph (i.e., chained operators), but in 
> some cases, we would like to know the structure of the chained operators as 
> well as the necessary metrics such as the inputs and outputs of data, etc.
>  
> Thus, we propose to show the stream graphs and some related metrics such as 
> numberRecordInand numberRecordOut on the Flink Web (As shown in the Figure).
>  
> !image-2023-10-10-18-45-42-991.png|width=508,height=178!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33230) Support Expanding ExecutionGraph to StreamGraph in Flink

2023-10-10 Thread Yu Chen (Jira)
Yu Chen created FLINK-33230:
---

 Summary: Support Expanding ExecutionGraph to StreamGraph in Flink
 Key: FLINK-33230
 URL: https://issues.apache.org/jira/browse/FLINK-33230
 Project: Flink
  Issue Type: Improvement
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32754) Using SplitEnumeratorContext.metricGroup() in restoreEnumerator causes NPE

2023-08-04 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-32754:

Description: 
We registered some metrics in the `enumerator` of the flip-27 source via 
`SplitEnumerator.metricGroup()`, but found that the task prints NPE logs in JM 
when restoring, suggesting that `SplitEnumerator. metricGroup()` is null.
{*}Meanwhile, the task does not experience failover, and the Checkpoints cannot 
be successfully created even after the task is in running state{*}.

We found that the implementation class of `SplitEnumerator` is 
`LazyInitializedCoordinatorContext`, however, the metricGroup() is initialized 
after calling lazyInitialize(). By reviewing the code, we found that at the 
time of SourceCoordinator.resetToCheckpoint(), lazyInitialize() has not been 
called yet, so NPE is thrown.

*Q: Why does this bug prevent the task from creating the Checkpoint?*
`SourceCoordinator.resetToCheckpoint()` throws an NPE which results in the 
member variable `enumerator` in `SourceCoordinator` being null. Unfortunately, 
all Checkpoint-related calls in `SourceCoordinator` are called via 
`runInEventLoop()`.
In `runInEventLoop()`, if the enumerator is null, it will return directly.

*Q: Why this bug doesn't trigger a task failover?*
In `RecreateOnResetOperatorCoordinator.resetAndStart()`, if 
`internalCoordinator.resetToCheckpoint` throws an exception, then it will catch 
the exception and call `cleanAndFailJob ` to try to fail the job.
However, `globalFailureHandler` is also initialized in `lazyInitialize()`, 
while `schedulerExecutor.execute` will ignore the NPE triggered by 
`globalFailureHandler.handleGlobalFailure(e)`.
Thus it appears that the task did not failover.
!image-2023-08-04-18-28-05-897.png|width=963,height=443!

  was:
We registered some metrics in the `enumerator` of the flip-27 source via 
`SplitEnumerator.metricGroup()`, but found that the task prints NPE logs in JM 
when restoring, suggesting that `SplitEnumerator. metricGroup()` is null.
Meanwhile, the task does not experience failover, and the Checkpoints cannot be 
successfully created even after the task is in running state.

We found that the implementation class of `SplitEnumerator` is 
`LazyInitializedCoordinatorContext`, however, the metricGroup() is initialized 
after calling lazyInitialize(). By reviewing the code, we found that at the 
time of SourceCoordinator.resetToCheckpoint(), lazyInitialize() has not been 
called yet, so NPE is thrown.

Q: Why does this bug prevent the task from creating the Checkpoint?
`SourceCoordinator.resetToCheckpoint()` throws an NPE which results in the 
member variable `enumerator` in `SourceCoordinator` being null. Unfortunately, 
all Checkpoint-related calls in `SourceCoordinator` are called via 
`runInEventLoop()`.
In `runInEventLoop()`, if the enumerator is null, it will return directly.

Q: Why this bug doesn't trigger a task failover?
In `RecreateOnResetOperatorCoordinator.resetAndStart()`, if 
`internalCoordinator.resetToCheckpoint` throws an exception, then it will catch 
the exception and call `cleanAndFailJob ` to try to fail the job.
However, `globalFailureHandler` is also initialized in `lazyInitialize()`, 
while `schedulerExecutor.execute` will ignore the NPE triggered by 
`globalFailureHandler.handleGlobalFailure(e)`.
Thus it appears that the task did not failover.
!image-2023-08-04-18-28-05-897.png|width=963,height=443!


> Using SplitEnumeratorContext.metricGroup() in restoreEnumerator causes NPE
> --
>
> Key: FLINK-32754
> URL: https://issues.apache.org/jira/browse/FLINK-32754
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.17.0, 1.17.1
>Reporter: Yu Chen
>Priority: Major
> Attachments: image-2023-08-04-18-28-05-897.png
>
>
> We registered some metrics in the `enumerator` of the flip-27 source via 
> `SplitEnumerator.metricGroup()`, but found that the task prints NPE logs in 
> JM when restoring, suggesting that `SplitEnumerator. metricGroup()` is null.
> {*}Meanwhile, the task does not experience failover, and the Checkpoints 
> cannot be successfully created even after the task is in running state{*}.
> We found that the implementation class of `SplitEnumerator` is 
> `LazyInitializedCoordinatorContext`, however, the metricGroup() is 
> initialized after calling lazyInitialize(). By reviewing the code, we found 
> that at the time of SourceCoordinator.resetToCheckpoint(), lazyInitialize() 
> has not been called yet, so NPE is thrown.
> *Q: Why does this bug prevent the task from creating the Checkpoint?*
> `SourceCoordinator.resetToCheckpoint()` throws an NPE which results in the 
> member variable `enumerator` in `SourceCoordinator` being null. 
> Unfortunately, a

[jira] [Updated] (FLINK-32754) Using SplitEnumeratorContext.metricGroup() in restoreEnumerator causes NPE

2023-08-04 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-32754:

Description: 
We registered some metrics in the `enumerator` of the flip-27 source via 
`SplitEnumerator.metricGroup()`, but found that the task prints NPE logs in JM 
when restoring, suggesting that `SplitEnumerator. metricGroup()` is null.
Meanwhile, the task does not experience failover, and the Checkpoints cannot be 
successfully created even after the task is in running state.

We found that the implementation class of `SplitEnumerator` is 
`LazyInitializedCoordinatorContext`, however, the metricGroup() is initialized 
after calling lazyInitialize(). By reviewing the code, we found that at the 
time of SourceCoordinator.resetToCheckpoint(), lazyInitialize() has not been 
called yet, so NPE is thrown.

Q: Why does this bug prevent the task from creating the Checkpoint?
`SourceCoordinator.resetToCheckpoint()` throws an NPE which results in the 
member variable `enumerator` in `SourceCoordinator` being null. Unfortunately, 
all Checkpoint-related calls in `SourceCoordinator` are called via 
`runInEventLoop()`.
In `runInEventLoop()`, if the enumerator is null, it will return directly.

Q: Why this bug doesn't trigger a task failover?
In `RecreateOnResetOperatorCoordinator.resetAndStart()`, if 
`internalCoordinator.resetToCheckpoint` throws an exception, then it will catch 
the exception and call `cleanAndFailJob ` to try to fail the job.
However, `globalFailureHandler` is also initialized in `lazyInitialize()`, 
while `schedulerExecutor.execute` will ignore the NPE triggered by 
`globalFailureHandler.handleGlobalFailure(e)`.
Thus it appears that the task did not failover.
!image-2023-08-04-18-28-05-897.png|width=963,height=443!

  was:
We registered some metrics in the `enumerator` of the flip-27 source via 
`SplitEnumerator.metricGroup()`, but found that the task prints NPE logs in JM 
when restoring, suggesting that `SplitEnumerator. metricGroup()` is null.
Meanwhile, the task does not experience failover, and the Checkpoints cannot be 
successfully created even after the task is in running state.

We found that the implementation class of `SplitEnumerator` is 
`LazyInitializedCoordinatorContext`, however, the metricGroup() is initialized 
after calling lazyInitialize(). By reviewing the code, we found that at the 
time of SourceCoordinator.resetToCheckpoint(), lazyInitialize() has not been 
called yet, so NPE is thrown.


Q: Why does this bug prevent the task from creating the Checkpoint?
`SourceCoordinator.resetToCheckpoint()` throws an NPE which results in the 
member variable `enumerator` in `SourceCoordinator` being null. Unfortunately, 
all Checkpoint-related calls in `SourceCoordinator` are called via 
`runInEventLoop()`.
In `runInEventLoop()`, if the enumerator is null, it will return directly.

Q: Why this bug doesn't trigger a task failover?
In `RecreateOnResetOperatorCoordinator.resetAndStart()`, if 
`internalCoordinator.resetToCheckpoint` throws an exception, then it will catch 
the exception and call `cleanAndFailJob ` to try to fail the job.
However, `globalFailureHandler` is also initialized in `lazyInitialize()`, 
while `schedulerExecutor.execute` will ignore the NPE triggered by 
`globalFailureHandler.handleGlobalFailure(e)`.
Thus it appears that the task did not failover.
!image-2023-08-04-18-28-05-897.png|width=2442,height=1123!


> Using SplitEnumeratorContext.metricGroup() in restoreEnumerator causes NPE
> --
>
> Key: FLINK-32754
> URL: https://issues.apache.org/jira/browse/FLINK-32754
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.17.0, 1.17.1
>Reporter: Yu Chen
>Priority: Major
> Attachments: image-2023-08-04-18-28-05-897.png
>
>
> We registered some metrics in the `enumerator` of the flip-27 source via 
> `SplitEnumerator.metricGroup()`, but found that the task prints NPE logs in 
> JM when restoring, suggesting that `SplitEnumerator. metricGroup()` is null.
> Meanwhile, the task does not experience failover, and the Checkpoints cannot 
> be successfully created even after the task is in running state.
> We found that the implementation class of `SplitEnumerator` is 
> `LazyInitializedCoordinatorContext`, however, the metricGroup() is 
> initialized after calling lazyInitialize(). By reviewing the code, we found 
> that at the time of SourceCoordinator.resetToCheckpoint(), lazyInitialize() 
> has not been called yet, so NPE is thrown.
> Q: Why does this bug prevent the task from creating the Checkpoint?
> `SourceCoordinator.resetToCheckpoint()` throws an NPE which results in the 
> member variable `enumerator` in `SourceCoordinator` being null. 
> Unfortunately, all Checkpoint-r

[jira] [Created] (FLINK-32754) Using SplitEnumeratorContext.metricGroup() in restoreEnumerator causes NPE

2023-08-04 Thread Yu Chen (Jira)
Yu Chen created FLINK-32754:
---

 Summary: Using SplitEnumeratorContext.metricGroup() in 
restoreEnumerator causes NPE
 Key: FLINK-32754
 URL: https://issues.apache.org/jira/browse/FLINK-32754
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.17.1, 1.17.0
Reporter: Yu Chen
 Attachments: image-2023-08-04-18-28-05-897.png

We registered some metrics in the `enumerator` of the flip-27 source via 
`SplitEnumerator.metricGroup()`, but found that the task prints NPE logs in JM 
when restoring, suggesting that `SplitEnumerator. metricGroup()` is null.
Meanwhile, the task does not experience failover, and the Checkpoints cannot be 
successfully created even after the task is in running state.

We found that the implementation class of `SplitEnumerator` is 
`LazyInitializedCoordinatorContext`, however, the metricGroup() is initialized 
after calling lazyInitialize(). By reviewing the code, we found that at the 
time of SourceCoordinator.resetToCheckpoint(), lazyInitialize() has not been 
called yet, so NPE is thrown.


Q: Why does this bug prevent the task from creating the Checkpoint?
`SourceCoordinator.resetToCheckpoint()` throws an NPE which results in the 
member variable `enumerator` in `SourceCoordinator` being null. Unfortunately, 
all Checkpoint-related calls in `SourceCoordinator` are called via 
`runInEventLoop()`.
In `runInEventLoop()`, if the enumerator is null, it will return directly.

Q: Why this bug doesn't trigger a task failover?
In `RecreateOnResetOperatorCoordinator.resetAndStart()`, if 
`internalCoordinator.resetToCheckpoint` throws an exception, then it will catch 
the exception and call `cleanAndFailJob ` to try to fail the job.
However, `globalFailureHandler` is also initialized in `lazyInitialize()`, 
while `schedulerExecutor.execute` will ignore the NPE triggered by 
`globalFailureHandler.handleGlobalFailure(e)`.
Thus it appears that the task did not failover.
!image-2023-08-04-18-28-05-897.png|width=2442,height=1123!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-32186) Support subtask stack auto-search when redirecting from subtask backpressure tab

2023-05-25 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-32186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17726096#comment-17726096
 ] 

Yu Chen commented on FLINK-32186:
-

Hi [~yunta], could you help to assign this ticket to me? Thank you~ 

> Support subtask stack auto-search when redirecting from subtask backpressure 
> tab
> 
>
> Key: FLINK-32186
> URL: https://issues.apache.org/jira/browse/FLINK-32186
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.18.0
>Reporter: Yu Chen
>Priority: Not a Priority
> Attachments: image-2023-05-25-15-52-54-383.png, 
> image-2023-05-25-16-11-00-374.png
>
>
> Note that we have introduced a dump link on the backpressure page in 
> FLINK-29996(Figure 1), which helps to check what are the corresponding 
> subtask doing more easily.
> But we still have to search for the corresponding call stack of the 
> back-pressured subtask from the whole TaskManager thread dumps, it's not 
> convenient enough.
> Therefore, I would like to trigger the search for the editor automatically 
> after redirecting from the backpressure tab, which will help to scroll the 
> thread dumps to the corresponding call stack of the back-pressured subtask 
> (As shown in Figure 2).
> !image-2023-05-25-15-52-54-383.png|width=680,height=260!
> Figure 1. ThreadDump Link in Backpressure Tab
> !image-2023-05-25-16-11-00-374.png|width=680,height=353!
> Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32186) Support subtask stack auto-search when redirecting from subtask backpressure tab

2023-05-25 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-32186:

Description: 
Note that we have introduced a dump link on the backpressure page in 
FLINK-29996(Figure 1), which helps to check what are the corresponding subtask 
doing more easily.

But we still have to search for the corresponding call stack of the 
back-pressured subtask from the whole TaskManager thread dumps, it's not 
convenient enough.

Therefore, I would like to trigger the search for the editor automatically 
after redirecting from the backpressure tab, which will help to scroll the 
thread dumps to the corresponding call stack of the back-pressured subtask (As 
shown in Figure 2).

!image-2023-05-25-15-52-54-383.png|width=680,height=260!
Figure 1. ThreadDump Link in Backpressure Tab

!image-2023-05-25-16-11-00-374.png|width=680,height=353!
Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab

  was:
Note that we have introduced a dump link on the backpressure page in 
FLINK-29996(Figure 1), which helps to check what are the corresponding subtask 
doing more easily.

But we still have to search for the corresponding call stack of the 
back-pressured subtask from the whole TaskManager thread dumps, it's not 
convenient enough.

Therefore, I would like to trigger the search for the editor automatically 
after redirecting from the backpressure tab, which will help to scroll the 
thread dumps to the corresponding call stack of the back-pressured subtask (As 
shown in Figure 2).

!image-2023-05-25-15-52-54-383.png|width=680,height=260!
Figure 1. ThreadDump Link in Backpressure Tab

!image-2023-05-25-16-08-14-325.png|width=676,height=351! 
Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab


> Support subtask stack auto-search when redirecting from subtask backpressure 
> tab
> 
>
> Key: FLINK-32186
> URL: https://issues.apache.org/jira/browse/FLINK-32186
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.18.0
>Reporter: Yu Chen
>Priority: Not a Priority
> Attachments: image-2023-05-25-15-52-54-383.png, 
> image-2023-05-25-16-11-00-374.png
>
>
> Note that we have introduced a dump link on the backpressure page in 
> FLINK-29996(Figure 1), which helps to check what are the corresponding 
> subtask doing more easily.
> But we still have to search for the corresponding call stack of the 
> back-pressured subtask from the whole TaskManager thread dumps, it's not 
> convenient enough.
> Therefore, I would like to trigger the search for the editor automatically 
> after redirecting from the backpressure tab, which will help to scroll the 
> thread dumps to the corresponding call stack of the back-pressured subtask 
> (As shown in Figure 2).
> !image-2023-05-25-15-52-54-383.png|width=680,height=260!
> Figure 1. ThreadDump Link in Backpressure Tab
> !image-2023-05-25-16-11-00-374.png|width=680,height=353!
> Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32186) Support subtask stack auto-search when redirecting from subtask backpressure tab

2023-05-25 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-32186:

Attachment: image-2023-05-25-16-11-00-374.png

> Support subtask stack auto-search when redirecting from subtask backpressure 
> tab
> 
>
> Key: FLINK-32186
> URL: https://issues.apache.org/jira/browse/FLINK-32186
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.18.0
>Reporter: Yu Chen
>Priority: Not a Priority
> Attachments: image-2023-05-25-15-52-54-383.png, 
> image-2023-05-25-16-11-00-374.png
>
>
> Note that we have introduced a dump link on the backpressure page in 
> FLINK-29996(Figure 1), which helps to check what are the corresponding 
> subtask doing more easily.
> But we still have to search for the corresponding call stack of the 
> back-pressured subtask from the whole TaskManager thread dumps, it's not 
> convenient enough.
> Therefore, I would like to trigger the search for the editor automatically 
> after redirecting from the backpressure tab, which will help to scroll the 
> thread dumps to the corresponding call stack of the back-pressured subtask 
> (As shown in Figure 2).
> !image-2023-05-25-15-52-54-383.png|width=680,height=260!
> Figure 1. ThreadDump Link in Backpressure Tab
> !image-2023-05-25-16-08-14-325.png|width=676,height=351! 
> Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32186) Support subtask stack auto-search when redirecting from subtask backpressure tab

2023-05-25 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-32186:

Attachment: (was: image-2023-05-25-16-08-14-325.png)

> Support subtask stack auto-search when redirecting from subtask backpressure 
> tab
> 
>
> Key: FLINK-32186
> URL: https://issues.apache.org/jira/browse/FLINK-32186
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.18.0
>Reporter: Yu Chen
>Priority: Not a Priority
> Attachments: image-2023-05-25-15-52-54-383.png, 
> image-2023-05-25-16-11-00-374.png
>
>
> Note that we have introduced a dump link on the backpressure page in 
> FLINK-29996(Figure 1), which helps to check what are the corresponding 
> subtask doing more easily.
> But we still have to search for the corresponding call stack of the 
> back-pressured subtask from the whole TaskManager thread dumps, it's not 
> convenient enough.
> Therefore, I would like to trigger the search for the editor automatically 
> after redirecting from the backpressure tab, which will help to scroll the 
> thread dumps to the corresponding call stack of the back-pressured subtask 
> (As shown in Figure 2).
> !image-2023-05-25-15-52-54-383.png|width=680,height=260!
> Figure 1. ThreadDump Link in Backpressure Tab
> !image-2023-05-25-16-11-00-374.png|width=680,height=353!
> Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-32186) Support subtask stack auto-search when redirecting from subtask backpressure tab

2023-05-25 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-32186:

Description: 
Note that we have introduced a dump link on the backpressure page in 
FLINK-29996(Figure 1), which helps to check what are the corresponding subtask 
doing more easily.

But we still have to search for the corresponding call stack of the 
back-pressured subtask from the whole TaskManager thread dumps, it's not 
convenient enough.

Therefore, I would like to trigger the search for the editor automatically 
after redirecting from the backpressure tab, which will help to scroll the 
thread dumps to the corresponding call stack of the back-pressured subtask (As 
shown in Figure 2).

!image-2023-05-25-15-52-54-383.png|width=680,height=260!
Figure 1. ThreadDump Link in Backpressure Tab

!image-2023-05-25-16-08-14-325.png|width=676,height=351! 
Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab

  was:
Note that we have introduced a dump link on the backpressure page in 
[FLINK-29996|https://issues.apache.org/jira/browse/FLINK-29996](Figure 1), 
which helps to check what are the corresponding subtask doing more easily.

But we still have to search for the corresponding call stack of the 
back-pressured subtask from the whole TaskManager thread dumps, it's not 
convenient enough.

Therefore, I would like to trigger the search for the editor automatically 
after redirecting from the backpressure tab, which will help to scroll the 
thread dumps to the corresponding call stack of the back-pressured subtask (As 
shown in Figure 2).   

!image-2023-05-25-15-52-54-383.png!
Figure 1. ThreadDump Link in Backpressure Tab

 !image-2023-05-25-16-08-14-325.png! 
Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab


> Support subtask stack auto-search when redirecting from subtask backpressure 
> tab
> 
>
> Key: FLINK-32186
> URL: https://issues.apache.org/jira/browse/FLINK-32186
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Web Frontend
>Affects Versions: 1.18.0
>Reporter: Yu Chen
>Priority: Not a Priority
> Attachments: image-2023-05-25-15-52-54-383.png, 
> image-2023-05-25-16-08-14-325.png
>
>
> Note that we have introduced a dump link on the backpressure page in 
> FLINK-29996(Figure 1), which helps to check what are the corresponding 
> subtask doing more easily.
> But we still have to search for the corresponding call stack of the 
> back-pressured subtask from the whole TaskManager thread dumps, it's not 
> convenient enough.
> Therefore, I would like to trigger the search for the editor automatically 
> after redirecting from the backpressure tab, which will help to scroll the 
> thread dumps to the corresponding call stack of the back-pressured subtask 
> (As shown in Figure 2).
> !image-2023-05-25-15-52-54-383.png|width=680,height=260!
> Figure 1. ThreadDump Link in Backpressure Tab
> !image-2023-05-25-16-08-14-325.png|width=676,height=351! 
> Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32186) Support subtask stack auto-search when redirecting from subtask backpressure tab

2023-05-25 Thread Yu Chen (Jira)
Yu Chen created FLINK-32186:
---

 Summary: Support subtask stack auto-search when redirecting from 
subtask backpressure tab
 Key: FLINK-32186
 URL: https://issues.apache.org/jira/browse/FLINK-32186
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.18.0
Reporter: Yu Chen
 Attachments: image-2023-05-25-15-52-54-383.png, 
image-2023-05-25-16-08-14-325.png

Note that we have introduced a dump link on the backpressure page in 
[FLINK-29996|https://issues.apache.org/jira/browse/FLINK-29996](Figure 1), 
which helps to check what are the corresponding subtask doing more easily.

But we still have to search for the corresponding call stack of the 
back-pressured subtask from the whole TaskManager thread dumps, it's not 
convenient enough.

Therefore, I would like to trigger the search for the editor automatically 
after redirecting from the backpressure tab, which will help to scroll the 
thread dumps to the corresponding call stack of the back-pressured subtask (As 
shown in Figure 2).   

!image-2023-05-25-15-52-54-383.png!
Figure 1. ThreadDump Link in Backpressure Tab

 !image-2023-05-25-16-08-14-325.png! 
Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-29322) Expose savepoint format on Web UI

2022-11-07 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-29322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17630231#comment-17630231
 ] 

Yu Chen commented on FLINK-29322:
-

I have already implemented it, if no one has any objection, I can take this 
ticket.

> Expose savepoint format on Web UI
> -
>
> Key: FLINK-29322
> URL: https://issues.apache.org/jira/browse/FLINK-29322
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Web Frontend
>Reporter: Matyas Orhidi
>Assignee: Matyas Orhidi
>Priority: Major
> Fix For: 1.17.0
>
>
> Savepoint format is not exposed on the Web UI, thus users should remember how 
> they triggered it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28926) Release Testing: Verify flip-235 hybrid shuffle mode

2022-09-05 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17600256#comment-17600256
 ] 

Yu Chen commented on FLINK-28926:
-

Hi, all.

According to the docs of [Batch 
Shuffle|https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/batch_shuffle/],
 I have tested the feature in a standalone cluster with 1JM+2TM(2 slots/TM) 
locally and submitted the {{WordCount}} example job to the cluster in batch 
mode with the following command respectively:
{code:sh}
./bin/flink run -Dexecution.batch-shuffle-mode=ALL_EXCHANGES_BLOCKING 
-Dparallelism.default=2 --detached examples/streaming/WordCount.jar --input 
/tmp/wordcount.txt --output /tmp/wordcount_res
./bin/flink run -Dexecution.batch-shuffle-mode=ALL_EXCHANGES_HYBRID_FULL 
-Dparallelism.default=2 --detached examples/streaming/WordCount.jar --input 
/tmp/wordcount.txt --output /tmp/wordcount_res
./bin/flink run -Dexecution.batch-shuffle-mode=ALL_EXCHANGES_HYBRID_SELECTIVE 
-Dparallelism.default=2 --detached examples/streaming/WordCount.jar --input 
/tmp/wordcount.txt --output /tmp/wordcount_res 
{code}
Note that {{/tmp/wordcount.txt}} contains {{10,000,000}} random words generated 
by {{{}RandomStringUtils.randomAlphabetic(10){}}}.

Through the Flink WEB, I verified the following scenarios:
 # {{HYBRID SHUFFLE}} can utilize all four slots, while {{BLOCK SHUFFLE}} uses 
only two slots, which is consistent with the relevant description in the 
document.
 # By examining the timeline chart of the job, we can see that {{HYBRID 
SHUFFLE}} makes the upstream task and the downstream task in the {{WordCount}} 
job start simultaneously.
 # I also restart the cluster with the configuration `{{{}jobmanager.scheduler: 
AdaptiveBatch{}}}` in the {{{}flink-conf.yaml{}}}. The job configured with 
{{ALL_EXCHANGES_HYBRID_FULL}} and {{ALL_EXCHANGES_HYBRID_SELECTIVE}} produced 
an error consistent with the documentation's description of limitation.

Overall, I have not found any problems with this feature, please feel free to 
contact me if any other cases need to be tested.

> Release Testing: Verify flip-235 hybrid shuffle mode
> 
>
> Key: FLINK-28926
> URL: https://issues.apache.org/jira/browse/FLINK-28926
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Affects Versions: 1.16.0
>Reporter: Weijie Guo
>Assignee: Yu Chen
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> * Please refer to release note of  FLINK-27862 for a list of changes need to 
> be verified.
>  * Please refer to out document for more details 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/batch_shuffle|https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/batch_shuffle/]
>  * Hybrid shuffle have some known limitations: No support for Slot Sharing, 
> Adaptive Batch Scheduler and Speculative Execution. Please make sure you do 
> not using this features in testing.
>  * The changes should be verified only in batch execution mode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28926) Release Testing: Verify flip-235 hybrid shuffle mode

2022-08-29 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17597496#comment-17597496
 ] 

Yu Chen commented on FLINK-28926:
-

Hi [~hxb] [~Weijie Guo] , I have finished the setup work of the tests and the 
overall progress is about 30%. And there are no problems found yet, if I find 
any, I will contact you in time. FYI.

> Release Testing: Verify flip-235 hybrid shuffle mode
> 
>
> Key: FLINK-28926
> URL: https://issues.apache.org/jira/browse/FLINK-28926
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Affects Versions: 1.16.0
>Reporter: Weijie Guo
>Assignee: Yu Chen
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> * Please refer to release note of  FLINK-27862 for a list of changes need to 
> be verified.
>  * Please refer to out document for more details 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/batch_shuffle|https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/batch_shuffle/]
>  * Hybrid shuffle have some known limitations: No support for Slot Sharing, 
> Adaptive Batch Scheduler and Speculative Execution. Please make sure you do 
> not using this features in testing.
>  * The changes should be verified only in batch execution mode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29107) Upgrade spotless version to improve spotless check efficiency

2022-08-25 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-29107:

Description: 
I noticed a [discussion|https://github.com/diffplug/spotless/issues/927] in the 
spotless GitHub repository that we can improve the efficiency of spotless 
checks significantly by upgrading the version of spotless and enabling the 
`upToDateChecking`.

I have made a simple test locally and the improvement of the spotless check 
after the upgrade is shown in the figure.
!image-2022-08-25-22-10-54-453.png!

  was:
Hi all, I noticed a 
[discussion|https://github.com/diffplug/spotless/issues/927] in the spotless 
GitHub repository that we can improve the efficiency of spotless checks 
significantly by upgrading the version of spotless and enabling the 
`upToDateChecking`.

I have made a simple test locally and the improvement of the spotless check 
after the upgrade is shown in the figure.
!image-2022-08-25-22-10-54-453.png!


> Upgrade spotless version to improve spotless check efficiency
> -
>
> Key: FLINK-29107
> URL: https://issues.apache.org/jira/browse/FLINK-29107
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.15.2
>Reporter: Yu Chen
>Priority: Major
> Attachments: image-2022-08-25-22-10-54-453.png
>
>
> I noticed a [discussion|https://github.com/diffplug/spotless/issues/927] in 
> the spotless GitHub repository that we can improve the efficiency of spotless 
> checks significantly by upgrading the version of spotless and enabling the 
> `upToDateChecking`.
> I have made a simple test locally and the improvement of the spotless check 
> after the upgrade is shown in the figure.
> !image-2022-08-25-22-10-54-453.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-29107) Upgrade spotless version to improve spotless check efficiency

2022-08-25 Thread Yu Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-29107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Chen updated FLINK-29107:

Summary: Upgrade spotless version to improve spotless check efficiency  
(was: Bump up spotless version to improve efficiently)

> Upgrade spotless version to improve spotless check efficiency
> -
>
> Key: FLINK-29107
> URL: https://issues.apache.org/jira/browse/FLINK-29107
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Affects Versions: 1.15.2
>Reporter: Yu Chen
>Priority: Major
> Attachments: image-2022-08-25-22-10-54-453.png
>
>
> Hi all, I noticed a 
> [discussion|https://github.com/diffplug/spotless/issues/927] in the spotless 
> GitHub repository that we can improve the efficiency of spotless checks 
> significantly by upgrading the version of spotless and enabling the 
> `upToDateChecking`.
> I have made a simple test locally and the improvement of the spotless check 
> after the upgrade is shown in the figure.
> !image-2022-08-25-22-10-54-453.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29107) Bump up spotless version to improve efficiently

2022-08-25 Thread Yu Chen (Jira)
Yu Chen created FLINK-29107:
---

 Summary: Bump up spotless version to improve efficiently
 Key: FLINK-29107
 URL: https://issues.apache.org/jira/browse/FLINK-29107
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.15.2
Reporter: Yu Chen
 Attachments: image-2022-08-25-22-10-54-453.png

Hi all, I noticed a 
[discussion|https://github.com/diffplug/spotless/issues/927] in the spotless 
GitHub repository that we can improve the efficiency of spotless checks 
significantly by upgrading the version of spotless and enabling the 
`upToDateChecking`.

I have made a simple test locally and the improvement of the spotless check 
after the upgrade is shown in the figure.
!image-2022-08-25-22-10-54-453.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28926) Release Testing: Verify flip-235 hybrid shuffle mode

2022-08-14 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17579497#comment-17579497
 ] 

Yu Chen commented on FLINK-28926:
-

Hi, I would like to take this release testing, please assign this ticket to me 
if you don't mind, thanks.

> Release Testing: Verify flip-235 hybrid shuffle mode
> 
>
> Key: FLINK-28926
> URL: https://issues.apache.org/jira/browse/FLINK-28926
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Network
>Affects Versions: 1.16.0
>Reporter: Weijie Guo
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.16.0
>
>
> * Please refer to release note of  FLINK-27862 for a list of changes need to 
> be verified.
>  * Please refer to out document for more details 
> [https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/batch_shuffle|https://nightlies.apache.org/flink/flink-docs-master/docs/ops/batch/batch_shuffle/]
>  * Hybrid shuffle have some known limitations: No support for Slot Sharing, 
> Adaptive Batch Scheduler and Speculative Execution. Please make sure you do 
> not using this features in testing.
>  * The changes should be verified only in batch execution mode.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-28577) 1.15.1 web ui console report error about checkpoint size

2022-07-26 Thread Yu Chen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-28577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17571713#comment-17571713
 ] 

Yu Chen commented on FLINK-28577:
-

Well, I think I've found the root cause of this issue: it is mainly introduced 
by the mistake modification of the flink-runtime-web by 
[JIRA-25557|https://issues.apache.org/jira/browse/FLINK-25557].

And I can propose a PR to resolve the problem.

cc: [~yunta] 

> 1.15.1 web ui console report error about checkpoint size
> 
>
> Key: FLINK-28577
> URL: https://issues.apache.org/jira/browse/FLINK-28577
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Web Frontend
>Affects Versions: 1.15.1
>Reporter: nobleyd
>Priority: Major
>
> 1.15.1
> 1 start-cluster
> 2 submit job: ./bin/flink run -d ./examples/streaming/TopSpeedWindowing.jar
> 3 trigger savepoint: ./bin/flink savepoint {{{jobId} ./sp0}}
> {{4 open web ui for job and change to checkpoint tab, nothing showed.}}
> {{Chrome console log shows some error:}}
> {{main.a7e97c2f60a2616e.js:1 ERROR TypeError: Cannot read properties of null 
> (reading 'checkpointed_size')
>     at q (253.e9e8f2b56b4981f5.js:1:607974)
>     at Sl (main.a7e97c2f60a2616e.js:1:186068)
>     at Br (main.a7e97c2f60a2616e.js:1:184696)
>     at N8 (main.a7e97c2f60a2616e.js:1:185128)
>     at Br (main.a7e97c2f60a2616e.js:1:185153)
>     at N8 (main.a7e97c2f60a2616e.js:1:185128)
>     at Br (main.a7e97c2f60a2616e.js:1:185153)
>     at N8 (main.a7e97c2f60a2616e.js:1:185128)
>     at Br (main.a7e97c2f60a2616e.js:1:185153)
>     at B8 (main.a7e97c2f60a2616e.js:1:191872)}}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)