[jira] [Created] (FLINK-35832) IFNULL returns error result in Flink SQL

2024-07-14 Thread Yu Chen (Jira)
Yu Chen created FLINK-35832:
---

 Summary: IFNULL returns error result in Flink SQL
 Key: FLINK-35832
 URL: https://issues.apache.org/jira/browse/FLINK-35832
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 2.0.0
Reporter: Yu Chen


Run following SQL in sql-client:

The correct result should be '16', but we got '1' on the master.
{code:java}
Flink SQL> SET 'sql-client.execution.result-mode' = 'tableau';
[INFO] Execute statement succeeded.

Flink SQL> select JSON_VALUE('{"a":16}','$.a'), 
IFNULL(JSON_VALUE('{"a":16}','$.a'),'0');
++++
| op |                         EXPR$0 |                         EXPR$1 |
++++
| +I |                             16 |                              1 |
++++
Received a total of 1 row (0.30 seconds){code}
 

With some quick debugging, I guess it may be caused by 
[FLINK-24413|https://issues.apache.org/jira/browse/FLINK-24413] which was 
introduced in Flink version 1.15.

 

I think the wrong result '1' was produced because the simplifying SQL procedure 
assumed that parameter 1 and parameter 2 ('0' was char) of IFNULL were of the 
same type, and therefore implicitly cast '16' to char, resulting in the 
incorrect result.

 

I have tested the SQL in the following version:

 
||Flink Version||Result||
|1.13|16,16|
|1.17|16,1|
|1.19|16,1|
|master|16,1|

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34968) Update flink-web copyright to 2024

2024-03-30 Thread Yu Chen (Jira)
Yu Chen created FLINK-34968:
---

 Summary: Update flink-web copyright to 2024
 Key: FLINK-34968
 URL: https://issues.apache.org/jira/browse/FLINK-34968
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34622) Typo of execution_mode configuration name in Chinese document

2024-03-07 Thread Yu Chen (Jira)
Yu Chen created FLINK-34622:
---

 Summary: Typo of execution_mode configuration name in Chinese 
document
 Key: FLINK-34622
 URL: https://issues.apache.org/jira/browse/FLINK-34622
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34099) CheckpointIntervalDuringBacklogITCase.testNoCheckpointDuringBacklog is unstable on AZP

2024-01-15 Thread Yu Chen (Jira)
Yu Chen created FLINK-34099:
---

 Summary: 
CheckpointIntervalDuringBacklogITCase.testNoCheckpointDuringBacklog is unstable 
on AZP
 Key: FLINK-34099
 URL: https://issues.apache.org/jira/browse/FLINK-34099
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.19.0
Reporter: Yu Chen


This build [Pipelines - Run 20240115.30 logs 
(azure.com)|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=56403=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba]
 fails as 
{code:java}
Jan 15 18:29:51 18:29:51.938 [ERROR] 
org.apache.flink.test.checkpointing.CheckpointIntervalDuringBacklogITCase.testNoCheckpointDuringBacklog
 -- Time elapsed: 2.022 s <<< FAILURE!
Jan 15 18:29:51 org.opentest4j.AssertionFailedError: 
Jan 15 18:29:51 
Jan 15 18:29:51 expected: 0
Jan 15 18:29:51  but was: 1
Jan 15 18:29:51 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
Jan 15 18:29:51 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
Jan 15 18:29:51 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
Jan 15 18:29:51 at 
org.apache.flink.test.checkpointing.CheckpointIntervalDuringBacklogITCase.testNoCheckpointDuringBacklog(CheckpointIntervalDuringBacklogITCase.java:141)
Jan 15 18:29:51 at java.lang.reflect.Method.invoke(Method.java:498)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-34029) Support different profiling mode on Flink WEB

2024-01-08 Thread Yu Chen (Jira)
Yu Chen created FLINK-34029:
---

 Summary: Support different profiling mode on Flink WEB
 Key: FLINK-34029
 URL: https://issues.apache.org/jira/browse/FLINK-34029
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Affects Versions: 1.19.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33613) Python UDF Runner process leak in Process Mode

2023-11-21 Thread Yu Chen (Jira)
Yu Chen created FLINK-33613:
---

 Summary: Python UDF Runner process leak in Process Mode
 Key: FLINK-33613
 URL: https://issues.apache.org/jira/browse/FLINK-33613
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.17.0
Reporter: Yu Chen
 Attachments: ps-ef.txt, streaming_word_count-1.py

While working with PyFlink, we found that in Process Mode, the Python UDF 
process may leak after a failover of the job. It leads to a rising number of 
processes with their threads in the host machine, which eventually results in 
failure to create new threads.

 

You can try to reproduce it with the attached test task 
`streamin_word_count.py`.

(Note that the job will continue failover, and you can watch the process leaks 
by `ps -ef` on Taskmanager.

 

Our test environment:
 * K8S Application Mode
 * 4 Taskmanagers with 12 slots/TM
 * Job's parallelism was set to 48 

The udf process `pyflink.fn_execution.beam.beam_boot` should be consistence 
with parallelism (48), but we found that there are 180 processes after several 
failovers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33474) ShowPlan throws undefined exception In Flink Web Submit Page

2023-11-06 Thread Yu Chen (Jira)
Yu Chen created FLINK-33474:
---

 Summary: ShowPlan throws undefined exception In Flink Web Submit 
Page
 Key: FLINK-33474
 URL: https://issues.apache.org/jira/browse/FLINK-33474
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.19.0
Reporter: Yu Chen
 Attachments: image-2023-11-07-13-53-08-216.png

The exception as shown in the figure below, meanwhile, the job plan cannot be 
displayed properly.

 

The root cause is that the dagreComponent is located in the nz-drawer and is 
only loaded when the drawer is visible, so we need to wait for the drawer to 
finish loading and then render the job plan.

!image-2023-11-07-13-53-08-216.png|width=400,height=190!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33436) Documentation on the built-in Profiler

2023-11-02 Thread Yu Chen (Jira)
Yu Chen created FLINK-33436:
---

 Summary: Documentation on the built-in Profiler
 Key: FLINK-33436
 URL: https://issues.apache.org/jira/browse/FLINK-33436
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation
Affects Versions: 1.19.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33435) The visualization and download capabilities of profiling history

2023-11-02 Thread Yu Chen (Jira)
Yu Chen created FLINK-33435:
---

 Summary: The visualization and download capabilities of profiling 
history 
 Key: FLINK-33435
 URL: https://issues.apache.org/jira/browse/FLINK-33435
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Web Frontend
Affects Versions: 1.19.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33434) Support invoke async-profiler on Taskmanager through REST API

2023-11-02 Thread Yu Chen (Jira)
Yu Chen created FLINK-33434:
---

 Summary: Support invoke async-profiler on Taskmanager through REST 
API
 Key: FLINK-33434
 URL: https://issues.apache.org/jira/browse/FLINK-33434
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / REST
Affects Versions: 1.19.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33433) Support invoke async-profiler on Jobmanager through REST API

2023-11-02 Thread Yu Chen (Jira)
Yu Chen created FLINK-33433:
---

 Summary: Support invoke async-profiler on Jobmanager through REST 
API
 Key: FLINK-33433
 URL: https://issues.apache.org/jira/browse/FLINK-33433
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / REST
Affects Versions: 1.19.0
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33325) FLIP-375: Built-in cross-platform powerful java profiler

2023-10-20 Thread Yu Chen (Jira)
Yu Chen created FLINK-33325:
---

 Summary: FLIP-375: Built-in cross-platform powerful java profiler
 Key: FLINK-33325
 URL: https://issues.apache.org/jira/browse/FLINK-33325
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / REST, Runtime / Web Frontend
Affects Versions: 1.19.0
Reporter: Yu Chen


This is an umbrella JIRA of 
[FLIP-375|https://cwiki.apache.org/confluence/x/64lEE]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-33230) Support Expanding ExecutionGraph to StreamGraph in Flink

2023-10-10 Thread Yu Chen (Jira)
Yu Chen created FLINK-33230:
---

 Summary: Support Expanding ExecutionGraph to StreamGraph in Flink
 Key: FLINK-33230
 URL: https://issues.apache.org/jira/browse/FLINK-33230
 Project: Flink
  Issue Type: Improvement
Reporter: Yu Chen






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32754) Using SplitEnumeratorContext.metricGroup() in restoreEnumerator causes NPE

2023-08-04 Thread Yu Chen (Jira)
Yu Chen created FLINK-32754:
---

 Summary: Using SplitEnumeratorContext.metricGroup() in 
restoreEnumerator causes NPE
 Key: FLINK-32754
 URL: https://issues.apache.org/jira/browse/FLINK-32754
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Checkpointing
Affects Versions: 1.17.1, 1.17.0
Reporter: Yu Chen
 Attachments: image-2023-08-04-18-28-05-897.png

We registered some metrics in the `enumerator` of the flip-27 source via 
`SplitEnumerator.metricGroup()`, but found that the task prints NPE logs in JM 
when restoring, suggesting that `SplitEnumerator. metricGroup()` is null.
Meanwhile, the task does not experience failover, and the Checkpoints cannot be 
successfully created even after the task is in running state.

We found that the implementation class of `SplitEnumerator` is 
`LazyInitializedCoordinatorContext`, however, the metricGroup() is initialized 
after calling lazyInitialize(). By reviewing the code, we found that at the 
time of SourceCoordinator.resetToCheckpoint(), lazyInitialize() has not been 
called yet, so NPE is thrown.


Q: Why does this bug prevent the task from creating the Checkpoint?
`SourceCoordinator.resetToCheckpoint()` throws an NPE which results in the 
member variable `enumerator` in `SourceCoordinator` being null. Unfortunately, 
all Checkpoint-related calls in `SourceCoordinator` are called via 
`runInEventLoop()`.
In `runInEventLoop()`, if the enumerator is null, it will return directly.

Q: Why this bug doesn't trigger a task failover?
In `RecreateOnResetOperatorCoordinator.resetAndStart()`, if 
`internalCoordinator.resetToCheckpoint` throws an exception, then it will catch 
the exception and call `cleanAndFailJob ` to try to fail the job.
However, `globalFailureHandler` is also initialized in `lazyInitialize()`, 
while `schedulerExecutor.execute` will ignore the NPE triggered by 
`globalFailureHandler.handleGlobalFailure(e)`.
Thus it appears that the task did not failover.
!image-2023-08-04-18-28-05-897.png|width=2442,height=1123!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-32186) Support subtask stack auto-search when redirecting from subtask backpressure tab

2023-05-25 Thread Yu Chen (Jira)
Yu Chen created FLINK-32186:
---

 Summary: Support subtask stack auto-search when redirecting from 
subtask backpressure tab
 Key: FLINK-32186
 URL: https://issues.apache.org/jira/browse/FLINK-32186
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.18.0
Reporter: Yu Chen
 Attachments: image-2023-05-25-15-52-54-383.png, 
image-2023-05-25-16-08-14-325.png

Note that we have introduced a dump link on the backpressure page in 
[FLINK-29996|https://issues.apache.org/jira/browse/FLINK-29996](Figure 1), 
which helps to check what are the corresponding subtask doing more easily.

But we still have to search for the corresponding call stack of the 
back-pressured subtask from the whole TaskManager thread dumps, it's not 
convenient enough.

Therefore, I would like to trigger the search for the editor automatically 
after redirecting from the backpressure tab, which will help to scroll the 
thread dumps to the corresponding call stack of the back-pressured subtask (As 
shown in Figure 2).   

!image-2023-05-25-15-52-54-383.png!
Figure 1. ThreadDump Link in Backpressure Tab

 !image-2023-05-25-16-08-14-325.png! 
Figure 2. Trigger Auto-search after Redirecting from Backpressure Tab



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-29107) Bump up spotless version to improve efficiently

2022-08-25 Thread Yu Chen (Jira)
Yu Chen created FLINK-29107:
---

 Summary: Bump up spotless version to improve efficiently
 Key: FLINK-29107
 URL: https://issues.apache.org/jira/browse/FLINK-29107
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.15.2
Reporter: Yu Chen
 Attachments: image-2022-08-25-22-10-54-453.png

Hi all, I noticed a 
[discussion|https://github.com/diffplug/spotless/issues/927] in the spotless 
GitHub repository that we can improve the efficiency of spotless checks 
significantly by upgrading the version of spotless and enabling the 
`upToDateChecking`.

I have made a simple test locally and the improvement of the spotless check 
after the upgrade is shown in the figure.
!image-2022-08-25-22-10-54-453.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)