[jira] [Updated] (FLINK-35353) Translate "Profiler" page into Chinese

2024-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35353:
---
Labels: pull-request-available  (was: )

> Translate  "Profiler" page into Chinese
> ---
>
> Key: FLINK-35353
> URL: https://issues.apache.org/jira/browse/FLINK-35353
> Project: Flink
>  Issue Type: Improvement
>  Components: chinese-translation, Documentation
>Affects Versions: 1.19.0
>Reporter: Juan Zifeng
>Assignee: Juan Zifeng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.19.0
>
>
> The links are 
> https://nightlies.apache.org/flink/flink-docs-release-1.19/zh/docs/ops/debugging/profiler/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [Module 1] Lab1 and Lab2 done [flink-training]

2024-05-15 Thread via GitHub


fhtrancoso commented on PR #83:
URL: https://github.com/apache/flink-training/pull/83#issuecomment-2112812806

   Wrong base branch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [Module 1] Lab1 and Lab2 done [flink-training]

2024-05-15 Thread via GitHub


fhtrancoso closed pull request #83: [Module 1] Lab1 and Lab2 done
URL: https://github.com/apache/flink-training/pull/83


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [BP-1.19][FLINK-35358][clients] Reintroduce recursive JAR listing in classpath load from "usrlib" [flink]

2024-05-15 Thread via GitHub


flinkbot commented on PR #24792:
URL: https://github.com/apache/flink/pull/24792#issuecomment-2112771239

   
   ## CI report:
   
   * 2350fd3bacf5615be7997ad8fdc427f0048e79dd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-35358) Breaking change when loading artifacts

2024-05-15 Thread Ferenc Csaky (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846647#comment-17846647
 ] 

Ferenc Csaky edited comment on FLINK-35358 at 5/15/24 2:51 PM:
---

This change was unintentional. Restricting JARs to only be put the root level 
of {{usrlib}} seems an unnecessary boundary. I think your analysis is on point, 
but AFAIK whatever we do this only can be fixed in a Flink patch version, but 
[~martijnvisser] correct me if I am wrong.

The fix is easy, I can open a PR with the fix and added unit tests to cover 
this case by EOD today. So feel free to assign this to me.


was (Author: ferenc-csaky):
This change was unintentional. Restricting JARs to only be put 1 level under 
{{usrlib}} seems an unnecessary boundary. I think your analysis is on point, 
but AFAIK whatever we do this only can be fixed in a Flink patch version, but 
[~martijnvisser] correct me if I am wrong.

The fix is easy, I can open a PR with the fix and added unit tests to cover 
this case by EOD today. So feel free to assign this to me.

> Breaking change when loading artifacts
> --
>
> Key: FLINK-35358
> URL: https://issues.apache.org/jira/browse/FLINK-35358
> Project: Flink
>  Issue Type: New Feature
>  Components: Client / Job Submission, flink-docker
>Affects Versions: 1.19.0
>Reporter: Rasmus Thygesen
>Priority: Not a Priority
>  Labels: pull-request-available
>
> We have been using the following code snippet in our Dockerfiles for running 
> a Flink job in application mode
>  
> {code:java}
> FROM flink:1.18.1-scala_2.12-java17
> COPY --from=build /app/target/my-job*.jar 
> /opt/flink/usrlib/artifacts/my-job.jar
> USER flink {code}
>  
> Which has been working since at least around Flink 1.14, but the 1.19 update 
> has broken our Dockerfiles. The fix is to put the jar file a step further out 
> so the code snippet becomes
>  
> {code:java}
> FROM flink:1.18.1-scala_2.12-java17
> COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/my-job.jar
> USER flink  {code}
>  
> We have not spent too much time looking into what the cause is, but we get 
> the stack trace
>  
> {code:java}
> myjob-jobmanager-1   | org.apache.flink.util.FlinkException: Could not load 
> the provided entrypoint class.
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:230)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:149)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.lambda$main$0(StandaloneApplicationClusterEntryPoint.java:90)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:89)
>  [flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   | Caused by: 
> org.apache.flink.client.program.ProgramInvocationException: The program's 
> entry point class 'my.company.job.MyJob' was not found in the jar file.
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:481)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     ... 4 more
> myjob-jobmanager-1   | Caused by: java.lang.ClassNotFoundException: 
> my.company.job.MyJob
> myjob-jobmanager-1   |     at java.net.URLClassLoader.findClass(Unknown 
> Source) ~[?:?]
> myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
> ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:67)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> 

Re: [PR] [FLINK-35358][clients] Reintroduce recursive JAR listing in classpath load from "usrlib" [flink]

2024-05-15 Thread via GitHub


flinkbot commented on PR #24791:
URL: https://github.com/apache/flink/pull/24791#issuecomment-2112758785

   
   ## CI report:
   
   * 4cd4046d28fecc8e65408c87fd17ac4929b42138 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [BP-1.19][FLINK-35358][clients] Reintroduce recursive JAR listing in classpath load from "usrlib" [flink]

2024-05-15 Thread via GitHub


ferenc-csaky opened a new pull request, #24792:
URL: https://github.com/apache/flink/pull/24792

   1.19 backport of https://github.com/apache/flink/pull/24791.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-35358) Breaking change when loading artifacts

2024-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35358:
---
Labels: pull-request-available  (was: )

> Breaking change when loading artifacts
> --
>
> Key: FLINK-35358
> URL: https://issues.apache.org/jira/browse/FLINK-35358
> Project: Flink
>  Issue Type: New Feature
>  Components: Client / Job Submission, flink-docker
>Affects Versions: 1.19.0
>Reporter: Rasmus Thygesen
>Priority: Not a Priority
>  Labels: pull-request-available
>
> We have been using the following code snippet in our Dockerfiles for running 
> a Flink job in application mode
>  
> {code:java}
> FROM flink:1.18.1-scala_2.12-java17
> COPY --from=build /app/target/my-job*.jar 
> /opt/flink/usrlib/artifacts/my-job.jar
> USER flink {code}
>  
> Which has been working since at least around Flink 1.14, but the 1.19 update 
> has broken our Dockerfiles. The fix is to put the jar file a step further out 
> so the code snippet becomes
>  
> {code:java}
> FROM flink:1.18.1-scala_2.12-java17
> COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/my-job.jar
> USER flink  {code}
>  
> We have not spent too much time looking into what the cause is, but we get 
> the stack trace
>  
> {code:java}
> myjob-jobmanager-1   | org.apache.flink.util.FlinkException: Could not load 
> the provided entrypoint class.
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:230)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:149)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.lambda$main$0(StandaloneApplicationClusterEntryPoint.java:90)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:89)
>  [flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   | Caused by: 
> org.apache.flink.client.program.ProgramInvocationException: The program's 
> entry point class 'my.company.job.MyJob' was not found in the jar file.
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:481)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     ... 4 more
> myjob-jobmanager-1   | Caused by: java.lang.ClassNotFoundException: 
> my.company.job.MyJob
> myjob-jobmanager-1   |     at java.net.URLClassLoader.findClass(Unknown 
> Source) ~[?:?]
> myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
> ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:67)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:74)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:51)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
> ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.loadClass(FlinkUserCodeClassLoaders.java:197)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at java.lang.Class.forName0(Native Method) ~[?:?]
> myjob-jobmanager-1   |     at java.lang.Class.forName(Unknown Source) ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:479)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> 

[PR] [FLINK-35358][clients] Reintroduce recursive JAR listing in classpath load from "usrlib" [flink]

2024-05-15 Thread via GitHub


ferenc-csaky opened a new pull request, #24791:
URL: https://github.com/apache/flink/pull/24791

   ## What is the purpose of the change
   
   With 
https://github.com/apache/flink/commit/e63aa12252843d0098a56f3091b28d48aff5b5af 
in Flink 1.19 an unintentional change were introduced, that now the 
`DefaultPackagedProgramRetriever` only loads JARs from the root level of the 
`usrlib` directory.
   
   This change reverts that behavior to search directories in a recursive 
manner, and adds a test case to cover such scenario.
   
   
   ## Brief change log
   
 - `Flies#list` -> `Files#walk` in `DefaultPackagedProgramRetriever`.
 - Add test case in `DefaultPackagedProgramRetrieverITCase`.
   
   
   ## Verifying this change
   
   Added relevant IT case.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: yes
 - The S3 file system connector: no


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [Module 1] Lab1 and Lab2 done [flink-training]

2024-05-15 Thread via GitHub


fhtrancoso opened a new pull request, #83:
URL: https://github.com/apache/flink-training/pull/83

   - Implementing ride-cleaning filtering only rides that started and finished 
within NYC range;
   - rides-and-fares, TaxiRide and TaxiFare records may come in different 
order, so we need to have state for each
   Whenever a pair is paired, clean the state that was set


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-35358) Breaking change when loading artifacts

2024-05-15 Thread Ferenc Csaky (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846647#comment-17846647
 ] 

Ferenc Csaky commented on FLINK-35358:
--

This change was unintentional. Restricting JARs to only be put 1 level under 
{{usrlib}} seems an unnecessary boundary. I think your analysis is on point, 
but AFAIK whatever we do this only can be fixed in a Flink patch version, but 
[~martijnvisser] correct me if I am wrong.

The fix is easy, I can open a PR with the fix and added unit tests to cover 
this case by EOD today. So feel free to assign this to me.

> Breaking change when loading artifacts
> --
>
> Key: FLINK-35358
> URL: https://issues.apache.org/jira/browse/FLINK-35358
> Project: Flink
>  Issue Type: New Feature
>  Components: Client / Job Submission, flink-docker
>Affects Versions: 1.19.0
>Reporter: Rasmus Thygesen
>Priority: Not a Priority
>
> We have been using the following code snippet in our Dockerfiles for running 
> a Flink job in application mode
>  
> {code:java}
> FROM flink:1.18.1-scala_2.12-java17
> COPY --from=build /app/target/my-job*.jar 
> /opt/flink/usrlib/artifacts/my-job.jar
> USER flink {code}
>  
> Which has been working since at least around Flink 1.14, but the 1.19 update 
> has broken our Dockerfiles. The fix is to put the jar file a step further out 
> so the code snippet becomes
>  
> {code:java}
> FROM flink:1.18.1-scala_2.12-java17
> COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/my-job.jar
> USER flink  {code}
>  
> We have not spent too much time looking into what the cause is, but we get 
> the stack trace
>  
> {code:java}
> myjob-jobmanager-1   | org.apache.flink.util.FlinkException: Could not load 
> the provided entrypoint class.
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:230)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:149)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.lambda$main$0(StandaloneApplicationClusterEntryPoint.java:90)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:89)
>  [flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   | Caused by: 
> org.apache.flink.client.program.ProgramInvocationException: The program's 
> entry point class 'my.company.job.MyJob' was not found in the jar file.
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:481)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     ... 4 more
> myjob-jobmanager-1   | Caused by: java.lang.ClassNotFoundException: 
> my.company.job.MyJob
> myjob-jobmanager-1   |     at java.net.URLClassLoader.findClass(Unknown 
> Source) ~[?:?]
> myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
> ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:67)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:74)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:51)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
> ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.loadClass(FlinkUserCodeClassLoaders.java:197)

Re: [PR] [FLINK-35361]Delete Flinkhistory files that failed to write to the lo… [flink]

2024-05-15 Thread via GitHub


flinkbot commented on PR #24790:
URL: https://github.com/apache/flink/pull/24790#issuecomment-2112623650

   
   ## CI report:
   
   * f805504f4d19cd1b3b14824aa17abfebd8084c3b UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-35361) Delete Flinkhistory files that failed to write to the local directory

2024-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-35361:
---
Labels: pull-request-available  (was: )

> Delete Flinkhistory files that failed to write to the local directory
> -
>
> Key: FLINK-35361
> URL: https://issues.apache.org/jira/browse/FLINK-35361
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.17.0, 1.18.0, 1.19.0, 1.20.0
>Reporter: dengxaing
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2024-05-15-21-15-54-973.png
>
>
> I found a bug in 
> org.apache.flink.runtime.webmonitor.history.HistoryServerStaticFileServerHandler
> !image-2024-05-15-21-15-54-973.png!
>  
> When the local directory is full, the above code will create an empty or 
> incomplete file in local. At this point, the Flink History webui page cannot 
> be open or display abnormally.
> However, when the local directory is expanded, the Flink History webui page 
> will not return to normal because new files will not be regenerated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[PR] [FLINK-35361]Delete Flinkhistory files that failed to write to the lo… [flink]

2024-05-15 Thread via GitHub


d20162416 opened a new pull request, #24790:
URL: https://github.com/apache/flink/pull/24790

   ## What is the purpose of the change
   
   *This pull request adds a code: when Flink History writes a file failed or 
writes a file incompleted in local , it will delete the local file to avoid the 
Flink History webui page will be opened failed or be displayed abnormalities*
   
   
   ## Brief change log
   
 - *When the file written failed or imcompleted in local, delete it*
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-35289) Incorrect timestamp of stream elements collected from onTimer in batch mode

2024-05-15 Thread Jeyhun Karimov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846631#comment-17846631
 ] 

Jeyhun Karimov commented on FLINK-35289:


Hi [~ekanthi] could you please provide a concrete example? According to your 
explanation, if

 
{code:java}
ctx.timerService().registerProcessingTimeTimer(Long.MAX_VALUE); {code}
and
{code:java}
@Override
public void onTimer(long timestamp, OnTimerContext ctx, Collector out) 
throws Exception {
out.collect(111);
} {code}
then, collected value (111) will have Long.MAX_VALUE timestamp. 

However, when we check the implementation of TimestampedCollector#collect, we 
can see that this is not the case. The stream record will preserve its existing 
timestamp. 

 

Am I missing something?

 

Thanks!

> Incorrect timestamp of stream elements collected from onTimer in batch mode
> ---
>
> Key: FLINK-35289
> URL: https://issues.apache.org/jira/browse/FLINK-35289
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.18.1
>Reporter: Kanthi Vaidya
>Priority: Major
>
> In batch mode  all registered timers will fire at the _end of time. Given 
> this, if a user registers a timer for Long.MAX_VALUE, the timestamp assigned 
> to the elements that are collected from the onTimer context ends up being 
> Long.MAX_VALUE. Ideally this should be the time when the batch actually 
> executed  the onTimer function._



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35361) Delete Flinkhistory files that failed to write to the local directory

2024-05-15 Thread dengxaing (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dengxaing updated FLINK-35361:
--
Affects Version/s: 1.19.0
   1.18.0
   1.17.0
   1.20.0

> Delete Flinkhistory files that failed to write to the local directory
> -
>
> Key: FLINK-35361
> URL: https://issues.apache.org/jira/browse/FLINK-35361
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.17.0, 1.18.0, 1.19.0, 1.20.0
>Reporter: dengxaing
>Priority: Major
> Attachments: image-2024-05-15-21-15-54-973.png
>
>
> I found a bug in 
> org.apache.flink.runtime.webmonitor.history.HistoryServerStaticFileServerHandler
> !image-2024-05-15-21-15-54-973.png!
>  
> When the local directory is full, the above code will create an empty or 
> incomplete file in local. At this point, the Flink History webui page cannot 
> be open or display abnormally.
> However, when the local directory is expanded, the Flink History webui page 
> will not return to normal because new files will not be regenerated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35361) Delete Flinkhistory files that failed to write to the local directory

2024-05-15 Thread dengxaing (Jira)
dengxaing created FLINK-35361:
-

 Summary: Delete Flinkhistory files that failed to write to the 
local directory
 Key: FLINK-35361
 URL: https://issues.apache.org/jira/browse/FLINK-35361
 Project: Flink
  Issue Type: Bug
Reporter: dengxaing
 Attachments: image-2024-05-15-21-15-54-973.png

I found a bug in 
org.apache.flink.runtime.webmonitor.history.HistoryServerStaticFileServerHandler

!image-2024-05-15-21-15-54-973.png!

 

When the local directory is full, the above code will create an empty or 
incomplete file in local. At this point, the Flink History webui page cannot be 
open or display abnormally.

However, when the local directory is expanded, the Flink History webui page 
will not return to normal because new files will not be regenerated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34380) Strange RowKind and records about intermediate output when using minibatch join

2024-05-15 Thread Shuai Xu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846626#comment-17846626
 ] 

Shuai Xu commented on FLINK-34380:
--

Hi [~rovboyko] , sorry for late reply. 
For the  incorrect order of output records, the minibatch optimization is 
designed to guanrantee final consistency. And the fix you mentioned has been 
considered when the pr was reviewed. Flink is a distributed realtime processing 
system. The order of output could be guanranteed on a node by using 
LinkedHashMap, however, it could not be guranteed when join operator runs on 
multiple nodes. So I think it makes little sense to keep the order here.

For the Rowkind, it was also reviewed. As you mentioned, it is a common problem 
of MiniBatch functionality. It does not influence final result. From the 
benefit perspective, this problem could be tolerable.

> Strange RowKind and records about intermediate output when using minibatch 
> join
> ---
>
> Key: FLINK-34380
> URL: https://issues.apache.org/jira/browse/FLINK-34380
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: xuyang
>Priority: Major
> Fix For: 1.20.0
>
>
> {code:java}
> // Add it in CalcItCase
> @Test
>   def test(): Unit = {
> env.setParallelism(1)
> val rows = Seq(
>   changelogRow("+I", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("-U", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("+U", java.lang.Integer.valueOf(1), "99"),
>   changelogRow("-D", java.lang.Integer.valueOf(1), "99")
> )
> val dataId = TestValuesTableFactory.registerData(rows)
> val ddl =
>   s"""
>  |CREATE TABLE t1 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl)
> val ddl2 =
>   s"""
>  |CREATE TABLE t2 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl2)
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ENABLED, 
> Boolean.box(true))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ALLOW_LATENCY, 
> Duration.ofSeconds(5))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_SIZE, Long.box(3L))
> println(tEnv.sqlQuery("SELECT * from t1 join t2 on t1.a = 
> t2.a").explain())
> tEnv.executeSql("SELECT * from t1 join t2 on t1.a = t2.a").print()
>   } {code}
> Output:
> {code:java}
> ++-+-+-+-+
> | op |           a |               b |          a0 |      b0 |
> ++-+-+-+-+
> | +U |           1 |               1 |           1 |      99 |
> | +U |           1 |              99 |           1 |      99 |
> | -U |           1 |               1 |           1 |      99 |
> | -D |           1 |              99 |           1 |      99 |
> ++-+-+-+-+{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] Update postgres-cdc.md [flink-cdc]

2024-05-15 Thread via GitHub


911432 closed pull request #3321: Update postgres-cdc.md
URL: https://github.com/apache/flink-cdc/pull/3321


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-20392) Migrating bash e2e tests to Java/Docker

2024-05-15 Thread Muhammet Orazov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846609#comment-17846609
 ] 

Muhammet Orazov edited comment on FLINK-20392 at 5/15/24 12:25 PM:
---

Thanks [~lorenzo.affetti] ,

 
{quote}MiniClusterExtension:
 - PRO: the code is very clean: no need to deploy separate JAR for the Flink 
app, as that is coded in the test and the Flink cluster is registered in the 
current test environment
{quote}
I would see this also extra work as CON. Most of the e2e are already have so 
called TestProgram, with MiniCluster approach these functionality will have to 
be extracted. I think this is extra effort, instead we could submit those test 
programs to FlinkContainers and validate the expected outcome(s).

 


was (Author: JIRAUSER303922):
Thanks [~lorenzo.affetti] ,

 
{quote}MiniClusterExtension:
 - PRO: the code is very clean: no need to deploy separate JAR for the Flink 
app, as that is coded in the test and the Flink cluster is registered in the 
current test environment
{quote}
I would see this also extra work as CON. Most of the e2e are already have so 
called TestProgram, with MiniCluster approach these functionality will have to 
extracted. I think this is extra effort, instead we could submit those test 
programs to FlinkContainers and validate the expected outcome(s).

 

> Migrating bash e2e tests to Java/Docker
> ---
>
> Key: FLINK-20392
> URL: https://issues.apache.org/jira/browse/FLINK-20392
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Test Infrastructure, Tests
>Reporter: Matthias Pohl
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> starter
>
> This Jira issue serves as an umbrella ticket for single e2e test migration 
> tasks. This should enable us to migrate all bash-based e2e tests step-by-step.
> The goal is to utilize the e2e test framework (see 
> [flink-end-to-end-tests-common|https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common]).
>  Ideally, the test should use Docker containers as much as possible 
> disconnect the execution from the environment. A good source to achieve that 
> is [testcontainers.org|https://www.testcontainers.org/].
> The related ML discussion is [Stop adding new bash-based e2e tests to 
> Flink|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Stop-adding-new-bash-based-e2e-tests-to-Flink-td46607.html].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-20392) Migrating bash e2e tests to Java/Docker

2024-05-15 Thread Muhammet Orazov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846609#comment-17846609
 ] 

Muhammet Orazov commented on FLINK-20392:
-

Thanks [~lorenzo.affetti] ,

 
{quote}MiniClusterExtension:
 - PRO: the code is very clean: no need to deploy separate JAR for the Flink 
app, as that is coded in the test and the Flink cluster is registered in the 
current test environment
{quote}
I would see this also extra work as CON. Most of the e2e are already have so 
called TestProgram, with MiniCluster approach these functionality will have to 
extracted. I think this is extra effort, instead we could submit those test 
programs to FlinkContainers and validate the expected outcome(s).

 

> Migrating bash e2e tests to Java/Docker
> ---
>
> Key: FLINK-20392
> URL: https://issues.apache.org/jira/browse/FLINK-20392
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Test Infrastructure, Tests
>Reporter: Matthias Pohl
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> starter
>
> This Jira issue serves as an umbrella ticket for single e2e test migration 
> tasks. This should enable us to migrate all bash-based e2e tests step-by-step.
> The goal is to utilize the e2e test framework (see 
> [flink-end-to-end-tests-common|https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common]).
>  Ideally, the test should use Docker containers as much as possible 
> disconnect the execution from the environment. A good source to achieve that 
> is [testcontainers.org|https://www.testcontainers.org/].
> The related ML discussion is [Stop adding new bash-based e2e tests to 
> Flink|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Stop-adding-new-bash-based-e2e-tests-to-Flink-td46607.html].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-32082][docs] Documentation of checkpoint file-merging [flink]

2024-05-15 Thread via GitHub


Zakelly commented on code in PR #24766:
URL: https://github.com/apache/flink/pull/24766#discussion_r1601510494


##
docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md:
##
@@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after 
all operators have rea
 without waiting for periodic triggering, but the job will need to wait for 
this final checkpoint 
 to be completed.
 
+## Unify file merging mechanism for checkpoints
+
+The unified file merging mechanism for checkpointing is introduced to Flink 
1.20 as an MVP ("minimum viable product") feature, 
+which allows scattered small checkpoint files to be written into a single 
file, reducing the number of file creations 
+and file deletions, helping to alleviate the pressure of file system metadata 
management and file flooding problem. 
+The unified fie merging mechanism can be enabled by setting the property 
`state.checkpoints.file-merging.enabled` to `true`.
+**Note** that enabling this mechanism may lead to space amplification, that 
is, the actual occupation on the file system
+will be larger than actual state size. 
`state.checkpoints.file-merging.max-space-amplification` 
+can be used to limit the upper bound of space amplification.
+
+This mechanism is applicable to keyed state, operator state and channel state 
in Flink. Subtask level granular merging is 
+provided for shared scope state; TaskManager-level granular merging is 
provided for private scope state. The maximum number of subtasks
+allowed to be written to a single file can be configured through the 
`state.checkpoints.file-merging.max-subtasks-per-file` option.
+
+The unified fie merging mechanism also supports file merging across 
checkpoints, which can be enabled by setting
+`state.checkpoints.file-merging.across-checkpoint-boundary` to `true`.
+
+This mechanism introduces a file pool to handle concurrent writing scenarios. 
The blocking mode can be

Review Comment:
   ```suggestion
   This mechanism introduces a file pool to handle concurrent writing 
scenarios. There are two modes... The blocking mode.. while the 
non-blocking modes.. . This can be configured via ``.
   ```
   Add some description to mode? instead of talking about enabling the option.



##
docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md:
##
@@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after 
all operators have rea
 without waiting for periodic triggering, but the job will need to wait for 
this final checkpoint 
 to be completed.
 
+## Unify file merging mechanism for checkpoints
+
+The unified file merging mechanism for checkpointing is introduced to Flink 
1.20 as an MVP ("minimum viable product") feature, 
+which allows scattered small checkpoint files to be written into a single 
file, reducing the number of file creations 
+and file deletions, helping to alleviate the pressure of file system metadata 
management and file flooding problem. 

Review Comment:
   ```suggestion
   and file deletions, which alleviates the pressure of file system metadata 
management raised by the file flooding problem during checkpoints.
   ```



##
docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md:
##
@@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after 
all operators have rea
 without waiting for periodic triggering, but the job will need to wait for 
this final checkpoint 
 to be completed.
 
+## Unify file merging mechanism for checkpoints
+
+The unified file merging mechanism for checkpointing is introduced to Flink 
1.20 as an MVP ("minimum viable product") feature, 
+which allows scattered small checkpoint files to be written into a single 
file, reducing the number of file creations 
+and file deletions, helping to alleviate the pressure of file system metadata 
management and file flooding problem. 
+The unified fie merging mechanism can be enabled by setting the property 
`state.checkpoints.file-merging.enabled` to `true`.

Review Comment:
   ```suggestion
   The mechanism can be enabled by setting 
`state.checkpoints.file-merging.enabled` to `true`.
   ```



##
docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md:
##
@@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after 
all operators have rea
 without waiting for periodic triggering, but the job will need to wait for 
this final checkpoint 
 to be completed.
 
+## Unify file merging mechanism for checkpoints

Review Comment:
   How about adding `(Experimental)` in title.



##
docs/content/docs/dev/datastream/fault-tolerance/checkpointing.md:
##
@@ -292,4 +292,25 @@ The final checkpoint would be triggered immediately after 
all operators have rea
 without waiting for periodic triggering, but the job will need to wait for 
this final checkpoint 
 to be completed.
 
+## Unify file merging mechanism for checkpoints
+
+The 

[jira] [Commented] (FLINK-33109) Watermark alignment not applied after recovery from checkpoint

2024-05-15 Thread Piotr Nowojski (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846597#comment-17846597
 ] 

Piotr Nowojski commented on FLINK-33109:


Hi!

{quote}
I guess this bug has been fixed in 1.17.2 and 1.18.
{quote}

Has this been confirmed?  [~YordanPavlov], has this issue been fixed for you?

> Watermark alignment not applied after recovery from checkpoint
> --
>
> Key: FLINK-33109
> URL: https://issues.apache.org/jira/browse/FLINK-33109
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.17.1
>Reporter: Yordan Pavlov
>Priority: Major
> Attachments: WatermarkTest-1.scala, 
> image-2023-09-18-15-40-06-868.png, image-2023-09-18-15-46-16-106.png
>
>
> I am observing a problem where after recovery from a checkpoint the Kafka 
> source watermarks would start to diverge not honoring the watermark alignment 
> setting I have applied.
> I have a Kafka source which reads a topic with 32 partitions. I am applying 
> the following watermark strategy:
> {code:java}
> new EventAwareWatermarkStrategy[KeyedKafkaSourceMessage]](msg => 
> msg.value.getTimestamp)
>       .withWatermarkAlignment("alignment-sources-group", 
> time.Duration.ofMillis(sourceWatermarkAlignmentBlocks)){code}
>  
> This works great up until my job needs to recover from checkpoint. Once the 
> recovery takes place, no alignment is taking place any more. This can best be 
> illustrated by looking at the watermark metrics for various operators in the 
> image:
> !image-2023-09-18-15-40-06-868.png!
>  
> You can see how the watermarks disperse after the recovery. Trying to debug 
> the problem I noticed that before the failure there would be calls in
>  
> {code:java}
> SourceCoordinator::announceCombinedWatermark() 
> {code}
> after the recovery, no calls get there, so no value for 
> {code:java}
> watermarkAlignmentParams.getMaxAllowedWatermarkDrift(){code}
> is ever read. I can manually fix the problem If I stop the job, clear all 
> state from Zookeeper and then manually start Flink providing the last 
> checkpoint with 
> {code:java}
> '–fromSavepoint'{code}
>  flag. This would cause the SourceCoordinator to be constructed properly and 
> watermark drift to be checked. Once recovery manually watermarks would again 
> converge to the allowed drift as seen in the metrics:
> !image-2023-09-18-15-46-16-106.png!
>  
> Let me know If I can be helpful by providing any more information.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35192] support jemalloc in image [flink-kubernetes-operator]

2024-05-15 Thread via GitHub


gaborgsomogyi commented on code in PR #825:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/825#discussion_r1601464660


##
docker-entrypoint.sh:
##
@@ -22,6 +22,27 @@ args=("$@")
 
 cd /flink-kubernetes-operator || exit
 
+maybe_enable_jemalloc() {
+if [ "${DISABLE_JEMALLOC:-false}" == "false" ]; then
+JEMALLOC_PATH="/usr/lib/$(uname -m)-linux-gnu/libjemalloc.so"
+JEMALLOC_FALLBACK="/usr/lib/x86_64-linux-gnu/libjemalloc.so"

Review Comment:
   Why do we fallback and not fail fast?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35192] support jemalloc in image [flink-kubernetes-operator]

2024-05-15 Thread via GitHub


gaborgsomogyi commented on code in PR #825:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/825#discussion_r1601440347


##
docker-entrypoint.sh:
##
@@ -22,6 +22,27 @@ args=("$@")
 
 cd /flink-kubernetes-operator || exit
 
+maybe_enable_jemalloc() {
+if [ "${DISABLE_JEMALLOC:-false}" == "false" ]; then

Review Comment:
   Any specific reason why double equals used here and all other places single?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-24298] Add GCP PubSub Sink API Implementation, bump Flink version to 1.19.0 [flink-connector-gcp-pubsub]

2024-05-15 Thread via GitHub


vahmed-hamdy commented on code in PR #27:
URL: 
https://github.com/apache/flink-connector-gcp-pubsub/pull/27#discussion_r1601406910


##
flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSink.java:
##
@@ -62,6 +62,7 @@
  *
  * @param  type of PubSubSink messages to write
  */
+@Deprecated
 public class PubSubSink extends RichSinkFunction implements 
CheckpointedFunction {

Review Comment:
   We can remove this when we release a new major version `4.0` once this is 
merged we cann follow up with another Jira to remove



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-24298] Add GCP PubSub Sink API Implementation, bump Flink version to 1.19.0 [flink-connector-gcp-pubsub]

2024-05-15 Thread via GitHub


vahmed-hamdy commented on code in PR #27:
URL: 
https://github.com/apache/flink-connector-gcp-pubsub/pull/27#discussion_r1601405551


##
flink-connector-gcp-pubsub/src/main/java/org/apache/flink/connector/gcp/pubsub/sink/PubSubSinkV2Builder.java:
##
@@ -0,0 +1,93 @@
+package org.apache.flink.connector.gcp.pubsub.sink;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.serialization.SerializationSchema;
+import org.apache.flink.connector.gcp.pubsub.sink.config.GcpPublisherConfig;
+
+import java.util.Optional;
+
+import static 
org.apache.flink.connector.gcp.pubsub.common.PubSubConstants.DEFAULT_FAIL_ON_ERROR;
+import static 
org.apache.flink.connector.gcp.pubsub.common.PubSubConstants.DEFAULT_MAXIMUM_INFLIGHT_REQUESTS;
+
+/**
+ * A builder for creating a {@link PubSubSinkV2}.
+ *
+ * The builder uses the following parameters to build a {@link 
PubSubSinkV2}:
+ *
+ * 
+ *   {@link GcpPublisherConfig} for the {@link 
com.google.cloud.pubsub.v1.Publisher}
+ *   configuration.
+ *   {@link SerializationSchema} for serializing the input data.
+ *   {@code projectId} for the name of the project where the topic is 
located.
+ *   {@code topicId} for the name of the topic to send messages to.
+ *   {@code maximumInflightMessages} for the maximum number of in-flight 
messages.
+ *   {@code failOnError} for whether to fail on an error.
+ * 
+ *
+ * It can be used as follows:
+ *
+ * {@code
+ * PubSubSinkV2Builder pubSubSink = {@link 
PubSubSinkV2Builder}.builder()

Review Comment:
   Good catch!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34379][table] Fix adding catalogtable logic [flink]

2024-05-15 Thread via GitHub


JingGe commented on code in PR #24788:
URL: https://github.com/apache/flink/pull/24788#discussion_r1601394755


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/utils/DynamicPartitionPruningUtils.java:
##
@@ -234,20 +234,16 @@ private static boolean isSuitableFilter(RexNode 
filterCondition) {
 }
 
 private void setTables(ContextResolvedTable catalogTable) {
-if (tables.size() == 0) {
-tables.add(catalogTable);
-} else {
-boolean hasAdded = false;
-for (ContextResolvedTable thisTable : new ArrayList<>(tables)) 
{
-if (hasAdded) {
-break;
-}
-if 
(!thisTable.getIdentifier().equals(catalogTable.getIdentifier())) {
-tables.add(catalogTable);
-hasAdded = true;
-}
+boolean alreadyExists = false;
+for (ContextResolvedTable table : tables) {

Review Comment:
   Not sure how big the tables could be and how often the method will be 
called. Does it make sense to use e.g. a Map to avoid the loop?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34572] Support OceanBase Jdbc Catalog [flink-connector-jdbc]

2024-05-15 Thread via GitHub


whhe commented on PR #109:
URL: 
https://github.com/apache/flink-connector-jdbc/pull/109#issuecomment-2112157138

   @eskabetxe PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-24298] Add GCP PubSub Sink API Implementation, bump Flink version to 1.19.0 [flink-connector-gcp-pubsub]

2024-05-15 Thread via GitHub


morazow commented on PR #27:
URL: 
https://github.com/apache/flink-connector-gcp-pubsub/pull/27#issuecomment-2112125712

   Thanks @vahmed-hamdy, great work!
   
   Added some minor suggestions. And a point for number of retries on failure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-24298] Add GCP PubSub Sink API Implementation, bump Flink version to 1.19.0 [flink-connector-gcp-pubsub]

2024-05-15 Thread via GitHub


morazow commented on code in PR #27:
URL: 
https://github.com/apache/flink-connector-gcp-pubsub/pull/27#discussion_r1601341788


##
flink-connector-gcp-pubsub/src/main/java/org/apache/flink/connector/gcp/pubsub/sink/PubSubSinkV2Builder.java:
##
@@ -0,0 +1,93 @@
+package org.apache.flink.connector.gcp.pubsub.sink;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.serialization.SerializationSchema;
+import org.apache.flink.connector.gcp.pubsub.sink.config.GcpPublisherConfig;
+
+import java.util.Optional;
+
+import static 
org.apache.flink.connector.gcp.pubsub.common.PubSubConstants.DEFAULT_FAIL_ON_ERROR;
+import static 
org.apache.flink.connector.gcp.pubsub.common.PubSubConstants.DEFAULT_MAXIMUM_INFLIGHT_REQUESTS;
+
+/**
+ * A builder for creating a {@link PubSubSinkV2}.
+ *
+ * The builder uses the following parameters to build a {@link 
PubSubSinkV2}:
+ *
+ * 
+ *   {@link GcpPublisherConfig} for the {@link 
com.google.cloud.pubsub.v1.Publisher}
+ *   configuration.
+ *   {@link SerializationSchema} for serializing the input data.
+ *   {@code projectId} for the name of the project where the topic is 
located.
+ *   {@code topicId} for the name of the topic to send messages to.
+ *   {@code maximumInflightMessages} for the maximum number of in-flight 
messages.
+ *   {@code failOnError} for whether to fail on an error.
+ * 
+ *
+ * It can be used as follows:
+ *
+ * {@code
+ * PubSubSinkV2Builder pubSubSink = {@link 
PubSubSinkV2Builder}.builder()

Review Comment:
   Should the `{@link ` removed here? Since it is already in code block maybe 
it doesn't render



##
flink-connector-gcp-pubsub/src/main/java/org/apache/flink/connector/gcp/pubsub/sink/PubSubSinkV2.java:
##
@@ -0,0 +1,96 @@
+package org.apache.flink.connector.gcp.pubsub.sink;
+
+import org.apache.flink.annotation.PublicEvolving;
+import org.apache.flink.api.common.serialization.SerializationSchema;
+import org.apache.flink.api.connector.sink2.Sink;
+import org.apache.flink.api.connector.sink2.SinkWriter;
+import org.apache.flink.api.connector.sink2.WriterInitContext;
+import org.apache.flink.connector.gcp.pubsub.sink.config.GcpPublisherConfig;
+import org.apache.flink.util.Preconditions;
+
+import java.io.IOException;
+
+/**
+ * A RabbitMQ {@link Sink} to produce data into Gcp PubSub. The sink uses the 
{@link

Review Comment:
   ```suggestion
* A {@link Sink} to produce data into Gcp PubSub. The sink uses the {@link
   ```



##
flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSink.java:
##
@@ -62,6 +62,7 @@
  *
  * @param  type of PubSubSink messages to write
  */
+@Deprecated
 public class PubSubSink extends RichSinkFunction implements 
CheckpointedFunction {

Review Comment:
    



##
flink-connector-gcp-pubsub/src/main/java/org/apache/flink/connector/gcp/pubsub/sink/PubSubWriter.java:
##
@@ -0,0 +1,195 @@
+package org.apache.flink.connector.gcp.pubsub.sink;
+
+import org.apache.flink.annotation.Internal;
+import org.apache.flink.annotation.VisibleForTesting;
+import org.apache.flink.api.common.operators.MailboxExecutor;
+import org.apache.flink.api.common.serialization.SerializationSchema;
+import org.apache.flink.api.connector.sink2.SinkWriter;
+import org.apache.flink.api.connector.sink2.WriterInitContext;
+import org.apache.flink.connector.gcp.pubsub.sink.config.GcpPublisherConfig;
+import org.apache.flink.metrics.Counter;
+import org.apache.flink.util.FlinkRuntimeException;
+import org.apache.flink.util.Preconditions;
+
+import com.google.api.core.ApiFuture;
+import com.google.api.core.ApiFutureCallback;
+import com.google.api.core.ApiFutures;
+import com.google.cloud.pubsub.v1.Publisher;
+import com.google.protobuf.ByteString;
+import com.google.pubsub.v1.PubsubMessage;
+import com.google.pubsub.v1.TopicName;
+
+import java.io.IOException;
+import java.util.Optional;
+
+import static org.apache.flink.util.concurrent.Executors.directExecutor;
+
+/**
+ * A stateless {@link SinkWriter} that writes records to Pub/Sub using generic 
{@link
+ * GcpWriterClient}. The writer blocks on completion of inflight requests on 
{@code flush()} and
+ * {@code close()}. The writer also uses {@code maxInFlightRequests} and 
blocks new writes if the
+ * number of inflight requests.

Review Comment:
   Sentence is not finished? `.. exceeds that number`



##
flink-connector-gcp-pubsub/src/main/java/org/apache/flink/streaming/connectors/gcp/pubsub/PubSubSink.java:
##
@@ -62,6 +62,7 @@
  *
  * @param  type of PubSubSink messages to write
  */
+@Deprecated
 public class PubSubSink extends RichSinkFunction implements 
CheckpointedFunction {

Review Comment:
   When is good to remove these API?



##
flink-connector-gcp-pubsub/src/main/java/org/apache/flink/connector/gcp/pubsub/sink/PubSubWriter.java:
##
@@ -0,0 +1,195 @@
+package 

[jira] [Updated] (FLINK-35360) [Feature] Submit Flink CDC pipeline job yarn Application mode

2024-05-15 Thread Jira


 [ 
https://issues.apache.org/jira/browse/FLINK-35360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

王俊博 updated FLINK-35360:

Summary: [Feature] Submit Flink CDC pipeline job yarn Application mode  
(was: [Feature] Submit Flink CDC pipeline job to yarn Application mode)

> [Feature] Submit Flink CDC pipeline job yarn Application mode
> -
>
> Key: FLINK-35360
> URL: https://issues.apache.org/jira/browse/FLINK-35360
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.2.0
>Reporter: 王俊博
>Priority: Minor
>
> For now flink-cdc pipeline support cli yarn session mode submit.I'm willing 
> to support yarn application mode submit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35360) [Feature] Submit Flink CDC pipeline job to yarn Application mode

2024-05-15 Thread Jira
王俊博 created FLINK-35360:
---

 Summary: [Feature] Submit Flink CDC pipeline job to yarn 
Application mode
 Key: FLINK-35360
 URL: https://issues.apache.org/jira/browse/FLINK-35360
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Affects Versions: cdc-3.2.0
Reporter: 王俊博


For now flink-cdc pipeline support cli yarn session mode submit.I'm willing to 
support yarn application mode submit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-34904) [Feature] submit Flink CDC pipeline job to yarn application cluster.

2024-05-15 Thread Jira


[ https://issues.apache.org/jira/browse/FLINK-34904 ]


王俊博 deleted comment on FLINK-34904:
-

was (Author: kwafor):
For now flink-cdc pipeline support cli yarn session mode submit.I'm willing to 
support yarn application mode submit, could assign it to me?

> [Feature] submit Flink CDC pipeline job to yarn application cluster.
> 
>
> Key: FLINK-34904
> URL: https://issues.apache.org/jira/browse/FLINK-34904
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: ZhengYu Chen
>Priority: Minor
> Fix For: cdc-3.1.0
>
>
> support flink cdc cli submit pipeline job to yarn application cluster.discuss 
> in FLINK-34853



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35359) General Improvement to Configuration for Flink 2.0

2024-05-15 Thread Xuannan Su (Jira)
Xuannan Su created FLINK-35359:
--

 Summary: General Improvement to Configuration for Flink 2.0
 Key: FLINK-35359
 URL: https://issues.apache.org/jira/browse/FLINK-35359
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Configuration
Reporter: Xuannan Su


As Flink moves toward version 2.0, we want to provide users with a better 
experience with the existing configuration. In this FLIP, we outline several 
general improvements to the current configuration:
 * Ensure all the ConfigOptions are properly annotated

 * Ensure all user-facing configurations are included in the documentation 
generation process

 * Make the existing ConfigOptions use the proper type

 * Mark all internally used ConfigOptions with the @Internal annotation

 

https://cwiki.apache.org/confluence/display/FLINK/FLIP-442%3A+General+Improvement+to+Configuration+for+Flink+2.0

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-34904) [Feature] submit Flink CDC pipeline job to yarn application cluster.

2024-05-15 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-34904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846571#comment-17846571
 ] 

王俊博 edited comment on FLINK-34904 at 5/15/24 9:50 AM:
--

For now flink-cdc pipeline support cli yarn session mode submit.I'm willing to 
support yarn application mode submit, could assign it to me?


was (Author: kwafor):
For now flink-cdc pipeline support cli yarn session mode submit.I'm willing to 
support yarn application mode submit, could assign it to me?

> [Feature] submit Flink CDC pipeline job to yarn application cluster.
> 
>
> Key: FLINK-34904
> URL: https://issues.apache.org/jira/browse/FLINK-34904
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: ZhengYu Chen
>Priority: Minor
> Fix For: cdc-3.1.0
>
>
> support flink cdc cli submit pipeline job to yarn application cluster.discuss 
> in FLINK-34853



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (FLINK-34904) [Feature] submit Flink CDC pipeline job to yarn application cluster.

2024-05-15 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-34904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846571#comment-17846571
 ] 

王俊博 edited comment on FLINK-34904 at 5/15/24 9:49 AM:
--

For now flink-cdc pipeline support cli yarn session mode submit.I'm willing to 
support yarn application mode submit, could assign it to me?


was (Author: kwafor):
For now flink-cdc pipeline support cli yarn session mode submit.I'm willing to 
support yarn application mode submit, could someone assign it to me?

> [Feature] submit Flink CDC pipeline job to yarn application cluster.
> 
>
> Key: FLINK-34904
> URL: https://issues.apache.org/jira/browse/FLINK-34904
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: ZhengYu Chen
>Priority: Minor
> Fix For: cdc-3.1.0
>
>
> support flink cdc cli submit pipeline job to yarn application cluster.discuss 
> in FLINK-34853



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34904) [Feature] submit Flink CDC pipeline job to yarn application cluster.

2024-05-15 Thread Jira


[ 
https://issues.apache.org/jira/browse/FLINK-34904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846571#comment-17846571
 ] 

王俊博 commented on FLINK-34904:
-

For now flink-cdc pipeline support cli yarn session mode submit.I'm willing to 
support yarn application mode submit, could someone assign it to me?

> [Feature] submit Flink CDC pipeline job to yarn application cluster.
> 
>
> Key: FLINK-34904
> URL: https://issues.apache.org/jira/browse/FLINK-34904
> Project: Flink
>  Issue Type: Improvement
>  Components: Flink CDC
>Affects Versions: cdc-3.1.0
>Reporter: ZhengYu Chen
>Priority: Minor
> Fix For: cdc-3.1.0
>
>
> support flink cdc cli submit pipeline job to yarn application cluster.discuss 
> in FLINK-34853



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35357) Add "kubernetes.operator.plugins.listeners" parameter description to the Operator configuration document

2024-05-15 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846569#comment-17846569
 ] 

Rui Fan commented on FLINK-35357:
-

IIUC, kubernetes.operator.plugins.listeners..class is similar to 
metrics.reporter..factory.class 
(org.apache.flink.configuration.MetricOptions#REPORTER_FACTORY_CLASS).

We can export it in flink kubernetes operator configuration page[1] as well.

 

[1]https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/operations/configuration/

> Add "kubernetes.operator.plugins.listeners" parameter description to the 
> Operator configuration document
> 
>
> Key: FLINK-35357
> URL: https://issues.apache.org/jira/browse/FLINK-35357
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Yang Zhou
>Assignee: Yang Zhou
>Priority: Minor
>
> In Flink Operator "Custom Flink Resource Listeners" in practice (doc: 
> [https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.8/docs/operations/plugins/#custom-flink-resource]
>  -listeners)
> It was found that the "Operator Configuration Reference" document did not 
> explain the "Custom Flink Resource Listeners" configuration parameters.
> So I wanted to come up with adding:
> kubernetes.operator.plugins.listeners..class: 
> 
> , after all it is useful.
> I want to submit a PR to optimize the document.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-25920) Allow receiving updates of CommittableSummary

2024-05-15 Thread Alexey Novakov (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-25920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846567#comment-17846567
 ] 

Alexey Novakov commented on FLINK-25920:


It also appears in Flink 1.17

> Allow receiving updates of CommittableSummary
> -
>
> Key: FLINK-25920
> URL: https://issues.apache.org/jira/browse/FLINK-25920
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream, Connectors / Common
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Fabian Paul
>Priority: Major
>
> In the case of unaligned checkpoints, it might happen that the checkpoint 
> barrier overtakes the records and an empty committable summary is emitted 
> that needs to be correct at a later point when the records arrive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35357) Add "kubernetes.operator.plugins.listeners" parameter description to the Operator configuration document

2024-05-15 Thread Rui Fan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846566#comment-17846566
 ] 

Rui Fan commented on FLINK-35357:
-

Thanks [~xinzhuxianshenger] for reporting this JIRA. I checked the operator 
code just now, I think you are right. You are assigned, please go ahead.

cc [~gyfora] 

> Add "kubernetes.operator.plugins.listeners" parameter description to the 
> Operator configuration document
> 
>
> Key: FLINK-35357
> URL: https://issues.apache.org/jira/browse/FLINK-35357
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Yang Zhou
>Assignee: Yang Zhou
>Priority: Minor
>
> In Flink Operator "Custom Flink Resource Listeners" in practice (doc: 
> [https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.8/docs/operations/plugins/#custom-flink-resource]
>  -listeners)
> It was found that the "Operator Configuration Reference" document did not 
> explain the "Custom Flink Resource Listeners" configuration parameters.
> So I wanted to come up with adding:
> kubernetes.operator.plugins.listeners..class: 
> 
> , after all it is useful.
> I want to submit a PR to optimize the document.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-35357) Add "kubernetes.operator.plugins.listeners" parameter description to the Operator configuration document

2024-05-15 Thread Rui Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan reassigned FLINK-35357:
---

Assignee: Yang Zhou

> Add "kubernetes.operator.plugins.listeners" parameter description to the 
> Operator configuration document
> 
>
> Key: FLINK-35357
> URL: https://issues.apache.org/jira/browse/FLINK-35357
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Yang Zhou
>Assignee: Yang Zhou
>Priority: Minor
>
> In Flink Operator "Custom Flink Resource Listeners" in practice (doc: 
> [https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.8/docs/operations/plugins/#custom-flink-resource]
>  -listeners)
> It was found that the "Operator Configuration Reference" document did not 
> explain the "Custom Flink Resource Listeners" configuration parameters.
> So I wanted to come up with adding:
> kubernetes.operator.plugins.listeners..class: 
> 
> , after all it is useful.
> I want to submit a PR to optimize the document.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-35153) Internal Async State Implementation and StateDescriptor for Map/List State

2024-05-15 Thread Zakelly Lan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan resolved FLINK-35153.
-
Resolution: Fixed

> Internal Async State Implementation and StateDescriptor for Map/List State
> --
>
> Key: FLINK-35153
> URL: https://issues.apache.org/jira/browse/FLINK-35153
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35153) Internal Async State Implementation and StateDescriptor for Map/List State

2024-05-15 Thread Zakelly Lan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846563#comment-17846563
 ] 

Zakelly Lan commented on FLINK-35153:
-

Merged into master via 73a7e1cc

> Internal Async State Implementation and StateDescriptor for Map/List State
> --
>
> Key: FLINK-35153
> URL: https://issues.apache.org/jira/browse/FLINK-35153
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35153][State] Internal async list/map state and corresponding state descriptor [flink]

2024-05-15 Thread via GitHub


Zakelly closed pull request #24781: [FLINK-35153][State] Internal async 
list/map state and corresponding state descriptor
URL: https://github.com/apache/flink/pull/24781


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-26821) Refactor Cassandra Sink implementation to the ASync Sink

2024-05-15 Thread Lorenzo Affetti (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-26821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846558#comment-17846558
 ] 

Lorenzo Affetti commented on FLINK-26821:
-

[~mzuehlke] are you working on this?

Otherwise I would be happy to pick it up [~martijnvisser] 

> Refactor Cassandra Sink implementation to the ASync Sink
> 
>
> Key: FLINK-26821
> URL: https://issues.apache.org/jira/browse/FLINK-26821
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Cassandra
>Reporter: Martijn Visser
>Assignee: Marco Zühlke
>Priority: Major
>
> The current Cassandra connector is using the SinkFunction. This needs to be 
> ported to the correct Flink API, which for Cassandra is most likely the ASync 
> Sink. More details about this API can be found in FLIP-171 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-35030][runtime] Introduce Epoch Manager under async execution [flink]

2024-05-15 Thread via GitHub


Zakelly commented on code in PR #24748:
URL: https://github.com/apache/flink/pull/24748#discussion_r1601239723


##
flink-runtime/src/main/java/org/apache/flink/runtime/asyncprocessing/EpochManager.java:
##
@@ -0,0 +1,207 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.asyncprocessing;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.annotation.Nullable;
+
+import java.util.LinkedList;
+
+/**
+ * Epoch manager segments inputs into distinct epochs, marked by the arrival 
of non-records(e.g.
+ * watermark, record attributes). Records are assigned to a unique epoch based 
on their arrival,
+ * records within an epoch are allowed to be parallelized, while the 
non-record of an epoch can only
+ * be executed when all records in this epoch have finished.
+ *
+ * For more details please refer to FLIP-425.
+ */
+public class EpochManager {
+private static final Logger LOG = 
LoggerFactory.getLogger(EpochManager.class);
+
+/**
+ * This enum defines whether parallel execution between epochs is allowed. 
We should keep this
+ * internal and away from API module for now, until we could see the 
concrete need for {@link
+ * #PARALLEL_BETWEEN_EPOCH} from average users.
+ */
+public enum ParallelMode {
+/**
+ * Subsequent epochs must wait until the previous epoch is completed 
before they can start.
+ */
+SERIAL_BETWEEN_EPOCH,
+/**
+ * Subsequent epochs can begin execution even if the previous epoch 
has not yet completed.
+ * Usually performs better than {@link #SERIAL_BETWEEN_EPOCH}.
+ */
+PARALLEL_BETWEEN_EPOCH
+}
+
+/**
+ * The reference to the {@link AsyncExecutionController}, used for {@link
+ * ParallelMode#SERIAL_BETWEEN_EPOCH}. Can be null when testing.
+ */
+final AsyncExecutionController asyncExecutionController;
+
+/** The number of epochs that have arrived. */
+long epochNum;
+
+/** The output queue to hold ongoing epochs. */
+LinkedList outputQueue;
+
+/** Current active epoch, only one active epoch at the same time. */
+Epoch activeEpoch;
+
+public EpochManager(AsyncExecutionController aec) {
+this.epochNum = 0;
+this.outputQueue = new LinkedList<>();
+this.asyncExecutionController = aec;
+// init an empty epoch, the epoch action will be updated when 
non-record is received.
+initNewActiveEpoch();
+}
+
+/**
+ * Add a record to the current epoch and return the current open epoch, 
the epoch will be
+ * associated with the {@link RecordContext} of this record. Must be 
invoked within task thread.
+ *
+ * @return the current open epoch.
+ */
+public Epoch onRecord() {
+activeEpoch.ongoingRecordCount++;
+return activeEpoch;
+}
+
+/**
+ * Add a non-record to the current epoch, close current epoch and open a 
new epoch. Must be
+ * invoked within task thread.
+ *
+ * @param action the action associated with this non-record.
+ * @param parallelMode the parallel mode for this epoch.
+ */
+public void onNonRecord(Runnable action, ParallelMode parallelMode) {
+if (parallelMode == ParallelMode.SERIAL_BETWEEN_EPOCH) {
+asyncExecutionController.drainInflightRecords(0);

Review Comment:
   When doing `drainInflightRecords`, will the `activeEpoch`'s 
`ongoingRecordCount` reached 0 and `completeOneRecord` called?
   
   I'd suggest not do drain here. Instead we'd better mark blocking in `AEC`, 
while the callback is to unblock the `AEC`



##
flink-runtime/src/main/java/org/apache/flink/runtime/asyncprocessing/AsyncExecutionController.java:
##
@@ -76,6 +77,9 @@ public class AsyncExecutionController implements 
StateRequestHandler {
  */
 private final MailboxExecutor mailboxExecutor;
 
+/** Exception handler to handle the exception thrown by asynchronous 
framework. */
+private AsyncFrameworkExceptionHandler exceptionHandler;

Review Comment:
   nit. make it `final`?



##

[jira] [Commented] (FLINK-20625) Refactor Google Cloud PubSub Source in accordance with FLIP-27

2024-05-15 Thread Lorenzo Affetti (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846554#comment-17846554
 ] 

Lorenzo Affetti commented on FLINK-20625:
-

[~clmccart] still actively working on this?

> Refactor Google Cloud PubSub Source in accordance with FLIP-27
> --
>
> Key: FLINK-20625
> URL: https://issues.apache.org/jira/browse/FLINK-20625
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Google Cloud PubSub
>Reporter: Jakob Edding
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
>
> The Source implementation for Google Cloud Pub/Sub should be refactored in 
> accordance with [FLIP-27: Refactor Source 
> Interface|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95653748].
> *Split Enumerator*
> Pub/Sub doesn't expose any partitions to consuming applications. Therefore, 
> the implementation of the Pub/Sub Split Enumerator won't do any real work 
> discovery. Instead, a static Source Split is handed to Source Readers which 
> request a Source Split. This static Source Split merely contains details 
> about the connection to Pub/Sub and the concrete Pub/Sub subscription to use 
> but no Split-specific information like partitions/offsets because this 
> information can not be obtained.
> *Source Reader*
> A Source Reader will use Pub/Sub's pull mechanism to read new messages from 
> the Pub/Sub subscription specified in the SourceSplit. In the case of 
> parallel-running Source Readers in Flink, every Source Reader will be passed 
> the same Source Split from the Enumerator. Because of this, all Source 
> Readers use the same connection details and the same Pub/Sub subscription to 
> receive messages. In this case, Pub/Sub will automatically load-balance 
> messages between all Source Readers pulling from the same subscription. This 
> way, parallel processing can be achieved in the Source.
> *At-least-once guarantee*
> Pub/Sub itself guarantees at-least-once message delivery so it is the goal to 
> keep up this guarantee in the Source as well. A mechanism that can be used to 
> achieve this is that Pub/Sub expects a message to be acknowledged by the 
> subscriber to signal that the message has been consumed successfully. Any 
> message that has not been acknowledged yet will be automatically redelivered 
> by Pub/Sub once an ack deadline has passed.
> After a certain time interval has elapsed...
>  # all pulled messages are checkpointed in the Source Reader
>  # same messages are acknowledged to Pub/Sub
>  # same messages are forwarded to downstream Flink tasks
> This should ensure at-least-once delivery in the Source because in the case 
> of failure, non-checkpointed messages have not yet been acknowledged and will 
> therefore be redelivered by the Pub/Sub service.
> Because of the static Source Split, it appears like checkpointing is not 
> necessary in the Split Enumerator.
> *Possible exactly-once guarantee*
> It should even be possible to achieve exactly-once guarantees for the source. 
> The following requirements would have to be met to have an exactly-once mode 
> besides the at-least-once mode similar to how it is done in the [current 
> RabbitMQ 
> Source|https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/rabbitmq.html#rabbitmq-source]:
>  * The system which publishes messages to Pub/Sub must add an id to each 
> message so that messages can be deduplicated in the Source.
>  * The Source must run in a non-parallel fashion (with parallelism=1).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-24299) Make Google PubSub available as Source and Sink for Table API

2024-05-15 Thread Lorenzo Affetti (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-24299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846551#comment-17846551
 ] 

Lorenzo Affetti commented on FLINK-24299:
-

I see there is no pull request for this issue, I volunteer for this one.

 

Is there any strong dependency between this and the other issue under the 
umbrella: https://issues.apache.org/jira/browse/FLINK-24296 ?

 

If so, I would wait for the other ones to be done, or contribute.

 

[~martijnvisser] could you please assign this to me?

> Make Google PubSub available as Source and Sink for Table API
> -
>
> Key: FLINK-24299
> URL: https://issues.apache.org/jira/browse/FLINK-24299
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Martijn Visser
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35358) Breaking change when loading artifacts

2024-05-15 Thread Rasmus Thygesen (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846549#comment-17846549
 ] 

Rasmus Thygesen commented on FLINK-35358:
-

I would like to add that we have a project where upgrading from Flink 1.18.1 to 
1.19.0 right now would mean we have to make a new major release, however, if 
this gets fixed, it will only be a minor. We will wait a little bit to see what 
the Flink community agrees on :)

> Breaking change when loading artifacts
> --
>
> Key: FLINK-35358
> URL: https://issues.apache.org/jira/browse/FLINK-35358
> Project: Flink
>  Issue Type: New Feature
>  Components: Client / Job Submission, flink-docker
>Affects Versions: 1.19.0
>Reporter: Rasmus Thygesen
>Priority: Not a Priority
>
> We have been using the following code snippet in our Dockerfiles for running 
> a Flink job in application mode
>  
> {code:java}
> FROM flink:1.18.1-scala_2.12-java17
> COPY --from=build /app/target/my-job*.jar 
> /opt/flink/usrlib/artifacts/my-job.jar
> USER flink {code}
>  
> Which has been working since at least around Flink 1.14, but the 1.19 update 
> has broken our Dockerfiles. The fix is to put the jar file a step further out 
> so the code snippet becomes
>  
> {code:java}
> FROM flink:1.18.1-scala_2.12-java17
> COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/my-job.jar
> USER flink  {code}
>  
> We have not spent too much time looking into what the cause is, but we get 
> the stack trace
>  
> {code:java}
> myjob-jobmanager-1   | org.apache.flink.util.FlinkException: Could not load 
> the provided entrypoint class.
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:230)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:149)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.lambda$main$0(StandaloneApplicationClusterEntryPoint.java:90)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:89)
>  [flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   | Caused by: 
> org.apache.flink.client.program.ProgramInvocationException: The program's 
> entry point class 'my.company.job.MyJob' was not found in the jar file.
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:481)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     ... 4 more
> myjob-jobmanager-1   | Caused by: java.lang.ClassNotFoundException: 
> my.company.job.MyJob
> myjob-jobmanager-1   |     at java.net.URLClassLoader.findClass(Unknown 
> Source) ~[?:?]
> myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
> ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:67)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:74)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:51)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
> ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.loadClass(FlinkUserCodeClassLoaders.java:197)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at java.lang.Class.forName0(Native Method) ~[?:?]
> myjob-jobmanager-1  

[jira] [Commented] (FLINK-20392) Migrating bash e2e tests to Java/Docker

2024-05-15 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846540#comment-17846540
 ] 

Matthias Pohl commented on FLINK-20392:
---

Thanks for the write-up. I'm just wondering whether we gain anything from only 
allowing one of the two approaches. What about allowing both options?

> Migrating bash e2e tests to Java/Docker
> ---
>
> Key: FLINK-20392
> URL: https://issues.apache.org/jira/browse/FLINK-20392
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Test Infrastructure, Tests
>Reporter: Matthias Pohl
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> starter
>
> This Jira issue serves as an umbrella ticket for single e2e test migration 
> tasks. This should enable us to migrate all bash-based e2e tests step-by-step.
> The goal is to utilize the e2e test framework (see 
> [flink-end-to-end-tests-common|https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common]).
>  Ideally, the test should use Docker containers as much as possible 
> disconnect the execution from the environment. A good source to achieve that 
> is [testcontainers.org|https://www.testcontainers.org/].
> The related ML discussion is [Stop adding new bash-based e2e tests to 
> Flink|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Stop-adding-new-bash-based-e2e-tests-to-Flink-td46607.html].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35358) Breaking change when loading artifacts

2024-05-15 Thread Rasmus Thygesen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rasmus Thygesen updated FLINK-35358:

Description: 
We have been using the following code snippet in our Dockerfiles for running a 
Flink job in application mode

 
{code:java}
FROM flink:1.18.1-scala_2.12-java17

COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/artifacts/my-job.jar

USER flink {code}
 

Which has been working since at least around Flink 1.14, but the 1.19 update 
has broken our Dockerfiles. The fix is to put the jar file a step further out 
so the code snippet becomes

 
{code:java}
FROM flink:1.18.1-scala_2.12-java17

COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/my-job.jar

USER flink  {code}
 

We have not spent too much time looking into what the cause is, but we get the 
stack trace

 
{code:java}
myjob-jobmanager-1   | org.apache.flink.util.FlinkException: Could not load the 
provided entrypoint class.
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:230)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:149)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.lambda$main$0(StandaloneApplicationClusterEntryPoint.java:90)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:89)
 [flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   | Caused by: 
org.apache.flink.client.program.ProgramInvocationException: The program's entry 
point class 'my.company.job.MyJob' was not found in the jar file.
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:481)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65) 
~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     ... 4 more
myjob-jobmanager-1   | Caused by: java.lang.ClassNotFoundException: 
my.company.job.MyJob
myjob-jobmanager-1   |     at java.net.URLClassLoader.findClass(Unknown Source) 
~[?:?]
myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
~[?:?]
myjob-jobmanager-1   |     at 
org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:67)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:74)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:51)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
~[?:?]
myjob-jobmanager-1   |     at 
org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.loadClass(FlinkUserCodeClassLoaders.java:197)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at java.lang.Class.forName0(Native Method) ~[?:?]
myjob-jobmanager-1   |     at java.lang.Class.forName(Unknown Source) ~[?:?]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:479)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65) 
~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     ... 4 more{code}
 


Re: [PR] [FLINK-33138] DataStream API implementation [flink-connector-prometheus]

2024-05-15 Thread via GitHub


hlteoh37 commented on code in PR #1:
URL: 
https://github.com/apache/flink-connector-prometheus/pull/1#discussion_r1600156895


##
README.md:
##
@@ -8,6 +8,14 @@ Apache Flink is an open source stream processing framework 
with powerful stream-
 
 Learn more about Flink at 
[https://flink.apache.org/](https://flink.apache.org/)
 
+## Modules
+
+This repository contains the following modules
+
+* [Prometheus Connector](./prometheus-connector): Flink Prometheus Connector 
implementation; supports optional request signer
+* [Sample application](./example-datastream-job): Sample application showing 
the usage of the connector with DataStream API. It also demonstrates how to 
configure the request signer. 

Review Comment:
   Can we move this to an integ test? Adding it as a separate module would make 
releases more difficult.



##
README.md:
##
@@ -8,6 +8,14 @@ Apache Flink is an open source stream processing framework 
with powerful stream-
 
 Learn more about Flink at 
[https://flink.apache.org/](https://flink.apache.org/)
 
+## Modules
+
+This repository contains the following modules
+
+* [Prometheus Connector](./prometheus-connector): Flink Prometheus Connector 
implementation; supports optional request signer

Review Comment:
   nit: can we name this `flink-connector-prometheus` to follow the repo / 
artifact name?



##
README.md:
##
@@ -8,6 +8,14 @@ Apache Flink is an open source stream processing framework 
with powerful stream-
 
 Learn more about Flink at 
[https://flink.apache.org/](https://flink.apache.org/)
 
+## Modules
+
+This repository contains the following modules
+
+* [Prometheus Connector](./prometheus-connector): Flink Prometheus Connector 
implementation; supports optional request signer
+* [Sample application](./example-datastream-job): Sample application showing 
the usage of the connector with DataStream API. It also demonstrates how to 
configure the request signer. 
+* [Amazon Managed Prometheus Request Signer](./amp-request-signer): 
Implementation of request signer for Amazon Managed Prometheus (AMP)

Review Comment:
   nit: Should we name this as `request-signer-amp` - this might read a little 
nicer



##
amp-request-signer/README.md:
##
@@ -0,0 +1,31 @@
+## Request Signer for Amazon Managed Prometheus (AMP)
+
+Request signer implementation for Amazon Managed Prometheus (AMP)
+
+The signer retrieves AWS credentials using 
`com.amazonaws.auth.DefaultAWSCredentialsProviderChain` and automatically
+supports session credentials.
+
+The Flink application requires `RemoteWrite` permissions to the AMP workspace 
(e.g. `AmazonPromethusRemoteWriteAccess`
+policy).
+
+### Sample usage
+
+To enable request signing for Amazon Managed Prometheus, and instance of 
`AmazonManagedPrometheusWriteRequestSigner`
+must

Review Comment:
   nit: why is this on its own line?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Commented] (FLINK-35358) Breaking change when loading artifacts

2024-05-15 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846536#comment-17846536
 ] 

Martijn Visser commented on FLINK-35358:


[~ferenc-csaky] [~mbalassi] Any thoughts on this one?

> Breaking change when loading artifacts
> --
>
> Key: FLINK-35358
> URL: https://issues.apache.org/jira/browse/FLINK-35358
> Project: Flink
>  Issue Type: New Feature
>  Components: Client / Job Submission, flink-docker
>Affects Versions: 1.19.0
>Reporter: Rasmus Thygesen
>Priority: Not a Priority
>
> We have been using the following code snippet in our Dockerfiles for running 
> a Flink job in application mode
>  
> {code:java}
> FROM flink:1.18.1-scala_2.12-java17
> COPY --from=build /app/target/my-job*.jar 
> /opt/flink/usrlib/artifacts/my-job.jar
> USER flink {code}
>  
> Which has been working since at least around Flink 1.14, but the 1.19 update 
> has broken our Dockerfiles. The fix is to put the jar file a step further out 
> so the code snippet becomes
>  
> {code:java}
> FROM flink:1.18.1-scala_2.12-java17
> COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/my-job.jar
> USER flink  {code}
>  
> We have not spent too much time looking into what the cause is, but we get 
> the stack trace
>  
> {code:java}
> myjob-jobmanager-1   | org.apache.flink.util.FlinkException: Could not load 
> the provided entrypoint class.
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:230)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:149)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.lambda$main$0(StandaloneApplicationClusterEntryPoint.java:90)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:89)
>  [flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   | Caused by: 
> org.apache.flink.client.program.ProgramInvocationException: The program's 
> entry point class 'my.company.job.MyJob' was not found in the jar file.
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:481)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     ... 4 more
> myjob-jobmanager-1   | Caused by: java.lang.ClassNotFoundException: 
> my.company.job.MyJob
> myjob-jobmanager-1   |     at java.net.URLClassLoader.findClass(Unknown 
> Source) ~[?:?]
> myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
> ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:67)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:74)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:51)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
> ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.loadClass(FlinkUserCodeClassLoaders.java:197)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> myjob-jobmanager-1   |     at java.lang.Class.forName0(Native Method) ~[?:?]
> myjob-jobmanager-1   |     at java.lang.Class.forName(Unknown Source) ~[?:?]
> myjob-jobmanager-1   |     at 
> org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:479)
>  ~[flink-dist-1.19.0.jar:1.19.0]
> 

Re: [PR] [FLINK-32092][tests] Integrate snapshot file-merging with existing ITCases [flink]

2024-05-15 Thread via GitHub


flinkbot commented on PR #24789:
URL: https://github.com/apache/flink/pull/24789#issuecomment-2111840165

   
   ## CI report:
   
   * c93a842f7d8ef102428d784f68771414066cd451 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35314][cdc] Add Flink CDC pipeline transform user document [flink-cdc]

2024-05-15 Thread via GitHub


PatrickRen commented on PR #3308:
URL: https://github.com/apache/flink-cdc/pull/3308#issuecomment-2111834224

   @aiwenmo Could you backport the commit to release-3.1? Thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35314][cdc] Add Flink CDC pipeline transform user document [flink-cdc]

2024-05-15 Thread via GitHub


PatrickRen merged PR #3308:
URL: https://github.com/apache/flink-cdc/pull/3308


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-32092][tests] Integrate snapshot file-merging with existing ITCases [flink]

2024-05-15 Thread via GitHub


fredia opened a new pull request, #24789:
URL: https://github.com/apache/flink/pull/24789

   
   
   ## What is the purpose of the change
   
   This pr integrates snapshot file-merging with existing ITCases by enabling 
file-merging in TestStreamEnvironment randomly.
   
   ## Brief change log
   - Enable file-merging in `TestStreamEnvironment` randomly.
   
   
   ## Verifying this change
   
   
   This change is a trivial rework, already covered by existing IT tests.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Updated] (FLINK-32092) Integrate snapshot file-merging with existing IT cases

2024-05-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-32092:
---
Labels: pull-request-available  (was: )

> Integrate snapshot file-merging with existing IT cases
> --
>
> Key: FLINK-32092
> URL: https://issues.apache.org/jira/browse/FLINK-32092
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Checkpointing, Runtime / State Backends
>Affects Versions: 1.18.0
>Reporter: Zakelly Lan
>Assignee: Yanfei Lei
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-35358) Breaking change when loading artifacts

2024-05-15 Thread Rasmus Thygesen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rasmus Thygesen updated FLINK-35358:

Description: 
We have been using the following code snippet in our Dockerfiles for running a 
Flink job in application mode

 

 
{code:java}
FROM flink:1.18.1-scala_2.12-java17

COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/artifacts/my-job.jar

USER flink {code}
 

 

Which has been working since at least around Flink 1.14, but the 1.19 update 
has broken our Dockerfiles. The fix is to put the jar file a step further out 
so the code snippet becomes

 

 
{code:java}
FROM flink:1.18.1-scala_2.12-java17

COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/my-job.jar

USER flink  {code}
 

 

We have not spent too much time looking into what the cause is, but we get the 
stack trace

 

 
{code:java}
myjob-jobmanager-1   | org.apache.flink.util.FlinkException: Could not load the 
provided entrypoint class.
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:230)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:149)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.lambda$main$0(StandaloneApplicationClusterEntryPoint.java:90)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:89)
 [flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   | Caused by: 
org.apache.flink.client.program.ProgramInvocationException: The program's entry 
point class 'my.company.job.MyJob' was not found in the jar file.
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:481)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65) 
~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     ... 4 more
myjob-jobmanager-1   | Caused by: java.lang.ClassNotFoundException: 
my.company.job.MyJob
myjob-jobmanager-1   |     at java.net.URLClassLoader.findClass(Unknown Source) 
~[?:?]
myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
~[?:?]
myjob-jobmanager-1   |     at 
org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:67)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:74)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:51)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
~[?:?]
myjob-jobmanager-1   |     at 
org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.loadClass(FlinkUserCodeClassLoaders.java:197)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at java.lang.Class.forName0(Native Method) 
~[?:?]myjob-jobmanager-1   |     at java.lang.Class.forName(Unknown Source) 
~[?:?]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:479)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65) 
~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     ... 

[jira] [Updated] (FLINK-35358) Breaking change when loading artifacts

2024-05-15 Thread Rasmus Thygesen (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rasmus Thygesen updated FLINK-35358:

Description: 
We have been using the following code snippet in our Dockerfiles for running a 
Flink job in application mode

 

 
{code:java}
FROM flink:1.18.1-scala_2.12-java17

COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/artifacts/my-job.jar

USER flink {code}
 

 

Which has been working since at least around Flink 1.14, but the 1.19 update 
has broken our Dockerfiles. The fix is to put the jar file a step further out 
so the code snippet becomes

 

 
{code:java}
FROM flink:1.18.1-scala_2.12-java17

COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/my-job.jar

USER flink  {code}
 

 

We have not spent too much time looking into what the cause is, but we get the 
stack trace

 

 
{code:java}
myjob-jobmanager-1   | org.apache.flink.util.FlinkException: Could not load the 
provided entrypoint class.
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:230)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:149)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.lambda$main$0(StandaloneApplicationClusterEntryPoint.java:90)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:89)
 [flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   | Caused by: 
org.apache.flink.client.program.ProgramInvocationException: The program's entry 
point class 'my.company.job.MyJob' was not found in the jar file.
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:481)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65) 
~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     ... 4 more
myjob-jobmanager-1   | Caused by: java.lang.ClassNotFoundException: 
my.company.job.MyJob
myjob-jobmanager-1   |     at java.net.URLClassLoader.findClass(Unknown Source) 
~[?:?]
myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
~[?:?]
myjob-jobmanager-1   |     at 
org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:67)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:74)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:51)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at java.lang.ClassLoader.loadClass(Unknown Source) 
~[?:?]
myjob-jobmanager-1   |     at 
org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.loadClass(FlinkUserCodeClassLoaders.java:197)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at java.lang.Class.forName0(Native Method) ~[?:?]
myjob-jobmanager-1   |     at java.lang.Class.forName(Unknown Source) ~[?:?]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:479)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65) 
~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
 ~[flink-dist-1.19.0.jar:1.19.0]
myjob-jobmanager-1   |     ... 4 

[jira] [Commented] (FLINK-20392) Migrating bash e2e tests to Java/Docker

2024-05-15 Thread Lorenzo Affetti (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846535#comment-17846535
 ] 

Lorenzo Affetti commented on FLINK-20392:
-

[~m.orazow] and I found ourselves working on two issues under this umbrella:
 - https://issues.apache.org/jira/browse/FLINK-20398  (batch sql)
 - https://issues.apache.org/jira/browse/FLINK-20400  (streaming sql)

And contributed with this 2 PRs:
 - [https://github.com/apache/flink/pull/24471]  (batch sql)
 - [https://github.com/apache/flink/pull/24776]  (streaming sql)

The "batch" PR makes use of the MiniClusterExtension while the "streaming" PR 
makes use of TestContainers and, specifically, FlinkContainers.
We think that the underlying purpose here (apart from porting tests from bash 
to Java) is to:
 - reflect their bash behavior
 - adopt a homogeneous solution for (hopefully) every test porting.

[~m.orazow]  and I had an offline discussion mainly focused on the nature of 
these end-to-end tests: at the best of our understanding, these tests are meant 
to test the end-to-end functionalities of some "close-to-reality" Flink 
cluster. Indeed, their current behavior (in bash) consists of starting a local 
Flink cluster and issuing `flink run` commands against it. Moreover, some test 
(e.g.: flink-end-to-end-tests/test-scripts/test_queryable_state_restart_tm.sh) 
also test failure by killing a random TaskManager (as a side note, this 
umbrella does not have an issue per existing e2e test, we volunteer to sink up 
its state in the near future).

After an offline discussion, Muhammet and I believe that _the TestContainers 
approach is the one that fits most the idea behind this umbrella issue._

We elicit here below the PROs and CONs of both approaches.

TestContainers :
 - PRO: the Flink cluster is as close as possible to reality (full TCP/IP stack 
involved, separate processes)
 - PRO: uses the Flink binaries in the Flink dist as the bash version does
 - PRO: handlers to control the cluster available -> a random TM can be killed 
to check the failure state
 - CON: running tests requires Docker images to be built (time required)

MiniClusterExtension:
 - PRO: the code is very clean: no need to deploy separate JAR for the Flink 
app, as that is coded in the test and the Flink cluster is registered in the 
current test environment
 - PRO: no further build steps required apart from java build (fast run)
 - CON: missing handlers for TM kill (could be implemented by exposing 
MiniClusterResource from the extension)
 - CON: less "close-to-reality" as each service runs in a separate thread (not 
process), however an in-depth analysis of the implementation of the 
MiniClusterResource should be conducted to understand how much this is close to 
a Flink cluster.

*Wrapping up, we do think that if we want to mimic as much as possible the 
current behavior of tests, TestContainers is the way to go at the cost of the 
the Docker-build time of containers in CI.*

We would like to gather at least 2 people's agreement on one of the 2 solution 
without a strong opposing position from somebody else in order to edit our PRs 
to match and to continue with this massive port.

 

As they are active in the thread and experts on the topic, I suggest [~mapohl] 
and [~jark] to help us in this.

 

Thank you guys for all the support and helpful reviews as of now.

> Migrating bash e2e tests to Java/Docker
> ---
>
> Key: FLINK-20392
> URL: https://issues.apache.org/jira/browse/FLINK-20392
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Test Infrastructure, Tests
>Reporter: Matthias Pohl
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> starter
>
> This Jira issue serves as an umbrella ticket for single e2e test migration 
> tasks. This should enable us to migrate all bash-based e2e tests step-by-step.
> The goal is to utilize the e2e test framework (see 
> [flink-end-to-end-tests-common|https://github.com/apache/flink/tree/master/flink-end-to-end-tests/flink-end-to-end-tests-common]).
>  Ideally, the test should use Docker containers as much as possible 
> disconnect the execution from the environment. A good source to achieve that 
> is [testcontainers.org|https://www.testcontainers.org/].
> The related ML discussion is [Stop adding new bash-based e2e tests to 
> Flink|http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Stop-adding-new-bash-based-e2e-tests-to-Flink-td46607.html].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-35358) Breaking change when loading artifacts

2024-05-15 Thread Rasmus Thygesen (Jira)
Rasmus Thygesen created FLINK-35358:
---

 Summary: Breaking change when loading artifacts
 Key: FLINK-35358
 URL: https://issues.apache.org/jira/browse/FLINK-35358
 Project: Flink
  Issue Type: New Feature
  Components: Client / Job Submission, flink-docker
Affects Versions: 1.19.0
Reporter: Rasmus Thygesen


We have been using the following code snippet in our Dockerfiles for running a 
Flink job in application mode

 

 
{code:java}
FROM flink:1.18.1-scala_2.12-java17

COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/artifacts/my-job.jar

USER flink {code}
 

 

Which has been working since at least around Flink 1.14, but the 1.19 update 
has broken our Dockerfiles. The fix is to put the jar file a step further out 
so the code snippet becomes

 

 
{code:java}
FROM flink:1.18.1-scala_2.12-java17

COPY --from=build /app/target/my-job*.jar /opt/flink/usrlib/my-job.jar

USER flink  {code}
 

 

We have not spent too much time looking into what the cause is, but we get the 
stack trace

 

 
{code:java}
myjob-jobmanager-1   | org.apache.flink.util.FlinkException: Could not load the 
provided entrypoint class.myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:230)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.getPackagedProgram(StandaloneApplicationClusterEntryPoint.java:149)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.lambda$main$0(StandaloneApplicationClusterEntryPoint.java:90)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.container.entrypoint.StandaloneApplicationClusterEntryPoint.main(StandaloneApplicationClusterEntryPoint.java:89)
 [flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   | Caused by: 
org.apache.flink.client.program.ProgramInvocationException: The program's entry 
point class 'my.company.job.MyJob' was not found in the jar 
file.myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:481)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65) 
~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.client.program.DefaultPackagedProgramRetriever.getPackagedProgram(DefaultPackagedProgramRetriever.java:228)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     ... 4 
moremyjob-jobmanager-1   | Caused by: java.lang.ClassNotFoundException: 
my.company.job.MyJobmyjob-jobmanager-1   |     at 
java.net.URLClassLoader.findClass(Unknown Source) ~[?:?]myjob-jobmanager-1   |  
   at java.lang.ClassLoader.loadClass(Unknown Source) ~[?:?]myjob-jobmanager-1  
 |     at 
org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:67)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:74)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:51)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
java.lang.ClassLoader.loadClass(Unknown Source) ~[?:?]myjob-jobmanager-1   |    
 at 
org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.loadClass(FlinkUserCodeClassLoaders.java:197)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
java.lang.Class.forName0(Native Method) ~[?:?]myjob-jobmanager-1   |     at 
java.lang.Class.forName(Unknown Source) ~[?:?]myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:479)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:153)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram.(PackagedProgram.java:65) 
~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     at 
org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:691)
 ~[flink-dist-1.19.0.jar:1.19.0]myjob-jobmanager-1   |     

Re: [PR] [FLINK-34379][table] Fix adding catalogtable logic [flink]

2024-05-15 Thread via GitHub


flinkbot commented on PR #24788:
URL: https://github.com/apache/flink/pull/24788#issuecomment-2111798960

   
   ## CI report:
   
   * 5ceaf0e8216ebcdcb09ce4a0c9574962789ee9c3 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34379][table] Fix OutOfMemoryError with large queries [flink]

2024-05-15 Thread via GitHub


jeyhunkarimov commented on code in PR #24600:
URL: https://github.com/apache/flink/pull/24600#discussion_r1601116454


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/utils/DynamicPartitionPruningUtils.java:
##
@@ -236,6 +238,9 @@ private void setTables(ContextResolvedTable catalogTable) {
 tables.add(catalogTable);
 } else {
 for (ContextResolvedTable thisTable : new ArrayList<>(tables)) 
{
+if (tables.contains(catalogTable)) {

Review Comment:
   HI @mumuhhh thanks for the ping and your suggestion! I think you are right. 
Now that I look in more detail, in addition to your suggestion, I think we can 
also remove the first  `if` check in the method. I filed the patch: 
https://github.com/apache/flink/pull/24788



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34379][table] Fix OutOfMemoryError with large queries [flink]

2024-05-15 Thread via GitHub


jeyhunkarimov commented on code in PR #24600:
URL: https://github.com/apache/flink/pull/24600#discussion_r1601116454


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/utils/DynamicPartitionPruningUtils.java:
##
@@ -236,6 +238,9 @@ private void setTables(ContextResolvedTable catalogTable) {
 tables.add(catalogTable);
 } else {
 for (ContextResolvedTable thisTable : new ArrayList<>(tables)) 
{
+if (tables.contains(catalogTable)) {

Review Comment:
   HI @mumuhhh thanks for the ping. I think you are right. Now that I look in 
more detail, in addition to your suggestion, I think we can also remove the 
first  `if` check in the method. I filed the patch: 
https://github.com/apache/flink/pull/24788



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34379][table] Fix OutOfMemoryError with large queries [flink]

2024-05-15 Thread via GitHub


jeyhunkarimov commented on code in PR #24600:
URL: https://github.com/apache/flink/pull/24600#discussion_r1601116454


##
flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/utils/DynamicPartitionPruningUtils.java:
##
@@ -236,6 +238,9 @@ private void setTables(ContextResolvedTable catalogTable) {
 tables.add(catalogTable);
 } else {
 for (ContextResolvedTable thisTable : new ArrayList<>(tables)) 
{
+if (tables.contains(catalogTable)) {

Review Comment:
   HI @mumuhhh thanks for the ping. I think you are right. Now that I look in 
more detail, in addition to your suggestion, I think we can also remove the 
first  `if` check in the method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35351][checkpoint] Fix fail during restore from unaligned chec… [flink]

2024-05-15 Thread via GitHub


ldadima commented on PR #24784:
URL: https://github.com/apache/flink/pull/24784#issuecomment-2111794984

   > ## CI report:
   > * 
[309cf5b](https://github.com/apache/flink/commit/309cf5b7c295a56527b71ac0a1869ce06cd3a244)
 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=59567)
   > 
   > Bot commands
   
   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Reopened] (FLINK-34379) table.optimizer.dynamic-filtering.enabled lead to OutOfMemoryError

2024-05-15 Thread Jeyhun Karimov (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeyhun Karimov reopened FLINK-34379:


Reopening because of patch needed, as a result of the comment: 
https://github.com/apache/flink/pull/24600#discussion_r1600843684

> table.optimizer.dynamic-filtering.enabled lead to OutOfMemoryError
> --
>
> Key: FLINK-34379
> URL: https://issues.apache.org/jira/browse/FLINK-34379
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.17.2, 1.18.1
> Environment: 1.17.1
>Reporter: zhu
>Assignee: Jeyhun Karimov
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.17.3, 1.18.2, 1.20.0, 1.19.1
>
>
> When using batch computing, I union all about 50 tables and then join other 
> table. When compiling the execution plan, 
> there throws OutOfMemoryError: Java heap space, which was no problem in  
> 1.15.2. However, both 1.17.2 and 1.18.1 all throws same errors,This causes 
> jobmanager to restart. Currently,it has been found that this is caused by 
> table.optimizer.dynamic-filtering.enabled, which defaults is true,When I set 
> table.optimizer.dynamic-filtering.enabled to false, it can be compiled and 
> executed normally
> code
> TableEnvironment.create(EnvironmentSettings.newInstance()
> .withConfiguration(configuration)
> .inBatchMode().build())
> sql=select att,filename,'table0' as mo_name from table0 UNION All select 
> att,filename,'table1' as mo_name from table1 UNION All select 
> att,filename,'table2' as mo_name from table2 UNION All select 
> att,filename,'table3' as mo_name from table3 UNION All select 
> att,filename,'table4' as mo_name from table4 UNION All select 
> att,filename,'table5' as mo_name from table5 UNION All select 
> att,filename,'table6' as mo_name from table6 UNION All select 
> att,filename,'table7' as mo_name from table7 UNION All select 
> att,filename,'table8' as mo_name from table8 UNION All select 
> att,filename,'table9' as mo_name from table9 UNION All select 
> att,filename,'table10' as mo_name from table10 UNION All select 
> att,filename,'table11' as mo_name from table11 UNION All select 
> att,filename,'table12' as mo_name from table12 UNION All select 
> att,filename,'table13' as mo_name from table13 UNION All select 
> att,filename,'table14' as mo_name from table14 UNION All select 
> att,filename,'table15' as mo_name from table15 UNION All select 
> att,filename,'table16' as mo_name from table16 UNION All select 
> att,filename,'table17' as mo_name from table17 UNION All select 
> att,filename,'table18' as mo_name from table18 UNION All select 
> att,filename,'table19' as mo_name from table19 UNION All select 
> att,filename,'table20' as mo_name from table20 UNION All select 
> att,filename,'table21' as mo_name from table21 UNION All select 
> att,filename,'table22' as mo_name from table22 UNION All select 
> att,filename,'table23' as mo_name from table23 UNION All select 
> att,filename,'table24' as mo_name from table24 UNION All select 
> att,filename,'table25' as mo_name from table25 UNION All select 
> att,filename,'table26' as mo_name from table26 UNION All select 
> att,filename,'table27' as mo_name from table27 UNION All select 
> att,filename,'table28' as mo_name from table28 UNION All select 
> att,filename,'table29' as mo_name from table29 UNION All select 
> att,filename,'table30' as mo_name from table30 UNION All select 
> att,filename,'table31' as mo_name from table31 UNION All select 
> att,filename,'table32' as mo_name from table32 UNION All select 
> att,filename,'table33' as mo_name from table33 UNION All select 
> att,filename,'table34' as mo_name from table34 UNION All select 
> att,filename,'table35' as mo_name from table35 UNION All select 
> att,filename,'table36' as mo_name from table36 UNION All select 
> att,filename,'table37' as mo_name from table37 UNION All select 
> att,filename,'table38' as mo_name from table38 UNION All select 
> att,filename,'table39' as mo_name from table39 UNION All select 
> att,filename,'table40' as mo_name from table40 UNION All select 
> att,filename,'table41' as mo_name from table41 UNION All select 
> att,filename,'table42' as mo_name from table42 UNION All select 
> att,filename,'table43' as mo_name from table43 UNION All select 
> att,filename,'table44' as mo_name from table44 UNION All select 
> att,filename,'table45' as mo_name from table45 UNION All select 
> att,filename,'table46' as mo_name from table46 UNION All select 
> att,filename,'table47' as mo_name from table47 UNION All select 
> att,filename,'table48' as mo_name from table48 UNION All select 
> att,filename,'table49' as mo_name from table49 UNION All select 
> att,filename,'table50' as mo_name from 

[PR] [FLINK-34379][table] Fix adding catalogtable logic [flink]

2024-05-15 Thread via GitHub


jeyhunkarimov opened a new pull request, #24788:
URL: https://github.com/apache/flink/pull/24788

   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
 - Fix the catalog table adding logic
   
   
   ## Verifying this change
   
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35109] Drop support for Flink 1.17 & 1.18 and fix tests for 1.20-SNAPSHOT [flink-connector-kafka]

2024-05-15 Thread via GitHub


MartijnVisser commented on code in PR #102:
URL: 
https://github.com/apache/flink-connector-kafka/pull/102#discussion_r1601097465


##
.github/workflows/push_pr.yml:
##
@@ -28,21 +28,16 @@ jobs:
   compile_and_test:
 strategy:
   matrix:
-flink: [ 1.17.2 ]
-jdk: [ '8, 11' ]
-include:
-  - flink: 1.18.1
-jdk: '8, 11, 17'
-  - flink: 1.19.0
-jdk: '8, 11, 17, 21'
+flink: [ 1.19.0, 1.20-SNAPSHOT ]
+jdk: [ '8, 11, 17, 21' ]

Review Comment:
   > May be I didn't get your message about java 21 support, however it is 
mentioned in 1.19 release notes
   
   臘 I mixed up Flink 1.18 and 1.19, sorry. My comment was incorrect



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-20398][e2e] Migrate test_batch_sql.sh to Java e2e tests framework [flink]

2024-05-15 Thread via GitHub


XComp commented on code in PR #24471:
URL: https://github.com/apache/flink/pull/24471#discussion_r1600075906


##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/GeneratedRow.java:
##
@@ -0,0 +1,169 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.table.data.ArrayData;
+import org.apache.flink.table.data.DecimalData;
+import org.apache.flink.table.data.MapData;
+import org.apache.flink.table.data.RawValueData;
+import org.apache.flink.table.data.RowData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.types.RowKind;
+
+import java.io.Serializable;
+import java.time.LocalDateTime;
+import java.util.Objects;
+
+class GeneratedRow implements RowData, Serializable {

Review Comment:
   Was there a reason to implement a dedicated `RowData`subclass rather than 
using the `Row`class like it's done in the original test code? 樂 



##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java:
##
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.table.data.RowData;
+
+import org.jetbrains.annotations.NotNull;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneOffset;
+import java.util.Iterator;
+import java.util.NoSuchElementException;
+
+class Generator implements Iterator, Iterable {
+final int numKeys;

Review Comment:
   ```suggestion
   private final int numKeys;
   ```
   There is no real reason to make this field a package-private one, does it? 樂 



##
flink-end-to-end-tests/flink-batch-sql-test/src/test/java/org/apache/flink/sql/tests/Generator.java:
##
@@ -0,0 +1,83 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.sql.tests;
+
+import org.apache.flink.table.data.RowData;
+
+import org.jetbrains.annotations.NotNull;
+
+import java.time.Instant;
+import java.time.LocalDateTime;
+import java.time.ZoneOffset;
+import java.util.Iterator;
+import java.util.NoSuchElementException;
+
+class Generator implements Iterator, Iterable {
+final int numKeys;
+
+private int keyIndex = 0;
+
+private final long durationMs;
+private final long stepMs;
+private final long offsetMs;
+private long ms = 0;
+
+static Generator create(
+int numKeys, float rowsPerKeyAndSecond, int durationSeconds, int 
offsetSeconds) {
+int sleepMs = (int) (1000 / rowsPerKeyAndSecond);
+return 

Re: [PR] [FLINK-34615]Split `ExternalizedCheckpointCleanup` out of `Checkpoint… [flink]

2024-05-15 Thread via GitHub


spoon-lz commented on PR #24461:
URL: https://github.com/apache/flink/pull/24461#issuecomment-2111766768

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35030][runtime] Introduce Epoch Manager under async execution [flink]

2024-05-15 Thread via GitHub


fredia commented on code in PR #24748:
URL: https://github.com/apache/flink/pull/24748#discussion_r1601082456


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/operators/asyncprocessing/AbstractAsyncStateStreamOperatorV2.java:
##
@@ -188,6 +190,35 @@ public  InternalTimerService 
getInternalTimerService(
 (AsyncExecutionController) asyncExecutionController);
 }
 
+@Override
+public void processWatermark(Watermark mark) throws Exception {
+if (!isAsyncStateProcessingEnabled()) {
+super.processWatermark(mark);
+return;
+}
+asyncExecutionController.processNonRecord(() -> 
super.processWatermark(mark));

Review Comment:
   `AbstractStreamOperatorV2` does not implement `processWatermark1 `,  
`processWatermark2` and `processWatermark(Watermark, int)`, if those methods 
will be added in the future, I would override them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Comment Edited] (FLINK-34380) Strange RowKind and records about intermediate output when using minibatch join

2024-05-15 Thread Roman Boyko (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846520#comment-17846520
 ] 

Roman Boyko edited comment on FLINK-34380 at 5/15/24 7:14 AM:
--

Hi [~xuyangzhong] ! Thank you for your reply.

Yes, you're right - the RowKind still not fixed in this example. But I think we 
should consider to fix the RowKind in separate issue because:

1) Incorrect RowKind in your example is the common problem of MiniBatch 
functionality. It happens every time when +I and -U records are assigned to 
first batch and then +U record is assigned to second batch. And it can't be 
fixed easily and only for Join operator - we should try to reproduce the same 
for Aggregate operator and fix it as well

2) While incorrect RowKind is not so serious problem, the incorrect order of 
output records might be really critical because it leads to incorrect result

So I sugest to fix only incorrect order in this issue and create the separate 
one for incorrect RowKind.


was (Author: rovboyko):
Hi [~xuyangzhong] ! Thank you for your reply.

Yes, you're right - the RowKind still not fixed in this example. But I think we 
should consider to fix the RowKind in separate issue because:

1) Incorrect RowKind in your example is the common problem of MiniBatch 
functionality. It happens every time when +I and -U records are assigned to 
first batch and then +U record is assigned to second batch. And it can't be 
fixed easily and only for Join operator - we should try to reproduce the same 
for Aggregate operator as well

2) While incorrect RowKind is not so serious problem, the incorrect order of 
output records might be really critical because it leads to incorrect result

So I sugest to fix only incorrect order in this issue and create the separate 
one for incorrect RowKind.

> Strange RowKind and records about intermediate output when using minibatch 
> join
> ---
>
> Key: FLINK-34380
> URL: https://issues.apache.org/jira/browse/FLINK-34380
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: xuyang
>Priority: Major
> Fix For: 1.20.0
>
>
> {code:java}
> // Add it in CalcItCase
> @Test
>   def test(): Unit = {
> env.setParallelism(1)
> val rows = Seq(
>   changelogRow("+I", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("-U", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("+U", java.lang.Integer.valueOf(1), "99"),
>   changelogRow("-D", java.lang.Integer.valueOf(1), "99")
> )
> val dataId = TestValuesTableFactory.registerData(rows)
> val ddl =
>   s"""
>  |CREATE TABLE t1 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl)
> val ddl2 =
>   s"""
>  |CREATE TABLE t2 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl2)
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ENABLED, 
> Boolean.box(true))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ALLOW_LATENCY, 
> Duration.ofSeconds(5))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_SIZE, Long.box(3L))
> println(tEnv.sqlQuery("SELECT * from t1 join t2 on t1.a = 
> t2.a").explain())
> tEnv.executeSql("SELECT * from t1 join t2 on t1.a = t2.a").print()
>   } {code}
> Output:
> {code:java}
> ++-+-+-+-+
> | op |           a |               b |          a0 |      b0 |
> ++-+-+-+-+
> | +U |           1 |               1 |           1 |      99 |
> | +U |           1 |              99 |           1 |      99 |
> | -U |           1 |               1 |           1 |      99 |
> | -D |           1 |              99 |           1 |      99 |
> ++-+-+-+-+{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34380) Strange RowKind and records about intermediate output when using minibatch join

2024-05-15 Thread Roman Boyko (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846520#comment-17846520
 ] 

Roman Boyko commented on FLINK-34380:
-

Hi [~xuyangzhong] ! Thank you for your reply.

Yes, you're right - the RowKind still not fixed in this example. But I think we 
should consider to fix the RowKind in separate issue because:

1) Incorrect RowKind in your example is the common problem of MiniBatch 
functionality. It happens every time when +I and -U records are assigned to 
first batch and then +U record is assigned to second batch. And it can't be 
fixed easily and only for Join operator - we should try to reproduce the same 
for Aggregate operator as well

2) While incorrect RowKind is not so serious problem, the incorrect order of 
output records might be really critical because it leads to incorrect result

So I sugest to fix only incorrect order in this issue and create the separate 
one for incorrect RowKind.

> Strange RowKind and records about intermediate output when using minibatch 
> join
> ---
>
> Key: FLINK-34380
> URL: https://issues.apache.org/jira/browse/FLINK-34380
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: xuyang
>Priority: Major
> Fix For: 1.20.0
>
>
> {code:java}
> // Add it in CalcItCase
> @Test
>   def test(): Unit = {
> env.setParallelism(1)
> val rows = Seq(
>   changelogRow("+I", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("-U", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("+U", java.lang.Integer.valueOf(1), "99"),
>   changelogRow("-D", java.lang.Integer.valueOf(1), "99")
> )
> val dataId = TestValuesTableFactory.registerData(rows)
> val ddl =
>   s"""
>  |CREATE TABLE t1 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl)
> val ddl2 =
>   s"""
>  |CREATE TABLE t2 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl2)
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ENABLED, 
> Boolean.box(true))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ALLOW_LATENCY, 
> Duration.ofSeconds(5))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_SIZE, Long.box(3L))
> println(tEnv.sqlQuery("SELECT * from t1 join t2 on t1.a = 
> t2.a").explain())
> tEnv.executeSql("SELECT * from t1 join t2 on t1.a = t2.a").print()
>   } {code}
> Output:
> {code:java}
> ++-+-+-+-+
> | op |           a |               b |          a0 |      b0 |
> ++-+-+-+-+
> | +U |           1 |               1 |           1 |      99 |
> | +U |           1 |              99 |           1 |      99 |
> | -U |           1 |               1 |           1 |      99 |
> | -D |           1 |              99 |           1 |      99 |
> ++-+-+-+-+{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-34380) Strange RowKind and records about intermediate output when using minibatch join

2024-05-15 Thread Roman Boyko (Jira)


[ https://issues.apache.org/jira/browse/FLINK-34380 ]


Roman Boyko deleted comment on FLINK-34380:
-

was (Author: rovboyko):
Hi [~xuyangzhong] ! Thank you for your reply.

> Strange RowKind and records about intermediate output when using minibatch 
> join
> ---
>
> Key: FLINK-34380
> URL: https://issues.apache.org/jira/browse/FLINK-34380
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: xuyang
>Priority: Major
> Fix For: 1.20.0
>
>
> {code:java}
> // Add it in CalcItCase
> @Test
>   def test(): Unit = {
> env.setParallelism(1)
> val rows = Seq(
>   changelogRow("+I", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("-U", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("+U", java.lang.Integer.valueOf(1), "99"),
>   changelogRow("-D", java.lang.Integer.valueOf(1), "99")
> )
> val dataId = TestValuesTableFactory.registerData(rows)
> val ddl =
>   s"""
>  |CREATE TABLE t1 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl)
> val ddl2 =
>   s"""
>  |CREATE TABLE t2 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl2)
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ENABLED, 
> Boolean.box(true))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ALLOW_LATENCY, 
> Duration.ofSeconds(5))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_SIZE, Long.box(3L))
> println(tEnv.sqlQuery("SELECT * from t1 join t2 on t1.a = 
> t2.a").explain())
> tEnv.executeSql("SELECT * from t1 join t2 on t1.a = t2.a").print()
>   } {code}
> Output:
> {code:java}
> ++-+-+-+-+
> | op |           a |               b |          a0 |      b0 |
> ++-+-+-+-+
> | +U |           1 |               1 |           1 |      99 |
> | +U |           1 |              99 |           1 |      99 |
> | -U |           1 |               1 |           1 |      99 |
> | -D |           1 |              99 |           1 |      99 |
> ++-+-+-+-+{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34380) Strange RowKind and records about intermediate output when using minibatch join

2024-05-15 Thread Roman Boyko (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846516#comment-17846516
 ] 

Roman Boyko commented on FLINK-34380:
-

Hi [~xuyangzhong] ! Thank you for your reply.

> Strange RowKind and records about intermediate output when using minibatch 
> join
> ---
>
> Key: FLINK-34380
> URL: https://issues.apache.org/jira/browse/FLINK-34380
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.19.0
>Reporter: xuyang
>Priority: Major
> Fix For: 1.20.0
>
>
> {code:java}
> // Add it in CalcItCase
> @Test
>   def test(): Unit = {
> env.setParallelism(1)
> val rows = Seq(
>   changelogRow("+I", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("-U", java.lang.Integer.valueOf(1), "1"),
>   changelogRow("+U", java.lang.Integer.valueOf(1), "99"),
>   changelogRow("-D", java.lang.Integer.valueOf(1), "99")
> )
> val dataId = TestValuesTableFactory.registerData(rows)
> val ddl =
>   s"""
>  |CREATE TABLE t1 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl)
> val ddl2 =
>   s"""
>  |CREATE TABLE t2 (
>  |  a int,
>  |  b string
>  |) WITH (
>  |  'connector' = 'values',
>  |  'data-id' = '$dataId',
>  |  'bounded' = 'false'
>  |)
>""".stripMargin
> tEnv.executeSql(ddl2)
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ENABLED, 
> Boolean.box(true))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_ALLOW_LATENCY, 
> Duration.ofSeconds(5))
> tEnv.getConfig.getConfiguration
>   .set(ExecutionConfigOptions.TABLE_EXEC_MINIBATCH_SIZE, Long.box(3L))
> println(tEnv.sqlQuery("SELECT * from t1 join t2 on t1.a = 
> t2.a").explain())
> tEnv.executeSql("SELECT * from t1 join t2 on t1.a = t2.a").print()
>   } {code}
> Output:
> {code:java}
> ++-+-+-+-+
> | op |           a |               b |          a0 |      b0 |
> ++-+-+-+-+
> | +U |           1 |               1 |           1 |      99 |
> | +U |           1 |              99 |           1 |      99 |
> | -U |           1 |               1 |           1 |      99 |
> | -D |           1 |              99 |           1 |      99 |
> ++-+-+-+-+{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-33925) Extended failure handling for bulk requests (elasticsearch back port)

2024-05-15 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin updated FLINK-33925:

Fix Version/s: opensearch-2.0.0

> Extended failure handling for bulk requests (elasticsearch back port)
> -
>
> Key: FLINK-33925
> URL: https://issues.apache.org/jira/browse/FLINK-33925
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Opensearch
>Affects Versions: opensearch-1.0.1
>Reporter: Peter Schulz
>Assignee: Peter Schulz
>Priority: Major
>  Labels: pull-request-available
> Fix For: opensearch-1.2.0, opensearch-2.0.0
>
>
> This is a back port of the implementation for the elasticsearch connector, 
> see FLINK-32028, to achieve consistent APIs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34470][Connectors/Kafka] Fix indefinite blocking by adjusting stopping condition in split reader [flink-connector-kafka]

2024-05-15 Thread via GitHub


dongwoo6kim commented on code in PR #100:
URL: 
https://github.com/apache/flink-connector-kafka/pull/100#discussion_r1601015487


##
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/KafkaPartitionSplitReader.java:
##
@@ -122,27 +122,21 @@ public RecordsWithSplitIds> fetch() throws IOExce
 KafkaPartitionSplitRecords recordsBySplits =
 new KafkaPartitionSplitRecords(consumerRecords, 
kafkaSourceReaderMetrics);
 List finishedPartitions = new ArrayList<>();
-for (TopicPartition tp : consumerRecords.partitions()) {
+for (TopicPartition tp : consumer.assignment()) {
 long stoppingOffset = getStoppingOffset(tp);
-final List> recordsFromPartition =
-consumerRecords.records(tp);
-
-if (recordsFromPartition.size() > 0) {
-final ConsumerRecord lastRecord =
-recordsFromPartition.get(recordsFromPartition.size() - 
1);
-
-// After processing a record with offset of "stoppingOffset - 
1", the split reader
-// should not continue fetching because the record with 
stoppingOffset may not
-// exist. Keep polling will just block forever.
-if (lastRecord.offset() >= stoppingOffset - 1) {
-recordsBySplits.setPartitionStoppingOffset(tp, 
stoppingOffset);
-finishSplitAtRecord(
-tp,
-stoppingOffset,
-lastRecord.offset(),
-finishedPartitions,
-recordsBySplits);
-}
+long consumerPosition = consumer.position(tp);
+// Stop fetching when the consumer's position reaches the 
stoppingOffset.
+// Control messages may follow the last record; therefore, using 
the last record's
+// offset as a stopping condition could result in indefinite 
blocking.
+if (consumerPosition >= stoppingOffset) {
+LOG.debug(
+"Position of {}: {}, has reached stopping offset: {}",
+tp,
+consumerPosition,
+stoppingOffset);
+recordsBySplits.setPartitionStoppingOffset(tp, stoppingOffset);
+finishSplitAtRecord(
+tp, stoppingOffset, consumerPosition, 
finishedPartitions, recordsBySplits);
 }
 // Track this partition's record lag if it never appears before
 kafkaSourceReaderMetrics.maybeAddRecordsLagMetric(consumer, tp);

Review Comment:
   @LinMingQiang, thanks for the review. 
   I've changed to track tp, only when there is record for that tp.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[jira] [Closed] (FLINK-30537) Add support for OpenSearch 2.3

2024-05-15 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-30537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin closed FLINK-30537.
---
Resolution: Fixed

Closed in favor of FLINK-33859

> Add support for OpenSearch 2.3
> --
>
> Key: FLINK-30537
> URL: https://issues.apache.org/jira/browse/FLINK-30537
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Opensearch
>Reporter: Martijn Visser
>Priority: Minor
>  Labels: auto-deprioritized-major, pull-request-available
>
> Create a version for Flink’s Opensearch connector that supports version 2.3.
> From the ASF Flink Slack: 
> https://apache-flink.slack.com/archives/C03GV7L3G2C/p1672339157102319



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-34942) Support Flink 1.19, 1.20-SNAPSHOT for OpenSearch connector

2024-05-15 Thread Sergey Nuyanzin (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-34942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Nuyanzin resolved FLINK-34942.
-
Fix Version/s: opensearch-1.2.0
   opensearch-2.0.0
   Resolution: Fixed

> Support Flink 1.19, 1.20-SNAPSHOT for OpenSearch connector
> --
>
> Key: FLINK-34942
> URL: https://issues.apache.org/jira/browse/FLINK-34942
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Opensearch
>Affects Versions: 3.1.0
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
> Fix For: opensearch-1.2.0, opensearch-2.0.0
>
>
> Currently it fails with similar issue as FLINK-33493



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-34942) Support Flink 1.19, 1.20-SNAPSHOT for OpenSearch connector

2024-05-15 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-34942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846511#comment-17846511
 ] 

Sergey Nuyanzin commented on FLINK-34942:
-

Merged as 
[00f1a5b13bfbadcb8efce8e16fb06ddea0d8e48e|https://github.com/apache/flink-connector-opensearch/commit/00f1a5b13bfbadcb8efce8e16fb06ddea0d8e48e]

> Support Flink 1.19, 1.20-SNAPSHOT for OpenSearch connector
> --
>
> Key: FLINK-34942
> URL: https://issues.apache.org/jira/browse/FLINK-34942
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Opensearch
>Affects Versions: 3.1.0
>Reporter: Sergey Nuyanzin
>Assignee: Sergey Nuyanzin
>Priority: Major
>  Labels: pull-request-available
> Fix For: opensearch-1.2.0, opensearch-2.0.0
>
>
> Currently it fails with similar issue as FLINK-33493



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [PR] [FLINK-34470][Connectors/Kafka] Fix indefinite blocking by adjusting stopping condition in split reader [flink-connector-kafka]

2024-05-15 Thread via GitHub


dongwoo6kim commented on code in PR #100:
URL: 
https://github.com/apache/flink-connector-kafka/pull/100#discussion_r1601015487


##
flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/source/reader/KafkaPartitionSplitReader.java:
##
@@ -122,27 +122,21 @@ public RecordsWithSplitIds> fetch() throws IOExce
 KafkaPartitionSplitRecords recordsBySplits =
 new KafkaPartitionSplitRecords(consumerRecords, 
kafkaSourceReaderMetrics);
 List finishedPartitions = new ArrayList<>();
-for (TopicPartition tp : consumerRecords.partitions()) {
+for (TopicPartition tp : consumer.assignment()) {
 long stoppingOffset = getStoppingOffset(tp);
-final List> recordsFromPartition =
-consumerRecords.records(tp);
-
-if (recordsFromPartition.size() > 0) {
-final ConsumerRecord lastRecord =
-recordsFromPartition.get(recordsFromPartition.size() - 
1);
-
-// After processing a record with offset of "stoppingOffset - 
1", the split reader
-// should not continue fetching because the record with 
stoppingOffset may not
-// exist. Keep polling will just block forever.
-if (lastRecord.offset() >= stoppingOffset - 1) {
-recordsBySplits.setPartitionStoppingOffset(tp, 
stoppingOffset);
-finishSplitAtRecord(
-tp,
-stoppingOffset,
-lastRecord.offset(),
-finishedPartitions,
-recordsBySplits);
-}
+long consumerPosition = consumer.position(tp);
+// Stop fetching when the consumer's position reaches the 
stoppingOffset.
+// Control messages may follow the last record; therefore, using 
the last record's
+// offset as a stopping condition could result in indefinite 
blocking.
+if (consumerPosition >= stoppingOffset) {
+LOG.debug(
+"Position of {}: {}, has reached stopping offset: {}",
+tp,
+consumerPosition,
+stoppingOffset);
+recordsBySplits.setPartitionStoppingOffset(tp, stoppingOffset);
+finishSplitAtRecord(
+tp, stoppingOffset, consumerPosition, 
finishedPartitions, recordsBySplits);
 }
 // Track this partition's record lag if it never appears before
 kafkaSourceReaderMetrics.maybeAddRecordsLagMetric(consumer, tp);

Review Comment:
   @LinMingQiang, thanks for the review. 
   I've changed to track tp, only when records for that tp is not empty.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35109] Drop support for Flink 1.17 & 1.18 and fix tests for 1.20-SNAPSHOT [flink-connector-kafka]

2024-05-15 Thread via GitHub


snuyanzin commented on code in PR #102:
URL: 
https://github.com/apache/flink-connector-kafka/pull/102#discussion_r1601012883


##
.github/workflows/push_pr.yml:
##
@@ -28,21 +28,16 @@ jobs:
   compile_and_test:
 strategy:
   matrix:
-flink: [ 1.17.2 ]
-jdk: [ '8, 11' ]
-include:
-  - flink: 1.18.1
-jdk: '8, 11, 17'
-  - flink: 1.19.0
-jdk: '8, 11, 17, 21'
+flink: [ 1.19.0, 1.20-SNAPSHOT ]
+jdk: [ '8, 11, 17, 21' ]

Review Comment:
   May be I didn't get your message about java 21 support, however it is 
mentioned in 1.19 release notes [1]
   Also we have nightly with jdk21 tests in Flink main repo (same as for jdk17) 
starting 1.19 e.g. [2]
   
   [1] 
https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/#beta-support-for-java-21
   [2] 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=59560=results



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-35109] Drop support for Flink 1.17 & 1.18 and fix tests for 1.20-SNAPSHOT [flink-connector-kafka]

2024-05-15 Thread via GitHub


snuyanzin commented on code in PR #102:
URL: 
https://github.com/apache/flink-connector-kafka/pull/102#discussion_r1601012883


##
.github/workflows/push_pr.yml:
##
@@ -28,21 +28,16 @@ jobs:
   compile_and_test:
 strategy:
   matrix:
-flink: [ 1.17.2 ]
-jdk: [ '8, 11' ]
-include:
-  - flink: 1.18.1
-jdk: '8, 11, 17'
-  - flink: 1.19.0
-jdk: '8, 11, 17, 21'
+flink: [ 1.19.0, 1.20-SNAPSHOT ]
+jdk: [ '8, 11, 17, 21' ]

Review Comment:
   May be I didn't get your message about java 21 support, however it is 
mentioned in 1.19 release notes [1]
   Also we have nightly with jdk21 tests (same as for jdk17) starting 1.19 e.g. 
[2]
   
   [1] 
https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/#beta-support-for-java-21
   [2] 
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=59560=results



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



<    1   2