Re: Adjusted frame length exceeds 2147483647

2022-03-17 Thread Matthias Pohl
Hi Ori, that looks odd. The message seems to exceed the maximum size of 2147483647 bytes (2GB). I couldn't find anything similar in the ML or in Jira that supports a bug in Flink. Could it be that there was some network issue? Matthias On Tue, Mar 15, 2022 at 6:52 AM Ori Popowski wrote: > I am

Re: how to set kafka sink ssl properties

2022-03-17 Thread Matthias Pohl
Could you share more details on what's not working? Is the ssl.trustore.location accessible from the Flink nodes? Matthias On Thu, Mar 17, 2022 at 4:00 PM HG wrote: > Hi all, > I am probably not the smartest but I cannot find how to set ssl-properties > for a Kafka Sink. > My assumption was tha

Re: Adjusted frame length exceeds 2147483647

2022-03-17 Thread Matthias Pohl
gt; This issue did not repeat, so it may be a network issue > > On Thu, Mar 17, 2022 at 6:12 PM Matthias Pohl wrote: > >> Hi Ori, >> that looks odd. The message seems to exceed the maximum size >> of 2147483647 bytes (2GB). I couldn't find anything similar in the ML or

Re: Jobmanager trying to be registered for Zombie Job

2022-04-21 Thread Matthias Pohl
Hi Peter, thanks for sharing. That doesn't sound right. May you provide the entire jobmanager logs? Best, Matthias On Thu, Apr 21, 2022 at 6:08 PM Peter Schrott wrote: > Hi Flink-Users, > > I am not sure if this does something to my cluster or not. But since > updating to Flink 1.15 (atm rc4) I

Re: Jobmanager trying to be registered for Zombie Job

2022-04-22 Thread Matthias Pohl
...if possible it would be good to get debug rather than only info logs. Did you encounter anything odd in the TaskManager logs as well. Sharing those might be of value as well. On Fri, Apr 22, 2022 at 8:57 AM Matthias Pohl wrote: > Hi Peter, > thanks for sharing. That doesn't sound

Re: Jobmanager trying to be registered for Zombie Job

2022-04-22 Thread Matthias Pohl
s) is that stopping the JobMaster didn't finish for some reason. For that it would be helpful to look at the logs to see whether there is some other issue that causes the JobMaster to stop entirely. On Fri, Apr 22, 2022 at 10:14 AM Matthias Pohl wrote: > ...if possible it would be good t

Re: Jobmanager trying to be registered for Zombie Job

2022-04-22 Thread Matthias Pohl
e.org/jira/browse/FLINK-27354 On Fri, Apr 22, 2022 at 11:54 AM Matthias Pohl wrote: > Just by looking through the code, it appears that these logs could be > produced while stopping the job. The ResourceManager sends a confirmation > of the JobMaster being disconnected at the end back t

Re: Jobmanager trying to be registered for Zombie Job

2022-04-25 Thread Matthias Pohl
It can be seen on jm 1 that > the job starts crashing and recovering a few times. This happens > until 2022-04-20 12:12:14,607. After that the above described behavior can > be seen. > > I hope this helps. > > Best, Peter > > On Fri, Apr 22, 2022 at 12:06 PM Matthias Poh

Re: Jobmanager trying to be registered for Zombie Job

2022-04-25 Thread Matthias Pohl
find more details on the investigation in FLINK-27354 [1] itself. Best, Matthias [1] https://issues.apache.org/jira/browse/FLINK-27354 On Mon, Apr 25, 2022 at 2:00 PM Matthias Pohl wrote: > Thanks Peter, we're looking into it... > > On Mon, Apr 25, 2022 at 11:54 AM Peter S

Re: Jobmanager trying to be registered for Zombie Job

2022-04-25 Thread Matthias Pohl
& thanks a lot for your help too! > > It's not quite clear to me, the bug was already there since 1.13.6 but not > reported yet (FLINK-27354 is a new ticket)? > > Best, Peter > > > On Mon, Apr 25, 2022 at 5:48 PM Matthias Pohl > wrote: > >> Thanks again

Re: Failing to maven compile install Flink 1.15

2022-08-22 Thread Matthias Pohl via user
Hi hjw, it would be interesting to know the exact Maven commands you used for the successful run (where you compiled the flink-client module individually) and the failed run (where you tried to build everything at once) and probably a more complete version of the Maven output. The path D:\learn\Co

Re: flink ci build run longer than the maximum time of 310 minutes.

2022-09-02 Thread Matthias Pohl via user
Not sure whether that applies to your case, but there was a recent issue [1] where the e2e_1_ci job ran into a timeout. If that's what you were observing, rebasing your branch might help. Best, Matthias [1] https://issues.apache.org/jira/browse/FLINK-29161 On Fri, Sep 2, 2022 at 10:51 AM Martijn

Re: flink ci build run longer than the maximum time of 310 minutes.

2022-09-05 Thread Matthias Pohl via user
1 > commits > <https://github.com/SwimSweet/flink/compare/release-1.15...apache:flink:release-1.15> > behind > <https://github.com/SwimSweet/flink/compare/release-1.15...apache:flink:release-1.15> > apache:release-1.15 > also appear in my pr change files. How can I

Re: Slow Tests in Flink 1.15

2022-09-06 Thread Matthias Pohl via user
Hi David, I guess, you're referring to [1]. But as Chesnay already pointed out in the previous thread: It would be helpful to get more insights into what exactly your tests are executing (logs, code, ...). That would help identifying the cause. > Can you give us a more complete stacktrace so we can

Re: New licensing for Akka

2022-09-07 Thread Matthias Pohl via user
There is some more discussion going on in the related PR [1]. Based on the current state of the discussion, akka 2.6.20 will be the last version under Apache 2.0 license. But, I guess, we'll have to see where this discussion is heading considering that it's kind of fresh. [1] https://github.com/ak

Re: New licensing for Akka

2022-09-09 Thread Matthias Pohl via user
Looks like there will be a bit of a grace period till Sep 2023 for vulnerability fixes in akka 2.6.x [1] [1] https://discuss.lightbend.com/t/2-6-x-maintenance-proposal/9949 On Wed, Sep 7, 2022 at 4:30 PM Robin Cassan via user wrote: > Thanks a lot for your answers, this is reassuring! > > Cheer

Re: Classloading issues with Flink Operator / Kubernetes Native

2022-09-16 Thread Matthias Pohl via user
Are you deploying the job in session or application mode? Could you provide the stacktrace. I'm wondering whether that would be helpful to pin a code location for further investigation. So far, I couldn't come up with a definite answer about placing the jar in the lib directory. Initially, I would

Re: Jobmanager fails to come up if the job has an issue

2022-09-26 Thread Matthias Pohl via user
Hi Ramkrishna, thanks for reaching out to the Flink community. Could you share the JobManager logs to get a better understanding of what's going on? I'm wondering why the JobManager is failing when the actual problem is that the job is struggling to access a folder. It sounds like there are multipl

Re: JobManager restarts on job failure

2022-09-26 Thread Matthias Pohl via user
Thanks Evgeniy for reaching out to the community and Gyula for picking it up. I haven't looked into the k8s operator in much detail, yet. So, help me out if I miss something here. But I'm afraid that this is not something that would be fixed by upgrading to 1.15. The issue here is that we're recove

Re: Jobmanager fails to come up if the job has an issue

2022-09-26 Thread Matthias Pohl via user
Sep 26, 2022 at 3:11 PM ramkrishna vasudevan < > ramvasu.fl...@gmail.com> wrote: > >> Thank you very much for the reply. I have lost the k8s cluster in this >> case before I could capture the logs. I will try to repro this and get back >> to you. >> >> Regards >&

Re: JobManager restarts on job failure

2022-09-26 Thread Matthias Pohl via user
with the following configs > enabled: > > SHUTDOWN_ON_APPLICATION_FINISH = false > SUBMIT_FAILED_JOB_ON_APPLICATION_ERROR = true > > I think jobmanager pod would not restart but simply go to a terminal > failed state right? > > Gyula > > On Mon, Sep 26, 2022 at 12:31

Re: Jobmanager fails to come up if the job has an issue

2022-09-26 Thread Matthias Pohl via user
Yes, the JobManager will failover in HA mode and all jobs would be recovered. On Mon, Sep 26, 2022 at 2:06 PM ramkrishna vasudevan < ramvasu.fl...@gmail.com> wrote: > Thanks @Matthias Pohl . This is informative. So > generally in a session cluster if I have more than one job and

Re: Cancel a job in status INITIALIZING

2022-09-26 Thread Matthias Pohl via user
Can you provide the JobManager logs for this case. It sounds odd that the job was stuck in the INITIALIZING phase. Matthias On Wed, Sep 21, 2022 at 11:50 AM Christian Lorenz via user < user@flink.apache.org> wrote: > Hi, > > > > we’re running a Flink Cluster in standalone/session mode. During a

Re: jobmaster's fatal error will kill the session cluster

2022-10-14 Thread Matthias Pohl via user
Hi Jie Han, welcome to the community. Just a little side note: These kinds of questions are more suitable to be asked in the user mailing list. The dev mailing list is rather used for discussing feature development or project-related topics. See [1] for further details. About your question: The st

Re: Sometimes checkpoints to s3 fail

2022-10-14 Thread Matthias Pohl via user
Hi Evgeniy, is it Ceph which you're using as a S3 server? All the Google search entries point to Ceph when looking for the error message. Could it be that there's a problem with the version of the underlying system? The stacktrace you provided looks like Flink struggles to close the File and, there

Re: jobmaster's fatal error will kill the session cluster

2022-10-17 Thread Matthias Pohl via user
[?:?] > at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) > ~[flink-scala_2.12-1.15.0.jar:1.15.0] > at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) > ~[flink-scala_2.12-1.15.0.jar:1.15.0] > at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:

Re: Watermark generating mechanism in Flink SQL

2022-10-17 Thread Matthias Pohl via user
Hi Hunk, there is documentation about watermarking in FlinkSQL [1]. There is also a FlinkSQL cookbook entry about watermarking [2]. Essentially, you define the watermark strategy in your CREATE TABLE statement and specify the lateness for a given event (not the period in which watermarks are automa

Re: [Security] - Critical OpenSSL Vulnerability

2022-11-01 Thread Matthias Pohl via user
The Docker image for Flink 1.12.7 uses an older base image which comes with openssl 1.1.1k. There was a previous post in the OpenSSL mailing list reporting a low vulnerability being fixed with 3.0.6 and 1.1.1r (both versions being explicitly mentioned) [1]. Therefore, I understand the post in a way

Re: How's JobManager bring up TaskManager in Application Mode or Session Mode?

2022-11-28 Thread Matthias Pohl via user
Hi Mark, the JobManager is not necessarily in charge of spinning up TaskManager instances. It depends on the resource provider configuration you choose. Flink differentiates between active and passive Resource Management (see the two available implementations of ResourceManager [1]). Active Resour

Re: Cleanup for high-availability.storageDir

2022-12-08 Thread Matthias Pohl via user
ds, and I believe that doesn't count as Cancelled, so > the artifacts for blobs and submitted job graphs are not cleaned up. I > imagine the same logic Gyula mentioned before applies, namely keep the > latest one and clean the older ones. > > Regards, > Alexis. > > Am D

Re: How does Flink plugin system work?

2023-01-02 Thread Matthias Pohl via user
Hi Ruibin, could you switch to using the currently supported way for instantiating reporters using the factory configuration parameter [1][2]? Based on the ClassNotFoundException, your suspicion might be right that the plugin didn't make it onto the classpath. Could you share the startup logs of t

Re: The use of zookeeper in flink

2023-01-02 Thread Matthias Pohl via user
And I screwed up the reply again. -.- Here's my previous response for the ML thread and not only spoon_lz: Hi spoon_lz, Thanks for reaching out to the community and sharing your use case. You're right about the fact that Flink's HA feature relies on the leader election. The HA backend not being re

Re: How does Flink plugin system work?

2023-01-02 Thread Matthias Pohl via user
.java#L457 > [3] > https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/filesystems/plugins/ > > Matthias Pohl via user 于2023年1月2日周一 20:27写道: > >> Hi Ruibin, >> could you switch to using the currently supported way for instantiating >> reporters using t

Re: Blob server connection problem

2023-01-24 Thread Matthias Pohl via user
We had issues like that in the past (e.g. FLINK-24923 [1], FLINK-10683 [2]). The error you're observing is caused by an unexpected byte being read from the socket. The BlobServer protocol expects either 0 (for put messages) or 1 (for get messages) being retrieved as a header for new message blocks

Re: Job Cancellation Failing

2023-02-20 Thread Matthias Pohl via user
What do you mean by "earlier it used to fail due to ExecutionGraphStore not existing in /tmp" folder? Did you get the error message "Could not create executionGraphStorage directory in /tmp." and creating this folder fixed the issue? It also looks like the stacktrace doesn't match any of the 1.15

Re: Job Cancellation Failing

2023-02-21 Thread Matthias Pohl via user
I noticed a test instability that sounds quite similar to what you're experiencing. I created FLINK-31168 [1] to follow-up on this one. [1] https://issues.apache.org/jira/browse/FLINK-31168 On Mon, Feb 20, 2023 at 4:50 PM Matthias Pohl wrote: > What do you mean by "earlier it use

Re: [ANNOUNCE] Apache Flink 1.17.0 released

2023-03-23 Thread Matthias Pohl via user
Thanks for making this release getting over the finish line. One additional thing: Feel free to reach out to the release managers (or respond to this thread) with feedback on the release process. Our goal is to constantly improve the release process. Feedback on what could be improved or things th

Re: [ANNOUNCE] Apache Flink 1.17.0 released

2023-03-27 Thread Matthias Pohl via user
Here are a few things I noticed from the 1.17 release retrospectively which I want to share (other release managers might have a different view or might disagree): - Google Meet might not be the best choice for the release sync. We need to be able to invite attendees even if the creator of the mee

Re: Issue with the flink version 1.10.1

2023-03-27 Thread Matthias Pohl via user
Hi Kiran, it's really hard to come up with an answer based on your description. Usually, it helps to share some logs with the exact error that's appearing and a clear description on what you're observing and what you're expecting. A plain "no jobs are running" is too general to come up with a concl

Re: [ANNOUNCE] Flink Table Store Joins Apache Incubator as Apache Paimon(incubating)

2023-03-27 Thread Matthias Pohl via user
Congratulations and good luck with pushing the project forward. On Mon, Mar 27, 2023 at 2:35 PM Jing Ge via user wrote: > Congrats! > > Best regards, > Jing > > On Mon, Mar 27, 2023 at 2:32 PM Leonard Xu wrote: > >> Congratulations! >> >> >> Best, >> Leonard >> >> On Mar 27, 2023, at 5:23 PM, Y

Re: [DISCUSS][FLINK-33240] Document deprecated options as well

2023-10-30 Thread Matthias Pohl via user
Thanks for your proposal, Zhanghao Chen. I think it adds more transparency to the configuration documentation. +1 from my side on the proposal On Wed, Oct 11, 2023 at 2:09 PM Zhanghao Chen wrote: > Hi Flink users and developers, > > Currently, Flink won't generate doc for the deprecated options

Re: Java 17 as default

2023-11-29 Thread Matthias Pohl via user
The 1.18 Docker images were pushed on Oct 31. This also included Java 17 images [1]. [1] https://hub.docker.com/_/flink/tags?page=1&name=java17 On Wed, Nov 15, 2023 at 7:56 AM Tauseef Janvekar wrote: > Dear Team, > > I saw the documentation for 1.18 and Java 17 is not supported and the > image

Re: Doubts about state and table API

2023-11-29 Thread Matthias Pohl via user
Hi Oscar, could you provide the Java code to illustrate what you were doing? The difference between version A and B might be especially helpful. I assume you already looked into the FAQ about operator IDs [1]? Adding the JM and TM logs might help as well to investigate the issue, as Yu Chen mentio

Re: Profiling on flink jobs

2023-12-01 Thread Matthias Pohl via user
I missed the Reply All button in my previous message. Here's my previous email for the sake of transparency sent to the user ML once more: Hi Oscar, sorry for the late reply. I didn't see that you posted the question at the beginning of the month already. I used jmap [1] in the past to get some s

<    1   2   3