Required built-in function [plus] could not be found in any catalog.

2021-09-07 Thread vtygoss
Hi, Flink Community! i met a problem using flink 1.12.0 standalone cluster with hive catalog. scene 1: - module: hive module - execute sql: select sum(1) from xxx - exception: org.apache.flink.table.api.TableException: Required built-in function [plus] could not be found in any catalog. s

Re: What is the event time of an element produced in a timer?

2021-09-07 Thread JING ZHANG
Hi Maro, Do you wanna to do a cascade Tumbling windowed stream Job? Do you hope the data generated in the first Tumbling windowed stream could belongs to same window in the second stream? We could understand your demands better if you could share the code. I would try to explain some detailed prob

aws s3 configuring error for flink image

2021-09-07 Thread Dhiru
Need to configure aws S3 getting this error org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 's3'. The scheme is directly supported by Flink through the following plugins: flink-s3-fs-hadoop, flink-s3-fs-presto. Please ensur

Re: When using the batch api, the sink task is always in the created state.

2021-09-07 Thread Caizhi Weng
Hi! In streaming / table / SQL API level Flink allows user to use the same code to run streaming or batch jobs, however in the actual runtime level these user code will be compiled to different operators for a more efficient execution. lec ssmi 于2021年9月7日周二 下午6:32写道: > Thanks for your reply! >

Re: User defined function (UDF) not sent when submitting job to session cluster

2021-09-07 Thread Joel Edwards
Hi Dawid, Thanks again for replying and for confirming the behavior. Best regards, Joel On Tue, Sep 7, 2021 at 1:35 PM Dawid Wysakowicz wrote: > Huh, of course. Actually I was too quick with my answer. Even if it is > serialized with the JobGraph, the class is necessary on TMs to be > deseria

Re: Seeing Exception ClassNotFoundException: __wrapper while running in Kubernetes Cluster

2021-09-07 Thread Praneeth Ramesh
Hi Arvid and Nicolaus Thanks for the reply. I don't think this is an issue with the user code. I'm pretty sure that the code is serializable. I have not used any custom serializer. And the serialization configs are not overridden. My guess was the problem was with order of classloading. I am depl

Re: User defined function (UDF) not sent when submitting job to session cluster

2021-09-07 Thread Dawid Wysakowicz
Huh, of course. Actually I was too quick with my answer. Even if it is serialized with the JobGraph, the class is necessary on TMs to be deserialized. That's how java serialization works after all. So the actual answer, it is serialized with the JobGraph. The class is mandatory for deserialization

Re: User defined function (UDF) not sent when submitting job to session cluster

2021-09-07 Thread Joel Edwards
Hi Dawid, Thank you for the response. I create a remote StreamExecutionEnvironment, wrap a new TableExecutionEnvironment around it, build the job, then execute it via executeAsync(). It sounds like I cannot send the serialized job graph to the session cluster's JobManager via the web API. The ab

Re: User defined function (UDF) not sent when submitting job to session cluster

2021-09-07 Thread Dawid Wysakowicz
Hi Joel, There is a few uncertainties in my reply so I am including Timo who should be able to confirm or deny what I wrote. Generally speaking once a JobGraph is created it should contain the UDF. I might be wrong here, but the UDF must be available while the JobGraph is created. I believe you'r

Re: Broadcast data to all keyed streams

2021-09-07 Thread Dawid Wysakowicz
Hi James, Can you elaborate why the "Broadcast State Pattern"[1] does not work for you? I'd definitely recommend that approach. I highly discourage this usage, but if you insist you could copy over the ConnectedStreams#transform method and remove the check that guards both sides of the operator

Re: What is the event time of an element produced in a timer?

2021-09-07 Thread Marco Villalobos
I am guessing the answer is right infront of me. If the window has these attributes: window start: 00:30:00.000. window end: 00:45:00.000 max timestamp: 00:44:59.999 Then perhaps I need the timer to fire at 00:44:59.999 if I want an element that it generates too belong to this window. I'll

User defined function (UDF) not sent when submitting job to session cluster

2021-09-07 Thread Joel Edwards
Good day, I have been attempting to submit a job to a session cluster. This job involves a pair of dynamic tables and a SQL query. The SQL query is calling a UDF which I register via the table API's createTemporarySystemFunction() method. The job runs locally, but when I attempt to submit it to a

Re: What is the event time of an element produced in a timer?

2021-09-07 Thread Marco Villalobos
Right. What confuses me is that data produced within a timer is not input data. It is synthesized within the timer and collected. I just realized that my original question was a bit of nonsense. KeyedProcessFuntion has timers, and it is not a window a function. What I meant to say is that the ope

Re: Cleaning old incremental checkpoint files

2021-09-07 Thread Robin Cassan
Hey Yun, thanks for the answer! How would you analyze the checkpoint metadata? Would you build a program with the State Processor API library, or is there a better way to do it? I believe the option you mention would indeed facilitate cleaning, it would still be manual (because we can't set a peri

Re: When using the batch api, the sink task is always in the created state.

2021-09-07 Thread lec ssmi
Thanks for your reply! But doesn't flink use stream to perform batch calculations? As you said above, to some extent, it is same as real batch computing eg.spark . Caizhi Weng 于2021年9月7日周二 下午2:53写道: > My previous mail intends to answer what is needed for all subtasks in a > batch job to run s

Re: Flink CEP in PyFlink

2021-09-07 Thread Dian Fu
Hi Kamil, It still doesn’t support CEP in PyFlink. However, as it supports SQL in PyFlink and so you could take a look at if CEP on SQL [1] could meet your requirements. If so, you could use CEP on SQL in PyFlink. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/

Re: What is the event time of an element produced in a timer?

2021-09-07 Thread JING ZHANG
Hi Marco, I'm not sure which API or SQL query do you use. If you use Windowed Stream API in DataStream [1]. The input data would be assigned to a Window based on which Window Assigner do you use. If using tumbling window, the input data would be assigned to a Window based on it's event-time and reg

Re: hdfs lease issues on flink retry

2021-09-07 Thread Matthias Pohl
Just for documentation purposes: I created FLINK-24147 [1] to cover this issue. [1] https://issues.apache.org/jira/browse/FLINK-24147 On Thu, Aug 26, 2021 at 6:14 PM Matthias Pohl wrote: > I see - I should have checked my mailbox before answering. I received the > email and was able to login. >

Flink CEP in PyFlink

2021-09-07 Thread Kamil ty
Hello all, I would like to use Flink CEP for my development requirements. Is Flink CEP supported in PyFlink? If not, are there any available workarounds? Kind regards Kamil

Re: High availability - leader election not working?

2021-09-07 Thread Chesnay Schepler
> How do I identify the leader job manager? (see the third point below on approach to date) Either through the config maps, or the logs. Your approach for the config maps _should_ be correct. There should be an "address" which in each ConfigMap pointing to the current leader. > How is the da

Re: FLINK-14316 happens on version 1.13.2

2021-09-07 Thread Matthias Pohl
Hi Xiangyu, thanks for reaching out to the community. Could you share the entire TaskManager and JobManager logs with us? That might help investigating what's going on. Best, Matthias On Fri, Sep 3, 2021 at 10:07 AM Xiangyu Su wrote: > Hi Yun, > Thanks alot. > I am running a test, and facing th