Re:Re: Flink SQL UDAF com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID

2020-08-14 Thread forideal
Hi Robert Metzger, I am very happy to share my code, public class ConcatString { public List list = new ArrayList<>(); public void add(String toString) { if (list != null) { if (list.size() < 100) { list.add(toString); } } } } > Are you registering

coordination of sinks

2020-08-14 Thread Marco Villalobos
Given a source that goes into a tumbling window with a process function that yields two side outputs, in addition to the main data stream, is it possible to coordinate the order of completion of sink 1, sink 2, and sink 3 as data leaves the tumbling window? source -> tumbling window -> process fun

Re: k8s job cluster using StatefulSet

2020-08-14 Thread Alexey Trenikhun
Thank you Arvid and Yang! From: Yang Wang Sent: Thursday, August 13, 2020 8:09:13 PM To: Arvid Heise Cc: Alexey Trenikhun ; user Subject: Re: k8s job cluster using StatefulSet Hi Alexey, Actually, StatefulSets could also be used to start the JobManager and Ta

Format for timestamp type in Flink SQL

2020-08-14 Thread 김영우
Hi, I'm trying to create a table using Flink SQL to query from a Kafka topic. Messages from Kafka look like following: (snip) "svc_mgmt_num":"7749b6a7e17127d43431e21b94f4eb0c116..." "log_ymdt":"2020-08-15T02:01:33.968Z" "snapshot_dt":"2020-08-13" "network_type":"LTE" I'd like to make the column

Re: [Flink-KAFKA-KEYTAB] Kafkaconsumer error Kerberos

2020-08-14 Thread Vijayendra Yadav
Thanks for your valuable inputs Robert, it helped me solve the issue. While I tried -yD from flink run, like you mentioned and many other combinations of the same, that didn't work out. Finally it worked when I passed it from flink-conf.yaml with relative path. Like below: env.java.opts.jobmanage

Re: Flink S3 Hadoop dependencies

2020-08-14 Thread Chesnay Schepler
Filesystems are supposed to be used as plugins (by putting the jars under plugins/ instead of lib/), in which case they are loaded separately from other classes, specifically user-code. https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/plugins.html On 14/08/2020 20:25, Satish Sa

Tracing and Flink

2020-08-14 Thread Aaron Levin
Hello Flink Friends! This is a long-shot, but I'm wondering if anyone is thinking or working on applying tracing to Streaming systems and in particular Flink. As far as I understand this is a fairly open problem and so I'm curious how folks are thinking about it and if anyone has considered how th

Re: Performance Flink streaming kafka consumer sink to s3

2020-08-14 Thread Vijayendra Yadav
Hi Robert, Thanks for information. payloads so far are 400KB (each record). To achieve high parallelism at the downstream operator do I rebalance the kafka stream ? Could you give me an example please. Regards, Vijay On Fri, Aug 14, 2020 at 12:50 PM Robert Metzger wrote: > Hi, > > Also, can w

Re: Performance Flink streaming kafka consumer sink to s3

2020-08-14 Thread Robert Metzger
Hi, Also, can we increase parallel processing, beyond the number of > kafka partitions that we have, without causing any overhead ? Yes, the Kafka sources produce a tiny bit of overhead, but the potential benefit of having downstream operators at a high parallelism might be much bigger. How lar

Re: [Flink-KAFKA-KEYTAB] Kafkaconsumer error Kerberos

2020-08-14 Thread Robert Metzger
Hi Vijayendra, I'm not sure if -yD is the right argument as you've posted it: It is meant to be used for Flink configuration keys, not for JVM properties. With the Flink configuration "env.java.opts", you should be able to pass JVM properties. This should work: -yD env.java.opts="-D java.security

Flink S3 Hadoop dependencies

2020-08-14 Thread Satish Saley
Hi team, Was there a reason for not shading hadoop-common https://github.com/apache/flink/commit/e1e7d7f7ecc080c850a264021bf1b20e3d27d373#diff-e7b798a682ee84ab804988165e99761cR38-R44 ? This is leaking lots of classes such as guava and causing issues in our flink application. I see that hadoop-comm

Re: Flink CPU load metrics in K8s

2020-08-14 Thread Bajaj, Abhinav
Awesome. This is exactly what I was going to look for. Thanks much. ~ Abhinav From: Arvid Heise Date: Thursday, August 13, 2020 at 12:33 AM To: "Bajaj, Abhinav" Cc: Xintong Song , "user@flink.apache.org" , Roman Grebennikov Subject: Re: Flink CPU load metrics in K8s Hi Abhinav, according to

Re: K8s operator for dlink

2020-08-14 Thread Austin Cawley-Edwards
Hey Narasimha, We use an operator at FinTech Studios[1] (built by me) to deploy Flink via the Ververica Platform[2]. We've been using it in production for the past 7 months with no "show-stopping" bugs, and know some others have been experimenting with bringing it to production as well. Best, Aus

K8s operator for dlink

2020-08-14 Thread narasimha
Hi all, Checking if anyone has deployed flink using k8s operator. If so what has been the experience and well it has eased the job updates. Also was there any comparison among other available operators like • lyft • Google cloud Thanks in advance, some insights into the above will save lot of

Re: Pushing metrics to Influx from Flink 1.9 on AWS EMR(5.28)

2020-08-14 Thread bat man
Hello Arvid, Thanks I’ll check my config and use the correct reporter and test it out. Thanks, Hemant On Fri, 14 Aug 2020 at 6:57 PM, Arvid Heise wrote: > Hi Hemant, > > according to the influx section of the 1.9 metric documentation [1], you > should use the reporter without a factory. The fa

Re: Flink SQL UDAF com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID

2020-08-14 Thread Robert Metzger
Hi Forideal, When using RocksDB, we need to serialize the data (to store it on disk), whereas when using the memory backend, the data (in this case RedConcat.ConcatString instances) is on the heap, thus we won't run into this issue. Are you registering your custom types in the ExecutionConfig? (I

Re: Pushing metrics to Influx from Flink 1.9 on AWS EMR(5.28)

2020-08-14 Thread Arvid Heise
Hi Hemant, according to the influx section of the 1.9 metric documentation [1], you should use the reporter without a factory. The factory was added later. metrics.reporter.influxdb.class: org.apache.flink.metrics.influxdb.InfluxdbReportermetrics.reporter.influxdb.host: localhostmetrics.reporter.

Re: Using managed keyed state with AsynIo

2020-08-14 Thread Arvid Heise
Hi Kristoff, the answer to your big questions is unfortunately no, two times. I see two options in general: 1) process function (as you proposed). On processElement, you'd read the state and invoke your async operation. You enqueue your result in some result queue where you emit it in the next ca

Flink SQL UDAF com.esotericsoftware.kryo.KryoException: Encountered unregistered class ID

2020-08-14 Thread forideal
Hi I wrote a UDAF referring to this article https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/functions/udfs.html#aggregation-functions, when using in-memory state, the task can run normally. However, When I chose rocksdb as the state backend, I encountere

Re: Using managed keyed state with AsynIo

2020-08-14 Thread KristoffSC
Thanks Arvid, I like your propositions in my case I wanted to use the state value to decide if I should do the Async Call to external system. The result of this call would be a state input. So having this: Process1(calcualteValue or take it from state) -> AsyncCall to External system to persist/Va

Re: k8s job cluster using StatefulSet

2020-08-14 Thread Jan Lukavský
Hi Alexey, I'm using StatefulSet for JM exactly as you describe (Deployment for TM is just fine). The main advantage is that you don't need distributed storage for JM fault tolerance, because you can use persistent volume mount (provided your cloud provider provides it as fault tolerant volum

Re: Flink issue in emitting data to same sideoutput from onTimer and processElement

2020-08-14 Thread Dawid Wysakowicz
It should work just fine. Can you share a minimal reproducible example we can debug if the problem still remains? Best, Dawid On 14/08/2020 11:39, Jaswin Shah wrote: > I am using KeyedCoProcessFunction > > Get Outlook for Android > > --

Re: Flink issue in emitting data to same sideoutput from onTimer and processElement

2020-08-14 Thread Jaswin Shah
I am using KeyedCoProcessFunction Get Outlook for Android From: Jaswin Shah Sent: Friday, August 14, 2020 3:09:21 PM To: user@flink.apache.org ; Dawid Wysakowicz ; Yun Tang Subject: Flink issue in emitting data to same sideoutput from onTi

Flink issue in emitting data to same sideoutput from onTimer and processElement

2020-08-14 Thread Jaswin Shah
Hi, I have a coProcessFunction which emits data to same side output from processElement1 method and on timer method.But, data is not getting emitted to sideoutput from onTimer. Is it like to the same sideoutput, we can not emit data from onTimer and processElement methods? Get Outlook for And

[no subject]

2020-08-14 Thread Jaswin Shah
Hi, I have a coProcessFunction which emits data to same side output from processElement1 method and on timer method.But, data is not getting emitted to sideoutput from onTimer. Is it like to the same sideoutput, we can not emit data from onTimer and processElement methods? Get Outlook for Andr

Re: Status of a job when a kafka source dies

2020-08-14 Thread Becket Qin
Hey Nick and Piotr, Sorry for the late reply. This email somehow failed to pass my mail filter. The KafkaConsumer in Apache Kafka itself does not throw any exception if the broker is down. There isn't any API in KafkaConsumer telling you that the brokers are not reachable. Instead, the consumer j

Re: Status of a job when a kafka source dies

2020-08-14 Thread Piotr Nowojski
Hey, But do you know what API is Kafka providing that Spring is using to provide this feature? Piotrek czw., 13 sie 2020 o 17:15 Nick Bendtner napisał(a): > Hi Piotr, > Sorry for the late reply. So the poll does not throw an exception when a > broker goes down. In spring they solve it by gener