Re: PyFlink Table API: Interpret datetime field from Kafka as event time

2021-03-29 Thread Dawid Wysakowicz
Hey, I am not sure which format you use, but if you work with JSON maybe this option[1] could help you. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/formats/json.html#json-timestamp-format-standard On 30/03/2021 06:45, Sumeet Malhotra wrote:

Support for sending generic class

2021-03-29 Thread Le Xu
Hello! I'm trying to figure out whether Flink Statefun supports sending object with class that has generic parameter types (and potentially nested types). For example, I send a message that looks like this: context.send(SINK_EVENT, idString, new Tuple3<>(someLongObject, listOfLongObject, Long));

Re: PyFlink Table API: Interpret datetime field from Kafka as event time

2021-03-29 Thread Sumeet Malhotra
Thanks. Yes, that's a possibility. I'd still prefer something that can be done within the Table API. If it's not possible, then there's no other option but to use the DataStream API to read from Kafka, do the time conversion and create a table from it. ..Sumeet On Mon, Mar 29, 2021 at 10:41 PM Pi

Fw:A question about flink watermark illustration in official documents

2021-03-29 Thread 罗昊
Recently I read flink official documents for something about watermarks。 url:https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/event_time.html there are two pictures illustrating flink watermark mechanism, which puzzle me mush: The first picture is easy to understa

Re: Need help with executing Flink CLI for native Kubernetes deployment

2021-03-29 Thread Yang Wang
Hi Fuyao, Thanks for trying the native Kubernetes integration. Just like you know, the Flink rest service could be exposed in following three types, configured via "kubernetes.rest-service.exposed.type". * ClusterIP, which means you could only access the Flink rest endpoint inside the K8s cluste

Re: SP with Drain and Cancel hangs after take a SP

2021-03-29 Thread Vishal Santoshi
More interested whether a StreamingFileSink without a drain negatively affects it's exactly-once semantics , given that I state on SP would have the offsets from kafka + the valid lengths of the part files at SP. To be honest not sure whether the flushed buffers on sink are included in the length

Re: Do data streams partitioned by same keyBy but from different operators converge to the same sub-task?

2021-03-29 Thread David Anderson
Yes, since the two streams have the same type, you can union the two streams, key the resulting stream, and then apply something like a RichFlatMapFunction. Or you can connect the two streams (again, they'll need to be keyed so you can use state), and apply a RichCoFlatMapFunction. You can use whic

Failure detection in Flink

2021-03-29 Thread Sonam Mandal
Hello, I'm looking for some resources around failure detection in Flink between the various components such as Task Manager, Job Manager, Resource Manager, etc. For example, how does the Job Manager detect that a Task Manager is down (long GC pause or it just crashed)? There is some indication

Re: Install/Run Streaming Anomaly Detection R package in Flink

2021-03-29 Thread Robert Cullen
Wei, Thank you for pointing to those examples. Here is a code sample of how it's configured for me: env = StreamExecutionEnvironment.get_execution_environment() env.set_parallelism(1) env.set_stream_time_characteristic(TimeCharacteristic.EventTime) env.add_python_a

SP with Drain and Cancel hangs after take a SP

2021-03-29 Thread Vishal Santoshi
Is this a known issue. We do a stop + savepoint with drain. I see no back pressure on our operators. It essentially takes a SP and then the SInk ( StreamingFileSink to S3 ) just stays in the RUNNING state. Without drain i stop + savepoint works fine. I would imagine drain is important ( flush the

Re: Question about checkpoints and savepoints

2021-03-29 Thread Robert Metzger
Mh, did you also check the TaskManger logs? I'm not aware of any known or issues in the past in that direction, the codepaths for checkpoint / savepoint are fairly similar when it comes to storing the data. You could also try to run Flink on DEBUG log level, maybe that reveals something?! On Fri

DataStream from kafka topic

2021-03-29 Thread Maminspapin
Hi everyone. How can I get entry in GenericRecord format from kafka topic using SchemaRegistry? I read this: https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html But can't to build it in my code... Is there some tutorials or examples to deserialise data using schema.

Re: Glob support on file access

2021-03-29 Thread Arvid Heise
Hi Etienne, In general, any small PR on this subject is very welcome. I don't think that the community as a whole will invest much into FileInputFormat as the whole DataSet API is phasing out. Afaik SQL and Table API are only using InputFormat for the legacy compatibility layer (e.g. when it come

Re: Flink failing to restore from checkpoint

2021-03-29 Thread Piotr Nowojski
Hi, What Flink version are you using and what is the scenario that's happening? It can be a number of things, most likely an issue that your filed mounted under: > /mnt/checkpoints/5dde50b6e70608c63708cbf979bce4aa/shared/47993871-c7eb-4fec-ae23-207d307c384a disappeared or stopped being accessible.

Re: PyFlink Table API: Interpret datetime field from Kafka as event time

2021-03-29 Thread Piotr Nowojski
Hi, I hope someone else might have a better answer, but one thing that would most likely work is to convert this field and define even time during DataStream to table conversion [1]. You could always pre process this field in the DataStream API. Piotrek [1] https://ci.apache.org/projects/flink/f

Re: Restore from Checkpoint from local Standalone Job

2021-03-29 Thread Piotr Nowojski
Hi Sandeep, I think it should work fine with `StandaloneCompletedCheckpointStore`. Have you checked if your directory /Users/test/savepoint is being populated in the first place? And if so, if the restarted job is not throwing some exceptions like it can not access those files? Also note, that

PyFlink Table API: Interpret datetime field from Kafka as event time

2021-03-29 Thread Sumeet Malhotra
Hi, Might be a simple, stupid question, but I'm not able to find how to convert/interpret a UTC datetime string like *2021-03-23T07:37:00.613910Z* as event-time using a DDL/Table API. I'm ingesting data from Kafka and can read this field as a string, but would like to mark it as event-time by defi

Re: Glob support on file access

2021-03-29 Thread Etienne Chauchot
But still this workaround would only work when you have access to the underlying /FileInputFormat/. For//SQL and Table APIs, you don't so you'll be unable to apply this workaround. So what we could do is make a PR to support glob at the FileInputFormat level to profit for all APIs. I'm gonna d

Flink State Query Server threads stuck in infinite loop with high GC activity on CopyOnWriteStateMap get

2021-03-29 Thread Aashutosh Swarnakar
Hi Folks, I've recently started using Flink for a pilot project where I need to aggregate event counts on per minute window basis. The state has been made queryable so that external services can query the state via Flink State Query API. I am using memory state backend with a keyed process funct

Flink failing to restore from checkpoint

2021-03-29 Thread Claude M
Hello, I executed a flink job in a Kubernetes Application cluster w/ four taskmanagers. The job was running fine for several hours but then crashed w/ the following exception which seems to be when restoring from a checkpoint.The UI shows the following for the checkpoint counts: Triggered: 6

Re: Kubernetes Application Cluster Not Working

2021-03-29 Thread Claude M
This issue was resolved by adding the following environment variable to both the jobmanager and taskmanager: - name: JOB_MANAGER_RPC_ADDRESS value: jobmanager On Wed, Mar 24, 2021 at 1:33 AM Yang Wang wrote: > Are you sure that the JobManager akka address is binded to > "flink-jobmanager"? >

Re: Do data streams partitioned by same keyBy but from different operators converge to the same sub-task?

2021-03-29 Thread vishalovercome
I've gone through the example as well as the documentation and I still couldn't understand whether my use case requires joining. 1. What would happen if I didn't join?2. As the 2 incoming data streams have the same type, if joining is absolutely necessary then just a union (oneStream.union(anotherS

Re: [BULK]Re: [SURVEY] Remove Mesos support

2021-03-29 Thread Robert Metzger
+1 On Mon, Mar 29, 2021 at 5:44 AM Yangze Guo wrote: > +1 > > Best, > Yangze Guo > > On Mon, Mar 29, 2021 at 11:31 AM Xintong Song > wrote: > > > > +1 > > It's already a matter of fact for a while that we no longer port new > features to the Mesos deployment. > > > > Thank you~ > > > > Xinton