Re: In Flink SQL for kafka avro based table , is there support for FORWARD_TRANSITIVE schema change?

2021-04-13 Thread Arvid Heise
Hi Agnelo, How is the writer schema encoded if you are using no schema registry? Or phrased differently: how does Flink know with which schema the data has been written so that it can map it to the new schema? On Wed, Apr 14, 2021 at 8:33 AM Agnelo Dcosta wrote: > Hi, we are using Flink SQL 1.1

In Flink SQL for kafka avro based table , is there support for FORWARD_TRANSITIVE schema change?

2021-04-13 Thread Agnelo Dcosta
Hi, we are using Flink SQL 1.12 and have a couple of tables created from kafka topics. Format is avro (not confluent avro) and no schema registry as such. In flink 1.11 we used to specify the schema, however in 1.12 the schema is derived from the message itself. Is it possible for the producers t

Re: NPE when aggregate window.

2021-04-13 Thread Si-li Liu
Thanks for your help. After I replaced com.google.common.base.Objects.hashCode with toString().hashCode(), the NPE problem is solved. Arvid Heise 于2021年4月13日周二 下午11:40写道: > To second Dawids question: are all fields final or is it possible that > their values are changing? > > On Tue, Apr 13, 20

Re: JSON source for pyflink stream

2021-04-13 Thread Yik San Chan
Hi Giacomo, I think you can try using Flink SQL connector. For JSON input such as {"a": 1, "b": {"c": 2, {"d": 3}}}, you can do: CREATE TABLE data ( a INT, b ROW> ) WITH (...) Let me know if that helps. Best, Yik San On Wed, Apr 14, 2021 at 2:00 AM wrote: > Hi, > I'm new to Flink and I a

Re: Clarification about Flink's managed memory and metric monitoring

2021-04-13 Thread Xintong Song
These metrics should also be available via REST. You can check the original design doc [1] for which metrics the UI is using. Thank you~ Xintong Song [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-102%3A+Add+More+Metrics+to+TaskManager On Tue, Apr 13, 2021 at 9:08 PM Alexis Sarda-

Re: [External] : Re: Conflict in the document - About native Kubernetes per job mode

2021-04-13 Thread Fuyao Li
Hello Yang, I also created a PR for this issue. Please take a look. Refer to https://github.com/apache/flink/pull/15602 Thanks, Fuyao From: Fuyao Li Date: Tuesday, April 13, 2021 at 18:23 To: Yang Wang Cc: user Subject: Re: [External] : Re: Conflict in the document - About native Kubernetes

Re: [External] : Re: Conflict in the document - About native Kubernetes per job mode

2021-04-13 Thread Fuyao Li
Hello Yang, I tried to create a ticket https://issues.apache.org/jira/browse/FLINK-22264 I just registered as a user and I can’t find a place to assign the task to myself… Any idea on this jira issue? Thanks. Best, Fuyao From: Yang Wang Date: Tuesday, April 13, 2021 at 03:01 To: Fuyao Li Cc:

Extract/Interpret embedded byte data from a record

2021-04-13 Thread Sumeet Malhotra
Hi, I'm reading data from Kafka, which is Avro encoded and has the following general schema: { "name": "SomeName", "doc": "Avro schema with variable embedded encodings", "type": "record", "fields": [ { "name": "Name", "doc": "My name", "type": "string" }, {

Re: Avro schema

2021-04-13 Thread Sumeet Malhotra
Hi Arvid, I certainly appreciate the points you make regarding schema evolution. Actually, I did end up writing an avro2sql script to autogen the DDL in the end. Thanks, Sumeet On Fri, Apr 9, 2021 at 12:13 PM Arvid Heise wrote: > Hi Sumeet, > > The beauty of Avro lies in having reader and writ

Re: Flink docker 1.11.3 actually runs 1.11.2

2021-04-13 Thread Flavio Pompermaier
Hi Chesnay, my tests were done using docker-compose (with the command 'docker-compose up --build -d flink-jobmanager flink-taskmanager'). These are the necessary files (./flink/db-libs/* contains the jdbc libraries I use while /opt/flink/data is used as a volume to share files with other dockers):

Re: Flink 1.11 job hit error "Job leader lost leadership" or "ResourceManager leader changed to new address null"

2021-04-13 Thread Lu Niu
FYI, my teammate Chen posted a similar question: ,*Apache Flink Mailing List archive. - handle SUSPENDED in ZooKeeperLeaderRetrievalService . That is the root cause of th

JSON source for pyflink stream

2021-04-13 Thread G . G . M . 5611
Hi, I'm new to Flink and I am trying to create a stream from locally downloaded tweets. The tweets are in json format, like in this example:   {"data":{"text":"Polsek Kakas Cegah Covid-19 https://t.co/ADjEgpt7bC","public_metrics":"retweet_count":0,"reply_count":0,"like_count":0,"quote_count":0}, "a

Re: Flink docker 1.11.3 actually runs 1.11.2

2021-04-13 Thread Chesnay Schepler
Please provide steps to reproduce the issue. I can't see anything wrong in the dockerfiles (they reference the correct release url), and the referenced release correctly identifies itself as 1.11.3 . I also started a container with the image, started a jobmanager, and the logs show 1.11.3 like

Flink docker 1.11.3 actually runs 1.11.2

2021-04-13 Thread Flavio Pompermaier
Hi to all, I've just build a docker that use the image flink:1.11.3-scala_2.12-java11 but the web UI (and logs too) display Flink 1.11.2 (Commit: fe36135). Was there an error with the release? Best, Flavio

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Arvid Heise
Hi Rahul, Checkpointing is Flink's way of providing processing guarantees "at least once"/"exactly once". So your question is like asking if a car offers any safety without you wanting to use a built-in belt and airbags. Sure you could install your own safety features but chances are that your sol

Re: NPE when aggregate window.

2021-04-13 Thread Arvid Heise
To second Dawids question: are all fields final or is it possible that their values are changing? On Tue, Apr 13, 2021 at 4:41 PM Si-li Liu wrote: > Hi,Dawid, > > Thanks for your help. I use com.google.common.base.Objects.hashCode, pass > all fields to it and generate a hashcode, and the equal m

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Rahul Patwari
Hi Arvid, Thanks for your inputs. They are super helpful. Why do you need the window operator at all? Couldn't you just backpressure > on the async I/O by delaying the processing there? > I haven't explored this approach. Wouldn't the backpressure gets propagated upstream and the consumption rat

Running Apache Flink 1.12 on Apple Silicon M1 MacBook Pro

2021-04-13 Thread Klemens Muthmann
Hi, I've just tried to run the basic example for Apache Flink on an Apple Mac Pro with the new M1 Processor. I only need this for development purposes. The actual thing is going to run on a Linux se

Re: NPE when aggregate window.

2021-04-13 Thread Si-li Liu
Hi,Dawid, Thanks for your help. I use com.google.common.base.Objects.hashCode, pass all fields to it and generate a hashcode, and the equal method also compare all the fields. Dawid Wysakowicz 于2021年4月13日周二 下午8:10写道: > Hi, > > Could you check that your grouping key has a stable hashcode and equ

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Arvid Heise
Hi Rahul, This pipeline should process millions of records per day with low latency. > I am avoiding Checkpointing, as the records in the Window operator and > in-flight records in the Async I/O operator are persisted along with the > Kafka offsets. But the records in Window and Async I/O operator

RE: Clarification about Flink's managed memory and metric monitoring

2021-04-13 Thread Alexis Sarda-Espinosa
Hi Xintong, Thanks for the info. Is there any way to access these metrics outside of the UI? I suppose Flink’s reporters might provide them, but will they also be available through the REST interface (or another interface)? Regards, Alexis. From: Xintong Song Sent: Tuesday, 13 April 2021 14:3

Re: Clarification about Flink's managed memory and metric monitoring

2021-04-13 Thread Xintong Song
Hi Alexis, First of all, I strongly recommend not to look into the JVM metrics. These metrics are fetched directly from JVM and do not well correspond to Flink's memory configurations. They were introduced a long time ago and are preserved mostly for compatibility. IMO, they bring more confusion t

Re: Python Integration with Ververica Platform

2021-04-13 Thread Dawid Wysakowicz
I'd recommend reaching out directly to Ververica. Ververica platform is not part of the open-source Apache Flink project. I can connect you with Konstantin who I am sure will be happy to answer your question ;) Best, Dawid On 12/04/2021 15:40, Robert Cullen wrote: > I've been using the Communit

Re: NPE when aggregate window.

2021-04-13 Thread Dawid Wysakowicz
Hi, Could you check that your grouping key has a stable hashcode and equals? It is very likely caused by an unstable hashcode and that a record with an incorrect key ends up on a wrong task manager. Best, Dawid On 13/04/2021 08:47, Si-li Liu wrote: > Hi,  > > I encounter a weird NPE when try to

Clarification about Flink's managed memory and metric monitoring

2021-04-13 Thread Alexis Sarda-Espinosa
Hello, I have a Flink TM configured with taskmanager.memory.managed.size: 1372m. There is a streaming job using RocksDB for checkpoints, so I assume some of this memory will indeed be used. I was looking at the metrics exposed through the REST interface, and I queried some of them: /taskmanag

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Rahul Patwari
Hi Arvid, Thanks for the reply. could you please help me to understand how the at least once guarantee > would work without checkpointing in your case? > This was the plan to maintain "at least once" guarantee: Logic at Sink: The DataStream on which Sink Function is applied, on the same DataStre

Re: Flink Metric isBackPressured not available

2021-04-13 Thread Claude M
Thanks for your reply. I'm using Flink 1.12. I'm checking in Datadog and the metric is not available there. It has other task/operator metrics such as numRecordsIn/numRecordsOut there but not the isBackPressured. On Mon, Apr 12, 2021 at 8:40 AM Roman Khachatryan wrote: > Hi, > > The metric is

Re: Flink 1.11.4?

2021-04-13 Thread Yuval Itzchakov
Roman, is there an ETA on 1.13? On Mon, Apr 12, 2021, 16:17 Roman Khachatryan wrote: > Hi Maciek, > > There are no specific plans for 1.11.4 yet as far as I know. > The official policy is to support the current and previous minor > release [1]. So 1.12 and 1.13 will be officially supported once

Re: [External] : Re: Conflict in the document - About native Kubernetes per job mode

2021-04-13 Thread Yang Wang
I think it makes sense to have such a simple fix. Could you please create a ticket and attach a PR? Best, Yang Fuyao Li 于2021年4月13日周二 下午2:24写道: > Hello Yang, > > > > It is very kind of you to give such a detailed explanation! Thanks for > clarification. > > > > For the small document fix I men

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Arvid Heise
Hi Rahul, could you please help me to understand how the at least once guarantee would work without checkpointing in your case? Let's say you read records A, B, C. You use a window to delay processing, so let's say A passes and B, C are still in the window for the trigger. Now do you want to aut

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Roman Khachatryan
Hi Rahul, Right. There are no workarounds as far as I know. Regards, Roman On Mon, Apr 12, 2021 at 9:00 PM Rahul Patwari wrote: > > Hi Roman, Arvid, > > So, to achieve "at least once" guarantee, currently, automatic restart of > Flink should be disabled? > Is there any workaround to get "at le

Re: Source Operators Stuck in the requestBufferBuilderBlocking

2021-04-13 Thread Arvid Heise
Hi Sihan, we managed to reproduce it, see [1]. It will be fixed in the next 1.12 and the upcoming 1.13 release. [1] https://issues.apache.org/jira/browse/FLINK-21992 On Tue, Apr 6, 2021 at 8:45 PM Roman Khachatryan wrote: > Hi Sihan, > > Unfortunately, we are unable to reproduce the issue so f