JDBC connector support for JSON

2021-03-30 Thread Fanbin Bu
Hi, For a streaming job that uses Kafka connector, this doc https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/formats/json.html#format-options shows that we can parse json data format. However, it does not seem like Flink JDBC connector support json data type, at least fr

Restoring from Flink Savepoint in Kubernetes not working

2021-03-30 Thread Claude M
Hello, I have Flink setup as an Application Cluster in Kubernetes, using Flink version 1.12. I created a savepoint using the curl command and the status indicated it was completed. I then tried to relaunch the job from that save point using the following arguments as indicated in the doc found h

Re: [External] : Re: Need help with executing Flink CLI for native Kubernetes deployment

2021-03-30 Thread Yang Wang
Hi Fuyao, Thanks for sharing the progress. 1. The flink client is able to list/cancel jobs, based on logs shared > above, I should be able to ping 144.25.13.78, why I still can NOT ping such > address? I think this is a environment problem. Actually, not every IP address could be tested with "p

Organizing Flink Applications: Mono repo or polyrepo

2021-03-30 Thread Xinbin Huang
Hi community I am curious about people's experience in structuring Flink applications. Do you use a mono repo structure (multiple applications in one single repo) or broken down each application into its own repo? If possible, can you share some of your thoughts on the pros/cons of each approach?

Re: Checkpoint fail due to timeout

2021-03-30 Thread Alexey Trenikhun
Hi Piotrek, I can't reproduce problem anymore, before the problem happened 2-3 times in row, I've turned off unaligned checkpoints, now returned unaligned checkpoints back, but the problem seems gone for now. When problem happened there was no progress on source operators, I thought maybe it wa

Re: Checkpoint fail due to timeout

2021-03-30 Thread Alexey Trenikhun
I also expected improve of checkpointing at the cost of throughput, but in in reality I didn't notice difference neither in checkpointing or throughput. Backlog was purged by Kafka, so can't post thread dump right now, but I doubt that the problem is gone, so will have next chance during next pe

Re: Source Operators Stuck in the requestBufferBuilderBlocking

2021-03-30 Thread Sihan You
Awesome. Let me know if you need any other information. Our application has a heavy usage on event timer and keyed state. The load is vey heavy. If that matters. On Mar 29, 2021, 05:50 -0700, Piotr Nowojski , wrote: > Hi Sihan, > > Thanks for the information. Previously I was not able to reproduc

Re: [External] : Re: Need help with executing Flink CLI for native Kubernetes deployment

2021-03-30 Thread Fuyao Li
Hello Yang, Thank you so much for providing me the flink-client.yaml. I was able to make some progress. I didn’t realize I should create an new pod flink-client to list/cancel jobs. I was trying to do such a thing from my local laptop. Maybe that is the reason why it doesn’t work. However, I st

Re: SP with Drain and Cancel hangs after take a SP

2021-03-30 Thread Vishal Santoshi
Great, thanks! On Tue, Mar 30, 2021 at 11:00 AM Till Rohrmann wrote: > This is a good idea. I will add it to the section here [1]. > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#terminating-a-job > > Cheers, > Till > > On Tue, Mar 30, 2021 at 2:46 PM Vishal

IO benchmarking

2021-03-30 Thread deepthi Sridharan
Hi, I am trying to set up some benchmarking with a couple of IO options for saving checkpoints and have a couple of questions : 1. Does flink come with any IO benchmarking tools? I couldn't find any. I was hoping to use those to derive some insights about the storage performance and extrapolate i

Proper way to get DataStream

2021-03-30 Thread Maminspapin
Hi, I'm trying to solve a task with getting data from topic. This topic keeps avro format data. I wrote next code: public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); Schema schema

Re: Support for sending generic class

2021-03-30 Thread Le Xu
Hi Gordon and Till: Thanks for pointing me to the new version! The code I'm using is for a research project so it's not on any production deadline. However I do like to know any upcoming updates so there won't be any duplicated works. Couple of questions I have now: 1. Does 3.0 support context.sen

Re: Scala : ClassCastException with Kafka Connector and ObjectNode

2021-03-30 Thread Lehuede sebastien
Hi Till, That solved my issue ! Many many thanks for the solution and for the useful StackOverflow link ! ☺️ Cheers, Sébastien > Le 30 mars 2021 à 18:16, Till Rohrmann a écrit : > > Hi Sebastien, > > I think the Scala compiler infers the most specific type for deepCopy() which > is Nothing

Re: Flink Table to DataStream: how to access column name?

2021-03-30 Thread Yik San Chan
Hi Till, >From the version I am using (1.12.0), getFieldNames is not available in Row ... See https://github.com/apache/flink/blob/release-1.12/flink-core/src/main/java/org/apache/flink/types/Row.java . Is there any workaround for this in version 1.12.0? Thanks. Best, Yik San On Wed, Mar 31, 20

Re: Failure detection in Flink

2021-03-30 Thread Sonam Mandal
Hi Till, This is really helpful, thanks for the detailed explanation about what happens. I'll reach out again if Ihave any further questions. For now I'm just trying to understand the various failure scenarios and how they are handled by Flink. Thanks, Sonam From

Re: Flink Table to DataStream: how to access column name?

2021-03-30 Thread Till Rohrmann
There is a method Row.getFieldNames. Cheers, Till On Tue, Mar 30, 2021 at 6:06 PM Yik San Chan wrote: > Hi Till, > > I look inside the Row class, it does contain a member `private final > Object[] fields;` though I wonder how to get column names out of the > member? > > Thanks! > > Best, > Yik

Re: Scala : ClassCastException with Kafka Connector and ObjectNode

2021-03-30 Thread Till Rohrmann
Hi Sebastien, I think the Scala compiler infers the most specific type for deepCopy() which is Nothing (Nothing is the subtype of every type) [1] because you haven't specified a type here. In order to make it work you have to specify the concrete type: event.get("value").deepCopy[ObjectNode]() [

Re: Flink Table to DataStream: how to access column name?

2021-03-30 Thread Yik San Chan
Hi Till, I look inside the Row class, it does contain a member `private final Object[] fields;` though I wonder how to get column names out of the member? Thanks! Best, Yik San On Tue, Mar 30, 2021 at 11:45 PM Till Rohrmann wrote: > Hi Yik San, > > by converting the rows to a Tuple3 you effec

Re: Flink Table to DataStream: how to access column name?

2021-03-30 Thread Till Rohrmann
Hi Yik San, by converting the rows to a Tuple3 you effectively lose the information about the column names. You could also call `toRetractStream[Row]` which will give you a `DataStream[Row]` where you keep the column names. Cheers, Till On Tue, Mar 30, 2021 at 3:52 PM Yik San Chan wrote: > The

Re: Failure detection in Flink

2021-03-30 Thread Till Rohrmann
Well, the FLIP-6 documentation is probably the best resource albeit being a bit outdated. The components react a bit differently: JobMaster loses heartbeat with a TaskExecutor: If this happens, then the JobMaster will invalidate all slots from this TaskExecutor. This will then fail the tasks whic

Re: StateFun examples in scala

2021-03-30 Thread jose farfan
Hi Many thx for your quick answer. I will review the links. BR Jose On Tue, 30 Mar 2021 at 15:22, Tzu-Li (Gordon) Tai wrote: > Hi Jose! > > For Scala, we would suggest to wait until StateFun 3.0.0 is released, > which is actually happening very soon (likely within 1-2 weeks) as there is > an o

Re: SP with Drain and Cancel hangs after take a SP

2021-03-30 Thread Till Rohrmann
This is a good idea. I will add it to the section here [1]. [1] https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#terminating-a-job Cheers, Till On Tue, Mar 30, 2021 at 2:46 PM Vishal Santoshi wrote: > Got it. Is it possible to add this very important note to the > doc

Re: Failure detection in Flink

2021-03-30 Thread Sonam Mandal
Hi Till, Thanks, this helps! Yes, removing the AKKA related configs will definitely help to reduce confusion. One more question, I was going through FLIP-6 and it does talk about the behavior of various components when failures are detected via heartbeat timeouts etc. is this the best referenc

Flink Table to DataStream: how to access column name?

2021-03-30 Thread Yik San Chan
The question is cross-posted on Stack Overflow https://stackoverflow.com/questions/66872184/flink-table-to-datastream-how-to-access-column-name . I want to consume a Kafka topic into a table using Flink SQL, then convert it back to a DataStream. Here is the `SOURCE_DDL`: ``` CREATE TABLE kafka_s

Scala : ClassCastException with Kafka Connector and ObjectNode

2021-03-30 Thread Lehuede sebastien
Hi all, I’m currently trying to use Scala to setup a simple Kafka consumer that receive JSON formatted events and then just send them to Elasticsearch. This is the first step and after I want to add some processing logic. My code works well but interesting fields form my JSON formatted events

Re: Support for sending generic class

2021-03-30 Thread Tzu-Li (Gordon) Tai
Hi Le, Thanks for reaching out with this question! It's actually a good segue to allow me to introduce you to StateFun 3.0.0 :) StateFun 3.0+ comes with a new type system that would eliminate this hassle. You can take a sneak peek here [1]. This is part 1 of a series of tutorials on fundamentals

Re: StateFun examples in scala

2021-03-30 Thread Tzu-Li (Gordon) Tai
Hi Jose! For Scala, we would suggest to wait until StateFun 3.0.0 is released, which is actually happening very soon (likely within 1-2 weeks) as there is an ongoing release candidate vote [1]. The reason for this is that version 3.0 adds a remote SDK for Java, which you should be able to use wit

Re: Evenly distribute task slots across task-manager

2021-03-30 Thread Till Rohrmann
Hi Vignesh, if I understand you correctly, then you have a job like: KafkaSources(parallelism = 64) => Mapper(parallelism = 16) => something else Moreover, you probably have slot sharing enabled which means that a KafkaSource and a Mapper can be deployed into the same slot. So what happens befo

Re: SP with Drain and Cancel hangs after take a SP

2021-03-30 Thread Vishal Santoshi
Got it. Is it possible to add this very important note to the documentation. Our case is the former as in this is an infinite pipeline and we were establishing the CiCD release process when non breaking changes ( DAG compatible changes are made ) on a running pipe. Regards On Tue, Mar 30, 2021 at

Re: StateFun examples in scala

2021-03-30 Thread Till Rohrmann
Hi Jose, I am pulling in Gordon who will be able to help you with your question. Personally, I am not aware of any limitations which prohibit the usage of Scala. Cheers, Till On Tue, Mar 30, 2021 at 11:55 AM jose farfan wrote: > Hi > > I am trying to find some examples written in scala of Sta

Re: Support for sending generic class

2021-03-30 Thread Till Rohrmann
Hi Le, I am pulling in Gordon who might be able to help you with your question. Looking at the interface Context, it looks that you cannot easily specify a TypeHint for the message you want to send. Hence, I guess that you explicitly need to register these types. Cheers, Till On Tue, Mar 30, 20

Re: SP with Drain and Cancel hangs after take a SP

2021-03-30 Thread Till Rohrmann
Hi Vishal, The difference between stop-with-savepoint and stop-with-savepoint-with-drain is that the latter emits a max watermark before taking the snapshot. The idea is to trigger all pending timers and flush the content of some buffering operations like windowing. Semantically, you should use th

Re: Failure detection in Flink

2021-03-30 Thread Till Rohrmann
Hi Sonam, Flink uses its own heartbeat implementation to detect failures of components. This mechanism is independent of the used deployment model. The relevant configuration options can be found here [1]. The akka.transport.* options are only for configuring the underlying Akka system. Since we

Re: Flink State Query Server threads stuck in infinite loop with high GC activity on CopyOnWriteStateMap get

2021-03-30 Thread Till Rohrmann
Hi Aashutosh, The queryable state feature is no longer actively maintained by the community. What I would recommend is to output the aggregate counts via a sink to some key value store which you query to obtain the results. Looking at the implementation of CopyOnWriteStateMap, it does not look li

StateFun examples in scala

2021-03-30 Thread jose farfan
Hi I am trying to find some examples written in scala of StateFun. But, I cannot find nothing. My questions is: 1. is there any problem to use statefun with Scala 2. is there any place with examples written in scala. BR Jose

Re: Hadoop is not in the classpath/dependencies

2021-03-30 Thread Chesnay Schepler
This looks related to HDFS-12920; where Hadoop 2.X tries to read a duration from hdfs-default.xml expecting plain numbers, but in 3.x they also contain time units. On 3/30/2021 9:37 AM, Matthias Seiler wrote: Thank you all for the replies! I did as @Maminspapin suggested and indeed the prev

Re: With the checkpoint interval of the same size, the Flink 1.12 version of the job checkpoint time-consuming increase and production failure, the Flink1.9 job is running normally

2021-03-30 Thread Yingjie Cao
Hi Haihang, After scanning the user mailing list, I found some users have reported checkpoint timeout when using unaligned checkpoint, can you share which checkpoint mode do you use? (The information can be found in log or the checkpoint -> configuration tab in webui) Best, Yingjie Yingjie Cao

Re: DataStream from kafka topic

2021-03-30 Thread Maminspapin
I tried this: 1. Schema (found in stackoverflow) class GenericRecordSchema implements KafkaDeserializationSchema { private String registryUrl; private transient KafkaAvroDeserializer deserializer; public GenericRecordSchema(String registryUrl) { this.registryUrl = registryUr

Re: With the checkpoint interval of the same size, the Flink 1.12 version of the job checkpoint time-consuming increase and production failure, the Flink1.9 job is running normally

2021-03-30 Thread Yingjie Cao
Hi Haihang, I think your issue is not related to FLINK-16404 , because that change should have small impact on checkpoint time, we already have a micro benchmark for that change (1s checkpoint interval) and no regression is seen. Could you share

Re: Evenly distribute task slots across task-manager

2021-03-30 Thread yidan zhao
I think currently flink doesn't support your case, and another idea is that you can set the parallelism of all operators to 64, then it will be evenly distributed to the two taskmanagers. Vignesh Ramesh 于2021年3月25日周四 上午1:05写道: > Hi Matthias, > > Thanks for your reply. In my case, yes the upstrea

Re: Hadoop is not in the classpath/dependencies

2021-03-30 Thread Matthias Seiler
Thank you all for the replies! I did as @Maminspapin suggested and indeed the previous error disappeared, but now the exception is ``` java.io.IOException: Cannot instantiate file system for URI: hdfs://node-1:9000/flink //... Caused by: java.lang.NumberFormatException: For input string: "30s" //