Re: Help needed to increase throughput of simple flink app

2020-11-12 Thread ashwinkonale
Hi Arvid, Thank you so much for your detailed reply. I think I will go with one schema per topic using GenericRecordAvroTypeInfo for genericRecords and not do any custom magic. Approach of sending records as byte array also seems quite interesting. Right now I am deserializing avro records so

Re: Help needed to increase throughput of simple flink app

2020-11-12 Thread ashwinkonale
So in this case, flink will fall back to default kyro serialiser right ? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Help needed to increase throughput of simple flink app

2020-11-12 Thread ashwinkonale
Hi Arvid, Thanks a lot for your reply. And yes, we do use confluent schema registry extensively. But the `ConfluentRegistryAvroDeserializationSchema` expects reader schema to be provided. That means it reads the message using writer schema and converts to reader schema. But this is not what I want

Re: Help needed to increase throughput of simple flink app

2020-11-12 Thread ashwinkonale
Hi, Thanks a lot for the reply. And you both are right. Serializing GenericRecord without specifying schema was indeed a HUGE bottleneck in my app. I got to know it through jfr analysis and then read the blog post you mentioned. Now I am able to pump in lot more data per second. (In my test setup a

Re: Help needed to increase throughput of simple flink app

2020-11-10 Thread ashwinkonale
Hey, I am reading messages with schema id and using confluent schema registry to deserialize to Genericrecord. After this point, pipelineline will have this objects moving across. Can you give me some examples of `special handling of avro messages` you mentioned ? -- Sent from: http://apache-fli

Re: Help needed to increase throughput of simple flink app

2020-11-09 Thread ashwinkonale
Hi, Thanks a lot for the reply. I added some more metrics to the pipeline to understand bottleneck. Seems like avro deserialization introduces some delay. With use of histogram I found processing of a single message takes ~300us(p99). ~180(p50). Which means a single slot can output at most 3000 mes

Re: Help needed to increase throughput of simple flink app

2020-11-09 Thread ashwinkonale
Hi Till, Thanks a lot for the reply. The problem I am facing is as soon as I add network(remove chaining) to discarding sink, I have huge problem with throughput. Do you have any pointers on how can I go about debugging this ? - Ashwin -- Sent from: http://apache-flink-user-mailing-list-archive