Hi Francis,

Could you share the benchmark code you use?

Regards,
Dian

On Wed, Dec 22, 2021 at 11:31 AM Francis Conroy <
francis.con...@switchdin.com> wrote:

> I've just run an analysis using a similar example which involves a single
> python flatmap operator and we're getting 100x less through by using python
> over java. I'm interested to know if you can do such a comparison. I'm
> using Flink 14.0.
>
> Thanks,
> Francis
>
> On Thu, 18 Nov 2021 at 02:20, Thomas Portugal <thomasportug...@gmail.com>
> wrote:
>
>> Hello community,
>> My team is developing an application using Pyflink. We are using the
>> Datastream API. Basically, we read from a kafka topic, do some maps, and
>> write on another kafka topic. One restriction about it is the first map,
>> that has to be serialized and with parallelism equals to one. This is
>> causing a bottleneck on the throughput, and we are achieving approximately
>> 2k msgs/sec. Monitoring the cpu usage and the number of records on each
>> operator, it seems that the first operator is causing the issue.
>> The first operator is like a buffer that groups the messages from kafka
>> and sends them to the next operators. We are using a dequeue from python's
>> collections. Since we are stuck on this issue, could you answer some
>> questions about this matter?
>>
>> 1 - Using data structures from python can introduce some latency or
>> increase the CPU usage?
>> 2 - There are alternatives to this approach? We were thinking about
>> Window structure, from Flink, but in our case it's not time based, and we
>> didn't find an equivalent on python API.
>> 3 - Using Table API to read from Kafka Topic and do the windowing can
>> improve our performance?
>>
>> We already set some parameters like python.fn-execution.bundle.time and
>> buffer.timeout to improve our performance.
>>
>> Thanks for your attention.
>> Best Regards
>>
>
> This email and any attachments are proprietary and confidential and are
> intended solely for the use of the individual to whom it is addressed. Any
> views or opinions expressed are solely those of the author and do not
> necessarily reflect or represent those of SwitchDin Pty Ltd. If you have
> received this email in error, please let us know immediately by reply email
> and delete it from your system. You may not use, disseminate, distribute or
> copy this message nor disclose its contents to anyone.
> SwitchDin Pty Ltd (ABN 29 154893857) PO Box 1165, Newcastle NSW 2300
> Australia
>

Reply via email to