Hi Francis, Could you share the benchmark code you use?
Regards, Dian On Wed, Dec 22, 2021 at 11:31 AM Francis Conroy < francis.con...@switchdin.com> wrote: > I've just run an analysis using a similar example which involves a single > python flatmap operator and we're getting 100x less through by using python > over java. I'm interested to know if you can do such a comparison. I'm > using Flink 14.0. > > Thanks, > Francis > > On Thu, 18 Nov 2021 at 02:20, Thomas Portugal <thomasportug...@gmail.com> > wrote: > >> Hello community, >> My team is developing an application using Pyflink. We are using the >> Datastream API. Basically, we read from a kafka topic, do some maps, and >> write on another kafka topic. One restriction about it is the first map, >> that has to be serialized and with parallelism equals to one. This is >> causing a bottleneck on the throughput, and we are achieving approximately >> 2k msgs/sec. Monitoring the cpu usage and the number of records on each >> operator, it seems that the first operator is causing the issue. >> The first operator is like a buffer that groups the messages from kafka >> and sends them to the next operators. We are using a dequeue from python's >> collections. Since we are stuck on this issue, could you answer some >> questions about this matter? >> >> 1 - Using data structures from python can introduce some latency or >> increase the CPU usage? >> 2 - There are alternatives to this approach? We were thinking about >> Window structure, from Flink, but in our case it's not time based, and we >> didn't find an equivalent on python API. >> 3 - Using Table API to read from Kafka Topic and do the windowing can >> improve our performance? >> >> We already set some parameters like python.fn-execution.bundle.time and >> buffer.timeout to improve our performance. >> >> Thanks for your attention. >> Best Regards >> > > This email and any attachments are proprietary and confidential and are > intended solely for the use of the individual to whom it is addressed. Any > views or opinions expressed are solely those of the author and do not > necessarily reflect or represent those of SwitchDin Pty Ltd. If you have > received this email in error, please let us know immediately by reply email > and delete it from your system. You may not use, disseminate, distribute or > copy this message nor disclose its contents to anyone. > SwitchDin Pty Ltd (ABN 29 154893857) PO Box 1165, Newcastle NSW 2300 > Australia >