Performance Optimization for Apache Hop Pipeline Running on Flink

arjun s Mon, 17 Mar 2025 11:50:50 -0700

Hi team,

I am working on building a pipeline in Apache Hop and executing it on Flink
using the flink run command to test performance.


My Hop pipeline consists of:

   1. Reading messages from Kafka
   2. Mapping values using the *Split Fields* transform
   3. Executing a query in SingleStore using the *Database Join* transform
   to fetch corresponding values
   4. Sending the processed record to another Kafka topic

While testing with *100,000 records*, the query execution step is taking
significantly longer than expected, impacting overall performance. I have
also tested with *Database Lookup* and *Execute SQL* plugins, but the issue
persists.

Could you suggest any best practices or tuning methods to optimize this
flow? Any guidance on improving query execution speed would be greatly
appreciated.

Please find the hpl file attached

Thanks in advance

TestTPS_Optimized.hpl
Description: Binary data

Performance Optimization for Apache Hop Pipeline Running on Flink

Reply via email to