python is generally restricted to a single CPU, and kafka-python will max
out a single CPU well before it maxes a network card. I would recommend
other tools for bulk transfers. Otherwise you may find that partitioning
your data set and running separate python processes for each will increase
the overall CPU available and therefore the throughput.

One day I will spend time improving the CPU performance of kafka-python,
but probably not in the near term.

-Dana

Reply via email to