Hello Dimitry,
Could you please elaborate on your tuning on ->
environment.addDefaultKryoSerializer(..) .
I'm interested on knowing what have you done there for a boost of about
50% .
Some small or simple example would be very nice.
Thank you very much in advance.
Kind Regards,
Daniel Santos
On 02/17/2017 12:43 PM, Dmitry Golubets wrote:
Hi,
My streaming job cannot benefit much from parallelization unfortunately.
So I'm looking for things I can tune in Flink, to make it process
sequential stream faster.
So far in our current engine based on Akka Streams (non distributed
ofc) we have 20k msg/sec.
Ported to Flink I'm getting 14k so far.
My observations are following:
* if I chain operations together they execute all in sequence, so I
basically sum up the time required to process one data item across
all my stream operators, not good
* if I split chains, they execute asynchronously to each other, but
there is serialization and network overhead
Second approach gives me better results, considering that I have a
server with more than enough memory and cores to do all side work for
serialization. But I want to reduce this serialization\data transfer
overhead to a minimum.
So what I have now:
environment.getConfig.enableObjectReuse() // cos it's Scala we don't
need unnecessary serialization
environment.getConfig.disableAutoTypeRegistration() // it works faster
with it, I'm not sure why
environment.addDefaultKryoSerializer(..) // custom Message Pack
serialization for all message types, gives about 50% boost
But that's it, I don't know what else to do.
I didn't find any interesting network\buffer settings in docs.
Best regards,
Dmitry