Github user mjsax commented on the pull request:
https://github.com/apache/storm/pull/694#issuecomment-134315867
Hi, I did a few performance tests and unfortunately, the impact of my
changes is quite high. Not using batching in my branch (compared to master)
reduces throughput by 40%. :(
I dug into the problem and it seems that there are two critical point in
the code. First, I introduced function `emit-msg` (in `executor.clj`). This
function uses a Java `HashMap` (`output-batch-buffer`) and it branches. Both
have a large negative impact. Can it be that the access via `HashMap-get()` is
slow in Clojure? What about branching? To me it seems that there performance
impact is ridiculously high (even if I put into account that this code is
called a few 100,000 times a second). Especially branch prediction should avoid
the branching overhead at all. Because batching was disabled in the test, the
else branch is taken every time...
Furthermore, I changed the serialization by overloading
`KryoTuple(Batch)Serializer.serialize(...)` (I renamed this class). It seems
that this overloading makes the call to `.serialize(Tuple)` much more
expensive. I was thinking about changing the code, such that even if batching
is disabled, I just use batches of size 1 to eliminate the overload. Of course,
if comes with the drawback, that a single tuple consumers more space as I need
to write the batch size in front of each batch.
Does this observation make sense? Or do you think I oversee something?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---