Github user mjsax commented on the pull request:

    https://github.com/apache/storm/pull/694#issuecomment-134315867
  
    Hi, I did a few performance tests and unfortunately, the impact of my 
changes is quite high. Not using batching in my branch (compared to master) 
reduces throughput by 40%. :(
    
    I dug into the problem and it seems that there are two critical point in 
the code. First, I introduced function `emit-msg` (in `executor.clj`). This 
function uses a Java `HashMap` (`output-batch-buffer`) and it branches. Both 
have a large negative impact. Can it be that the access via `HashMap-get()` is 
slow in Clojure? What about branching? To me it seems that there performance 
impact is ridiculously high (even if I put into account that this code is 
called a few 100,000 times a second). Especially branch prediction should avoid 
the branching overhead at all. Because batching was disabled in the test, the 
else branch is taken every time...
    
    Furthermore, I changed the serialization by overloading 
`KryoTuple(Batch)Serializer.serialize(...)` (I renamed this class). It seems 
that this overloading makes the call to `.serialize(Tuple)` much more 
expensive. I was thinking about changing the code, such that even if batching 
is disabled, I just use batches of size 1 to eliminate the overload. Of course, 
if comes with the drawback, that a single tuple consumers more space as I need 
to write the batch size in front of each batch.
    
    Does this observation make sense? Or do you think I oversee something?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to