TJ Giuli created SAMZA-342:
------------------------------

             Summary: Priority streams experience large latencies before being 
consumed by the stream processor
                 Key: SAMZA-342
                 URL: https://issues.apache.org/jira/browse/SAMZA-342
             Project: Samza
          Issue Type: Bug
          Components: kafka
    Affects Versions: 0.7.0
         Environment: ubuntu 13.10
            Reporter: TJ Giuli


I have a stream processor that takes inputs from multiple streams, some are 
more batch, non-latency sensitive and others are real-time, infrequently have 
traffic and should be low-latency.  The real-time stream helps me interpret the 
batch stream, so I would ideally like any real-time stream envelopes delivered 
within some maximum latency from the time the message enters into a Kafka 
topic.  

I have my stream processor configured to prioritize my real-time streams over 
the batch streams, but I consistently find that the real-time stream is delayed 
by traffic from the batch stream.  From tracing the Kafka consumer, it looks 
like my stream processor periodically fetches from Kafka, finds that the batch 
streams have a large chunk of messages waiting, doesn’t find anything on the 
real-time topics, and processes away the batch messages for a few minutes. 
During the batch processing, the Kafka consumer does not poll the real-time 
streams, so if a message is sent to a real-time topic, the message effectively 
doesn’t arrive until the next time the Kafka consumer does another fetch.  When 
a real-time message is consumed by the Kafka consumer, the 
TieredPriorityChooser correctly prioritizes traffic from the real-time streams 
over the batch streams.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to