[ 
https://issues.apache.org/jira/browse/SAMZA-342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074941#comment-14074941
 ] 

TJ Giuli commented on SAMZA-342:
--------------------------------

Hey, Chris:

1.)  This one is pretty tough to gauge by eye -- I believe that when eyeballing 
the system at normal run state, according to logs from the processes that sends 
out real-time messages, once a message is sent I observe it showing up in Kafka 
almost instantaneously using kafka-console-consumer.sh

2.)  I do see those log messages, and once KafkaSystemConsumer gets the 
real-time message, there is some latency before my stream processor consumes it
{noformat}
2014-07-25 13:41:46 KafkaSystemConsumer [TRACE] Incoming message [REALTIME,0]: 
MessageAndOffset(Message(magic = 0, attributes = 0, crc = 1154012242, key = 
null, payload = java.nio.HeapByteBuffer[pos=0 lim=1647 cap=1647]),5).

014-07-25 13:41:49 TieredPriorityChooser [TRACE] Got prioritized envelope: 
IncomingMessageEnvelope [systemStreamPartition=SystemStreamPartition 
[partition=Partition [partition=0], system=kafka, stream=REALTIME], offset=4, 
key=null, message="XXX"
{noformat}

So it does appear that the KafkaSystemConsumer receives the message and takes 3 
seconds to deliver it, correct?

> Priority streams experience large latencies before being consumed by the 
> stream processor
> -----------------------------------------------------------------------------------------
>
>                 Key: SAMZA-342
>                 URL: https://issues.apache.org/jira/browse/SAMZA-342
>             Project: Samza
>          Issue Type: Bug
>          Components: kafka
>    Affects Versions: 0.7.0
>         Environment: ubuntu 13.10
>            Reporter: TJ Giuli
>
> I have a stream processor that takes inputs from multiple streams, some are 
> more batch, non-latency sensitive and others are real-time, infrequently have 
> traffic and should be low-latency.  The real-time stream helps me interpret 
> the batch stream, so I would ideally like any real-time stream envelopes 
> delivered within some maximum latency from the time the message enters into a 
> Kafka topic.  
> I have my stream processor configured to prioritize my real-time streams over 
> the batch streams, but I consistently find that the real-time stream is 
> delayed by traffic from the batch stream.  From tracing the Kafka consumer, 
> it looks like my stream processor periodically fetches from Kafka, finds that 
> the batch streams have a large chunk of messages waiting, doesn’t find 
> anything on the real-time topics, and processes away the batch messages for a 
> few minutes. During the batch processing, the Kafka consumer does not poll 
> the real-time streams, so if a message is sent to a real-time topic, the 
> message effectively doesn’t arrive until the next time the Kafka consumer 
> does another fetch.  When a real-time message is consumed by the Kafka 
> consumer, the TieredPriorityChooser correctly prioritizes traffic from the 
> real-time streams over the batch streams.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to