[
https://issues.apache.org/jira/browse/KAFKA-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15159545#comment-15159545
]
Jay Kreps edited comment on KAFKA-3262 at 2/23/16 7:58 PM:
-----------------------------------------------------------
This is a good catch, I ran into this issue too. Another issue is that the
clients default to debug logging which is not ideal for development (it can be
kind of confusing whether something is happening or not since all the action is
in the event loop).
I'm a little reticent about fixing this issue by background threading though. A
few things to be careful of:
1. The complexity of orchestration back and forth from the thread is complicated
2. If we use a blocking queue to pass data it will be really important to batch
actions to not kill performance (or at least that was our finding before).
3. Having a single thread and having the debugging step into the consumer
itself is actually more transparent (I think) and will make various failure
scenarios work the way we want (e.g. the contract in the consumer is if there
is a fatal error to throw an exception which should propagate and not just kill
the bg thread).
I suppose an alternative to changing the threading model would be just to set
the timeout really high in development. It occurs to me there are several
things you might want in development:
1. Large or infinite session timeout
2. More logging
3. Single threaded?
4. Re-start from the beginning of the inputs?
5. Recreate intermediate topics?
Dunno, maybe there should be some kind of overall "dev-mode" for all of this?
Just thinking out loud...
was (Author: jkreps):
This is a good catch, I ran into this issue too. Another issue is that the
clients default to debug logging which is not ideal for development (it can be
kind of confusing whether something is happening or not since all the action is
in the event loop).
I'm a little reticent about fixing this issue by background threading though. A
few things to be careful of:
1. The complexity of orchestration back and forth from the thread is complicated
2. If we use a blocking queue to pass data it will be really important to batch
actions to not kill performance (or at least that was our finding before).
3. Having a single thread and having the debugging step into the consumer
itself is actually more transparent (I think) and will make various failure
scenarios work the way we want (e.g. the contract in the consumer is if there
is a fatal error to throw an exception which should propagate and not just kill
the bg thread).
I suppose an alternative to changing the threading model would be just to set
the timeout really high in development. It occurs to me there are several
things you might want in development:
1. Large or infinite session timeout
2. More logging
3. Single threaded?
4. Re-start from the beginning of the inputs?
5. Recreate intermediate topics?
Dunno, maybe there should be some kind of overall "dev-mode" for all of this?
> Make KafkaStreams debugging friendly
> ------------------------------------
>
> Key: KAFKA-3262
> URL: https://issues.apache.org/jira/browse/KAFKA-3262
> Project: Kafka
> Issue Type: Sub-task
> Components: kafka streams
> Affects Versions: 0.9.1.0
> Reporter: Yasuhiro Matsuda
>
> Current KafkaStreams polls records in the same thread as the data processing
> thread. This makes debugging user code, as well as KafkaStreams itself,
> difficult. When the thread is suspended by the debugger, the next heartbeat
> of the consumer tie to the thread won't be send until the thread is resumed.
> This often results in missed heartbeats and causes a group rebalance. So it
> may will be a completely different context then the thread hits the break
> point the next time.
> We should consider using separate threads for polling and processing.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)