[ 
https://issues.apache.org/jira/browse/KAFKA-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642229#comment-14642229
 ] 

Neha Narkhede edited comment on KAFKA-2350 at 7/27/15 3:55 AM:
---------------------------------------------------------------

Few thoughts:
bq. 1. Add pause/unpause
bq. 2. Allow subscribe(topic) followed by unsubscribe(partition) to subscribe 
to a topic but suppress a partition
bq. 3. Either of the above but making the use of group management explicit 
using an enable.group.management flag

I'm in favor of 1. for several reasons:
1. It keeps the API semantics clean. subscribe/unsubscribe indicates intent to 
consume data, while pause/resume indicates a *temporary* preference for the 
purposes of flow control.
2. It avoids all the different permutations of subscribe/unsubscribe that we 
will need to worry about and each one of those would have to make sense and be 
explained clearly to the user. This discussion is confusing enough that I'm 
convinced that it will not be easy.
3. pause/resume moves the consumer to a different state in its state diagram. 
Overloading the same API to represent two different states is unintuitive.

Also +1 on - 
1. Renaming unpause to resume.
2. Not maintaining the pause/resume preference across consumer rebalances.

Also not in favor of adding the enable.group.management config. I agree with 
Jay that adding the config will just complicate the semantics and reduce 
operational simplicity, increasing the number of ways the API calls made by the 
user would not behave as expected. 

There may be complications in the implementation of the above preferences that 
I may have overlooked, but I feel we should design APIs for the right behavior 
and figure out the implementation related issues that might come up as a 
result. 


was (Author: nehanarkhede):
Few thoughts:
bq. 1. Add pause/unpause
bq. 2. Allow subscribe(topic) followed by unsubscribe(partition) to subscribe 
to a topic but suppress a partition
bq. 3. Either of the above but making the use of group management explicit 
using an enable.group.management flag

I'm in favor of 1. for several reasons:
1. It keeps the API semantics clean. subscribe/unsubscribe indicates intent to 
consume data, while pause/resume indicates a *temporary* preference for the 
purposes of flow control.
2. It avoids all the different permutations of subscribe/unsubscribe that we 
will need to worry about and each one of those would have to make sense and be 
explained clearly to the user. This discussion is confusing enough that I'm 
convinced that it will not be easy.
3. pause/resume moves the consumer to a different state in its state diagram. 
Overloading the same API to represent two different states is unintuitive.

Also +1 on - 
1. Renaming unpause to resume.
2. Not maintaining the pause/resume preference across consumer rebalances.

There may be complications in the implementation of the above preferences that 
I may have overlooked, but I feel we should design APIs for the right behavior 
and figure out the implementation related issues that might come up as a 
result. 


> Add KafkaConsumer pause capability
> ----------------------------------
>
>                 Key: KAFKA-2350
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2350
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>
> There are some use cases in stream processing where it is helpful to be able 
> to pause consumption of a topic. For example, when joining two topics, you 
> may need to delay processing of one topic while you wait for the consumer of 
> the other topic to catch up. The new consumer currently doesn't provide a 
> nice way to do this. If you skip poll() or if you unsubscribe, then a 
> rebalance will be triggered and your partitions will be reassigned.
> One way to achieve this would be to add two new methods to KafkaConsumer:
> {code}
> void pause(String... topics);
> void unpause(String... topics);
> {code}
> When a topic is paused, a call to KafkaConsumer.poll will not initiate any 
> new fetches for that topic. After it is unpaused, fetches will begin again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to