[ 
https://issues.apache.org/jira/browse/KAFKA-364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397471#comment-13397471
 ] 

Ross Black edited comment on KAFKA-364 at 6/20/12 1:22 PM:
-----------------------------------------------------------

In the API redesign, it would be nice to somehow allow for flexible/pluggable 
control of the allocation of [broker:partition] from producers and to consumers 
when using zookeeper management.
(I was not certain whether "Manual partition assignment" covered this - it did 
not mention producer partition control)

I currently use SyncProducer and SimpleConsumer to directly control the set of 
[broker:partition] that a producer writes to and that a consumer reads from.

I need this for a scenario where the consumer holds some state (like a cache) 
on local disk.  It is expensive to discard the local state - the consumer must 
then instead perform a remote lookup with a very high latency. (> 5mins).  I 
need the partitioning performed by a producer to remain fixed until explicitly 
changed (the number of producers is relatively static, and each producer sends 
messages into a dedicated broker).  I need each consumer to fetch the same 
partitions unless a consumer has failed for more than some period of time 
(approx 5 mins) so that if it recovers quickly I have not wastefully discarded 
local state.

Currently if I use Producer with zookeeper, the Partitioner API allows me to 
partition messages, but then kafka code in the Producer controls allocation of 
the Partitioner result to a physical [broker:partition].  If I use Producer 
with fixed brokers, messages are allocated to random partitions.  If I use the 
high level consumer, kafka code in ZookeeperConsumerConnector controls the 
allocation of [broker:partition] to available consumers.

I understand if this is an over-specialised use-case to cater for.  At minimum 
I would like the equivalent functionality of SyncProducer and SimpleConsumer to 
be preserved in a public API.

Thanks,
Ross


                
      was (Author: ross.black):
    In the API redesign, it would be nice to somehow allow for 
flexible/pluggable control of the allocation of [broker:partition] from 
producers and to consumers when using zookeeper management.

I currently use SyncProducer and SimpleConsumer to directly control the set of 
[broker:partition] that a producer writes to and that a consumer reads from.

I need this for a scenario where the consumer holds some state (like a cache) 
on local disk.  It is expensive to discard the local state - the consumer must 
then instead perform a remote lookup with a very high latency. (> 5mins).  I 
need the partitioning performed by a producer to remain fixed until explicitly 
changed (the number of producers is relatively static, and each producer sends 
messages into a dedicated broker).  I need each consumer to fetch the same 
partitions unless a consumer has failed for more than some period of time 
(approx 5 mins) so that if it recovers quickly I have not wastefully discarded 
local state.

Currently if I use Producer with zookeeper, the Partitioner API allows me to 
partition messages, but then kafka code in the Producer controls allocation of 
the Partitioner result to a physical [broker:partition].  If I use Producer 
with fixed brokers, messages are allocated to random partitions.  If I use the 
high level consumer, kafka code in ZookeeperConsumerConnector controls the 
allocation of [broker:partition] to available consumers.

I understand if this is an over-specialised use-case to cater for.  At minimum 
I would like the equivalent functionality of SyncProducer and SimpleConsumer to 
be preserved in a public API.

Thanks,
Ross


                  
> Consumer re-design
> ------------------
>
>                 Key: KAFKA-364
>                 URL: https://issues.apache.org/jira/browse/KAFKA-364
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>
> We've received quite a lot of feedback on the consumer side features over the 
> past few months. Some of them are improvements to the current consumer design 
> and some are simply new feature/API requests. I have attempted to write up 
> the requirements that I've heard on this wiki -
> https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Client+Re-Design
> This would involve some significant changes to the consumer APIs, so we would 
> like to collect feedback on the proposal from our community. Since the list 
> of changes is not small, we would like to understand if some features are 
> preferred over others, and more importantly, if some features are not 
> required at all. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to