Thanks, Jun.  We have considered doing message filtering in the consumer.  
However, the thrust of my question below is not filtering, but dispatching.  If 
we take Chris' recommendation and pump a small set of msg types, belonging to 
the same "class" of messages, such as Account History, through the same topic, 
we will want to process all the messages, but we will want to process each msg 
type within the "class" differently, so we will want to dispatch to different 
handlers.

I totally see your point that if we only want to process a subset of the 
messages, then we really ought to filter in the producer and send the filtered 
message stream to its own topic.

I am leaning toward the architecture of having a different consumerConnector 
per topic, as there ARE plenty of ports.  This allows per topic control, which 
is useful.  Do you see any issues with this approach?

Thanks,
rob 


-----Original Message-----
From: Jun Rao [mailto:jun...@gmail.com] 
Sent: Wednesday, May 29, 2013 9:58 AM
To: users@kafka.apache.org
Subject: Re: one consumerConnector or many?

Rob,

You are correct that each instance of consumer will use a single socket to 
connect to a broker, independent of # topics/partitions. One thing that's good 
to avoid is to read all data and filter in the consumer, especially when the 
data is consumed multiple times by different consumers. In this case, it's 
better to put the filtered data in a separate topic and let all consumers 
consume the filtered data directly.

Thanks,

Jun




On Wed, May 29, 2013 at 6:13 AM, Rob Withers <reefed...@gmail.com> wrote:

> In thinking about the design of consumption, we have in mind a generic 
> consumer server which would consume from more than one message type.  
> The handling of each type of message would be different.  I suppose we 
> could have upwards of say 50 different message types, eventually, 
> maybe 100+ different types.  Which of the following designs would be 
> best and why would the other options be bad?
>
>
>
> 1)      Have all message types go through one topic and use a dispatcher
> pattern to select the correct handler.  Use one consumerConnector.
>
> 2)      Use a different topic for each message type, but still use one
> consumerConnector and a dispatcher pattern.
>
> 3)      Use a different topic for each message type and have a separate
> consumerConnector for each topic.
>
>
>
> I am struggling with whether my assumptions are correct.  It seems 
> that a single connector for a topic would establish one socket to each 
> broker, as rebalancing assigns various partitions to that thread.  
> Option 2 would pull messages from more than one topic through a single 
> socket to a particular broker, is it so?  Would option 3 be 
> reasonable, establishing upwards of
> 100
> sockets per broker?
>
>
>
> I am guestimating that option 2 is the right way forward, to bound 
> socket use, and we'll need to figure out a way to parameterize stream 
> consumption with the right handlers for a particular msg type.  If we 
> add a topic, do you think we should create a new connector or restart 
> the original connector with the new topic in the map?
>
>
>
> Thanks,
>
> rob
>
>

Reply via email to