Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-13 Thread Zakee
Sorry, but still confused. Maximum number of threads (fetchers) to fetch from a Leader or maximum number of threads within a follower broker? Thanks for clarifying, -Zakee On Mar 12, 2015, at 11:11 PM, tao xiao xiaotao...@gmail.com wrote: The number of fetchers is configurable via

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-13 Thread tao xiao
The number of fetchers is configurable via num.replica.fetchers. The description of num.replica.fetchers in Kafka documentation is not quite accurate. num.replica.fetchers actually controls the max number of fetchers per broker. In you case num.replica.fetchers=8 and 5 brokers the means no more 8

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-12 Thread Zakee
Is this always the case that there is only one fetcher per broker, won’t setting num.replica.fetchers greater than number-of-brokers cause more fetchers per broker? Let’s I have 5 brokers, and num of replica fetchers is 8, will there be 2 fetcher threads pulling from each broker? Thanks

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-12 Thread James Cheng
Ah, I understand now. I didn't realize that there was one fetcher thread per broker. Thanks Tao Guozhang! -James On Mar 11, 2015, at 5:00 PM, tao xiao xiaotao...@gmail.com wrote: Fetcher thread is per broker basis, it ensures that at lease one fetcher thread per broker. Fetcher thread is

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread Guozhang Wang
Hi James, What I meant before is that a single fetcher may be responsible for putting fetched data to multiple queues according to the construction of the streams setup, where each queue may be consumed by a different thread. And the queues are actually bounded. Now say if there are two queues

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread tao xiao
Fetcher thread is per broker basis, it ensures that at lease one fetcher thread per broker. Fetcher thread is sent to broker with a fetch request to ask for all partitions. So if A, B, C are in the same broker fetcher thread is still able to fetch data from A, B, C even though A returns no data.

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread James Cheng
On Mar 11, 2015, at 9:12 AM, Guozhang Wang wangg...@gmail.com wrote: Hi James, What I meant before is that a single fetcher may be responsible for putting fetched data to multiple queues according to the construction of the streams setup, where each queue may be consumed by a different

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread tao xiao
consumer.timeout.ms only affects how the stream reads data from the internal chunk queue that is used to buffer received data. The actual data fetching is done by another fetcher thread kafka.consumer.ConsumerFetcherThread. The fetcher thread keeps reading data from broker and put them to the

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-11 Thread Guozhang Wang
The new consumer will be released in 0.9, which is targeted for end of this quarter. On Tue, Feb 10, 2015 at 7:11 PM, tao xiao xiaotao...@gmail.com wrote: Do you know when the new consumer API will be publicly available? On Wed, Feb 11, 2015 at 10:43 AM, Guozhang Wang wangg...@gmail.com

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread tao xiao
Thank you Guozhang for your detailed explanation. In your example createMessageStreamsByFilter(*C = 3) since threads are shared among topics there may be situation where all 3 threads threads get stuck with topic AC e.g. topic is empty which will be holding the connecting threads (setting

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread tao xiao
Do you know when the new consumer API will be publicly available? On Wed, Feb 11, 2015 at 10:43 AM, Guozhang Wang wangg...@gmail.com wrote: Yes, it can get stuck. For example, AC and BC are processed by two different processes and AC processors gets stuck, hence AC messages will fill up in

createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread tao xiao
Hi team, I am comparing the differences between ConsumerConnector.createMessageStreams and ConsumerConnector.createMessageStreamsByFilter. My understanding is that createMessageStreams creates x number of threads (x is the number of threads passed in to the method) dedicated to the specified

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread Guozhang Wang
I was not clear before .. for createMessageStreamsByFilter each matched topic will have num-threads, but shared: i.e. there will be totally num-threads created, but each thread will be responsible for fetching all matched topics. A more concrete example: say you have topic AC: 3 partitions, topic

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread Guozhang Wang
createMessageStreams is used for consuming from specific topic(s), where you can put a map of [topic-name, num-threads] as its input parameters; createMessageStreamsByFilter is used for consuming from wildcard topics, where you can put a (regex, num-threads) as its input parameters, and for each