Hi, this happens most times I restart the consumer group, but not every time. 
There are no log errors and nothing seems to be indicating that a rebalance is 
occurring. Here are the ZK logs I see on one of the processes that isn’t 
receiving partitions.

2015-05-04 13:55:32,365 [main] INFO  org.apache.zookeeper.ZooKeeper:438 - 
Initiating client connection, connectString=lxpkfkdal01.nanigans.com sessionTime
out=400 watcher=org.I0Itec.zkclient.ZkClient@6971e8ba
2015-05-04 13:55:32,366 [main-SendThread(] INFO  
org.apache.zookeeper.ClientCnxn:966 - Opening socket connection to server
.8.44.121:2181. Will not attempt to authenticate using SASL (unknown error)
2015-05-04 13:55:32,367 [main-SendThread(] INFO  
org.apache.zookeeper.ClientCnxn:849 - Socket connection established to
44.121:2181, initiating session
2015-05-04 13:55:32,371 [main-SendThread(] INFO  
org.apache.zookeeper.ClientCnxn:1207 - Session establishment complete on server 
121/, sessionid = 0x14691649cf75e2c, negotiated timeout = 4000

Here is the output of the ConsumerOffsetChecker, note that 6 of the partitions 
are unclaimed:

Group           Topic                          Pid Offset          logSize      
   Lag             Owner
rtb_targeting_server compile_request                0   328831805       
328832108       303             
rtb_targeting_server compile_request                1   328680629       
328680761       132             
rtb_targeting_server compile_request                2   328322706       
328626882       304176          none
rtb_targeting_server compile_request                3   328397868       
328703662       305794          none
rtb_targeting_server compile_request                4   328393846       
328393923       77              
rtb_targeting_server compile_request                5   329085299       
329085385       86              
rtb_targeting_server compile_request                6   328667153       
328667153       0               
rtb_targeting_server compile_request                7   328537143       
328537272       129             
rtb_targeting_server compile_request                8   328613787       
328913671       299884          none
rtb_targeting_server compile_request                9   328212202       
328516662       304460          none
rtb_targeting_server compile_request                10  329370706       
329370951       245             
rtb_targeting_server compile_request                11  328207478       
328207705       227             
rtb_targeting_server compile_request                12  328564790       
328564790       0               
rtb_targeting_server compile_request                13  328473600       
328473672       72              
rtb_targeting_server compile_request                14  329088239       
329088315       76              
rtb_targeting_server compile_request                15  328311986       
328311986       0               
rtb_targeting_server compile_request                16  328615462       
328615497       35              
rtb_targeting_server compile_request                17  327853920       
327853949       29              
rtb_targeting_server compile_request                18  328196285       
328497010       300725          none
rtb_targeting_server compile_request                19  330429455       
330733318       303863          none
rtb_targeting_server compile_request                20  328678091       
328678137       46              
rtb_targeting_server compile_request                21  328089585       
328089585       0               
rtb_targeting_server compile_request                22  328235530       
328235571       41              
rtb_targeting_server compile_request                23  328699002       
328699041       39              

Thanks for your help,

On 4/29/15, 11:30 PM, "Aditya Auradkar" <aaurad...@linkedin.com.INVALID> wrote:

>Hey Dave,
>It's hard to say why this is happening without more information. Even if there 
>are no errors in the log, is there anything to indicate that the rebalance 
>process on those hosts even started? Does this happen occasionally or every 
>time you start the consumer group? Can you paste the output of 
>ConsumerOffsetChecker and describe topic?
>From: Dave Hamilton [dhamil...@nanigans.com]
>Sent: Wednesday, April 29, 2015 6:46 PM
>To: users@kafka.apache.org; users@kafka.apache.org
>Subject: Re: Unclaimed partitions
>Hi, would anyone be able to help me with this issue? Thanks.
>- Dave
>On Tue, Apr 28, 2015 at 1:32 PM -0700, "Dave Hamilton" 
><dhamil...@nanigans.com<mailto:dhamil...@nanigans.com>> wrote:
>1. We’re using version
>2. No failures in the consumer logs
>3. We’re using the ConsumerOffsetChecker to see what partitions are assigned 
>to the consumer group and what their offsets are. 8 of the 12 process each 
>have been assigned two partitions and they’re keeping up with the topic. The 
>other 4 do not get assigned partitions and no consumers in the group are 
>consuming those 8 partitions.
>Thanks for your help,
>On 4/28/15, 1:40 PM, "Aditya Auradkar" <aaurad...@linkedin.com.INVALID> wrote:
>>Couple of questions:
>>- What version of the consumer API are you using?
>>- Are you seeing any rebalance failures in the consumer logs?
>>- How do you determine that some partitions are unassigned? Just confirming 
>>that you have partitions that are not being consumed from as opposed to 
>>consumer threads that aren't assigned any partitions.
>>From: Dave Hamilton [dhamil...@nanigans.com]
>>Sent: Tuesday, April 28, 2015 10:19 AM
>>To: users@kafka.apache.org
>>Subject: Re: Unclaimed partitions
>>I’m sorry, I forgot to specify that these processes are in the same consumer 
>>On 4/28/15, 1:15 PM, "Aditya Auradkar" <aaurad...@linkedin.com.INVALID> wrote:
>>>Hi Dave,
>>>The simple consumer doesn't do any state management across consumer 
>>>instances. So I'm not sure how you are assigning partitions in your 
>>>application code. Did you mean to say that you are using the high level 
>>>consumer API?
>>>From: Dave Hamilton [dhamil...@nanigans.com]
>>>Sent: Tuesday, April 28, 2015 7:58 AM
>>>To: users@kafka.apache.org
>>>Subject: Unclaimed partitions
>>>Hi, I am trying to consume a 24-partition topic across 12 processes. Each 
>>>process is using the simple consumer API, and each is being assigned two 
>>>consumer threads. I have noticed when starting these processes that 
>>>sometimes some of my processes are not being assigned any partitions, and no 
>>>rebalance seems to ever be triggered, leaving some of the partitions 
>>>When I first tried deploying this yesterday, I noticed 8 of the 24 
>>>partitions, for 4 of the consumer processes, went unclaimed. Redeploying 
>>>shortly later corrected this issue. I tried deploying again today, and now I 
>>>see a different set of 4 processes not getting assigned partitions. The 
>>>processes otherwise appear to be running normally, they are currently 
>>>running in production and we are working to get the consumers quietly 
>>>running before enabling them to do any work. I’m not sure if we might be 
>>>looking at some sort of timing issue.
>>>Does anyone know what might be causing the issues we’re observing?

Reply via email to