[ 
https://issues.apache.org/jira/browse/KAFKA-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14392017#comment-14392017
 ] 

Evan Huus commented on KAFKA-2082:
----------------------------------

I have pushed another branch called {{test-jira-kafka-2082}} which is identical 
to the previous branch except it also sets the virtualbox memory to 3GB 
(otherwise it defaults to 512MB). You will likely have to {{vagrant destroy}} 
and recreate for this setting to stick.

With this branch, I have been able to reproduce from scratch using VMware on an 
OSX host and VirtualBox on an Ubuntu host - the process is identical. The 
relevant test fragment when it fails is as follows:

{noformat}
=== RUN TestReliableProducing
[sarama] 2015/04/01 22:26:06 Initializing new client
[sarama] 2015/04/01 22:26:06 Fetching metadata for all topics from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:06 Connected to broker 192.168.100.67:9091
[sarama] 2015/04/01 22:26:06 Registered new broker #9091 at 192.168.100.67:9091
[sarama] 2015/04/01 22:26:06 Registered new broker #9095 at 192.168.100.67:9095
[sarama] 2015/04/01 22:26:06 Registered new broker #9094 at 192.168.100.67:9094
[sarama] 2015/04/01 22:26:06 Registered new broker #9093 at 192.168.100.67:9093
[sarama] 2015/04/01 22:26:06 Registered new broker #9092 at 192.168.100.67:9092
[sarama] 2015/04/01 22:26:06 Successfully initialized new client
[sarama] 2015/04/01 22:26:06 producer/flusher/9091 starting up
[sarama] 2015/04/01 22:26:06 Connected to broker 192.168.100.67:9091
[sarama] 2015/04/01 22:26:06 producer/flusher/9092 starting up
[sarama] 2015/04/01 22:26:06 Connected to broker 192.168.100.67:9092
[sarama] 2015/04/01 22:26:06 producer/flusher/9095 starting up
[sarama] 2015/04/01 22:26:06 Connected to broker 192.168.100.67:9095
[sarama] 2015/04/01 22:26:06 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:06 Connected to broker 192.168.100.67:9093
[sarama] 2015/04/01 22:26:06 producer/flusher/9094 starting up
[sarama] 2015/04/01 22:26:06 Connected to broker 192.168.100.67:9094
[sarama] 2015/04/01 22:26:09 zk1 disabled
[sarama] 2015/04/01 22:26:12 zk3 disabled
[sarama] 2015/04/01 22:26:19 producer/flusher/9091 state change to [retrying] 
on many_partition/25 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:19 producer/flusher/9091 state change to [retrying] 
on many_partition/5 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:19 producer/leader state change to [retrying-1] on 
many_partition/25
[sarama] 2015/04/01 22:26:19 producer/leader abandoning broker 9091 on 
many_partition/25
[sarama] 2015/04/01 22:26:19 producer/flusher/9091 state change to [retrying] 
on many_partition/30 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:19 producer/leader state change to [retrying-1] on 
many_partition/5
[sarama] 2015/04/01 22:26:19 producer/leader abandoning broker 9091 on 
many_partition/5
[sarama] 2015/04/01 22:26:19 producer/leader state change to [retrying-1] on 
many_partition/30
[sarama] 2015/04/01 22:26:19 producer/leader abandoning broker 9091 on 
many_partition/30
[sarama] 2015/04/01 22:26:19 producer/flusher/9091 state change to [normal] on 
many_partition/25
[sarama] 2015/04/01 22:26:19 producer/flusher/9093 state change to [retrying] 
on many_partition/7 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:19 producer/leader state change to [retrying-1] on 
many_partition/7
[sarama] 2015/04/01 22:26:19 producer/leader abandoning broker 9093 on 
many_partition/7
[sarama] 2015/04/01 22:26:21 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:21 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:21 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:21 producer/leader selected broker 9091 on 
many_partition/25
[sarama] 2015/04/01 22:26:21 producer/leader state change to [normal-1] on 
many_partition/25
[sarama] 2015/04/01 22:26:21 producer/leader state change to [flushing-0] on 
many_partition/25
[sarama] 2015/04/01 22:26:21 producer/leader state change to [normal-0] on 
many_partition/25
[sarama] 2015/04/01 22:26:21 producer/leader selected broker 9091 on 
many_partition/5
[sarama] 2015/04/01 22:26:21 producer/leader selected broker 9091 on 
many_partition/30
[sarama] 2015/04/01 22:26:21 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:21 producer/leader selected broker 9093 on 
many_partition/7
[sarama] 2015/04/01 22:26:24 producer/flusher/9091 state change to [retrying] 
on many_partition/20 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:24 producer/leader state change to [retrying-1] on 
many_partition/20
[sarama] 2015/04/01 22:26:24 producer/leader abandoning broker 9091 on 
many_partition/20
[sarama] 2015/04/01 22:26:24 producer/flusher/9091 state change to [retrying] 
on many_partition/0 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:24 producer/leader state change to [retrying-1] on 
many_partition/0
[sarama] 2015/04/01 22:26:24 producer/leader abandoning broker 9091 on 
many_partition/0
[sarama] 2015/04/01 22:26:24 producer/flusher/9091 state change to [retrying] 
on many_partition/10 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:24 producer/leader state change to [retrying-1] on 
many_partition/10
[sarama] 2015/04/01 22:26:24 producer/leader abandoning broker 9091 on 
many_partition/10
[sarama] 2015/04/01 22:26:24 producer/flusher/9091 state change to [retrying] 
on many_partition/15 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:24 producer/leader state change to [retrying-1] on 
many_partition/15
[sarama] 2015/04/01 22:26:24 producer/leader abandoning broker 9091 on 
many_partition/15
[sarama] 2015/04/01 22:26:24 producer/flusher/9091 state change to [normal] on 
many_partition/5
[sarama] 2015/04/01 22:26:24 producer/flusher/9091 state change to [normal] on 
many_partition/30
[sarama] 2015/04/01 22:26:24 producer/leader state change to [normal-1] on 
many_partition/5
[sarama] 2015/04/01 22:26:24 producer/leader state change to [flushing-0] on 
many_partition/5
[sarama] 2015/04/01 22:26:24 producer/leader state change to [normal-1] on 
many_partition/30
[sarama] 2015/04/01 22:26:24 producer/leader state change to [flushing-0] on 
many_partition/30
[sarama] 2015/04/01 22:26:24 producer/leader state change to [normal-0] on 
many_partition/5
[sarama] 2015/04/01 22:26:24 producer/leader state change to [normal-0] on 
many_partition/30
[sarama] 2015/04/01 22:26:24 producer/flusher/9091 state change to [normal] on 
many_partition/20
[sarama] 2015/04/01 22:26:24 producer/flusher/9091 state change to [normal] on 
many_partition/0
[sarama] 2015/04/01 22:26:24 producer/flusher/9091 state change to [normal] on 
many_partition/10
[sarama] 2015/04/01 22:26:24 producer/flusher/9091 state change to [normal] on 
many_partition/15
[sarama] 2015/04/01 22:26:25 zk3 enabled
[sarama] 2015/04/01 22:26:26 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:26 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:26 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:26 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:26 producer/leader selected broker 9091 on 
many_partition/20
[sarama] 2015/04/01 22:26:26 producer/leader state change to [normal-1] on 
many_partition/20
[sarama] 2015/04/01 22:26:26 producer/leader state change to [flushing-0] on 
many_partition/20
[sarama] 2015/04/01 22:26:26 producer/leader state change to [normal-0] on 
many_partition/20
[sarama] 2015/04/01 22:26:26 producer/leader selected broker 9091 on 
many_partition/0
[sarama] 2015/04/01 22:26:26 producer/leader state change to [normal-1] on 
many_partition/0
[sarama] 2015/04/01 22:26:26 producer/leader state change to [flushing-0] on 
many_partition/0
[sarama] 2015/04/01 22:26:26 producer/leader state change to [normal-0] on 
many_partition/0
[sarama] 2015/04/01 22:26:26 producer/leader selected broker 9091 on 
many_partition/10
[sarama] 2015/04/01 22:26:26 producer/leader state change to [normal-1] on 
many_partition/10
[sarama] 2015/04/01 22:26:26 producer/leader state change to [flushing-0] on 
many_partition/10
[sarama] 2015/04/01 22:26:26 producer/leader state change to [normal-0] on 
many_partition/10
[sarama] 2015/04/01 22:26:26 producer/leader selected broker 9091 on 
many_partition/15
[sarama] 2015/04/01 22:26:26 producer/leader state change to [normal-1] on 
many_partition/15
[sarama] 2015/04/01 22:26:26 producer/leader state change to [flushing-0] on 
many_partition/15
[sarama] 2015/04/01 22:26:26 producer/leader state change to [normal-0] on 
many_partition/15
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [retrying] 
on many_partition/27 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:26 producer/leader state change to [retrying-1] on 
many_partition/27
[sarama] 2015/04/01 22:26:26 producer/leader abandoning broker 9093 on 
many_partition/27
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [retrying] 
on many_partition/2 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:26 producer/leader state change to [retrying-1] on 
many_partition/2
[sarama] 2015/04/01 22:26:26 producer/leader abandoning broker 9093 on 
many_partition/2
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [retrying] 
on many_partition/17 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:26 producer/leader state change to [retrying-1] on 
many_partition/17
[sarama] 2015/04/01 22:26:26 producer/leader abandoning broker 9093 on 
many_partition/17
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [retrying] 
on many_partition/22 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:26 producer/leader state change to [retrying-1] on 
many_partition/22
[sarama] 2015/04/01 22:26:26 producer/leader abandoning broker 9093 on 
many_partition/22
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [retrying] 
on many_partition/12 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:26 producer/leader state change to [retrying-1] on 
many_partition/12
[sarama] 2015/04/01 22:26:26 producer/leader abandoning broker 9093 on 
many_partition/12
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [normal] on 
many_partition/7
[sarama] 2015/04/01 22:26:26 producer/leader state change to [normal-1] on 
many_partition/7
[sarama] 2015/04/01 22:26:26 producer/leader state change to [flushing-0] on 
many_partition/7
[sarama] 2015/04/01 22:26:26 producer/leader state change to [normal-0] on 
many_partition/7
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [normal] on 
many_partition/27
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [normal] on 
many_partition/2
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [normal] on 
many_partition/17
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [normal] on 
many_partition/22
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [normal] on 
many_partition/12
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [retrying] 
on many_partition/7 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:26 producer/leader state change to [retrying-2] on 
many_partition/7
[sarama] 2015/04/01 22:26:26 producer/leader abandoning broker 9093 on 
many_partition/7
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 state change to [normal] on 
many_partition/7
[sarama] 2015/04/01 22:26:26 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:28 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:28 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:28 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:28 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:28 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:28 producer/leader selected broker 9093 on 
many_partition/27
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-1] on 
many_partition/27
[sarama] 2015/04/01 22:26:28 producer/leader state change to [flushing-0] on 
many_partition/27
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-0] on 
many_partition/27
[sarama] 2015/04/01 22:26:28 producer/leader selected broker 9093 on 
many_partition/2
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-1] on 
many_partition/2
[sarama] 2015/04/01 22:26:28 producer/leader state change to [flushing-0] on 
many_partition/2
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-0] on 
many_partition/2
[sarama] 2015/04/01 22:26:28 producer/leader selected broker 9093 on 
many_partition/17
[sarama] 2015/04/01 22:26:28 producer/leader selected broker 9093 on 
many_partition/22
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-1] on 
many_partition/17
[sarama] 2015/04/01 22:26:28 producer/leader state change to [flushing-0] on 
many_partition/17
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-0] on 
many_partition/17
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-1] on 
many_partition/22
[sarama] 2015/04/01 22:26:28 producer/leader state change to [flushing-0] on 
many_partition/22
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-0] on 
many_partition/22
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [retrying] 
on many_partition/27 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:28 producer/leader state change to [retrying-2] on 
many_partition/27
[sarama] 2015/04/01 22:26:28 producer/leader abandoning broker 9093 on 
many_partition/27
[sarama] 2015/04/01 22:26:28 producer/leader selected broker 9093 on 
many_partition/12
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-1] on 
many_partition/12
[sarama] 2015/04/01 22:26:28 producer/leader state change to [flushing-0] on 
many_partition/12
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-0] on 
many_partition/12
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [retrying] 
on many_partition/2 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:28 producer/leader state change to [retrying-2] on 
many_partition/2
[sarama] 2015/04/01 22:26:28 producer/leader abandoning broker 9093 on 
many_partition/2
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [retrying] 
on many_partition/17 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:28 producer/leader state change to [retrying-2] on 
many_partition/17
[sarama] 2015/04/01 22:26:28 producer/leader abandoning broker 9093 on 
many_partition/17
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [retrying] 
on many_partition/22 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:28 producer/leader state change to [retrying-2] on 
many_partition/22
[sarama] 2015/04/01 22:26:28 producer/leader abandoning broker 9093 on 
many_partition/22
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [normal] on 
many_partition/27
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [normal] on 
many_partition/2
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [normal] on 
many_partition/17
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [normal] on 
many_partition/22
[sarama] 2015/04/01 22:26:28 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [retrying] 
on many_partition/12 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:28 producer/leader state change to [retrying-2] on 
many_partition/12
[sarama] 2015/04/01 22:26:28 producer/leader abandoning broker 9093 on 
many_partition/12
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [normal] on 
many_partition/12
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:28 producer/leader selected broker 9093 on 
many_partition/7
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-2] on 
many_partition/7
[sarama] 2015/04/01 22:26:28 producer/leader state change to [flushing-1] on 
many_partition/7
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-1] on 
many_partition/7
[sarama] 2015/04/01 22:26:28 producer/leader state change to [flushing-0] on 
many_partition/7
[sarama] 2015/04/01 22:26:28 producer/leader state change to [normal-0] on 
many_partition/7
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [retrying] 
on many_partition/7 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:28 producer/leader state change to [retrying-3] on 
many_partition/7
[sarama] 2015/04/01 22:26:28 producer/leader abandoning broker 9093 on 
many_partition/7
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 state change to [normal] on 
many_partition/7
[sarama] 2015/04/01 22:26:28 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:29 producer/flusher/9091 state change to [retrying] 
on many_partition/25 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:29 producer/leader state change to [retrying-2] on 
many_partition/25
[sarama] 2015/04/01 22:26:29 producer/leader abandoning broker 9091 on 
many_partition/25
[sarama] 2015/04/01 22:26:29 producer/flusher/9091 state change to [retrying] 
on many_partition/5 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:29 producer/leader state change to [retrying-2] on 
many_partition/5
[sarama] 2015/04/01 22:26:29 producer/leader abandoning broker 9091 on 
many_partition/5
[sarama] 2015/04/01 22:26:29 producer/flusher/9091 state change to [retrying] 
on many_partition/30 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:29 producer/leader state change to [retrying-2] on 
many_partition/30
[sarama] 2015/04/01 22:26:29 producer/leader abandoning broker 9091 on 
many_partition/30
[sarama] 2015/04/01 22:26:29 producer/flusher/9091 state change to [normal] on 
many_partition/25
[sarama] 2015/04/01 22:26:29 producer/flusher/9091 state change to [normal] on 
many_partition/5
[sarama] 2015/04/01 22:26:29 producer/flusher/9091 state change to [normal] on 
many_partition/30
[sarama] 2015/04/01 22:26:30 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:30 producer/leader selected broker 9093 on 
many_partition/27
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-2] on 
many_partition/27
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-1] on 
many_partition/27
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-1] on 
many_partition/27
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-0] on 
many_partition/27
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-0] on 
many_partition/27
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [retrying] 
on many_partition/27 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:30 producer/leader state change to [retrying-3] on 
many_partition/27
[sarama] 2015/04/01 22:26:30 producer/leader abandoning broker 9093 on 
many_partition/27
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [normal] on 
many_partition/27
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:30 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:30 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:30 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:30 producer/leader selected broker 9093 on 
many_partition/2
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-2] on 
many_partition/2
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-1] on 
many_partition/2
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-1] on 
many_partition/2
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-0] on 
many_partition/2
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-0] on 
many_partition/2
[sarama] 2015/04/01 22:26:30 producer/leader selected broker 9093 on 
many_partition/17
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-2] on 
many_partition/17
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-1] on 
many_partition/17
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-1] on 
many_partition/17
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-0] on 
many_partition/17
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-0] on 
many_partition/17
[sarama] 2015/04/01 22:26:30 producer/leader selected broker 9093 on 
many_partition/22
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-2] on 
many_partition/22
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-1] on 
many_partition/22
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-1] on 
many_partition/22
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-0] on 
many_partition/22
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-0] on 
many_partition/22
[sarama] 2015/04/01 22:26:30 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:30 producer/leader selected broker 9093 on 
many_partition/12
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-2] on 
many_partition/12
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-1] on 
many_partition/12
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-1] on 
many_partition/12
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-0] on 
many_partition/12
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-0] on 
many_partition/12
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [retrying] 
on many_partition/2 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:30 producer/leader state change to [retrying-3] on 
many_partition/2
[sarama] 2015/04/01 22:26:30 producer/leader abandoning broker 9093 on 
many_partition/2
[sarama] 2015/04/01 22:26:30 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:30 producer/leader selected broker 9093 on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-3] on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-2] on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-2] on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-1] on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-1] on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/leader state change to [flushing-0] on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/leader state change to [normal-0] on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [retrying] 
on many_partition/17 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:30 producer/leader state change to [retrying-3] on 
many_partition/17
[sarama] 2015/04/01 22:26:30 producer/leader abandoning broker 9093 on 
many_partition/17
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [retrying] 
on many_partition/22 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:30 producer/leader state change to [retrying-3] on 
many_partition/22
[sarama] 2015/04/01 22:26:30 producer/leader abandoning broker 9093 on 
many_partition/22
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [retrying] 
on many_partition/12 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:30 producer/leader state change to [retrying-3] on 
many_partition/12
[sarama] 2015/04/01 22:26:30 producer/leader abandoning broker 9093 on 
many_partition/12
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [normal] on 
many_partition/2
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [normal] on 
many_partition/17
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [normal] on 
many_partition/22
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [normal] on 
many_partition/12
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [retrying] 
on many_partition/7 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:30 producer/leader state change to [retrying-4] on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/leader abandoning broker 9093 on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 state change to [normal] on 
many_partition/7
[sarama] 2015/04/01 22:26:30 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:31 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:31 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:31 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:31 producer/leader selected broker 9091 on 
many_partition/25
[sarama] 2015/04/01 22:26:31 producer/leader state change to [normal-2] on 
many_partition/25
[sarama] 2015/04/01 22:26:31 producer/leader state change to [flushing-1] on 
many_partition/25
[sarama] 2015/04/01 22:26:31 producer/leader state change to [normal-1] on 
many_partition/25
[sarama] 2015/04/01 22:26:31 producer/leader state change to [flushing-0] on 
many_partition/25
[sarama] 2015/04/01 22:26:31 producer/leader state change to [normal-0] on 
many_partition/25
[sarama] 2015/04/01 22:26:31 producer/leader selected broker 9091 on 
many_partition/5
[sarama] 2015/04/01 22:26:31 producer/leader state change to [normal-2] on 
many_partition/5
[sarama] 2015/04/01 22:26:31 producer/leader state change to [flushing-1] on 
many_partition/5
[sarama] 2015/04/01 22:26:31 producer/leader state change to [normal-1] on 
many_partition/5
[sarama] 2015/04/01 22:26:31 producer/leader state change to [flushing-0] on 
many_partition/5
[sarama] 2015/04/01 22:26:31 producer/leader state change to [normal-0] on 
many_partition/5
[sarama] 2015/04/01 22:26:31 producer/leader selected broker 9091 on 
many_partition/30
[sarama] 2015/04/01 22:26:31 producer/leader state change to [normal-2] on 
many_partition/30
[sarama] 2015/04/01 22:26:31 producer/leader state change to [flushing-1] on 
many_partition/30
[sarama] 2015/04/01 22:26:31 producer/leader state change to [normal-1] on 
many_partition/30
[sarama] 2015/04/01 22:26:31 producer/leader state change to [flushing-0] on 
many_partition/30
[sarama] 2015/04/01 22:26:31 producer/leader state change to [normal-0] on 
many_partition/30
[sarama] 2015/04/01 22:26:32 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:32 producer/leader selected broker 9093 on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-3] on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-2] on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-2] on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-1] on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-1] on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-0] on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-0] on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [retrying] 
on many_partition/27 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:32 producer/leader state change to [retrying-4] on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/leader abandoning broker 9093 on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [normal] on 
many_partition/27
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:32 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:32 producer/leader selected broker 9093 on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-3] on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-2] on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-2] on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-1] on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-1] on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-0] on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-0] on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [retrying] 
on many_partition/2 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:32 producer/leader state change to [retrying-4] on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/leader abandoning broker 9093 on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [normal] on 
many_partition/2
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:32 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:32 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:32 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:32 producer/leader selected broker 9093 on 
many_partition/17
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-3] on 
many_partition/17
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-2] on 
many_partition/17
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-2] on 
many_partition/17
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-1] on 
many_partition/17
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-1] on 
many_partition/17
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-0] on 
many_partition/17
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-0] on 
many_partition/17
[sarama] 2015/04/01 22:26:32 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [retrying] 
on many_partition/17 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:32 producer/leader state change to [retrying-4] on 
many_partition/17
[sarama] 2015/04/01 22:26:32 producer/leader abandoning broker 9093 on 
many_partition/17
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [normal] on 
many_partition/17
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:32 producer/leader selected broker 9093 on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-3] on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-2] on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-2] on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-1] on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-1] on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-0] on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-0] on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/leader selected broker 9093 on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-3] on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-2] on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-2] on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-1] on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-1] on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-0] on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-0] on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/leader selected broker 9093 on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-4] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-3] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-3] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-2] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-2] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-1] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-1] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/leader state change to [flushing-0] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/leader state change to [normal-0] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [retrying] 
on many_partition/22 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:32 producer/leader state change to [retrying-4] on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/leader abandoning broker 9093 on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [retrying] 
on many_partition/12 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:32 producer/leader state change to [retrying-4] on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/leader abandoning broker 9093 on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [retrying] 
on many_partition/7 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:32 producer/leader state change to [retrying-5] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/leader abandoning broker 9093 on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [normal] on 
many_partition/22
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [normal] on 
many_partition/12
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 state change to [normal] on 
many_partition/7
[sarama] 2015/04/01 22:26:32 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:34 producer/flusher/9091 state change to [retrying] 
on many_partition/20 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:34 producer/leader state change to [retrying-2] on 
many_partition/20
[sarama] 2015/04/01 22:26:34 producer/leader abandoning broker 9091 on 
many_partition/20
[sarama] 2015/04/01 22:26:34 producer/flusher/9091 state change to [retrying] 
on many_partition/0 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:34 producer/leader state change to [retrying-2] on 
many_partition/0
[sarama] 2015/04/01 22:26:34 producer/leader abandoning broker 9091 on 
many_partition/0
[sarama] 2015/04/01 22:26:34 producer/flusher/9091 state change to [retrying] 
on many_partition/10 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:34 producer/leader state change to [retrying-2] on 
many_partition/10
[sarama] 2015/04/01 22:26:34 producer/leader abandoning broker 9091 on 
many_partition/10
[sarama] 2015/04/01 22:26:34 producer/flusher/9091 state change to [retrying] 
on many_partition/15 because kafka server: Request exceeded the user-specified 
time limit in the request.
[sarama] 2015/04/01 22:26:34 producer/leader state change to [retrying-2] on 
many_partition/15
[sarama] 2015/04/01 22:26:34 producer/leader abandoning broker 9091 on 
many_partition/15
[sarama] 2015/04/01 22:26:34 producer/flusher/9091 state change to [normal] on 
many_partition/20
[sarama] 2015/04/01 22:26:34 producer/flusher/9091 state change to [normal] on 
many_partition/0
[sarama] 2015/04/01 22:26:34 producer/flusher/9091 state change to [normal] on 
many_partition/10
[sarama] 2015/04/01 22:26:34 producer/flusher/9091 state change to [normal] on 
many_partition/15
[sarama] 2015/04/01 22:26:34 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:34 producer/leader selected broker 9093 on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-4] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-3] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-3] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-2] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-2] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-1] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-1] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-0] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-0] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [retrying] 
on many_partition/27 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:34 producer/leader state change to [retrying-5] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/leader abandoning broker 9093 on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [normal] on 
many_partition/27
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:34 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:34 producer/leader selected broker 9093 on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-4] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-3] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-3] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-2] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-2] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-1] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-1] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-0] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-0] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [retrying] 
on many_partition/2 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:34 producer/leader state change to [retrying-5] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/leader abandoning broker 9093 on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [normal] on 
many_partition/2
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:34 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:34 producer/leader selected broker 9093 on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-4] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-3] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-3] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-2] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-2] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-1] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-1] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-0] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-0] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [retrying] 
on many_partition/17 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:34 producer/leader state change to [retrying-5] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/leader abandoning broker 9093 on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [normal] on 
many_partition/17
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 shut down
[sarama] 2015/04/01 22:26:34 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:34 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:34 producer/leader selected broker 9093 on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 starting up
[sarama] 2015/04/01 22:26:34 Fetching metadata for [many_partition] from broker 
192.168.100.67:9091
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-4] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-3] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-3] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-2] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-2] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-1] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-1] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-0] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-0] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader selected broker 9093 on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-4] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-3] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-3] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-2] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-2] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-1] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-1] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-0] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-0] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [retrying] 
on many_partition/22 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:34 producer/leader state change to [retrying-5] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader abandoning broker 9093 on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/leader selected broker 9093 on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-5] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-4] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-4] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-3] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-3] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-2] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-2] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-1] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-1] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [flushing-0] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/leader state change to [normal-0] on 
many_partition/7
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [retrying] 
on many_partition/12 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
[sarama] 2015/04/01 22:26:34 producer/leader state change to [retrying-5] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/leader abandoning broker 9093 on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [normal] on 
many_partition/22
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [normal] on 
many_partition/12
[sarama] 2015/04/01 22:26:34 producer/flusher/9093 state change to [retrying] 
on many_partition/7 because kafka server: Tried to send a message to a replica 
that is not the leader for some partition. Your metadata is out of date.
--- FAIL: TestReliableProducing (28.06 seconds)
        functional_producer_test.go:217: kafka server: Tried to send a message 
to a replica that is not the leader for some partition. Your metadata is out of 
date.
{noformat}

The particularly important lines are the ones with {{zk* enabled}} and {{zk* 
disabled}} since those indicate the cutting and restoring of zookeeper 
connections. In this scenario, it is reliably kafka-9093 (talking to "zk3" from 
the test's perspective) which ends up in the bad state spewing logs.

> Kafka Replication ends up in a bad state
> ----------------------------------------
>
>                 Key: KAFKA-2082
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2082
>             Project: Kafka
>          Issue Type: Bug
>          Components: replication
>    Affects Versions: 0.8.2.1
>            Reporter: Evan Huus
>            Assignee: Neha Narkhede
>            Priority: Critical
>
> While running integration tests for Sarama (the go client) we came across a 
> pattern of connection losses that reliably puts kafka into a bad state: 
> several of the brokers start spinning, chewing ~30% CPU and spamming the logs 
> with hundreds of thousands of lines like:
> {noformat}
> [2015-04-01 13:08:40,070] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111094 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,1] failed due to Leader not local for partition 
> [many_partition,1] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,070] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111094 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,6] failed due to Leader not local for partition 
> [many_partition,6] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,070] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111095 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,21] failed due to Leader not local for partition 
> [many_partition,21] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,071] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111095 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,26] failed due to Leader not local for partition 
> [many_partition,26] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,071] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111095 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,1] failed due to Leader not local for partition 
> [many_partition,1] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,071] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111095 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,6] failed due to Leader not local for partition 
> [many_partition,6] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,072] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111096 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,21] failed due to Leader not local for partition 
> [many_partition,21] on broker 9093 (kafka.server.ReplicaManager)
> [2015-04-01 13:08:40,072] WARN [Replica Manager on Broker 9093]: Fetch 
> request with correlation id 111096 from client ReplicaFetcherThread-0-9093 on 
> partition [many_partition,26] failed due to Leader not local for partition 
> [many_partition,26] on broker 9093 (kafka.server.ReplicaManager)
> {noformat}
> This can be easily and reliably reproduced using the {{toxiproxy-final}} 
> branch of https://github.com/Shopify/sarama which includes a vagrant script 
> for provisioning the appropriate cluster: 
> - {{git clone https://github.com/Shopify/sarama.git}}
> - {{git checkout toxiproxy-final}}
> - {{vagrant up}}
> - {{TEST_SEED=1427917826425719059 DEBUG=true go test -v}}
> After the test finishes (it fails because the cluster ends up in a bad 
> state), you can log into the cluster machine with {{vagrant ssh}} and inspect 
> the bad nodes. The vagrant script provisions five zookeepers and five brokers 
> in {{/opt/kafka-9091/}} through {{/opt/kafka-9095/}}.
> Additional context: the test produces continually to the cluster while 
> randomly cutting and restoring zookeeper connections (all connections to 
> zookeeper are run through a simple proxy on the same vm to make this easy). 
> The majority of the time this works very well and does a good job exercising 
> our producer's retry and failover code. However, under certain patterns of 
> connection loss (the {{TEST_SEED}} in the instructions is important), kafka 
> gets confused. The test never cuts more than two connections at a time, so 
> zookeeper should always have quorum, and the topic (with three replicas) 
> should always be writable.
> Completely restarting the cluster via {{vagrant reload}} seems to put it back 
> into a sane state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to