Neil, what you are seeing could probably be KAFKA-1407
<https://issues.apache.org/jira/browse/KAFKA-1407>.

On Tue, Oct 21, 2014 at 12:03 PM, Gwen Shapira <gshap...@cloudera.com>
wrote:

> Consumers always read from the leader replica, which is always in sync
> by definition. So you are good there.
> The concern would be if the leader crashes during this period.
>
>
>
> On Tue, Oct 21, 2014 at 2:56 PM, Neil Harkins <nhark...@gmail.com> wrote:
> > Hi. I've got a 5 node cluster running Kafka 0.8.1,
> > with 4697 partitions (2 replicas each) across 564 topics.
> > I'm sending it about 1% of our total messaging load now,
> > and several times a day there is a period where 1~1500
> > partitions have one replica not in sync. Is this normal?
> > If a consumer is reading from a replica that gets deemed
> > "not in sync", does it get redirected to the good replica?
> > Is there a #partitions over which maintenance tasks
> > become infeasible?
> >
> > Relevant config bits:
> > auto.leader.rebalance.enable=true
> > leader.imbalance.per.broker.percentage=20
> > leader.imbalance.check.interval.seconds=30
> > replica.lag.time.max.ms=10000
> > replica.lag.max.messages=4000
> > num.replica.fetchers=4
> > replica.fetch.max.bytes=10485760
> >
> > Not necessarily correlated to those periods,
> > I see a lot of these errors in the logs:
> >
> > [2014-10-20 21:23:26,999] 21963614 [ReplicaFetcherThread-3-1] ERROR
> > kafka.server.ReplicaFetcherThread  - [ReplicaFetcherThread-3-1], Error
> > in fetch Name: FetchRequest; Version: 0; CorrelationId: 77423;
> > ClientId: ReplicaFetcherThread-3-1; ReplicaId: 2; MaxWait: 500 ms;
> > MinBytes: 1 bytes; RequestInfo: ...
> >
> > And a few of these:
> >
> > [2014-10-20 21:23:39,555] 3467527 [kafka-scheduler-2] ERROR
> > kafka.utils.ZkUtils$  - Conditional update of path
> > /brokers/topics/foo.bar/partitions/3/state with data
> >
> {"controller_epoch":11,"leader":3,"version":1,"leader_epoch":109,"isr":[3]}
> > and expected version 197 failed due to
> > org.apache.zookeeper.KeeperException$BadVersionException:
> > KeeperErrorCode = BadVersion for
> > /brokers/topics/foo.bar/partitions/3/state
> >
> > And this one I assume is a client closing the connection non-gracefully,
> > thus should probably be a warning, not an error?:
> >
> > [2014-10-20 21:54:15,599] 23812214 [kafka-processor-9092-3] ERROR
> > kafka.network.Processor  - Closing socket for /10.31.0.224 because of
> > error
> >
> > -neil
>



-- 
-- Guozhang

Reply via email to