These configs are mainly dependent on your publish throughput, since the
replication throughput is higher bounded by the publish throughput. If the
publish throughput is not high, then setting a lower threshold values in
these two configs will cause churns in shrinking / expanding ISRs.

Guozhang

On Mon, Apr 4, 2016 at 11:55 PM, Yifan Ying <nafan...@gmail.com> wrote:

> Thanks for replying, Guozhang. We did increase both settings:
>
> replica.lag.max.messages=20000
>
> replica.lag.time.max.ms=20000
>
>
> But no sure if these are good enough. And yes, that's a good suggestion to
> monitor ZK performance.
>
>
> Thanks.
>
> On Mon, Apr 4, 2016 at 8:58 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>
>> Hmm, it seems like your broker config "replica.lag.max.messages" and "
>> replica.lag.time.max.ms" is mis-configed regarding your replication
>> traffic, and the deletion of the topic actually makes it below the
>> threshold. What are the config values for these two? And could you try to
>> increase these configs and see if that helps?
>>
>> In 0.8.2.1 Kafka-consumer-offset-checker.sh access ZK to query the
>> consumer offsets one-by-one, and hence if your ZK read latency is high it
>> could take long time. You may want to monitor your ZK cluster performance
>> to check its read / write latencies.
>>
>>
>> Guozhang
>>
>>
>>
>>
>>
>> On Mon, Apr 4, 2016 at 10:59 AM, Yifan Ying <nafan...@gmail.com> wrote:
>>
>>> Hi Guozhang,
>>>
>>> It's 0.8.2.1. So it should be fixed? We also tried to start from scratch
>>> by wiping out the data directory on both Kafka and Zookeeper. And it's odd
>>> that the constant shrinking and expanding happened after fresh restart, and
>>> high request latency as well. The brokers are using the same config before
>>> topic deletion.
>>>
>>> Another observation is that, using the Kafka-consumer-offset-checker.sh
>>> is extremely slow. Any suggestion would be appreciated! Thanks.
>>>
>>> On Sun, Apr 3, 2016 at 2:29 PM, Guozhang Wang <wangg...@gmail.com>
>>> wrote:
>>>
>>>> Yifan,
>>>>
>>>> Are you on 0.8.0 or 0.8.1/2? There are some issues with zkVersion
>>>> checking
>>>> in 0.8.0 that are fixed in later minor releases of 0.8.
>>>>
>>>> Guozhang
>>>>
>>>> On Fri, Apr 1, 2016 at 7:46 PM, Yifan Ying <nafan...@gmail.com> wrote:
>>>>
>>>> > Hi All,
>>>> >
>>>> > We deleted a deprecated topic on Kafka cluster(0.8) and started
>>>> observing
>>>> > constant 'Expanding ISR for partition' and 'Shrinking ISR for
>>>> partition'
>>>> > for other topics. As a result we saw a huge number of under replicated
>>>> > partitions and very high request latency from Kafka. And it doesn't
>>>> seem
>>>> > able to recover itself.
>>>> >
>>>> > Anyone knows what caused this issue and how to resolve it?
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> -- Guozhang
>>>>
>>>
>>>
>>>
>>> --
>>> Yifan
>>>
>>>
>>>
>>
>>
>> --
>> -- Guozhang
>>
>
>
>
> --
> Yifan
>
>
>


-- 
-- Guozhang

Reply via email to