I just wanted to follow-up on a fall-out caused by the issue mentioned
below. After the offset manager moved, consumer offsets started going
to a different partition of the offsets topic.

However, the previous partition of the offsets topic can still have
the old offsets.  Those won't necessarily get compacted out (if the
dirtiness threshold is not met).

Now suppose multiple leader changes occur and both the new (correct)
offsets partition and the old offsets partition happen to move to the
same broker (in that order). We load the offsets into the offsets
cache on a leader change. So if the order of leader changes is new
partition followed by old partition, then the old offsets end up
overwriting the correct offsets in the cache with old (most likely out
of range) offsets.

The above issue affected some of our consumers (especially ones that
have auto.offset.reset set to smallest).

In order to fix the issue completely I ended up writing this tool to
purge bad offsets:

https://gist.github.com/jjkoshy/a3f64d67fe494da3c3a6

In order to produce the tombstones the broker needs to allow producer
requests to the __consumer_offsets topic. i.e., fortunately we had not
yet picked up KAFKA-1580 so the above worked for us.

Thanks,

Joel


On Mon, Sep 22, 2014 at 03:36:46PM -0700, Joel Koshy wrote:
> I just wanted to send this out as an FYI but it does not affect any
> released versions.
> 
> This only affects those who release off trunk and use Kafka-based
> consumer offset management.  KAFKA-1469 fixes an issue in our
> Utils.abs code. Since we use this method in determining the offset
> manager for a consumer group, the fix can yield a different offset
> manager if you happen to run off trunk and upgrade across the fix.
> This won't affect all groups, but those that happen to hash to a value
> that is affected by the bug fixed in KAFKA-1469.
> 
> (Sort of related - we may want to consider not using hashcode on the
> group and switch to a more standard hashing algorithm but I highly
> doubt that hashcode values on a string will change in the future.)
> 
> Thanks,
> 
> -- 
> Joel

Reply via email to