Jason Gustafson created KAFKA-3766:
--------------------------------------

             Summary: Unhandled "not enough replicas" errors in SyncGroup
                 Key: KAFKA-3766
                 URL: https://issues.apache.org/jira/browse/KAFKA-3766
             Project: Kafka
          Issue Type: Bug
            Reporter: Jason Gustafson
            Assignee: Jason Gustafson


Caught by [~ijuma]. We seem to be missing at least a couple error codes when 
handling the append log response when writing group metadata to the offsets 
topic in the SyncGroup handler. In particular, we are missing checks for 
NOT_ENOUGH_REPLICAS and NOT_ENOUGH_REPLICAS_AFTER_APPEND. Currently these 
errors are returned directly in the SyncGroup response and cause an exception 
to be raised to the user.

There are two options to fix this problem:
1. We can continue to return these error codes in the sync group response and 
add a handler on the client side to retry.
2. We can convert the errors on the server to something like 
COORDINATOR_NOT_AVAILABLE, which will cause the client to retry with the 
existing logic.

The second option seems a little nicer to avoid exposing the internal 
implementation of the SyncGroup request (i.e. that we write group metadata to a 
partition). It also has the nice side effect of fixing old clients 
automatically when the server is upgraded.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to