client question: What do you do when you rollback?

Vishal Kher Sun, 07 Aug 2011 12:01:51 -0700

Hi Camille,

Can you share the kind of problems you were facing on the servers that
forced you to rollback the cluster?


Thanks.
-Vishal

On Thu, Aug 4, 2011 at 1:29 PM, Fournier, Camille F. <
[email protected]> wrote:

> We had an issue here the other day where the ZK servers were running
> poorly, and in an effort to get them healthy again we ended up rolling back
> the cluster state. While this was, in retrospect, not the right solution to
> the problem we were facing, it brought up another problem. Namely, that many
> of our clients couldn't reconnect with their sessions because their zxid was
> too high (expected), but that the error they got when trying to do that
> reconnection was just a vanilla disconnected error. The result was that most
> of our clients had to be bounced.
>
> Aside from trying hard to avoid ever rolling back the cluster state, does
> anyone have a way they deal with this situation if it occurs? Should we
> consider enhancing the error message to the client so we could track the
> fact that we were ahead of the quorum zxid and react sensibly? Alternately,
> since we were sending a sessionId along with the zxid, perhaps it would be
> nice to check to see if the sessionId exists before checking the zxid, which
> would send an expired state signal which my client code could handle
> cleanly.
>
> Any ideas or suggestions would be welcome.
>
> C
>
>

Re: devops/admin/client question: What do you do when you rollback?

Reply via email to