[ https://issues.apache.org/jira/browse/HBASE-22057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Elser updated HBASE-22057: ------------------------------- Attachment: HBASE-22057.003.patch > Impose upper-bound on size of ZK ops sent in a single multi() > ------------------------------------------------------------- > > Key: HBASE-22057 > URL: https://issues.apache.org/jira/browse/HBASE-22057 > Project: HBase > Issue Type: Bug > Reporter: Josh Elser > Assignee: Josh Elser > Priority: Major > Fix For: 3.0.0, 1.6.0, 2.2.0 > > Attachments: HBASE-22057.001.patch, HBASE-22057.002.patch, > HBASE-22057.003.patch > > > In {{ZKUtil#multiOrSequential}}, we accept a list of {{ZKUtilOp}}'s to pass > down to the {{ZooKeeper#multi(Iterable<Op>)}} method. > One problem with this approach is that we may generate a large list of ZNodes > to mutate in one batch which exceeds the allowable client package length, > specified by {{jute.maxbuffer}}. > This problem can manifest when we have a large number of WALs to replicate, > queued in ZooKeeper, from a disabled peer. When that peer is dropped, the RS > would submit deletes of those queued WALs. The RS will see ConnectionLoss for > the resulting {{multi()}} calls it tries to make, because we are sending too > large of a client message (because we're trying to delete too many WALs at > once). The result (at least in branch-1 ish versions) is that the RS aborts > after exceeding the ZK retries (as this operation will never succeed). > A simple fix would be to impose a maximum number of Ops to run in a single > batch inside ZKUtil, and split apart the caller-submitted batch into smaller > chunks. Before we make such a change, I do need to make sure that we don't > have any expectations on atomicity of the operations. I'm not sure what ZK > provides here -- for the above example, splitting up batches of deletes is > not an issue, but there could be issues with batches of creates where we only > apply some. -- This message was sent by Atlassian JIRA (v7.6.3#76005)