szepet commented on a change in pull request #899: ZOOKEEPER-3354: Improve efficiency of DeleteAllCommand URL: https://github.com/apache/zookeeper/pull/899#discussion_r274944586
########## File path: zookeeper-server/src/main/java/org/apache/zookeeper/ZKUtil.java ########## @@ -45,20 +48,67 @@ * * @throws IllegalArgumentException if an invalid path is specified */ - public static void deleteRecursive(ZooKeeper zk, final String pathRoot) + public static boolean deleteRecursive(ZooKeeper zk, final String pathRoot) throws InterruptedException, KeeperException { PathUtils.validatePath(pathRoot); List<String> tree = listSubTreeBFS(zk, pathRoot); LOG.debug("Deleting " + tree); LOG.debug("Deleting " + tree.size() + " subnodes "); - for (int i = tree.size() - 1; i >= 0 ; --i) { - //Delete the leaves first and eventually get rid of the root - zk.delete(tree.get(i), -1); //Delete all versions of the node with -1. + + int asyncReqRateLimit = 10; + // Try deleting the tree nodes in batches of size 1000. + // If some batch failed, try again with batches of size 1 to delete as + // many nodes as possible. Review comment: I am not sure whether this is the right decision. Besides the case where a 1000 node `multiOp` exceeds `JuteMaxBufferSize`, I believe the retry would result in an unsuccessful delete as well. As @eolivelli pointed it out, it would be great to have an early exit in `deleteInBatch` function otherwise the retry would encounter NoNodeExceptions as well. In my opinion, it's better to return as soon as possible and warn the user that the `deleteAll` was not successful. If the problem was only a NoNodeException or some parallel changes that we were not aware, it would be probably still faster to rerun the batched version rather than retry with smaller sizes. (Except in case of a `Len error` where `JuteMaxBufferSize` is exceeded.) What do you think? In any case, I believe this error handling logic worth a test case as well. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services