tisonkun commented on code in PR #1966:
URL: https://github.com/apache/zookeeper/pull/1966#discussion_r1115748846


##########
zookeeper-docs/src/main/resources/markdown/zookeeperSnapshotAndRestore.md:
##########
@@ -0,0 +1,67 @@
+<!--
+Copyright 2002-2004 The Apache Software Foundation
+
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+//-->
+
+# ZooKeeper Snapshot and Restore Guide
+
+Zookeeper is designed to withstand machine failures. A Zookeeper cluster can 
automatically recover 
+from temporary failures such as machine reboot. It can also tolerate up to 
(N-1)/2 permanent 
+failures for a cluster of N members due to hardware failures or disk 
corruption, etc. When a member 
+permanently fails, it loses access to the cluster. If the cluster permanently 
loses more than 
+(N-1)/2 members, it disastrously fails and loses quorum. Once the quorum is 
lost, the cluster 
+cannot reach consensus and therefore cannot continue to accept updates.
+
+To recover from such disastrous failures, Zookeeper provides snapshot and 
restore functionalities to
+restore a cluster from a snapshot.
+
+1. Snapshot and restore operate on the connected server via Admin Server APIs
+1. Snapshot and restore are rate limited to protect the server from being 
overloaded
+1. Snapshot and restore require authentication and authorization on the root 
path with ALL permission.
+The supported auth schemas are digest, x509 and IP.
+   
+* [Snapshot](#zookeeper_snapshot)
+* [Restore](#zookeeper_restore)  
+
+<a name="zookeeper_snapshot"></a>
+
+## Snapshot
+Recovering a cluster needs a snapshot from a ZooKeeper cluster. Users can 
periodically take
+snapshots from a live server which has the highest zxid and stream out data to 
a local 
+or external storage/file system (e.g., S3).
+
+ ```bash
+ # The snapshot command takes snapshot from the server it connects to and rate 
limited to once every 5 mins by default
+ curl -H 'Authorization: digest root:root_passwd' 
http://hostname:adminPort/commands/snapshot?streaming=true --output 
snapshotFileName
+ ```
+
+<a name="zookeeper_restore"></a>
+## Restore
+
+Restoring a cluster needs a single snapshot as input stream. Restore can be 
used for recovering a 
+cluster for quorum lost or building a brand-new cluster with seed data. 
+
+All members should restore using the same snapshot. The following are the 
recommended steps:
+ 
+- Blocking client traffic before restore starts
+- For each server
+  - Moving the files in dataDir and dataLogDir to different location to 
prevent the restored database 

Review Comment:
   Is it better to take a snapshot of the latest state? I think blocking client 
traffic doesn't prohibit admin request?



##########
zookeeper-server/src/main/java/org/apache/zookeeper/server/ZooKeeperServer.java:
##########
@@ -2069,7 +2069,7 @@ public void dumpMonitorValues(BiConsumer<String, Object> 
response) {
 
     /**
      * Grant or deny authorization to an operation on a node as a function of:
-     * @param cnxn :    the server connection
+     * @param cnxn :    the server connection or null for admin server commands

Review Comment:
   Actually, I found `cnxn` is not required for this method since `ServerObjs 
serverObjs` is not effectively used in the method call.
   
   But it can be out of the scope so this comment is just a hint.



##########
zookeeper-server/src/main/java/org/apache/zookeeper/server/admin/CommandBase.java:
##########
@@ -28,22 +28,31 @@ public abstract class CommandBase implements Command {
     private final Set<String> names;
     private final String doc;
     private final boolean serverRequired;
+    private final AuthRequest authRequest;
 
     /**
      * @param names The possible names of this command, with the primary name 
first.
      */
     protected CommandBase(List<String> names) {
-        this(names, true, null);
+        this(names, true);
     }
     protected CommandBase(List<String> names, boolean serverRequired) {
         this(names, serverRequired, null);
     }
 
-    protected CommandBase(List<String> names, boolean serverRequired, String 
doc) {
+    protected CommandBase(List<String> names, boolean serverRequired, 
AuthRequest authRequest) {
+        this(names, serverRequired, null, authRequest);
+    }
+
+    protected CommandBase(List<String> names, boolean serverRequired, String 
doc, AuthRequest authRequest) {

Review Comment:
   nit: It seems `doc` is always null and can be safely removed.



##########
zookeeper-server/src/main/java/org/apache/zookeeper/server/admin/StreamOutputter.java:
##########
@@ -46,7 +46,7 @@ public void output(final CommandResponse response, final 
OutputStream os) {
         try (final InputStream is = response.getInputStream()){
             IOUtils.copyBytes(is, os, 1024, true);
         } catch (final IOException e) {
-            LOG.error("Exception occurred when streaming out data to {}", 
clientIP, e);
+            LOG.warn("Exception streaming out data to {}", clientIP, e);

Review Comment:
   Why do you make this change?



##########
zookeeper-server/src/main/java/org/apache/zookeeper/server/auth/AuthenticationProvider.java:
##########
@@ -49,6 +52,19 @@ public interface AuthenticationProvider {
      */
     KeeperException.Code handleAuthentication(ServerCnxn cnxn, byte[] 
authData);
 
+    /**
+     * This method is called when admin server command passes authentication 
data for this
+     * scheme.
+     *
+     * @param request
+     *                the request that contains the authentication information.
+     * @param authData
+     *                the authentication data received.
+     * @return Ids
+     *                the list of Id. Empty list means not authenticated
+     */
+    List<Id> handleAuthentication(HttpServletRequest request, byte[] authData);

Review Comment:
   `default` to empty list and thus reduce some impls?



##########
zookeeper-server/src/main/java/org/apache/zookeeper/server/admin/JettyAdminServer.java:
##########
@@ -89,17 +90,20 @@ public JettyAdminServer() throws AdminServerException, 
IOException, GeneralSecur
             System.getProperty("zookeeper.admin.commandURL", 
DEFAULT_COMMAND_URL),
             Integer.getInteger("zookeeper.admin.httpVersion", 
DEFAULT_HTTP_VERSION),
             Boolean.getBoolean("zookeeper.admin.portUnification"),
-            Boolean.getBoolean("zookeeper.admin.forceHttps"));
+            Boolean.getBoolean("zookeeper.admin.forceHttps"),
+            Boolean.getBoolean("zookeeper.admin.needClientAuth"));
     }
 
+    @SuppressWarnings("deprecation")

Review Comment:
   Here is a patch to reduce complex:
   
   ```diff
   diff --git 
a/zookeeper-server/src/main/java/org/apache/zookeeper/server/admin/JettyAdminServer.java
 
b/zookeeper-server/src/main/java/org/apache/zookeeper/server/admin/JettyAdminServer.java
   index 2c3bb723d..a237e4c3b 100644
   --- 
a/zookeeper-server/src/main/java/org/apache/zookeeper/server/admin/JettyAdminServer.java
   +++ 
b/zookeeper-server/src/main/java/org/apache/zookeeper/server/admin/JettyAdminServer.java
   @@ -94,7 +94,6 @@ public JettyAdminServer() throws AdminServerException, 
IOException, GeneralSecur
   Boolean.getBoolean("zookeeper.admin.needClientAuth"));
   }
   
   -    @SuppressWarnings("deprecation")
   public JettyAdminServer(
   String address,
   int port,
   @@ -148,15 +147,12 @@ public JettyAdminServer(
   throw e;
   }
   
   -                SslContextFactory sslContextFactory = new 
SslContextFactory.Server();
   +                SslContextFactory.Server sslContextFactory = new 
SslContextFactory.Server();
   sslContextFactory.setKeyStore(keyStore);
   sslContextFactory.setKeyStorePassword(privateKeyPassword);
   sslContextFactory.setTrustStore(trustStore);
   sslContextFactory.setTrustStorePassword(certAuthPassword);
   -                if (needClientAuth) {
   -                    // setNeedClientAuth() of SslContextFactory.Server is 
not a deprecated API
   -                    sslContextFactory.setNeedClientAuth(true);
   -                }
   +                sslContextFactory.setNeedClientAuth(needClientAuth);
   
   if (forceHttps) {
   connector = new ServerConnector(server,
   ```



##########
zookeeper-server/src/main/java/org/apache/zookeeper/server/admin/Commands.java:
##########
@@ -532,23 +615,21 @@ public CommandResponse runGet(ZooKeeperServer zkServer, 
Map<String, String> kwar
      */
     public static class RestoreCommand extends PostCommand {
         static final String RESPONSE_DATA_LAST_ZXID = "last_zxid";
-
         static final String ADMIN_RESTORE_ENABLED = 
"zookeeper.admin.restore.enabled";
 
-
         private RateLimiter rateLimiter;
 
         public RestoreCommand() {
-            super(Arrays.asList("restore", "rest"));
-            rateLimiter = new RateLimiter(1, rateLimiterInterval, 
TimeUnit.MICROSECONDS);
+            super(Arrays.asList("restore", "rest"), true, new 
AuthRequest(ZooDefs.Perms.ALL, ROOT_PATH));
+            rateLimiter = new RateLimiter(1, rateLimiterInterval, 
TimeUnit.MILLISECONDS);

Review Comment:
   And here we fix a bug?



##########
zookeeper-server/src/main/java/org/apache/zookeeper/server/admin/Commands.java:
##########
@@ -76,6 +85,8 @@ public class Commands {
     static final Logger LOG = LoggerFactory.getLogger(Commands.class);
     static final String ADMIN_RATE_LIMITER_INTERVAL = 
"zookeeper.admin.rateLimiterIntervalInMS";
     private static final long rateLimiterInterval = 
Integer.parseInt(System.getProperty(ADMIN_RATE_LIMITER_INTERVAL, "300000"));
+    static final String AUTH_INFO_SEPARATOR = " ";
+    static final String ROOT_PATH = "/";

Review Comment:
   VisibleForTesting?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to