[GitHub] [hadoop-ozone] linyiqun commented on a change in pull request #1430: HDDS-4227. Implement a 'Prepare For Upgrade' step in OM that applies all committed Ratis transactions.

2020-09-23 Thread GitBox


linyiqun commented on a change in pull request #1430:
URL: https://github.com/apache/hadoop-ozone/pull/1430#discussion_r493463257



##
File path: 
hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/OMConfigKeys.java
##
@@ -246,4 +246,9 @@ private OMConfigKeys() {
   "ozone.om.enable.filesystem.paths";
   public static final boolean OZONE_OM_ENABLE_FILESYSTEM_PATHS_DEFAULT =
   false;
+
+  public static final long OZONE_OM_MAX_TIME_TO_WAIT_FLUSH_TXNS =
+  TimeUnit.MINUTES.toSeconds(5);
+  public static final long OZONE_OM_FLUSH_TXNS_RETRY_INTERVAL_SECONDS =
+  TimeUnit.SECONDS.toMillis(5);

Review comment:
   For OZONE_OM_FLUSH_TXNS_RETRY_INTERVAL_SECONDS , here should be 5. On 
RatisUpgradeUtils#waitForAllTxnsApplied, we already do the convert:
   >long intervalTime = TimeUnit.SECONDS.toMillis(timeBetweenRetryInSeconds);





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] linyiqun commented on a change in pull request #1430: HDDS-4227. Implement a 'Prepare For Upgrade' step in OM that applies all committed Ratis transactions.

2020-09-22 Thread GitBox


linyiqun commented on a change in pull request #1430:
URL: https://github.com/apache/hadoop-ozone/pull/1430#discussion_r492798607



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -994,6 +1005,45 @@ public static boolean omInit(OzoneConfiguration conf) 
throws IOException,
 }
   }
 
+  public boolean applyAllPendingTransactions()
+  throws InterruptedException, IOException {
+
+if (!isRatisEnabled) {
+  LOG.info("Ratis not enabled. Nothing to do.");
+  return true;
+}
+
+String purgeConfig = omRatisServer.getServer()
+.getProperties().get(PURGE_UPTO_SNAPSHOT_INDEX_KEY);
+if (!Boolean.parseBoolean(purgeConfig)) {
+  throw new IllegalStateException("Cannot prepare OM for Upgrade since  " +
+  "raft.server.log.purge.upto.snapshot.index is not true");
+}
+
+waitForAllTxnsApplied(omRatisServer.getOmStateMachine(),
+omRatisServer.getRaftGroup(),
+(RaftServerProxy) omRatisServer.getServer(),
+TimeUnit.MINUTES.toSeconds(5));

Review comment:
   >Also in 5 minutes I would expect in all cases that the unapplied 
transactions can be applied, as the number of this kind of transactions should 
not be too much as far as I know, or if it is then the system is not healthy 
anyway.
   
   I'm okay to let 5 mins as current threshold wait time, only one minor 
comment: can we define a variable for this time value rather hard-coded in 
method here? 
   

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -1179,15 +1229,22 @@ public void start() throws IOException {
   // Allow OM to start as Http Server failure is not fatal.
   LOG.error("OM HttpServer failed to start.", ex);
 }
-omRpcServer.start();
-isOmRpcServerRunning = true;
 
+if (!prepareForUpgrade) {
+  omRpcServer.start();
+  isOmRpcServerRunning = true;
+}

Review comment:
   Okay, enable RPC server via next startup makes sense to me.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] linyiqun commented on a change in pull request #1430: HDDS-4227. Implement a 'Prepare For Upgrade' step in OM that applies all committed Ratis transactions.

2020-09-22 Thread GitBox


linyiqun commented on a change in pull request #1430:
URL: https://github.com/apache/hadoop-ozone/pull/1430#discussion_r492801087



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -1179,15 +1229,22 @@ public void start() throws IOException {
   // Allow OM to start as Http Server failure is not fatal.
   LOG.error("OM HttpServer failed to start.", ex);
 }
-omRpcServer.start();
-isOmRpcServerRunning = true;
 
+if (!prepareForUpgrade) {
+  omRpcServer.start();
+  isOmRpcServerRunning = true;
+}

Review comment:
   Okay, enable RPC server via next startup makes sense to me.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] linyiqun commented on a change in pull request #1430: HDDS-4227. Implement a 'Prepare For Upgrade' step in OM that applies all committed Ratis transactions.

2020-09-22 Thread GitBox


linyiqun commented on a change in pull request #1430:
URL: https://github.com/apache/hadoop-ozone/pull/1430#discussion_r492798607



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -994,6 +1005,45 @@ public static boolean omInit(OzoneConfiguration conf) 
throws IOException,
 }
   }
 
+  public boolean applyAllPendingTransactions()
+  throws InterruptedException, IOException {
+
+if (!isRatisEnabled) {
+  LOG.info("Ratis not enabled. Nothing to do.");
+  return true;
+}
+
+String purgeConfig = omRatisServer.getServer()
+.getProperties().get(PURGE_UPTO_SNAPSHOT_INDEX_KEY);
+if (!Boolean.parseBoolean(purgeConfig)) {
+  throw new IllegalStateException("Cannot prepare OM for Upgrade since  " +
+  "raft.server.log.purge.upto.snapshot.index is not true");
+}
+
+waitForAllTxnsApplied(omRatisServer.getOmStateMachine(),
+omRatisServer.getRaftGroup(),
+(RaftServerProxy) omRatisServer.getServer(),
+TimeUnit.MINUTES.toSeconds(5));

Review comment:
   >Also in 5 minutes I would expect in all cases that the unapplied 
transactions can be applied, as the number of this kind of transactions should 
not be too much as far as I know, or if it is then the system is not healthy 
anyway.
   
   I'm okay to let 5 mins as current threshold wait time, only one minor 
comment: can we define a variable for this time value rather hard-coded in 
method here? 
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] linyiqun commented on a change in pull request #1430: HDDS-4227. Implement a 'Prepare For Upgrade' step in OM that applies all committed Ratis transactions.

2020-09-17 Thread GitBox


linyiqun commented on a change in pull request #1430:
URL: https://github.com/apache/hadoop-ozone/pull/1430#discussion_r490064398



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManagerStarter.java
##
@@ -98,6 +98,28 @@ public void initOm()
 }
   }
 
+
+  /**
+   * This function implements a sub-command to allow the OM to be
+   * "prepared for upgrade".
+   */
+  @CommandLine.Command(name = "--prepareForUpgrade",
+  aliases = {"--prepareForDowngrade", "--flushTransactions"},

Review comment:
   Will prepareForUpgrade command send multiple OMs simultaneously here? Or 
we should trigger prepareForUpgrade command for each OM service. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] linyiqun commented on a change in pull request #1430: HDDS-4227. Implement a 'Prepare For Upgrade' step in OM that applies all committed Ratis transactions.

2020-09-17 Thread GitBox


linyiqun commented on a change in pull request #1430:
URL: https://github.com/apache/hadoop-ozone/pull/1430#discussion_r490048885



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -1179,15 +1229,22 @@ public void start() throws IOException {
   // Allow OM to start as Http Server failure is not fatal.
   LOG.error("OM HttpServer failed to start.", ex);
 }
-omRpcServer.start();
-isOmRpcServerRunning = true;
 
+if (!prepareForUpgrade) {
+  omRpcServer.start();
+  isOmRpcServerRunning = true;
+}

Review comment:
   During prepareForUpgrade, the RPC server is not stared. So we should 
also have the corresponding command to trigger to restart RPC server. Otherwise 
after all txns applied, new requests still cannot get in.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] linyiqun commented on a change in pull request #1430: HDDS-4227. Implement a 'Prepare For Upgrade' step in OM that applies all committed Ratis transactions.

2020-09-17 Thread GitBox


linyiqun commented on a change in pull request #1430:
URL: https://github.com/apache/hadoop-ozone/pull/1430#discussion_r490045948



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -994,6 +1005,45 @@ public static boolean omInit(OzoneConfiguration conf) 
throws IOException,
 }
   }
 
+  public boolean applyAllPendingTransactions()
+  throws InterruptedException, IOException {
+
+if (!isRatisEnabled) {
+  LOG.info("Ratis not enabled. Nothing to do.");
+  return true;
+}
+
+String purgeConfig = omRatisServer.getServer()
+.getProperties().get(PURGE_UPTO_SNAPSHOT_INDEX_KEY);
+if (!Boolean.parseBoolean(purgeConfig)) {
+  throw new IllegalStateException("Cannot prepare OM for Upgrade since  " +
+  "raft.server.log.purge.upto.snapshot.index is not true");
+}
+
+waitForAllTxnsApplied(omRatisServer.getOmStateMachine(),
+omRatisServer.getRaftGroup(),
+(RaftServerProxy) omRatisServer.getServer(),
+TimeUnit.MINUTES.toSeconds(5));

Review comment:
   Can we make maxTimeToWaitSeconds configurable?

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -1179,15 +1229,22 @@ public void start() throws IOException {
   // Allow OM to start as Http Server failure is not fatal.
   LOG.error("OM HttpServer failed to start.", ex);
 }
-omRpcServer.start();
-isOmRpcServerRunning = true;
 
+if (!prepareForUpgrade) {
+  omRpcServer.start();
+  isOmRpcServerRunning = true;
+}

Review comment:
   During prepareForUpgrade, the RPC server is not stared. So we should 
also have the corresponding command to trigger to restart RPC server.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org