saintstack commented on a change in pull request #2501:
URL: https://github.com/apache/hbase/pull/2501#discussion_r501434229



##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
##########
@@ -1905,8 +1906,8 @@ public boolean isOnline() {
   private void setupWALAndReplication() throws IOException {
     boolean isMasterNoTableOrSystemTableOnly = this instanceof HMaster &&
         !LoadBalancer.isMasterCanHostUserRegions(conf);
-    WALFactory factory =
-        new WALFactory(conf, serverName.toString(), 
!isMasterNoTableOrSystemTableOnly);
+    WALFactory factory = new WALFactory(conf, serverName.toString(), (Server) 
this,

Review comment:
       Why the cast? Isn't HRegionServer a Server?

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
##########
@@ -329,6 +336,11 @@ public WalProps(Map<byte[], Long> 
encodedName2HighestSequenceId, long logSize) {
 
   protected final AtomicBoolean rollRequested = new AtomicBoolean(false);
 
+  private final ExecutorService logArchiveExecutor = 
Executors.newSingleThreadExecutor(
+    new 
ThreadFactoryBuilder().setDaemon(true).setNameFormat("Log-Archiver-%d").build());

Review comment:
       s/Log/WAL/g

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
##########
@@ -185,6 +190,8 @@
    */
   protected final Configuration conf;
 
+  protected final Server server;

Review comment:
       Is it a good idea adding 'Server' in here? I think been trying to keep 
it so this is apart from need for a 'Server'?

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
##########
@@ -715,11 +738,39 @@ private void cleanOldLogs() throws IOException {
         regionsBlockingThisWal.clear();
       }
     }
+
     if (logsToArchive != null) {
-      for (Pair<Path, Long> logAndSize : logsToArchive) {
-        this.totalLogSize.addAndGet(-logAndSize.getSecond());
-        archiveLogFile(logAndSize.getFirst());
-        this.walFile2Props.remove(logAndSize.getFirst());
+      final List<Pair<Path, Long>> localLogsToArchive = logsToArchive;
+      // make it async
+      for (Pair<Path, Long> log : localLogsToArchive) {
+        logArchiveExecutor.execute(() -> {
+          archiveRetriable(log);
+        });
+        this.walFile2Props.remove(log.getFirst());
+      }
+    }
+  }
+
+  protected void archiveRetriable(final Pair<Path, Long> log) {
+    int retry = 1;
+    while (true) {
+      try {
+        archiveLogFile(log.getFirst());
+        totalLogSize.addAndGet(-log.getSecond());
+        // successful
+        break;
+      } catch (Throwable e) {
+        if (retry > archiveRetries) {
+          LOG.error("Failed log archiving for the log {},", log.getFirst(), e);
+          if (this.server != null) {
+            this.server.abort("Failed log archiving", e);

Review comment:
       This the only reason for passing server? There is an Abortable 
Interface. Pass in an Abortable Interface instead (the server is an 
implementation of an Abortable so you could pass server).

##########
File path: 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestFailedAppendAndSync.java
##########
@@ -122,6 +127,18 @@ public DodgyFSLog(FileSystem fs, Path root, String logDir, 
Configuration conf)
       return regions;
     }
 
+    @Override
+    protected void archiveLogFile(Path p) throws IOException {
+      if (throwArchiveException) {
+        throw new IOException("throw archival exception");
+      }
+    }
+
+    @Override
+    protected void archiveRetriable(Pair<Path, Long> localLogsToArchive) {

Review comment:
       s/Retriable/Retryable/g

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
##########
@@ -482,6 +503,8 @@ protected SyncFuture initialValue() {
     this.walTooOldNs = TimeUnit.SECONDS.toNanos(conf.getInt(
             SURVIVED_TOO_LONG_SEC_KEY, SURVIVED_TOO_LONG_SEC_DEFAULT));
     this.useHsync = conf.getBoolean(HRegion.WAL_HSYNC_CONF_KEY, 
HRegion.DEFAULT_WAL_HSYNC);
+    archiveRetries = 
this.conf.getInt("hbase.regionserver.logroll.archive.retries", 0);

Review comment:
       s/logroll/walroll/g

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
##########
@@ -907,6 +959,9 @@ public void shutdown() throws IOException {
     rollWriterLock.lock();
     try {
       doShutdown();
+      if (logArchiveExecutor != null) {

Review comment:
       Why not put this inside the doShutdown?

##########
File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WALFactory.java
##########
@@ -86,6 +87,7 @@
   public static final String WAL_ENABLED = "hbase.regionserver.hlog.enabled";
 
   final String factoryId;
+  final Server server;

Review comment:
       Yeah, too much if it is only being used to abort. Just pass an Abortable.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to