ndimiduk commented on code in PR #6868:
URL: https://github.com/apache/hbase/pull/6868#discussion_r2057989697


##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractClientScanner.java:
##########
@@ -25,15 +27,33 @@
  */
 @InterfaceAudience.Private
 public abstract class AbstractClientScanner implements ResultScanner {
-  protected ScanMetrics scanMetrics;
+  protected ScanMetrics scanMetrics = null;
+  protected List<ScanMetrics> scanMetricsByRegion;

Review Comment:
   Over-all question: why do you always add this `List<ScanMetrics> 
scanMetricsByRegion` instead of using the existing instance `ScanMetrics 
scanMetrics` as an accumulator and fold in the new metrics as they arrive into 
this instance? You could initialize it once and have less null-checking going 
on. You do this in all the classes that interact with this new feature.



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractClientScanner.java:
##########
@@ -44,6 +64,24 @@ protected void initScanMetrics(Scan scan) {
    */
   @Override
   public ScanMetrics getScanMetrics() {
-    return scanMetrics;
+    if (scanMetricsByRegion != null) {
+      if (scanMetricsByRegion.isEmpty()) {
+        return null;

Review Comment:
   nit: imho, it's easier to read code that exits early, as opposed to deeply 
indented if/else blocks. Just my style preference.



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractClientScanner.java:
##########
@@ -44,6 +64,24 @@ protected void initScanMetrics(Scan scan) {
    */
   @Override
   public ScanMetrics getScanMetrics() {
-    return scanMetrics;
+    if (scanMetricsByRegion != null) {
+      if (scanMetricsByRegion.isEmpty()) {
+        return null;
+      } else if (scanMetricsByRegion.size() == 1) {
+        return scanMetricsByRegion.get(0);
+      }
+      ScanMetrics overallScanMetrics = new ScanMetrics();
+      for (ScanMetrics otherScanMetrics : scanMetricsByRegion) {
+        overallScanMetrics.combineMetrics(otherScanMetrics);
+      }
+      return overallScanMetrics;
+    } else {
+      return scanMetrics;
+    }
+  }
+
+  @Override
+  public List<ScanMetrics> getScanMetricsByRegion() {

Review Comment:
   Why do we have this method at all? Is it not sufficient to return the 
"overall" ScanMetics object?



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/client/FromClientSideBase.java:
##########
@@ -280,6 +282,19 @@ protected Result getSingleScanResult(Table ht, Scan scan) 
throws IOException {
     return result;
   }
 
+  protected Pair<ScanMetrics, List<ScanMetrics>> getScanMetrics(Table ht, Scan 
scan)
+    throws IOException {
+    ScanMetrics scanMetrics;
+    List<ScanMetrics> scanMetricsByRegion;
+    try (ResultScanner scanner = ht.getScanner(scan)) {
+      while ((scanner.next()) != null) {
+      }
+      scanMetrics = scanner.getScanMetrics();
+      scanMetricsByRegion = scanner.getScanMetricsByRegion();
+    }
+    return Pair.newPair(scanMetrics, scanMetricsByRegion);

Review Comment:
   Yeah, this emphisized to me how it's weird that the new feature is not 
folded into the existing feature, and is instead a "bolt-on". It's not very 
nice to use like this.



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestTableSnapshotScanner.java:
##########
@@ -244,6 +245,68 @@ public void testWithOfflineHBaseMultiRegion() throws 
Exception {
     testScanner(UTIL, "testWithMultiRegion", 20, true);
   }
 
+  @Test
+  public void testScanMetricsByRegionForTableSnapshotScanner() throws 
Exception {
+    TableName tableName = TableName.valueOf(name.getMethodName() + "_TABLE");
+    String snapshotName = name.getMethodName() + "_SNAPSHOT";
+    try {
+      createTableAndSnapshot(UTIL, tableName, snapshotName, 50);
+      Path restoreDir = UTIL.getDataTestDirOnTestFS(snapshotName);
+      Scan scan = new Scan().withStartRow(bbb).withStopRow(yyy); // limit the 
scan
+      scan.setScanMetricsEnabled(true);
+      scan.setEnableScanMetricsByRegion(true);
+      Scan scanCopy = new Scan(scan);
+
+      Configuration conf = UTIL.getConfiguration();
+
+      // Scan table snapshot and get combined scan metrics

Review Comment:
   nit: each of these little blocks could be its own test method, and then this 
test would be a lot nicer to reason about when there are failures.



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/metrics/ServerSideScanMetrics.java:
##########
@@ -117,4 +121,64 @@ public Map<String, Long> getMetricsMap(boolean reset) {
     // Build the immutable map so that people can't mess around with it.
     return builder.build();
   }
+
+  public ServerName getServerName() {
+    return this.serverName;
+  }
+
+  public void setServerName(ServerName serverName) {
+    if (this.serverName == null) {
+      this.serverName = serverName;
+    }
+  }
+
+  public void setEncodedRegionName(String encodedRegionName) {
+    if (this.encodedRegionName == null) {
+      this.encodedRegionName = encodedRegionName;
+    }
+  }
+
+  public String getEncodedRegionName() {
+    return this.encodedRegionName;
+  }
+
+  @Override
+  public String toString() {
+    StringBuilder sb = new StringBuilder();
+    sb.append("ServerName:");
+    sb.append(this.serverName);
+    sb.append(",EncodedRegion:");
+    sb.append(this.encodedRegionName);
+    sb.append(",[");
+    boolean isFirstMetric = true;
+    for (Map.Entry<String, AtomicLong> e : this.counters.entrySet()) {
+      if (isFirstMetric) {
+        isFirstMetric = false;
+      } else {
+        sb.append(",");
+      }
+      sb.append(e.getKey());
+      sb.append(":");
+      sb.append(e.getValue().get());
+    }
+    sb.append("]");
+    return sb.toString();
+  }
+
+  public void combineMetrics(ScanMetrics other) {
+    for (Map.Entry<String, AtomicLong> entry : other.counters.entrySet()) {
+      String counterName = entry.getKey();
+      AtomicLong counter = entry.getValue();
+      this.addToCounter(counterName, counter.get());
+    }
+    if (
+      this.encodedRegionName != null
+        && !Objects.equals(this.encodedRegionName, 
other.getEncodedRegionName())
+    ) {
+      this.encodedRegionName = null;
+    }
+    if (this.serverName != null && !Objects.equals(this.serverName, 
other.getServerName())) {
+      this.serverName = null;

Review Comment:
   I agree that this is not immediately obvious. Please add a comment 
explaining your thought process.



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultScanner.java:
##########
@@ -115,4 +115,8 @@ default Result[] next(int nbRows) throws IOException {
 
   /** Returns the scan metrics, or {@code null} if we do not enable metrics. */
   ScanMetrics getScanMetrics();
+
+  default List<ScanMetrics> getScanMetricsByRegion() {
+    throw new UnsupportedOperationException("Scan Metrics by region is not 
supported");

Review Comment:
   Can't we return null (or empty list, whatever we decide on) instead of 
throwing? Wouldn't that be more consistent with the existing interfaces?
   
   I'm also not clear on whether we need this method at all.



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractClientScanner.java:
##########
@@ -44,6 +64,24 @@ protected void initScanMetrics(Scan scan) {
    */
   @Override
   public ScanMetrics getScanMetrics() {
-    return scanMetrics;
+    if (scanMetricsByRegion != null) {
+      if (scanMetricsByRegion.isEmpty()) {
+        return null;
+      } else if (scanMetricsByRegion.size() == 1) {
+        return scanMetricsByRegion.get(0);
+      }
+      ScanMetrics overallScanMetrics = new ScanMetrics();
+      for (ScanMetrics otherScanMetrics : scanMetricsByRegion) {
+        overallScanMetrics.combineMetrics(otherScanMetrics);
+      }
+      return overallScanMetrics;
+    } else {
+      return scanMetrics;
+    }
+  }
+
+  @Override
+  public List<ScanMetrics> getScanMetricsByRegion() {

Review Comment:
   I'm surprised that this method with this name returns a list and not a 
`Map<encodedRegionName -> ScanMetrics>`.



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScanResultConsumerBase.java:
##########
@@ -38,11 +39,22 @@ public interface ScanResultConsumerBase {
   void onComplete();
 
   /**
-   * If {@code scan.isScanMetricsEnabled()} returns true, then this method 
will be called prior to
-   * all other methods in this interface to give you the {@link ScanMetrics} 
instance for this scan
-   * operation. The {@link ScanMetrics} instance will be updated on-the-fly 
during the scan, you can
-   * store it somewhere to get the metrics at any time if you want.
+   * If {@code scan.isScanMetricsEnabled()} returns true and
+   * {@code scan.isScanMetricsByRegionEnabled()} returns false, then this 
method will be called
+   * prior to all other methods in this interface to give you the {@link 
ScanMetrics} instance for
+   * this scan operation. The {@link ScanMetrics} instance will be updated 
on-the-fly during the
+   * scan, you can store it somewhere to get the metrics at any time if you 
want.

Review Comment:
   So is this object underneath this method no longer maintained when the 
scan.isScanMetricsByRegionEnabled() returns true? The javadoc now implies to me 
that it is NOT maintained, which is a backward compatibility problem (and, 
imho, an ergonomics problem).



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/client/FromClientSideBase.java:
##########
@@ -280,6 +282,19 @@ protected Result getSingleScanResult(Table ht, Scan scan) 
throws IOException {
     return result;
   }
 
+  protected Pair<ScanMetrics, List<ScanMetrics>> getScanMetrics(Table ht, Scan 
scan)
+    throws IOException {
+    ScanMetrics scanMetrics;
+    List<ScanMetrics> scanMetricsByRegion;
+    try (ResultScanner scanner = ht.getScanner(scan)) {
+      while ((scanner.next()) != null) {

Review Comment:
   Doesn't your IDE complain about an empty while loop? Is there a Guava 
utility or something that can consume an iterator to completion?



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAsyncTableScanMetrics.java:
##########
@@ -70,7 +72,7 @@ public class TestAsyncTableScanMetrics {
 
   @FunctionalInterface
   private interface ScanWithMetrics {
-    Pair<List<Result>, ScanMetrics> scan(Scan scan) throws Exception;
+    Pair<List<Result>, Pair<ScanMetrics, List<ScanMetrics>>> scan(Scan scan) 
throws Exception;

Review Comment:
   A Pair of Pairs? It's time to make a static POJO class or use record type (I 
guess the former because want to backport to branch-2 lineages).



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/client/SimpleScanResultConsumerImpl.java:
##########
@@ -71,6 +72,29 @@ public synchronized List<Result> getAll() throws Exception {
 
   @Override
   public ScanMetrics getScanMetrics() {
-    return scanMetrics;
+    if (scanMetricsByRegion != null) {

Review Comment:
   We've seen this logical structure repeated over and over. Is it worth 
putting into a utility method on a base-class?



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestAsyncTableScanMetrics.java:
##########
@@ -110,56 +112,101 @@ public static void tearDown() throws Exception {
     UTIL.shutdownMiniCluster();
   }
 
-  private static Pair<List<Result>, ScanMetrics> doScanWithRawAsyncTable(Scan 
scan)
-    throws IOException, InterruptedException {
+  private static Pair<List<Result>, Pair<ScanMetrics, List<ScanMetrics>>>
+    doScanWithRawAsyncTable(Scan scan) throws IOException, 
InterruptedException {
     BufferingScanResultConsumer consumer = new BufferingScanResultConsumer();
     CONN.getTable(TABLE_NAME).scan(scan, consumer);
     List<Result> results = new ArrayList<>();
     for (Result result; (result = consumer.take()) != null;) {
       results.add(result);
     }
-    return Pair.newPair(results, consumer.getScanMetrics());
+    return Pair.newPair(results,
+      Pair.newPair(consumer.getScanMetrics(), 
consumer.getScanMetricsByRegion()));
   }
 
-  private static Pair<List<Result>, ScanMetrics> doScanWithAsyncTableScan(Scan 
scan)
-    throws Exception {
+  private static Pair<List<Result>, Pair<ScanMetrics, List<ScanMetrics>>>
+    doScanWithAsyncTableScan(Scan scan) throws Exception {
     SimpleScanResultConsumerImpl consumer = new SimpleScanResultConsumerImpl();
     CONN.getTable(TABLE_NAME, ForkJoinPool.commonPool()).scan(scan, consumer);
-    return Pair.newPair(consumer.getAll(), consumer.getScanMetrics());
+    return Pair.newPair(consumer.getAll(),
+      Pair.newPair(consumer.getScanMetrics(), 
consumer.getScanMetricsByRegion()));
   }
 
-  private static Pair<List<Result>, ScanMetrics> 
doScanWithAsyncTableScanner(Scan scan)
-    throws IOException {
+  private static Pair<List<Result>, Pair<ScanMetrics, List<ScanMetrics>>>
+    doScanWithAsyncTableScanner(Scan scan) throws IOException {
     try (ResultScanner scanner =
       CONN.getTable(TABLE_NAME, ForkJoinPool.commonPool()).getScanner(scan)) {
       List<Result> results = new ArrayList<>();
       for (Result result; (result = scanner.next()) != null;) {
         results.add(result);
       }
-      return Pair.newPair(results, scanner.getScanMetrics());
+      return Pair.newPair(results,
+        Pair.newPair(scanner.getScanMetrics(), 
scanner.getScanMetricsByRegion()));
     }
   }
 
   @Test
   public void testNoScanMetrics() throws Exception {
-    Pair<List<Result>, ScanMetrics> pair = method.scan(new Scan());
+    Pair<List<Result>, Pair<ScanMetrics, List<ScanMetrics>>> pair = 
method.scan(new Scan());
     assertEquals(3, pair.getFirst().size());
-    assertNull(pair.getSecond());
+    // Assert no scan metrics
+    assertNull(pair.getSecond().getFirst());
+    // Assert no per region scan metrics
+    assertNull(pair.getSecond().getSecond());
   }
 
   @Test
   public void testScanMetrics() throws Exception {
-    Pair<List<Result>, ScanMetrics> pair = method.scan(new 
Scan().setScanMetricsEnabled(true));

Review Comment:
   Can you please bust up this method into two variants, one with and one 
without the per-region accounting enabled? Ideally the latter is a pure 
super-set of the former, but as things are currently implemented, I think that 
won't be the case...



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestScanMetricsByRegion.java:
##########
@@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client;
+
+import java.io.IOException;
+import java.util.Arrays;
+import java.util.List;
+import org.apache.hadoop.hbase.HBaseClassTestRule;
+import org.apache.hadoop.hbase.HBaseTestingUtil;
+import org.apache.hadoop.hbase.TableName;
+import org.apache.hadoop.hbase.TableNameTestRule;
+import org.apache.hadoop.hbase.client.metrics.ScanMetrics;
+import org.apache.hadoop.hbase.testclassification.ClientTests;
+import org.apache.hadoop.hbase.testclassification.MediumTests;
+import org.apache.hadoop.hbase.util.Pair;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Rule;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runners.Parameterized.Parameter;
+import org.junit.runners.Parameterized.Parameters;
+
+@Category({ ClientTests.class, MediumTests.class })
+public class TestScanMetricsByRegion extends FromClientSideBase {
+  @ClassRule
+  public static final HBaseClassTestRule CLASS_RULE =
+    HBaseClassTestRule.forClass(TestScanMetricsByRegion.class);
+
+  @Rule
+  public TableNameTestRule name = new TableNameTestRule();
+
+  private static final HBaseTestingUtil TEST_UTIL = new HBaseTestingUtil();
+
+  @Parameters(name = "{index}: scanner={0}")
+  public static List<Object[]> params() {
+    return Arrays.asList(new Object[] { "ForwardScanner", new Scan() },
+      new Object[] { "ReverseScanner", new Scan().setReversed(true) });
+  }
+
+  @Parameter(0)
+  public String scannerName;
+
+  @Parameter(1)
+  public Scan originalScan;
+
+  @BeforeClass
+  public static void setUp() throws Exception {
+    // Start the minicluster
+    TEST_UTIL.startMiniCluster(2);
+  }
+
+  @AfterClass
+  public static void tearDown() throws Exception {
+    TEST_UTIL.shutdownMiniCluster();
+  }
+
+  private Scan getScan(byte[][] ROWS, boolean isSingleRegionScan) throws 
IOException {
+    Scan scan = new Scan(originalScan);
+    if (isSingleRegionScan) {
+      scan.withStartRow(ROWS[0], true);
+      scan.withStopRow(ROWS[0], true);
+    } else if (scan.isReversed()) {
+      scan.withStartRow(ROWS[1], true);
+      scan.withStopRow(ROWS[0], true);
+    } else {
+      scan.withStartRow(ROWS[0], true);
+      scan.withStopRow(ROWS[1], true);
+    }
+    return scan;
+  }
+
+  @Test
+  public void testScanMetricsByRegion() throws Exception {
+    final TableName tableName = name.getTableName();
+    byte[][] ROWS = makeN(ROW, 2);
+    byte[][] FAMILIES = makeNAscii(FAMILY, 2);
+    byte[][] QUALIFIERS = makeN(QUALIFIER, 1);
+    byte[][] VALUES = makeN(VALUE, 2);
+    byte[][] splitKeys = new byte[][] { ROWS[1] };
+    try (Table ht = TEST_UTIL.createTable(tableName, FAMILIES, splitKeys)) {
+      // Add two rows in two separate regions
+      Put put1 = new Put(ROWS[0]);
+      put1.addColumn(FAMILIES[0], QUALIFIERS[0], VALUES[0]);
+      Put put2 = new Put(ROWS[1]);
+      put2.addColumn(FAMILIES[0], QUALIFIERS[0], VALUES[1]);
+      List<Put> puts = Arrays.asList(put1, put2);
+      ht.batch(puts, null);
+      // Verify we have two regions
+      Admin admin = TEST_UTIL.getAdmin();
+      Assert.assertEquals(2, admin.getRegions(tableName).size());
+
+      // Test scan metrics disabled

Review Comment:
   nit: each of these little blocks could be its own test method, and then this 
test would be a lot nicer to reason about when there are failures.



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientSideRegionScanner.java:
##########
@@ -179,6 +181,72 @@ public void testContinuesToScanIfHasMore() throws 
IOException {
     }
   }
 
+  @Test
+  public void testScanMetricsByRegion() throws IOException {
+    Scan scan = new Scan();
+    scan.setScanMetricsEnabled(true);
+    scan.setEnableScanMetricsByRegion(true);
+
+    Configuration copyConf = new Configuration(conf);
+    // Test without providing scan metric at prior and scan metrics by region 
enabled

Review Comment:
   nit: each of these little blocks could be its own test method, and then this 
test would be a lot nicer to reason about when there are failures.



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractClientScanner.java:
##########
@@ -44,6 +64,24 @@ protected void initScanMetrics(Scan scan) {
    */
   @Override
   public ScanMetrics getScanMetrics() {
-    return scanMetrics;
+    if (scanMetricsByRegion != null) {
+      if (scanMetricsByRegion.isEmpty()) {
+        return null;
+      } else if (scanMetricsByRegion.size() == 1) {
+        return scanMetricsByRegion.get(0);
+      }
+      ScanMetrics overallScanMetrics = new ScanMetrics();
+      for (ScanMetrics otherScanMetrics : scanMetricsByRegion) {
+        overallScanMetrics.combineMetrics(otherScanMetrics);
+      }
+      return overallScanMetrics;
+    } else {
+      return scanMetrics;
+    }
+  }
+
+  @Override
+  public List<ScanMetrics> getScanMetricsByRegion() {

Review Comment:
   Oh wait, it's because you haven't added the per-region scan metrics to the 
existing object. Why not?



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AbstractClientScanner.java:
##########
@@ -44,6 +64,24 @@ protected void initScanMetrics(Scan scan) {
    */
   @Override
   public ScanMetrics getScanMetrics() {
-    return scanMetrics;
+    if (scanMetricsByRegion != null) {
+      if (scanMetricsByRegion.isEmpty()) {
+        return null;

Review Comment:
   We need to maintain consistency in our client-side metrics interfaces. I 
believe the patch we just merged on #6623 returns null.
   
   Actually, please rebase onto that commit (if you haven't already) so that we 
can easily get the complete picture of these features.



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/client/metrics/ServerSideScanMetrics.java:
##########
@@ -117,4 +121,64 @@ public Map<String, Long> getMetricsMap(boolean reset) {
     // Build the immutable map so that people can't mess around with it.
     return builder.build();
   }
+
+  public ServerName getServerName() {
+    return this.serverName;
+  }
+
+  public void setServerName(ServerName serverName) {
+    if (this.serverName == null) {
+      this.serverName = serverName;
+    }
+  }
+
+  public void setEncodedRegionName(String encodedRegionName) {
+    if (this.encodedRegionName == null) {
+      this.encodedRegionName = encodedRegionName;
+    }
+  }
+
+  public String getEncodedRegionName() {
+    return this.encodedRegionName;
+  }
+
+  @Override
+  public String toString() {
+    StringBuilder sb = new StringBuilder();
+    sb.append("ServerName:");
+    sb.append(this.serverName);
+    sb.append(",EncodedRegion:");
+    sb.append(this.encodedRegionName);
+    sb.append(",[");
+    boolean isFirstMetric = true;
+    for (Map.Entry<String, AtomicLong> e : this.counters.entrySet()) {
+      if (isFirstMetric) {
+        isFirstMetric = false;
+      } else {
+        sb.append(",");
+      }
+      sb.append(e.getKey());
+      sb.append(":");
+      sb.append(e.getValue().get());
+    }
+    sb.append("]");
+    return sb.toString();
+  }

Review Comment:
   I generally agree with using the ToStringBuilder. the only issue is it's 
based on reflection and is not very efficient. If this gets called in a hot 
path, it's pretty bad. I don't think this gets called in a hot-path, but FYI.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to