[GitHub] [hbase] Apache9 commented on a diff in pull request #4967: HBASE-27558 Scan quotas and limits should account for total block IO

GitBox Wed, 18 Jan 2023 01:49:13 -0800


Apache9 commented on code in PR #4967:
URL: https://github.com/apache/hbase/pull/4967#discussion_r1073268854



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java:
##########
@@ -140,4 +140,10 @@ public interface HFileScanner extends Shipper, Closeable {
    */
   @Override
   void close();
+
+  /**
+   * Returns the block size in bytes for the current block. Will only return a 
value once per block,
+   * otherwise 0. Used for calculating block IO in ScannerContext.
+   */
+  int getCurrentBlockSizeOnce();

Review Comment:
   This API design is a bit strange... Let me take a look on the usage...



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/ScannerContext.java:
##########
@@ -158,10 +167,32 @@ void setKeepProgress(boolean keepProgress) {
     this.keepProgress = keepProgress;
   }
 
+  /**
+   * In this mode, only block size progress is tracked, and limits are 
ignored. We set this mode
+   * when skipping to next row, in which case all cells returned a thrown away 
so should not count

Review Comment:
   all cells returned [a] thrown -> all cells returned [are] thrown away?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerImpl.java:
##########
@@ -608,6 +610,11 @@ private boolean nextInternal(List<Cell> results, 
ScannerContext scannerContext)
           return 
scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues();
         }
         if (!shouldStop) {
+          // Read nothing as the cells were filtered, but still need to check 
time limit.

Review Comment:
   Why we need to add a check here?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionScannerImpl.java:
##########
@@ -705,13 +712,21 @@ public int size() {
 
   protected boolean nextRow(ScannerContext scannerContext, Cell curRowCell) 
throws IOException {
     assert this.joinedContinuationRow == null : "Trying to go to next row 
during joinedHeap read.";
+
+    // Enable skipping row mode, which disables limits and skips tracking 
progress for all
+    // but block size. We keep tracking block size because skipping a row in 
this way
+    // might involve reading blocks along the way.
+    scannerContext.setSkippingRow(true);
+
     Cell next;
     while ((next = this.storeHeap.peek()) != null && 
CellUtil.matchingRows(next, curRowCell)) {
       // Check for thread interrupt status in case we have been signaled from
       // #interruptRegionOperation.
       region.checkInterrupt();
-      this.storeHeap.next(MOCKED_LIST);
+      this.storeHeap.next(MOCKED_LIST, scannerContext);

Review Comment:
   So this is why we need the skippingRow flag? In the past we do not passing 
ScannerContext in, and now we want to record the block size so we need to pass 
it down, but we do not want to increase the size other than block?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java:
##########
@@ -3398,7 +3398,6 @@ private void scan(HBaseRpcController controller, 
ScanRequest request, RegionScan
             }
             boolean mayHaveMoreCellsInRow = 
scannerContext.mayHaveMoreCellsInRow();
             Result r = Result.create(values, null, stale, 
mayHaveMoreCellsInRow);
-            lastBlock.setValue(addSize(rpcCall, r, lastBlock.getValue()));

Review Comment:
   We do not need to count lastBlock any more?



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java:
##########
@@ -612,6 +612,9 @@ public boolean next(List<Cell> outResult, ScannerContext 
scannerContext) throws
           // here, we still need to scan all the qualifiers before returning...
           scannerContext.returnImmediately();
         }
+
+        scannerContext.incrementBlockProgress(heap.getCurrentBlockSizeOnce());

Review Comment:
   So the problem here is we do not know if this is a new block?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] Apache9 commented on a diff in pull request #4967: HBASE-27558 Scan quotas and limits should account for total block IO

Reply via email to