keith-turner commented on code in PR #5341:
URL: https://github.com/apache/accumulo/pull/5341#discussion_r1970440558


##########
server/manager/src/main/java/org/apache/accumulo/manager/tableOps/bulkVer2/LoadFiles.java:
##########
@@ -342,12 +341,22 @@ private long loadFiles(TableId tableId, Path bulkDir, 
LoadMappingIterator loadMa
     loader.start(bulkDir, manager, tid, bulkInfo.setTime);
 
     long t1 = System.currentTimeMillis();
+    KeyExtent prevLastExtent = null; // KeyExtent of last tablet from prior 
loadMapEntry
     while (lmi.hasNext()) {
       loadMapEntry = lmi.next();
-      List<TabletMetadata> tablets =
-          findOverlappingTablets(fmtTid, loadMapEntry.getKey(), tabletIter);
+      KeyExtent loadMapKey = loadMapEntry.getKey();
+      if (prevLastExtent != null && 
!loadMapKey.isPreviousExtent(prevLastExtent)) {

Review Comment:
   In some case this strategy could potentially make performance worse, like 
the case of importing into every 3rd tablet.  The underlying scanner has 
already made an RPC and fetched some number of key values.  Not sure of the 
best way to do this, but ideally we would only reset the scanner if the needed 
data is not already sitting in that batch of key values that was already read.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to