keith-turner commented on issue #5520:
URL: https://github.com/apache/accumulo/issues/5520#issuecomment-2844982515

   One way this could possibly optimized is by using an iterator like the 
following.  Maybe existing iterator could do this, not sure.  Hoping this could 
quickly skip entire metadata tablets that do not have a migration column family 
present using information in the rfile about what families are present in the 
rfile.
   
   ```java
   /**
    * An iterator that only returns rows where a primary column family is 
present.  The returned rows can contain a secondy set of columns.
    */
   class RowSubsetFilter extends SortedKeyValueIterator {
      // primary iterator that is look for only the migration column family
      SortedKeyValueIterator primary;
      // secondary iterator created by using deep copy, used to get all columns 
for the row when the migration column is found.
      SortedKeyValueIterator secondary;
   
      Key getTopKey(){
          return secondary.getTopKey();
      }
   
      Key getTopKey(){
          return secondary.getTopValue();
      }
       
   
      boolean hasTop(){
         return primary.hasTop() || secondary.hasTop();
      }
      
     void next(){
         if(secondary.hasTop()){
            secondary.next();
         } else if(primary.hasTop()){
            var seekRange = new Range(primary.getTopKey().getRow());
            // seek on the migrations column plus the other columns of interest 
like prevrow
            secondary.seek(seekRange, allColumns, true);
            primary.next();
         }
      }
   }
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to