keith-turner commented on issue #5520:
URL: https://github.com/apache/accumulo/issues/5520#issuecomment-2844982515
One way this could possibly optimized is by using an iterator like the
following. Maybe existing iterator could do this, not sure. Hoping this could
quickly skip entire metadata tablets that do not have a migration column family
present using information in the rfile about what families are present in the
rfile.
```java
/**
* An iterator that only returns rows where a primary column family is
present. The returned rows can contain a secondy set of columns.
*/
class RowSubsetFilter extends SortedKeyValueIterator {
// primary iterator that is look for only the migration column family
SortedKeyValueIterator primary;
// secondary iterator created by using deep copy, used to get all columns
for the row when the migration column is found.
SortedKeyValueIterator secondary;
Key getTopKey(){
return secondary.getTopKey();
}
Key getTopKey(){
return secondary.getTopValue();
}
boolean hasTop(){
return primary.hasTop() || secondary.hasTop();
}
void next(){
if(secondary.hasTop()){
secondary.next();
} else if(primary.hasTop()){
var seekRange = new Range(primary.getTopKey().getRow());
// seek on the migrations column plus the other columns of interest
like prevrow
secondary.seek(seekRange, allColumns, true);
primary.next();
}
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]