Github user joshelser commented on a diff in the pull request:

    https://github.com/apache/accumulo/pull/260#discussion_r119144555
  
    --- Diff: 
server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/Tablet.java ---
    @@ -715,47 +722,54 @@ Batch nextBatch(SortedKeyValueIterator<Key,Value> 
iter, Range range, int num, Se
     
         long maxResultsSize = 
tableConfiguration.getAsBytes(Property.TABLE_SCAN_MAXMEM);
     
    -    if (columns.size() == 0) {
    -      iter.seek(range, LocalityGroupUtil.EMPTY_CF_SET, false);
    -    } else {
    -      iter.seek(range, LocalityGroupUtil.families(columns), true);
    -    }
    -
         Key continueKey = null;
         boolean skipContinueKey = false;
     
         boolean endOfTabletReached = false;
    -    while (iter.hasTop()) {
     
    -      value = iter.getTopValue();
    -      key = iter.getTopKey();
    +    try {
    +      if (columns.size() == 0) {
    +        iter.seek(range, LocalityGroupUtil.EMPTY_CF_SET, false);
    --- End diff --
    
    > The underlying iterator would presumably seek its source with the same 
range. Now the seek is iterating through keys past the initial start of the 
range and realizes at some point that it has been passing by many of the 
underlying keys for whatever reason without returning control to the Tablet. 
Now it throws a ScanYieldException specifying where it left off in the seek.
    
    Yup, right.
    
    > Later the Tablet will come back and rebuild the iterator and seek it 
using the new start as specified. The seek can now continue doing its job until 
it actually finds a key that is to be returned. Where in this scenario is a key 
missed?
    
    The caveat here is that _no_ keys from this Range were returned; the SKVI 
would return a "position" that is the same as the startKey of the Range is was 
passed.
    
    ```java
    +    } catch (ScanYieldException sye) {
    +      continueKey = new Key(sye.getPosition());
    +      skipContinueKey = true;
    +    }
    ```
    
    I may be grasping at straws, but, the little bit I've read about how this 
method is invoked, the fact that `skipContinueKey` is set to true means that 
the next time we return to the Range, we construct a new Range which is 
non-inclusive of the original start key.
    
    Concretely:
    
    `seek([a, z])` throwing `ScanYieldException` would net a call `seek((a, 
z])` (instead of `seek([a,z])` again). This worried me because your test case 
obviously didn't cover this path. If you are sure this is not an issue, I trust 
you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to