[ 
https://issues.apache.org/jira/browse/HBASE-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206869#comment-16206869
 ] 

Duo Zhang commented on HBASE-19001:
-----------------------------------

OK, the problem of Tephra is for flush and compaction. There are two things, 
first it sets to read all versions, second it adds a Filter.

I think the first one is not a problem for flush/compaction, we always read all 
versions when flush/compaction. The flush/compaction for MOB maybe different 
but it is OK I think? The MOB file works like an external storage.

For the filter, the code is
{code}
  static class IncludeInProgressFilter extends FilterBase {
    private final long visibilityUpperBound;
    private final Set<Long> invalidIds;
    private final Filter txFilter;

    public IncludeInProgressFilter(long upperBound, Collection<Long> invalids, 
Filter transactionFilter) {
      this.visibilityUpperBound = upperBound;
      this.invalidIds = Sets.newHashSet(invalids);
      this.txFilter = transactionFilter;
    }

    @Override
    public ReturnCode filterKeyValue(Cell cell) throws IOException {
      // include all cells visible to in-progress transactions, except for 
those already marked as invalid
      long ts = cell.getTimestamp();
      if (ts > visibilityUpperBound) {
        // include everything that could still be in-progress except invalids
        if (invalidIds.contains(ts)) {
          return ReturnCode.SKIP;
        }
        return ReturnCode.INCLUDE;
      }
      return txFilter.filterKeyValue(cell);
    }
  }
{code}

It just does filterKeyValue, so I think it is easy to change to use a wrap of 
InternalScanner and do filtering on the Cell list returned by 
InternalScanner.next. There is a example:

https://github.com/apache/hbase/blob/master/hbase-examples/src/main/java/org/apache/hadoop/hbase/coprocessor/example/ZooKeeperScanPolicyObserver.java

{code}
  private InternalScanner wrap(InternalScanner scanner) {
    OptionalLong optExpireBefore = getExpireBefore();
    if (!optExpireBefore.isPresent()) {
      return scanner;
    }
    long expireBefore = optExpireBefore.getAsLong();
    return new DelegatingInternalScanner(scanner) {

      @Override
      public boolean next(List<Cell> result, ScannerContext scannerContext) 
throws IOException {
        boolean moreRows = scanner.next(result, scannerContext);
        result.removeIf(c -> c.getTimestamp() < expireBefore);
        return moreRows;
      }
    };
  }
{code}

Thanks.

> Remove the hooks in RegionObserver which are designed to construct a 
> StoreScanner which is marked as IA.Private
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-19001
>                 URL: https://issues.apache.org/jira/browse/HBASE-19001
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Coprocessors
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>             Fix For: 2.0.0-alpha-4
>
>         Attachments: HBASE-19001.patch
>
>
> There are three methods here
> {code}
> KeyValueScanner 
> preStoreScannerOpen(ObserverContext<RegionCoprocessorEnvironment> c,
>       Store store, Scan scan, NavigableSet<byte[]> targetCols, 
> KeyValueScanner s, long readPt)
>       throws IOException;
> InternalScanner 
> preFlushScannerOpen(ObserverContext<RegionCoprocessorEnvironment> c,
>       Store store, List<KeyValueScanner> scanners, InternalScanner s, long 
> readPoint)
>       throws IOException;
> InternalScanner 
> preCompactScannerOpen(ObserverContext<RegionCoprocessorEnvironment> c,
>       Store store, List<? extends KeyValueScanner> scanners, ScanType 
> scanType, long earliestPutTs,
>       InternalScanner s, CompactionLifeCycleTracker tracker, 
> CompactionRequest request,
>       long readPoint) throws IOException;
> {code}
> For the flush and compact ones, we've discussed many times, it is not safe to 
> let user inject a Filter or even implement their own InternalScanner using 
> the store file scanners, as our correctness highly depends on the complicated 
> logic in SQM and StoreScanner. CP users are expected to wrap the original 
> InternalScanner(it is a StoreScanner anyway) in preFlush/preCompact methods 
> to do filtering or something else.
> For preStoreScannerOpen it even returns a KeyValueScanner which is marked as 
> IA.Private... This is less hurt but still, we've decided to not expose 
> StoreScanner to CP users so here this method is useless. CP users can use 
> preGetOp and preScannerOpen method to modify the Get/Scan object passed in to 
> inject into the scan operation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to