[ https://issues.apache.org/jira/browse/HBASE-19001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206869#comment-16206869 ]
Duo Zhang commented on HBASE-19001: ----------------------------------- OK, the problem of Tephra is for flush and compaction. There are two things, first it sets to read all versions, second it adds a Filter. I think the first one is not a problem for flush/compaction, we always read all versions when flush/compaction. The flush/compaction for MOB maybe different but it is OK I think? The MOB file works like an external storage. For the filter, the code is {code} static class IncludeInProgressFilter extends FilterBase { private final long visibilityUpperBound; private final Set<Long> invalidIds; private final Filter txFilter; public IncludeInProgressFilter(long upperBound, Collection<Long> invalids, Filter transactionFilter) { this.visibilityUpperBound = upperBound; this.invalidIds = Sets.newHashSet(invalids); this.txFilter = transactionFilter; } @Override public ReturnCode filterKeyValue(Cell cell) throws IOException { // include all cells visible to in-progress transactions, except for those already marked as invalid long ts = cell.getTimestamp(); if (ts > visibilityUpperBound) { // include everything that could still be in-progress except invalids if (invalidIds.contains(ts)) { return ReturnCode.SKIP; } return ReturnCode.INCLUDE; } return txFilter.filterKeyValue(cell); } } {code} It just does filterKeyValue, so I think it is easy to change to use a wrap of InternalScanner and do filtering on the Cell list returned by InternalScanner.next. There is a example: https://github.com/apache/hbase/blob/master/hbase-examples/src/main/java/org/apache/hadoop/hbase/coprocessor/example/ZooKeeperScanPolicyObserver.java {code} private InternalScanner wrap(InternalScanner scanner) { OptionalLong optExpireBefore = getExpireBefore(); if (!optExpireBefore.isPresent()) { return scanner; } long expireBefore = optExpireBefore.getAsLong(); return new DelegatingInternalScanner(scanner) { @Override public boolean next(List<Cell> result, ScannerContext scannerContext) throws IOException { boolean moreRows = scanner.next(result, scannerContext); result.removeIf(c -> c.getTimestamp() < expireBefore); return moreRows; } }; } {code} Thanks. > Remove the hooks in RegionObserver which are designed to construct a > StoreScanner which is marked as IA.Private > --------------------------------------------------------------------------------------------------------------- > > Key: HBASE-19001 > URL: https://issues.apache.org/jira/browse/HBASE-19001 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors > Reporter: Duo Zhang > Assignee: Duo Zhang > Fix For: 2.0.0-alpha-4 > > Attachments: HBASE-19001.patch > > > There are three methods here > {code} > KeyValueScanner > preStoreScannerOpen(ObserverContext<RegionCoprocessorEnvironment> c, > Store store, Scan scan, NavigableSet<byte[]> targetCols, > KeyValueScanner s, long readPt) > throws IOException; > InternalScanner > preFlushScannerOpen(ObserverContext<RegionCoprocessorEnvironment> c, > Store store, List<KeyValueScanner> scanners, InternalScanner s, long > readPoint) > throws IOException; > InternalScanner > preCompactScannerOpen(ObserverContext<RegionCoprocessorEnvironment> c, > Store store, List<? extends KeyValueScanner> scanners, ScanType > scanType, long earliestPutTs, > InternalScanner s, CompactionLifeCycleTracker tracker, > CompactionRequest request, > long readPoint) throws IOException; > {code} > For the flush and compact ones, we've discussed many times, it is not safe to > let user inject a Filter or even implement their own InternalScanner using > the store file scanners, as our correctness highly depends on the complicated > logic in SQM and StoreScanner. CP users are expected to wrap the original > InternalScanner(it is a StoreScanner anyway) in preFlush/preCompact methods > to do filtering or something else. > For preStoreScannerOpen it even returns a KeyValueScanner which is marked as > IA.Private... This is less hurt but still, we've decided to not expose > StoreScanner to CP users so here this method is useless. CP users can use > preGetOp and preScannerOpen method to modify the Get/Scan object passed in to > inject into the scan operation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)