[ 
https://issues.apache.org/jira/browse/PHOENIX-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032238#comment-15032238
 ] 

Ankit Singhal commented on PHOENIX-2249:
----------------------------------------

[~jamestaylor],[~chrajeshbab...@gmail.com]
how about deleting the stale stats for a merge daughter regions immediately at 
the start of the compaction (As stats for a new merged region will be 
automatically added during the compaction run). 

Otherwise this will return the duplicate results to the user.

In UngroupedAggregateRegionObserver.java
  
{code}

    @Override
    public InternalScanner 
preCompact(ObserverContext<RegionCoprocessorEnvironment> c, final Store store,
            InternalScanner scanner, final ScanType scanType) throws 
IOException {
        TableName table = 
c.getEnvironment().getRegion().getRegionInfo().getTable();
        ClusterConnection conn = 
c.getEnvironment().getRegionServerServices().getConnection();
        InternalScanner internalScanner = scanner;
        if (scanType.equals(ScanType.COMPACT_DROP_DELETES)) {
            try {

                Pair<HRegionInfo, HRegionInfo> mergeRegions = 
MetaTableAccessor.getRegionsFromMergeQualifier(conn,
                        
c.getEnvironment().getRegion().getRegionInfo().getRegionName());

                boolean useCurrentTime = 
c.getEnvironment().getConfiguration().getBoolean(
                        QueryServices.STATS_USE_CURRENT_TIME_ATTRIB,
                        QueryServicesOptions.DEFAULT_STATS_USE_CURRENT_TIME);
                // Provides a means of clients controlling their timestamps to 
not use current time
                // when background tasks are updating stats. Instead we track 
the max timestamp of
                // the cells and use that.
                long clientTimeStamp = useCurrentTime ? 
TimeKeeper.SYSTEM.getCurrentTime()
                        : StatisticsCollector.NO_TIMESTAMP;
                StatisticsCollector stats = new 
StatisticsCollector(c.getEnvironment(), table.getNameAsString(),
                        clientTimeStamp, store.getFamily().getName());
                if (mergeRegions != null) {
                    ImmutableBytesPtr fam = new 
ImmutableBytesPtr(store.getFamily().getName());
                    
stats.deleteStatistic(mergeRegions.getFirst().getRegionName(), fam);
                    
stats.deleteStatistic(mergeRegions.getSecond().getRegionName(), fam);
                }

                internalScanner = 
stats.createCompactionScanner(c.getEnvironment().getRegion(), store, scanner);
            } catch (IOException e) {
                // If we can't reach the stats table, don't interrupt the normal
                // compaction operation, just log a warning.
                if (logger.isWarnEnabled()) {
                    logger.warn("Unable to collect stats for " + table, e);
                }
            }
        }
        return internalScanner;
    }
{code}

> SYSTEM.STATS not update after region merge occurs.
> --------------------------------------------------
>
>                 Key: PHOENIX-2249
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2249
>             Project: Phoenix
>          Issue Type: Bug
>         Environment: Ubuntu server 14.04
> Hadoop 2.6.0
> HBase 1.0.0
> Phoenix 4.4.0-HBase-1.0.0
>            Reporter: Kuan-Po Tseng
>            Assignee: Ankit Singhal
>
> When region merge occurs, SYSTEM.STATS does not update, it will leave stale 
> information behind. And then if region splits on the merged region, this may 
> cause 
> "org.apache.phoenix.schema.StaleRegionBoundaryCacheException: ERROR 1108 
> (XCL08): Cache of region boundaries are out of date" 
> after creating parallel scans since the stale information in SYSTEM.STATS 
> will let guide post list not sorted in ascending order and this will cause 
> scans over regions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to