[
https://issues.apache.org/jira/browse/HBASE-11979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14201330#comment-14201330
]
Esteban Gutierrez commented on HBASE-11979:
-------------------------------------------
My bad [~apurtell], got sidetracked by other things. Yeah, my guess is how we
calculate the keys for the StoreFiles:
{code}
// NOTE: getFilterEntries could cause under-sized blooms if the user
// switches bloom type (e.g. from ROW to ROWCOL)
long keyCount = (r.getBloomFilterType() ==
store.getFamily().getBloomFilterType())
? r.getFilterEntries() : r.getEntries();
fd.maxKeyCount += keyCount;
{code}
where the StoreFile can get an under estimated number of keys:
{code}
/**
* The number of Bloom filter entries in this store file, or an estimate
* thereof, if the Bloom filter is not loaded. This always returns an upper
* bound of the number of Bloom filter entries.
*
* @return an estimate of the number of Bloom filter entries in this file
*/
public long getFilterEntries() {
return generalBloomFilter != null ? generalBloomFilter.getKeyCount()
: reader.getEntries();
}
{code}
So as we do some progress during the compaction we end above the value from
BloomFilter.keycount causing the pct to run above 100%
> Compaction progress reporting is wrong
> --------------------------------------
>
> Key: HBASE-11979
> URL: https://issues.apache.org/jira/browse/HBASE-11979
> Project: HBase
> Issue Type: Bug
> Reporter: Andrew Purtell
> Assignee: Esteban Gutierrez
> Priority: Minor
> Fix For: 2.0.0, 0.98.8, 0.99.2
>
>
> This is a long standing problem and previously could be observed in
> regionserver metrics, but, we recently added logging for long running
> compactions, and this has exposed the issue in a new way, e.g.
> {noformat}
> 2014-09-15 14:20:59,450 DEBUG
> [regionserver8120-largeCompactions-1410813534627]
> compactions.Compactor: Compaction progress: 22683625/6808179 (333.18%),
> rate=162.08 kB/sec
> {noformat}
> The 'rate' reported in such logging is consistent and what we were really
> after, but the progress indication is clearly broken and should be fixed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)