We add some log in our production cluster. 2016-06-06,21:37:28,443 INFO org.apache.hadoop.hbase.regionserver.HRegion: scannerReadPoints size is 0 and smallest read point is 4037995 2016-06-06,21:37:51,429 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Failed openScannerjava.io.IOException: Could not seek StoreFileScanner 2016-06-06,21:38:02,103 INFO org.apache.hadoop.hbase.regionserver.HRegion: scannerReadPoints size is 1 and smallest read point is 4038037 2016-06-13,17:17:10,516 INFO org.apache.hadoop.hbase.regionserver.HRegion: scannerReadPoints size is 1 and smallest read point is 4038037
When new RegionScanner, it will add a scanner read point in scannerReadPoints. But if we got a exception after add read point, the read point will keep in regions server and the delete after this mvcc number will never be compacted. Create a issue about this. https://issues.apache.org/jira/browse/HBASE-16012 2016-06-10 3:08 GMT+08:00 Stack <[email protected]>: > Tian-ying: > > Can you try the suggestion below to see if it helps? > > Do you see the lease expired logs? > > St.Ack > > > On Thu, Jun 2, 2016 at 7:03 PM, hao zhang <[email protected]> wrote: > > > Hi, Stack > > > > We found this in our production cluster, too. I take a look about the > > code and found one case which will make the major compact not work. > > > > 1. put some rows > > 2. scan > > 3. delete > > 4. scanner didn't close rightly, it will keep the the read point in > > region's scannerReadPoints. > > 5. major compact. It can't work because we have a scanner which have > > small read point than the delete. > > > > But if move the region to new rs, the scannerReadPoints will update to > > the biggest memstoreTs which form sotrefile. So major compact will work. > > I thought the try...catch module in Scanner.leaseExpired() method > needs a > > finally module to close the region scanner rightly. > > > > public void leaseExpired() { > > RegionScannerHolder rsh = scanners.remove(this.scannerName); > > if (rsh != null) { > > RegionScanner s = rsh.s; > > LOG.info("Scanner " + this.scannerName + " lease expired on > region > > " > > + s.getRegionInfo().getRegionNameAsString()); > > try { > > Region region = > > regionServer.getRegion(s.getRegionInfo().getRegionName()); > > if (region != null && region.getCoprocessorHost() != null) { > > region.getCoprocessorHost().preScannerClose(s); > > } > > s.close(); > > if (region != null && region.getCoprocessorHost() != null) { > > region.getCoprocessorHost().postScannerClose(s); > > } > > } catch (IOException e) { > > LOG.error("Closing scanner for " > > + s.getRegionInfo().getRegionNameAsString(), e); > > } > > } else { > > LOG.warn("Scanner " + this.scannerName + " lease expired, but no > > related" + > > " scanner found, hence no chance to close that related > > scanner!"); > > } > > } > > > > > > 2016-06-02 2:50 GMT+08:00 Stack <[email protected]>: > > > > > On Wed, Jun 1, 2016 at 10:56 AM, Tianying Chang <[email protected]> > > wrote: > > > > > > > Hi, Stack > > > > > > > > After moving the region and issue a major compact on that region, its > > > size > > > > shrink from 99G down to 24G. So it looks like the region is in a bad > > > state > > > > that cannot recover, close/open it fixed the issue. And from the > region > > > > size metric graph, we can see major compaction stop working since > > March > > > > 31, so some bug that caused region enter into bad state... > > Unfortunately, > > > > we don't have DEBUG enabled and that is the last region that has the > > > issue, > > > > it is hard to figure out what is the bug that caused the bad state... > > > > > > > > > > > Interesting. So moving it to another RS make it major-compactable? That > > > would seem to indicate some state kept in the RS memory is preventing > the > > > major compaction running. Is moving the region a workaround for you > until > > > we figure what it is Tian-Ying? > > > > > > St. > > > > > > > > > > > > > Thanks > > > > Tian-Ying > > > > > > > > On Tue, May 31, 2016 at 3:43 PM, Tianying Chang <[email protected]> > > > wrote: > > > > > > > > > Hi, Stack > > > > > > > > > > Based on the log, the major compaction was run, and it took 5+ > hours. > > > > And > > > > > I also manually run major_compact from hbase shell explicitly to > > > verify. > > > > > > > > > > I just moved the region to a different RS and issued a > major_compact > > on > > > > > that region again, let me see if the major compaction can succeed > and > > > > will > > > > > report back. > > > > > > > > > > Thanks > > > > > Tian-Ying > > > > > > > > > > On Sun, May 29, 2016 at 4:35 PM, Stack <[email protected]> wrote: > > > > > > > > > >> On Fri, May 27, 2016 at 3:17 PM, Tianying Chang < > [email protected]> > > > > >> wrote: > > > > >> > > > > >> > Yes, it is 94.26. By a quick glance, I didn't see any put that > > is > > > > >> older > > > > >> > than the delete marker's TS, which could go as far as about > couple > > > > weeks > > > > >> > ago since major compaction on it for long time seems. > > > > >> > > > > > >> Also it is really strange that if the region is split, then seems > > > > >> > everything is working as expected. Also we noticed, the same > > region > > > > >> > replicated at the slave side is totally normal, i.e. at 20+G.... > > > > >> > > > > > >> > > > > > >> If you move the region to another server, does that work? > > > > >> > > > > >> Looking in 0.94 codebase, I see this in Compactor#compact > > > > >> > > > > >> > > > > >> // For major compactions calculate the earliest put > timestamp > > > > >> > > > > >> // of all involved storefiles. This is used to remove > > > > >> > > > > >> // family delete marker during the compaction. > > > > >> > > > > >> if (majorCompaction) { > > > > >> > > > > >> tmp = fileInfo.get(StoreFile.EARLIEST_PUT_TS); > > > > >> > > > > >> if (tmp == null) { > > > > >> > > > > >> // There's a file with no information, must be an old > one > > > > >> > > > > >> // assume we have very old puts > > > > >> > > > > >> earliestPutTs = HConstants.OLDEST_TIMESTAMP; > > > > >> > > > > >> } else { > > > > >> > > > > >> earliestPutTs = Math.min(earliestPutTs, > > Bytes.toLong(tmp)); > > > > >> > > > > >> } > > > > >> > > > > >> } > > > > >> > > > > >> > > > > >> The above is followed by this log line: > > > > >> > > > > >> > > > > >> if (LOG.isDebugEnabled()) { > > > > >> > > > > >> LOG.debug("Compacting " + file + > > > > >> > > > > >> ", keycount=" + keyCount + > > > > >> > > > > >> ", bloomtype=" + r.getBloomFilterType().toString() + > > > > >> > > > > >> ", size=" + StringUtils.humanReadableInt(r.length()) + > > > > >> > > > > >> ", encoding=" + r.getHFileReader().getEncodingOnDisk() + > > > > >> > > > > >> (majorCompaction? ", earliestPutTs=" + earliestPutTs: > > "")); > > > > >> > > > > >> } > > > > >> > > > > >> This prints out earliestPutTs. You see that in the logs? You > > running > > > > with > > > > >> DEBUG? Does the earliest put ts preclude our dropping delete > family? > > > > >> > > > > >> > > > > >> Looking more in code, we retain deletes in following > circumstances: > > > > >> > > > > >> > > > > >> this.retainDeletesInOutput = scanType == > ScanType.MINOR_COMPACT > > || > > > > >> scan > > > > >> .isRaw(); > > > > >> > > > > >> > > > > >> So, for sure we are running major compaction? > > > > >> > > > > >> Otherwise, have to dig in a bit more here.. This stuff is a little > > > > >> involved. > > > > >> St.Ack > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > On Fri, May 27, 2016 at 3:13 PM, Stack <[email protected]> > wrote: > > > > >> > > > > > >> > > On Fri, May 27, 2016 at 2:32 PM, Tianying Chang < > > > [email protected]> > > > > >> > wrote: > > > > >> > > > > > > >> > > > Hi, > > > > >> > > > > > > > >> > > > We saw a very strange case in one of our production > cluster. A > > > > >> couple > > > > >> > > > regions cannot get their deleted rows or delete marker > removed > > > > even > > > > >> > after > > > > >> > > > major compaction. However when the region triggered split > (we > > > set > > > > >> 100G > > > > >> > > for > > > > >> > > > auto split), the deletion worked. The 100G region becomes > two > > > 10G > > > > >> > > daughter > > > > >> > > > regions, and all the delete marker are gone. > > > > >> > > > > > > > >> > > > Also, the same region in the slave cluster (through > > replication) > > > > >> have > > > > >> > > > normal size at about 20+G. > > > > >> > > > > > > > >> > > > BTW, the delete marker in the regions are mostly > deleteFamily > > if > > > > it > > > > >> > > > matters. > > > > >> > > > > > > > >> > > > This is really weird. Anyone has any clue for this strange > > > > behavior? > > > > >> > > > > > > > >> > > > Thanks > > > > >> > > > Tian-Ying > > > > >> > > > > > > > >> > > > These 0.94 Tian-Ying? > > > > >> > > > > > > >> > > It looks like the DeleteFamily is retained only; do you see > > > > incidence > > > > >> > where > > > > >> > > there may have been versions older than the DeleteFamily that > > are > > > > also > > > > >> > > retained post-major-compaction? > > > > >> > > > > > > >> > > St.Ack > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > A snippet of the HFile generated by the major compaction: > > > > >> > > > > > > > >> > > > : \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > \x10\x00?PF/d:/1459808114380/DeleteFamily/vlen=0/ts= > > 2292870047 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > \x10\x00?PF/d:/1459808114011/DeleteFamily/vlen=0/ts= > > 2292869794 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > > \x10\x00?PF/d:/1459805381104/DeleteFamily/vlen=0/ts=2291072240 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > > \x10\x00?PF/d:/1459805380673/DeleteFamily/vlen=0/ts=2291071997 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > > \x10\x00?PF/d:/1459802643449/DeleteFamily/vlen=0/ts=2290248886 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > > \x10\x00?PF/d:/1459802643246/DeleteFamily/vlen=0/ts=2290248786 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > \x10\x00?PF/d:/1459799913003/DeleteFamily/vlen=0/ts= > > 2289446916 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > \x10\x00?PF/d:/1459797181831/DeleteFamily/vlen=0/ts= > > 2288670451 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > \x10\x00?PF/d:/1459794447388/DeleteFamily/vlen=0/ts= > > 2287911443 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > \x10\x00?PF/d:/1459791708803/DeleteFamily/vlen=0/ts= > > 2287213792 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > \x10\x00?PF/d:/1459788978387/DeleteFamily/vlen=0/ts= > > 2286488738 > > > > >> > > > V: > > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > > >> > > > \x10\x00?PF/d:/1459786243642/DeleteFamily/vlen=0/ts= > > 2285778927 > > > > >> > > > V: > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > >
