Tian-ying: Can you try the suggestion below to see if it helps?
Do you see the lease expired logs? St.Ack On Thu, Jun 2, 2016 at 7:03 PM, hao zhang <[email protected]> wrote: > Hi, Stack > > We found this in our production cluster, too. I take a look about the > code and found one case which will make the major compact not work. > > 1. put some rows > 2. scan > 3. delete > 4. scanner didn't close rightly, it will keep the the read point in > region's scannerReadPoints. > 5. major compact. It can't work because we have a scanner which have > small read point than the delete. > > But if move the region to new rs, the scannerReadPoints will update to > the biggest memstoreTs which form sotrefile. So major compact will work. > I thought the try...catch module in Scanner.leaseExpired() method needs a > finally module to close the region scanner rightly. > > public void leaseExpired() { > RegionScannerHolder rsh = scanners.remove(this.scannerName); > if (rsh != null) { > RegionScanner s = rsh.s; > LOG.info("Scanner " + this.scannerName + " lease expired on region > " > + s.getRegionInfo().getRegionNameAsString()); > try { > Region region = > regionServer.getRegion(s.getRegionInfo().getRegionName()); > if (region != null && region.getCoprocessorHost() != null) { > region.getCoprocessorHost().preScannerClose(s); > } > s.close(); > if (region != null && region.getCoprocessorHost() != null) { > region.getCoprocessorHost().postScannerClose(s); > } > } catch (IOException e) { > LOG.error("Closing scanner for " > + s.getRegionInfo().getRegionNameAsString(), e); > } > } else { > LOG.warn("Scanner " + this.scannerName + " lease expired, but no > related" + > " scanner found, hence no chance to close that related > scanner!"); > } > } > > > 2016-06-02 2:50 GMT+08:00 Stack <[email protected]>: > > > On Wed, Jun 1, 2016 at 10:56 AM, Tianying Chang <[email protected]> > wrote: > > > > > Hi, Stack > > > > > > After moving the region and issue a major compact on that region, its > > size > > > shrink from 99G down to 24G. So it looks like the region is in a bad > > state > > > that cannot recover, close/open it fixed the issue. And from the region > > > size metric graph, we can see major compaction stop working since > March > > > 31, so some bug that caused region enter into bad state... > Unfortunately, > > > we don't have DEBUG enabled and that is the last region that has the > > issue, > > > it is hard to figure out what is the bug that caused the bad state... > > > > > > > > Interesting. So moving it to another RS make it major-compactable? That > > would seem to indicate some state kept in the RS memory is preventing the > > major compaction running. Is moving the region a workaround for you until > > we figure what it is Tian-Ying? > > > > St. > > > > > > > > > Thanks > > > Tian-Ying > > > > > > On Tue, May 31, 2016 at 3:43 PM, Tianying Chang <[email protected]> > > wrote: > > > > > > > Hi, Stack > > > > > > > > Based on the log, the major compaction was run, and it took 5+ hours. > > > And > > > > I also manually run major_compact from hbase shell explicitly to > > verify. > > > > > > > > I just moved the region to a different RS and issued a major_compact > on > > > > that region again, let me see if the major compaction can succeed and > > > will > > > > report back. > > > > > > > > Thanks > > > > Tian-Ying > > > > > > > > On Sun, May 29, 2016 at 4:35 PM, Stack <[email protected]> wrote: > > > > > > > >> On Fri, May 27, 2016 at 3:17 PM, Tianying Chang <[email protected]> > > > >> wrote: > > > >> > > > >> > Yes, it is 94.26. By a quick glance, I didn't see any put that > is > > > >> older > > > >> > than the delete marker's TS, which could go as far as about couple > > > weeks > > > >> > ago since major compaction on it for long time seems. > > > >> > > > > >> Also it is really strange that if the region is split, then seems > > > >> > everything is working as expected. Also we noticed, the same > region > > > >> > replicated at the slave side is totally normal, i.e. at 20+G.... > > > >> > > > > >> > > > > >> If you move the region to another server, does that work? > > > >> > > > >> Looking in 0.94 codebase, I see this in Compactor#compact > > > >> > > > >> > > > >> // For major compactions calculate the earliest put timestamp > > > >> > > > >> // of all involved storefiles. This is used to remove > > > >> > > > >> // family delete marker during the compaction. > > > >> > > > >> if (majorCompaction) { > > > >> > > > >> tmp = fileInfo.get(StoreFile.EARLIEST_PUT_TS); > > > >> > > > >> if (tmp == null) { > > > >> > > > >> // There's a file with no information, must be an old one > > > >> > > > >> // assume we have very old puts > > > >> > > > >> earliestPutTs = HConstants.OLDEST_TIMESTAMP; > > > >> > > > >> } else { > > > >> > > > >> earliestPutTs = Math.min(earliestPutTs, > Bytes.toLong(tmp)); > > > >> > > > >> } > > > >> > > > >> } > > > >> > > > >> > > > >> The above is followed by this log line: > > > >> > > > >> > > > >> if (LOG.isDebugEnabled()) { > > > >> > > > >> LOG.debug("Compacting " + file + > > > >> > > > >> ", keycount=" + keyCount + > > > >> > > > >> ", bloomtype=" + r.getBloomFilterType().toString() + > > > >> > > > >> ", size=" + StringUtils.humanReadableInt(r.length()) + > > > >> > > > >> ", encoding=" + r.getHFileReader().getEncodingOnDisk() + > > > >> > > > >> (majorCompaction? ", earliestPutTs=" + earliestPutTs: > "")); > > > >> > > > >> } > > > >> > > > >> This prints out earliestPutTs. You see that in the logs? You > running > > > with > > > >> DEBUG? Does the earliest put ts preclude our dropping delete family? > > > >> > > > >> > > > >> Looking more in code, we retain deletes in following circumstances: > > > >> > > > >> > > > >> this.retainDeletesInOutput = scanType == ScanType.MINOR_COMPACT > || > > > >> scan > > > >> .isRaw(); > > > >> > > > >> > > > >> So, for sure we are running major compaction? > > > >> > > > >> Otherwise, have to dig in a bit more here.. This stuff is a little > > > >> involved. > > > >> St.Ack > > > >> > > > >> > > > >> > > > >> > > > >> > On Fri, May 27, 2016 at 3:13 PM, Stack <[email protected]> wrote: > > > >> > > > > >> > > On Fri, May 27, 2016 at 2:32 PM, Tianying Chang < > > [email protected]> > > > >> > wrote: > > > >> > > > > > >> > > > Hi, > > > >> > > > > > > >> > > > We saw a very strange case in one of our production cluster. A > > > >> couple > > > >> > > > regions cannot get their deleted rows or delete marker removed > > > even > > > >> > after > > > >> > > > major compaction. However when the region triggered split (we > > set > > > >> 100G > > > >> > > for > > > >> > > > auto split), the deletion worked. The 100G region becomes two > > 10G > > > >> > > daughter > > > >> > > > regions, and all the delete marker are gone. > > > >> > > > > > > >> > > > Also, the same region in the slave cluster (through > replication) > > > >> have > > > >> > > > normal size at about 20+G. > > > >> > > > > > > >> > > > BTW, the delete marker in the regions are mostly deleteFamily > if > > > it > > > >> > > > matters. > > > >> > > > > > > >> > > > This is really weird. Anyone has any clue for this strange > > > behavior? > > > >> > > > > > > >> > > > Thanks > > > >> > > > Tian-Ying > > > >> > > > > > > >> > > > These 0.94 Tian-Ying? > > > >> > > > > > >> > > It looks like the DeleteFamily is retained only; do you see > > > incidence > > > >> > where > > > >> > > there may have been versions older than the DeleteFamily that > are > > > also > > > >> > > retained post-major-compaction? > > > >> > > > > > >> > > St.Ack > > > >> > > > > > >> > > > > > >> > > > > > >> > > > A snippet of the HFile generated by the major compaction: > > > >> > > > > > > >> > > > : \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459808114380/DeleteFamily/vlen=0/ts= > 2292870047 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459808114011/DeleteFamily/vlen=0/ts= > 2292869794 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459805381104/DeleteFamily/vlen=0/ts=2291072240 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459805380673/DeleteFamily/vlen=0/ts=2291071997 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459802643449/DeleteFamily/vlen=0/ts=2290248886 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459802643246/DeleteFamily/vlen=0/ts=2290248786 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459799913003/DeleteFamily/vlen=0/ts= > 2289446916 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459797181831/DeleteFamily/vlen=0/ts= > 2288670451 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459794447388/DeleteFamily/vlen=0/ts= > 2287911443 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459791708803/DeleteFamily/vlen=0/ts= > 2287213792 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459788978387/DeleteFamily/vlen=0/ts= > 2286488738 > > > >> > > > V: > > > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@ > > > >> > > > \x10\x00?PF/d:/1459786243642/DeleteFamily/vlen=0/ts= > 2285778927 > > > >> > > > V: > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > > > > >
