Re: Major compaction cannot remove deleted rows until the region is split. Strange!

Anoop John Thu, 02 Jun 2016 22:06:54 -0700

So u feel the call  s.close() would have created some Exception and
the removal of entry from scannerReadPoints would not have happened.
hmm.. sounds possible


public synchronized void close() {
      if (storeHeap != null) {
        storeHeap.close();
        storeHeap = null;
      }
      if (joinedHeap != null) {
        joinedHeap.close();
        joinedHeap = null;
      }
      // no need to synchronize here.
      scannerReadPoints.remove(this);
      this.filterClosed = true;
    }

We may need to change this close impl and make sure the removal from
scannerReadPoints happen in a finally block.

-Anoop-


On Fri, Jun 3, 2016 at 7:33 AM, hao zhang <[email protected]> wrote:
> Hi, Stack
>
>   We found this in our production cluster, too.  I take a look about the
> code and found one case which will make the major compact not work.
>
>   1. put some rows
>   2. scan
>   3. delete
>   4. scanner didn't close rightly, it will keep the  the read point in
> region's scannerReadPoints.
>   5. major compact. It can't work because we have a scanner which have
> small read point than the delete.
>
>   But if move the region to new rs, the scannerReadPoints will update to
> the biggest memstoreTs which form sotrefile. So major compact will work.
>   I thought the try...catch module in Scanner.leaseExpired() method needs a
> finally module to close the region scanner rightly.
>
>     public void leaseExpired() {
>       RegionScannerHolder rsh = scanners.remove(this.scannerName);
>       if (rsh != null) {
>         RegionScanner s = rsh.s;
>         LOG.info("Scanner " + this.scannerName + " lease expired on region "
>           + s.getRegionInfo().getRegionNameAsString());
>         try {
>           Region region =
> regionServer.getRegion(s.getRegionInfo().getRegionName());
>           if (region != null && region.getCoprocessorHost() != null) {
>             region.getCoprocessorHost().preScannerClose(s);
>           }
>           s.close();
>           if (region != null && region.getCoprocessorHost() != null) {
>             region.getCoprocessorHost().postScannerClose(s);
>           }
>         } catch (IOException e) {
>           LOG.error("Closing scanner for "
>             + s.getRegionInfo().getRegionNameAsString(), e);
>         }
>       } else {
>         LOG.warn("Scanner " + this.scannerName + " lease expired, but no
> related" +
>           " scanner found, hence no chance to close that related scanner!");
>       }
>     }
>
>
> 2016-06-02 2:50 GMT+08:00 Stack <[email protected]>:
>
>> On Wed, Jun 1, 2016 at 10:56 AM, Tianying Chang <[email protected]> wrote:
>>
>> > Hi, Stack
>> >
>> > After moving the region and issue a major compact on that region, its
>> size
>> > shrink from 99G down to 24G. So it looks like the region is in a bad
>> state
>> > that cannot recover, close/open it fixed the issue. And from the region
>> > size metric graph, we can see major compaction stop working  since March
>> > 31, so some bug that caused region enter into bad state... Unfortunately,
>> > we don't have DEBUG enabled and that is the last region that has the
>> issue,
>> > it is hard to figure out what is the bug that caused the bad state...
>> >
>> >
>> Interesting. So moving it to another RS make it major-compactable? That
>> would seem to indicate some state kept in the RS memory is preventing the
>> major compaction running. Is moving the region a workaround for you until
>> we figure what it is Tian-Ying?
>>
>> St.
>>
>>
>>
>> > Thanks
>> > Tian-Ying
>> >
>> > On Tue, May 31, 2016 at 3:43 PM, Tianying Chang <[email protected]>
>> wrote:
>> >
>> > > Hi, Stack
>> > >
>> > > Based on the log, the major compaction was run, and it took 5+ hours.
>> > And
>> > > I also manually run major_compact from hbase shell explicitly to
>> verify.
>> > >
>> > > I just moved the region to a different RS and issued a major_compact on
>> > > that region again, let me see if the major compaction can succeed and
>> > will
>> > > report back.
>> > >
>> > > Thanks
>> > > Tian-Ying
>> > >
>> > > On Sun, May 29, 2016 at 4:35 PM, Stack <[email protected]> wrote:
>> > >
>> > >> On Fri, May 27, 2016 at 3:17 PM, Tianying Chang <[email protected]>
>> > >> wrote:
>> > >>
>> > >> > Yes, it is 94.26.  By a quick glance, I didn't  see any put that is
>> > >> older
>> > >> > than the delete marker's TS, which could go as far as about couple
>> > weeks
>> > >> > ago since major compaction on it for long time seems.
>> > >> >
>> > >> Also it is really strange that if the region is split, then seems
>> > >> > everything is working as expected. Also we noticed, the same region
>> > >> > replicated at the slave side is totally normal, i.e. at 20+G....
>> > >> >
>> > >> >
>> > >> If you move the region to another server, does that work?
>> > >>
>> > >> Looking in 0.94 codebase, I see this in Compactor#compact
>> > >>
>> > >>
>> > >>       // For major compactions calculate the earliest put timestamp
>> > >>
>> > >>       // of all involved storefiles. This is used to remove
>> > >>
>> > >>       // family delete marker during the compaction.
>> > >>
>> > >>       if (majorCompaction) {
>> > >>
>> > >>         tmp = fileInfo.get(StoreFile.EARLIEST_PUT_TS);
>> > >>
>> > >>         if (tmp == null) {
>> > >>
>> > >>           // There's a file with no information, must be an old one
>> > >>
>> > >>           // assume we have very old puts
>> > >>
>> > >>           earliestPutTs = HConstants.OLDEST_TIMESTAMP;
>> > >>
>> > >>         } else {
>> > >>
>> > >>           earliestPutTs = Math.min(earliestPutTs, Bytes.toLong(tmp));
>> > >>
>> > >>         }
>> > >>
>> > >>       }
>> > >>
>> > >>
>> > >> The above is followed by this log line:
>> > >>
>> > >>
>> > >>       if (LOG.isDebugEnabled()) {
>> > >>
>> > >>         LOG.debug("Compacting " + file +
>> > >>
>> > >>           ", keycount=" + keyCount +
>> > >>
>> > >>           ", bloomtype=" + r.getBloomFilterType().toString() +
>> > >>
>> > >>           ", size=" + StringUtils.humanReadableInt(r.length()) +
>> > >>
>> > >>           ", encoding=" + r.getHFileReader().getEncodingOnDisk() +
>> > >>
>> > >>           (majorCompaction? ", earliestPutTs=" + earliestPutTs: ""));
>> > >>
>> > >>       }
>> > >>
>> > >> This prints out earliestPutTs. You see that in the logs?  You running
>> > with
>> > >> DEBUG? Does the earliest put ts preclude our dropping delete family?
>> > >>
>> > >>
>> > >> Looking more in code, we retain deletes in following circumstances:
>> > >>
>> > >>
>> > >>     this.retainDeletesInOutput = scanType == ScanType.MINOR_COMPACT ||
>> > >> scan
>> > >> .isRaw();
>> > >>
>> > >>
>> > >> So, for sure we are running major compaction?
>> > >>
>> > >> Otherwise, have to dig in a bit more here.. This stuff is a little
>> > >> involved.
>> > >> St.Ack
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> > On Fri, May 27, 2016 at 3:13 PM, Stack <[email protected]> wrote:
>> > >> >
>> > >> > > On Fri, May 27, 2016 at 2:32 PM, Tianying Chang <
>> [email protected]>
>> > >> > wrote:
>> > >> > >
>> > >> > > > Hi,
>> > >> > > >
>> > >> > > > We saw a very strange case in one of our production cluster. A
>> > >> couple
>> > >> > > > regions cannot get their deleted rows or delete marker removed
>> > even
>> > >> > after
>> > >> > > > major compaction. However when the region triggered split (we
>> set
>> > >> 100G
>> > >> > > for
>> > >> > > > auto split), the deletion worked. The 100G region becomes two
>> 10G
>> > >> > > daughter
>> > >> > > > regions, and all the delete marker are gone.
>> > >> > > >
>> > >> > > > Also, the same region in the slave cluster (through replication)
>> > >> have
>> > >> > > > normal size at about 20+G.
>> > >> > > >
>> > >> > > > BTW, the delete marker in the regions are mostly deleteFamily if
>> > it
>> > >> > > > matters.
>> > >> > > >
>> > >> > > > This is really weird. Anyone has any clue for this strange
>> > behavior?
>> > >> > > >
>> > >> > > > Thanks
>> > >> > > > Tian-Ying
>> > >> > > >
>> > >> > > > These 0.94 Tian-Ying?
>> > >> > >
>> > >> > > It looks like the DeleteFamily is retained only; do you see
>> > incidence
>> > >> > where
>> > >> > > there may have been versions older than the DeleteFamily that are
>> > also
>> > >> > > retained post-major-compaction?
>> > >> > >
>> > >> > > St.Ack
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > > > A snippet of the HFile generated by the major compaction:
>> > >> > > >
>> > >> > > > : \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459808114380/DeleteFamily/vlen=0/ts=2292870047
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459808114011/DeleteFamily/vlen=0/ts=2292869794
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459805381104/DeleteFamily/vlen=0/ts=2291072240
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459805380673/DeleteFamily/vlen=0/ts=2291071997
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459802643449/DeleteFamily/vlen=0/ts=2290248886
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459802643246/DeleteFamily/vlen=0/ts=2290248786
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459799913003/DeleteFamily/vlen=0/ts=2289446916
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459797181831/DeleteFamily/vlen=0/ts=2288670451
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459794447388/DeleteFamily/vlen=0/ts=2287911443
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459791708803/DeleteFamily/vlen=0/ts=2287213792
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459788978387/DeleteFamily/vlen=0/ts=2286488738
>> > >> > > > V:
>> > >> > > > K: \xA0\x00\x00L\x1A@\x1CBe\x00\x00\x08m\x03\x1A@
>> > >> > > > \x10\x00?PF/d:/1459786243642/DeleteFamily/vlen=0/ts=2285778927
>> > >> > > > V:
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>

Re: Major compaction cannot remove deleted rows until the region is split. Strange!

Reply via email to