[ 
https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545873#comment-16545873
 ] 

Francis Liu edited comment on HBASE-20704 at 7/17/18 12:14 AM:
---------------------------------------------------------------

{quote}expecting eventual GC to call a finalizer that cleans things up
{quote}
AFAIK it should get cleaned up either via another next() rpc (that fails 
bacause the region is cloased) or scanner lease expiration processing. The 
readers won't be garbage until the scanner state is cleaned up. In any case it 
would objects that would give gc more work, tho it doesn't sounds like it's 
going to be significant and generally just part of normal operation. ie scan 
lease expiring and pauses between next() rpc calls. 

The trade off is tho now we have to have concurrent threads access a map during 
storefilescanner creation and and close for streaming scans. The overhead may 
be negligible assuming streaming scans are meant for doing large scans. I've 
attached a rough patch on how it would look. Let me know what you think. 


was (Author: toffer):
{quote}expecting eventual GC to call a finalizer that cleans things up
{quote}
AFAIK it should get cleaned up either via another next() rpc (that fails 
bacause the region is cloased) or scanner lease expiration processing. The 
readers won't be garbage until the scanner state is cleaned up. In any case it 
would objects that would give gc more work.

The trade off is tho now we have to have concurrent threads access a map during 
storefilescanner creation and and close for streaming scans. The overhead may 
be negligible assuming streaming scans are meant for doing large scans. I've 
attached a rough patch on how it would look. Let me know what you think. 

> Sometimes some compacted storefiles are not archived on region close
> --------------------------------------------------------------------
>
>                 Key: HBASE-20704
>                 URL: https://issues.apache.org/jira/browse/HBASE-20704
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>    Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>            Reporter: Francis Liu
>            Assignee: Francis Liu
>            Priority: Critical
>         Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch, 
> HBASE-20704.003.patch, HBASE-20704.004.draft.patch
>
>
> During region close compacted files which have not yet been archived by the 
> discharger are archived as part of the region closing process. It is 
> important that these files are wholly archived to insure data consistency. ie 
> a storefile containing delete tombstones can be archived while older 
> storefiles containing cells that were supposed to be deleted are left 
> unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has 
> read references (ie open scanners). This behavior is correct for when the 
> discharger chore runs but on region close consistency is of course more 
> important so we should add a special case to ignore any references on the 
> storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the 
> proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to