[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110224#comment-15110224 ] deepankar commented on HBASE-15101: --- It is on on branch-1 only. ( But I have ported other patches also so When I pulled this in there no significant merge conflicts) > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110198#comment-15110198 ] ramkrishna.s.vasudevan commented on HBASE-15101: [~dvdreddy] I have a question - the above leakage of references are you observing in your patched version or in the trunk? If it is in your patched version - can you tell me the branch on which this back porting of HBASE-13082 has been done? I have some doubts in one of the scan flows with this patch in branch-1 case. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108406#comment-15108406 ] Hudson commented on HBASE-15101: FAILURE: Integrated in HBase-Trunk_matrix #645 (See [https://builds.apache.org/job/HBase-Trunk_matrix/645/]) HBASE-15101 Leaked References to StoreFile.Reader after HBASE-13082 (ramkrishna: rev 93e200d52b29d35ad5a98eed9eea05783960f6b2) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108295#comment-15108295 ] Hadoop QA commented on HBASE-15101: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 8s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s {color} | {color:green} master passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 22s {color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s {color} | {color:green} master passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 13s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 129m 49s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 95m 20s {color} | {color:green} hbase-server in the patch passed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 274m 42s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0 Failed junit tests | hadoop.hbase.master.balancer.TestStochasticLoadBalancer | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12783280/HBASE-15101-v4.patch | | JIRA Issue | HBASE-15101 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108082#comment-15108082 ] Anoop Sam John commented on HBASE-15101: +1 Ya that addition of close(false) will not impact any thing. And of no use also. This will just add the scanner to the delayedClosing Set. In KVHeap upper layer any way based on the return boolean we will add this StoreScanner itself into its delayedClosing Set. The shipped call come 1st to KVHeap and there it will close() this StoreScanner fully. (Which will close every thing and clear) And there wont be any shipped () call on this SToreScanner at all.. So 1st addition to delayedClosing Set in StoreScanner is of no value. Any way no harm also. So go for commit. ANy way we dont need the Set in StoreScanner now and that we can do in another IA jira. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108080#comment-15108080 ] ramkrishna.s.vasudevan commented on HBASE-15101: Will discuss and further check with [~dvdreddy] if there is still leak and will raise further JIRAs to track down the issue if it still exists in their branch. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108073#comment-15108073 ] ramkrishna.s.vasudevan commented on HBASE-15101: Further clean up of the delayed closed thing in StoreScanner can be done in a follow up JIRA. Will commit v4 as is to trunk. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108063#comment-15108063 ] ramkrishna.s.vasudevan commented on HBASE-15101: bq.Adding those extra close(false) might not be really needed Ya no problem. It is only for consistency with other code changes. Will definitely not impact this JIRA. But the other changes where the unused scanners are getting closed may be helpful. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108061#comment-15108061 ] ramkrishna.s.vasudevan commented on HBASE-15101: bq. I think we need some cleanup in these area. In my previous patches with HBASE-13082 I added a TODO to clean it up. We can check that once and do the need ful cleanup. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108051#comment-15108051 ] Anoop Sam John commented on HBASE-15101: Adding those extra close(false) might not be really needed. Any way based on the return boolean from these scanners(hasMore) it is getting lazily closed within KVHeap. And one more thing. May be within StoreScanner we don't need a delayed closing set and stuff (?) That was needed to handle the concurrent running compactions. And now we will be continue to use the same compacted away files for these scanners. The reader observer stuff is for the flushes only right? I think we need some cleanup in these area. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108038#comment-15108038 ] ramkrishna.s.vasudevan commented on HBASE-15101: bq.. I added it to all places where we are returning the NO_MORE_VALUES, is there any other place I am missing ? I only missed it. Its fine now. I can commit this. +1 > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108036#comment-15108036 ] deepankar commented on HBASE-15101: --- Adding the close did not help before, but I thought it should follow convention. I added it to all places where we are returning the NO_MORE_VALUES, is there any other place I am missing ? > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108030#comment-15108030 ] ramkrishna.s.vasudevan commented on HBASE-15101: Once you update the patch then I can commit to trunk. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108029#comment-15108029 ] ramkrishna.s.vasudevan commented on HBASE-15101: Then you will need to add it in 2 more places. It makes sense to add close(false) there as anyway there are no more rows there. Does adding close in those places help you? Am not very sure on that. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108017#comment-15108017 ] deepankar commented on HBASE-15101: --- Attached patch with close calls also. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101-v4.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108013#comment-15108013 ] deepankar commented on HBASE-15101: --- Should I add the close calls before return statement as I mentioned above ? > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108007#comment-15108007 ] ramkrishna.s.vasudevan commented on HBASE-15101: Will commit this patch to trunk and wait for further analysis in subsequent JIRAs? [~dvdreddy] - What do you say? > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar >Priority: Critical > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106464#comment-15106464 ] ramkrishna.s.vasudevan commented on HBASE-15101: Can you try coming up with a test case to see this problem is reproducible as an FT. Try to start 2 region servers and after some data insertion ensure there is a major compaction and count the number of files after the compacted discharger chore runs. Also in your experimental set up are you seeing that none of the compacted files are are getting finalized or few of them are getting finalized and few are getting missed out? If nothing is getting finalized then it would be great to check the status of those files in by adding some debug logs in HStore.closeAndArchiveStoreFiles() (the method name am not very sure but something like this was added which does the finalization part). > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106413#comment-15106413 ] deepankar commented on HBASE-15101: --- I thought before HBASE-13082, when a compaction starts and before it completes the files are present in .tmp directory (of the region folder) and finalized once it completes giving a very small window (after moving in the files from .tmp and moving out files from RegionServer) where there could be that all files are present. This is not the case after HBASE-13082 because both the set of files are present in the folder for a longer period of time and if there is any leak in the reference counting then all the files co exist and it can lead to a region size explosion . This is what exactly happened with us, without this patch we were running one regionserver with HBASE-13082 and almost all the regions on that server had all the files from the time of begining of that regionserver and movement of region to that server (movement rarely happens). The worst is we force major compact regions daily and that lead to the region data getting repeated over 7 times and In panic when we shutdown (gracefully) this server it lead to other regionservers that hosted these regions keep on compacting the whole next day (as each of them contained 5-7x the data of normal region). So then when applied this patch and hosted only two regions on this experimental regionserver for 2 days, and the samething repeated and when again we shutdown (again gracefully) the regionserver all the files did remain in the directory and it did lead to longer compaction next time. If we can come up with patch after leak may I could take a stab testing again, I will also go through the close() to see if I am missing any thing. Thanks > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15106396#comment-15106396 ] ramkrishna.s.vasudevan commented on HBASE-15101: Patch looks good to me. ya I think the shorter version is the easiest way after seeing the code where the patch is prepared. Regarding the point where you say a Region Server dies and you end up seeing lot of uncollected files, in the version without HBASE-13082 - suppose a set of compactions started but that RS got killed or dies before it is completed you wil have all those store files again getting used for subsequent readers when the regions are opened in a new RS and there should be another compaction that needs to run and move the files to the archive dir. Now even after HBASE-13082 - the scenario is similar because once the compaction is done if the Compaction discharger thread does not kick in and before that your RS dies, you end up with the store files again available for reads and you need one more set of compaction to happen in the new RS to move the files to the archive dir. After applying the patch also you still get issues where the compacted files are not cleared? You mean there is always a set of files that is not cleared ? Now coming to do the cleaning operation during close(), I think we are doing it on HStore.close() operation and ensuring that the compacted files are moved to the archive dir. Its good to check this out before we commit the updated patch to branch-1. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102936#comment-15102936 ] Ted Yu commented on HBASE-15101: bq. not being preceded with close I think close() should be added for those places. I ran test suite with the above addition of close() and latest patch. Looks like there was no repeatable test failure. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102926#comment-15102926 ] Hadoop QA commented on HBASE-15101: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 34s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} master passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 42s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 6s {color} | {color:red} hbase-server in master has 81 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s {color} | {color:green} master passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 23m 8s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 95m 53s {color} | {color:red} hbase-server in the patch failed with JDK v1.8.0. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 89m 29s {color} | {color:green} hbase-server in the patch passed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 231m 55s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0 Failed junit tests | hadoop.hbase.security.token.TestZKSecretWatcher | | JDK v1.8.0 Timed out junit tests | org.apache.hadoop.hbase.client.replication.TestReplicationAdminWithClusters | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12782620/HBASE-15101-v3.patch | | JIRA Issue | HBASE-15101 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux asf903.gq1.
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102569#comment-15102569 ] deepankar commented on HBASE-15101: --- In StoreScanner, can the close(false) not being called lead to these references getting leaked ? I am seeing a couple of places where the {code} return scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues(); {code} not being preceded with close and some places that statement is preceded with close. Also the shipped() method in storeFileScanner does not contain the decrement, is it because in the case of Scan.next() we want to keep scanning the same files ?? > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102559#comment-15102559 ] deepankar commented on HBASE-15101: --- Ah sorry, fixed it > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101-v3.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101816#comment-15101816 ] Ted Yu commented on HBASE-15101: {code} 379* Returns a pair of lists of scanners where the first element is the 380* selected scanners and the second element is the ignored scanners {code} The above no longer matches code in patch v2. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar > Attachments: HBASE-15101-v1.patch, HBASE-15101-v2.patch, > HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101476#comment-15101476 ] Hadoop QA commented on HBASE-15101: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 35s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} master passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 25s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 6s {color} | {color:red} hbase-server in master has 83 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s {color} | {color:green} master passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 23m 15s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 103m 44s {color} | {color:green} hbase-server in the patch passed with JDK v1.8.0. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 91m 23s {color} | {color:green} hbase-server in the patch passed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 242m 1s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12782436/HBASE-15101-v2.patch | | JIRA Issue | HBASE-15101 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098982#comment-15098982 ] Hadoop QA commented on HBASE-15101: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 10s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} master passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 44s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 12s {color} | {color:red} hbase-server in master has 83 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s {color} | {color:green} master passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 4m 58s {color} | {color:red} Patch generated 1 new checkstyle issues in hbase-server (total was 146, now 147). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 24m 34s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 87m 24s {color} | {color:green} hbase-server in the patch passed with JDK v1.8.0. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 1s {color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 221m 25s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12782329/HBASE-15101-v1.patch | | JIRA Issue | HBASE-15101 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenk
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098603#comment-15098603 ] deepankar commented on HBASE-15101: --- Should we also finalize the files when a region is closed (may be with logging to find the store references), I think that will prevent adverse effects. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar > Attachments: HBASE-15101-v1.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098566#comment-15098566 ] deepankar commented on HBASE-15101: --- I havent backported HBASE-15027, will backport it now. With just this patch (not hbase-15027) it did not help as the compacted files are still being retained. Is there any other place we are missing references ? Thanks for the help, will update you with tests after backporting HBASE-15027. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar > Attachments: HBASE-15101-v1.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098560#comment-15098560 ] deepankar commented on HBASE-15101: --- Done. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar >Assignee: deepankar > Attachments: HBASE-15101-v1.patch, HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098213#comment-15098213 ] Anoop Sam John commented on HBASE-15101: Analysis seems reasonable. Any other places we missed like this? ! Need to check but any way as another jira. {code} if (kvs.shouldUseScanner(scan, store, expiredTimestampCutoff)) { 416 selectedScanners.add(kvs); 417 } else { 418 ignoredScanners.add(kvs); 419 } {code} Can we just change the existing method selectScannersFrom and instead of adding to new list just directly close there? No need to have the over head of another ArrayList and Pair creation. Can simple direct change. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar > Attachments: HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098058#comment-15098058 ] ramkrishna.s.vasudevan commented on HBASE-15101: Pls give me a couple of days to check this patch. First of all thanks for testing this. Hope u have also back ported hbase-15027. The change looks good to me but did u see ur problem getting solved after this patch? > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar > Attachments: HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097438#comment-15097438 ] Hadoop QA commented on HBASE-15101: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 49s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} master passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 21s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 51s {color} | {color:red} hbase-server in master has 83 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} master passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} master passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 4m 5s {color} | {color:red} Patch generated 1 new checkstyle issues in hbase-server (total was 146, now 147). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 21m 2s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.8.0 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 82m 59s {color} | {color:green} hbase-server in the patch passed with JDK v1.8.0. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 79m 47s {color} | {color:green} hbase-server in the patch passed with JDK v1.7.0_79. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 205m 5s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12782139/HBASE-15101.patch | | JIRA Issue | HBASE-15101 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/j
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097147#comment-15097147 ] deepankar commented on HBASE-15101: --- I can add a basic unit test to check whether dummy scanners are closed or not, I did not had time to add a full fledged test to reproduce the bug. > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar > Attachments: HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15101) Leaked References to StoreFile.Reader after HBASE-13082
[ https://issues.apache.org/jira/browse/HBASE-15101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15097133#comment-15097133 ] Ted Yu commented on HBASE-15101: Is it possible to add a test for this bug ? > Leaked References to StoreFile.Reader after HBASE-13082 > --- > > Key: HBASE-15101 > URL: https://issues.apache.org/jira/browse/HBASE-15101 > Project: HBase > Issue Type: Bug > Components: HFile, io >Affects Versions: 2.0.0 >Reporter: deepankar > Attachments: HBASE-15101.patch > > > We observed this production that after a region server dies there are huge > number of hfiles in that region for the region server running the version > with HBASE-13082, In the doc it is given that it is expected to happen, but > we found a one place where scanners are not being closed. If the scanners are > not closed their references are not decremented and that is leading to the > issue of huge number of store files not being finalized > All I was able to find is in the selectScannersFrom, where we discard some of > the scanners and we are not closing them. I am attaching a patch for that. > Also to avoid these issues should the files that are done be logged and > finalized (moved to archive) as a part of region close operation. This will > solve any leaks that can happen and does not cause any dire consequences? -- This message was sent by Atlassian JIRA (v6.3.4#6332)