[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589965#comment-16589965
 ] 

Hudson commented on HBASE-21031:


Results for branch master
[build #450 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/450/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/450//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/450//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/450//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> HBASE-21031.branch-2.0.006.patch, HBASE-21031.branch-2.0.006.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-23 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589814#comment-16589814
 ] 

Hudson commented on HBASE-21031:


Results for branch branch-2
[build #1148 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1148/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1148//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1148//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1148//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
-- Something went wrong with this stage, [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1148//console].


> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> HBASE-21031.branch-2.0.006.patch, HBASE-21031.branch-2.0.006.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589621#comment-16589621
 ] 

Hudson commented on HBASE-21031:


Results for branch branch-2.1
[build #226 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/226//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> HBASE-21031.branch-2.0.006.patch, HBASE-21031.branch-2.0.006.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589594#comment-16589594
 ] 

Hudson commented on HBASE-21031:


Results for branch branch-2.0
[build #715 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/715/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/715//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/715//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/715//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> HBASE-21031.branch-2.0.006.patch, HBASE-21031.branch-2.0.006.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16588076#comment-16588076
 ] 

Hadoop QA commented on HBASE-21031:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
48s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
55s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
52s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 18s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}130m  
6s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936503/HBASE-21031.branch-2.0.006.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 4193c8128c80 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / dcf8a23183 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14119/testReport/ |
| Max. process+thread count | 4579 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14119/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically 

[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-21 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587891#comment-16587891
 ] 

stack commented on HBASE-21031:
---

Retry while [~allan163] is offline.

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> HBASE-21031.branch-2.0.006.patch, HBASE-21031.branch-2.0.006.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587558#comment-16587558
 ] 

Hadoop QA commented on HBASE-21031:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 1s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
57s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
22s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
14s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
12s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m  6s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}113m  2s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936434/HBASE-21031.branch-2.0.006.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2cc12df77480 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / efa54012b4 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14111/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14111/testReport/ |
| Max. process+thread count | 4149 (vs. ulimit of 1) |
| modules 

[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-21 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587483#comment-16587483
 ] 

Mike Drob commented on HBASE-21031:
---

+1 assuming QA agrees, thanks for working on this!

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> HBASE-21031.branch-2.0.006.patch, memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-21 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587350#comment-16587350
 ] 

Allan Yang commented on HBASE-21031:


{quote}
+ LOG.error("replay failed, that's expected", t);
I think the test is better without this since we already fail the test if the 
replay doesn't fail, so this is unneeded lines in the log. 
{quote}
Done!
{quote}
please try to put line breaks in between words instead of in the middle of 
words.
{quote}
Done!
{quote}
Also, I responded to your other question about my comments for the 
try/catch/finally directly on RB. Let me know if that doesn't make sense or if 
you think it's better to leave them as it is.
{quote}
I think we'd better leave them there, since we only catch IOException here, but 
we need to abort the status whatever exception is thrown.
Thanks for reviewing,[~mdrob]. Can you take a look at HBASE-21041 too, it is 
another bug you mentioned in the review board.

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> HBASE-21031.branch-2.0.006.patch, memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-15 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581501#comment-16581501
 ] 

Mike Drob commented on HBASE-21031:
---

{quote}
+  LOG.warn("Failed drop memstore of region= {}, s"
+  + "ome chunks may not released forever since MSLAB is 
enabled",
{quote}
please try to put line breaks in between words instead of in the middle of 
words.

Also, I responded to your other question about my comments for the 
try/catch/finally directly on RB. Let me know if that doesn't make sense or if 
you think it's better to leave them as it is.

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-15 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581499#comment-16581499
 ] 

Mike Drob commented on HBASE-21031:
---

bq. +LOG.error("replay failed, that's expected", t);
I think the test is better without this since we already fail the test if the 
replay doesn't fail, so this is unneeded lines in the log. Especially stack 
trace at error will make it stick out and harder to diagnose other failures.

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-12 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577798#comment-16577798
 ] 

Allan Yang commented on HBASE-21031:


Thanks for reviewing, [~yuzhih...@gmail.com].
Any other comments? [~mdrob]

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, HBASE-21031.branch-2.0.005.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577686#comment-16577686
 ] 

Hadoop QA commented on HBASE-21031:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
23s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
8s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
49s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
35s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 43s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}104m 
25s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}147m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935287/HBASE-21031.branch-2.0.005.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 5c4c8edeed01 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / aa83594b84 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14013/testReport/ |
| Max. process+thread count | 4177 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14013/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically 

[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-12 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577592#comment-16577592
 ] 

Ted Yu commented on HBASE-21031:


Looks good overall. 

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> HBASE-21031.branch-2.0.004.patch, memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-12 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577584#comment-16577584
 ] 

Hadoop QA commented on HBASE-21031:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
50s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
40s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 5s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
13s{color} | {color:red} hbase-server: The patch generated 1 new + 250 
unchanged - 0 fixed = 251 total (was 250) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 7s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 23s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}108m 
41s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}147m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935281/HBASE-21031.branch-2.0.004.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2ce66c307bf7 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / aa83594b84 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14010/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14010/testReport/ |
| Max. process+thread count | 4088 (vs. ulimit of 1) |
| modules | C: hbase-server U: 

[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-12 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577496#comment-16577496
 ] 

Allan Yang commented on HBASE-21031:


{quote}
Can you come up with a test which fails if dropMemStoreContents only rolls back 
single region (the region which encounters Throwable) ?
Thanks
{quote}
Sorry, maybe the comment in the patch is misleading, we only rollback the 
problematic region's memtore if opening fails, not all region's.

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-12 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16577482#comment-16577482
 ] 

Allan Yang commented on HBASE-21031:


[~yuzhih...@gmail.com], just checked, master branch and branch-2 has 
HBASE-20542, so the problem has fixed. I think we can check in this one only to 
branch-2.0 and branch-2.1? 

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576494#comment-16576494
 ] 

Ted Yu commented on HBASE-21031:


Can you come up with a test which fails if {{dropMemStoreContents}} only rolls 
back single region (the region which encounters Throwable) ?

Thanks

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576492#comment-16576492
 ] 

Allan Yang commented on HBASE-21031:


{quote}
TestRecoveredEidtsReplayAndAbort passes with the above change.
{quote}
Sorry, [~yuzhih...@gmail.com], I missed your point.

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576466#comment-16576466
 ] 

Ted Yu commented on HBASE-21031:


I modified the dropMemStoreContents() method by passing it the region which 
encounters Throwable.
{code}
  public MemStoreSize dropMemStoreContents(HRegion r) throws IOException {
...
  for (HStore s : stores.values()) {
if (!s.getHRegion().equals(r)) continue;
{code}
TestRecoveredEidtsReplayAndAbort passes with the above change.

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, HBASE-21031.branch-2.0.003.patch, 
> memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576432#comment-16576432
 ] 

Hadoop QA commented on HBASE-21031:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
23s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
58s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
41s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
27s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
15s{color} | {color:red} hbase-server: The patch generated 1 new + 250 
unchanged - 0 fixed = 251 total (was 250) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  3m  
7s{color} | {color:red} patch has 10 errors when building our shaded downstream 
artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}101m 
41s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
19s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935134/HBASE-21031.branch-2.0.003.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2ad2799203a4 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / 7ee4aa459c |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14002/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| shadedjars | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14002/artifact/patchprocess/patch-shadedjars.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14002/testReport/ |
| 

[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576025#comment-16576025
 ] 

Hadoop QA commented on HBASE-21031:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
52s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
19s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
21s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
14s{color} | {color:red} hbase-server: The patch generated 1 new + 250 
unchanged - 0 fixed = 251 total (was 250) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  3m 
11s{color} | {color:red} patch has 10 errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
30s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 
1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}104m 
30s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
20s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
|  |  Redundant nullcheck of region which is known to be null in 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion()  
Redundant null check at OpenRegionHandler.java:is known to be null in 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion()  
Redundant null check at OpenRegionHandler.java:[line 307] |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12935089/HBASE-21031.branch-2.0.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 72a3918f2f0b 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / 7ee4aa459c |
| maven | version: Apache Maven 3.5.4 

[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-10 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575858#comment-16575858
 ] 

Allan Yang commented on HBASE-21031:


[~mdrob], uploaded the new patch to the review board, feel free to give 
advices. Thanks!

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, 
> HBASE-21031.branch-2.0.002.patch, memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575027#comment-16575027
 ] 

Hadoop QA commented on HBASE-21031:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
51s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
41s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
10s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
15s{color} | {color:red} hbase-server: The patch generated 14 new + 303 
unchanged - 0 fixed = 317 total (was 303) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 6s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 23s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m 
13s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 
1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 11s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}143m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-server |
|  |  Null pointer dereference of region in 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion() on 
exception path  Dereferenced at OpenRegionHandler.java:in 
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion() on 
exception path  Dereferenced at OpenRegionHandler.java:[line 308] |
| Failed junit tests | hadoop.hbase.regionserver.TestHRegion |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21031 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12934976/HBASE-21031.branch-2.0.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux b022565dbae1 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / 

[jira] [Commented] (HBASE-21031) Memory leak if replay edits failed during region opening

2018-08-09 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16574986#comment-16574986
 ] 

Mike Drob commented on HBASE-21031:
---

Very interesting failure scenario, Allan. Great job diagnosing it.

I think I have feedback for the patch, would you mind uploading to review board?

> Memory leak if replay edits failed during region opening
> 
>
> Key: HBASE-21031
> URL: https://issues.apache.org/jira/browse/HBASE-21031
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21031.branch-2.0.001.patch, memoryleak.png
>
>
> Due to HBASE-21029, when replaying edits with a lot of same cells, the 
> memstore won't flush,  a exception will throw when all heap space was used:
> {code}
> 2018-08-06 15:52:27,590 ERROR 
> [RS_OPEN_REGION-regionserver/hb-bp10cw4ejoy0a2f3f-009:16020-2] 
> handler.OpenRegionHandler(302): Failed open of 
> region=hbase_test,dffa78,1531227033378.cbf9a2daf3aaa0c7e931e9c9a7b53f41., 
> starting to roll back the global memstore size.
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
> at 
> org.apache.hadoop.hbase.regionserver.OnheapChunk.allocateDataBuffer(OnheapChunk.java:41)
> at org.apache.hadoop.hbase.regionserver.Chunk.init(Chunk.java:104)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:226)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.ChunkCreator.getChunk(ChunkCreator.java:163)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.getOrMakeChunk(MemStoreLABImpl.java:273)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:148)
> at 
> org.apache.hadoop.hbase.regionserver.MemStoreLABImpl.copyCellInto(MemStoreLABImpl.java:111)
> at 
> org.apache.hadoop.hbase.regionserver.Segment.maybeCloneWithAllocator(Segment.java:178)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.maybeCloneWithAllocator(AbstractMemStore.java:287)
> at 
> org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:107)
> at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.restoreEdit(HRegion.java:5494)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEdits(HRegion.java:4608)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.replayRecoveredEditsIfAny(HRegion.java:4404)
> {code}
> After this exception, the memstore did not roll back, and since MSLAB is 
> used, all the chunk allocated won't release for ever. Those memory is leak 
> forever...
> We need to rollback the memory if open region fails(For now, only global 
> memstore size is decreased after failure).
> Another problem is that we use replayEditsPerRegion in RegionServerAccounting 
> to record how many memory used during replaying. And decrease the global 
> memstore size if replay fails. This is not right, since during replaying, we 
> may also flush the memstore, the size in the map of replayEditsPerRegion is 
> not accurate at all! 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)